Sample records for accurately predicted based

  1. Accurate prediction of energy expenditure using a shoe-based activity monitor.

    PubMed

    Sazonova, Nadezhda; Browning, Raymond C; Sazonov, Edward

    2011-07-01

    The aim of this study was to develop and validate a method for predicting energy expenditure (EE) using a footwear-based system with integrated accelerometer and pressure sensors. We developed a footwear-based device with an embedded accelerometer and insole pressure sensors for the prediction of EE. The data from the device can be used to perform accurate recognition of major postures and activities and to estimate EE using the acceleration, pressure, and posture/activity classification information in a branched algorithm without the need for individual calibration. We measured EE via indirect calorimetry as 16 adults (body mass index=19-39 kg·m) performed various low- to moderate-intensity activities and compared measured versus predicted EE using several models based on the acceleration and pressure signals. Inclusion of pressure data resulted in better accuracy of EE prediction during static postures such as sitting and standing. The activity-based branched model that included predictors from accelerometer and pressure sensors (BACC-PS) achieved the lowest error (e.g., root mean squared error (RMSE)=0.69 METs) compared with the accelerometer-only-based branched model BACC (RMSE=0.77 METs) and nonbranched model (RMSE=0.94-0.99 METs). Comparison of EE prediction models using data from both legs versus models using data from a single leg indicates that only one shoe needs to be equipped with sensors. These results suggest that foot acceleration combined with insole pressure measurement, when used in an activity-specific branched model, can accurately estimate the EE associated with common daily postures and activities. The accuracy and unobtrusiveness of a footwear-based device may make it an effective physical activity monitoring tool.

  2. Searching for an Accurate Marker-Based Prediction of an Individual Quantitative Trait in Molecular Plant Breeding

    PubMed Central

    Fu, Yong-Bi; Yang, Mo-Hua; Zeng, Fangqin; Biligetu, Bill

    2017-01-01

    Molecular plant breeding with the aid of molecular markers has played an important role in modern plant breeding over the last two decades. Many marker-based predictions for quantitative traits have been made to enhance parental selection, but the trait prediction accuracy remains generally low, even with the aid of dense, genome-wide SNP markers. To search for more accurate trait-specific prediction with informative SNP markers, we conducted a literature review on the prediction issues in molecular plant breeding and on the applicability of an RNA-Seq technique for developing function-associated specific trait (FAST) SNP markers. To understand whether and how FAST SNP markers could enhance trait prediction, we also performed a theoretical reasoning on the effectiveness of these markers in a trait-specific prediction, and verified the reasoning through computer simulation. To the end, the search yielded an alternative to regular genomic selection with FAST SNP markers that could be explored to achieve more accurate trait-specific prediction. Continuous search for better alternatives is encouraged to enhance marker-based predictions for an individual quantitative trait in molecular plant breeding. PMID:28729875

  3. SIFTER search: a web server for accurate phylogeny-based protein function prediction

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sahraeian, Sayed M.; Luo, Kevin R.; Brenner, Steven E.

    We are awash in proteins discovered through high-throughput sequencing projects. As only a minuscule fraction of these have been experimentally characterized, computational methods are widely used for automated annotation. Here, we introduce a user-friendly web interface for accurate protein function prediction using the SIFTER algorithm. SIFTER is a state-of-the-art sequence-based gene molecular function prediction algorithm that uses a statistical model of function evolution to incorporate annotations throughout the phylogenetic tree. Due to the resources needed by the SIFTER algorithm, running SIFTER locally is not trivial for most users, especially for large-scale problems. The SIFTER web server thus provides access tomore » precomputed predictions on 16 863 537 proteins from 232 403 species. Users can explore SIFTER predictions with queries for proteins, species, functions, and homologs of sequences not in the precomputed prediction set. Lastly, the SIFTER web server is accessible at http://sifter.berkeley.edu/ and the source code can be downloaded.« less

  4. SIFTER search: a web server for accurate phylogeny-based protein function prediction

    DOE PAGES

    Sahraeian, Sayed M.; Luo, Kevin R.; Brenner, Steven E.

    2015-05-15

    We are awash in proteins discovered through high-throughput sequencing projects. As only a minuscule fraction of these have been experimentally characterized, computational methods are widely used for automated annotation. Here, we introduce a user-friendly web interface for accurate protein function prediction using the SIFTER algorithm. SIFTER is a state-of-the-art sequence-based gene molecular function prediction algorithm that uses a statistical model of function evolution to incorporate annotations throughout the phylogenetic tree. Due to the resources needed by the SIFTER algorithm, running SIFTER locally is not trivial for most users, especially for large-scale problems. The SIFTER web server thus provides access tomore » precomputed predictions on 16 863 537 proteins from 232 403 species. Users can explore SIFTER predictions with queries for proteins, species, functions, and homologs of sequences not in the precomputed prediction set. Lastly, the SIFTER web server is accessible at http://sifter.berkeley.edu/ and the source code can be downloaded.« less

  5. Are EMS call volume predictions based on demand pattern analysis accurate?

    PubMed

    Brown, Lawrence H; Lerner, E Brooke; Larmon, Baxter; LeGassick, Todd; Taigman, Michael

    2007-01-01

    Most EMS systems determine the number of crews they will deploy in their communities and when those crews will be scheduled based on anticipated call volumes. Many systems use historical data to calculate their anticipated call volumes, a method of prediction known as demand pattern analysis. To evaluate the accuracy of call volume predictions calculated using demand pattern analysis. Seven EMS systems provided 73 consecutive weeks of hourly call volume data. The first 20 weeks of data were used to calculate three common demand pattern analysis constructs for call volume prediction: average peak demand (AP), smoothed average peak demand (SAP), and 90th percentile rank (90%R). The 21st week served as a buffer. Actual call volumes in the last 52 weeks were then compared to the predicted call volumes by using descriptive statistics. There were 61,152 hourly observations in the test period. All three constructs accurately predicted peaks and troughs in call volume but not exact call volume. Predictions were accurate (+/-1 call) 13% of the time using AP, 10% using SAP, and 19% using 90%R. Call volumes were overestimated 83% of the time using AP, 86% using SAP, and 74% using 90%R. When call volumes were overestimated, predictions exceeded actual call volume by a median (Interquartile range) of 4 (2-6) calls for AP, 4 (2-6) for SAP, and 3 (2-5) for 90%R. Call volumes were underestimated 4% of time using AP, 4% using SAP, and 7% using 90%R predictions. When call volumes were underestimated, call volumes exceeded predictions by a median (Interquartile range; maximum under estimation) of 1 (1-2; 18) call for AP, 1 (1-2; 18) for SAP, and 2 (1-3; 20) for 90%R. Results did not vary between systems. Generally, demand pattern analysis estimated or overestimated call volume, making it a reasonable predictor for ambulance staffing patterns. However, it did underestimate call volume between 4% and 7% of the time. Communities need to determine if these rates of over

  6. Accurate Binding Free Energy Predictions in Fragment Optimization.

    PubMed

    Steinbrecher, Thomas B; Dahlgren, Markus; Cappel, Daniel; Lin, Teng; Wang, Lingle; Krilov, Goran; Abel, Robert; Friesner, Richard; Sherman, Woody

    2015-11-23

    Predicting protein-ligand binding free energies is a central aim of computational structure-based drug design (SBDD)--improved accuracy in binding free energy predictions could significantly reduce costs and accelerate project timelines in lead discovery and optimization. The recent development and validation of advanced free energy calculation methods represents a major step toward this goal. Accurately predicting the relative binding free energy changes of modifications to ligands is especially valuable in the field of fragment-based drug design, since fragment screens tend to deliver initial hits of low binding affinity that require multiple rounds of synthesis to gain the requisite potency for a project. In this study, we show that a free energy perturbation protocol, FEP+, which was previously validated on drug-like lead compounds, is suitable for the calculation of relative binding strengths of fragment-sized compounds as well. We study several pharmaceutically relevant targets with a total of more than 90 fragments and find that the FEP+ methodology, which uses explicit solvent molecular dynamics and physics-based scoring with no parameters adjusted, can accurately predict relative fragment binding affinities. The calculations afford R(2)-values on average greater than 0.5 compared to experimental data and RMS errors of ca. 1.1 kcal/mol overall, demonstrating significant improvements over the docking and MM-GBSA methods tested in this work and indicating that FEP+ has the requisite predictive power to impact fragment-based affinity optimization projects.

  7. Towards First Principles-Based Prediction of Highly Accurate Electrochemical Pourbaix Diagrams

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zeng, Zhenhua; Chan, Maria K. Y.; Zhao, Zhi-Jian

    2015-08-13

    Electrochemical potential/pH (Pourbaix) diagrams underpin many aqueous electrochemical processes and are central to the identification of stable phases of metals for processes ranging from electrocatalysis to corrosion. Even though standard DFT calculations are potentially powerful tools for the prediction of such diagrams, inherent errors in the description of transition metal (hydroxy)oxides, together with neglect of van der Waals interactions, have limited the reliability of such predictions for even the simplest pure metal bulk compounds, and corresponding predictions for more complex alloy or surface structures are even more challenging. In the present work, through synergistic use of a Hubbard U correction,more » a state-of-the-art dispersion correction, and a water-based bulk reference state for the calculations, these errors are systematically corrected. The approach describes the weak binding that occurs between hydroxyl-containing functional groups in certain compounds in Pourbaix diagrams, corrects for self-interaction errors in transition metal compounds, and reduces residual errors on oxygen atoms by preserving a consistent oxidation state between the reference state, water, and the relevant bulk phases. The strong performance is illustrated on a series of bulk transition metal (Mn, Fe, Co and Ni) hydroxides, oxyhydroxides, binary, and ternary oxides, where the corresponding thermodynamics of redox and (de)hydration are described with standard errors of 0.04 eV per (reaction) formula unit. The approach further preserves accurate descriptions of the overall thermodynamics of electrochemically-relevant bulk reactions, such as water formation, which is an essential condition for facilitating accurate analysis of reaction energies for electrochemical processes on surfaces. The overall generality and transferability of the scheme suggests that it may find useful application in the construction of a broad array of electrochemical phase diagrams, including

  8. Mental models accurately predict emotion transitions.

    PubMed

    Thornton, Mark A; Tamir, Diana I

    2017-06-06

    Successful social interactions depend on people's ability to predict others' future actions and emotions. People possess many mechanisms for perceiving others' current emotional states, but how might they use this information to predict others' future states? We hypothesized that people might capitalize on an overlooked aspect of affective experience: current emotions predict future emotions. By attending to regularities in emotion transitions, perceivers might develop accurate mental models of others' emotional dynamics. People could then use these mental models of emotion transitions to predict others' future emotions from currently observable emotions. To test this hypothesis, studies 1-3 used data from three extant experience-sampling datasets to establish the actual rates of emotional transitions. We then collected three parallel datasets in which participants rated the transition likelihoods between the same set of emotions. Participants' ratings of emotion transitions predicted others' experienced transitional likelihoods with high accuracy. Study 4 demonstrated that four conceptual dimensions of mental state representation-valence, social impact, rationality, and human mind-inform participants' mental models. Study 5 used 2 million emotion reports on the Experience Project to replicate both of these findings: again people reported accurate models of emotion transitions, and these models were informed by the same four conceptual dimensions. Importantly, neither these conceptual dimensions nor holistic similarity could fully explain participants' accuracy, suggesting that their mental models contain accurate information about emotion dynamics above and beyond what might be predicted by static emotion knowledge alone.

  9. Mental models accurately predict emotion transitions

    PubMed Central

    Thornton, Mark A.; Tamir, Diana I.

    2017-01-01

    Successful social interactions depend on people’s ability to predict others’ future actions and emotions. People possess many mechanisms for perceiving others’ current emotional states, but how might they use this information to predict others’ future states? We hypothesized that people might capitalize on an overlooked aspect of affective experience: current emotions predict future emotions. By attending to regularities in emotion transitions, perceivers might develop accurate mental models of others’ emotional dynamics. People could then use these mental models of emotion transitions to predict others’ future emotions from currently observable emotions. To test this hypothesis, studies 1–3 used data from three extant experience-sampling datasets to establish the actual rates of emotional transitions. We then collected three parallel datasets in which participants rated the transition likelihoods between the same set of emotions. Participants’ ratings of emotion transitions predicted others’ experienced transitional likelihoods with high accuracy. Study 4 demonstrated that four conceptual dimensions of mental state representation—valence, social impact, rationality, and human mind—inform participants’ mental models. Study 5 used 2 million emotion reports on the Experience Project to replicate both of these findings: again people reported accurate models of emotion transitions, and these models were informed by the same four conceptual dimensions. Importantly, neither these conceptual dimensions nor holistic similarity could fully explain participants’ accuracy, suggesting that their mental models contain accurate information about emotion dynamics above and beyond what might be predicted by static emotion knowledge alone. PMID:28533373

  10. Radiomics biomarkers for accurate tumor progression prediction of oropharyngeal cancer

    NASA Astrophysics Data System (ADS)

    Hadjiiski, Lubomir; Chan, Heang-Ping; Cha, Kenny H.; Srinivasan, Ashok; Wei, Jun; Zhou, Chuan; Prince, Mark; Papagerakis, Silvana

    2017-03-01

    Accurate tumor progression prediction for oropharyngeal cancers is crucial for identifying patients who would best be treated with optimized treatment and therefore minimize the risk of under- or over-treatment. An objective decision support system that can merge the available radiomics, histopathologic and molecular biomarkers in a predictive model based on statistical outcomes of previous cases and machine learning may assist clinicians in making more accurate assessment of oropharyngeal tumor progression. In this study, we evaluated the feasibility of developing individual and combined predictive models based on quantitative image analysis from radiomics, histopathology and molecular biomarkers for oropharyngeal tumor progression prediction. With IRB approval, 31, 84, and 127 patients with head and neck CT (CT-HN), tumor tissue microarrays (TMAs) and molecular biomarker expressions, respectively, were collected. For 8 of the patients all 3 types of biomarkers were available and they were sequestered in a test set. The CT-HN lesions were automatically segmented using our level sets based method. Morphological, texture and molecular based features were extracted from CT-HN and TMA images, and selected features were merged by a neural network. The classification accuracy was quantified using the area under the ROC curve (AUC). Test AUCs of 0.87, 0.74, and 0.71 were obtained with the individual predictive models based on radiomics, histopathologic, and molecular features, respectively. Combining the radiomics and molecular models increased the test AUC to 0.90. Combining all 3 models increased the test AUC further to 0.94. This preliminary study demonstrates that the individual domains of biomarkers are useful and the integrated multi-domain approach is most promising for tumor progression prediction.

  11. SCPRED: accurate prediction of protein structural class for sequences of twilight-zone similarity with predicting sequences.

    PubMed

    Kurgan, Lukasz; Cios, Krzysztof; Chen, Ke

    2008-05-01

    Protein structure prediction methods provide accurate results when a homologous protein is predicted, while poorer predictions are obtained in the absence of homologous templates. However, some protein chains that share twilight-zone pairwise identity can form similar folds and thus determining structural similarity without the sequence similarity would be desirable for the structure prediction. The folding type of a protein or its domain is defined as the structural class. Current structural class prediction methods that predict the four structural classes defined in SCOP provide up to 63% accuracy for the datasets in which sequence identity of any pair of sequences belongs to the twilight-zone. We propose SCPRED method that improves prediction accuracy for sequences that share twilight-zone pairwise similarity with sequences used for the prediction. SCPRED uses a support vector machine classifier that takes several custom-designed features as its input to predict the structural classes. Based on extensive design that considers over 2300 index-, composition- and physicochemical properties-based features along with features based on the predicted secondary structure and content, the classifier's input includes 8 features based on information extracted from the secondary structure predicted with PSI-PRED and one feature computed from the sequence. Tests performed with datasets of 1673 protein chains, in which any pair of sequences shares twilight-zone similarity, show that SCPRED obtains 80.3% accuracy when predicting the four SCOP-defined structural classes, which is superior when compared with over a dozen recent competing methods that are based on support vector machine, logistic regression, and ensemble of classifiers predictors. The SCPRED can accurately find similar structures for sequences that share low identity with sequence used for the prediction. The high predictive accuracy achieved by SCPRED is attributed to the design of the features, which are

  12. SCPRED: Accurate prediction of protein structural class for sequences of twilight-zone similarity with predicting sequences

    PubMed Central

    Kurgan, Lukasz; Cios, Krzysztof; Chen, Ke

    2008-01-01

    Background Protein structure prediction methods provide accurate results when a homologous protein is predicted, while poorer predictions are obtained in the absence of homologous templates. However, some protein chains that share twilight-zone pairwise identity can form similar folds and thus determining structural similarity without the sequence similarity would be desirable for the structure prediction. The folding type of a protein or its domain is defined as the structural class. Current structural class prediction methods that predict the four structural classes defined in SCOP provide up to 63% accuracy for the datasets in which sequence identity of any pair of sequences belongs to the twilight-zone. We propose SCPRED method that improves prediction accuracy for sequences that share twilight-zone pairwise similarity with sequences used for the prediction. Results SCPRED uses a support vector machine classifier that takes several custom-designed features as its input to predict the structural classes. Based on extensive design that considers over 2300 index-, composition- and physicochemical properties-based features along with features based on the predicted secondary structure and content, the classifier's input includes 8 features based on information extracted from the secondary structure predicted with PSI-PRED and one feature computed from the sequence. Tests performed with datasets of 1673 protein chains, in which any pair of sequences shares twilight-zone similarity, show that SCPRED obtains 80.3% accuracy when predicting the four SCOP-defined structural classes, which is superior when compared with over a dozen recent competing methods that are based on support vector machine, logistic regression, and ensemble of classifiers predictors. Conclusion The SCPRED can accurately find similar structures for sequences that share low identity with sequence used for the prediction. The high predictive accuracy achieved by SCPRED is attributed to the design of

  13. Can phenological models predict tree phenology accurately under climate change conditions?

    NASA Astrophysics Data System (ADS)

    Chuine, Isabelle; Bonhomme, Marc; Legave, Jean Michel; García de Cortázar-Atauri, Inaki; Charrier, Guillaume; Lacointe, André; Améglio, Thierry

    2014-05-01

    The onset of the growing season of trees has been globally earlier by 2.3 days/decade during the last 50 years because of global warming and this trend is predicted to continue according to climate forecast. The effect of temperature on plant phenology is however not linear because temperature has a dual effect on bud development. On one hand, low temperatures are necessary to break bud dormancy, and on the other hand higher temperatures are necessary to promote bud cells growth afterwards. Increasing phenological changes in temperate woody species have strong impacts on forest trees distribution and productivity, as well as crops cultivation areas. Accurate predictions of trees phenology are therefore a prerequisite to understand and foresee the impacts of climate change on forests and agrosystems. Different process-based models have been developed in the last two decades to predict the date of budburst or flowering of woody species. They are two main families: (1) one-phase models which consider only the ecodormancy phase and make the assumption that endodormancy is always broken before adequate climatic conditions for cell growth occur; and (2) two-phase models which consider both the endodormancy and ecodormancy phases and predict a date of dormancy break which varies from year to year. So far, one-phase models have been able to predict accurately tree bud break and flowering under historical climate. However, because they do not consider what happens prior to ecodormancy, and especially the possible negative effect of winter temperature warming on dormancy break, it seems unlikely that they can provide accurate predictions in future climate conditions. It is indeed well known that a lack of low temperature results in abnormal pattern of bud break and development in temperate fruit trees. An accurate modelling of the dormancy break date has thus become a major issue in phenology modelling. Two-phases phenological models predict that global warming should delay

  14. A Weibull statistics-based lignocellulose saccharification model and a built-in parameter accurately predict lignocellulose hydrolysis performance.

    PubMed

    Wang, Mingyu; Han, Lijuan; Liu, Shasha; Zhao, Xuebing; Yang, Jinghua; Loh, Soh Kheang; Sun, Xiaomin; Zhang, Chenxi; Fang, Xu

    2015-09-01

    Renewable energy from lignocellulosic biomass has been deemed an alternative to depleting fossil fuels. In order to improve this technology, we aim to develop robust mathematical models for the enzymatic lignocellulose degradation process. By analyzing 96 groups of previously published and newly obtained lignocellulose saccharification results and fitting them to Weibull distribution, we discovered Weibull statistics can accurately predict lignocellulose saccharification data, regardless of the type of substrates, enzymes and saccharification conditions. A mathematical model for enzymatic lignocellulose degradation was subsequently constructed based on Weibull statistics. Further analysis of the mathematical structure of the model and experimental saccharification data showed the significance of the two parameters in this model. In particular, the λ value, defined the characteristic time, represents the overall performance of the saccharification system. This suggestion was further supported by statistical analysis of experimental saccharification data and analysis of the glucose production levels when λ and n values change. In conclusion, the constructed Weibull statistics-based model can accurately predict lignocellulose hydrolysis behavior and we can use the λ parameter to assess the overall performance of enzymatic lignocellulose degradation. Advantages and potential applications of the model and the λ value in saccharification performance assessment were discussed. Copyright © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  15. Accurate RNA 5-methylcytosine site prediction based on heuristic physical-chemical properties reduction and classifier ensemble.

    PubMed

    Zhang, Ming; Xu, Yan; Li, Lei; Liu, Zi; Yang, Xibei; Yu, Dong-Jun

    2018-06-01

    RNA 5-methylcytosine (m 5 C) is an important post-transcriptional modification that plays an indispensable role in biological processes. The accurate identification of m 5 C sites from primary RNA sequences is especially useful for deeply understanding the mechanisms and functions of m 5 C. Due to the difficulty and expensive costs of identifying m 5 C sites with wet-lab techniques, developing fast and accurate machine-learning-based prediction methods is urgently needed. In this study, we proposed a new m 5 C site predictor, called M5C-HPCR, by introducing a novel heuristic nucleotide physicochemical property reduction (HPCR) algorithm and classifier ensemble. HPCR extracts multiple reducts of physical-chemical properties for encoding discriminative features, while the classifier ensemble is applied to integrate multiple base predictors, each of which is trained based on a separate reduct of the physical-chemical properties obtained from HPCR. Rigorous jackknife tests on two benchmark datasets demonstrate that M5C-HPCR outperforms state-of-the-art m 5 C site predictors, with the highest values of MCC (0.859) and AUC (0.962). We also implemented the webserver of M5C-HPCR, which is freely available at http://cslab.just.edu.cn:8080/M5C-HPCR/. Copyright © 2018 Elsevier Inc. All rights reserved.

  16. Biomarker Surrogates Do Not Accurately Predict Sputum Eosinophils and Neutrophils in Asthma

    PubMed Central

    Hastie, Annette T.; Moore, Wendy C.; Li, Huashi; Rector, Brian M.; Ortega, Victor E.; Pascual, Rodolfo M.; Peters, Stephen P.; Meyers, Deborah A.; Bleecker, Eugene R.

    2013-01-01

    Background Sputum eosinophils (Eos) are a strong predictor of airway inflammation, exacerbations, and aid asthma management, whereas sputum neutrophils (Neu) indicate a different severe asthma phenotype, potentially less responsive to TH2-targeted therapy. Variables such as blood Eos, total IgE, fractional exhaled nitric oxide (FeNO) or FEV1% predicted, may predict airway Eos, while age, FEV1%predicted, or blood Neu may predict sputum Neu. Availability and ease of measurement are useful characteristics, but accuracy in predicting airway Eos and Neu, individually or combined, is not established. Objectives To determine whether blood Eos, FeNO, and IgE accurately predict sputum eosinophils, and age, FEV1% predicted, and blood Neu accurately predict sputum neutrophils (Neu). Methods Subjects in the Wake Forest Severe Asthma Research Program (N=328) were characterized by blood and sputum cells, healthcare utilization, lung function, FeNO, and IgE. Multiple analytical techniques were utilized. Results Despite significant association with sputum Eos, blood Eos, FeNO and total IgE did not accurately predict sputum Eos, and combinations of these variables failed to improve prediction. Age, FEV1%predicted and blood Neu were similarly unsatisfactory for prediction of sputum Neu. Factor analysis and stepwise selection found FeNO, IgE and FEV1% predicted, but not blood Eos, correctly predicted 69% of sputum Eospredicted 64% of sputum Neupredict both sputum Eos and Neu accurately assigned only 41% of samples. Conclusion Despite statistically significant associations FeNO, IgE, blood Eos and Neu, FEV1%predicted, and age are poor surrogates, separately and combined, for accurately predicting sputum eosinophils and neutrophils. PMID:23706399

  17. Measuring the value of accurate link prediction for network seeding.

    PubMed

    Wei, Yijin; Spencer, Gwen

    2017-01-01

    The influence-maximization literature seeks small sets of individuals whose structural placement in the social network can drive large cascades of behavior. Optimization efforts to find the best seed set often assume perfect knowledge of the network topology. Unfortunately, social network links are rarely known in an exact way. When do seeding strategies based on less-than-accurate link prediction provide valuable insight? We introduce optimized-against-a-sample ([Formula: see text]) performance to measure the value of optimizing seeding based on a noisy observation of a network. Our computational study investigates [Formula: see text] under several threshold-spread models in synthetic and real-world networks. Our focus is on measuring the value of imprecise link information. The level of investment in link prediction that is strategic appears to depend closely on spread model: in some parameter ranges investments in improving link prediction can pay substantial premiums in cascade size. For other ranges, such investments would be wasted. Several trends were remarkably consistent across topologies.

  18. Basophile: Accurate Fragment Charge State Prediction Improves Peptide Identification Rates

    DOE PAGES

    Wang, Dong; Dasari, Surendra; Chambers, Matthew C.; ...

    2013-03-07

    In shotgun proteomics, database search algorithms rely on fragmentation models to predict fragment ions that should be observed for a given peptide sequence. The most widely used strategy (Naive model) is oversimplified, cleaving all peptide bonds with equal probability to produce fragments of all charges below that of the precursor ion. More accurate models, based on fragmentation simulation, are too computationally intensive for on-the-fly use in database search algorithms. We have created an ordinal-regression-based model called Basophile that takes fragment size and basic residue distribution into account when determining the charge retention during CID/higher-energy collision induced dissociation (HCD) of chargedmore » peptides. This model improves the accuracy of predictions by reducing the number of unnecessary fragments that are routinely predicted for highly-charged precursors. Basophile increased the identification rates by 26% (on average) over the Naive model, when analyzing triply-charged precursors from ion trap data. Basophile achieves simplicity and speed by solving the prediction problem with an ordinal regression equation, which can be incorporated into any database search software for shotgun proteomic identification.« less

  19. PredSTP: a highly accurate SVM based model to predict sequential cystine stabilized peptides.

    PubMed

    Islam, S M Ashiqul; Sajed, Tanvir; Kearney, Christopher Michel; Baker, Erich J

    2015-07-05

    Numerous organisms have evolved a wide range of toxic peptides for self-defense and predation. Their effective interstitial and macro-environmental use requires energetic and structural stability. One successful group of these peptides includes a tri-disulfide domain arrangement that offers toxicity and high stability. Sequential tri-disulfide connectivity variants create highly compact disulfide folds capable of withstanding a variety of environmental stresses. Their combination of toxicity and stability make these peptides remarkably valuable for their potential as bio-insecticides, antimicrobial peptides and peptide drug candidates. However, the wide sequence variation, sources and modalities of group members impose serious limitations on our ability to rapidly identify potential members. As a result, there is a need for automated high-throughput member classification approaches that leverage their demonstrated tertiary and functional homology. We developed an SVM-based model to predict sequential tri-disulfide peptide (STP) toxins from peptide sequences. One optimized model, called PredSTP, predicted STPs from training set with sensitivity, specificity, precision, accuracy and a Matthews correlation coefficient of 94.86%, 94.11%, 84.31%, 94.30% and 0.86, respectively, using 200 fold cross validation. The same model outperforms existing prediction approaches in three independent out of sample testsets derived from PDB. PredSTP can accurately identify a wide range of cystine stabilized peptide toxins directly from sequences in a species-agnostic fashion. The ability to rapidly filter sequences for potential bioactive peptides can greatly compress the time between peptide identification and testing structural and functional properties for possible antimicrobial and insecticidal candidates. A web interface is freely available to predict STP toxins from http://crick.ecs.baylor.edu/.

  20. ILT based defect simulation of inspection images accurately predicts mask defect printability on wafer

    NASA Astrophysics Data System (ADS)

    Deep, Prakash; Paninjath, Sankaranarayanan; Pereira, Mark; Buck, Peter

    2016-05-01

    At advanced technology nodes mask complexity has been increased because of large-scale use of resolution enhancement technologies (RET) which includes Optical Proximity Correction (OPC), Inverse Lithography Technology (ILT) and Source Mask Optimization (SMO). The number of defects detected during inspection of such mask increased drastically and differentiation of critical and non-critical defects are more challenging, complex and time consuming. Because of significant defectivity of EUVL masks and non-availability of actinic inspection, it is important and also challenging to predict the criticality of defects for printability on wafer. This is one of the significant barriers for the adoption of EUVL for semiconductor manufacturing. Techniques to decide criticality of defects from images captured using non actinic inspection images is desired till actinic inspection is not available. High resolution inspection of photomask images detects many defects which are used for process and mask qualification. Repairing all defects is not practical and probably not required, however it's imperative to know which defects are severe enough to impact wafer before repair. Additionally, wafer printability check is always desired after repairing a defect. AIMSTM review is the industry standard for this, however doing AIMSTM review for all defects is expensive and very time consuming. Fast, accurate and an economical mechanism is desired which can predict defect printability on wafer accurately and quickly from images captured using high resolution inspection machine. Predicting defect printability from such images is challenging due to the fact that the high resolution images do not correlate with actual mask contours. The challenge is increased due to use of different optical condition during inspection other than actual scanner condition, and defects found in such images do not have correlation with actual impact on wafer. Our automated defect simulation tool predicts

  1. Accurate Identification of Fear Facial Expressions Predicts Prosocial Behavior

    PubMed Central

    Marsh, Abigail A.; Kozak, Megan N.; Ambady, Nalini

    2009-01-01

    The fear facial expression is a distress cue that is associated with the provision of help and prosocial behavior. Prior psychiatric studies have found deficits in the recognition of this expression by individuals with antisocial tendencies. However, no prior study has shown accuracy for recognition of fear to predict actual prosocial or antisocial behavior in an experimental setting. In 3 studies, the authors tested the prediction that individuals who recognize fear more accurately will behave more prosocially. In Study 1, participants who identified fear more accurately also donated more money and time to a victim in a classic altruism paradigm. In Studies 2 and 3, participants’ ability to identify the fear expression predicted prosocial behavior in a novel task designed to control for confounding variables. In Study 3, accuracy for recognizing fear proved a better predictor of prosocial behavior than gender, mood, or scores on an empathy scale. PMID:17516803

  2. Accurate identification of fear facial expressions predicts prosocial behavior.

    PubMed

    Marsh, Abigail A; Kozak, Megan N; Ambady, Nalini

    2007-05-01

    The fear facial expression is a distress cue that is associated with the provision of help and prosocial behavior. Prior psychiatric studies have found deficits in the recognition of this expression by individuals with antisocial tendencies. However, no prior study has shown accuracy for recognition of fear to predict actual prosocial or antisocial behavior in an experimental setting. In 3 studies, the authors tested the prediction that individuals who recognize fear more accurately will behave more prosocially. In Study 1, participants who identified fear more accurately also donated more money and time to a victim in a classic altruism paradigm. In Studies 2 and 3, participants' ability to identify the fear expression predicted prosocial behavior in a novel task designed to control for confounding variables. In Study 3, accuracy for recognizing fear proved a better predictor of prosocial behavior than gender, mood, or scores on an empathy scale.

  3. Heart rate during basketball game play and volleyball drills accurately predicts oxygen uptake and energy expenditure.

    PubMed

    Scribbans, T D; Berg, K; Narazaki, K; Janssen, I; Gurd, B J

    2015-09-01

    There is currently little information regarding the ability of metabolic prediction equations to accurately predict oxygen uptake and exercise intensity from heart rate (HR) during intermittent sport. The purpose of the present study was to develop and, cross-validate equations appropriate for accurately predicting oxygen cost (VO2) and energy expenditure from HR during intermittent sport participation. Eleven healthy adult males (19.9±1.1yrs) were recruited to establish the relationship between %VO2peak and %HRmax during low-intensity steady state endurance (END), moderate-intensity interval (MOD) and high intensity-interval exercise (HI), as performed on a cycle ergometer. Three equations (END, MOD, and HI) for predicting %VO2peak based on %HRmax were developed. HR and VO2 were directly measured during basketball games (6 male, 20.8±1.0 yrs; 6 female, 20.0±1.3yrs) and volleyball drills (12 female; 20.8±1.0yrs). Comparisons were made between measured and predicted VO2 and energy expenditure using the 3 equations developed and 2 previously published equations. The END and MOD equations accurately predicted VO2 and energy expenditure, while the HI equation underestimated, and the previously published equations systematically overestimated VO2 and energy expenditure. Intermittent sport VO2 and energy expenditure can be accurately predicted from heart rate data using either the END (%VO2peak=%HRmax x 1.008-17.17) or MOD (%VO2peak=%HRmax x 1.2-32) equations. These 2 simple equations provide an accessible and cost-effective method for accurate estimation of exercise intensity and energy expenditure during intermittent sport.

  4. Base Rates, Contingencies, and Prediction Behavior

    ERIC Educational Resources Information Center

    Kareev, Yaakov; Fiedler, Klaus; Avrahami, Judith

    2009-01-01

    A skew in the base rate of upcoming events can often provide a better cue for accurate predictions than a contingency between signals and events. The authors study prediction behavior and test people's sensitivity to both base rate and contingency; they also examine people's ability to compare the benefits of both for prediction. They formalize…

  5. ASTRAL, DRAGON and SEDAN scores predict stroke outcome more accurately than physicians.

    PubMed

    Ntaios, G; Gioulekas, F; Papavasileiou, V; Strbian, D; Michel, P

    2016-11-01

    ASTRAL, SEDAN and DRAGON scores are three well-validated scores for stroke outcome prediction. Whether these scores predict stroke outcome more accurately compared with physicians interested in stroke was investigated. Physicians interested in stroke were invited to an online anonymous survey to provide outcome estimates in randomly allocated structured scenarios of recent real-life stroke patients. Their estimates were compared to scores' predictions in the same scenarios. An estimate was considered accurate if it was within 95% confidence intervals of actual outcome. In all, 244 participants from 32 different countries responded assessing 720 real scenarios and 2636 outcomes. The majority of physicians' estimates were inaccurate (1422/2636, 53.9%). 400 (56.8%) of physicians' estimates about the percentage probability of 3-month modified Rankin score (mRS) > 2 were accurate compared with 609 (86.5%) of ASTRAL score estimates (P < 0.0001). 394 (61.2%) of physicians' estimates about the percentage probability of post-thrombolysis symptomatic intracranial haemorrhage were accurate compared with 583 (90.5%) of SEDAN score estimates (P < 0.0001). 160 (24.8%) of physicians' estimates about post-thrombolysis 3-month percentage probability of mRS 0-2 were accurate compared with 240 (37.3%) DRAGON score estimates (P < 0.0001). 260 (40.4%) of physicians' estimates about the percentage probability of post-thrombolysis mRS 5-6 were accurate compared with 518 (80.4%) DRAGON score estimates (P < 0.0001). ASTRAL, DRAGON and SEDAN scores predict outcome of acute ischaemic stroke patients with higher accuracy compared to physicians interested in stroke. © 2016 EAN.

  6. Base pair probability estimates improve the prediction accuracy of RNA non-canonical base pairs

    PubMed Central

    2017-01-01

    Prediction of RNA tertiary structure from sequence is an important problem, but generating accurate structure models for even short sequences remains difficult. Predictions of RNA tertiary structure tend to be least accurate in loop regions, where non-canonical pairs are important for determining the details of structure. Non-canonical pairs can be predicted using a knowledge-based model of structure that scores nucleotide cyclic motifs, or NCMs. In this work, a partition function algorithm is introduced that allows the estimation of base pairing probabilities for both canonical and non-canonical interactions. Pairs that are predicted to be probable are more likely to be found in the true structure than pairs of lower probability. Pair probability estimates can be further improved by predicting the structure conserved across multiple homologous sequences using the TurboFold algorithm. These pairing probabilities, used in concert with prior knowledge of the canonical secondary structure, allow accurate inference of non-canonical pairs, an important step towards accurate prediction of the full tertiary structure. Software to predict non-canonical base pairs and pairing probabilities is now provided as part of the RNAstructure software package. PMID:29107980

  7. XenoSite: accurately predicting CYP-mediated sites of metabolism with neural networks.

    PubMed

    Zaretzki, Jed; Matlock, Matthew; Swamidass, S Joshua

    2013-12-23

    Understanding how xenobiotic molecules are metabolized is important because it influences the safety, efficacy, and dose of medicines and how they can be modified to improve these properties. The cytochrome P450s (CYPs) are proteins responsible for metabolizing 90% of drugs on the market, and many computational methods can predict which atomic sites of a molecule--sites of metabolism (SOMs)--are modified during CYP-mediated metabolism. This study improves on prior methods of predicting CYP-mediated SOMs by using new descriptors and machine learning based on neural networks. The new method, XenoSite, is faster to train and more accurate by as much as 4% or 5% for some isozymes. Furthermore, some "incorrect" predictions made by XenoSite were subsequently validated as correct predictions by revaluation of the source literature. Moreover, XenoSite output is interpretable as a probability, which reflects both the confidence of the model that a particular atom is metabolized and the statistical likelihood that its prediction for that atom is correct.

  8. Accurate prediction of secondary metabolite gene clusters in filamentous fungi.

    PubMed

    Andersen, Mikael R; Nielsen, Jakob B; Klitgaard, Andreas; Petersen, Lene M; Zachariasen, Mia; Hansen, Tilde J; Blicher, Lene H; Gotfredsen, Charlotte H; Larsen, Thomas O; Nielsen, Kristian F; Mortensen, Uffe H

    2013-01-02

    Biosynthetic pathways of secondary metabolites from fungi are currently subject to an intense effort to elucidate the genetic basis for these compounds due to their large potential within pharmaceutics and synthetic biochemistry. The preferred method is methodical gene deletions to identify supporting enzymes for key synthases one cluster at a time. In this study, we design and apply a DNA expression array for Aspergillus nidulans in combination with legacy data to form a comprehensive gene expression compendium. We apply a guilt-by-association-based analysis to predict the extent of the biosynthetic clusters for the 58 synthases active in our set of experimental conditions. A comparison with legacy data shows the method to be accurate in 13 of 16 known clusters and nearly accurate for the remaining 3 clusters. Furthermore, we apply a data clustering approach, which identifies cross-chemistry between physically separate gene clusters (superclusters), and validate this both with legacy data and experimentally by prediction and verification of a supercluster consisting of the synthase AN1242 and the prenyltransferase AN11080, as well as identification of the product compound nidulanin A. We have used A. nidulans for our method development and validation due to the wealth of available biochemical data, but the method can be applied to any fungus with a sequenced and assembled genome, thus supporting further secondary metabolite pathway elucidation in the fungal kingdom.

  9. Accurate Prediction of Contact Numbers for Multi-Spanning Helical Membrane Proteins

    PubMed Central

    Li, Bian; Mendenhall, Jeffrey; Nguyen, Elizabeth Dong; Weiner, Brian E.; Fischer, Axel W.; Meiler, Jens

    2017-01-01

    Prediction of the three-dimensional (3D) structures of proteins by computational methods is acknowledged as an unsolved problem. Accurate prediction of important structural characteristics such as contact number is expected to accelerate the otherwise slow progress being made in the prediction of 3D structure of proteins. Here, we present a dropout neural network-based method, TMH-Expo, for predicting the contact number of transmembrane helix (TMH) residues from sequence. Neuronal dropout is a strategy where certain neurons of the network are excluded from back-propagation to prevent co-adaptation of hidden-layer neurons. By using neuronal dropout, overfitting was significantly reduced and performance was noticeably improved. For multi-spanning helical membrane proteins, TMH-Expo achieved a remarkable Pearson correlation coefficient of 0.69 between predicted and experimental values and a mean absolute error of only 1.68. In addition, among those membrane protein–membrane protein interface residues, 76.8% were correctly predicted. Mapping of predicted contact numbers onto structures indicates that contact numbers predicted by TMH-Expo reflect the exposure patterns of TMHs and reveal membrane protein–membrane protein interfaces, reinforcing the potential of predicted contact numbers to be used as restraints for 3D structure prediction and protein–protein docking. TMH-Expo can be accessed via a Web server at www.meilerlab.org. PMID:26804342

  10. A Critical Review for Developing Accurate and Dynamic Predictive Models Using Machine Learning Methods in Medicine and Health Care.

    PubMed

    Alanazi, Hamdan O; Abdullah, Abdul Hanan; Qureshi, Kashif Naseer

    2017-04-01

    Recently, Artificial Intelligence (AI) has been used widely in medicine and health care sector. In machine learning, the classification or prediction is a major field of AI. Today, the study of existing predictive models based on machine learning methods is extremely active. Doctors need accurate predictions for the outcomes of their patients' diseases. In addition, for accurate predictions, timing is another significant factor that influences treatment decisions. In this paper, existing predictive models in medicine and health care have critically reviewed. Furthermore, the most famous machine learning methods have explained, and the confusion between a statistical approach and machine learning has clarified. A review of related literature reveals that the predictions of existing predictive models differ even when the same dataset is used. Therefore, existing predictive models are essential, and current methods must be improved.

  11. Competitive Abilities in Experimental Microcosms Are Accurately Predicted by a Demographic Index for R*

    PubMed Central

    Murrell, Ebony G.; Juliano, Steven A.

    2012-01-01

    Resource competition theory predicts that R*, the equilibrium resource amount yielding zero growth of a consumer population, should predict species' competitive abilities for that resource. This concept has been supported for unicellular organisms, but has not been well-tested for metazoans, probably due to the difficulty of raising experimental populations to equilibrium and measuring population growth rates for species with long or complex life cycles. We developed an index (Rindex) of R* based on demography of one insect cohort, growing from egg to adult in a non-equilibrium setting, and tested whether Rindex yielded accurate predictions of competitive abilities using mosquitoes as a model system. We estimated finite rate of increase (λ′) from demographic data for cohorts of three mosquito species raised with different detritus amounts, and estimated each species' Rindex using nonlinear regressions of λ′ vs. initial detritus amount. All three species' Rindex differed significantly, and accurately predicted competitive hierarchy of the species determined in simultaneous pairwise competition experiments. Our Rindex could provide estimates and rigorous statistical comparisons of competitive ability for organisms for which typical chemostat methods and equilibrium population conditions are impractical. PMID:22970128

  12. Fast and Accurate Prediction of Stratified Steel Temperature During Holding Period of Ladle

    NASA Astrophysics Data System (ADS)

    Deodhar, Anirudh; Singh, Umesh; Shukla, Rishabh; Gautham, B. P.; Singh, Amarendra K.

    2017-04-01

    Thermal stratification of liquid steel in a ladle during the holding period and the teeming operation has a direct bearing on the superheat available at the caster and hence on the caster set points such as casting speed and cooling rates. The changes in the caster set points are typically carried out based on temperature measurements at the end of tundish outlet. Thermal prediction models provide advance knowledge of the influence of process and design parameters on the steel temperature at various stages. Therefore, they can be used in making accurate decisions about the caster set points in real time. However, this requires both fast and accurate thermal prediction models. In this work, we develop a surrogate model for the prediction of thermal stratification using data extracted from a set of computational fluid dynamics (CFD) simulations, pre-determined using design of experiments technique. Regression method is used for training the predictor. The model predicts the stratified temperature profile instantaneously, for a given set of process parameters such as initial steel temperature, refractory heat content, slag thickness, and holding time. More than 96 pct of the predicted values are within an error range of ±5 K (±5 °C), when compared against corresponding CFD results. Considering its accuracy and computational efficiency, the model can be extended for thermal control of casting operations. This work also sets a benchmark for developing similar thermal models for downstream processes such as tundish and caster.

  13. Identification of fidgety movements and prediction of CP by the use of computer-based video analysis is more accurate when based on two video recordings.

    PubMed

    Adde, Lars; Helbostad, Jorunn; Jensenius, Alexander R; Langaas, Mette; Støen, Ragnhild

    2013-08-01

    This study evaluates the role of postterm age at assessment and the use of one or two video recordings for the detection of fidgety movements (FMs) and prediction of cerebral palsy (CP) using computer vision software. Recordings between 9 and 17 weeks postterm age from 52 preterm and term infants (24 boys, 28 girls; 26 born preterm) were used. Recordings were analyzed using computer vision software. Movement variables, derived from differences between subsequent video frames, were used for quantitative analysis. Sensitivities, specificities, and area under curve were estimated for the first and second recording, or a mean of both. FMs were classified based on the Prechtl approach of general movement assessment. CP status was reported at 2 years. Nine children developed CP of whom all recordings had absent FMs. The mean variability of the centroid of motion (CSD) from two recordings was more accurate than using only one recording, and identified all children who were diagnosed with CP at 2 years. Age at assessment did not influence the detection of FMs or prediction of CP. The accuracy of computer vision techniques in identifying FMs and predicting CP based on two recordings should be confirmed in future studies.

  14. Reliable and accurate point-based prediction of cumulative infiltration using soil readily available characteristics: A comparison between GMDH, ANN, and MLR

    NASA Astrophysics Data System (ADS)

    Rahmati, Mehdi

    2017-08-01

    Developing accurate and reliable pedo-transfer functions (PTFs) to predict soil non-readily available characteristics is one of the most concerned topic in soil science and selecting more appropriate predictors is a crucial factor in PTFs' development. Group method of data handling (GMDH), which finds an approximate relationship between a set of input and output variables, not only provide an explicit procedure to select the most essential PTF input variables, but also results in more accurate and reliable estimates than other mostly applied methodologies. Therefore, the current research was aimed to apply GMDH in comparison with multivariate linear regression (MLR) and artificial neural network (ANN) to develop several PTFs to predict soil cumulative infiltration point-basely at specific time intervals (0.5-45 min) using soil readily available characteristics (RACs). In this regard, soil infiltration curves as well as several soil RACs including soil primary particles (clay (CC), silt (Si), and sand (Sa)), saturated hydraulic conductivity (Ks), bulk (Db) and particle (Dp) densities, organic carbon (OC), wet-aggregate stability (WAS), electrical conductivity (EC), and soil antecedent (θi) and field saturated (θfs) water contents were measured at 134 different points in Lighvan watershed, northwest of Iran. Then, applying GMDH, MLR, and ANN methodologies, several PTFs have been developed to predict cumulative infiltrations using two sets of selected soil RACs including and excluding Ks. According to the test data, results showed that developed PTFs by GMDH and MLR procedures using all soil RACs including Ks resulted in more accurate (with E values of 0.673-0.963) and reliable (with CV values lower than 11 percent) predictions of cumulative infiltrations at different specific time steps. In contrast, ANN procedure had lower accuracy (with E values of 0.356-0.890) and reliability (with CV values up to 50 percent) compared to GMDH and MLR. The results also revealed

  15. High Order Schemes in Bats-R-US for Faster and More Accurate Predictions

    NASA Astrophysics Data System (ADS)

    Chen, Y.; Toth, G.; Gombosi, T. I.

    2014-12-01

    BATS-R-US is a widely used global magnetohydrodynamics model that originally employed second order accurate TVD schemes combined with block based Adaptive Mesh Refinement (AMR) to achieve high resolution in the regions of interest. In the last years we have implemented fifth order accurate finite difference schemes CWENO5 and MP5 for uniform Cartesian grids. Now the high order schemes have been extended to generalized coordinates, including spherical grids and also to the non-uniform AMR grids including dynamic regridding. We present numerical tests that verify the preservation of free-stream solution and high-order accuracy as well as robust oscillation-free behavior near discontinuities. We apply the new high order accurate schemes to both heliospheric and magnetospheric simulations and show that it is robust and can achieve the same accuracy as the second order scheme with much less computational resources. This is especially important for space weather prediction that requires faster than real time code execution.

  16. Accurate Prediction of Motor Failures by Application of Multi CBM Tools: A Case Study

    NASA Astrophysics Data System (ADS)

    Dutta, Rana; Singh, Veerendra Pratap; Dwivedi, Jai Prakash

    2018-02-01

    Motor failures are very difficult to predict accurately with a single condition-monitoring tool as both electrical and the mechanical systems are closely related. Electrical problem, like phase unbalance, stator winding insulation failures can, at times, lead to vibration problem and at the same time mechanical failures like bearing failure, leads to rotor eccentricity. In this case study of a 550 kW blower motor it has been shown that a rotor bar crack was detected by current signature analysis and vibration monitoring confirmed the same. In later months in a similar motor vibration monitoring predicted bearing failure and current signature analysis confirmed the same. In both the cases, after dismantling the motor, the predictions were found to be accurate. In this paper we will be discussing the accurate predictions of motor failures through use of multi condition monitoring tools with two case studies.

  17. Computer-based personality judgments are more accurate than those made by humans.

    PubMed

    Youyou, Wu; Kosinski, Michal; Stillwell, David

    2015-01-27

    Judging others' personalities is an essential skill in successful social living, as personality is a key driver behind people's interactions, behaviors, and emotions. Although accurate personality judgments stem from social-cognitive skills, developments in machine learning show that computer models can also make valid judgments. This study compares the accuracy of human and computer-based personality judgments, using a sample of 86,220 volunteers who completed a 100-item personality questionnaire. We show that (i) computer predictions based on a generic digital footprint (Facebook Likes) are more accurate (r = 0.56) than those made by the participants' Facebook friends using a personality questionnaire (r = 0.49); (ii) computer models show higher interjudge agreement; and (iii) computer personality judgments have higher external validity when predicting life outcomes such as substance use, political attitudes, and physical health; for some outcomes, they even outperform the self-rated personality scores. Computers outpacing humans in personality judgment presents significant opportunities and challenges in the areas of psychological assessment, marketing, and privacy.

  18. An Accurate GPS-IMU/DR Data Fusion Method for Driverless Car Based on a Set of Predictive Models and Grid Constraints

    PubMed Central

    Wang, Shiyao; Deng, Zhidong; Yin, Gang

    2016-01-01

    A high-performance differential global positioning system (GPS)  receiver with real time kinematics provides absolute localization for driverless cars. However, it is not only susceptible to multipath effect but also unable to effectively fulfill precise error correction in a wide range of driving areas. This paper proposes an accurate GPS–inertial measurement unit (IMU)/dead reckoning (DR) data fusion method based on a set of predictive models and occupancy grid constraints. First, we employ a set of autoregressive and moving average (ARMA) equations that have different structural parameters to build maximum likelihood models of raw navigation. Second, both grid constraints and spatial consensus checks on all predictive results and current measurements are required to have removal of outliers. Navigation data that satisfy stationary stochastic process are further fused to achieve accurate localization results. Third, the standard deviation of multimodal data fusion can be pre-specified by grid size. Finally, we perform a lot of field tests on a diversity of real urban scenarios. The experimental results demonstrate that the method can significantly smooth small jumps in bias and considerably reduce accumulated position errors due to DR. With low computational complexity, the position accuracy of our method surpasses existing state-of-the-arts on the same dataset and the new data fusion method is practically applied in our driverless car. PMID:26927108

  19. An Accurate GPS-IMU/DR Data Fusion Method for Driverless Car Based on a Set of Predictive Models and Grid Constraints.

    PubMed

    Wang, Shiyao; Deng, Zhidong; Yin, Gang

    2016-02-24

    A high-performance differential global positioning system (GPS)  receiver with real time kinematics provides absolute localization for driverless cars. However, it is not only susceptible to multipath effect but also unable to effectively fulfill precise error correction in a wide range of driving areas. This paper proposes an accurate GPS-inertial measurement unit (IMU)/dead reckoning (DR) data fusion method based on a set of predictive models and occupancy grid constraints. First, we employ a set of autoregressive and moving average (ARMA) equations that have different structural parameters to build maximum likelihood models of raw navigation. Second, both grid constraints and spatial consensus checks on all predictive results and current measurements are required to have removal of outliers. Navigation data that satisfy stationary stochastic process are further fused to achieve accurate localization results. Third, the standard deviation of multimodal data fusion can be pre-specified by grid size. Finally, we perform a lot of field tests on a diversity of real urban scenarios. The experimental results demonstrate that the method can significantly smooth small jumps in bias and considerably reduce accumulated position errors due to DR. With low computational complexity, the position accuracy of our method surpasses existing state-of-the-arts on the same dataset and the new data fusion method is practically applied in our driverless car.

  20. A Machine Learned Classifier That Uses Gene Expression Data to Accurately Predict Estrogen Receptor Status

    PubMed Central

    Bastani, Meysam; Vos, Larissa; Asgarian, Nasimeh; Deschenes, Jean; Graham, Kathryn; Mackey, John; Greiner, Russell

    2013-01-01

    Background Selecting the appropriate treatment for breast cancer requires accurately determining the estrogen receptor (ER) status of the tumor. However, the standard for determining this status, immunohistochemical analysis of formalin-fixed paraffin embedded samples, suffers from numerous technical and reproducibility issues. Assessment of ER-status based on RNA expression can provide more objective, quantitative and reproducible test results. Methods To learn a parsimonious RNA-based classifier of hormone receptor status, we applied a machine learning tool to a training dataset of gene expression microarray data obtained from 176 frozen breast tumors, whose ER-status was determined by applying ASCO-CAP guidelines to standardized immunohistochemical testing of formalin fixed tumor. Results This produced a three-gene classifier that can predict the ER-status of a novel tumor, with a cross-validation accuracy of 93.17±2.44%. When applied to an independent validation set and to four other public databases, some on different platforms, this classifier obtained over 90% accuracy in each. In addition, we found that this prediction rule separated the patients' recurrence-free survival curves with a hazard ratio lower than the one based on the IHC analysis of ER-status. Conclusions Our efficient and parsimonious classifier lends itself to high throughput, highly accurate and low-cost RNA-based assessments of ER-status, suitable for routine high-throughput clinical use. This analytic method provides a proof-of-principle that may be applicable to developing effective RNA-based tests for other biomarkers and conditions. PMID:24312637

  1. Computer-based personality judgments are more accurate than those made by humans

    PubMed Central

    Youyou, Wu; Kosinski, Michal; Stillwell, David

    2015-01-01

    Judging others’ personalities is an essential skill in successful social living, as personality is a key driver behind people’s interactions, behaviors, and emotions. Although accurate personality judgments stem from social-cognitive skills, developments in machine learning show that computer models can also make valid judgments. This study compares the accuracy of human and computer-based personality judgments, using a sample of 86,220 volunteers who completed a 100-item personality questionnaire. We show that (i) computer predictions based on a generic digital footprint (Facebook Likes) are more accurate (r = 0.56) than those made by the participants’ Facebook friends using a personality questionnaire (r = 0.49); (ii) computer models show higher interjudge agreement; and (iii) computer personality judgments have higher external validity when predicting life outcomes such as substance use, political attitudes, and physical health; for some outcomes, they even outperform the self-rated personality scores. Computers outpacing humans in personality judgment presents significant opportunities and challenges in the areas of psychological assessment, marketing, and privacy. PMID:25583507

  2. WegoLoc: accurate prediction of protein subcellular localization using weighted Gene Ontology terms.

    PubMed

    Chi, Sang-Mun; Nam, Dougu

    2012-04-01

    We present an accurate and fast web server, WegoLoc for predicting subcellular localization of proteins based on sequence similarity and weighted Gene Ontology (GO) information. A term weighting method in the text categorization process is applied to GO terms for a support vector machine classifier. As a result, WegoLoc surpasses the state-of-the-art methods for previously used test datasets. WegoLoc supports three eukaryotic kingdoms (animals, fungi and plants) and provides human-specific analysis, and covers several sets of cellular locations. In addition, WegoLoc provides (i) multiple possible localizations of input protein(s) as well as their corresponding probability scores, (ii) weights of GO terms representing the contribution of each GO term in the prediction, and (iii) a BLAST E-value for the best hit with GO terms. If the similarity score does not meet a given threshold, an amino acid composition-based prediction is applied as a backup method. WegoLoc and User's guide are freely available at the website http://www.btool.org/WegoLoc smchiks@ks.ac.kr; dougnam@unist.ac.kr Supplementary data is available at http://www.btool.org/WegoLoc.

  3. PredictSNP: Robust and Accurate Consensus Classifier for Prediction of Disease-Related Mutations

    PubMed Central

    Bendl, Jaroslav; Stourac, Jan; Salanda, Ondrej; Pavelka, Antonin; Wieben, Eric D.; Zendulka, Jaroslav; Brezovsky, Jan; Damborsky, Jiri

    2014-01-01

    Single nucleotide variants represent a prevalent form of genetic variation. Mutations in the coding regions are frequently associated with the development of various genetic diseases. Computational tools for the prediction of the effects of mutations on protein function are very important for analysis of single nucleotide variants and their prioritization for experimental characterization. Many computational tools are already widely employed for this purpose. Unfortunately, their comparison and further improvement is hindered by large overlaps between the training datasets and benchmark datasets, which lead to biased and overly optimistic reported performances. In this study, we have constructed three independent datasets by removing all duplicities, inconsistencies and mutations previously used in the training of evaluated tools. The benchmark dataset containing over 43,000 mutations was employed for the unbiased evaluation of eight established prediction tools: MAPP, nsSNPAnalyzer, PANTHER, PhD-SNP, PolyPhen-1, PolyPhen-2, SIFT and SNAP. The six best performing tools were combined into a consensus classifier PredictSNP, resulting into significantly improved prediction performance, and at the same time returned results for all mutations, confirming that consensus prediction represents an accurate and robust alternative to the predictions delivered by individual tools. A user-friendly web interface enables easy access to all eight prediction tools, the consensus classifier PredictSNP and annotations from the Protein Mutant Database and the UniProt database. The web server and the datasets are freely available to the academic community at http://loschmidt.chemi.muni.cz/predictsnp. PMID:24453961

  4. Accurate and computationally efficient prediction of thermochemical properties of biomolecules using the generalized connectivity-based hierarchy.

    PubMed

    Sengupta, Arkajyoti; Ramabhadran, Raghunath O; Raghavachari, Krishnan

    2014-08-14

    In this study we have used the connectivity-based hierarchy (CBH) method to derive accurate heats of formation of a range of biomolecules, 18 amino acids and 10 barbituric acid/uracil derivatives. The hierarchy is based on the connectivity of the different atoms in a large molecule. It results in error-cancellation reaction schemes that are automated, general, and can be readily used for a broad range of organic molecules and biomolecules. Herein, we first locate stable conformational and tautomeric forms of these biomolecules using an accurate level of theory (viz. CCSD(T)/6-311++G(3df,2p)). Subsequently, the heats of formation of the amino acids are evaluated using the CBH-1 and CBH-2 schemes and routinely employed density functionals or wave function-based methods. The calculated heats of formation obtained herein using modest levels of theory and are in very good agreement with those obtained using more expensive W1-F12 and W2-F12 methods on amino acids and G3 results on barbituric acid derivatives. Overall, the present study (a) highlights the small effect of including multiple conformers in determining the heats of formation of biomolecules and (b) in concurrence with previous CBH studies, proves that use of the more effective error-cancelling isoatomic scheme (CBH-2) results in more accurate heats of formation with modestly sized basis sets along with common density functionals or wave function-based methods.

  5. Kinetic approach to degradation mechanisms in polymer solar cells and their accurate lifetime predictions

    NASA Astrophysics Data System (ADS)

    Arshad, Muhammad Azeem; Maaroufi, AbdelKrim

    2018-07-01

    A beginning has been made in the present study regarding the accurate lifetime predictions of polymer solar cells. Certain reservations about the conventionally employed temperature accelerated lifetime measurements test for its unworthiness of predicting reliable lifetimes of polymer solar cells are brought into light. Critical issues concerning the accelerated lifetime testing include, assuming reaction mechanism instead of determining it, and relying solely on the temperature acceleration of a single property of material. An advanced approach comprising a set of theoretical models to estimate the accurate lifetimes of polymer solar cells is therefore suggested in order to suitably alternate the accelerated lifetime testing. This approach takes into account systematic kinetic modeling of various possible polymer degradation mechanisms under natural weathering conditions. The proposed kinetic approach is substantiated by its applications on experimental aging data-sets of polymer solar materials/solar cells including, P3HT polymer film, bulk heterojunction (MDMO-PPV:PCBM) and dye-sensitized solar cells. Based on the suggested approach, an efficacious lifetime determination formula for polymer solar cells is derived and tested on dye-sensitized solar cells. Some important merits of the proposed method are also pointed out and its prospective applications are discussed.

  6. Can phenological models predict tree phenology accurately in the future? The unrevealed hurdle of endodormancy break.

    PubMed

    Chuine, Isabelle; Bonhomme, Marc; Legave, Jean-Michel; García de Cortázar-Atauri, Iñaki; Charrier, Guillaume; Lacointe, André; Améglio, Thierry

    2016-10-01

    The onset of the growing season of trees has been earlier by 2.3 days per decade during the last 40 years in temperate Europe because of global warming. The effect of temperature on plant phenology is, however, not linear because temperature has a dual effect on bud development. On one hand, low temperatures are necessary to break bud endodormancy, and, on the other hand, higher temperatures are necessary to promote bud cell growth afterward. Different process-based models have been developed in the last decades to predict the date of budbreak of woody species. They predict that global warming should delay or compromise endodormancy break at the species equatorward range limits leading to a delay or even impossibility to flower or set new leaves. These models are classically parameterized with flowering or budbreak dates only, with no information on the endodormancy break date because this information is very scarce. Here, we evaluated the efficiency of a set of phenological models to accurately predict the endodormancy break dates of three fruit trees. Our results show that models calibrated solely with budbreak dates usually do not accurately predict the endodormancy break date. Providing endodormancy break date for the model parameterization results in much more accurate prediction of this latter, with, however, a higher error than that on budbreak dates. Most importantly, we show that models not calibrated with endodormancy break dates can generate large discrepancies in forecasted budbreak dates when using climate scenarios as compared to models calibrated with endodormancy break dates. This discrepancy increases with mean annual temperature and is therefore the strongest after 2050 in the southernmost regions. Our results claim for the urgent need of massive measurements of endodormancy break dates in forest and fruit trees to yield more robust projections of phenological changes in a near future. © 2016 John Wiley & Sons Ltd.

  7. How accurate is our clinical prediction of "minimal prostate cancer"?

    PubMed

    Leibovici, Dan; Shikanov, Sergey; Gofrit, Ofer N; Zagaja, Gregory P; Shilo, Yaniv; Shalhav, Arieh L

    2013-07-01

    Recommendations for active surveillance versus immediate treatment for low risk prostate cancer are based on biopsy and clinical data, assuming that a low volume of well-differentiated carcinoma will be associated with a low progression risk. However, the accuracy of clinical prediction of minimal prostate cancer (MPC) is unclear. To define preoperative predictors for MPC in prostatectomy specimens and to examine the accuracy of such prediction. Data collected on 1526 consecutive radical prostatectomy patients operated in a single center between 2003 and 2008 included: age, body mass index, preoperative prostate-specific antigen level, biopsy Gleason score, clinical stage, percentage of positive biopsy cores, and maximal core length (MCL) involvement. MPC was defined as < 5% of prostate volume involvement with organ-confined Gleason score < or = 6. Univariate and multivariate logistic regression analyses were used to define independent predictors of minimal disease. Classification and Regression Tree (CART) analysis was used to define cutoff values for the predictors and measure the accuracy of prediction. MPC was found in 241 patients (15.8%). Clinical stage, biopsy Gleason's score, percent of positive biopsy cores, and maximal involved core length were associated with minimal disease (OR 0.42, 0.1, 0.92, and 0.9, respectively). Independent predictors of MPC included: biopsy Gleason score, percent of positive cores and MCL (OR 0.21, 095 and 0.95, respectively). CART showed that when the MCL exceeded 11.5%, the likelihood of MPC was 3.8%. Conversely, when applying the most favorable preoperative conditions (Gleason < or = 6, < 20% positive cores, MCL < or = 11.5%) the chance of minimal disease was 41%. Biopsy Gleason score, the percent of positive cores and MCL are independently associated with MPC. While preoperative prediction of significant prostate cancer was accurate, clinical prediction of MPC was incorrect 59% of the time. Caution is necessary when

  8. Toward accurate prediction of pKa values for internal protein residues: the importance of conformational relaxation and desolvation energy.

    PubMed

    Wallace, Jason A; Wang, Yuhang; Shi, Chuanyin; Pastoor, Kevin J; Nguyen, Bao-Linh; Xia, Kai; Shen, Jana K

    2011-12-01

    Proton uptake or release controls many important biological processes, such as energy transduction, virus replication, and catalysis. Accurate pK(a) prediction informs about proton pathways, thereby revealing detailed acid-base mechanisms. Physics-based methods in the framework of molecular dynamics simulations not only offer pK(a) predictions but also inform about the physical origins of pK(a) shifts and provide details of ionization-induced conformational relaxation and large-scale transitions. One such method is the recently developed continuous constant pH molecular dynamics (CPHMD) method, which has been shown to be an accurate and robust pK(a) prediction tool for naturally occurring titratable residues. To further examine the accuracy and limitations of CPHMD, we blindly predicted the pK(a) values for 87 titratable residues introduced in various hydrophobic regions of staphylococcal nuclease and variants. The predictions gave a root-mean-square deviation of 1.69 pK units from experiment, and there were only two pK(a)'s with errors greater than 3.5 pK units. Analysis of the conformational fluctuation of titrating side-chains in the context of the errors of calculated pK(a) values indicate that explicit treatment of conformational flexibility and the associated dielectric relaxation gives CPHMD a distinct advantage. Analysis of the sources of errors suggests that more accurate pK(a) predictions can be obtained for the most deeply buried residues by improving the accuracy in calculating desolvation energies. Furthermore, it is found that the generalized Born implicit-solvent model underlying the current CPHMD implementation slightly distorts the local conformational environment such that the inclusion of an explicit-solvent representation may offer improvement of accuracy. Copyright © 2011 Wiley-Liss, Inc.

  9. Intermolecular potentials and the accurate prediction of the thermodynamic properties of water

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shvab, I.; Sadus, Richard J., E-mail: rsadus@swin.edu.au

    2013-11-21

    The ability of intermolecular potentials to correctly predict the thermodynamic properties of liquid water at a density of 0.998 g/cm{sup 3} for a wide range of temperatures (298–650 K) and pressures (0.1–700 MPa) is investigated. Molecular dynamics simulations are reported for the pressure, thermal pressure coefficient, thermal expansion coefficient, isothermal and adiabatic compressibilities, isobaric and isochoric heat capacities, and Joule-Thomson coefficient of liquid water using the non-polarizable SPC/E and TIP4P/2005 potentials. The results are compared with both experiment data and results obtained from the ab initio-based Matsuoka-Clementi-Yoshimine non-additive (MCYna) [J. Li, Z. Zhou, and R. J. Sadus, J. Chem. Phys.more » 127, 154509 (2007)] potential, which includes polarization contributions. The data clearly indicate that both the SPC/E and TIP4P/2005 potentials are only in qualitative agreement with experiment, whereas the polarizable MCYna potential predicts some properties within experimental uncertainty. This highlights the importance of polarizability for the accurate prediction of the thermodynamic properties of water, particularly at temperatures beyond 298 K.« less

  10. Accurate prediction of protein–protein interactions from sequence alignments using a Bayesian method

    PubMed Central

    Burger, Lukas; van Nimwegen, Erik

    2008-01-01

    Accurate and large-scale prediction of protein–protein interactions directly from amino-acid sequences is one of the great challenges in computational biology. Here we present a new Bayesian network method that predicts interaction partners using only multiple alignments of amino-acid sequences of interacting protein domains, without tunable parameters, and without the need for any training examples. We first apply the method to bacterial two-component systems and comprehensively reconstruct two-component signaling networks across all sequenced bacteria. Comparisons of our predictions with known interactions show that our method infers interaction partners genome-wide with high accuracy. To demonstrate the general applicability of our method we show that it also accurately predicts interaction partners in a recent dataset of polyketide synthases. Analysis of the predicted genome-wide two-component signaling networks shows that cognates (interacting kinase/regulator pairs, which lie adjacent on the genome) and orphans (which lie isolated) form two relatively independent components of the signaling network in each genome. In addition, while most genes are predicted to have only a small number of interaction partners, we find that 10% of orphans form a separate class of ‘hub' nodes that distribute and integrate signals to and from up to tens of different interaction partners. PMID:18277381

  11. Accurate indel prediction using paired-end short reads

    PubMed Central

    2013-01-01

    Background One of the major open challenges in next generation sequencing (NGS) is the accurate identification of structural variants such as insertions and deletions (indels). Current methods for indel calling assign scores to different types of evidence or counter-evidence for the presence of an indel, such as the number of split read alignments spanning the boundaries of a deletion candidate or reads that map within a putative deletion. Candidates with a score above a manually defined threshold are then predicted to be true indels. As a consequence, structural variants detected in this manner contain many false positives. Results Here, we present a machine learning based method which is able to discover and distinguish true from false indel candidates in order to reduce the false positive rate. Our method identifies indel candidates using a discriminative classifier based on features of split read alignment profiles and trained on true and false indel candidates that were validated by Sanger sequencing. We demonstrate the usefulness of our method with paired-end Illumina reads from 80 genomes of the first phase of the 1001 Genomes Project ( http://www.1001genomes.org) in Arabidopsis thaliana. Conclusion In this work we show that indel classification is a necessary step to reduce the number of false positive candidates. We demonstrate that missing classification may lead to spurious biological interpretations. The software is available at: http://agkb.is.tuebingen.mpg.de/Forschung/SV-M/. PMID:23442375

  12. A Novel Method for Accurate Operon Predictions in All SequencedProkaryotes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Price, Morgan N.; Huang, Katherine H.; Alm, Eric J.

    2004-12-01

    We combine comparative genomic measures and the distance separating adjacent genes to predict operons in 124 completely sequenced prokaryotic genomes. Our method automatically tailors itself to each genome using sequence information alone, and thus can be applied to any prokaryote. For Escherichia coli K12 and Bacillus subtilis, our method is 85 and 83% accurate, respectively, which is similar to the accuracy of methods that use the same features but are trained on experimentally characterized transcripts. In Halobacterium NRC-1 and in Helicobacterpylori, our method correctly infers that genes in operons are separated by shorter distances than they are in E.coli, andmore » its predictions using distance alone are more accurate than distance-only predictions trained on a database of E.coli transcripts. We use microarray data from sixphylogenetically diverse prokaryotes to show that combining intergenic distance with comparative genomic measures further improves accuracy and that our method is broadly effective. Finally, we survey operon structure across 124 genomes, and find several surprises: H.pylori has many operons, contrary to previous reports; Bacillus anthracis has an unusual number of pseudogenes within conserved operons; and Synechocystis PCC6803 has many operons even though it has unusually wide spacings between conserved adjacent genes.« less

  13. Microarray-based cancer prediction using soft computing approach.

    PubMed

    Wang, Xiaosheng; Gotoh, Osamu

    2009-05-26

    One of the difficulties in using gene expression profiles to predict cancer is how to effectively select a few informative genes to construct accurate prediction models from thousands or ten thousands of genes. We screen highly discriminative genes and gene pairs to create simple prediction models involved in single genes or gene pairs on the basis of soft computing approach and rough set theory. Accurate cancerous prediction is obtained when we apply the simple prediction models for four cancerous gene expression datasets: CNS tumor, colon tumor, lung cancer and DLBCL. Some genes closely correlated with the pathogenesis of specific or general cancers are identified. In contrast with other models, our models are simple, effective and robust. Meanwhile, our models are interpretable for they are based on decision rules. Our results demonstrate that very simple models may perform well on cancerous molecular prediction and important gene markers of cancer can be detected if the gene selection approach is chosen reasonably.

  14. Accurately predicting the structure, density, and hydrostatic compression of crystalline β-1,3,5,7-tetranitro-1,3,5,7-tetraazacyclooctane based on its wave-function-based potential

    NASA Astrophysics Data System (ADS)

    Song, H.-J.; Huang, F.

    2011-09-01

    A wave-function-based intermolecular potential of the β phase 1,3,5,7-tetranitro-1,3,5,7-tetraazacyclooctane (HMX) molecule has been constructed from first principles using the Williams-Stone-Misquitta method and the symmetry-adapted perturbation theory. Using the potential and its derivatives, we have accurately predicted not only the structure and lattice energy of the crystalline β-HMX at 0 K, but also its densities at temperatures of 0-403 K within an accuracy of 1% of density. The calculated densities at pressures within 0-6 GPa excellently agree with the results from the experiments on hydrostatic compression.

  15. Accurate and robust genomic prediction of celiac disease using statistical learning.

    PubMed

    Abraham, Gad; Tye-Din, Jason A; Bhalala, Oneil G; Kowalczyk, Adam; Zobel, Justin; Inouye, Michael

    2014-02-01

    Practical application of genomic-based risk stratification to clinical diagnosis is appealing yet performance varies widely depending on the disease and genomic risk score (GRS) method. Celiac disease (CD), a common immune-mediated illness, is strongly genetically determined and requires specific HLA haplotypes. HLA testing can exclude diagnosis but has low specificity, providing little information suitable for clinical risk stratification. Using six European cohorts, we provide a proof-of-concept that statistical learning approaches which simultaneously model all SNPs can generate robust and highly accurate predictive models of CD based on genome-wide SNP profiles. The high predictive capacity replicated both in cross-validation within each cohort (AUC of 0.87-0.89) and in independent replication across cohorts (AUC of 0.86-0.9), despite differences in ethnicity. The models explained 30-35% of disease variance and up to ∼43% of heritability. The GRS's utility was assessed in different clinically relevant settings. Comparable to HLA typing, the GRS can be used to identify individuals without CD with ≥99.6% negative predictive value however, unlike HLA typing, fine-scale stratification of individuals into categories of higher-risk for CD can identify those that would benefit from more invasive and costly definitive testing. The GRS is flexible and its performance can be adapted to the clinical situation by adjusting the threshold cut-off. Despite explaining a minority of disease heritability, our findings indicate a genomic risk score provides clinically relevant information to improve upon current diagnostic pathways for CD and support further studies evaluating the clinical utility of this approach in CD and other complex diseases.

  16. Rapid and accurate prediction and scoring of water molecules in protein binding sites.

    PubMed

    Ross, Gregory A; Morris, Garrett M; Biggin, Philip C

    2012-01-01

    Water plays a critical role in ligand-protein interactions. However, it is still challenging to predict accurately not only where water molecules prefer to bind, but also which of those water molecules might be displaceable. The latter is often seen as a route to optimizing affinity of potential drug candidates. Using a protocol we call WaterDock, we show that the freely available AutoDock Vina tool can be used to predict accurately the binding sites of water molecules. WaterDock was validated using data from X-ray crystallography, neutron diffraction and molecular dynamics simulations and correctly predicted 97% of the water molecules in the test set. In addition, we combined data-mining, heuristic and machine learning techniques to develop probabilistic water molecule classifiers. When applied to WaterDock predictions in the Astex Diverse Set of protein ligand complexes, we could identify whether a water molecule was conserved or displaced to an accuracy of 75%. A second model predicted whether water molecules were displaced by polar groups or by non-polar groups to an accuracy of 80%. These results should prove useful for anyone wishing to undertake rational design of new compounds where the displacement of water molecules is being considered as a route to improved affinity.

  17. Quasi-closed phase forward-backward linear prediction analysis of speech for accurate formant detection and estimation.

    PubMed

    Gowda, Dhananjaya; Airaksinen, Manu; Alku, Paavo

    2017-09-01

    Recently, a quasi-closed phase (QCP) analysis of speech signals for accurate glottal inverse filtering was proposed. However, the QCP analysis which belongs to the family of temporally weighted linear prediction (WLP) methods uses the conventional forward type of sample prediction. This may not be the best choice especially in computing WLP models with a hard-limiting weighting function. A sample selective minimization of the prediction error in WLP reduces the effective number of samples available within a given window frame. To counter this problem, a modified quasi-closed phase forward-backward (QCP-FB) analysis is proposed, wherein each sample is predicted based on its past as well as future samples thereby utilizing the available number of samples more effectively. Formant detection and estimation experiments on synthetic vowels generated using a physical modeling approach as well as natural speech utterances show that the proposed QCP-FB method yields statistically significant improvements over the conventional linear prediction and QCP methods.

  18. Combining transcription factor binding affinities with open-chromatin data for accurate gene expression prediction

    PubMed Central

    Schmidt, Florian; Gasparoni, Nina; Gasparoni, Gilles; Gianmoena, Kathrin; Cadenas, Cristina; Polansky, Julia K.; Ebert, Peter; Nordström, Karl; Barann, Matthias; Sinha, Anupam; Fröhler, Sebastian; Xiong, Jieyi; Dehghani Amirabad, Azim; Behjati Ardakani, Fatemeh; Hutter, Barbara; Zipprich, Gideon; Felder, Bärbel; Eils, Jürgen; Brors, Benedikt; Chen, Wei; Hengstler, Jan G.; Hamann, Alf; Lengauer, Thomas; Rosenstiel, Philip; Walter, Jörn; Schulz, Marcel H.

    2017-01-01

    The binding and contribution of transcription factors (TF) to cell specific gene expression is often deduced from open-chromatin measurements to avoid costly TF ChIP-seq assays. Thus, it is important to develop computational methods for accurate TF binding prediction in open-chromatin regions (OCRs). Here, we report a novel segmentation-based method, TEPIC, to predict TF binding by combining sets of OCRs with position weight matrices. TEPIC can be applied to various open-chromatin data, e.g. DNaseI-seq and NOMe-seq. Additionally, Histone-Marks (HMs) can be used to identify candidate TF binding sites. TEPIC computes TF affinities and uses open-chromatin/HM signal intensity as quantitative measures of TF binding strength. Using machine learning, we find low affinity binding sites to improve our ability to explain gene expression variability compared to the standard presence/absence classification of binding sites. Further, we show that both footprints and peaks capture essential TF binding events and lead to a good prediction performance. In our application, gene-based scores computed by TEPIC with one open-chromatin assay nearly reach the quality of several TF ChIP-seq data sets. Finally, these scores correctly predict known transcriptional regulators as illustrated by the application to novel DNaseI-seq and NOMe-seq data for primary human hepatocytes and CD4+ T-cells, respectively. PMID:27899623

  19. Accurate prediction of vaccine stability under real storage conditions and during temperature excursions.

    PubMed

    Clénet, Didier

    2018-04-01

    Due to their thermosensitivity, most vaccines must be kept refrigerated from production to use. To successfully carry out global immunization programs, ensuring the stability of vaccines is crucial. In this context, two important issues are critical, namely: (i) predicting vaccine stability and (ii) preventing product damage due to excessive temperature excursions outside of the recommended storage conditions (cold chain break). We applied a combination of advanced kinetics and statistical analyses on vaccine forced degradation data to accurately describe the loss of antigenicity for a multivalent freeze-dried inactivated virus vaccine containing three variants. The screening of large amounts of kinetic models combined with a statistical model selection approach resulted in the identification of two-step kinetic models. Predictions based on kinetic analysis and experimental stability data were in agreement, with approximately five percentage points difference from real values for long-term stability storage conditions, after excursions of temperature and during experimental shipments of freeze-dried products. Results showed that modeling a few months of forced degradation can be used to predict various time and temperature profiles endured by vaccines, i.e. long-term stability, short time excursions outside the labeled storage conditions or shipments at ambient temperature, with high accuracy. Pharmaceutical applications of the presented kinetics-based approach are discussed. Copyright © 2018 The Author. Published by Elsevier B.V. All rights reserved.

  20. Accurate prediction of severe allergic reactions by a small set of environmental parameters (NDVI, temperature).

    PubMed

    Notas, George; Bariotakis, Michail; Kalogrias, Vaios; Andrianaki, Maria; Azariadis, Kalliopi; Kampouri, Errika; Theodoropoulou, Katerina; Lavrentaki, Katerina; Kastrinakis, Stelios; Kampa, Marilena; Agouridakis, Panagiotis; Pirintsos, Stergios; Castanas, Elias

    2015-01-01

    Severe allergic reactions of unknown etiology,necessitating a hospital visit, have an important impact in the life of affected individuals and impose a major economic burden to societies. The prediction of clinically severe allergic reactions would be of great importance, but current attempts have been limited by the lack of a well-founded applicable methodology and the wide spatiotemporal distribution of allergic reactions. The valid prediction of severe allergies (and especially those needing hospital treatment) in a region, could alert health authorities and implicated individuals to take appropriate preemptive measures. In the present report we have collecterd visits for serious allergic reactions of unknown etiology from two major hospitals in the island of Crete, for two distinct time periods (validation and test sets). We have used the Normalized Difference Vegetation Index (NDVI), a satellite-based, freely available measurement, which is an indicator of live green vegetation at a given geographic area, and a set of meteorological data to develop a model capable of describing and predicting severe allergic reaction frequency. Our analysis has retained NDVI and temperature as accurate identifiers and predictors of increased hospital severe allergic reactions visits. Our approach may contribute towards the development of satellite-based modules, for the prediction of severe allergic reactions in specific, well-defined geographical areas. It could also probably be used for the prediction of other environment related diseases and conditions.

  1. Accurate Prediction of Severe Allergic Reactions by a Small Set of Environmental Parameters (NDVI, Temperature)

    PubMed Central

    Andrianaki, Maria; Azariadis, Kalliopi; Kampouri, Errika; Theodoropoulou, Katerina; Lavrentaki, Katerina; Kastrinakis, Stelios; Kampa, Marilena; Agouridakis, Panagiotis; Pirintsos, Stergios; Castanas, Elias

    2015-01-01

    Severe allergic reactions of unknown etiology,necessitating a hospital visit, have an important impact in the life of affected individuals and impose a major economic burden to societies. The prediction of clinically severe allergic reactions would be of great importance, but current attempts have been limited by the lack of a well-founded applicable methodology and the wide spatiotemporal distribution of allergic reactions. The valid prediction of severe allergies (and especially those needing hospital treatment) in a region, could alert health authorities and implicated individuals to take appropriate preemptive measures. In the present report we have collecterd visits for serious allergic reactions of unknown etiology from two major hospitals in the island of Crete, for two distinct time periods (validation and test sets). We have used the Normalized Difference Vegetation Index (NDVI), a satellite-based, freely available measurement, which is an indicator of live green vegetation at a given geographic area, and a set of meteorological data to develop a model capable of describing and predicting severe allergic reaction frequency. Our analysis has retained NDVI and temperature as accurate identifiers and predictors of increased hospital severe allergic reactions visits. Our approach may contribute towards the development of satellite-based modules, for the prediction of severe allergic reactions in specific, well-defined geographical areas. It could also probably be used for the prediction of other environment related diseases and conditions. PMID:25794106

  2. Exchange-Hole Dipole Dispersion Model for Accurate Energy Ranking in Molecular Crystal Structure Prediction.

    PubMed

    Whittleton, Sarah R; Otero-de-la-Roza, A; Johnson, Erin R

    2017-02-14

    Accurate energy ranking is a key facet to the problem of first-principles crystal-structure prediction (CSP) of molecular crystals. This work presents a systematic assessment of B86bPBE-XDM, a semilocal density functional combined with the exchange-hole dipole moment (XDM) dispersion model, for energy ranking using 14 compounds from the first five CSP blind tests. Specifically, the set of crystals studied comprises 11 rigid, planar compounds and 3 co-crystals. The experimental structure was correctly identified as the lowest in lattice energy for 12 of the 14 total crystals. One of the exceptions is 4-hydroxythiophene-2-carbonitrile, for which the experimental structure was correctly identified once a quasi-harmonic estimate of the vibrational free-energy contribution was included, evidencing the occasional importance of thermal corrections for accurate energy ranking. The other exception is an organic salt, where charge-transfer error (also called delocalization error) is expected to cause the base density functional to be unreliable. Provided the choice of base density functional is appropriate and an estimate of temperature effects is used, XDM-corrected density-functional theory is highly reliable for the energetic ranking of competing crystal structures.

  3. Multi-fidelity machine learning models for accurate bandgap predictions of solids

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pilania, Ghanshyam; Gubernatis, James E.; Lookman, Turab

    Here, we present a multi-fidelity co-kriging statistical learning framework that combines variable-fidelity quantum mechanical calculations of bandgaps to generate a machine-learned model that enables low-cost accurate predictions of the bandgaps at the highest fidelity level. Additionally, the adopted Gaussian process regression formulation allows us to predict the underlying uncertainties as a measure of our confidence in the predictions. In using a set of 600 elpasolite compounds as an example dataset and using semi-local and hybrid exchange correlation functionals within density functional theory as two levels of fidelities, we demonstrate the excellent learning performance of the method against actual high fidelitymore » quantum mechanical calculations of the bandgaps. The presented statistical learning method is not restricted to bandgaps or electronic structure methods and extends the utility of high throughput property predictions in a significant way.« less

  4. Multi-fidelity machine learning models for accurate bandgap predictions of solids

    DOE PAGES

    Pilania, Ghanshyam; Gubernatis, James E.; Lookman, Turab

    2016-12-28

    Here, we present a multi-fidelity co-kriging statistical learning framework that combines variable-fidelity quantum mechanical calculations of bandgaps to generate a machine-learned model that enables low-cost accurate predictions of the bandgaps at the highest fidelity level. Additionally, the adopted Gaussian process regression formulation allows us to predict the underlying uncertainties as a measure of our confidence in the predictions. In using a set of 600 elpasolite compounds as an example dataset and using semi-local and hybrid exchange correlation functionals within density functional theory as two levels of fidelities, we demonstrate the excellent learning performance of the method against actual high fidelitymore » quantum mechanical calculations of the bandgaps. The presented statistical learning method is not restricted to bandgaps or electronic structure methods and extends the utility of high throughput property predictions in a significant way.« less

  5. Combining Structural Modeling with Ensemble Machine Learning to Accurately Predict Protein Fold Stability and Binding Affinity Effects upon Mutation

    PubMed Central

    Garcia Lopez, Sebastian; Kim, Philip M.

    2014-01-01

    Advances in sequencing have led to a rapid accumulation of mutations, some of which are associated with diseases. However, to draw mechanistic conclusions, a biochemical understanding of these mutations is necessary. For coding mutations, accurate prediction of significant changes in either the stability of proteins or their affinity to their binding partners is required. Traditional methods have used semi-empirical force fields, while newer methods employ machine learning of sequence and structural features. Here, we show how combining both of these approaches leads to a marked boost in accuracy. We introduce ELASPIC, a novel ensemble machine learning approach that is able to predict stability effects upon mutation in both, domain cores and domain-domain interfaces. We combine semi-empirical energy terms, sequence conservation, and a wide variety of molecular details with a Stochastic Gradient Boosting of Decision Trees (SGB-DT) algorithm. The accuracy of our predictions surpasses existing methods by a considerable margin, achieving correlation coefficients of 0.77 for stability, and 0.75 for affinity predictions. Notably, we integrated homology modeling to enable proteome-wide prediction and show that accurate prediction on modeled structures is possible. Lastly, ELASPIC showed significant differences between various types of disease-associated mutations, as well as between disease and common neutral mutations. Unlike pure sequence-based prediction methods that try to predict phenotypic effects of mutations, our predictions unravel the molecular details governing the protein instability, and help us better understand the molecular causes of diseases. PMID:25243403

  6. Accurate electrical prediction of memory array through SEM-based edge-contour extraction using SPICE simulation

    NASA Astrophysics Data System (ADS)

    Shauly, Eitan; Rotstein, Israel; Peltinov, Ram; Latinski, Sergei; Adan, Ofer; Levi, Shimon; Menadeva, Ovadya

    2009-03-01

    The continues transistors scaling efforts, for smaller devices, similar (or larger) drive current/um and faster devices, increase the challenge to predict and to control the transistor off-state current. Typically, electrical simulators like SPICE, are using the design intent (as-drawn GDS data). At more sophisticated cases, the simulators are fed with the pattern after lithography and etch process simulations. As the importance of electrical simulation accuracy is increasing and leakage is becoming more dominant, there is a need to feed these simulators, with more accurate information extracted from physical on-silicon transistors. Our methodology to predict changes in device performances due to systematic lithography and etch effects was used in this paper. In general, the methodology consists on using the OPCCmaxTM for systematic Edge-Contour-Extraction (ECE) from transistors, taking along the manufacturing and includes any image distortions like line-end shortening, corner rounding and line-edge roughness. These measurements are used for SPICE modeling. Possible application of this new metrology is to provide a-head of time, physical and electrical statistical data improving time to market. In this work, we applied our methodology to analyze a small and large array's of 2.14um2 6T-SRAM, manufactured using Tower Standard Logic for General Purposes Platform. 4 out of the 6 transistors used "U-Shape AA", known to have higher variability. The predicted electrical performances of the transistors drive current and leakage current, in terms of nominal values and variability are presented. We also used the methodology to analyze an entire SRAM Block array. Study of an isolation leakage and variability are presented.

  7. Simple prediction scores predict good and devastating outcomes after stroke more accurately than physicians.

    PubMed

    Reid, John Michael; Dai, Dingwei; Delmonte, Susanna; Counsell, Carl; Phillips, Stephen J; MacLeod, Mary Joan

    2017-05-01

    physicians are often asked to prognosticate soon after a patient presents with stroke. This study aimed to compare two outcome prediction scores (Five Simple Variables [FSV] score and the PLAN [Preadmission comorbidities, Level of consciousness, Age, and focal Neurologic deficit]) with informal prediction by physicians. demographic and clinical variables were prospectively collected from consecutive patients hospitalised with acute ischaemic or haemorrhagic stroke (2012-13). In-person or telephone follow-up at 6 months established vital and functional status (modified Rankin score [mRS]). Area under the receiver operating curves (AUC) was used to establish prediction score performance. five hundred and seventy-five patients were included; 46% female, median age 76 years, 88% ischaemic stroke. Six months after stroke, 47% of patients had a good outcome (alive and independent, mRS 0-2) and 26% a devastating outcome (dead or severely dependent, mRS 5-6). The FSV and PLAN scores were superior to physician prediction (AUCs of 0.823-0.863 versus 0.773-0.805, P < 0.0001) for good and devastating outcomes. The FSV score was superior to the PLAN score for predicting good outcomes and vice versa for devastating outcomes (P < 0.001). Outcome prediction was more accurate for those with later presentations (>24 hours from onset). the FSV and PLAN scores are validated in this population for outcome prediction after both ischaemic and haemorrhagic stroke. The FSV score is the least complex of all developed scores and can assist outcome prediction by physicians. © The Author 2016. Published by Oxford University Press on behalf of the British Geriatrics Society. All rights reserved. For permissions, please email: journals.permissions@oup.com

  8. A gene expression biomarker accurately predicts estrogen ...

    EPA Pesticide Factsheets

    The EPA’s vision for the Endocrine Disruptor Screening Program (EDSP) in the 21st Century (EDSP21) includes utilization of high-throughput screening (HTS) assays coupled with computational modeling to prioritize chemicals with the goal of eventually replacing current Tier 1 screening tests. The ToxCast program currently includes 18 HTS in vitro assays that evaluate the ability of chemicals to modulate estrogen receptor α (ERα), an important endocrine target. We propose microarray-based gene expression profiling as a complementary approach to predict ERα modulation and have developed computational methods to identify ERα modulators in an existing database of whole-genome microarray data. The ERα biomarker consisted of 46 ERα-regulated genes with consistent expression patterns across 7 known ER agonists and 3 known ER antagonists. The biomarker was evaluated as a predictive tool using the fold-change rank-based Running Fisher algorithm by comparison to annotated gene expression data sets from experiments in MCF-7 cells. Using 141 comparisons from chemical- and hormone-treated cells, the biomarker gave a balanced accuracy for prediction of ERα activation or suppression of 94% or 93%, respectively. The biomarker was able to correctly classify 18 out of 21 (86%) OECD ER reference chemicals including “very weak” agonists and replicated predictions based on 18 in vitro ER-associated HTS assays. For 114 chemicals present in both the HTS data and the MCF-7 c

  9. Accurate prediction of collapse temperature using optical coherence tomography-based freeze-drying microscopy.

    PubMed

    Greco, Kristyn; Mujat, Mircea; Galbally-Kinney, Kristin L; Hammer, Daniel X; Ferguson, R Daniel; Iftimia, Nicusor; Mulhall, Phillip; Sharma, Puneet; Kessler, William J; Pikal, Michael J

    2013-06-01

    The objective of this study was to assess the feasibility of developing and applying a laboratory tool that can provide three-dimensional product structural information during freeze-drying and which can accurately characterize the collapse temperature (Tc ) of pharmaceutical formulations designed for freeze-drying. A single-vial freeze dryer coupled with optical coherence tomography freeze-drying microscopy (OCT-FDM) was developed to investigate the structure and Tc of formulations in pharmaceutically relevant products containers (i.e., freeze-drying in vials). OCT-FDM was used to measure the Tc and eutectic melt of three formulations in freeze-drying vials. The Tc as measured by OCT-FDM was found to be predictive of freeze-drying with a batch of vials in a conventional laboratory freeze dryer. The freeze-drying cycles developed using OCT-FDM data, as compared with traditional light transmission freeze-drying microscopy (LT-FDM), resulted in a significant reduction in primary drying time, which could result in a substantial reduction of manufacturing costs while maintaining product quality. OCT-FDM provides quantitative data to justify freeze-drying at temperatures higher than the Tc measured by LT-FDM and provides a reliable upper limit to setting a product temperature in primary drying. Copyright © 2013 Wiley Periodicals, Inc.

  10. Combining transcription factor binding affinities with open-chromatin data for accurate gene expression prediction.

    PubMed

    Schmidt, Florian; Gasparoni, Nina; Gasparoni, Gilles; Gianmoena, Kathrin; Cadenas, Cristina; Polansky, Julia K; Ebert, Peter; Nordström, Karl; Barann, Matthias; Sinha, Anupam; Fröhler, Sebastian; Xiong, Jieyi; Dehghani Amirabad, Azim; Behjati Ardakani, Fatemeh; Hutter, Barbara; Zipprich, Gideon; Felder, Bärbel; Eils, Jürgen; Brors, Benedikt; Chen, Wei; Hengstler, Jan G; Hamann, Alf; Lengauer, Thomas; Rosenstiel, Philip; Walter, Jörn; Schulz, Marcel H

    2017-01-09

    The binding and contribution of transcription factors (TF) to cell specific gene expression is often deduced from open-chromatin measurements to avoid costly TF ChIP-seq assays. Thus, it is important to develop computational methods for accurate TF binding prediction in open-chromatin regions (OCRs). Here, we report a novel segmentation-based method, TEPIC, to predict TF binding by combining sets of OCRs with position weight matrices. TEPIC can be applied to various open-chromatin data, e.g. DNaseI-seq and NOMe-seq. Additionally, Histone-Marks (HMs) can be used to identify candidate TF binding sites. TEPIC computes TF affinities and uses open-chromatin/HM signal intensity as quantitative measures of TF binding strength. Using machine learning, we find low affinity binding sites to improve our ability to explain gene expression variability compared to the standard presence/absence classification of binding sites. Further, we show that both footprints and peaks capture essential TF binding events and lead to a good prediction performance. In our application, gene-based scores computed by TEPIC with one open-chromatin assay nearly reach the quality of several TF ChIP-seq data sets. Finally, these scores correctly predict known transcriptional regulators as illustrated by the application to novel DNaseI-seq and NOMe-seq data for primary human hepatocytes and CD4+ T-cells, respectively. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  11. Accurate First-Principles Spectra Predictions for Planetological and Astrophysical Applications at Various T-Conditions

    NASA Astrophysics Data System (ADS)

    Rey, M.; Nikitin, A. V.; Tyuterev, V.

    2014-06-01

    Knowledge of near infrared intensities of rovibrational transitions of polyatomic molecules is essential for the modeling of various planetary atmospheres, brown dwarfs and for other astrophysical applications 1,2,3. For example, to analyze exoplanets, atmospheric models have been developed, thus making the need to provide accurate spectroscopic data. Consequently, the spectral characterization of such planetary objects relies on the necessity of having adequate and reliable molecular data in extreme conditions (temperature, optical path length, pressure). On the other hand, in the modeling of astrophysical opacities, millions of lines are generally involved and the line-by-line extraction is clearly not feasible in laboratory measurements. It is thus suggested that this large amount of data could be interpreted only by reliable theoretical predictions. There exists essentially two theoretical approaches for the computation and prediction of spectra. The first one is based on empirically-fitted effective spectroscopic models. Another way for computing energies, line positions and intensities is based on global variational calculations using ab initio surfaces. They do not yet reach the spectroscopic accuracy stricto sensu but implicitly account for all intramolecular interactions including resonance couplings in a wide spectral range. The final aim of this work is to provide reliable predictions which could be quantitatively accurate with respect to the precision of available observations and as complete as possible. All this thus requires extensive first-principles quantum mechanical calculations essentially based on three necessary ingredients which are (i) accurate intramolecular potential energy surface and dipole moment surface components well-defined in a large range of vibrational displacements and (ii) efficient computational methods combined with suitable choices of coordinates to account for molecular symmetry properties and to achieve a good numerical

  12. An accurate model for predicting high frequency noise of nanoscale NMOS SOI transistors

    NASA Astrophysics Data System (ADS)

    Shen, Yanfei; Cui, Jie; Mohammadi, Saeed

    2017-05-01

    A nonlinear and scalable model suitable for predicting high frequency noise of N-type Metal Oxide Semiconductor (NMOS) transistors is presented. The model is developed for a commercial 45 nm CMOS SOI technology and its accuracy is validated through comparison with measured performance of a microwave low noise amplifier. The model employs the virtual source nonlinear core and adds parasitic elements to accurately simulate the RF behavior of multi-finger NMOS transistors up to 40 GHz. For the first time, the traditional long-channel thermal noise model is supplemented with an injection noise model to accurately represent the noise behavior of these short-channel transistors up to 26 GHz. The developed model is simple and easy to extract, yet very accurate.

  13. Improving Computational Efficiency of Prediction in Model-Based Prognostics Using the Unscented Transform

    NASA Technical Reports Server (NTRS)

    Daigle, Matthew John; Goebel, Kai Frank

    2010-01-01

    Model-based prognostics captures system knowledge in the form of physics-based models of components, and how they fail, in order to obtain accurate predictions of end of life (EOL). EOL is predicted based on the estimated current state distribution of a component and expected profiles of future usage. In general, this requires simulations of the component using the underlying models. In this paper, we develop a simulation-based prediction methodology that achieves computational efficiency by performing only the minimal number of simulations needed in order to accurately approximate the mean and variance of the complete EOL distribution. This is performed through the use of the unscented transform, which predicts the means and covariances of a distribution passed through a nonlinear transformation. In this case, the EOL simulation acts as that nonlinear transformation. In this paper, we review the unscented transform, and describe how this concept is applied to efficient EOL prediction. As a case study, we develop a physics-based model of a solenoid valve, and perform simulation experiments to demonstrate improved computational efficiency without sacrificing prediction accuracy.

  14. MetaPSICOV: combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins.

    PubMed

    Jones, David T; Singh, Tanya; Kosciolek, Tomasz; Tetchner, Stuart

    2015-04-01

    Recent developments of statistical techniques to infer direct evolutionary couplings between residue pairs have rendered covariation-based contact prediction a viable means for accurate 3D modelling of proteins, with no information other than the sequence required. To extend the usefulness of contact prediction, we have designed a new meta-predictor (MetaPSICOV) which combines three distinct approaches for inferring covariation signals from multiple sequence alignments, considers a broad range of other sequence-derived features and, uniquely, a range of metrics which describe both the local and global quality of the input multiple sequence alignment. Finally, we use a two-stage predictor, where the second stage filters the output of the first stage. This two-stage predictor is additionally evaluated on its ability to accurately predict the long range network of hydrogen bonds, including correctly assigning the donor and acceptor residues. Using the original PSICOV benchmark set of 150 protein families, MetaPSICOV achieves a mean precision of 0.54 for top-L predicted long range contacts-around 60% higher than PSICOV, and around 40% better than CCMpred. In de novo protein structure prediction using FRAGFOLD, MetaPSICOV is able to improve the TM-scores of models by a median of 0.05 compared with PSICOV. Lastly, for predicting long range hydrogen bonding, MetaPSICOV-HB achieves a precision of 0.69 for the top-L/10 hydrogen bonds compared with just 0.26 for the baseline MetaPSICOV. MetaPSICOV is available as a freely available web server at http://bioinf.cs.ucl.ac.uk/MetaPSICOV. Raw data (predicted contact lists and 3D models) and source code can be downloaded from http://bioinf.cs.ucl.ac.uk/downloads/MetaPSICOV. Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.

  15. Predicting volume of distribution with decision tree-based regression methods using predicted tissue:plasma partition coefficients.

    PubMed

    Freitas, Alex A; Limbu, Kriti; Ghafourian, Taravat

    2015-01-01

    Volume of distribution is an important pharmacokinetic property that indicates the extent of a drug's distribution in the body tissues. This paper addresses the problem of how to estimate the apparent volume of distribution at steady state (Vss) of chemical compounds in the human body using decision tree-based regression methods from the area of data mining (or machine learning). Hence, the pros and cons of several different types of decision tree-based regression methods have been discussed. The regression methods predict Vss using, as predictive features, both the compounds' molecular descriptors and the compounds' tissue:plasma partition coefficients (Kt:p) - often used in physiologically-based pharmacokinetics. Therefore, this work has assessed whether the data mining-based prediction of Vss can be made more accurate by using as input not only the compounds' molecular descriptors but also (a subset of) their predicted Kt:p values. Comparison of the models that used only molecular descriptors, in particular, the Bagging decision tree (mean fold error of 2.33), with those employing predicted Kt:p values in addition to the molecular descriptors, such as the Bagging decision tree using adipose Kt:p (mean fold error of 2.29), indicated that the use of predicted Kt:p values as descriptors may be beneficial for accurate prediction of Vss using decision trees if prior feature selection is applied. Decision tree based models presented in this work have an accuracy that is reasonable and similar to the accuracy of reported Vss inter-species extrapolations in the literature. The estimation of Vss for new compounds in drug discovery will benefit from methods that are able to integrate large and varied sources of data and flexible non-linear data mining methods such as decision trees, which can produce interpretable models. Graphical AbstractDecision trees for the prediction of tissue partition coefficient and volume of distribution of drugs.

  16. Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model.

    PubMed

    Wang, Sheng; Sun, Siqi; Li, Zhen; Zhang, Renyu; Xu, Jinbo

    2017-01-01

    Protein contacts contain key information for the understanding of protein structure and function and thus, contact prediction from sequence is an important problem. Recently exciting progress has been made on this problem, but the predicted contacts for proteins without many sequence homologs is still of low quality and not very useful for de novo structure prediction. This paper presents a new deep learning method that predicts contacts by integrating both evolutionary coupling (EC) and sequence conservation information through an ultra-deep neural network formed by two deep residual neural networks. The first residual network conducts a series of 1-dimensional convolutional transformation of sequential features; the second residual network conducts a series of 2-dimensional convolutional transformation of pairwise information including output of the first residual network, EC information and pairwise potential. By using very deep residual networks, we can accurately model contact occurrence patterns and complex sequence-structure relationship and thus, obtain higher-quality contact prediction regardless of how many sequence homologs are available for proteins in question. Our method greatly outperforms existing methods and leads to much more accurate contact-assisted folding. Tested on 105 CASP11 targets, 76 past CAMEO hard targets, and 398 membrane proteins, the average top L long-range prediction accuracy obtained by our method, one representative EC method CCMpred and the CASP11 winner MetaPSICOV is 0.47, 0.21 and 0.30, respectively; the average top L/10 long-range accuracy of our method, CCMpred and MetaPSICOV is 0.77, 0.47 and 0.59, respectively. Ab initio folding using our predicted contacts as restraints but without any force fields can yield correct folds (i.e., TMscore>0.6) for 203 of the 579 test proteins, while that using MetaPSICOV- and CCMpred-predicted contacts can do so for only 79 and 62 of them, respectively. Our contact-assisted models also have

  17. Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model

    PubMed Central

    Li, Zhen; Zhang, Renyu

    2017-01-01

    Motivation Protein contacts contain key information for the understanding of protein structure and function and thus, contact prediction from sequence is an important problem. Recently exciting progress has been made on this problem, but the predicted contacts for proteins without many sequence homologs is still of low quality and not very useful for de novo structure prediction. Method This paper presents a new deep learning method that predicts contacts by integrating both evolutionary coupling (EC) and sequence conservation information through an ultra-deep neural network formed by two deep residual neural networks. The first residual network conducts a series of 1-dimensional convolutional transformation of sequential features; the second residual network conducts a series of 2-dimensional convolutional transformation of pairwise information including output of the first residual network, EC information and pairwise potential. By using very deep residual networks, we can accurately model contact occurrence patterns and complex sequence-structure relationship and thus, obtain higher-quality contact prediction regardless of how many sequence homologs are available for proteins in question. Results Our method greatly outperforms existing methods and leads to much more accurate contact-assisted folding. Tested on 105 CASP11 targets, 76 past CAMEO hard targets, and 398 membrane proteins, the average top L long-range prediction accuracy obtained by our method, one representative EC method CCMpred and the CASP11 winner MetaPSICOV is 0.47, 0.21 and 0.30, respectively; the average top L/10 long-range accuracy of our method, CCMpred and MetaPSICOV is 0.77, 0.47 and 0.59, respectively. Ab initio folding using our predicted contacts as restraints but without any force fields can yield correct folds (i.e., TMscore>0.6) for 203 of the 579 test proteins, while that using MetaPSICOV- and CCMpred-predicted contacts can do so for only 79 and 62 of them, respectively. Our contact

  18. A Micromechanics-Based Method for Multiscale Fatigue Prediction

    NASA Astrophysics Data System (ADS)

    Moore, John Allan

    An estimated 80% of all structural failures are due to mechanical fatigue, often resulting in catastrophic, dangerous and costly failure events. However, an accurate model to predict fatigue remains an elusive goal. One of the major challenges is that fatigue is intrinsically a multiscale process, which is dependent on a structure's geometric design as well as its material's microscale morphology. The following work begins with a microscale study of fatigue nucleation around non- metallic inclusions. Based on this analysis, a novel multiscale method for fatigue predictions is developed. This method simulates macroscale geometries explicitly while concurrently calculating the simplified response of microscale inclusions. Thus, providing adequate detail on multiple scales for accurate fatigue life predictions. The methods herein provide insight into the multiscale nature of fatigue, while also developing a tool to aid in geometric design and material optimization for fatigue critical devices such as biomedical stents and artificial heart valves.

  19. Accurate prediction of personalized olfactory perception from large-scale chemoinformatic features.

    PubMed

    Li, Hongyang; Panwar, Bharat; Omenn, Gilbert S; Guan, Yuanfang

    2018-02-01

    The olfactory stimulus-percept problem has been studied for more than a century, yet it is still hard to precisely predict the odor given the large-scale chemoinformatic features of an odorant molecule. A major challenge is that the perceived qualities vary greatly among individuals due to different genetic and cultural backgrounds. Moreover, the combinatorial interactions between multiple odorant receptors and diverse molecules significantly complicate the olfaction prediction. Many attempts have been made to establish structure-odor relationships for intensity and pleasantness, but no models are available to predict the personalized multi-odor attributes of molecules. In this study, we describe our winning algorithm for predicting individual and population perceptual responses to various odorants in the DREAM Olfaction Prediction Challenge. We find that random forest model consisting of multiple decision trees is well suited to this prediction problem, given the large feature spaces and high variability of perceptual ratings among individuals. Integrating both population and individual perceptions into our model effectively reduces the influence of noise and outliers. By analyzing the importance of each chemical feature, we find that a small set of low- and nondegenerative features is sufficient for accurate prediction. Our random forest model successfully predicts personalized odor attributes of structurally diverse molecules. This model together with the top discriminative features has the potential to extend our understanding of olfactory perception mechanisms and provide an alternative for rational odorant design.

  20. PSSP-RFE: accurate prediction of protein structural class by recursive feature extraction from PSI-BLAST profile, physical-chemical property and functional annotations.

    PubMed

    Li, Liqi; Cui, Xiang; Yu, Sanjiu; Zhang, Yuan; Luo, Zhong; Yang, Hua; Zhou, Yue; Zheng, Xiaoqi

    2014-01-01

    Protein structure prediction is critical to functional annotation of the massively accumulated biological sequences, which prompts an imperative need for the development of high-throughput technologies. As a first and key step in protein structure prediction, protein structural class prediction becomes an increasingly challenging task. Amongst most homological-based approaches, the accuracies of protein structural class prediction are sufficiently high for high similarity datasets, but still far from being satisfactory for low similarity datasets, i.e., below 40% in pairwise sequence similarity. Therefore, we present a novel method for accurate and reliable protein structural class prediction for both high and low similarity datasets. This method is based on Support Vector Machine (SVM) in conjunction with integrated features from position-specific score matrix (PSSM), PROFEAT and Gene Ontology (GO). A feature selection approach, SVM-RFE, is also used to rank the integrated feature vectors through recursively removing the feature with the lowest ranking score. The definitive top features selected by SVM-RFE are input into the SVM engines to predict the structural class of a query protein. To validate our method, jackknife tests were applied to seven widely used benchmark datasets, reaching overall accuracies between 84.61% and 99.79%, which are significantly higher than those achieved by state-of-the-art tools. These results suggest that our method could serve as an accurate and cost-effective alternative to existing methods in protein structural classification, especially for low similarity datasets.

  1. CUFID-query: accurate network querying through random walk based network flow estimation.

    PubMed

    Jeong, Hyundoo; Qian, Xiaoning; Yoon, Byung-Jun

    2017-12-28

    Functional modules in biological networks consist of numerous biomolecules and their complicated interactions. Recent studies have shown that biomolecules in a functional module tend to have similar interaction patterns and that such modules are often conserved across biological networks of different species. As a result, such conserved functional modules can be identified through comparative analysis of biological networks. In this work, we propose a novel network querying algorithm based on the CUFID (Comparative network analysis Using the steady-state network Flow to IDentify orthologous proteins) framework combined with an efficient seed-and-extension approach. The proposed algorithm, CUFID-query, can accurately detect conserved functional modules as small subnetworks in the target network that are expected to perform similar functions to the given query functional module. The CUFID framework was recently developed for probabilistic pairwise global comparison of biological networks, and it has been applied to pairwise global network alignment, where the framework was shown to yield accurate network alignment results. In the proposed CUFID-query algorithm, we adopt the CUFID framework and extend it for local network alignment, specifically to solve network querying problems. First, in the seed selection phase, the proposed method utilizes the CUFID framework to compare the query and the target networks and to predict the probabilistic node-to-node correspondence between the networks. Next, the algorithm selects and greedily extends the seed in the target network by iteratively adding nodes that have frequent interactions with other nodes in the seed network, in a way that the conductance of the extended network is maximally reduced. Finally, CUFID-query removes irrelevant nodes from the querying results based on the personalized PageRank vector for the induced network that includes the fully extended network and its neighboring nodes. Through extensive

  2. Accurate prediction of cation-π interaction energy using substituent effects.

    PubMed

    Sayyed, Fareed Bhasha; Suresh, Cherumuttathu H

    2012-06-14

    Substituent effects on cation-π interactions have been quantified using a variety of Φ-X···M(+) complexes where Φ, X, and M(+) are the π-system, substituent, and cation, respectively. The cation-π interaction energy, E(M(+)), showed a strong linear correlation with the molecular electrostatic potential (MESP) based measure of the substituent effect, ΔV(min) (the difference between the MESP minimum (V(min)) on the π-region of a substituted system and the corresponding unsubstituted system). This linear relationship is E(M(+)) = C(M(+))(ΔV(min)) + E(M(+))' where C(M(+)) is the reaction constant and E(M(+))' is the cation-π interaction energy of the unsubstituted complex. This relationship is similar to the Hammett equation and its first term yields the substituent contribution of the cation-π interaction energy. Further, a linear correlation between C(M(+))() and E(M(+))()' has been established, which facilitates the prediction of C(M(+)) for unknown cations. Thus, a prediction of E(M(+)) for any Φ-X···M(+) complex is achieved by knowing the values of E(M(+))' and ΔV(min). The generality of the equation is tested for a variety of cations (Li(+), Na(+), K(+), Mg(+), BeCl(+), MgCl(+), CaCl(+), TiCl(3)(+), CrCl(2)(+), NiCl(+), Cu(+), ZnCl(+), NH(4)(+), CH(3)NH(3)(+), N(CH(3))(4)(+), C(NH(2))(3)(+)), substituents (N(CH(3))(2), NH(2), OCH(3), CH(3), OH, H, SCH(3), SH, CCH, F, Cl, COOH, CHO, CF(3), CN, NO(2)), and a large number of π-systems. The tested systems also include multiple substituted π-systems, viz. ethylene, acetylene, hexa-1,3,5-triene, benzene, naphthalene, indole, pyrrole, phenylalanine, tryptophan, tyrosine, azulene, pyrene, [6]-cyclacene, and corannulene and found that E(M)(+) follows the additivity of substituent effects. Further, the substituent effects on cationic sandwich complexes of the type C(6)H(6)···M(+)···C(6)H(5)X have been assessed and found that E(M(+)) can be predicted with 97.7% accuracy using the values of E

  3. The prediction of drug metabolism, tissue distribution, and bioavailability of 50 structurally diverse compounds in rat using mechanism-based absorption, distribution, and metabolism prediction tools.

    PubMed

    De Buck, Stefan S; Sinha, Vikash K; Fenu, Luca A; Gilissen, Ron A; Mackie, Claire E; Nijsen, Marjoleen J

    2007-04-01

    The aim of this study was to assess a physiologically based modeling approach for predicting drug metabolism, tissue distribution, and bioavailability in rat for a structurally diverse set of neutral and moderate-to-strong basic compounds (n = 50). Hepatic blood clearance (CL(h)) was projected using microsomal data and shown to be well predicted, irrespective of the type of hepatic extraction model (80% within 2-fold). Best predictions of CL(h) were obtained disregarding both plasma and microsomal protein binding, whereas strong bias was seen using either blood binding only or both plasma and microsomal protein binding. Two mechanistic tissue composition-based equations were evaluated for predicting volume of distribution (V(dss)) and tissue-to-plasma partitioning (P(tp)). A first approach, which accounted for ionic interactions with acidic phospholipids, resulted in accurate predictions of V(dss) (80% within 2-fold). In contrast, a second approach, which disregarded ionic interactions, was a poor predictor of V(dss) (60% within 2-fold). The first approach also yielded accurate predictions of P(tp) in muscle, heart, and kidney (80% within 3-fold), whereas in lung, liver, and brain, predictions ranged from 47% to 62% within 3-fold. Using the second approach, P(tp) prediction accuracy in muscle, heart, and kidney was on average 70% within 3-fold, and ranged from 24% to 54% in all other tissues. Combining all methods for predicting V(dss) and CL(h) resulted in accurate predictions of the in vivo half-life (70% within 2-fold). Oral bioavailability was well predicted using CL(h) data and Gastroplus Software (80% within 2-fold). These results illustrate that physiologically based prediction tools can provide accurate predictions of rat pharmacokinetics.

  4. Disturbance observer based model predictive control for accurate atmospheric entry of spacecraft

    NASA Astrophysics Data System (ADS)

    Wu, Chao; Yang, Jun; Li, Shihua; Li, Qi; Guo, Lei

    2018-05-01

    Facing the complex aerodynamic environment of Mars atmosphere, a composite atmospheric entry trajectory tracking strategy is investigated in this paper. External disturbances, initial states uncertainties and aerodynamic parameters uncertainties are the main problems. The composite strategy is designed to solve these problems and improve the accuracy of Mars atmospheric entry. This strategy includes a model predictive control for optimized trajectory tracking performance, as well as a disturbance observer based feedforward compensation for external disturbances and uncertainties attenuation. 500-run Monte Carlo simulations show that the proposed composite control scheme achieves more precise Mars atmospheric entry (3.8 km parachute deployment point distribution error) than the baseline control scheme (8.4 km) and integral control scheme (5.8 km).

  5. Ensemble predictive model for more accurate soil organic carbon spectroscopic estimation

    NASA Astrophysics Data System (ADS)

    Vašát, Radim; Kodešová, Radka; Borůvka, Luboš

    2017-07-01

    A myriad of signal pre-processing strategies and multivariate calibration techniques has been explored in attempt to improve the spectroscopic prediction of soil organic carbon (SOC) over the last few decades. Therefore, to come up with a novel, more powerful, and accurate predictive approach to beat the rank becomes a challenging task. However, there may be a way, so that combine several individual predictions into a single final one (according to ensemble learning theory). As this approach performs best when combining in nature different predictive algorithms that are calibrated with structurally different predictor variables, we tested predictors of two different kinds: 1) reflectance values (or transforms) at each wavelength and 2) absorption feature parameters. Consequently we applied four different calibration techniques, two per each type of predictors: a) partial least squares regression and support vector machines for type 1, and b) multiple linear regression and random forest for type 2. The weights to be assigned to individual predictions within the ensemble model (constructed as a weighted average) were determined by an automated procedure that ensured the best solution among all possible was selected. The approach was tested at soil samples taken from surface horizon of four sites differing in the prevailing soil units. By employing the ensemble predictive model the prediction accuracy of SOC improved at all four sites. The coefficient of determination in cross-validation (R2cv) increased from 0.849, 0.611, 0.811 and 0.644 (the best individual predictions) to 0.864, 0.650, 0.824 and 0.698 for Site 1, 2, 3 and 4, respectively. Generally, the ensemble model affected the final prediction so that the maximal deviations of predicted vs. observed values of the individual predictions were reduced, and thus the correlation cloud became thinner as desired.

  6. Development and Validation of a Multidisciplinary Tool for Accurate and Efficient Rotorcraft Noise Prediction (MUTE)

    NASA Technical Reports Server (NTRS)

    Liu, Yi; Anusonti-Inthra, Phuriwat; Diskin, Boris

    2011-01-01

    A physics-based, systematically coupled, multidisciplinary prediction tool (MUTE) for rotorcraft noise was developed and validated with a wide range of flight configurations and conditions. MUTE is an aggregation of multidisciplinary computational tools that accurately and efficiently model the physics of the source of rotorcraft noise, and predict the noise at far-field observer locations. It uses systematic coupling approaches among multiple disciplines including Computational Fluid Dynamics (CFD), Computational Structural Dynamics (CSD), and high fidelity acoustics. Within MUTE, advanced high-order CFD tools are used around the rotor blade to predict the transonic flow (shock wave) effects, which generate the high-speed impulsive noise. Predictions of the blade-vortex interaction noise in low speed flight are also improved by using the Particle Vortex Transport Method (PVTM), which preserves the wake flow details required for blade/wake and fuselage/wake interactions. The accuracy of the source noise prediction is further improved by utilizing a coupling approach between CFD and CSD, so that the effects of key structural dynamics, elastic blade deformations, and trim solutions are correctly represented in the analysis. The blade loading information and/or the flow field parameters around the rotor blade predicted by the CFD/CSD coupling approach are used to predict the acoustic signatures at far-field observer locations with a high-fidelity noise propagation code (WOPWOP3). The predicted results from the MUTE tool for rotor blade aerodynamic loading and far-field acoustic signatures are compared and validated with a variation of experimental data sets, such as UH60-A data, DNW test data and HART II test data.

  7. Quokka: a comprehensive tool for rapid and accurate prediction of kinase family-specific phosphorylation sites in the human proteome.

    PubMed

    Li, Fuyi; Li, Chen; Marquez-Lago, Tatiana T; Leier, André; Akutsu, Tatsuya; Purcell, Anthony W; Smith, A Ian; Lithgow, Trevor; Daly, Roger J; Song, Jiangning; Chou, Kuo-Chen

    2018-06-27

    Kinase-regulated phosphorylation is a ubiquitous type of post-translational modification (PTM) in both eukaryotic and prokaryotic cells. Phosphorylation plays fundamental roles in many signalling pathways and biological processes, such as protein degradation and protein-protein interactions. Experimental studies have revealed that signalling defects caused by aberrant phosphorylation are highly associated with a variety of human diseases, especially cancers. In light of this, a number of computational methods aiming to accurately predict protein kinase family-specific or kinase-specific phosphorylation sites have been established, thereby facilitating phosphoproteomic data analysis. In this work, we present Quokka, a novel bioinformatics tool that allows users to rapidly and accurately identify human kinase family-regulated phosphorylation sites. Quokka was developed by using a variety of sequence scoring functions combined with an optimized logistic regression algorithm. We evaluated Quokka based on well-prepared up-to-date benchmark and independent test datasets, curated from the Phospho.ELM and UniProt databases, respectively. The independent test demonstrates that Quokka improves the prediction performance compared with state-of-the-art computational tools for phosphorylation prediction. In summary, our tool provides users with high-quality predicted human phosphorylation sites for hypothesis generation and biological validation. The Quokka webserver and datasets are freely available at http://quokka.erc.monash.edu/. Supplementary data are available at Bioinformatics online.

  8. Highly accurate prediction of emotions surrounding the attacks of September 11, 2001 over 1-, 2-, and 7-year prediction intervals.

    PubMed

    Doré, Bruce P; Meksin, Robert; Mather, Mara; Hirst, William; Ochsner, Kevin N

    2016-06-01

    In the aftermath of a national tragedy, important decisions are predicated on judgments of the emotional significance of the tragedy in the present and future. Research in affective forecasting has largely focused on ways in which people fail to make accurate predictions about the nature and duration of feelings experienced in the aftermath of an event. Here we ask a related but understudied question: can people forecast how they will feel in the future about a tragic event that has already occurred? We found that people were strikingly accurate when predicting how they would feel about the September 11 attacks over 1-, 2-, and 7-year prediction intervals. Although people slightly under- or overestimated their future feelings at times, they nonetheless showed high accuracy in forecasting (a) the overall intensity of their future negative emotion, and (b) the relative degree of different types of negative emotion (i.e., sadness, fear, or anger). Using a path model, we found that the relationship between forecasted and actual future emotion was partially mediated by current emotion and remembered emotion. These results extend theories of affective forecasting by showing that emotional responses to an event of ongoing national significance can be predicted with high accuracy, and by identifying current and remembered feelings as independent sources of this accuracy. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  9. Highly accurate prediction of emotions surrounding the attacks of September 11, 2001 over 1-, 2-, and 7-year prediction intervals

    PubMed Central

    Doré, B.P.; Meksin, R.; Mather, M.; Hirst, W.; Ochsner, K.N

    2016-01-01

    In the aftermath of a national tragedy, important decisions are predicated on judgments of the emotional significance of the tragedy in the present and future. Research in affective forecasting has largely focused on ways in which people fail to make accurate predictions about the nature and duration of feelings experienced in the aftermath of an event. Here we ask a related but understudied question: can people forecast how they will feel in the future about a tragic event that has already occurred? We found that people were strikingly accurate when predicting how they would feel about the September 11 attacks over 1-, 2-, and 7-year prediction intervals. Although people slightly under- or overestimated their future feelings at times, they nonetheless showed high accuracy in forecasting 1) the overall intensity of their future negative emotion, and 2) the relative degree of different types of negative emotion (i.e., sadness, fear, or anger). Using a path model, we found that the relationship between forecasted and actual future emotion was partially mediated by current emotion and remembered emotion. These results extend theories of affective forecasting by showing that emotional responses to an event of ongoing national significance can be predicted with high accuracy, and by identifying current and remembered feelings as independent sources of this accuracy. PMID:27100309

  10. Accurate lithography simulation model based on convolutional neural networks

    NASA Astrophysics Data System (ADS)

    Watanabe, Yuki; Kimura, Taiki; Matsunawa, Tetsuaki; Nojima, Shigeki

    2017-07-01

    Lithography simulation is an essential technique for today's semiconductor manufacturing process. In order to calculate an entire chip in realistic time, compact resist model is commonly used. The model is established for faster calculation. To have accurate compact resist model, it is necessary to fix a complicated non-linear model function. However, it is difficult to decide an appropriate function manually because there are many options. This paper proposes a new compact resist model using CNN (Convolutional Neural Networks) which is one of deep learning techniques. CNN model makes it possible to determine an appropriate model function and achieve accurate simulation. Experimental results show CNN model can reduce CD prediction errors by 70% compared with the conventional model.

  11. Fast and accurate predictions of covalent bonds in chemical space.

    PubMed

    Chang, K Y Samuel; Fias, Stijn; Ramakrishnan, Raghunathan; von Lilienfeld, O Anatole

    2016-05-07

    We assess the predictive accuracy of perturbation theory based estimates of changes in covalent bonding due to linear alchemical interpolations among molecules. We have investigated σ bonding to hydrogen, as well as σ and π bonding between main-group elements, occurring in small sets of iso-valence-electronic molecules with elements drawn from second to fourth rows in the p-block of the periodic table. Numerical evidence suggests that first order Taylor expansions of covalent bonding potentials can achieve high accuracy if (i) the alchemical interpolation is vertical (fixed geometry), (ii) it involves elements from the third and fourth rows of the periodic table, and (iii) an optimal reference geometry is used. This leads to near linear changes in the bonding potential, resulting in analytical predictions with chemical accuracy (∼1 kcal/mol). Second order estimates deteriorate the prediction. If initial and final molecules differ not only in composition but also in geometry, all estimates become substantially worse, with second order being slightly more accurate than first order. The independent particle approximation based second order perturbation theory performs poorly when compared to the coupled perturbed or finite difference approach. Taylor series expansions up to fourth order of the potential energy curve of highly symmetric systems indicate a finite radius of convergence, as illustrated for the alchemical stretching of H2 (+). Results are presented for (i) covalent bonds to hydrogen in 12 molecules with 8 valence electrons (CH4, NH3, H2O, HF, SiH4, PH3, H2S, HCl, GeH4, AsH3, H2Se, HBr); (ii) main-group single bonds in 9 molecules with 14 valence electrons (CH3F, CH3Cl, CH3Br, SiH3F, SiH3Cl, SiH3Br, GeH3F, GeH3Cl, GeH3Br); (iii) main-group double bonds in 9 molecules with 12 valence electrons (CH2O, CH2S, CH2Se, SiH2O, SiH2S, SiH2Se, GeH2O, GeH2S, GeH2Se); (iv) main-group triple bonds in 9 molecules with 10 valence electrons (HCN, HCP, HCAs, HSiN, HSi

  12. A novel fibrosis index comprising a non-cholesterol sterol accurately predicts HCV-related liver cirrhosis.

    PubMed

    Ydreborg, Magdalena; Lisovskaja, Vera; Lagging, Martin; Brehm Christensen, Peer; Langeland, Nina; Buhl, Mads Rauning; Pedersen, Court; Mørch, Kristine; Wejstål, Rune; Norkrans, Gunnar; Lindh, Magnus; Färkkilä, Martti; Westin, Johan

    2014-01-01

    Diagnosis of liver cirrhosis is essential in the management of chronic hepatitis C virus (HCV) infection. Liver biopsy is invasive and thus entails a risk of complications as well as a potential risk of sampling error. Therefore, non-invasive diagnostic tools are preferential. The aim of the present study was to create a model for accurate prediction of liver cirrhosis based on patient characteristics and biomarkers of liver fibrosis, including a panel of non-cholesterol sterols reflecting cholesterol synthesis and absorption and secretion. We evaluated variables with potential predictive significance for liver fibrosis in 278 patients originally included in a multicenter phase III treatment trial for chronic HCV infection. A stepwise multivariate logistic model selection was performed with liver cirrhosis, defined as Ishak fibrosis stage 5-6, as the outcome variable. A new index, referred to as Nordic Liver Index (NoLI) in the paper, was based on the model: Log-odds (predicting cirrhosis) = -12.17+ (age × 0.11) + (BMI (kg/m(2)) × 0.23) + (D7-lathosterol (μg/100 mg cholesterol)×(-0.013)) + (Platelet count (x10(9)/L) × (-0.018)) + (Prothrombin-INR × 3.69). The area under the ROC curve (AUROC) for prediction of cirrhosis was 0.91 (95% CI 0.86-0.96). The index was validated in a separate cohort of 83 patients and the AUROC for this cohort was similar (0.90; 95% CI: 0.82-0.98). In conclusion, the new index may complement other methods in diagnosing cirrhosis in patients with chronic HCV infection.

  13. Accurate in silico prediction of species-specific methylation sites based on information gain feature optimization.

    PubMed

    Wen, Ping-Ping; Shi, Shao-Ping; Xu, Hao-Dong; Wang, Li-Na; Qiu, Jian-Ding

    2016-10-15

    As one of the most important reversible types of post-translational modification, protein methylation catalyzed by methyltransferases carries many pivotal biological functions as well as many essential biological processes. Identification of methylation sites is prerequisite for decoding methylation regulatory networks in living cells and understanding their physiological roles. Experimental methods are limitations of labor-intensive and time-consuming. While in silicon approaches are cost-effective and high-throughput manner to predict potential methylation sites, but those previous predictors only have a mixed model and their prediction performances are not fully satisfactory now. Recently, with increasing availability of quantitative methylation datasets in diverse species (especially in eukaryotes), there is a growing need to develop a species-specific predictor. Here, we designed a tool named PSSMe based on information gain (IG) feature optimization method for species-specific methylation site prediction. The IG method was adopted to analyze the importance and contribution of each feature, then select the valuable dimension feature vectors to reconstitute a new orderly feature, which was applied to build the finally prediction model. Finally, our method improves prediction performance of accuracy about 15% comparing with single features. Furthermore, our species-specific model significantly improves the predictive performance compare with other general methylation prediction tools. Hence, our prediction results serve as useful resources to elucidate the mechanism of arginine or lysine methylation and facilitate hypothesis-driven experimental design and validation. The tool online service is implemented by C# language and freely available at http://bioinfo.ncu.edu.cn/PSSMe.aspx CONTACT: jdqiu@ncu.edu.cnSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights

  14. Accurate high-throughput structure mapping and prediction with transition metal ion FRET

    PubMed Central

    Yu, Xiaozhen; Wu, Xiongwu; Bermejo, Guillermo A.; Brooks, Bernard R.; Taraska, Justin W.

    2013-01-01

    Mapping the landscape of a protein’s conformational space is essential to understanding its functions and regulation. The limitations of many structural methods have made this process challenging for most proteins. Here, we report that transition metal ion FRET (tmFRET) can be used in a rapid, highly parallel screen, to determine distances from multiple locations within a protein at extremely low concentrations. The distances generated through this screen for the protein Maltose Binding Protein (MBP) match distances from the crystal structure to within a few angstroms. Furthermore, energy transfer accurately detects structural changes during ligand binding. Finally, fluorescence-derived distances can be used to guide molecular simulations to find low energy states. Our results open the door to rapid, accurate mapping and prediction of protein structures at low concentrations, in large complex systems, and in living cells. PMID:23273426

  15. Accurate approximation method for prediction of class I MHC affinities for peptides of length 8, 10 and 11 using prediction tools trained on 9mers.

    PubMed

    Lundegaard, Claus; Lund, Ole; Nielsen, Morten

    2008-06-01

    Several accurate prediction systems have been developed for prediction of class I major histocompatibility complex (MHC):peptide binding. Most of these are trained on binding affinity data of primarily 9mer peptides. Here, we show how prediction methods trained on 9mer data can be used for accurate binding affinity prediction of peptides of length 8, 10 and 11. The method gives the opportunity to predict peptides with a different length than nine for MHC alleles where no such peptides have been measured. As validation, the performance of this approach is compared to predictors trained on peptides of the peptide length in question. In this validation, the approximation method has an accuracy that is comparable to or better than methods trained on a peptide length identical to the predicted peptides. The algorithm has been implemented in the web-accessible servers NetMHC-3.0: http://www.cbs.dtu.dk/services/NetMHC-3.0, and NetMHCpan-1.1: http://www.cbs.dtu.dk/services/NetMHCpan-1.1

  16. Ensemble-based prediction of RNA secondary structures.

    PubMed

    Aghaeepour, Nima; Hoos, Holger H

    2013-04-24

    Accurate structure prediction methods play an important role for the understanding of RNA function. Energy-based, pseudoknot-free secondary structure prediction is one of the most widely used and versatile approaches, and improved methods for this task have received much attention over the past five years. Despite the impressive progress that as been achieved in this area, existing evaluations of the prediction accuracy achieved by various algorithms do not provide a comprehensive, statistically sound assessment. Furthermore, while there is increasing evidence that no prediction algorithm consistently outperforms all others, no work has been done to exploit the complementary strengths of multiple approaches. In this work, we present two contributions to the area of RNA secondary structure prediction. Firstly, we use state-of-the-art, resampling-based statistical methods together with a previously published and increasingly widely used dataset of high-quality RNA structures to conduct a comprehensive evaluation of existing RNA secondary structure prediction procedures. The results from this evaluation clarify the performance relationship between ten well-known existing energy-based pseudoknot-free RNA secondary structure prediction methods and clearly demonstrate the progress that has been achieved in recent years. Secondly, we introduce AveRNA, a generic and powerful method for combining a set of existing secondary structure prediction procedures into an ensemble-based method that achieves significantly higher prediction accuracies than obtained from any of its component procedures. Our new, ensemble-based method, AveRNA, improves the state of the art for energy-based, pseudoknot-free RNA secondary structure prediction by exploiting the complementary strengths of multiple existing prediction procedures, as demonstrated using a state-of-the-art statistical resampling approach. In addition, AveRNA allows an intuitive and effective control of the trade-off between

  17. Accurate prediction of cardiorespiratory fitness using cycle ergometry in minimally disabled persons with relapsing-remitting multiple sclerosis.

    PubMed

    Motl, Robert W; Fernhall, Bo

    2012-03-01

    To examine the accuracy of predicting peak oxygen consumption (VO(2peak)) primarily from peak work rate (WR(peak)) recorded during a maximal, incremental exercise test on a cycle ergometer among persons with relapsing-remitting multiple sclerosis (RRMS) who had minimal disability. Cross-sectional study. Clinical research laboratory. Women with RRMS (n=32) and sex-, age-, height-, and weight-matched healthy controls (n=16) completed an incremental exercise test on a cycle ergometer to volitional termination. Not applicable. Measured and predicted VO(2peak) and WR(peak). There were strong, statistically significant associations between measured and predicted VO(2peak) in the overall sample (R(2)=.89, standard error of the estimate=127.4 mL/min) and subsamples with (R(2)=.89, standard error of the estimate=131.3 mL/min) and without (R(2)=.85, standard error of the estimate=126.8 mL/min) multiple sclerosis (MS) based on the linear regression analyses. Based on the 95% confidence limits for worst-case errors, the equation predicted VO(2peak) within 10% of its true value in 95 of every 100 subjects with MS. Peak VO(2) can be accurately predicted in persons with RRMS who have minimal disability as it is in controls by using established equations and WR(peak) recorded from a maximal, incremental exercise test on a cycle ergometer. Copyright © 2012 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.

  18. Structure-Based Predictions of Activity Cliffs

    PubMed Central

    Husby, Jarmila; Bottegoni, Giovanni; Kufareva, Irina; Abagyan, Ruben; Cavalli, Andrea

    2015-01-01

    In drug discovery, it is generally accepted that neighboring molecules in a given descriptors' space display similar activities. However, even in regions that provide strong predictability, structurally similar molecules can occasionally display large differences in potency. In QSAR jargon, these discontinuities in the activity landscape are known as ‘activity cliffs’. In this study, we assessed the reliability of ligand docking and virtual ligand screening schemes in predicting activity cliffs. We performed our calculations on a diverse, independently collected database of cliff-forming co-crystals. Starting from ideal situations, which allowed us to establish our baseline, we progressively moved toward simulating more realistic scenarios. Ensemble- and template-docking achieved a significant level of accuracy, suggesting that, despite the well-known limitations of empirical scoring schemes, activity cliffs can be accurately predicted by advanced structure-based methods. PMID:25918827

  19. Model-based prediction of myelosuppression and recovery based on frequent neutrophil monitoring.

    PubMed

    Netterberg, Ida; Nielsen, Elisabet I; Friberg, Lena E; Karlsson, Mats O

    2017-08-01

    To investigate whether a more frequent monitoring of the absolute neutrophil counts (ANC) during myelosuppressive chemotherapy, together with model-based predictions, can improve therapy management, compared to the limited clinical monitoring typically applied today. Daily ANC in chemotherapy-treated cancer patients were simulated from a previously published population model describing docetaxel-induced myelosuppression. The simulated values were used to generate predictions of the individual ANC time-courses, given the myelosuppression model. The accuracy of the predicted ANC was evaluated under a range of conditions with reduced amount of ANC measurements. The predictions were most accurate when more data were available for generating the predictions and when making short forecasts. The inaccuracy of ANC predictions was highest around nadir, although a high sensitivity (≥90%) was demonstrated to forecast Grade 4 neutropenia before it occurred. The time for a patient to recover to baseline could be well forecasted 6 days (±1 day) before the typical value occurred on day 17. Daily monitoring of the ANC, together with model-based predictions, could improve anticancer drug treatment by identifying patients at risk for severe neutropenia and predicting when the next cycle could be initiated.

  20. Development of a noise prediction model based on advanced fuzzy approaches in typical industrial workrooms.

    PubMed

    Aliabadi, Mohsen; Golmohammadi, Rostam; Khotanlou, Hassan; Mansoorizadeh, Muharram; Salarpour, Amir

    2014-01-01

    Noise prediction is considered to be the best method for evaluating cost-preventative noise controls in industrial workrooms. One of the most important issues is the development of accurate models for analysis of the complex relationships among acoustic features affecting noise level in workrooms. In this study, advanced fuzzy approaches were employed to develop relatively accurate models for predicting noise in noisy industrial workrooms. The data were collected from 60 industrial embroidery workrooms in the Khorasan Province, East of Iran. The main acoustic and embroidery process features that influence the noise were used to develop prediction models using MATLAB software. Multiple regression technique was also employed and its results were compared with those of fuzzy approaches. Prediction errors of all prediction models based on fuzzy approaches were within the acceptable level (lower than one dB). However, Neuro-fuzzy model (RMSE=0.53dB and R2=0.88) could slightly improve the accuracy of noise prediction compared with generate fuzzy model. Moreover, fuzzy approaches provided more accurate predictions than did regression technique. The developed models based on fuzzy approaches as useful prediction tools give professionals the opportunity to have an optimum decision about the effectiveness of acoustic treatment scenarios in embroidery workrooms.

  1. NMRDSP: an accurate prediction of protein shape strings from NMR chemical shifts and sequence data.

    PubMed

    Mao, Wusong; Cong, Peisheng; Wang, Zhiheng; Lu, Longjian; Zhu, Zhongliang; Li, Tonghua

    2013-01-01

    Shape string is structural sequence and is an extremely important structure representation of protein backbone conformations. Nuclear magnetic resonance chemical shifts give a strong correlation with the local protein structure, and are exploited to predict protein structures in conjunction with computational approaches. Here we demonstrate a novel approach, NMRDSP, which can accurately predict the protein shape string based on nuclear magnetic resonance chemical shifts and structural profiles obtained from sequence data. The NMRDSP uses six chemical shifts (HA, H, N, CA, CB and C) and eight elements of structure profiles as features, a non-redundant set (1,003 entries) as the training set, and a conditional random field as a classification algorithm. For an independent testing set (203 entries), we achieved an accuracy of 75.8% for S8 (the eight states accuracy) and 87.8% for S3 (the three states accuracy). This is higher than only using chemical shifts or sequence data, and confirms that the chemical shift and the structure profile are significant features for shape string prediction and their combination prominently improves the accuracy of the predictor. We have constructed the NMRDSP web server and believe it could be employed to provide a solid platform to predict other protein structures and functions. The NMRDSP web server is freely available at http://cal.tongji.edu.cn/NMRDSP/index.jsp.

  2. Accurate multimodal probabilistic prediction of conversion to Alzheimer's disease in patients with mild cognitive impairment.

    PubMed

    Young, Jonathan; Modat, Marc; Cardoso, Manuel J; Mendelson, Alex; Cash, Dave; Ourselin, Sebastien

    2013-01-01

    Accurately identifying the patients that have mild cognitive impairment (MCI) who will go on to develop Alzheimer's disease (AD) will become essential as new treatments will require identification of AD patients at earlier stages in the disease process. Most previous work in this area has centred around the same automated techniques used to diagnose AD patients from healthy controls, by coupling high dimensional brain image data or other relevant biomarker data to modern machine learning techniques. Such studies can now distinguish between AD patients and controls as accurately as an experienced clinician. Models trained on patients with AD and control subjects can also distinguish between MCI patients that will convert to AD within a given timeframe (MCI-c) and those that remain stable (MCI-s), although differences between these groups are smaller and thus, the corresponding accuracy is lower. The most common type of classifier used in these studies is the support vector machine, which gives categorical class decisions. In this paper, we introduce Gaussian process (GP) classification to the problem. This fully Bayesian method produces naturally probabilistic predictions, which we show correlate well with the actual chances of converting to AD within 3 years in a population of 96 MCI-s and 47 MCI-c subjects. Furthermore, we show that GPs can integrate multimodal data (in this study volumetric MRI, FDG-PET, cerebrospinal fluid, and APOE genotype with the classification process through the use of a mixed kernel). The GP approach aids combination of different data sources by learning parameters automatically from training data via type-II maximum likelihood, which we compare to a more conventional method based on cross validation and an SVM classifier. When the resulting probabilities from the GP are dichotomised to produce a binary classification, the results for predicting MCI conversion based on the combination of all three types of data show a balanced accuracy

  3. SnowyOwl: accurate prediction of fungal genes by using RNA-Seq and homology information to select among ab initio models

    PubMed Central

    2014-01-01

    Background Locating the protein-coding genes in novel genomes is essential to understanding and exploiting the genomic information but it is still difficult to accurately predict all the genes. The recent availability of detailed information about transcript structure from high-throughput sequencing of messenger RNA (RNA-Seq) delineates many expressed genes and promises increased accuracy in gene prediction. Computational gene predictors have been intensively developed for and tested in well-studied animal genomes. Hundreds of fungal genomes are now or will soon be sequenced. The differences of fungal genomes from animal genomes and the phylogenetic sparsity of well-studied fungi call for gene-prediction tools tailored to them. Results SnowyOwl is a new gene prediction pipeline that uses RNA-Seq data to train and provide hints for the generation of Hidden Markov Model (HMM)-based gene predictions and to evaluate the resulting models. The pipeline has been developed and streamlined by comparing its predictions to manually curated gene models in three fungal genomes and validated against the high-quality gene annotation of Neurospora crassa; SnowyOwl predicted N. crassa genes with 83% sensitivity and 65% specificity. SnowyOwl gains sensitivity by repeatedly running the HMM gene predictor Augustus with varied input parameters and selectivity by choosing the models with best homology to known proteins and best agreement with the RNA-Seq data. Conclusions SnowyOwl efficiently uses RNA-Seq data to produce accurate gene models in both well-studied and novel fungal genomes. The source code for the SnowyOwl pipeline (in Python) and a web interface (in PHP) is freely available from http://sourceforge.net/projects/snowyowl/. PMID:24980894

  4. Protein asparagine deamidation prediction based on structures with machine learning methods.

    PubMed

    Jia, Lei; Sun, Yaxiong

    2017-01-01

    Chemical stability is a major concern in the development of protein therapeutics due to its impact on both efficacy and safety. Protein "hotspots" are amino acid residues that are subject to various chemical modifications, including deamidation, isomerization, glycosylation, oxidation etc. A more accurate prediction method for potential hotspot residues would allow their elimination or reduction as early as possible in the drug discovery process. In this work, we focus on prediction models for asparagine (Asn) deamidation. Sequence-based prediction method simply identifies the NG motif (amino acid asparagine followed by a glycine) to be liable to deamidation. It still dominates deamidation evaluation process in most pharmaceutical setup due to its convenience. However, the simple sequence-based method is less accurate and often causes over-engineering a protein. We introduce structure-based prediction models by mining available experimental and structural data of deamidated proteins. Our training set contains 194 Asn residues from 25 proteins that all have available high-resolution crystal structures. Experimentally measured deamidation half-life of Asn in penta-peptides as well as 3D structure-based properties, such as solvent exposure, crystallographic B-factors, local secondary structure and dihedral angles etc., were used to train prediction models with several machine learning algorithms. The prediction tools were cross-validated as well as tested with an external test data set. The random forest model had high enrichment in ranking deamidated residues higher than non-deamidated residues while effectively eliminated false positive predictions. It is possible that such quantitative protein structure-function relationship tools can also be applied to other protein hotspot predictions. In addition, we extensively discussed metrics being used to evaluate the performance of predicting unbalanced data sets such as the deamidation case.

  5. Machine learning predictions of molecular properties: Accurate many-body potentials and nonlocality in chemical space

    DOE PAGES

    Hansen, Katja; Biegler, Franziska; Ramakrishnan, Raghunathan; ...

    2015-06-04

    Simultaneously accurate and efficient prediction of molecular properties throughout chemical compound space is a critical ingredient toward rational compound design in chemical and pharmaceutical industries. Aiming toward this goal, we develop and apply a systematic hierarchy of efficient empirical methods to estimate atomization and total energies of molecules. These methods range from a simple sum over atoms, to addition of bond energies, to pairwise interatomic force fields, reaching to the more sophisticated machine learning approaches that are capable of describing collective interactions between many atoms or bonds. In the case of equilibrium molecular geometries, even simple pairwise force fields demonstratemore » prediction accuracy comparable to benchmark energies calculated using density functional theory with hybrid exchange-correlation functionals; however, accounting for the collective many-body interactions proves to be essential for approaching the “holy grail” of chemical accuracy of 1 kcal/mol for both equilibrium and out-of-equilibrium geometries. This remarkable accuracy is achieved by a vectorized representation of molecules (so-called Bag of Bonds model) that exhibits strong nonlocality in chemical space. The same representation allows us to predict accurate electronic properties of molecules, such as their polarizability and molecular frontier orbital energies.« less

  6. Machine Learning Predictions of Molecular Properties: Accurate Many-Body Potentials and Nonlocality in Chemical Space

    PubMed Central

    2015-01-01

    Simultaneously accurate and efficient prediction of molecular properties throughout chemical compound space is a critical ingredient toward rational compound design in chemical and pharmaceutical industries. Aiming toward this goal, we develop and apply a systematic hierarchy of efficient empirical methods to estimate atomization and total energies of molecules. These methods range from a simple sum over atoms, to addition of bond energies, to pairwise interatomic force fields, reaching to the more sophisticated machine learning approaches that are capable of describing collective interactions between many atoms or bonds. In the case of equilibrium molecular geometries, even simple pairwise force fields demonstrate prediction accuracy comparable to benchmark energies calculated using density functional theory with hybrid exchange-correlation functionals; however, accounting for the collective many-body interactions proves to be essential for approaching the “holy grail” of chemical accuracy of 1 kcal/mol for both equilibrium and out-of-equilibrium geometries. This remarkable accuracy is achieved by a vectorized representation of molecules (so-called Bag of Bonds model) that exhibits strong nonlocality in chemical space. In addition, the same representation allows us to predict accurate electronic properties of molecules, such as their polarizability and molecular frontier orbital energies. PMID:26113956

  7. Non-Fourier based thermal-mechanical tissue damage prediction for thermal ablation.

    PubMed

    Li, Xin; Zhong, Yongmin; Smith, Julian; Gu, Chengfan

    2017-01-02

    Prediction of tissue damage under thermal loads plays important role for thermal ablation planning. A new methodology is presented in this paper by combing non-Fourier bio-heat transfer, constitutive elastic mechanics as well as non-rigid motion of dynamics to predict and analyze thermal distribution, thermal-induced mechanical deformation and thermal-mechanical damage of soft tissues under thermal loads. Simulations and comparison analysis demonstrate that the proposed methodology based on the non-Fourier bio-heat transfer can account for the thermal-induced mechanical behaviors of soft tissues and predict tissue thermal damage more accurately than classical Fourier bio-heat transfer based model.

  8. Accurate prediction of acute fish toxicity of fragrance chemicals with the RTgill-W1 cell assay.

    PubMed

    Natsch, Andreas; Laue, Heike; Haupt, Tina; von Niederhäusern, Valentin; Sanders, Gordon

    2018-03-01

    Testing for acute fish toxicity is an integral part of the environmental safety assessment of chemicals. A true replacement of primary fish tissue was recently proposed using cell viability in a fish gill cell line (RTgill-W1) as a means of predicting acute toxicity, showing good predictivity on 35 chemicals. To promote regulatory acceptance, the predictivity and applicability domain of novel tests need to be carefully evaluated on chemicals with existing high-quality in vivo data. We applied the RTgill-W1 cell assay to 38 fragrance chemicals with a wide range of both physicochemical properties and median lethal concentration (LC50) values and representing a diverse range of chemistries. A strong correlation (R 2  = 0.90-0.94) between the logarithmic in vivo LC50 values, based on fish mortality, and the logarithmic in vitro median effect concentration (EC50) values based on cell viability was observed. A leave-one-out analysis illustrates a median under-/overprediction from in vitro EC50 values to in vivo LC50 values by a factor of 1.5. This assay offers a simple, accurate, and reliable alternative to in vivo acute fish toxicity testing for chemicals, presumably acting mainly by a narcotic mode of action. Furthermore, the present study provides validation of the predictivity of the RTgill-W1 assay on a completely independent set of chemicals that had not been previously tested and indicates that fragrance chemicals are clearly within the applicability domain. Environ Toxicol Chem 2018;37:931-941. © 2017 SETAC. © 2017 SETAC.

  9. How accurate are resting energy expenditure prediction equations in obese trauma and burn patients?

    PubMed

    Stucky, Chee-Chee H; Moncure, Michael; Hise, Mary; Gossage, Clint M; Northrop, David

    2008-01-01

    While the prevalence of obesity continues to increase in our society, outdated resting energy expenditure (REE) prediction equations may overpredict energy requirements in obese patients. Accurate feeding is essential since overfeeding has been demonstrated to adversely affect outcomes. The first objective was to compare REE calculated by prediction equations to the measured REE in obese trauma and burn patients. Our hypothesis was that an equation using fat-free mass would give a more accurate prediction. The second objective was to consider the effect of a commonly used injury factor on the predicted REE. A retrospective chart review was performed on 28 patients. REE was measured using indirect calorimetry and compared with the Harris-Benedict and Cunningham equations, and an equation using type II diabetes as a factor. Statistical analyses used were paired t test, +/-95% confidence interval, and the Bland-Altman method. Measured average REE in trauma and burn patients was 21.37 +/- 5.26 and 21.81 +/- 3.35 kcal/kg/d, respectively. Harris-Benedict underpredicted REE in trauma and burn patients to the least extent, while the Cunningham equation underpredicted REE in both populations to the greatest extent. Using an injury factor of 1.2, Cunningham continued to underestimate REE in both populations, while the Harris-Benedict and Diabetic equations overpredicted REE in both populations. The measured average REE is significantly less than current guidelines. This finding suggests that a hypocaloric regimen is worth considering for ICU patients. Also, if an injury factor of 1.2 is incorporated in certain equations, patients may be given too many calories.

  10. Simulated Annealing Based Hybrid Forecast for Improving Daily Municipal Solid Waste Generation Prediction

    PubMed Central

    Song, Jingwei; He, Jiaying; Zhu, Menghua; Tan, Debao; Zhang, Yu; Ye, Song; Shen, Dingtao; Zou, Pengfei

    2014-01-01

    A simulated annealing (SA) based variable weighted forecast model is proposed to combine and weigh local chaotic model, artificial neural network (ANN), and partial least square support vector machine (PLS-SVM) to build a more accurate forecast model. The hybrid model was built and multistep ahead prediction ability was tested based on daily MSW generation data from Seattle, Washington, the United States. The hybrid forecast model was proved to produce more accurate and reliable results and to degrade less in longer predictions than three individual models. The average one-week step ahead prediction has been raised from 11.21% (chaotic model), 12.93% (ANN), and 12.94% (PLS-SVM) to 9.38%. Five-week average has been raised from 13.02% (chaotic model), 15.69% (ANN), and 15.92% (PLS-SVM) to 11.27%. PMID:25301508

  11. ChIP-seq Accurately Predicts Tissue-Specific Activity of Enhancers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Visel, Axel; Blow, Matthew J.; Li, Zirong

    2009-02-01

    A major yet unresolved quest in decoding the human genome is the identification of the regulatory sequences that control the spatial and temporal expression of genes. Distant-acting transcriptional enhancers are particularly challenging to uncover since they are scattered amongst the vast non-coding portion of the genome. Evolutionary sequence constraint can facilitate the discovery of enhancers, but fails to predict when and where they are active in vivo. Here, we performed chromatin immunoprecipitation with the enhancer-associated protein p300, followed by massively-parallel sequencing, to map several thousand in vivo binding sites of p300 in mouse embryonic forebrain, midbrain, and limb tissue. Wemore » tested 86 of these sequences in a transgenic mouse assay, which in nearly all cases revealed reproducible enhancer activity in those tissues predicted by p300 binding. Our results indicate that in vivo mapping of p300 binding is a highly accurate means for identifying enhancers and their associated activities and suggest that such datasets will be useful to study the role of tissue-specific enhancers in human biology and disease on a genome-wide scale.« less

  12. Does the emergency surgery score accurately predict outcomes in emergent laparotomies?

    PubMed

    Peponis, Thomas; Bohnen, Jordan D; Sangji, Naveen F; Nandan, Anirudh R; Han, Kelsey; Lee, Jarone; Yeh, D Dante; de Moya, Marc A; Velmahos, George C; Chang, David C; Kaafarani, Haytham M A

    2017-08-01

    The emergency surgery score is a mortality-risk calculator for emergency general operation patients. We sought to examine whether the emergency surgery score predicts 30-day morbidity and mortality in a high-risk group of patients undergoing emergent laparotomy. Using the 2011-2012 American College of Surgeons National Surgical Quality Improvement Program database, we identified all patients who underwent emergent laparotomy using (1) the American College of Surgeons National Surgical Quality Improvement Program definition of "emergent," and (2) all Current Procedural Terminology codes denoting a laparotomy, excluding aortic aneurysm rupture. Multivariable logistic regression analyses were performed to measure the correlation (c-statistic) between the emergency surgery score and (1) 30-day mortality, and (2) 30-day morbidity after emergent laparotomy. As sensitivity analyses, the correlation between the emergency surgery score and 30-day mortality was also evaluated in prespecified subgroups based on Current Procedural Terminology codes. A total of 26,410 emergent laparotomy patients were included. Thirty-day mortality and morbidity were 10.2% and 43.8%, respectively. The emergency surgery score correlated well with mortality (c-statistic = 0.84); scores of 1, 11, and 22 correlated with mortalities of 0.4%, 39%, and 100%, respectively. Similarly, the emergency surgery score correlated well with morbidity (c-statistic = 0.74); scores of 0, 7, and 11 correlated with complication rates of 13%, 58%, and 79%, respectively. The morbidity rates plateaued for scores higher than 11. Sensitivity analyses demonstrated that the emergency surgery score effectively predicts mortality in patients undergoing emergent (1) splenic, (2) gastroduodenal, (3) intestinal, (4) hepatobiliary, or (5) incarcerated ventral hernia operation. The emergency surgery score accurately predicts outcomes in all types of emergent laparotomy patients and may prove valuable as a bedside decision

  13. MicroRNAfold: pre-microRNA secondary structure prediction based on modified NCM model with thermodynamics-based scoring strategy.

    PubMed

    Han, Dianwei; Zhang, Jun; Tang, Guiliang

    2012-01-01

    An accurate prediction of the pre-microRNA secondary structure is important in miRNA informatics. Based on a recently proposed model, nucleotide cyclic motifs (NCM), to predict RNA secondary structure, we propose and implement a Modified NCM (MNCM) model with a physics-based scoring strategy to tackle the problem of pre-microRNA folding. Our microRNAfold is implemented using a global optimal algorithm based on the bottom-up local optimal solutions. Our experimental results show that microRNAfold outperforms the current leading prediction tools in terms of True Negative rate, False Negative rate, Specificity, and Matthews coefficient ratio.

  14. Simple Mathematical Models Do Not Accurately Predict Early SIV Dynamics

    PubMed Central

    Noecker, Cecilia; Schaefer, Krista; Zaccheo, Kelly; Yang, Yiding; Day, Judy; Ganusov, Vitaly V.

    2015-01-01

    Upon infection of a new host, human immunodeficiency virus (HIV) replicates in the mucosal tissues and is generally undetectable in circulation for 1–2 weeks post-infection. Several interventions against HIV including vaccines and antiretroviral prophylaxis target virus replication at this earliest stage of infection. Mathematical models have been used to understand how HIV spreads from mucosal tissues systemically and what impact vaccination and/or antiretroviral prophylaxis has on viral eradication. Because predictions of such models have been rarely compared to experimental data, it remains unclear which processes included in these models are critical for predicting early HIV dynamics. Here we modified the “standard” mathematical model of HIV infection to include two populations of infected cells: cells that are actively producing the virus and cells that are transitioning into virus production mode. We evaluated the effects of several poorly known parameters on infection outcomes in this model and compared model predictions to experimental data on infection of non-human primates with variable doses of simian immunodifficiency virus (SIV). First, we found that the mode of virus production by infected cells (budding vs. bursting) has a minimal impact on the early virus dynamics for a wide range of model parameters, as long as the parameters are constrained to provide the observed rate of SIV load increase in the blood of infected animals. Interestingly and in contrast with previous results, we found that the bursting mode of virus production generally results in a higher probability of viral extinction than the budding mode of virus production. Second, this mathematical model was not able to accurately describe the change in experimentally determined probability of host infection with increasing viral doses. Third and finally, the model was also unable to accurately explain the decline in the time to virus detection with increasing viral dose. These results

  15. Improving medical decisions for incapacitated persons: does focusing on "accurate predictions" lead to an inaccurate picture?

    PubMed

    Kim, Scott Y H

    2014-04-01

    The Patient Preference Predictor (PPP) proposal places a high priority on the accuracy of predicting patients' preferences and finds the performance of surrogates inadequate. However, the quest to develop a highly accurate, individualized statistical model has significant obstacles. First, it will be impossible to validate the PPP beyond the limit imposed by 60%-80% reliability of people's preferences for future medical decisions--a figure no better than the known average accuracy of surrogates. Second, evidence supports the view that a sizable minority of persons may not even have preferences to predict. Third, many, perhaps most, people express their autonomy just as much by entrusting their loved ones to exercise their judgment than by desiring to specifically control future decisions. Surrogate decision making faces none of these issues and, in fact, it may be more efficient, accurate, and authoritative than is commonly assumed.

  16. Probability-based collaborative filtering model for predicting gene-disease associations.

    PubMed

    Zeng, Xiangxiang; Ding, Ningxiang; Rodríguez-Patón, Alfonso; Zou, Quan

    2017-12-28

    Accurately predicting pathogenic human genes has been challenging in recent research. Considering extensive gene-disease data verified by biological experiments, we can apply computational methods to perform accurate predictions with reduced time and expenses. We propose a probability-based collaborative filtering model (PCFM) to predict pathogenic human genes. Several kinds of data sets, containing data of humans and data of other nonhuman species, are integrated in our model. Firstly, on the basis of a typical latent factorization model, we propose model I with an average heterogeneous regularization. Secondly, we develop modified model II with personal heterogeneous regularization to enhance the accuracy of aforementioned models. In this model, vector space similarity or Pearson correlation coefficient metrics and data on related species are also used. We compared the results of PCFM with the results of four state-of-arts approaches. The results show that PCFM performs better than other advanced approaches. PCFM model can be leveraged for predictions of disease genes, especially for new human genes or diseases with no known relationships.

  17. Saccades to future ball location reveal memory-based prediction in a virtual-reality interception task

    PubMed Central

    Diaz, Gabriel; Cooper, Joseph; Rothkopf, Constantin; Hayhoe, Mary

    2013-01-01

    Despite general agreement that prediction is a central aspect of perception, there is relatively little evidence concerning the basis on which visual predictions are made. Although both saccadic and pursuit eye-movements reveal knowledge of the future position of a moving visual target, in many of these studies targets move along simple trajectories through a fronto-parallel plane. Here, using a naturalistic and racquet-based interception task in a virtual environment, we demonstrate that subjects make accurate predictions of visual target motion, even when targets follow trajectories determined by the complex dynamics of physical interactions and the head and body are unrestrained. Furthermore, we found that, following a change in ball elasticity, subjects were able to accurately adjust their prebounce predictions of the ball's post-bounce trajectory. This suggests that prediction is guided by experience-based models of how information in the visual image will change over time. PMID:23325347

  18. Saccades to future ball location reveal memory-based prediction in a virtual-reality interception task.

    PubMed

    Diaz, Gabriel; Cooper, Joseph; Rothkopf, Constantin; Hayhoe, Mary

    2013-01-16

    Despite general agreement that prediction is a central aspect of perception, there is relatively little evidence concerning the basis on which visual predictions are made. Although both saccadic and pursuit eye-movements reveal knowledge of the future position of a moving visual target, in many of these studies targets move along simple trajectories through a fronto-parallel plane. Here, using a naturalistic and racquet-based interception task in a virtual environment, we demonstrate that subjects make accurate predictions of visual target motion, even when targets follow trajectories determined by the complex dynamics of physical interactions and the head and body are unrestrained. Furthermore, we found that, following a change in ball elasticity, subjects were able to accurately adjust their prebounce predictions of the ball's post-bounce trajectory. This suggests that prediction is guided by experience-based models of how information in the visual image will change over time.

  19. Rapid and accurate prediction of degradant formation rates in pharmaceutical formulations using high-performance liquid chromatography-mass spectrometry.

    PubMed

    Darrington, Richard T; Jiao, Jim

    2004-04-01

    Rapid and accurate stability prediction is essential to pharmaceutical formulation development. Commonly used stability prediction methods include monitoring parent drug loss at intended storage conditions or initial rate determination of degradants under accelerated conditions. Monitoring parent drug loss at the intended storage condition does not provide a rapid and accurate stability assessment because often <0.5% drug loss is all that can be observed in a realistic time frame, while the accelerated initial rate method in conjunction with extrapolation of rate constants using the Arrhenius or Eyring equations often introduces large errors in shelf-life prediction. In this study, the shelf life prediction of a model pharmaceutical preparation utilizing sensitive high-performance liquid chromatography-mass spectrometry (LC/MS) to directly quantitate degradant formation rates at the intended storage condition is proposed. This method was compared to traditional shelf life prediction approaches in terms of time required to predict shelf life and associated error in shelf life estimation. Results demonstrated that the proposed LC/MS method using initial rates analysis provided significantly improved confidence intervals for the predicted shelf life and required less overall time and effort to obtain the stability estimation compared to the other methods evaluated. Copyright 2004 Wiley-Liss, Inc. and the American Pharmacists Association.

  20. Non-Fourier based thermal-mechanical tissue damage prediction for thermal ablation

    PubMed Central

    Li, Xin; Zhong, Yongmin; Smith, Julian; Gu, Chengfan

    2017-01-01

    ABSTRACT Prediction of tissue damage under thermal loads plays important role for thermal ablation planning. A new methodology is presented in this paper by combing non-Fourier bio-heat transfer, constitutive elastic mechanics as well as non-rigid motion of dynamics to predict and analyze thermal distribution, thermal-induced mechanical deformation and thermal-mechanical damage of soft tissues under thermal loads. Simulations and comparison analysis demonstrate that the proposed methodology based on the non-Fourier bio-heat transfer can account for the thermal-induced mechanical behaviors of soft tissues and predict tissue thermal damage more accurately than classical Fourier bio-heat transfer based model. PMID:27690290

  1. Accurate disulfide-bonding network predictions improve ab initio structure prediction of cysteine-rich proteins

    PubMed Central

    Yang, Jing; He, Bao-Ji; Jang, Richard; Zhang, Yang; Shen, Hong-Bin

    2015-01-01

    Abstract Motivation: Cysteine-rich proteins cover many important families in nature but there are currently no methods specifically designed for modeling the structure of these proteins. The accuracy of disulfide connectivity pattern prediction, particularly for the proteins of higher-order connections, e.g. >3 bonds, is too low to effectively assist structure assembly simulations. Results: We propose a new hierarchical order reduction protocol called Cyscon for disulfide-bonding prediction. The most confident disulfide bonds are first identified and bonding prediction is then focused on the remaining cysteine residues based on SVR training. Compared with purely machine learning-based approaches, Cyscon improved the average accuracy of connectivity pattern prediction by 21.9%. For proteins with more than 5 disulfide bonds, Cyscon improved the accuracy by 585% on the benchmark set of PDBCYS. When applied to 158 non-redundant cysteine-rich proteins, Cyscon predictions helped increase (or decrease) the TM-score (or RMSD) of the ab initio QUARK modeling by 12.1% (or 14.4%). This result demonstrates a new avenue to improve the ab initio structure modeling for cysteine-rich proteins. Availability and implementation: http://www.csbio.sjtu.edu.cn/bioinf/Cyscon/ Contact: zhng@umich.edu or hbshen@sjtu.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26254435

  2. Alignment-Based Prediction of Sites of Metabolism.

    PubMed

    de Bruyn Kops, Christina; Friedrich, Nils-Ole; Kirchmair, Johannes

    2017-06-26

    Prediction of metabolically labile atom positions in a molecule (sites of metabolism) is a key component of the simulation of xenobiotic metabolism as a whole, providing crucial information for the development of safe and effective drugs. In 2008, an exploratory study was published in which sites of metabolism were derived based on molecular shape- and chemical feature-based alignment to a molecule whose site of metabolism (SoM) had been determined by experiments. We present a detailed analysis of the breadth of applicability of alignment-based SoM prediction, including transfer of the approach from a structure- to ligand-based method and extension of the applicability of the models from cytochrome P450 2C9 to all cytochrome P450 isozymes involved in drug metabolism. We evaluate the effect of molecular similarity of the query and reference molecules on the ability of this approach to accurately predict SoMs. In addition, we combine the alignment-based method with a leading chemical reactivity model to take reactivity into account. The combined model yielded superior performance in comparison to the alignment-based approach and the reactivity models with an average area under the receiver operating characteristic curve of 0.85 in cross-validation experiments. In particular, early enrichment was improved, as evidenced by higher BEDROC scores (mean BEDROC = 0.59 for α = 20.0, mean BEDROC = 0.73 for α = 80.5).

  3. Modeling methodology for the accurate and prompt prediction of symptomatic events in chronic diseases.

    PubMed

    Pagán, Josué; Risco-Martín, José L; Moya, José M; Ayala, José L

    2016-08-01

    Prediction of symptomatic crises in chronic diseases allows to take decisions before the symptoms occur, such as the intake of drugs to avoid the symptoms or the activation of medical alarms. The prediction horizon is in this case an important parameter in order to fulfill the pharmacokinetics of medications, or the time response of medical services. This paper presents a study about the prediction limits of a chronic disease with symptomatic crises: the migraine. For that purpose, this work develops a methodology to build predictive migraine models and to improve these predictions beyond the limits of the initial models. The maximum prediction horizon is analyzed, and its dependency on the selected features is studied. A strategy for model selection is proposed to tackle the trade off between conservative but robust predictive models, with respect to less accurate predictions with higher horizons. The obtained results show a prediction horizon close to 40min, which is in the time range of the drug pharmacokinetics. Experiments have been performed in a realistic scenario where input data have been acquired in an ambulatory clinical study by the deployment of a non-intrusive Wireless Body Sensor Network. Our results provide an effective methodology for the selection of the future horizon in the development of prediction algorithms for diseases experiencing symptomatic crises. Copyright © 2016 Elsevier Inc. All rights reserved.

  4. Accurate prediction of interfacial residues in two-domain proteins using evolutionary information: implications for three-dimensional modeling.

    PubMed

    Bhaskara, Ramachandra M; Padhi, Amrita; Srinivasan, Narayanaswamy

    2014-07-01

    With the preponderance of multidomain proteins in eukaryotic genomes, it is essential to recognize the constituent domains and their functions. Often function involves communications across the domain interfaces, and the knowledge of the interacting sites is essential to our understanding of the structure-function relationship. Using evolutionary information extracted from homologous domains in at least two diverse domain architectures (single and multidomain), we predict the interface residues corresponding to domains from the two-domain proteins. We also use information from the three-dimensional structures of individual domains of two-domain proteins to train naïve Bayes classifier model to predict the interfacial residues. Our predictions are highly accurate (∼85%) and specific (∼95%) to the domain-domain interfaces. This method is specific to multidomain proteins which contain domains in at least more than one protein architectural context. Using predicted residues to constrain domain-domain interaction, rigid-body docking was able to provide us with accurate full-length protein structures with correct orientation of domains. We believe that these results can be of considerable interest toward rational protein and interaction design, apart from providing us with valuable information on the nature of interactions. © 2013 Wiley Periodicals, Inc.

  5. PrePhyloPro: phylogenetic profile-based prediction of whole proteome linkages

    PubMed Central

    Niu, Yulong; Liu, Chengcheng; Moghimyfiroozabad, Shayan; Yang, Yi

    2017-01-01

    Direct and indirect functional links between proteins as well as their interactions as part of larger protein complexes or common signaling pathways may be predicted by analyzing the correlation of their evolutionary patterns. Based on phylogenetic profiling, here we present a highly scalable and time-efficient computational framework for predicting linkages within the whole human proteome. We have validated this method through analysis of 3,697 human pathways and molecular complexes and a comparison of our results with the prediction outcomes of previously published co-occurrency model-based and normalization methods. Here we also introduce PrePhyloPro, a web-based software that uses our method for accurately predicting proteome-wide linkages. We present data on interactions of human mitochondrial proteins, verifying the performance of this software. PrePhyloPro is freely available at http://prephylopro.org/phyloprofile/. PMID:28875072

  6. Do dual-route models accurately predict reading and spelling performance in individuals with acquired alexia and agraphia?

    PubMed

    Rapcsak, Steven Z; Henry, Maya L; Teague, Sommer L; Carnahan, Susan D; Beeson, Pélagie M

    2007-06-18

    Coltheart and co-workers [Castles, A., Bates, T. C., & Coltheart, M. (2006). John Marshall and the developmental dyslexias. Aphasiology, 20, 871-892; Coltheart, M., Rastle, K., Perry, C., Langdon, R., & Ziegler, J. (2001). DRC: A dual route cascaded model of visual word recognition and reading aloud. Psychological Review, 108, 204-256] have demonstrated that an equation derived from dual-route theory accurately predicts reading performance in young normal readers and in children with reading impairment due to developmental dyslexia or stroke. In this paper, we present evidence that the dual-route equation and a related multiple regression model also accurately predict both reading and spelling performance in adult neurological patients with acquired alexia and agraphia. These findings provide empirical support for dual-route theories of written language processing.

  7. Deep-Learning-Based Drug-Target Interaction Prediction.

    PubMed

    Wen, Ming; Zhang, Zhimin; Niu, Shaoyu; Sha, Haozhi; Yang, Ruihan; Yun, Yonghuan; Lu, Hongmei

    2017-04-07

    Identifying interactions between known drugs and targets is a major challenge in drug repositioning. In silico prediction of drug-target interaction (DTI) can speed up the expensive and time-consuming experimental work by providing the most potent DTIs. In silico prediction of DTI can also provide insights about the potential drug-drug interaction and promote the exploration of drug side effects. Traditionally, the performance of DTI prediction depends heavily on the descriptors used to represent the drugs and the target proteins. In this paper, to accurately predict new DTIs between approved drugs and targets without separating the targets into different classes, we developed a deep-learning-based algorithmic framework named DeepDTIs. It first abstracts representations from raw input descriptors using unsupervised pretraining and then applies known label pairs of interaction to build a classification model. Compared with other methods, it is found that DeepDTIs reaches or outperforms other state-of-the-art methods. The DeepDTIs can be further used to predict whether a new drug targets to some existing targets or whether a new target interacts with some existing drugs.

  8. Beating Heart Motion Accurate Prediction Method Based on Interactive Multiple Model: An Information Fusion Approach

    PubMed Central

    Xie, Weihong; Yu, Yang

    2017-01-01

    Robot-assisted motion compensated beating heart surgery has the advantage over the conventional Coronary Artery Bypass Graft (CABG) in terms of reduced trauma to the surrounding structures that leads to shortened recovery time. The severe nonlinear and diverse nature of irregular heart rhythm causes enormous difficulty for the robot to realize the clinic requirements, especially under arrhythmias. In this paper, we propose a fusion prediction framework based on Interactive Multiple Model (IMM) estimator, allowing each model to cover a distinguishing feature of the heart motion in underlying dynamics. We find that, at normal state, the nonlinearity of the heart motion with slow time-variant changing dominates the beating process. When an arrhythmia occurs, the irregularity mode, the fast uncertainties with random patterns become the leading factor of the heart motion. We deal with prediction problem in the case of arrhythmias by estimating the state with two behavior modes which can adaptively “switch” from one to the other. Also, we employed the signal quality index to adaptively determine the switch transition probability in the framework of IMM. We conduct comparative experiments to evaluate the proposed approach with four distinguished datasets. The test results indicate that the new proposed approach reduces prediction errors significantly. PMID:29124062

  9. Beating Heart Motion Accurate Prediction Method Based on Interactive Multiple Model: An Information Fusion Approach.

    PubMed

    Liang, Fan; Xie, Weihong; Yu, Yang

    2017-01-01

    Robot-assisted motion compensated beating heart surgery has the advantage over the conventional Coronary Artery Bypass Graft (CABG) in terms of reduced trauma to the surrounding structures that leads to shortened recovery time. The severe nonlinear and diverse nature of irregular heart rhythm causes enormous difficulty for the robot to realize the clinic requirements, especially under arrhythmias. In this paper, we propose a fusion prediction framework based on Interactive Multiple Model (IMM) estimator, allowing each model to cover a distinguishing feature of the heart motion in underlying dynamics. We find that, at normal state, the nonlinearity of the heart motion with slow time-variant changing dominates the beating process. When an arrhythmia occurs, the irregularity mode, the fast uncertainties with random patterns become the leading factor of the heart motion. We deal with prediction problem in the case of arrhythmias by estimating the state with two behavior modes which can adaptively "switch" from one to the other. Also, we employed the signal quality index to adaptively determine the switch transition probability in the framework of IMM. We conduct comparative experiments to evaluate the proposed approach with four distinguished datasets. The test results indicate that the new proposed approach reduces prediction errors significantly.

  10. Radiometrically accurate scene-based nonuniformity correction for array sensors.

    PubMed

    Ratliff, Bradley M; Hayat, Majeed M; Tyo, J Scott

    2003-10-01

    A novel radiometrically accurate scene-based nonuniformity correction (NUC) algorithm is described. The technique combines absolute calibration with a recently reported algebraic scene-based NUC algorithm. The technique is based on the following principle: First, detectors that are along the perimeter of the focal-plane array are absolutely calibrated; then the calibration is transported to the remaining uncalibrated interior detectors through the application of the algebraic scene-based algorithm, which utilizes pairs of image frames exhibiting arbitrary global motion. The key advantage of this technique is that it can obtain radiometric accuracy during NUC without disrupting camera operation. Accurate estimates of the bias nonuniformity can be achieved with relatively few frames, which can be fewer than ten frame pairs. Advantages of this technique are discussed, and a thorough performance analysis is presented with use of simulated and real infrared imagery.

  11. Obtaining Accurate Probabilities Using Classifier Calibration

    ERIC Educational Resources Information Center

    Pakdaman Naeini, Mahdi

    2016-01-01

    Learning probabilistic classification and prediction models that generate accurate probabilities is essential in many prediction and decision-making tasks in machine learning and data mining. One way to achieve this goal is to post-process the output of classification models to obtain more accurate probabilities. These post-processing methods are…

  12. A micromechanics-based strength prediction methodology for notched metal matrix composites

    NASA Technical Reports Server (NTRS)

    Bigelow, C. A.

    1992-01-01

    An analytical micromechanics based strength prediction methodology was developed to predict failure of notched metal matrix composites. The stress-strain behavior and notched strength of two metal matrix composites, boron/aluminum (B/Al) and silicon-carbide/titanium (SCS-6/Ti-15-3), were predicted. The prediction methodology combines analytical techniques ranging from a three dimensional finite element analysis of a notched specimen to a micromechanical model of a single fiber. In the B/Al laminates, a fiber failure criteria based on the axial and shear stress in the fiber accurately predicted laminate failure for a variety of layups and notch-length to specimen-width ratios with both circular holes and sharp notches when matrix plasticity was included in the analysis. For the SCS-6/Ti-15-3 laminates, a fiber failure based on the axial stress in the fiber correlated well with experimental results for static and post fatigue residual strengths when fiber matrix debonding and matrix cracking were included in the analysis. The micromechanics based strength prediction methodology offers a direct approach to strength prediction by modeling behavior and damage on a constituent level, thus, explicitly including matrix nonlinearity, fiber matrix debonding, and matrix cracking.

  13. A micromechanics-based strength prediction methodology for notched metal-matrix composites

    NASA Technical Reports Server (NTRS)

    Bigelow, C. A.

    1993-01-01

    An analytical micromechanics-based strength prediction methodology was developed to predict failure of notched metal matrix composites. The stress-strain behavior and notched strength of two metal matrix composites, boron/aluminum (B/Al) and silicon-carbide/titanium (SCS-6/Ti-15-3), were predicted. The prediction methodology combines analytical techniques ranging from a three-dimensional finite element analysis of a notched specimen to a micromechanical model of a single fiber. In the B/Al laminates, a fiber failure criteria based on the axial and shear stress in the fiber accurately predicted laminate failure for a variety of layups and notch-length to specimen-width ratios with both circular holes and sharp notches when matrix plasticity was included in the analysis. For the SCS-6/Ti-15-3 laminates, a fiber failure based on the axial stress in the fiber correlated well with experimental results for static and postfatigue residual strengths when fiber matrix debonding and matrix cracking were included in the analysis. The micromechanics-based strength prediction methodology offers a direct approach to strength prediction by modeling behavior and damage on a constituent level, thus, explicitly including matrix nonlinearity, fiber matrix debonding, and matrix cracking.

  14. Utilizing Adjoint-Based Error Estimates for Surrogate Models to Accurately Predict Probabilities of Events

    DOE PAGES

    Butler, Troy; Wildey, Timothy

    2018-01-01

    In thist study, we develop a procedure to utilize error estimates for samples of a surrogate model to compute robust upper and lower bounds on estimates of probabilities of events. We show that these error estimates can also be used in an adaptive algorithm to simultaneously reduce the computational cost and increase the accuracy in estimating probabilities of events using computationally expensive high-fidelity models. Specifically, we introduce the notion of reliability of a sample of a surrogate model, and we prove that utilizing the surrogate model for the reliable samples and the high-fidelity model for the unreliable samples gives preciselymore » the same estimate of the probability of the output event as would be obtained by evaluation of the original model for each sample. The adaptive algorithm uses the additional evaluations of the high-fidelity model for the unreliable samples to locally improve the surrogate model near the limit state, which significantly reduces the number of high-fidelity model evaluations as the limit state is resolved. Numerical results based on a recently developed adjoint-based approach for estimating the error in samples of a surrogate are provided to demonstrate (1) the robustness of the bounds on the probability of an event, and (2) that the adaptive enhancement algorithm provides a more accurate estimate of the probability of the QoI event than standard response surface approximation methods at a lower computational cost.« less

  15. Utilizing Adjoint-Based Error Estimates for Surrogate Models to Accurately Predict Probabilities of Events

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Butler, Troy; Wildey, Timothy

    In thist study, we develop a procedure to utilize error estimates for samples of a surrogate model to compute robust upper and lower bounds on estimates of probabilities of events. We show that these error estimates can also be used in an adaptive algorithm to simultaneously reduce the computational cost and increase the accuracy in estimating probabilities of events using computationally expensive high-fidelity models. Specifically, we introduce the notion of reliability of a sample of a surrogate model, and we prove that utilizing the surrogate model for the reliable samples and the high-fidelity model for the unreliable samples gives preciselymore » the same estimate of the probability of the output event as would be obtained by evaluation of the original model for each sample. The adaptive algorithm uses the additional evaluations of the high-fidelity model for the unreliable samples to locally improve the surrogate model near the limit state, which significantly reduces the number of high-fidelity model evaluations as the limit state is resolved. Numerical results based on a recently developed adjoint-based approach for estimating the error in samples of a surrogate are provided to demonstrate (1) the robustness of the bounds on the probability of an event, and (2) that the adaptive enhancement algorithm provides a more accurate estimate of the probability of the QoI event than standard response surface approximation methods at a lower computational cost.« less

  16. Remaining Useful Life Prediction for Lithium-Ion Batteries Based on Gaussian Processes Mixture

    PubMed Central

    Li, Lingling; Wang, Pengchong; Chao, Kuei-Hsiang; Zhou, Yatong; Xie, Yang

    2016-01-01

    The remaining useful life (RUL) prediction of Lithium-ion batteries is closely related to the capacity degeneration trajectories. Due to the self-charging and the capacity regeneration, the trajectories have the property of multimodality. Traditional prediction models such as the support vector machines (SVM) or the Gaussian Process regression (GPR) cannot accurately characterize this multimodality. This paper proposes a novel RUL prediction method based on the Gaussian Process Mixture (GPM). It can process multimodality by fitting different segments of trajectories with different GPR models separately, such that the tiny differences among these segments can be revealed. The method is demonstrated to be effective for prediction by the excellent predictive result of the experiments on the two commercial and chargeable Type 1850 Lithium-ion batteries, provided by NASA. The performance comparison among the models illustrates that the GPM is more accurate than the SVM and the GPR. In addition, GPM can yield the predictive confidence interval, which makes the prediction more reliable than that of traditional models. PMID:27632176

  17. A deep learning-based multi-model ensemble method for cancer prediction.

    PubMed

    Xiao, Yawen; Wu, Jun; Lin, Zongli; Zhao, Xiaodong

    2018-01-01

    Cancer is a complex worldwide health problem associated with high mortality. With the rapid development of the high-throughput sequencing technology and the application of various machine learning methods that have emerged in recent years, progress in cancer prediction has been increasingly made based on gene expression, providing insight into effective and accurate treatment decision making. Thus, developing machine learning methods, which can successfully distinguish cancer patients from healthy persons, is of great current interest. However, among the classification methods applied to cancer prediction so far, no one method outperforms all the others. In this paper, we demonstrate a new strategy, which applies deep learning to an ensemble approach that incorporates multiple different machine learning models. We supply informative gene data selected by differential gene expression analysis to five different classification models. Then, a deep learning method is employed to ensemble the outputs of the five classifiers. The proposed deep learning-based multi-model ensemble method was tested on three public RNA-seq data sets of three kinds of cancers, Lung Adenocarcinoma, Stomach Adenocarcinoma and Breast Invasive Carcinoma. The test results indicate that it increases the prediction accuracy of cancer for all the tested RNA-seq data sets as compared to using a single classifier or the majority voting algorithm. By taking full advantage of different classifiers, the proposed deep learning-based multi-model ensemble method is shown to be accurate and effective for cancer prediction. Copyright © 2017 Elsevier B.V. All rights reserved.

  18. Ensemble MD simulations restrained via crystallographic data: Accurate structure leads to accurate dynamics

    PubMed Central

    Xue, Yi; Skrynnikov, Nikolai R

    2014-01-01

    Currently, the best existing molecular dynamics (MD) force fields cannot accurately reproduce the global free-energy minimum which realizes the experimental protein structure. As a result, long MD trajectories tend to drift away from the starting coordinates (e.g., crystallographic structures). To address this problem, we have devised a new simulation strategy aimed at protein crystals. An MD simulation of protein crystal is essentially an ensemble simulation involving multiple protein molecules in a crystal unit cell (or a block of unit cells). To ensure that average protein coordinates remain correct during the simulation, we introduced crystallography-based restraints into the MD protocol. Because these restraints are aimed at the ensemble-average structure, they have only minimal impact on conformational dynamics of the individual protein molecules. So long as the average structure remains reasonable, the proteins move in a native-like fashion as dictated by the original force field. To validate this approach, we have used the data from solid-state NMR spectroscopy, which is the orthogonal experimental technique uniquely sensitive to protein local dynamics. The new method has been tested on the well-established model protein, ubiquitin. The ensemble-restrained MD simulations produced lower crystallographic R factors than conventional simulations; they also led to more accurate predictions for crystallographic temperature factors, solid-state chemical shifts, and backbone order parameters. The predictions for 15N R1 relaxation rates are at least as accurate as those obtained from conventional simulations. Taken together, these results suggest that the presented trajectories may be among the most realistic protein MD simulations ever reported. In this context, the ensemble restraints based on high-resolution crystallographic data can be viewed as protein-specific empirical corrections to the standard force fields. PMID:24452989

  19. Combining first-principles and data modeling for the accurate prediction of the refractive index of organic polymers

    NASA Astrophysics Data System (ADS)

    Afzal, Mohammad Atif Faiz; Cheng, Chong; Hachmann, Johannes

    2018-06-01

    Organic materials with a high index of refraction (RI) are attracting considerable interest due to their potential application in optic and optoelectronic devices. However, most of these applications require an RI value of 1.7 or larger, while typical carbon-based polymers only exhibit values in the range of 1.3-1.5. This paper introduces an efficient computational protocol for the accurate prediction of RI values in polymers to facilitate in silico studies that can guide the discovery and design of next-generation high-RI materials. Our protocol is based on the Lorentz-Lorenz equation and is parametrized by the polarizability and number density values of a given candidate compound. In the proposed scheme, we compute the former using first-principles electronic structure theory and the latter using an approximation based on van der Waals volumes. The critical parameter in the number density approximation is the packing fraction of the bulk polymer, for which we have devised a machine learning model. We demonstrate the performance of the proposed RI protocol by testing its predictions against the experimentally known RI values of 112 optical polymers. Our approach to combine first-principles and data modeling emerges as both a successful and a highly economical path to determining the RI values for a wide range of organic polymers.

  20. Limb-Enhancer Genie: An accessible resource of accurate enhancer predictions in the developing limb

    DOE PAGES

    Monti, Remo; Barozzi, Iros; Osterwalder, Marco; ...

    2017-08-21

    Epigenomic mapping of enhancer-associated chromatin modifications facilitates the genome-wide discovery of tissue-specific enhancers in vivo. However, reliance on single chromatin marks leads to high rates of false-positive predictions. More sophisticated, integrative methods have been described, but commonly suffer from limited accessibility to the resulting predictions and reduced biological interpretability. Here we present the Limb-Enhancer Genie (LEG), a collection of highly accurate, genome-wide predictions of enhancers in the developing limb, available through a user-friendly online interface. We predict limb enhancers using a combination of > 50 published limb-specific datasets and clusters of evolutionarily conserved transcription factor binding sites, taking advantage ofmore » the patterns observed at previously in vivo validated elements. By combining different statistical models, our approach outperforms current state-of-the-art methods and provides interpretable measures of feature importance. Our results indicate that including a previously unappreciated score that quantifies tissue-specific nuclease accessibility significantly improves prediction performance. We demonstrate the utility of our approach through in vivo validation of newly predicted elements. Moreover, we describe general features that can guide the type of datasets to include when predicting tissue-specific enhancers genome-wide, while providing an accessible resource to the general biological community and facilitating the functional interpretation of genetic studies of limb malformations.« less

  1. A link prediction approach to cancer drug sensitivity prediction.

    PubMed

    Turki, Turki; Wei, Zhi

    2017-10-03

    Predicting the response to a drug for cancer disease patients based on genomic information is an important problem in modern clinical oncology. This problem occurs in part because many available drug sensitivity prediction algorithms do not consider better quality cancer cell lines and the adoption of new feature representations; both lead to the accurate prediction of drug responses. By predicting accurate drug responses to cancer, oncologists gain a more complete understanding of the effective treatments for each patient, which is a core goal in precision medicine. In this paper, we model cancer drug sensitivity as a link prediction, which is shown to be an effective technique. We evaluate our proposed link prediction algorithms and compare them with an existing drug sensitivity prediction approach based on clinical trial data. The experimental results based on the clinical trial data show the stability of our link prediction algorithms, which yield the highest area under the ROC curve (AUC) and are statistically significant. We propose a link prediction approach to obtain new feature representation. Compared with an existing approach, the results show that incorporating the new feature representation to the link prediction algorithms has significantly improved the performance.

  2. Predicted osteotomy planes are accurate when using patient-specific instrumentation for total knee arthroplasty in cadavers: a descriptive analysis.

    PubMed

    Kievit, A J; Dobbe, J G G; Streekstra, G J; Blankevoort, L; Schafroth, M U

    2018-06-01

    Malalignment of implants is a major source of failure during total knee arthroplasty. To achieve more accurate 3D planning and execution of the osteotomy cuts during surgery, the Signature (Biomet, Warsaw) patient-specific instrumentation (PSI) was used to produce pin guides for the positioning of the osteotomy blocks by means of computer-aided manufacture based on CT scan images. The research question of this study is: what is the transfer accuracy of osteotomy planes predicted by the Signature PSI system for preoperative 3D planning and intraoperative block-guided pin placement to perform total knee arthroplasty procedures? The transfer accuracy achieved by using the Signature PSI system was evaluated by comparing the osteotomy planes predicted preoperatively with the osteotomy planes seen intraoperatively in human cadaveric legs. Outcomes were measured in terms of translational and rotational errors (varus, valgus, flexion, extension and axial rotation) for both tibia and femur osteotomies. Average translational errors between the osteotomy planes predicted using the Signature system and the actual osteotomy planes achieved was 0.8 mm (± 0.5 mm) for the tibia and 0.7 mm (± 4.0 mm) for the femur. Average rotational errors in relation to predicted and achieved osteotomy planes were 0.1° (± 1.2°) of varus and 0.4° (± 1.7°) of anterior slope (extension) for the tibia, and 2.8° (± 2.0°) of varus and 0.9° (± 2.7°) of flexion and 1.4° (± 2.2°) of external rotation for the femur. The similarity between osteotomy planes predicted using the Signature system and osteotomy planes actually achieved was excellent for the tibia although some discrepancies were seen for the femur. The use of 3D system techniques in TKA surgery can provide accurate intraoperative guidance, especially for patients with deformed bone, tailored to individual patients and ensure better placement of the implant.

  3. Prognostic breast cancer signature identified from 3D culture model accurately predicts clinical outcome across independent datasets

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Martin, Katherine J.; Patrick, Denis R.; Bissell, Mina J.

    2008-10-20

    One of the major tenets in breast cancer research is that early detection is vital for patient survival by increasing treatment options. To that end, we have previously used a novel unsupervised approach to identify a set of genes whose expression predicts prognosis of breast cancer patients. The predictive genes were selected in a well-defined three dimensional (3D) cell culture model of non-malignant human mammary epithelial cell morphogenesis as down-regulated during breast epithelial cell acinar formation and cell cycle arrest. Here we examine the ability of this gene signature (3D-signature) to predict prognosis in three independent breast cancer microarray datasetsmore » having 295, 286, and 118 samples, respectively. Our results show that the 3D-signature accurately predicts prognosis in three unrelated patient datasets. At 10 years, the probability of positive outcome was 52, 51, and 47 percent in the group with a poor-prognosis signature and 91, 75, and 71 percent in the group with a good-prognosis signature for the three datasets, respectively (Kaplan-Meier survival analysis, p<0.05). Hazard ratios for poor outcome were 5.5 (95% CI 3.0 to 12.2, p<0.0001), 2.4 (95% CI 1.6 to 3.6, p<0.0001) and 1.9 (95% CI 1.1 to 3.2, p = 0.016) and remained significant for the two larger datasets when corrected for estrogen receptor (ER) status. Hence the 3D-signature accurately predicts breast cancer outcome in both ER-positive and ER-negative tumors, though individual genes differed in their prognostic ability in the two subtypes. Genes that were prognostic in ER+ patients are AURKA, CEP55, RRM2, EPHA2, FGFBP1, and VRK1, while genes prognostic in ER patients include ACTB, FOXM1 and SERPINE2 (Kaplan-Meier p<0.05). Multivariable Cox regression analysis in the largest dataset showed that the 3D-signature was a strong independent factor in predicting breast cancer outcome. The 3D-signature accurately predicts breast cancer outcome across multiple datasets and holds

  4. Moving Toward Integrating Gene Expression Profiling Into High-Throughput Testing: A Gene Expression Biomarker Accurately Predicts Estrogen Receptor α Modulation in a Microarray Compendium

    PubMed Central

    Ryan, Natalia; Chorley, Brian; Tice, Raymond R.; Judson, Richard; Corton, J. Christopher

    2016-01-01

    Microarray profiling of chemical-induced effects is being increasingly used in medium- and high-throughput formats. Computational methods are described here to identify molecular targets from whole-genome microarray data using as an example the estrogen receptor α (ERα), often modulated by potential endocrine disrupting chemicals. ERα biomarker genes were identified by their consistent expression after exposure to 7 structurally diverse ERα agonists and 3 ERα antagonists in ERα-positive MCF-7 cells. Most of the biomarker genes were shown to be directly regulated by ERα as determined by ESR1 gene knockdown using siRNA as well as through chromatin immunoprecipitation coupled with DNA sequencing analysis of ERα-DNA interactions. The biomarker was evaluated as a predictive tool using the fold-change rank-based Running Fisher algorithm by comparison to annotated gene expression datasets from experiments using MCF-7 cells, including those evaluating the transcriptional effects of hormones and chemicals. Using 141 comparisons from chemical- and hormone-treated cells, the biomarker gave a balanced accuracy for prediction of ERα activation or suppression of 94% and 93%, respectively. The biomarker was able to correctly classify 18 out of 21 (86%) ER reference chemicals including “very weak” agonists. Importantly, the biomarker predictions accurately replicated predictions based on 18 in vitro high-throughput screening assays that queried different steps in ERα signaling. For 114 chemicals, the balanced accuracies were 95% and 98% for activation or suppression, respectively. These results demonstrate that the ERα gene expression biomarker can accurately identify ERα modulators in large collections of microarray data derived from MCF-7 cells. PMID:26865669

  5. Prediction of blast-induced air overpressure: a hybrid AI-based predictive model.

    PubMed

    Jahed Armaghani, Danial; Hajihassani, Mohsen; Marto, Aminaton; Shirani Faradonbeh, Roohollah; Mohamad, Edy Tonnizam

    2015-11-01

    Blast operations in the vicinity of residential areas usually produce significant environmental problems which may cause severe damage to the nearby areas. Blast-induced air overpressure (AOp) is one of the most important environmental impacts of blast operations which needs to be predicted to minimize the potential risk of damage. This paper presents an artificial neural network (ANN) optimized by the imperialist competitive algorithm (ICA) for the prediction of AOp induced by quarry blasting. For this purpose, 95 blasting operations were precisely monitored in a granite quarry site in Malaysia and AOp values were recorded in each operation. Furthermore, the most influential parameters on AOp, including the maximum charge per delay and the distance between the blast-face and monitoring point, were measured and used to train the ICA-ANN model. Based on the generalized predictor equation and considering the measured data from the granite quarry site, a new empirical equation was developed to predict AOp. For comparison purposes, conventional ANN models were developed and compared with the ICA-ANN results. The results demonstrated that the proposed ICA-ANN model is able to predict blast-induced AOp more accurately than other presented techniques.

  6. Using radiance predicted by the P3 approximation in a spherical geometry to predict tissue optical properties

    NASA Astrophysics Data System (ADS)

    Dickey, Dwayne J.; Moore, Ronald B.; Tulip, John

    2001-01-01

    For photodynamic therapy of solid tumors, such as prostatic carcinoma, to be achieved, an accurate model to predict tissue parameters and light dose must be found. Presently, most analytical light dosimetry models are fluence based and are not clinically viable for tissue characterization. Other methods of predicting optical properties, such as Monet Carlo, are accurate but far too time consuming for clinical application. However, radiance predicted by the P3-Approximation, an anaylitical solution to the transport equation, may be a viable and accurate alternative. The P3-Approximation accurately predicts optical parameters in intralipid/methylene blue based phantoms in a spherical geometry. The optical parameters furnished by the radiance, when introduced into fluence predicted by both P3- Approximation and Grosjean Theory, correlate well with experimental data. The P3-Approximation also predicts the optical properties of prostate tissue, agreeing with documented optical parameters. The P3-Approximation could be the clinical tool necessary to facilitate PDT of solid tumors because of the limited number of invasive measurements required and the speed in which accurate calculations can be performed.

  7. Predicting missing links in complex networks based on common neighbors and distance

    PubMed Central

    Yang, Jinxuan; Zhang, Xiao-Dong

    2016-01-01

    The algorithms based on common neighbors metric to predict missing links in complex networks are very popular, but most of these algorithms do not account for missing links between nodes with no common neighbors. It is not accurate enough to reconstruct networks by using these methods in some cases especially when between nodes have less common neighbors. We proposed in this paper a new algorithm based on common neighbors and distance to improve accuracy of link prediction. Our proposed algorithm makes remarkable effect in predicting the missing links between nodes with no common neighbors and performs better than most existing currently used methods for a variety of real-world networks without increasing complexity. PMID:27905526

  8. User's manual for the ALS base heating prediction code, volume 2

    NASA Technical Reports Server (NTRS)

    Reardon, John E.; Fulton, Michael S.

    1992-01-01

    The Advanced Launch System (ALS) Base Heating Prediction Code is based on a generalization of first principles in the prediction of plume induced base convective heating and plume radiation. It should be considered to be an approximate method for evaluating trends as a function of configuration variables because the processes being modeled are too complex to allow an accurate generalization. The convective methodology is based upon generalizing trends from four nozzle configurations, so an extension to use the code with strap-on boosters, multiple nozzle sizes, and variations in the propellants and chamber pressure histories cannot be precisely treated. The plume radiation is more amenable to precise computer prediction, but simplified assumptions are required to model the various aspects of the candidate configurations. Perhaps the most difficult area to characterize is the variation of radiation with altitude. The theory in the radiation predictions is described in more detail. This report is intended to familiarize a user with the interface operation and options, to summarize the limitations and restrictions of the code, and to provide information to assist in installing the code.

  9. Size-independent neural networks based first-principles method for accurate prediction of heat of formation of fuels

    NASA Astrophysics Data System (ADS)

    Yang, GuanYa; Wu, Jiang; Chen, ShuGuang; Zhou, WeiJun; Sun, Jian; Chen, GuanHua

    2018-06-01

    Neural network-based first-principles method for predicting heat of formation (HOF) was previously demonstrated to be able to achieve chemical accuracy in a broad spectrum of target molecules [L. H. Hu et al., J. Chem. Phys. 119, 11501 (2003)]. However, its accuracy deteriorates with the increase in molecular size. A closer inspection reveals a systematic correlation between the prediction error and the molecular size, which appears correctable by further statistical analysis, calling for a more sophisticated machine learning algorithm. Despite the apparent difference between simple and complex molecules, all the essential physical information is already present in a carefully selected set of small molecule representatives. A model that can capture the fundamental physics would be able to predict large and complex molecules from information extracted only from a small molecules database. To this end, a size-independent, multi-step multi-variable linear regression-neural network-B3LYP method is developed in this work, which successfully improves the overall prediction accuracy by training with smaller molecules only. And in particular, the calculation errors for larger molecules are drastically reduced to the same magnitudes as those of the smaller molecules. Specifically, the method is based on a 164-molecule database that consists of molecules made of hydrogen and carbon elements. 4 molecular descriptors were selected to encode molecule's characteristics, among which raw HOF calculated from B3LYP and the molecular size are also included. Upon the size-independent machine learning correction, the mean absolute deviation (MAD) of the B3LYP/6-311+G(3df,2p)-calculated HOF is reduced from 16.58 to 1.43 kcal/mol and from 17.33 to 1.69 kcal/mol for the training and testing sets (small molecules), respectively. Furthermore, the MAD of the testing set (large molecules) is reduced from 28.75 to 1.67 kcal/mol.

  10. Streamflow Prediction based on Chaos Theory

    NASA Astrophysics Data System (ADS)

    Li, X.; Wang, X.; Babovic, V. M.

    2015-12-01

    Chaos theory is a popular method in hydrologic time series prediction. Local model (LM) based on this theory utilizes time-delay embedding to reconstruct the phase-space diagram. For this method, its efficacy is dependent on the embedding parameters, i.e. embedding dimension, time lag, and nearest neighbor number. The optimal estimation of these parameters is thus critical to the application of Local model. However, these embedding parameters are conventionally estimated using Average Mutual Information (AMI) and False Nearest Neighbors (FNN) separately. This may leads to local optimization and thus has limitation to its prediction accuracy. Considering about these limitation, this paper applies a local model combined with simulated annealing (SA) to find the global optimization of embedding parameters. It is also compared with another global optimization approach of Genetic Algorithm (GA). These proposed hybrid methods are applied in daily and monthly streamflow time series for examination. The results show that global optimization can contribute to the local model to provide more accurate prediction results compared with local optimization. The LM combined with SA shows more advantages in terms of its computational efficiency. The proposed scheme here can also be applied to other fields such as prediction of hydro-climatic time series, error correction, etc.

  11. Sex-specific lean body mass predictive equations are accurate in the obese paediatric population

    PubMed Central

    Jackson, Lanier B.; Henshaw, Melissa H.; Carter, Janet; Chowdhury, Shahryar M.

    2015-01-01

    Background The clinical assessment of lean body mass (LBM) is challenging in obese children. A sex-specific predictive equation for LBM derived from anthropometric data was recently validated in children. Aim The purpose of this study was to independently validate these predictive equations in the obese paediatric population. Subjects and methods Obese subjects aged 4–21 were analysed retrospectively. Predicted LBM (LBMp) was calculated using equations previously developed in children. Measured LBM (LBMm) was derived from dual-energy x-ray absorptiometry. Agreement was expressed as [(LBMm-LBMp)/LBMm] with 95% limits of agreement. Results Of 310 enrolled patients, 195 (63%) were females. The mean age was 11.8 ± 3.4 years and mean BMI Z-score was 2.3 ± 0.4. The average difference between LBMm and LBMp was −0.6% (−17.0%, 15.8%). Pearson’s correlation revealed a strong linear relationship between LBMm and LBMp (r=0.97, p<0.01). Conclusion This study validates the use of these clinically-derived sex-specific LBM predictive equations in the obese paediatric population. Future studies should use these equations to improve the ability to accurately classify LBM in obese children. PMID:26287383

  12. Local-search based prediction of medical image registration error

    NASA Astrophysics Data System (ADS)

    Saygili, Görkem

    2018-03-01

    Medical image registration is a crucial task in many different medical imaging applications. Hence, considerable amount of work has been published recently that aim to predict the error in a registration without any human effort. If provided, these error predictions can be used as a feedback to the registration algorithm to further improve its performance. Recent methods generally start with extracting image-based and deformation-based features, then apply feature pooling and finally train a Random Forest (RF) regressor to predict the real registration error. Image-based features can be calculated after applying a single registration but provide limited accuracy whereas deformation-based features such as variation of deformation vector field may require up to 20 registrations which is a considerably high time-consuming task. This paper proposes to use extracted features from a local search algorithm as image-based features to estimate the error of a registration. The proposed method comprises a local search algorithm to find corresponding voxels between registered image pairs and based on the amount of shifts and stereo confidence measures, it predicts the amount of registration error in millimetres densely using a RF regressor. Compared to other algorithms in the literature, the proposed algorithm does not require multiple registrations, can be efficiently implemented on a Graphical Processing Unit (GPU) and can still provide highly accurate error predictions in existence of large registration error. Experimental results with real registrations on a public dataset indicate a substantially high accuracy achieved by using features from the local search algorithm.

  13. CodingQuarry: highly accurate hidden Markov model gene prediction in fungal genomes using RNA-seq transcripts.

    PubMed

    Testa, Alison C; Hane, James K; Ellwood, Simon R; Oliver, Richard P

    2015-03-11

    The impact of gene annotation quality on functional and comparative genomics makes gene prediction an important process, particularly in non-model species, including many fungi. Sets of homologous protein sequences are rarely complete with respect to the fungal species of interest and are often small or unreliable, especially when closely related species have not been sequenced or annotated in detail. In these cases, protein homology-based evidence fails to correctly annotate many genes, or significantly improve ab initio predictions. Generalised hidden Markov models (GHMM) have proven to be invaluable tools in gene annotation and, recently, RNA-seq has emerged as a cost-effective means to significantly improve the quality of automated gene annotation. As these methods do not require sets of homologous proteins, improving gene prediction from these resources is of benefit to fungal researchers. While many pipelines now incorporate RNA-seq data in training GHMMs, there has been relatively little investigation into additionally combining RNA-seq data at the point of prediction, and room for improvement in this area motivates this study. CodingQuarry is a highly accurate, self-training GHMM fungal gene predictor designed to work with assembled, aligned RNA-seq transcripts. RNA-seq data informs annotations both during gene-model training and in prediction. Our approach capitalises on the high quality of fungal transcript assemblies by incorporating predictions made directly from transcript sequences. Correct predictions are made despite transcript assembly problems, including those caused by overlap between the transcripts of adjacent gene loci. Stringent benchmarking against high-confidence annotation subsets showed CodingQuarry predicted 91.3% of Schizosaccharomyces pombe genes and 90.4% of Saccharomyces cerevisiae genes perfectly. These results are 4-5% better than those of AUGUSTUS, the next best performing RNA-seq driven gene predictor tested. Comparisons against

  14. Mass spectrometry-based protein identification with accurate statistical significance assignment.

    PubMed

    Alves, Gelio; Yu, Yi-Kuo

    2015-03-01

    Assigning statistical significance accurately has become increasingly important as metadata of many types, often assembled in hierarchies, are constructed and combined for further biological analyses. Statistical inaccuracy of metadata at any level may propagate to downstream analyses, undermining the validity of scientific conclusions thus drawn. From the perspective of mass spectrometry-based proteomics, even though accurate statistics for peptide identification can now be achieved, accurate protein level statistics remain challenging. We have constructed a protein ID method that combines peptide evidences of a candidate protein based on a rigorous formula derived earlier; in this formula the database P-value of every peptide is weighted, prior to the final combination, according to the number of proteins it maps to. We have also shown that this protein ID method provides accurate protein level E-value, eliminating the need of using empirical post-processing methods for type-I error control. Using a known protein mixture, we find that this protein ID method, when combined with the Sorić formula, yields accurate values for the proportion of false discoveries. In terms of retrieval efficacy, the results from our method are comparable with other methods tested. The source code, implemented in C++ on a linux system, is available for download at ftp://ftp.ncbi.nlm.nih.gov/pub/qmbp/qmbp_ms/RAId/RAId_Linux_64Bit. Published by Oxford University Press 2014. This work is written by US Government employees and is in the public domain in the US.

  15. High accuracy operon prediction method based on STRING database scores.

    PubMed

    Taboada, Blanca; Verde, Cristina; Merino, Enrique

    2010-07-01

    We present a simple and highly accurate computational method for operon prediction, based on intergenic distances and functional relationships between the protein products of contiguous genes, as defined by STRING database (Jensen,L.J., Kuhn,M., Stark,M., Chaffron,S., Creevey,C., Muller,J., Doerks,T., Julien,P., Roth,A., Simonovic,M. et al. (2009) STRING 8-a global view on proteins and their functional interactions in 630 organisms. Nucleic Acids Res., 37, D412-D416). These two parameters were used to train a neural network on a subset of experimentally characterized Escherichia coli and Bacillus subtilis operons. Our predictive model was successfully tested on the set of experimentally defined operons in E. coli and B. subtilis, with accuracies of 94.6 and 93.3%, respectively. As far as we know, these are the highest accuracies ever obtained for predicting bacterial operons. Furthermore, in order to evaluate the predictable accuracy of our model when using an organism's data set for the training procedure, and a different organism's data set for testing, we repeated the E. coli operon prediction analysis using a neural network trained with B. subtilis data, and a B. subtilis analysis using a neural network trained with E. coli data. Even for these cases, the accuracies reached with our method were outstandingly high, 91.5 and 93%, respectively. These results show the potential use of our method for accurately predicting the operons of any other organism. Our operon predictions for fully-sequenced genomes are available at http://operons.ibt.unam.mx/OperonPredictor/.

  16. Association Rule-based Predictive Model for Machine Failure in Industrial Internet of Things

    NASA Astrophysics Data System (ADS)

    Kwon, Jung-Hyok; Lee, Sol-Bee; Park, Jaehoon; Kim, Eui-Jik

    2017-09-01

    This paper proposes an association rule-based predictive model for machine failure in industrial Internet of things (IIoT), which can accurately predict the machine failure in real manufacturing environment by investigating the relationship between the cause and type of machine failure. To develop the predictive model, we consider three major steps: 1) binarization, 2) rule creation, 3) visualization. The binarization step translates item values in a dataset into one or zero, then the rule creation step creates association rules as IF-THEN structures using the Lattice model and Apriori algorithm. Finally, the created rules are visualized in various ways for users’ understanding. An experimental implementation was conducted using R Studio version 3.3.2. The results show that the proposed predictive model realistically predicts machine failure based on association rules.

  17. Accurate and Reliable Prediction of the Binding Affinities of Macrocycles to Their Protein Targets.

    PubMed

    Yu, Haoyu S; Deng, Yuqing; Wu, Yujie; Sindhikara, Dan; Rask, Amy R; Kimura, Takayuki; Abel, Robert; Wang, Lingle

    2017-12-12

    Macrocycles have been emerging as a very important drug class in the past few decades largely due to their expanded chemical diversity benefiting from advances in synthetic methods. Macrocyclization has been recognized as an effective way to restrict the conformational space of acyclic small molecule inhibitors with the hope of improving potency, selectivity, and metabolic stability. Because of their relatively larger size as compared to typical small molecule drugs and the complexity of the structures, efficient sampling of the accessible macrocycle conformational space and accurate prediction of their binding affinities to their target protein receptors poses a great challenge of central importance in computational macrocycle drug design. In this article, we present a novel method for relative binding free energy calculations between macrocycles with different ring sizes and between the macrocycles and their corresponding acyclic counterparts. We have applied the method to seven pharmaceutically interesting data sets taken from recent drug discovery projects including 33 macrocyclic ligands covering a diverse chemical space. The predicted binding free energies are in good agreement with experimental data with an overall root-mean-square error (RMSE) of 0.94 kcal/mol. This is to our knowledge the first time where the free energy of the macrocyclization of linear molecules has been directly calculated with rigorous physics-based free energy calculation methods, and we anticipate the outstanding accuracy demonstrated here across a broad range of target classes may have significant implications for macrocycle drug discovery.

  18. Genetic algorithm based adaptive neural network ensemble and its application in predicting carbon flux

    USGS Publications Warehouse

    Xue, Y.; Liu, S.; Hu, Y.; Yang, J.; Chen, Q.

    2007-01-01

    To improve the accuracy in prediction, Genetic Algorithm based Adaptive Neural Network Ensemble (GA-ANNE) is presented. Intersections are allowed between different training sets based on the fuzzy clustering analysis, which ensures the diversity as well as the accuracy of individual Neural Networks (NNs). Moreover, to improve the accuracy of the adaptive weights of individual NNs, GA is used to optimize the cluster centers. Empirical results in predicting carbon flux of Duke Forest reveal that GA-ANNE can predict the carbon flux more accurately than Radial Basis Function Neural Network (RBFNN), Bagging NN ensemble, and ANNE. ?? 2007 IEEE.

  19. Learning a weighted sequence model of the nucleosome core and linker yields more accurate predictions in Saccharomyces cerevisiae and Homo sapiens.

    PubMed

    Reynolds, Sheila M; Bilmes, Jeff A; Noble, William Stafford

    2010-07-08

    DNA in eukaryotes is packaged into a chromatin complex, the most basic element of which is the nucleosome. The precise positioning of the nucleosome cores allows for selective access to the DNA, and the mechanisms that control this positioning are important pieces of the gene expression puzzle. We describe a large-scale nucleosome pattern that jointly characterizes the nucleosome core and the adjacent linkers and is predominantly characterized by long-range oscillations in the mono, di- and tri-nucleotide content of the DNA sequence, and we show that this pattern can be used to predict nucleosome positions in both Homo sapiens and Saccharomyces cerevisiae more accurately than previously published methods. Surprisingly, in both H. sapiens and S. cerevisiae, the most informative individual features are the mono-nucleotide patterns, although the inclusion of di- and tri-nucleotide features results in improved performance. Our approach combines a much longer pattern than has been previously used to predict nucleosome positioning from sequence-301 base pairs, centered at the position to be scored-with a novel discriminative classification approach that selectively weights the contributions from each of the input features. The resulting scores are relatively insensitive to local AT-content and can be used to accurately discriminate putative dyad positions from adjacent linker regions without requiring an additional dynamic programming step and without the attendant edge effects and assumptions about linker length modeling and overall nucleosome density. Our approach produces the best dyad-linker classification results published to date in H. sapiens, and outperforms two recently published models on a large set of S. cerevisiae nucleosome positions. Our results suggest that in both genomes, a comparable and relatively small fraction of nucleosomes are well-positioned and that these positions are predictable based on sequence alone. We believe that the bulk of the

  20. Learning a Weighted Sequence Model of the Nucleosome Core and Linker Yields More Accurate Predictions in Saccharomyces cerevisiae and Homo sapiens

    PubMed Central

    Reynolds, Sheila M.; Bilmes, Jeff A.; Noble, William Stafford

    2010-01-01

    DNA in eukaryotes is packaged into a chromatin complex, the most basic element of which is the nucleosome. The precise positioning of the nucleosome cores allows for selective access to the DNA, and the mechanisms that control this positioning are important pieces of the gene expression puzzle. We describe a large-scale nucleosome pattern that jointly characterizes the nucleosome core and the adjacent linkers and is predominantly characterized by long-range oscillations in the mono, di- and tri-nucleotide content of the DNA sequence, and we show that this pattern can be used to predict nucleosome positions in both Homo sapiens and Saccharomyces cerevisiae more accurately than previously published methods. Surprisingly, in both H. sapiens and S. cerevisiae, the most informative individual features are the mono-nucleotide patterns, although the inclusion of di- and tri-nucleotide features results in improved performance. Our approach combines a much longer pattern than has been previously used to predict nucleosome positioning from sequence—301 base pairs, centered at the position to be scored—with a novel discriminative classification approach that selectively weights the contributions from each of the input features. The resulting scores are relatively insensitive to local AT-content and can be used to accurately discriminate putative dyad positions from adjacent linker regions without requiring an additional dynamic programming step and without the attendant edge effects and assumptions about linker length modeling and overall nucleosome density. Our approach produces the best dyad-linker classification results published to date in H. sapiens, and outperforms two recently published models on a large set of S. cerevisiae nucleosome positions. Our results suggest that in both genomes, a comparable and relatively small fraction of nucleosomes are well-positioned and that these positions are predictable based on sequence alone. We believe that the bulk of the

  1. Accurate predictions of iron redox state in silicate glasses: A multivariate approach using X-ray absorption spectroscopy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dyar, M. Darby; McCanta, Molly; Breves, Elly

    2016-03-01

    Pre-edge features in the K absorption edge of X-ray absorption spectra are commonly used to predict Fe3+ valence state in silicate glasses. However, this study shows that using the entire spectral region from the pre-edge into the extended X-ray absorption fine-structure region provides more accurate results when combined with multivariate analysis techniques. The least absolute shrinkage and selection operator (lasso) regression technique yields %Fe3+ values that are accurate to ±3.6% absolute when the full spectral region is employed. This method can be used across a broad range of glass compositions, is easily automated, and is demonstrated to yield accurate resultsmore » from different synchrotrons. It will enable future studies involving X-ray mapping of redox gradients on standard thin sections at 1 × 1 μm pixel sizes.« less

  2. Accurate predictions of iron redox state in silicate glasses: A multivariate approach using X-ray absorption spectroscopy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dyar, M. Darby; McCanta, Molly; Breves, Elly

    2016-03-01

    Pre-edge features in the K absorption edge of X-ray absorption spectra are commonly used to predict Fe 3+ valence state in silicate glasses. However, this study shows that using the entire spectral region from the pre-edge into the extended X-ray absorption fine-structure region provides more accurate results when combined with multivariate analysis techniques. The least absolute shrinkage and selection operator (lasso) regression technique yields %Fe 3+ values that are accurate to ±3.6% absolute when the full spectral region is employed. This method can be used across a broad range of glass compositions, is easily automated, and is demonstrated to yieldmore » accurate results from different synchrotrons. It will enable future studies involving X-ray mapping of redox gradients on standard thin sections at 1 × 1 μm pixel sizes.« less

  3. Efficient and Accurate Algorithm for Cleaved Fragments Prediction (CFPA) in Protein Sequences Dataset Based on Consensus and Its Variants: A Novel Degradomics Prediction Application.

    PubMed

    El-Assaad, Atlal; Dawy, Zaher; Nemer, Georges; Hajj, Hazem; Kobeissy, Firas H

    2017-01-01

    Degradomics is a novel discipline that involves determination of the proteases/substrate fragmentation profile, called the substrate degradome, and has been recently applied in different disciplines. A major application of degradomics is its utility in the field of biomarkers where the breakdown products (BDPs) of different protease have been investigated. Among the major proteases assessed, calpain and caspase proteases have been associated with the execution phases of the pro-apoptotic and pro-necrotic cell death, generating caspase/calpain-specific cleaved fragments. The distinction between calpain and caspase protein fragments has been applied to distinguish injury mechanisms. Advanced proteomics technology has been used to identify these BDPs experimentally. However, it has been a challenge to identify these BDPs with high precision and efficiency, especially if we are targeting a number of proteins at one time. In this chapter, we present a novel bioinfromatic detection method that identifies BDPs accurately and efficiently with validation against experimental data. This method aims at predicting the consensus sequence occurrences and their variants in a large set of experimentally detected protein sequences based on state-of-the-art sequence matching and alignment algorithms. After detection, the method generates all the potential cleaved fragments by a specific protease. This space and time-efficient algorithm is flexible to handle the different orientations that the consensus sequence and the protein sequence can take before cleaving. It is O(mn) in space complexity and O(Nmn) in time complexity, with N number of protein sequences, m length of the consensus sequence, and n length of each protein sequence. Ultimately, this knowledge will subsequently feed into the development of a novel tool for researchers to detect diverse types of selected BDPs as putative disease markers, contributing to the diagnosis and treatment of related disorders.

  4. The Satellite Clock Bias Prediction Method Based on Takagi-Sugeno Fuzzy Neural Network

    NASA Astrophysics Data System (ADS)

    Cai, C. L.; Yu, H. G.; Wei, Z. C.; Pan, J. D.

    2017-05-01

    The continuous improvement of the prediction accuracy of Satellite Clock Bias (SCB) is the key problem of precision navigation. In order to improve the precision of SCB prediction and better reflect the change characteristics of SCB, this paper proposes an SCB prediction method based on the Takagi-Sugeno fuzzy neural network. Firstly, the SCB values are pre-treated based on their characteristics. Then, an accurate Takagi-Sugeno fuzzy neural network model is established based on the preprocessed data to predict SCB. This paper uses the precise SCB data with different sampling intervals provided by IGS (International Global Navigation Satellite System Service) to realize the short-time prediction experiment, and the results are compared with the ARIMA (Auto-Regressive Integrated Moving Average) model, GM(1,1) model, and the quadratic polynomial model. The results show that the Takagi-Sugeno fuzzy neural network model is feasible and effective for the SCB short-time prediction experiment, and performs well for different types of clocks. The prediction results for the proposed method are better than the conventional methods obviously.

  5. Accurate prediction of polarised high order electrostatic interactions for hydrogen bonded complexes using the machine learning method kriging.

    PubMed

    Hughes, Timothy J; Kandathil, Shaun M; Popelier, Paul L A

    2015-02-05

    As intermolecular interactions such as the hydrogen bond are electrostatic in origin, rigorous treatment of this term within force field methodologies should be mandatory. We present a method able of accurately reproducing such interactions for seven van der Waals complexes. It uses atomic multipole moments up to hexadecupole moment mapped to the positions of the nuclear coordinates by the machine learning method kriging. Models were built at three levels of theory: HF/6-31G(**), B3LYP/aug-cc-pVDZ and M06-2X/aug-cc-pVDZ. The quality of the kriging models was measured by their ability to predict the electrostatic interaction energy between atoms in external test examples for which the true energies are known. At all levels of theory, >90% of test cases for small van der Waals complexes were predicted within 1 kJ mol(-1), decreasing to 60-70% of test cases for larger base pair complexes. Models built on moments obtained at B3LYP and M06-2X level generally outperformed those at HF level. For all systems the individual interactions were predicted with a mean unsigned error of less than 1 kJ mol(-1). Copyright © 2013 Elsevier B.V. All rights reserved.

  6. End-of-Discharge and End-of-Life Prediction in Lithium-Ion Batteries with Electrochemistry-Based Aging Models

    NASA Technical Reports Server (NTRS)

    Daigle, Matthew; Kulkarni, Chetan S.

    2016-01-01

    As batteries become increasingly prevalent in complex systems such as aircraft and electric cars, monitoring and predicting battery state of charge and state of health becomes critical. In order to accurately predict the remaining battery power to support system operations for informed operational decision-making, age-dependent changes in dynamics must be accounted for. Using an electrochemistry-based model, we investigate how key parameters of the battery change as aging occurs, and develop models to describe aging through these key parameters. Using these models, we demonstrate how we can (i) accurately predict end-of-discharge for aged batteries, and (ii) predict the end-of-life of a battery as a function of anticipated usage. The approach is validated through an experimental set of randomized discharge profiles.

  7. City traffic flow breakdown prediction based on fuzzy rough set

    NASA Astrophysics Data System (ADS)

    Yang, Xu; Da-wei, Hu; Bing, Su; Duo-jia, Zhang

    2017-05-01

    In city traffic management, traffic breakdown is a very important issue, which is defined as a speed drop of a certain amount within a dense traffic situation. In order to predict city traffic flow breakdown accurately, in this paper, we propose a novel city traffic flow breakdown prediction algorithm based on fuzzy rough set. Firstly, we illustrate the city traffic flow breakdown problem, in which three definitions are given, that is, 1) Pre-breakdown flow rate, 2) Rate, density, and speed of the traffic flow breakdown, and 3) Duration of the traffic flow breakdown. Moreover, we define a hazard function to represent the probability of the breakdown ending at a given time point. Secondly, as there are many redundant and irrelevant attributes in city flow breakdown prediction, we propose an attribute reduction algorithm using the fuzzy rough set. Thirdly, we discuss how to predict the city traffic flow breakdown based on attribute reduction and SVM classifier. Finally, experiments are conducted by collecting data from I-405 Freeway, which is located at Irvine, California. Experimental results demonstrate that the proposed algorithm is able to achieve lower average error rate of city traffic flow breakdown prediction.

  8. Differential equation based method for accurate approximations in optimization

    NASA Technical Reports Server (NTRS)

    Pritchard, Jocelyn I.; Adelman, Howard M.

    1990-01-01

    This paper describes a method to efficiently and accurately approximate the effect of design changes on structural response. The key to this new method is to interpret sensitivity equations as differential equations that may be solved explicitly for closed form approximations, hence, the method is denoted the Differential Equation Based (DEB) method. Approximations were developed for vibration frequencies, mode shapes and static displacements. The DEB approximation method was applied to a cantilever beam and results compared with the commonly-used linear Taylor series approximations and exact solutions. The test calculations involved perturbing the height, width, cross-sectional area, tip mass, and bending inertia of the beam. The DEB method proved to be very accurate, and in msot cases, was more accurate than the linear Taylor series approximation. The method is applicable to simultaneous perturbation of several design variables. Also, the approximations may be used to calculate other system response quantities. For example, the approximations for displacement are used to approximate bending stresses.

  9. Differential equation based method for accurate approximations in optimization

    NASA Technical Reports Server (NTRS)

    Pritchard, Jocelyn I.; Adelman, Howard M.

    1990-01-01

    A method to efficiently and accurately approximate the effect of design changes on structural response is described. The key to this method is to interpret sensitivity equations as differential equations that may be solved explicitly for closed form approximations, hence, the method is denoted the Differential Equation Based (DEB) method. Approximations were developed for vibration frequencies, mode shapes and static displacements. The DEB approximation method was applied to a cantilever beam and results compared with the commonly-used linear Taylor series approximations and exact solutions. The test calculations involved perturbing the height, width, cross-sectional area, tip mass, and bending inertia of the beam. The DEB method proved to be very accurate, and in most cases, was more accurate than the linear Taylor series approximation. The method is applicable to simultaneous perturbation of several design variables. Also, the approximations may be used to calculate other system response quantities. For example, the approximations for displacements are used to approximate bending stresses.

  10. Resting Energy Expenditure Prediction in Recreational Athletes of 18–35 Years: Confirmation of Cunningham Equation and an Improved Weight-Based Alternative

    PubMed Central

    ten Haaf, Twan; Weijs, Peter J. M.

    2014-01-01

    Introduction Resting energy expenditure (REE) is expected to be higher in athletes because of their relatively high fat free mass (FFM). Therefore, REE predictive equation for recreational athletes may be required. The aim of this study was to validate existing REE predictive equations and to develop a new recreational athlete specific equation. Methods 90 (53M, 37F) adult athletes, exercising on average 9.1±5.0 hours a week and 5.0±1.8 times a week, were included. REE was measured using indirect calorimetry (Vmax Encore n29), FFM and FM were measured using air displacement plethysmography. Multiple linear regression analysis was used to develop a new FFM-based and weight-based REE predictive equation. The percentage accurate predictions (within 10% of measured REE), percentage bias, root mean square error and limits of agreement were calculated. Results The Cunningham equation and the new weight-based equation and the new FFM-based equation performed equally well. De Lorenzo's equation predicted REE less accurate, but better than the other generally used REE predictive equations. Harris-Benedict, WHO, Schofield, Mifflin and Owen all showed less than 50% accuracy. Conclusion For a population of (Dutch) recreational athletes, the REE can accurately be predicted with the existing Cunningham equation. Since body composition measurement is not always possible, and other generally used equations fail, the new weight-based equation is advised for use in sports nutrition. PMID:25275434

  11. A range-based predictive localization algorithm for WSID networks

    NASA Astrophysics Data System (ADS)

    Liu, Yuan; Chen, Junjie; Li, Gang

    2017-11-01

    Most studies on localization algorithms are conducted on the sensor networks with densely distributed nodes. However, the non-localizable problems are prone to occur in the network with sparsely distributed sensor nodes. To solve this problem, a range-based predictive localization algorithm (RPLA) is proposed in this paper for the wireless sensor networks syncretizing the RFID (WSID) networks. The Gaussian mixture model is established to predict the trajectory of a mobile target. Then, the received signal strength indication is used to reduce the residence area of the target location based on the approximate point-in-triangulation test algorithm. In addition, collaborative localization schemes are introduced to locate the target in the non-localizable situations. Simulation results verify that the RPLA achieves accurate localization for the network with sparsely distributed sensor nodes. The localization accuracy of the RPLA is 48.7% higher than that of the APIT algorithm, 16.8% higher than that of the single Gaussian model-based algorithm and 10.5% higher than that of the Kalman filtering-based algorithm.

  12. Hindered rotor models with variable kinetic functions for accurate thermodynamic and kinetic predictions

    NASA Astrophysics Data System (ADS)

    Reinisch, Guillaume; Leyssale, Jean-Marc; Vignoles, Gérard L.

    2010-10-01

    We present an extension of some popular hindered rotor (HR) models, namely, the one-dimensional HR (1DHR) and the degenerated two-dimensional HR (d2DHR) models, allowing for a simple and accurate treatment of internal rotations. This extension, based on the use of a variable kinetic function in the Hamiltonian instead of a constant reduced moment of inertia, is extremely suitable in the case of rocking/wagging motions involved in dissociation or atom transfer reactions. The variable kinetic function is first introduced in the framework of a classical 1DHR model. Then, an effective temperature and potential dependent constant is proposed in the cases of quantum 1DHR and classical d2DHR models. These methods are finally applied to the atom transfer reaction SiCl3+BCl3→SiCl4+BCl2. We show, for this particular case, that a proper accounting of internal rotations greatly improves the accuracy of thermodynamic and kinetic predictions. Moreover, our results confirm (i) that using a suitably defined kinetic function appears to be very adapted to such problems; (ii) that the separability assumption of independent rotations seems justified; and (iii) that a quantum mechanical treatment is not a substantial improvement with respect to a classical one.

  13. Feedback about More Accurate versus Less Accurate Trials: Differential Effects on Self-Confidence and Activation

    ERIC Educational Resources Information Center

    Badami, Rokhsareh; VaezMousavi, Mohammad; Wulf, Gabriele; Namazizadeh, Mahdi

    2012-01-01

    One purpose of the present study was to examine whether self-confidence or anxiety would be differentially affected by feedback from more accurate rather than less accurate trials. The second purpose was to determine whether arousal variations (activation) would predict performance. On Day 1, participants performed a golf putting task under one of…

  14. Tools for Early Prediction of Drug Loading in Lipid-Based Formulations

    PubMed Central

    2015-01-01

    Identification of the usefulness of lipid-based formulations (LBFs) for delivery of poorly water-soluble drugs is at date mainly experimentally based. In this work we used a diverse drug data set, and more than 2,000 solubility measurements to develop experimental and computational tools to predict the loading capacity of LBFs. Computational models were developed to enable in silico prediction of solubility, and hence drug loading capacity, in the LBFs. Drug solubility in mixed mono-, di-, triglycerides (Maisine 35-1 and Capmul MCM EP) correlated (R2 0.89) as well as the drug solubility in Carbitol and other ethoxylated excipients (PEG400, R2 0.85; Polysorbate 80, R2 0.90; Cremophor EL, R2 0.93). A melting point below 150 °C was observed to result in a reasonable solubility in the glycerides. The loading capacity in LBFs was accurately calculated from solubility data in single excipients (R2 0.91). In silico models, without the demand of experimentally determined solubility, also gave good predictions of the loading capacity in these complex formulations (R2 0.79). The framework established here gives a better understanding of drug solubility in single excipients and of LBF loading capacity. The large data set studied revealed that experimental screening efforts can be rationalized by solubility measurements in key excipients or from solid state information. For the first time it was shown that loading capacity in complex formulations can be accurately predicted using molecular information extracted from calculated descriptors and thermal properties of the crystalline drug. PMID:26568134

  15. Tools for Early Prediction of Drug Loading in Lipid-Based Formulations.

    PubMed

    Alskär, Linda C; Porter, Christopher J H; Bergström, Christel A S

    2016-01-04

    Identification of the usefulness of lipid-based formulations (LBFs) for delivery of poorly water-soluble drugs is at date mainly experimentally based. In this work we used a diverse drug data set, and more than 2,000 solubility measurements to develop experimental and computational tools to predict the loading capacity of LBFs. Computational models were developed to enable in silico prediction of solubility, and hence drug loading capacity, in the LBFs. Drug solubility in mixed mono-, di-, triglycerides (Maisine 35-1 and Capmul MCM EP) correlated (R(2) 0.89) as well as the drug solubility in Carbitol and other ethoxylated excipients (PEG400, R(2) 0.85; Polysorbate 80, R(2) 0.90; Cremophor EL, R(2) 0.93). A melting point below 150 °C was observed to result in a reasonable solubility in the glycerides. The loading capacity in LBFs was accurately calculated from solubility data in single excipients (R(2) 0.91). In silico models, without the demand of experimentally determined solubility, also gave good predictions of the loading capacity in these complex formulations (R(2) 0.79). The framework established here gives a better understanding of drug solubility in single excipients and of LBF loading capacity. The large data set studied revealed that experimental screening efforts can be rationalized by solubility measurements in key excipients or from solid state information. For the first time it was shown that loading capacity in complex formulations can be accurately predicted using molecular information extracted from calculated descriptors and thermal properties of the crystalline drug.

  16. An Interpretable Machine Learning Model for Accurate Prediction of Sepsis in the ICU.

    PubMed

    Nemati, Shamim; Holder, Andre; Razmi, Fereshteh; Stanley, Matthew D; Clifford, Gari D; Buchman, Timothy G

    2018-04-01

    Sepsis is among the leading causes of morbidity, mortality, and cost overruns in critically ill patients. Early intervention with antibiotics improves survival in septic patients. However, no clinically validated system exists for real-time prediction of sepsis onset. We aimed to develop and validate an Artificial Intelligence Sepsis Expert algorithm for early prediction of sepsis. Observational cohort study. Academic medical center from January 2013 to December 2015. Over 31,000 admissions to the ICUs at two Emory University hospitals (development cohort), in addition to over 52,000 ICU patients from the publicly available Medical Information Mart for Intensive Care-III ICU database (validation cohort). Patients who met the Third International Consensus Definitions for Sepsis (Sepsis-3) prior to or within 4 hours of their ICU admission were excluded, resulting in roughly 27,000 and 42,000 patients within our development and validation cohorts, respectively. None. High-resolution vital signs time series and electronic medical record data were extracted. A set of 65 features (variables) were calculated on hourly basis and passed to the Artificial Intelligence Sepsis Expert algorithm to predict onset of sepsis in the proceeding T hours (where T = 12, 8, 6, or 4). Artificial Intelligence Sepsis Expert was used to predict onset of sepsis in the proceeding T hours and to produce a list of the most significant contributing factors. For the 12-, 8-, 6-, and 4-hour ahead prediction of sepsis, Artificial Intelligence Sepsis Expert achieved area under the receiver operating characteristic in the range of 0.83-0.85. Performance of the Artificial Intelligence Sepsis Expert on the development and validation cohorts was indistinguishable. Using data available in the ICU in real-time, Artificial Intelligence Sepsis Expert can accurately predict the onset of sepsis in an ICU patient 4-12 hours prior to clinical recognition. A prospective study is necessary to determine the

  17. A new method for enhancer prediction based on deep belief network.

    PubMed

    Bu, Hongda; Gan, Yanglan; Wang, Yang; Zhou, Shuigeng; Guan, Jihong

    2017-10-16

    Studies have shown that enhancers are significant regulatory elements to play crucial roles in gene expression regulation. Since enhancers are unrelated to the orientation and distance to their target genes, it is a challenging mission for scholars and researchers to accurately predicting distal enhancers. In the past years, with the high-throughout ChiP-seq technologies development, several computational techniques emerge to predict enhancers using epigenetic or genomic features. Nevertheless, the inconsistency of computational models across different cell-lines and the unsatisfactory prediction performance call for further research in this area. Here, we propose a new Deep Belief Network (DBN) based computational method for enhancer prediction, which is called EnhancerDBN. This method combines diverse features, composed of DNA sequence compositional features, DNA methylation and histone modifications. Our computational results indicate that 1) EnhancerDBN outperforms 13 existing methods in prediction, and 2) GC content and DNA methylation can serve as relevant features for enhancer prediction. Deep learning is effective in boosting the performance of enhancer prediction.

  18. Moving Toward Integrating Gene Expression Profiling Into High-Throughput Testing: A Gene Expression Biomarker Accurately Predicts Estrogen Receptor α Modulation in a Microarray Compendium.

    PubMed

    Ryan, Natalia; Chorley, Brian; Tice, Raymond R; Judson, Richard; Corton, J Christopher

    2016-05-01

    Microarray profiling of chemical-induced effects is being increasingly used in medium- and high-throughput formats. Computational methods are described here to identify molecular targets from whole-genome microarray data using as an example the estrogen receptor α (ERα), often modulated by potential endocrine disrupting chemicals. ERα biomarker genes were identified by their consistent expression after exposure to 7 structurally diverse ERα agonists and 3 ERα antagonists in ERα-positive MCF-7 cells. Most of the biomarker genes were shown to be directly regulated by ERα as determined by ESR1 gene knockdown using siRNA as well as through chromatin immunoprecipitation coupled with DNA sequencing analysis of ERα-DNA interactions. The biomarker was evaluated as a predictive tool using the fold-change rank-based Running Fisher algorithm by comparison to annotated gene expression datasets from experiments using MCF-7 cells, including those evaluating the transcriptional effects of hormones and chemicals. Using 141 comparisons from chemical- and hormone-treated cells, the biomarker gave a balanced accuracy for prediction of ERα activation or suppression of 94% and 93%, respectively. The biomarker was able to correctly classify 18 out of 21 (86%) ER reference chemicals including "very weak" agonists. Importantly, the biomarker predictions accurately replicated predictions based on 18 in vitro high-throughput screening assays that queried different steps in ERα signaling. For 114 chemicals, the balanced accuracies were 95% and 98% for activation or suppression, respectively. These results demonstrate that the ERα gene expression biomarker can accurately identify ERα modulators in large collections of microarray data derived from MCF-7 cells. Published by Oxford University Press on behalf of the Society of Toxicology 2016. This work is written by US Government employees and is in the public domain in the US.

  19. Evaluation of three statistical prediction models for forensic age prediction based on DNA methylation.

    PubMed

    Smeers, Inge; Decorte, Ronny; Van de Voorde, Wim; Bekaert, Bram

    2018-05-01

    DNA methylation is a promising biomarker for forensic age prediction. A challenge that has emerged in recent studies is the fact that prediction errors become larger with increasing age due to interindividual differences in epigenetic ageing rates. This phenomenon of non-constant variance or heteroscedasticity violates an assumption of the often used method of ordinary least squares (OLS) regression. The aim of this study was to evaluate alternative statistical methods that do take heteroscedasticity into account in order to provide more accurate, age-dependent prediction intervals. A weighted least squares (WLS) regression is proposed as well as a quantile regression model. Their performances were compared against an OLS regression model based on the same dataset. Both models provided age-dependent prediction intervals which account for the increasing variance with age, but WLS regression performed better in terms of success rate in the current dataset. However, quantile regression might be a preferred method when dealing with a variance that is not only non-constant, but also not normally distributed. Ultimately the choice of which model to use should depend on the observed characteristics of the data. Copyright © 2018 Elsevier B.V. All rights reserved.

  20. Self-Adaptive Prediction of Cloud Resource Demands Using Ensemble Model and Subtractive-Fuzzy Clustering Based Fuzzy Neural Network

    PubMed Central

    Chen, Zhijia; Zhu, Yuanchang; Di, Yanqiang; Feng, Shaochong

    2015-01-01

    In IaaS (infrastructure as a service) cloud environment, users are provisioned with virtual machines (VMs). To allocate resources for users dynamically and effectively, accurate resource demands predicting is essential. For this purpose, this paper proposes a self-adaptive prediction method using ensemble model and subtractive-fuzzy clustering based fuzzy neural network (ESFCFNN). We analyze the characters of user preferences and demands. Then the architecture of the prediction model is constructed. We adopt some base predictors to compose the ensemble model. Then the structure and learning algorithm of fuzzy neural network is researched. To obtain the number of fuzzy rules and the initial value of the premise and consequent parameters, this paper proposes the fuzzy c-means combined with subtractive clustering algorithm, that is, the subtractive-fuzzy clustering. Finally, we adopt different criteria to evaluate the proposed method. The experiment results show that the method is accurate and effective in predicting the resource demands. PMID:25691896

  1. A pilot study of NMR-based sensory prediction of roasted coffee bean extracts.

    PubMed

    Wei, Feifei; Furihata, Kazuo; Miyakawa, Takuya; Tanokura, Masaru

    2014-01-01

    Nuclear magnetic resonance (NMR) spectroscopy can be considered a kind of "magnetic tongue" for the characterisation and prediction of the tastes of foods, since it provides a wealth of information in a nondestructive and nontargeted manner. In the present study, the chemical substances in roasted coffee bean extracts that could distinguish and predict the different sensations of coffee taste were identified by the combination of NMR-based metabolomics and human sensory test and the application of the multivariate projection method of orthogonal projection to latent structures (OPLS). In addition, the tastes of commercial coffee beans were successfully predicted based on their NMR metabolite profiles using our OPLS model, suggesting that NMR-based metabolomics accompanied with multiple statistical models is convenient, fast and accurate for the sensory evaluation of coffee. Copyright © 2013 Elsevier Ltd. All rights reserved.

  2. Prediction of Industrial Electric Energy Consumption in Anhui Province Based on GA-BP Neural Network

    NASA Astrophysics Data System (ADS)

    Zhang, Jiajing; Yin, Guodong; Ni, Youcong; Chen, Jinlan

    2018-01-01

    In order to improve the prediction accuracy of industrial electrical energy consumption, a prediction model of industrial electrical energy consumption was proposed based on genetic algorithm and neural network. The model use genetic algorithm to optimize the weights and thresholds of BP neural network, and the model is used to predict the energy consumption of industrial power in Anhui Province, to improve the prediction accuracy of industrial electric energy consumption in Anhui province. By comparing experiment of GA-BP prediction model and BP neural network model, the GA-BP model is more accurate with smaller number of neurons in the hidden layer.

  3. Accurate prediction of X-ray pulse properties from a free-electron laser using machine learning

    DOE PAGES

    Sanchez-Gonzalez, A.; Micaelli, P.; Olivier, C.; ...

    2017-06-05

    Free-electron lasers providing ultra-short high-brightness pulses of X-ray radiation have great potential for a wide impact on science, and are a critical element for unravelling the structural dynamics of matter. To fully harness this potential, we must accurately know the X-ray properties: intensity, spectrum and temporal profile. Owing to the inherent fluctuations in free-electron lasers, this mandates a full characterization of the properties for each and every pulse. While diagnostics of these properties exist, they are often invasive and many cannot operate at a high-repetition rate. Here, we present a technique for circumventing this limitation. Employing a machine learning strategy,more » we can accurately predict X-ray properties for every shot using only parameters that are easily recorded at high-repetition rate, by training a model on a small set of fully diagnosed pulses. Lastly, this opens the door to fully realizing the promise of next-generation high-repetition rate X-ray lasers.« less

  4. Accurate prediction of X-ray pulse properties from a free-electron laser using machine learning

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sanchez-Gonzalez, A.; Micaelli, P.; Olivier, C.

    Free-electron lasers providing ultra-short high-brightness pulses of X-ray radiation have great potential for a wide impact on science, and are a critical element for unravelling the structural dynamics of matter. To fully harness this potential, we must accurately know the X-ray properties: intensity, spectrum and temporal profile. Owing to the inherent fluctuations in free-electron lasers, this mandates a full characterization of the properties for each and every pulse. While diagnostics of these properties exist, they are often invasive and many cannot operate at a high-repetition rate. Here, we present a technique for circumventing this limitation. Employing a machine learning strategy,more » we can accurately predict X-ray properties for every shot using only parameters that are easily recorded at high-repetition rate, by training a model on a small set of fully diagnosed pulses. Lastly, this opens the door to fully realizing the promise of next-generation high-repetition rate X-ray lasers.« less

  5. Accurate and dynamic predictive model for better prediction in medicine and healthcare.

    PubMed

    Alanazi, H O; Abdullah, A H; Qureshi, K N; Ismail, A S

    2018-05-01

    Information and communication technologies (ICTs) have changed the trend into new integrated operations and methods in all fields of life. The health sector has also adopted new technologies to improve the systems and provide better services to customers. Predictive models in health care are also influenced from new technologies to predict the different disease outcomes. However, still, existing predictive models have suffered from some limitations in terms of predictive outcomes performance. In order to improve predictive model performance, this paper proposed a predictive model by classifying the disease predictions into different categories. To achieve this model performance, this paper uses traumatic brain injury (TBI) datasets. TBI is one of the serious diseases worldwide and needs more attention due to its seriousness and serious impacts on human life. The proposed predictive model improves the predictive performance of TBI. The TBI data set is developed and approved by neurologists to set its features. The experiment results show that the proposed model has achieved significant results including accuracy, sensitivity, and specificity.

  6. Resting energy expenditure prediction in recreational athletes of 18-35 years: confirmation of Cunningham equation and an improved weight-based alternative.

    PubMed

    ten Haaf, Twan; Weijs, Peter J M

    2014-01-01

    Resting energy expenditure (REE) is expected to be higher in athletes because of their relatively high fat free mass (FFM). Therefore, REE predictive equation for recreational athletes may be required. The aim of this study was to validate existing REE predictive equations and to develop a new recreational athlete specific equation. 90 (53 M, 37 F) adult athletes, exercising on average 9.1 ± 5.0 hours a week and 5.0 ± 1.8 times a week, were included. REE was measured using indirect calorimetry (Vmax Encore n29), FFM and FM were measured using air displacement plethysmography. Multiple linear regression analysis was used to develop a new FFM-based and weight-based REE predictive equation. The percentage accurate predictions (within 10% of measured REE), percentage bias, root mean square error and limits of agreement were calculated. Results: The Cunningham equation and the new weight-based equation REE(kJ / d) = 49.940* weight(kg) + 2459.053* height(m) - 34.014* age(y) + 799.257* sex(M = 1,F = 0) + 122.502 and the new FFM-based equation REE(kJ / d) = 95.272*FFM(kg) + 2026.161 performed equally well. De Lorenzo's equation predicted REE less accurate, but better than the other generally used REE predictive equations. Harris-Benedict, WHO, Schofield, Mifflin and Owen all showed less than 50% accuracy. For a population of (Dutch) recreational athletes, the REE can accurately be predicted with the existing Cunningham equation. Since body composition measurement is not always possible, and other generally used equations fail, the new weight-based equation is advised for use in sports nutrition.

  7. A dynamic multi-scale Markov model based methodology for remaining life prediction

    NASA Astrophysics Data System (ADS)

    Yan, Jihong; Guo, Chaozhong; Wang, Xing

    2011-05-01

    The ability to accurately predict the remaining life of partially degraded components is crucial in prognostics. In this paper, a performance degradation index is designed using multi-feature fusion techniques to represent deterioration severities of facilities. Based on this indicator, an improved Markov model is proposed for remaining life prediction. Fuzzy C-Means (FCM) algorithm is employed to perform state division for Markov model in order to avoid the uncertainty of state division caused by the hard division approach. Considering the influence of both historical and real time data, a dynamic prediction method is introduced into Markov model by a weighted coefficient. Multi-scale theory is employed to solve the state division problem of multi-sample prediction. Consequently, a dynamic multi-scale Markov model is constructed. An experiment is designed based on a Bently-RK4 rotor testbed to validate the dynamic multi-scale Markov model, experimental results illustrate the effectiveness of the methodology.

  8. Efficient and accurate two-scale FE-FFT-based prediction of the effective material behavior of elasto-viscoplastic polycrystals

    NASA Astrophysics Data System (ADS)

    Kochmann, Julian; Wulfinghoff, Stephan; Ehle, Lisa; Mayer, Joachim; Svendsen, Bob; Reese, Stefanie

    2018-06-01

    Recently, two-scale FE-FFT-based methods (e.g., Spahn et al. in Comput Methods Appl Mech Eng 268:871-883, 2014; Kochmann et al. in Comput Methods Appl Mech Eng 305:89-110, 2016) have been proposed to predict the microscopic and overall mechanical behavior of heterogeneous materials. The purpose of this work is the extension to elasto-viscoplastic polycrystals, efficient and robust Fourier solvers and the prediction of micromechanical fields during macroscopic deformation processes. Assuming scale separation, the macroscopic problem is solved using the finite element method. The solution of the microscopic problem, which is embedded as a periodic unit cell (UC) in each macroscopic integration point, is found by employing fast Fourier transforms, fixed-point and Newton-Krylov methods. The overall material behavior is defined by the mean UC response. In order to ensure spatially converged micromechanical fields as well as feasible overall CPU times, an efficient but simple solution strategy for two-scale simulations is proposed. As an example, the constitutive behavior of 42CrMo4 steel is predicted during macroscopic three-point bending tests.

  9. Efficient and accurate two-scale FE-FFT-based prediction of the effective material behavior of elasto-viscoplastic polycrystals

    NASA Astrophysics Data System (ADS)

    Kochmann, Julian; Wulfinghoff, Stephan; Ehle, Lisa; Mayer, Joachim; Svendsen, Bob; Reese, Stefanie

    2017-09-01

    Recently, two-scale FE-FFT-based methods (e.g., Spahn et al. in Comput Methods Appl Mech Eng 268:871-883, 2014; Kochmann et al. in Comput Methods Appl Mech Eng 305:89-110, 2016) have been proposed to predict the microscopic and overall mechanical behavior of heterogeneous materials. The purpose of this work is the extension to elasto-viscoplastic polycrystals, efficient and robust Fourier solvers and the prediction of micromechanical fields during macroscopic deformation processes. Assuming scale separation, the macroscopic problem is solved using the finite element method. The solution of the microscopic problem, which is embedded as a periodic unit cell (UC) in each macroscopic integration point, is found by employing fast Fourier transforms, fixed-point and Newton-Krylov methods. The overall material behavior is defined by the mean UC response. In order to ensure spatially converged micromechanical fields as well as feasible overall CPU times, an efficient but simple solution strategy for two-scale simulations is proposed. As an example, the constitutive behavior of 42CrMo4 steel is predicted during macroscopic three-point bending tests.

  10. Combining Mean and Standard Deviation of Hounsfield Unit Measurements from Preoperative CT Allows More Accurate Prediction of Urinary Stone Composition Than Mean Hounsfield Units Alone.

    PubMed

    Tailly, Thomas; Larish, Yaniv; Nadeau, Brandon; Violette, Philippe; Glickman, Leonard; Olvera-Posada, Daniel; Alenezi, Husain; Amann, Justin; Denstedt, John; Razvi, Hassan

    2016-04-01

    The mineral composition of a urinary stone may influence its surgical and medical treatment. Previous attempts at identifying stone composition based on mean Hounsfield Units (HUm) have had varied success. We aimed to evaluate the additional use of standard deviation of HU (HUsd) to more accurately predict stone composition. We identified patients from two centers who had undergone urinary stone treatment between 2006 and 2013 and had mineral stone analysis and a computed tomography (CT) available. HUm and HUsd of the stones were compared with ANOVA. Receiver operative characteristic analysis with area under the curve (AUC), Youden index, and likelihood ratio calculations were performed. Data were available for 466 patients. The major components were calcium oxalate monohydrate (COM), uric acid, hydroxyapatite, struvite, brushite, cystine, and CO dihydrate (COD) in 41.4%, 19.3%, 12.4%, 7.5%, 5.8%, 5.4%, and 4.7% of patients, respectively. The HUm of UA and Br was significantly lower and higher than the HUm of any other stone type, respectively. HUm and HUsd were most accurate in predicting uric acid with an AUC of 0.969 and 0.851, respectively. The combined use of HUm and HUsd resulted in increased positive predictive value and higher likelihood ratios for identifying a stone's mineral composition for all stone types but COM. To the best of our knowledge, this is the first report of CT data aiding in the prediction of brushite stone composition. Both HUm and HUsd can help predict stone composition and their combined use results in higher likelihood ratios influencing probability.

  11. Research on the Wire Network Signal Prediction Based on the Improved NNARX Model

    NASA Astrophysics Data System (ADS)

    Zhang, Zipeng; Fan, Tao; Wang, Shuqing

    It is difficult to obtain accurately the wire net signal of power system's high voltage power transmission lines in the process of monitoring and repairing. In order to solve this problem, the signal measured in remote substation or laboratory is employed to make multipoint prediction to gain the needed data. But, the obtained power grid frequency signal is delay. In order to solve the problem, an improved NNARX network which can predict frequency signal based on multi-point data collected by remote substation PMU is describes in this paper. As the error curved surface of the NNARX network is more complicated, this paper uses L-M algorithm to train the network. The result of the simulation shows that the NNARX network has preferable predication performance which provides accurate real time data for field testing and maintenance.

  12. Predicting drug hydrolysis based on moisture uptake in various packaging designs.

    PubMed

    Naversnik, Klemen; Bohanec, Simona

    2008-12-18

    An attempt was made to predict the stability of a moisture sensitive drug product based on the knowledge of the dependence of the degradation rate on tablet moisture. The moisture increase inside a HDPE bottle with the drug formulation was simulated with the sorption-desorption moisture transfer model, which, in turn, allowed an accurate prediction of the drug degradation kinetics. The stability prediction, obtained by computer simulation, was made in a considerably shorter time frame and required little resources compared to a conventional stability study. The prediction was finally upgraded to a stochastic Monte Carlo simulation, which allowed quantitative incorporation of uncertainty, stemming from various sources. The resulting distribution of the outcome of interest (amount of degradation product at expiry) is a comprehensive way of communicating the result along with its uncertainty, superior to single-value results or confidence intervals.

  13. Life prediction for high temperature low cycle fatigue of two kinds of titanium alloys based on exponential function

    NASA Astrophysics Data System (ADS)

    Mu, G. Y.; Mi, X. Z.; Wang, F.

    2018-01-01

    The high temperature low cycle fatigue tests of TC4 titanium alloy and TC11 titanium alloy are carried out under strain controlled. The relationships between cyclic stress-life and strain-life are analyzed. The high temperature low cycle fatigue life prediction model of two kinds of titanium alloys is established by using Manson-Coffin method. The relationship between failure inverse number and plastic strain range presents nonlinear in the double logarithmic coordinates. Manson-Coffin method assumes that they have linear relation. Therefore, there is bound to be a certain prediction error by using the Manson-Coffin method. In order to solve this problem, a new method based on exponential function is proposed. The results show that the fatigue life of the two kinds of titanium alloys can be predicted accurately and effectively by using these two methods. Prediction accuracy is within ±1.83 times scatter zone. The life prediction capability of new methods based on exponential function proves more effective and accurate than Manson-Coffin method for two kinds of titanium alloys. The new method based on exponential function can give better fatigue life prediction results with the smaller standard deviation and scatter zone than Manson-Coffin method. The life prediction results of two methods for TC4 titanium alloy prove better than TC11 titanium alloy.

  14. Limited Sampling Strategy for Accurate Prediction of Pharmacokinetics of Saroglitazar: A 3-point Linear Regression Model Development and Successful Prediction of Human Exposure.

    PubMed

    Joshi, Shuchi N; Srinivas, Nuggehally R; Parmar, Deven V

    2018-03-01

    Our aim was to develop and validate the extrapolative performance of a regression model using a limited sampling strategy for accurate estimation of the area under the plasma concentration versus time curve for saroglitazar. Healthy subject pharmacokinetic data from a well-powered food-effect study (fasted vs fed treatments; n = 50) was used in this work. The first 25 subjects' serial plasma concentration data up to 72 hours and corresponding AUC 0-t (ie, 72 hours) from the fasting group comprised a training dataset to develop the limited sampling model. The internal datasets for prediction included the remaining 25 subjects from the fasting group and all 50 subjects from the fed condition of the same study. The external datasets included pharmacokinetic data for saroglitazar from previous single-dose clinical studies. Limited sampling models were composed of 1-, 2-, and 3-concentration-time points' correlation with AUC 0-t of saroglitazar. Only models with regression coefficients (R 2 ) >0.90 were screened for further evaluation. The best R 2 model was validated for its utility based on mean prediction error, mean absolute prediction error, and root mean square error. Both correlations between predicted and observed AUC 0-t of saroglitazar and verification of precision and bias using Bland-Altman plot were carried out. None of the evaluated 1- and 2-concentration-time points models achieved R 2 > 0.90. Among the various 3-concentration-time points models, only 4 equations passed the predefined criterion of R 2 > 0.90. Limited sampling models with time points 0.5, 2, and 8 hours (R 2 = 0.9323) and 0.75, 2, and 8 hours (R 2 = 0.9375) were validated. Mean prediction error, mean absolute prediction error, and root mean square error were <30% (predefined criterion) and correlation (r) was at least 0.7950 for the consolidated internal and external datasets of 102 healthy subjects for the AUC 0-t prediction of saroglitazar. The same models, when applied to the AUC 0-t

  15. Accurate Prediction of Protein Contact Maps by Coupling Residual Two-Dimensional Bidirectional Long Short-Term Memory with Convolutional Neural Networks.

    PubMed

    Hanson, Jack; Paliwal, Kuldip; Litfin, Thomas; Yang, Yuedong; Zhou, Yaoqi

    2018-06-19

    Accurate prediction of a protein contact map depends greatly on capturing as much contextual information as possible from surrounding residues for a target residue pair. Recently, ultra-deep residual convolutional networks were found to be state-of-the-art in the latest Critical Assessment of Structure Prediction techniques (CASP12, (Schaarschmidt et al., 2018)) for protein contact map prediction by attempting to provide a protein-wide context at each residue pair. Recurrent neural networks have seen great success in recent protein residue classification problems due to their ability to propagate information through long protein sequences, especially Long Short-Term Memory (LSTM) cells. Here we propose a novel protein contact map prediction method by stacking residual convolutional networks with two-dimensional residual bidirectional recurrent LSTM networks, and using both one-dimensional sequence-based and two-dimensional evolutionary coupling-based information. We show that the proposed method achieves a robust performance over validation and independent test sets with the Area Under the receiver operating characteristic Curve (AUC)>0.95 in all tests. When compared to several state-of-the-art methods for independent testing of 228 proteins, the method yields an AUC value of 0.958, whereas the next-best method obtains an AUC of 0.909. More importantly, the improvement is over contacts at all sequence-position separations. Specifically, a 8.95%, 5.65% and 2.84% increase in precision were observed for the top L∕10 predictions over the next best for short, medium and long-range contacts, respectively. This confirms the usefulness of ResNets to congregate the short-range relations and 2D-BRLSTM to propagate the long-range dependencies throughout the entire protein contact map 'image'. SPOT-Contact server url: http://sparks-lab.org/jack/server/SPOT-Contact/. Supplementary data is available at Bioinformatics online.

  16. Using In-Service and Coaching to Increase Teachers' Accurate Use of Research-Based Strategies

    ERIC Educational Resources Information Center

    Kretlow, Allison G.; Cooke, Nancy L.; Wood, Charles L.

    2012-01-01

    Increasing the accurate use of research-based practices in classrooms is a critical issue. Professional development is one of the most practical ways to provide practicing teachers with training related to research-based practices. This study examined the effects of in-service plus follow-up coaching on first grade teachers' accurate delivery of…

  17. Wind power prediction based on genetic neural network

    NASA Astrophysics Data System (ADS)

    Zhang, Suhan

    2017-04-01

    The scale of grid connected wind farms keeps increasing. To ensure the stability of power system operation, make a reasonable scheduling scheme and improve the competitiveness of wind farm in the electricity generation market, it's important to accurately forecast the short-term wind power. To reduce the influence of the nonlinear relationship between the disturbance factor and the wind power, the improved prediction model based on genetic algorithm and neural network method is established. To overcome the shortcomings of long training time of BP neural network and easy to fall into local minimum and improve the accuracy of the neural network, genetic algorithm is adopted to optimize the parameters and topology of neural network. The historical data is used as input to predict short-term wind power. The effectiveness and feasibility of the method is verified by the actual data of a certain wind farm as an example.

  18. Predicting neuroblastoma using developmental signals and a logic-based model.

    PubMed

    Kasemeier-Kulesa, Jennifer C; Schnell, Santiago; Woolley, Thomas; Spengler, Jennifer A; Morrison, Jason A; McKinney, Mary C; Pushel, Irina; Wolfe, Lauren A; Kulesa, Paul M

    2018-07-01

    Genomic information from human patient samples of pediatric neuroblastoma cancers and known outcomes have led to specific gene lists put forward as high risk for disease progression. However, the reliance on gene expression correlations rather than mechanistic insight has shown limited potential and suggests a critical need for molecular network models that better predict neuroblastoma progression. In this study, we construct and simulate a molecular network of developmental genes and downstream signals in a 6-gene input logic model that predicts a favorable/unfavorable outcome based on the outcome of the four cell states including cell differentiation, proliferation, apoptosis, and angiogenesis. We simulate the mis-expression of the tyrosine receptor kinases, trkA and trkB, two prognostic indicators of neuroblastoma, and find differences in the number and probability distribution of steady state outcomes. We validate the mechanistic model assumptions using RNAseq of the SHSY5Y human neuroblastoma cell line to define the input states and confirm the predicted outcome with antibody staining. Lastly, we apply input gene signatures from 77 published human patient samples and show that our model makes more accurate disease outcome predictions for early stage disease than any current neuroblastoma gene list. These findings highlight the predictive strength of a logic-based model based on developmental genes and offer a better understanding of the molecular network interactions during neuroblastoma disease progression. Copyright © 2018. Published by Elsevier B.V.

  19. Two States Mapping Based Time Series Neural Network Model for Compensation Prediction Residual Error

    NASA Astrophysics Data System (ADS)

    Jung, Insung; Koo, Lockjo; Wang, Gi-Nam

    2008-11-01

    The objective of this paper was to design a model of human bio signal data prediction system for decreasing of prediction error using two states mapping based time series neural network BP (back-propagation) model. Normally, a lot of the industry has been applied neural network model by training them in a supervised manner with the error back-propagation algorithm for time series prediction systems. However, it still has got a residual error between real value and prediction result. Therefore, we designed two states of neural network model for compensation residual error which is possible to use in the prevention of sudden death and metabolic syndrome disease such as hypertension disease and obesity. We determined that most of the simulation cases were satisfied by the two states mapping based time series prediction model. In particular, small sample size of times series were more accurate than the standard MLP model.

  20. Anthropometric measures in cardiovascular disease prediction: comparison of laboratory-based versus non-laboratory-based model.

    PubMed

    Dhana, Klodian; Ikram, M Arfan; Hofman, Albert; Franco, Oscar H; Kavousi, Maryam

    2015-03-01

    Body mass index (BMI) has been used to simplify cardiovascular risk prediction models by substituting total cholesterol and high-density lipoprotein cholesterol. In the elderly, the ability of BMI as a predictor of cardiovascular disease (CVD) declines. We aimed to find the most predictive anthropometric measure for CVD risk to construct a non-laboratory-based model and to compare it with the model including laboratory measurements. The study included 2675 women and 1902 men aged 55-79 years from the prospective population-based Rotterdam Study. We used Cox proportional hazard regression analysis to evaluate the association of BMI, waist circumference, waist-to-hip ratio and a body shape index (ABSI) with CVD, including coronary heart disease and stroke. The performance of the laboratory-based and non-laboratory-based models was evaluated by studying the discrimination, calibration, correlation and risk agreement. Among men, ABSI was the most informative measure associated with CVD, therefore ABSI was used to construct the non-laboratory-based model. Discrimination of the non-laboratory-based model was not different than laboratory-based model (c-statistic: 0.680-vs-0.683, p=0.71); both models were well calibrated (15.3% observed CVD risk vs 16.9% and 17.0% predicted CVD risks by the non-laboratory-based and laboratory-based models, respectively) and Spearman rank correlation and the agreement between non-laboratory-based and laboratory-based models were 0.89 and 91.7%, respectively. Among women, none of the anthropometric measures were independently associated with CVD. Among middle-aged and elderly where the ability of BMI to predict CVD declines, the non-laboratory-based model, based on ABSI, could predict CVD risk as accurately as the laboratory-based model among men. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  1. Prediction of human pharmacokinetics using physiologically based modeling: a retrospective analysis of 26 clinically tested drugs.

    PubMed

    De Buck, Stefan S; Sinha, Vikash K; Fenu, Luca A; Nijsen, Marjoleen J; Mackie, Claire E; Gilissen, Ron A H J

    2007-10-01

    The aim of this study was to evaluate different physiologically based modeling strategies for the prediction of human pharmacokinetics. Plasma profiles after intravenous and oral dosing were simulated for 26 clinically tested drugs. Two mechanism-based predictions of human tissue-to-plasma partitioning (P(tp)) from physicochemical input (method Vd1) were evaluated for their ability to describe human volume of distribution at steady state (V(ss)). This method was compared with a strategy that combined predicted and experimentally determined in vivo rat P(tp) data (method Vd2). Best V(ss) predictions were obtained using method Vd2, providing that rat P(tp) input was corrected for interspecies differences in plasma protein binding (84% within 2-fold). V(ss) predictions from physicochemical input alone were poor (32% within 2-fold). Total body clearance (CL) was predicted as the sum of scaled rat renal clearance and hepatic clearance projected from in vitro metabolism data. Best CL predictions were obtained by disregarding both blood and microsomal or hepatocyte binding (method CL2, 74% within 2-fold), whereas strong bias was seen using both blood and microsomal or hepatocyte binding (method CL1, 53% within 2-fold). The physiologically based pharmacokinetics (PBPK) model, which combined methods Vd2 and CL2 yielded the most accurate predictions of in vivo terminal half-life (69% within 2-fold). The Gastroplus advanced compartmental absorption and transit model was used to construct an absorption-disposition model and provided accurate predictions of area under the plasma concentration-time profile, oral apparent volume of distribution, and maximum plasma concentration after oral dosing, with 74%, 70%, and 65% within 2-fold, respectively. This evaluation demonstrates that PBPK models can lead to reasonable predictions of human pharmacokinetics.

  2. A data mining based approach to predict spatiotemporal changes in satellite images

    NASA Astrophysics Data System (ADS)

    Boulila, W.; Farah, I. R.; Ettabaa, K. Saheb; Solaiman, B.; Ghézala, H. Ben

    2011-06-01

    The interpretation of remotely sensed images in a spatiotemporal context is becoming a valuable research topic. However, the constant growth of data volume in remote sensing imaging makes reaching conclusions based on collected data a challenging task. Recently, data mining appears to be a promising research field leading to several interesting discoveries in various areas such as marketing, surveillance, fraud detection and scientific discovery. By integrating data mining and image interpretation techniques, accurate and relevant information (i.e. functional relation between observed parcels and a set of informational contents) can be automatically elicited. This study presents a new approach to predict spatiotemporal changes in satellite image databases. The proposed method exploits fuzzy sets and data mining concepts to build predictions and decisions for several remote sensing fields. It takes into account imperfections related to the spatiotemporal mining process in order to provide more accurate and reliable information about land cover changes in satellite images. The proposed approach is validated using SPOT images representing the Saint-Denis region, capital of Reunion Island. Results show good performances of the proposed framework in predicting change for the urban zone.

  3. An Extrapolation of a Radical Equation More Accurately Predicts Shelf Life of Frozen Biological Matrices.

    PubMed

    De Vore, Karl W; Fatahi, Nadia M; Sass, John E

    2016-08-01

    Arrhenius modeling of analyte recovery at increased temperatures to predict long-term colder storage stability of biological raw materials, reagents, calibrators, and controls is standard practice in the diagnostics industry. Predicting subzero temperature stability using the same practice is frequently criticized but nevertheless heavily relied upon. We compared the ability to predict analyte recovery during frozen storage using 3 separate strategies: traditional accelerated studies with Arrhenius modeling, and extrapolation of recovery at 20% of shelf life using either ordinary least squares or a radical equation y = B1x(0.5) + B0. Computer simulations were performed to establish equivalence of statistical power to discern the expected changes during frozen storage or accelerated stress. This was followed by actual predictive and follow-up confirmatory testing of 12 chemistry and immunoassay analytes. Linear extrapolations tended to be the most conservative in the predicted percent recovery, reducing customer and patient risk. However, the majority of analytes followed a rate of change that slowed over time, which was fit best to a radical equation of the form y = B1x(0.5) + B0. Other evidence strongly suggested that the slowing of the rate was not due to higher-order kinetics, but to changes in the matrix during storage. Predicting shelf life of frozen products through extrapolation of early initial real-time storage analyte recovery should be considered the most accurate method. Although in this study the time required for a prediction was longer than a typical accelerated testing protocol, there are less potential sources of error, reduced costs, and a lower expenditure of resources. © 2016 American Association for Clinical Chemistry.

  4. Physics-based enzyme design: predicting binding affinity and catalytic activity.

    PubMed

    Sirin, Sarah; Pearlman, David A; Sherman, Woody

    2014-12-01

    Computational enzyme design is an emerging field that has yielded promising success stories, but where numerous challenges remain. Accurate methods to rapidly evaluate possible enzyme design variants could provide significant value when combined with experimental efforts by reducing the number of variants needed to be synthesized and speeding the time to reach the desired endpoint of the design. To that end, extending our computational methods to model the fundamental physical-chemical principles that regulate activity in a protocol that is automated and accessible to a broad population of enzyme design researchers is essential. Here, we apply a physics-based implicit solvent MM-GBSA scoring approach to enzyme design and benchmark the computational predictions against experimentally determined activities. Specifically, we evaluate the ability of MM-GBSA to predict changes in affinity for a steroid binder protein, catalytic turnover for a Kemp eliminase, and catalytic activity for α-Gliadin peptidase variants. Using the enzyme design framework developed here, we accurately rank the most experimentally active enzyme variants, suggesting that this approach could provide enrichment of active variants in real-world enzyme design applications. © 2014 Wiley Periodicals, Inc.

  5. A new solar power output prediction based on hybrid forecast engine and decomposition model.

    PubMed

    Zhang, Weijiang; Dang, Hongshe; Simoes, Rolando

    2018-06-12

    Regarding to the growing trend of photovoltaic (PV) energy as a clean energy source in electrical networks and its uncertain nature, PV energy prediction has been proposed by researchers in recent decades. This problem is directly effects on operation in power network while, due to high volatility of this signal, an accurate prediction model is demanded. A new prediction model based on Hilbert Huang transform (HHT) and integration of improved empirical mode decomposition (IEMD) with feature selection and forecast engine is presented in this paper. The proposed approach is divided into three main sections. In the first section, the signal is decomposed by the proposed IEMD as an accurate decomposition tool. To increase the accuracy of the proposed method, a new interpolation method has been used instead of cubic spline curve (CSC) fitting in EMD. Then the obtained output is entered into the new feature selection procedure to choose the best candidate inputs. Finally, the signal is predicted by a hybrid forecast engine composed of support vector regression (SVR) based on an intelligent algorithm. The effectiveness of the proposed approach has been verified over a number of real-world engineering test cases in comparison with other well-known models. The obtained results prove the validity of the proposed method. Copyright © 2018 ISA. Published by Elsevier Ltd. All rights reserved.

  6. Hadoop-Based Distributed System for Online Prediction of Air Pollution Based on Support Vector Machine

    NASA Astrophysics Data System (ADS)

    Ghaemi, Z.; Farnaghi, M.; Alimohammadi, A.

    2015-12-01

    The critical impact of air pollution on human health and environment in one hand and the complexity of pollutant concentration behavior in the other hand lead the scientists to look for advance techniques for monitoring and predicting the urban air quality. Additionally, recent developments in data measurement techniques have led to collection of various types of data about air quality. Such data is extremely voluminous and to be useful it must be processed at high velocity. Due to the complexity of big data analysis especially for dynamic applications, online forecasting of pollutant concentration trends within a reasonable processing time is still an open problem. The purpose of this paper is to present an online forecasting approach based on Support Vector Machine (SVM) to predict the air quality one day in advance. In order to overcome the computational requirements for large-scale data analysis, distributed computing based on the Hadoop platform has been employed to leverage the processing power of multiple processing units. The MapReduce programming model is adopted for massive parallel processing in this study. Based on the online algorithm and Hadoop framework, an online forecasting system is designed to predict the air pollution of Tehran for the next 24 hours. The results have been assessed on the basis of Processing Time and Efficiency. Quite accurate predictions of air pollutant indicator levels within an acceptable processing time prove that the presented approach is very suitable to tackle large scale air pollution prediction problems.

  7. Data-Based Predictive Control with Multirate Prediction Step

    NASA Technical Reports Server (NTRS)

    Barlow, Jonathan S.

    2010-01-01

    Data-based predictive control is an emerging control method that stems from Model Predictive Control (MPC). MPC computes current control action based on a prediction of the system output a number of time steps into the future and is generally derived from a known model of the system. Data-based predictive control has the advantage of deriving predictive models and controller gains from input-output data. Thus, a controller can be designed from the outputs of complex simulation code or a physical system where no explicit model exists. If the output data happens to be corrupted by periodic disturbances, the designed controller will also have the built-in ability to reject these disturbances without the need to know them. When data-based predictive control is implemented online, it becomes a version of adaptive control. One challenge of MPC is computational requirements increasing with prediction horizon length. This paper develops a closed-loop dynamic output feedback controller that minimizes a multi-step-ahead receding-horizon cost function with multirate prediction step. One result is a reduced influence of prediction horizon and the number of system outputs on the computational requirements of the controller. Another result is an emphasis on portions of the prediction window that are sampled more frequently. A third result is the ability to include more outputs in the feedback path than in the cost function.

  8. Soil-pipe interaction modeling for pipe behavior prediction with super learning based methods

    NASA Astrophysics Data System (ADS)

    Shi, Fang; Peng, Xiang; Liu, Huan; Hu, Yafei; Liu, Zheng; Li, Eric

    2018-03-01

    Underground pipelines are subject to severe distress from the surrounding expansive soil. To investigate the structural response of water mains to varying soil movements, field data, including pipe wall strains in situ soil water content, soil pressure and temperature, was collected. The research on monitoring data analysis has been reported, but the relationship between soil properties and pipe deformation has not been well-interpreted. To characterize the relationship between soil property and pipe deformation, this paper presents a super learning based approach combining feature selection algorithms to predict the water mains structural behavior in different soil environments. Furthermore, automatic variable selection method, e.i. recursive feature elimination algorithm, were used to identify the critical predictors contributing to the pipe deformations. To investigate the adaptability of super learning to different predictive models, this research employed super learning based methods to three different datasets. The predictive performance was evaluated by R-squared, root-mean-square error and mean absolute error. Based on the prediction performance evaluation, the superiority of super learning was validated and demonstrated by predicting three types of pipe deformations accurately. In addition, a comprehensive understand of the water mains working environments becomes possible.

  9. Ligand and structure-based methodologies for the prediction of the activity of G protein-coupled receptor ligands

    NASA Astrophysics Data System (ADS)

    Costanzi, Stefano; Tikhonova, Irina G.; Harden, T. Kendall; Jacobson, Kenneth A.

    2009-11-01

    Accurate in silico models for the quantitative prediction of the activity of G protein-coupled receptor (GPCR) ligands would greatly facilitate the process of drug discovery and development. Several methodologies have been developed based on the properties of the ligands, the direct study of the receptor-ligand interactions, or a combination of both approaches. Ligand-based three-dimensional quantitative structure-activity relationships (3D-QSAR) techniques, not requiring knowledge of the receptor structure, have been historically the first to be applied to the prediction of the activity of GPCR ligands. They are generally endowed with robustness and good ranking ability; however they are highly dependent on training sets. Structure-based techniques generally do not provide the level of accuracy necessary to yield meaningful rankings when applied to GPCR homology models. However, they are essentially independent from training sets and have a sufficient level of accuracy to allow an effective discrimination between binders and nonbinders, thus qualifying as viable lead discovery tools. The combination of ligand and structure-based methodologies in the form of receptor-based 3D-QSAR and ligand and structure-based consensus models results in robust and accurate quantitative predictions. The contribution of the structure-based component to these combined approaches is expected to become more substantial and effective in the future, as more sophisticated scoring functions are developed and more detailed structural information on GPCRs is gathered.

  10. Light Field Imaging Based Accurate Image Specular Highlight Removal

    PubMed Central

    Wang, Haoqian; Xu, Chenxue; Wang, Xingzheng; Zhang, Yongbing; Peng, Bo

    2016-01-01

    Specular reflection removal is indispensable to many computer vision tasks. However, most existing methods fail or degrade in complex real scenarios for their individual drawbacks. Benefiting from the light field imaging technology, this paper proposes a novel and accurate approach to remove specularity and improve image quality. We first capture images with specularity by the light field camera (Lytro ILLUM). After accurately estimating the image depth, a simple and concise threshold strategy is adopted to cluster the specular pixels into “unsaturated” and “saturated” category. Finally, a color variance analysis of multiple views and a local color refinement are individually conducted on the two categories to recover diffuse color information. Experimental evaluation by comparison with existed methods based on our light field dataset together with Stanford light field archive verifies the effectiveness of our proposed algorithm. PMID:27253083

  11. Improved regulatory element prediction based on tissue-specific local epigenomic signatures

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    He, Yupeng; Gorkin, David U.; Dickel, Diane E.

    Accurate enhancer identification is critical for understanding the spatiotemporal transcriptional regulation during development as well as the functional impact of disease-related noncoding genetic variants. Computational methods have been developed to predict the genomic locations of active enhancers based on histone modifications, but the accuracy and resolution of these methods remain limited. Here, we present an algorithm, regulator y element prediction based on tissue-specific local epigenetic marks (REPTILE), which integrates histone modification and whole-genome cytosine DNA methylation profiles to identify the precise location of enhancers. We tested the ability of REPTILE to identify enhancers previously validated in reporter assays. Compared withmore » existing methods, REPTILE shows consistently superior performance across diverse cell and tissue types, and the enhancer locations are significantly more refined. We show that, by incorporating base-resolution methylation data, REPTILE greatly improves upon current methods for annotation of enhancers across a variety of cell and tissue types.« less

  12. Improved regulatory element prediction based on tissue-specific local epigenomic signatures

    PubMed Central

    He, Yupeng; Gorkin, David U.; Dickel, Diane E.; Nery, Joseph R.; Castanon, Rosa G.; Lee, Ah Young; Shen, Yin; Visel, Axel; Pennacchio, Len A.; Ren, Bing; Ecker, Joseph R.

    2017-01-01

    Accurate enhancer identification is critical for understanding the spatiotemporal transcriptional regulation during development as well as the functional impact of disease-related noncoding genetic variants. Computational methods have been developed to predict the genomic locations of active enhancers based on histone modifications, but the accuracy and resolution of these methods remain limited. Here, we present an algorithm, regulatory element prediction based on tissue-specific local epigenetic marks (REPTILE), which integrates histone modification and whole-genome cytosine DNA methylation profiles to identify the precise location of enhancers. We tested the ability of REPTILE to identify enhancers previously validated in reporter assays. Compared with existing methods, REPTILE shows consistently superior performance across diverse cell and tissue types, and the enhancer locations are significantly more refined. We show that, by incorporating base-resolution methylation data, REPTILE greatly improves upon current methods for annotation of enhancers across a variety of cell and tissue types. REPTILE is available at https://github.com/yupenghe/REPTILE/. PMID:28193886

  13. Improved regulatory element prediction based on tissue-specific local epigenomic signatures

    DOE PAGES

    He, Yupeng; Gorkin, David U.; Dickel, Diane E.; ...

    2017-02-13

    Accurate enhancer identification is critical for understanding the spatiotemporal transcriptional regulation during development as well as the functional impact of disease-related noncoding genetic variants. Computational methods have been developed to predict the genomic locations of active enhancers based on histone modifications, but the accuracy and resolution of these methods remain limited. Here, we present an algorithm, regulator y element prediction based on tissue-specific local epigenetic marks (REPTILE), which integrates histone modification and whole-genome cytosine DNA methylation profiles to identify the precise location of enhancers. We tested the ability of REPTILE to identify enhancers previously validated in reporter assays. Compared withmore » existing methods, REPTILE shows consistently superior performance across diverse cell and tissue types, and the enhancer locations are significantly more refined. We show that, by incorporating base-resolution methylation data, REPTILE greatly improves upon current methods for annotation of enhancers across a variety of cell and tissue types.« less

  14. Does mesenteric venous imaging assessment accurately predict pathologic invasion in localized pancreatic ductal adenocarcinoma?

    PubMed

    Clanton, Jesse; Oh, Stephen; Kaplan, Stephen J; Johnson, Emily; Ross, Andrew; Kozarek, Richard; Alseidi, Adnan; Biehl, Thomas; Picozzi, Vincent J; Helton, William S; Coy, David; Dorer, Russell; Rocha, Flavio G

    2018-05-09

    Accurate prediction of mesenteric venous involvement in pancreatic ductal adenocarcinoma (PDAC) is necessary for adequate staging and treatment. A retrospective cohort study was conducted in PDAC patients at a single institution. All patients with resected PDAC and staging CT and EUS between 2003 and 2014 were included and sub-divided into "upfront resected" and "neoadjuvant chemotherapy (NAC)" groups. Independent imaging re-review was correlated to venous resection and venous invasion. Sensitivity, specificity, positive and negative predictive values were then calculated. A total of 109 patients underwent analysis, 60 received upfront resection, and 49 NAC. Venous resection (30%) and vein invasion (13%) was less common in patients resected upfront than those who received NAC (53% and 16%, respectively). Both CT and EUS had poor sensitivity (14-44%) but high specificity (75-95%) for detecting venous resection and vein invasion in patients resected upfront, whereas sensitivity was high (84-100%) and specificity was low (27-44%) after NAC. Preoperative CT and EUS in PDAC have similar efficacy but different predictive capacity in assessing mesenteric venous involvement depending on whether patients are resected upfront or received NAC. Both modalities appear to significantly overestimate true vascular involvement and should be interpreted in the appropriate clinical context. Copyright © 2018 International Hepato-Pancreato-Biliary Association Inc. Published by Elsevier Ltd. All rights reserved.

  15. Accurate load prediction by BEM with airfoil data from 3D RANS simulations

    NASA Astrophysics Data System (ADS)

    Schneider, Marc S.; Nitzsche, Jens; Hennings, Holger

    2016-09-01

    In this paper, two methods for the extraction of airfoil coefficients from 3D CFD simulations of a wind turbine rotor are investigated, and these coefficients are used to improve the load prediction of a BEM code. The coefficients are extracted from a number of steady RANS simulations, using either averaging of velocities in annular sections, or an inverse BEM approach for determination of the induction factors in the rotor plane. It is shown that these 3D rotor polars are able to capture the rotational augmentation at the inner part of the blade as well as the load reduction by 3D effects close to the blade tip. They are used as input to a simple BEM code and the results of this BEM with 3D rotor polars are compared to the predictions of BEM with 2D airfoil coefficients plus common empirical corrections for stall delay and tip loss. While BEM with 2D airfoil coefficients produces a very different radial distribution of loads than the RANS simulation, the BEM with 3D rotor polars manages to reproduce the loads from RANS very accurately for a variety of load cases, as long as the blade pitch angle is not too different from the cases from which the polars were extracted.

  16. Accurate secondary structure prediction and fold recognition for circular dichroism spectroscopy

    PubMed Central

    Micsonai, András; Wien, Frank; Kernya, Linda; Lee, Young-Ho; Goto, Yuji; Réfrégiers, Matthieu; Kardos, József

    2015-01-01

    Circular dichroism (CD) spectroscopy is a widely used technique for the study of protein structure. Numerous algorithms have been developed for the estimation of the secondary structure composition from the CD spectra. These methods often fail to provide acceptable results on α/β-mixed or β-structure–rich proteins. The problem arises from the spectral diversity of β-structures, which has hitherto been considered as an intrinsic limitation of the technique. The predictions are less reliable for proteins of unusual β-structures such as membrane proteins, protein aggregates, and amyloid fibrils. Here, we show that the parallel/antiparallel orientation and the twisting of the β-sheets account for the observed spectral diversity. We have developed a method called β-structure selection (BeStSel) for the secondary structure estimation that takes into account the twist of β-structures. This method can reliably distinguish parallel and antiparallel β-sheets and accurately estimates the secondary structure for a broad range of proteins. Moreover, the secondary structure components applied by the method are characteristic to the protein fold, and thus the fold can be predicted to the level of topology in the CATH classification from a single CD spectrum. By constructing a web server, we offer a general tool for a quick and reliable structure analysis using conventional CD or synchrotron radiation CD (SRCD) spectroscopy for the protein science research community. The method is especially useful when X-ray or NMR techniques fail. Using BeStSel on data collected by SRCD spectroscopy, we investigated the structure of amyloid fibrils of various disease-related proteins and peptides. PMID:26038575

  17. Accurate interatomic force fields via machine learning with covariant kernels

    NASA Astrophysics Data System (ADS)

    Glielmo, Aldo; Sollich, Peter; De Vita, Alessandro

    2017-06-01

    We present a novel scheme to accurately predict atomic forces as vector quantities, rather than sets of scalar components, by Gaussian process (GP) regression. This is based on matrix-valued kernel functions, on which we impose the requirements that the predicted force rotates with the target configuration and is independent of any rotations applied to the configuration database entries. We show that such covariant GP kernels can be obtained by integration over the elements of the rotation group SO (d ) for the relevant dimensionality d . Remarkably, in specific cases the integration can be carried out analytically and yields a conservative force field that can be recast into a pair interaction form. Finally, we show that restricting the integration to a summation over the elements of a finite point group relevant to the target system is sufficient to recover an accurate GP. The accuracy of our kernels in predicting quantum-mechanical forces in real materials is investigated by tests on pure and defective Ni, Fe, and Si crystalline systems.

  18. A variable capacitance based modeling and power capability predicting method for ultracapacitor

    NASA Astrophysics Data System (ADS)

    Liu, Chang; Wang, Yujie; Chen, Zonghai; Ling, Qiang

    2018-01-01

    Methods of accurate modeling and power capability predicting for ultracapacitors are of great significance in management and application of lithium-ion battery/ultracapacitor hybrid energy storage system. To overcome the simulation error coming from constant capacitance model, an improved ultracapacitor model based on variable capacitance is proposed, where the main capacitance varies with voltage according to a piecewise linear function. A novel state-of-charge calculation approach is developed accordingly. After that, a multi-constraint power capability prediction is developed for ultracapacitor, in which a Kalman-filter-based state observer is designed for tracking ultracapacitor's real-time behavior. Finally, experimental results verify the proposed methods. The accuracy of the proposed model is verified by terminal voltage simulating results under different temperatures, and the effectiveness of the designed observer is proved by various test conditions. Additionally, the power capability prediction results of different time scales and temperatures are compared, to study their effects on ultracapacitor's power capability.

  19. DFT-based method for more accurate adsorption energies: An adaptive sum of energies from RPBE and vdW density functionals

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hensley, Alyssa J. R.; Ghale, Kushal; Rieg, Carolin

    In recent years, the popularity of density functional theory with periodic boundary conditions (DFT) has surged for the design and optimization of functional materials. However, no single DFT exchange–correlation functional currently available gives accurate adsorption energies on transition metals both when bonding to the surface is dominated by strong covalent or ionic bonding and when it has strong contributions from van der Waals interactions (i.e., dispersion forces). Here we present a new, simple method for accurately predicting adsorption energies on transition-metal surfaces based on DFT calculations, using an adaptively weighted sum of energies from RPBE and optB86b-vdW (or optB88-vdW) densitymore » functionals. This method has been benchmarked against a set of 39 reliable experimental energies for adsorption reactions. Our results show that this method has a mean absolute error and root mean squared error relative to experiments of 13.4 and 19.3 kJ/mol, respectively, compared to 20.4 and 26.4 kJ/mol for the BEEF-vdW functional. For systems with large van der Waals contributions, this method decreases these errors to 11.6 and 17.5 kJ/mol. Furthermore, this method provides predictions of adsorption energies both for processes dominated by strong covalent or ionic bonding and for those dominated by dispersion forces that are more accurate than those of any current standard DFT functional alone.« less

  20. DFT-based method for more accurate adsorption energies: An adaptive sum of energies from RPBE and vdW density functionals

    DOE PAGES

    Hensley, Alyssa J. R.; Ghale, Kushal; Rieg, Carolin; ...

    2017-01-26

    In recent years, the popularity of density functional theory with periodic boundary conditions (DFT) has surged for the design and optimization of functional materials. However, no single DFT exchange–correlation functional currently available gives accurate adsorption energies on transition metals both when bonding to the surface is dominated by strong covalent or ionic bonding and when it has strong contributions from van der Waals interactions (i.e., dispersion forces). Here we present a new, simple method for accurately predicting adsorption energies on transition-metal surfaces based on DFT calculations, using an adaptively weighted sum of energies from RPBE and optB86b-vdW (or optB88-vdW) densitymore » functionals. This method has been benchmarked against a set of 39 reliable experimental energies for adsorption reactions. Our results show that this method has a mean absolute error and root mean squared error relative to experiments of 13.4 and 19.3 kJ/mol, respectively, compared to 20.4 and 26.4 kJ/mol for the BEEF-vdW functional. For systems with large van der Waals contributions, this method decreases these errors to 11.6 and 17.5 kJ/mol. Furthermore, this method provides predictions of adsorption energies both for processes dominated by strong covalent or ionic bonding and for those dominated by dispersion forces that are more accurate than those of any current standard DFT functional alone.« less

  1. Accurate prediction of bacterial type IV secreted effectors using amino acid composition and PSSM profiles.

    PubMed

    Zou, Lingyun; Nan, Chonghan; Hu, Fuquan

    2013-12-15

    Various human pathogens secret effector proteins into hosts cells via the type IV secretion system (T4SS). These proteins play important roles in the interaction between bacteria and hosts. Computational methods for T4SS effector prediction have been developed for screening experimental targets in several isolated bacterial species; however, widely applicable prediction approaches are still unavailable In this work, four types of distinctive features, namely, amino acid composition, dipeptide composition, .position-specific scoring matrix composition and auto covariance transformation of position-specific scoring matrix, were calculated from primary sequences. A classifier, T4EffPred, was developed using the support vector machine with these features and their different combinations for effector prediction. Various theoretical tests were performed in a newly established dataset, and the results were measured with four indexes. We demonstrated that T4EffPred can discriminate IVA and IVB effectors in benchmark datasets with positive rates of 76.7% and 89.7%, respectively. The overall accuracy of 95.9% shows that the present method is accurate for distinguishing the T4SS effector in unidentified sequences. A classifier ensemble was designed to synthesize all single classifiers. Notable performance improvement was observed using this ensemble system in benchmark tests. To demonstrate the model's application, a genome-scale prediction of effectors was performed in Bartonella henselae, an important zoonotic pathogen. A number of putative candidates were distinguished. A web server implementing the prediction method and the source code are both available at http://bioinfo.tmmu.edu.cn/T4EffPred.

  2. Bringing modeling to the masses: A web based system to predict potential species distributions

    USGS Publications Warehouse

    Graham, Jim; Newman, Greg; Kumar, Sunil; Jarnevich, Catherine S.; Young, Nick; Crall, Alycia W.; Stohlgren, Thomas J.; Evangelista, Paul

    2010-01-01

    Predicting current and potential species distributions and abundance is critical for managing invasive species, preserving threatened and endangered species, and conserving native species and habitats. Accurate predictive models are needed at local, regional, and national scales to guide field surveys, improve monitoring, and set priorities for conservation and restoration. Modeling capabilities, however, are often limited by access to software and environmental data required for predictions. To address these needs, we built a comprehensive web-based system that: (1) maintains a large database of field data; (2) provides access to field data and a wealth of environmental data; (3) accesses values in rasters representing environmental characteristics; (4) runs statistical spatial models; and (5) creates maps that predict the potential species distribution. The system is available online at www.niiss.org, and provides web-based tools for stakeholders to create potential species distribution models and maps under current and future climate scenarios.

  3. Towards accurate cosmological predictions for rapidly oscillating scalar fields as dark matter

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ureña-López, L. Arturo; Gonzalez-Morales, Alma X., E-mail: lurena@ugto.mx, E-mail: alma.gonzalez@fisica.ugto.mx

    2016-07-01

    As we are entering the era of precision cosmology, it is necessary to count on accurate cosmological predictions from any proposed model of dark matter. In this paper we present a novel approach to the cosmological evolution of scalar fields that eases their analytic and numerical analysis at the background and at the linear order of perturbations. The new method makes use of appropriate angular variables that simplify the writing of the equations of motion, and which also show that the usual field variables play a secondary role in the cosmological dynamics. We apply the method to a scalar fieldmore » endowed with a quadratic potential and revisit its properties as dark matter. Some of the results known in the literature are recovered, and a better understanding of the physical properties of the model is provided. It is confirmed that there exists a Jeans wavenumber k {sub J} , directly related to the suppression of linear perturbations at wavenumbers k > k {sub J} , and which is verified to be k {sub J} = a √ mH . We also discuss some semi-analytical results that are well satisfied by the full numerical solutions obtained from an amended version of the CMB code CLASS. Finally we draw some of the implications that this new treatment of the equations of motion may have in the prediction of cosmological observables from scalar field dark matter models.« less

  4. Adaptive Granulation-Based Prediction for Energy System of Steel Industry.

    PubMed

    Wang, Tianyu; Han, Zhongyang; Zhao, Jun; Wang, Wei

    2018-01-01

    The flow variation tendency of byproduct gas plays a crucial role for energy scheduling in steel industry. An accurate prediction of its future trends will be significantly beneficial for the economic profits of steel enterprise. In this paper, a long-term prediction model for the energy system is proposed by providing an adaptive granulation-based method that considers the production semantics involved in the fluctuation tendency of the energy data, and partitions them into a series of information granules. To fully reflect the corresponding data characteristics of the formed unequal-length temporal granules, a 3-D feature space consisting of the timespan, the amplitude and the linetype is designed as linguistic descriptors. In particular, a collaborative-conditional fuzzy clustering method is proposed to granularize the tendency-based feature descriptors and specifically measure the amplitude variation of industrial data which plays a dominant role in the feature space. To quantify the performance of the proposed method, a series of real-world industrial data coming from the energy data center of a steel plant is employed to conduct the comparative experiments. The experimental results demonstrate that the proposed method successively satisfies the requirements of the practically viable prediction.

  5. EP-DNN: A Deep Neural Network-Based Global Enhancer Prediction Algorithm.

    PubMed

    Kim, Seong Gon; Harwani, Mrudul; Grama, Ananth; Chaterji, Somali

    2016-12-08

    We present EP-DNN, a protocol for predicting enhancers based on chromatin features, in different cell types. Specifically, we use a deep neural network (DNN)-based architecture to extract enhancer signatures in a representative human embryonic stem cell type (H1) and a differentiated lung cell type (IMR90). We train EP-DNN using p300 binding sites, as enhancers, and TSS and random non-DHS sites, as non-enhancers. We perform same-cell and cross-cell predictions to quantify the validation rate and compare against two state-of-the-art methods, DEEP-ENCODE and RFECS. We find that EP-DNN has superior accuracy with a validation rate of 91.6%, relative to 85.3% for DEEP-ENCODE and 85.5% for RFECS, for a given number of enhancer predictions and also scales better for a larger number of enhancer predictions. Moreover, our H1 → IMR90 predictions turn out to be more accurate than IMR90 → IMR90, potentially because H1 exhibits a richer signature set and our EP-DNN model is expressive enough to extract these subtleties. Our work shows how to leverage the full expressivity of deep learning models, using multiple hidden layers, while avoiding overfitting on the training data. We also lay the foundation for exploration of cross-cell enhancer predictions, potentially reducing the need for expensive experimentation.

  6. EP-DNN: A Deep Neural Network-Based Global Enhancer Prediction Algorithm

    NASA Astrophysics Data System (ADS)

    Kim, Seong Gon; Harwani, Mrudul; Grama, Ananth; Chaterji, Somali

    2016-12-01

    We present EP-DNN, a protocol for predicting enhancers based on chromatin features, in different cell types. Specifically, we use a deep neural network (DNN)-based architecture to extract enhancer signatures in a representative human embryonic stem cell type (H1) and a differentiated lung cell type (IMR90). We train EP-DNN using p300 binding sites, as enhancers, and TSS and random non-DHS sites, as non-enhancers. We perform same-cell and cross-cell predictions to quantify the validation rate and compare against two state-of-the-art methods, DEEP-ENCODE and RFECS. We find that EP-DNN has superior accuracy with a validation rate of 91.6%, relative to 85.3% for DEEP-ENCODE and 85.5% for RFECS, for a given number of enhancer predictions and also scales better for a larger number of enhancer predictions. Moreover, our H1 → IMR90 predictions turn out to be more accurate than IMR90 → IMR90, potentially because H1 exhibits a richer signature set and our EP-DNN model is expressive enough to extract these subtleties. Our work shows how to leverage the full expressivity of deep learning models, using multiple hidden layers, while avoiding overfitting on the training data. We also lay the foundation for exploration of cross-cell enhancer predictions, potentially reducing the need for expensive experimentation.

  7. Accurate prediction of cellular co-translational folding indicates proteins can switch from post- to co-translational folding

    PubMed Central

    Nissley, Daniel A.; Sharma, Ajeet K.; Ahmed, Nabeel; Friedrich, Ulrike A.; Kramer, Günter; Bukau, Bernd; O'Brien, Edward P.

    2016-01-01

    The rates at which domains fold and codons are translated are important factors in determining whether a nascent protein will co-translationally fold and function or misfold and malfunction. Here we develop a chemical kinetic model that calculates a protein domain's co-translational folding curve during synthesis using only the domain's bulk folding and unfolding rates and codon translation rates. We show that this model accurately predicts the course of co-translational folding measured in vivo for four different protein molecules. We then make predictions for a number of different proteins in yeast and find that synonymous codon substitutions, which change translation-elongation rates, can switch some protein domains from folding post-translationally to folding co-translationally—a result consistent with previous experimental studies. Our approach explains essential features of co-translational folding curves and predicts how varying the translation rate at different codon positions along a transcript's coding sequence affects this self-assembly process. PMID:26887592

  8. A data-driven SVR model for long-term runoff prediction and uncertainty analysis based on the Bayesian framework

    NASA Astrophysics Data System (ADS)

    Liang, Zhongmin; Li, Yujie; Hu, Yiming; Li, Binquan; Wang, Jun

    2017-06-01

    Accurate and reliable long-term forecasting plays an important role in water resources management and utilization. In this paper, a hybrid model called SVR-HUP is presented to predict long-term runoff and quantify the prediction uncertainty. The model is created based on three steps. First, appropriate predictors are selected according to the correlations between meteorological factors and runoff. Second, a support vector regression (SVR) model is structured and optimized based on the LibSVM toolbox and a genetic algorithm. Finally, using forecasted and observed runoff, a hydrologic uncertainty processor (HUP) based on a Bayesian framework is used to estimate the posterior probability distribution of the simulated values, and the associated uncertainty of prediction was quantitatively analyzed. Six precision evaluation indexes, including the correlation coefficient (CC), relative root mean square error (RRMSE), relative error (RE), mean absolute percentage error (MAPE), Nash-Sutcliffe efficiency (NSE), and qualification rate (QR), are used to measure the prediction accuracy. As a case study, the proposed approach is applied in the Han River basin, South Central China. Three types of SVR models are established to forecast the monthly, flood season and annual runoff volumes. The results indicate that SVR yields satisfactory accuracy and reliability at all three scales. In addition, the results suggest that the HUP cannot only quantify the uncertainty of prediction based on a confidence interval but also provide a more accurate single value prediction than the initial SVR forecasting result. Thus, the SVR-HUP model provides an alternative method for long-term runoff forecasting.

  9. A microRNA-based prediction model for lymph node metastasis in hepatocellular carcinoma.

    PubMed

    Zhang, Li; Xiang, Zuo-Lin; Zeng, Zhao-Chong; Fan, Jia; Tang, Zhao-You; Zhao, Xiao-Mei

    2016-01-19

    We developed an efficient microRNA (miRNA) model that could predict the risk of lymph node metastasis (LNM) in hepatocellular carcinoma (HCC). We first evaluated a training cohort of 192 HCC patients after hepatectomy and found five LNM associated predictive factors: vascular invasion, Barcelona Clinic Liver Cancer stage, miR-145, miR-31, and miR-92a. The five statistically independent factors were used to develop a predictive model. The predictive value of the miRNA-based model was confirmed in a validation cohort of 209 consecutive HCC patients. The prediction model was scored for LNM risk from 0 to 8. The cutoff value 4 was used to distinguish high-risk and low-risk groups. The model sensitivity and specificity was 69.6 and 80.2%, respectively, during 5 years in the validation cohort. And the area under the curve (AUC) for the miRNA-based prognostic model was 0.860. The 5-year positive and negative predictive values of the model in the validation cohort were 30.3 and 95.5%, respectively. Cox regression analysis revealed that the LNM hazard ratio of the high-risk versus low-risk groups was 11.751 (95% CI, 5.110-27.021; P < 0.001) in the validation cohort. In conclusion, the miRNA-based model is reliable and accurate for the early prediction of LNM in patients with HCC.

  10. Comparison of numerical weather prediction based deterministic and probabilistic wind resource assessment methods

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Jie; Draxl, Caroline; Hopson, Thomas

    Numerical weather prediction (NWP) models have been widely used for wind resource assessment. Model runs with higher spatial resolution are generally more accurate, yet extremely computational expensive. An alternative approach is to use data generated by a low resolution NWP model, in conjunction with statistical methods. In order to analyze the accuracy and computational efficiency of different types of NWP-based wind resource assessment methods, this paper performs a comparison of three deterministic and probabilistic NWP-based wind resource assessment methodologies: (i) a coarse resolution (0.5 degrees x 0.67 degrees) global reanalysis data set, the Modern-Era Retrospective Analysis for Research and Applicationsmore » (MERRA); (ii) an analog ensemble methodology based on the MERRA, which provides both deterministic and probabilistic predictions; and (iii) a fine resolution (2-km) NWP data set, the Wind Integration National Dataset (WIND) Toolkit, based on the Weather Research and Forecasting model. Results show that: (i) as expected, the analog ensemble and WIND Toolkit perform significantly better than MERRA confirming their ability to downscale coarse estimates; (ii) the analog ensemble provides the best estimate of the multi-year wind distribution at seven of the nine sites, while the WIND Toolkit is the best at one site; (iii) the WIND Toolkit is more accurate in estimating the distribution of hourly wind speed differences, which characterizes the wind variability, at five of the available sites, with the analog ensemble being best at the remaining four locations; and (iv) the analog ensemble computational cost is negligible, whereas the WIND Toolkit requires large computational resources. Future efforts could focus on the combination of the analog ensemble with intermediate resolution (e.g., 10-15 km) NWP estimates, to considerably reduce the computational burden, while providing accurate deterministic estimates and reliable probabilistic assessments.« less

  11. Identifying and Tracking Pedestrians Based on Sensor Fusion and Motion Stability Predictions

    PubMed Central

    Musleh, Basam; García, Fernando; Otamendi, Javier; Armingol, José Mª; de la Escalera, Arturo

    2010-01-01

    The lack of trustworthy sensors makes development of Advanced Driver Assistance System (ADAS) applications a tough task. It is necessary to develop intelligent systems by combining reliable sensors and real-time algorithms to send the proper, accurate messages to the drivers. In this article, an application to detect and predict the movement of pedestrians in order to prevent an imminent collision has been developed and tested under real conditions. The proposed application, first, accurately measures the position of obstacles using a two-sensor hybrid fusion approach: a stereo camera vision system and a laser scanner. Second, it correctly identifies pedestrians using intelligent algorithms based on polylines and pattern recognition related to leg positions (laser subsystem) and dense disparity maps and u-v disparity (vision subsystem). Third, it uses statistical validation gates and confidence regions to track the pedestrian within the detection zones of the sensors and predict their position in the upcoming frames. The intelligent sensor application has been experimentally tested with success while tracking pedestrians that cross and move in zigzag fashion in front of a vehicle. PMID:22163639

  12. Identifying and tracking pedestrians based on sensor fusion and motion stability predictions.

    PubMed

    Musleh, Basam; García, Fernando; Otamendi, Javier; Armingol, José Maria; de la Escalera, Arturo

    2010-01-01

    The lack of trustworthy sensors makes development of Advanced Driver Assistance System (ADAS) applications a tough task. It is necessary to develop intelligent systems by combining reliable sensors and real-time algorithms to send the proper, accurate messages to the drivers. In this article, an application to detect and predict the movement of pedestrians in order to prevent an imminent collision has been developed and tested under real conditions. The proposed application, first, accurately measures the position of obstacles using a two-sensor hybrid fusion approach: a stereo camera vision system and a laser scanner. Second, it correctly identifies pedestrians using intelligent algorithms based on polylines and pattern recognition related to leg positions (laser subsystem) and dense disparity maps and u-v disparity (vision subsystem). Third, it uses statistical validation gates and confidence regions to track the pedestrian within the detection zones of the sensors and predict their position in the upcoming frames. The intelligent sensor application has been experimentally tested with success while tracking pedestrians that cross and move in zigzag fashion in front of a vehicle.

  13. Weather-based prediction of Plasmodium falciparum malaria in epidemic-prone regions of Ethiopia II. Weather-based prediction systems perform comparably to early detection systems in identifying times for interventions.

    PubMed

    Teklehaimanot, Hailay D; Schwartz, Joel; Teklehaimanot, Awash; Lipsitch, Marc

    2004-11-19

    Timely and accurate information about the onset of malaria epidemics is essential for effective control activities in epidemic-prone regions. Early warning methods that provide earlier alerts (usually by the use of weather variables) may permit control measures to interrupt transmission earlier in the epidemic, perhaps at the expense of some level of accuracy. Expected case numbers were modeled using a Poisson regression with lagged weather factors in a 4th-degree polynomial distributed lag model. For each week, the numbers of malaria cases were predicted using coefficients obtained using all years except that for which the prediction was being made. The effectiveness of alerts generated by the prediction system was compared against that of alerts based on observed cases. The usefulness of the prediction system was evaluated in cold and hot districts. The system predicts the overall pattern of cases well, yet underestimates the height of the largest peaks. Relative to alerts triggered by observed cases, the alerts triggered by the predicted number of cases performed slightly worse, within 5% of the detection system. The prediction-based alerts were able to prevent 10-25% more cases at a given sensitivity in cold districts than in hot ones. The prediction of malaria cases using lagged weather performed well in identifying periods of increased malaria cases. Weather-derived predictions identified epidemics with reasonable accuracy and better timeliness than early detection systems; therefore, the prediction of malarial epidemics using weather is a plausible alternative to early detection systems.

  14. Copula based prediction models: an application to an aortic regurgitation study

    PubMed Central

    Kumar, Pranesh; Shoukri, Mohamed M

    2007-01-01

    Background: An important issue in prediction modeling of multivariate data is the measure of dependence structure. The use of Pearson's correlation as a dependence measure has several pitfalls and hence application of regression prediction models based on this correlation may not be an appropriate methodology. As an alternative, a copula based methodology for prediction modeling and an algorithm to simulate data are proposed. Methods: The method consists of introducing copulas as an alternative to the correlation coefficient commonly used as a measure of dependence. An algorithm based on the marginal distributions of random variables is applied to construct the Archimedean copulas. Monte Carlo simulations are carried out to replicate datasets, estimate prediction model parameters and validate them using Lin's concordance measure. Results: We have carried out a correlation-based regression analysis on data from 20 patients aged 17–82 years on pre-operative and post-operative ejection fractions after surgery and estimated the prediction model: Post-operative ejection fraction = - 0.0658 + 0.8403 (Pre-operative ejection fraction); p = 0.0008; 95% confidence interval of the slope coefficient (0.3998, 1.2808). From the exploratory data analysis, it is noted that both the pre-operative and post-operative ejection fractions measurements have slight departures from symmetry and are skewed to the left. It is also noted that the measurements tend to be widely spread and have shorter tails compared to normal distribution. Therefore predictions made from the correlation-based model corresponding to the pre-operative ejection fraction measurements in the lower range may not be accurate. Further it is found that the best approximated marginal distributions of pre-operative and post-operative ejection fractions (using q-q plots) are gamma distributions. The copula based prediction model is estimated as: Post -operative ejection fraction = - 0.0933 + 0.8907 × (Pre

  15. Base excess is an accurate predictor of elevated lactate in ED septic patients.

    PubMed

    Montassier, Emmanuel; Batard, Eric; Segard, Julien; Hardouin, Jean-Benoît; Martinage, Arnaud; Le Conte, Philippe; Potel, Gille

    2012-01-01

    Prior studies showed that lactate is a useful marker in sepsis. However, lactate is often not routinely drawn or rapidly available in the emergency department (ED). The study aimed to determine if base excess (BE), widely and rapidly available in the ED, could be used as a surrogate marker for elevated lactate in ED septic patients. This was a prospective and observational cohort study. From March 2009 to March 2010, consecutive patients 18 years or older who presented to the ED with a suspected severe sepsis were enrolled in the study. Lactate and BE measurements were performed. We defined, a priori, a clinically significant lactate to be greater than 3 mmol/L and BE less than -4 mmol/L. A total of 224 patients were enrolled in the study. The average BE was -4.5 mmol/L (SD, 4.9) and the average lactate was 3.5 mmol/L (SD, 2.9). The sensitivity of a BE less than -4 mmol/L in predicting elevated lactate greater than 3 mmol/L was 91.1% (95% confidence interval, 85.5%-96.6%) and the specificity was 88.6% (95% confidence interval, 83.0%-94.2%). The area under the curve was 0.95. Base excess is an accurate marker for the prediction of elevated lactate in the ED. The measurement of BE, obtained in a few minutes in the ED, provides a secure and quick method, similar to the electrocardiogram at triage for patients with chest pain, to determine the patients with sepsis who need an early aggressive resuscitation. Copyright © 2012 Elsevier Inc. All rights reserved.

  16. Predicting DNA hybridization kinetics from sequence

    NASA Astrophysics Data System (ADS)

    Zhang, Jinny X.; Fang, John Z.; Duan, Wei; Wu, Lucia R.; Zhang, Angela W.; Dalchau, Neil; Yordanov, Boyan; Petersen, Rasmus; Phillips, Andrew; Zhang, David Yu

    2018-01-01

    Hybridization is a key molecular process in biology and biotechnology, but so far there is no predictive model for accurately determining hybridization rate constants based on sequence information. Here, we report a weighted neighbour voting (WNV) prediction algorithm, in which the hybridization rate constant of an unknown sequence is predicted based on similarity reactions with known rate constants. To construct this algorithm we first performed 210 fluorescence kinetics experiments to observe the hybridization kinetics of 100 different DNA target and probe pairs (36 nt sub-sequences of the CYCS and VEGF genes) at temperatures ranging from 28 to 55 °C. Automated feature selection and weighting optimization resulted in a final six-feature WNV model, which can predict hybridization rate constants of new sequences to within a factor of 3 with ∼91% accuracy, based on leave-one-out cross-validation. Accurate prediction of hybridization kinetics allows the design of efficient probe sequences for genomics research.

  17. Accurate prediction of complex free surface flow around a high speed craft using a single-phase level set method

    NASA Astrophysics Data System (ADS)

    Broglia, Riccardo; Durante, Danilo

    2017-11-01

    This paper focuses on the analysis of a challenging free surface flow problem involving a surface vessel moving at high speeds, or planing. The investigation is performed using a general purpose high Reynolds free surface solver developed at CNR-INSEAN. The methodology is based on a second order finite volume discretization of the unsteady Reynolds-averaged Navier-Stokes equations (Di Mascio et al. in A second order Godunov—type scheme for naval hydrodynamics, Kluwer Academic/Plenum Publishers, Dordrecht, pp 253-261, 2001; Proceedings of 16th international offshore and polar engineering conference, San Francisco, CA, USA, 2006; J Mar Sci Technol 14:19-29, 2009); air/water interface dynamics is accurately modeled by a non standard level set approach (Di Mascio et al. in Comput Fluids 36(5):868-886, 2007a), known as the single-phase level set method. In this algorithm the governing equations are solved only in the water phase, whereas the numerical domain in the air phase is used for a suitable extension of the fluid dynamic variables. The level set function is used to track the free surface evolution; dynamic boundary conditions are enforced directly on the interface. This approach allows to accurately predict the evolution of the free surface even in the presence of violent breaking waves phenomena, maintaining the interface sharp, without any need to smear out the fluid properties across the two phases. This paper is aimed at the prediction of the complex free-surface flow field generated by a deep-V planing boat at medium and high Froude numbers (from 0.6 up to 1.2). In the present work, the planing hull is treated as a two-degree-of-freedom rigid object. Flow field is characterized by the presence of thin water sheets, several energetic breaking waves and plungings. The computational results include convergence of the trim angle, sinkage and resistance under grid refinement; high-quality experimental data are used for the purposes of validation, allowing to

  18. Prediction of Recidivism in Juvenile Offenders Based on Discriminant Analysis.

    ERIC Educational Resources Information Center

    Proefrock, David W.

    The recent development of strong statistical techniques has made accurate predictions of recidivism possible. To investigate the utility of discriminant analysis methodology in making predictions of recidivism in juvenile offenders, the court records of 271 male and female juvenile offenders, aged 12-16, were reviewed. A cross validation group…

  19. Accurate prediction of pregnancy viability by means of a simple scoring system.

    PubMed

    Bottomley, Cecilia; Van Belle, Vanya; Kirk, Emma; Van Huffel, Sabine; Timmerman, Dirk; Bourne, Tom

    2013-01-01

    What is the performance of a simple scoring system to predict whether women will have an ongoing viable intrauterine pregnancy beyond the first trimester? A simple scoring system using demographic and initial ultrasound variables accurately predicts pregnancy viability beyond the first trimester with an area under the curve (AUC) in a receiver operating characteristic curve of 0.924 [95% confidence interval (CI) 0.900-0.947] on an independent test set. Individual demographic and ultrasound factors, such as maternal age, vaginal bleeding and gestational sac size, are strong predictors of miscarriage. Previous mathematical models have combined individual risk factors with reasonable performance. A simple scoring system derived from a mathematical model that can be easily implemented in clinical practice has not previously been described for the prediction of ongoing viability. This was a prospective observational study in a single early pregnancy assessment centre during a 9-month period. A cohort of 1881 consecutive women undergoing transvaginal ultrasound scan at a gestational age <84 days were included. Women were excluded if the first trimester outcome was not known. Demographic features, symptoms and ultrasound variables were tested for their influence on ongoing viability. Logistic regression was used to determine the influence on first trimester viability from demographics and symptoms alone, ultrasound findings alone and then from all the variables combined. Each model was developed on a training data set, and a simple scoring system was derived from this. This scoring system was tested on an independent test data set. The final outcome based on a total of 1435 participants was an ongoing viable pregnancy in 885 (61.7%) and early pregnancy loss in 550 (38.3%) women. The scoring system using significant demographic variables alone (maternal age and amount of bleeding) to predict ongoing viability gave an AUC of 0.724 (95% CI = 0.692-0.756) in the training set

  20. Intelligent Prediction of Fan Rotation Stall in Power Plants Based on Pressure Sensor Data Measured In-Situ

    PubMed Central

    Xu, Xiaogang; Wang, Songling; Liu, Jinlian; Liu, Xinyu

    2014-01-01

    Blower and exhaust fans consume over 30% of electricity in a thermal power plant, and faults of these fans due to rotation stalls are one of the most frequent reasons for power plant outage failures. To accurately predict the occurrence of fan rotation stalls, we propose a support vector regression machine (SVRM) model that predicts the fan internal pressures during operation, leaving ample time for rotation stall detection. We train the SVRM model using experimental data samples, and perform pressure data prediction using the trained SVRM model. To prove the feasibility of using the SVRM model for rotation stall prediction, we further process the predicted pressure data via wavelet-transform-based stall detection. By comparison of the detection results from the predicted and measured pressure data, we demonstrate that the SVRM model can accurately predict the fan pressure and guarantee reliable stall detection with a time advance of up to 0.0625 s. This superior pressure data prediction capability leaves significant time for effective control and prevention of fan rotation stall faults. This model has great potential for use in intelligent fan systems with stall prevention capability, which will ensure safe operation and improve the energy efficiency of power plants. PMID:24854057

  1. The Prediction of the Gas Utilization Ratio Based on TS Fuzzy Neural Network and Particle Swarm Optimization

    PubMed Central

    Jiang, Haihe; Yin, Yixin; Xiao, Wendong; Zhao, Baoyong

    2018-01-01

    Gas utilization ratio (GUR) is an important indicator that is used to evaluate the energy consumption of blast furnaces (BFs). Currently, the existing methods cannot predict the GUR accurately. In this paper, we present a novel data-driven model for predicting the GUR. The proposed approach utilized both the TS fuzzy neural network (TS-FNN) and the particle swarm algorithm (PSO) to predict the GUR. The particle swarm algorithm (PSO) is applied to optimize the parameters of the TS-FNN in order to decrease the error caused by the inaccurate initial parameter. This paper also applied the box graph (Box-plot) method to eliminate the abnormal value of the raw data during the data preprocessing. This method can deal with the data which does not obey the normal distribution which is caused by the complex industrial environments. The prediction results demonstrate that the optimization model based on PSO and the TS-FNN approach achieves higher prediction accuracy compared with the TS-FNN model and SVM model and the proposed approach can accurately predict the GUR of the blast furnace, providing an effective way for the on-line blast furnace distribution control. PMID:29461469

  2. The Prediction of the Gas Utilization Ratio based on TS Fuzzy Neural Network and Particle Swarm Optimization.

    PubMed

    Zhang, Sen; Jiang, Haihe; Yin, Yixin; Xiao, Wendong; Zhao, Baoyong

    2018-02-20

    Gas utilization ratio (GUR) is an important indicator that is used to evaluate the energy consumption of blast furnaces (BFs). Currently, the existing methods cannot predict the GUR accurately. In this paper, we present a novel data-driven model for predicting the GUR. The proposed approach utilized both the TS fuzzy neural network (TS-FNN) and the particle swarm algorithm (PSO) to predict the GUR. The particle swarm algorithm (PSO) is applied to optimize the parameters of the TS-FNN in order to decrease the error caused by the inaccurate initial parameter. This paper also applied the box graph (Box-plot) method to eliminate the abnormal value of the raw data during the data preprocessing. This method can deal with the data which does not obey the normal distribution which is caused by the complex industrial environments. The prediction results demonstrate that the optimization model based on PSO and the TS-FNN approach achieves higher prediction accuracy compared with the TS-FNN model and SVM model and the proposed approach can accurately predict the GUR of the blast furnace, providing an effective way for the on-line blast furnace distribution control.

  3. FastRNABindR: Fast and Accurate Prediction of Protein-RNA Interface Residues.

    PubMed

    El-Manzalawy, Yasser; Abbas, Mostafa; Malluhi, Qutaibah; Honavar, Vasant

    2016-01-01

    A wide range of biological processes, including regulation of gene expression, protein synthesis, and replication and assembly of many viruses are mediated by RNA-protein interactions. However, experimental determination of the structures of protein-RNA complexes is expensive and technically challenging. Hence, a number of computational tools have been developed for predicting protein-RNA interfaces. Some of the state-of-the-art protein-RNA interface predictors rely on position-specific scoring matrix (PSSM)-based encoding of the protein sequences. The computational efforts needed for generating PSSMs severely limits the practical utility of protein-RNA interface prediction servers. In this work, we experiment with two approaches, random sampling and sequence similarity reduction, for extracting a representative reference database of protein sequences from more than 50 million protein sequences in UniRef100. Our results suggest that random sampled databases produce better PSSM profiles (in terms of the number of hits used to generate the profile and the distance of the generated profile to the corresponding profile generated using the entire UniRef100 data as well as the accuracy of the machine learning classifier trained using these profiles). Based on our results, we developed FastRNABindR, an improved version of RNABindR for predicting protein-RNA interface residues using PSSM profiles generated using 1% of the UniRef100 sequences sampled uniformly at random. To the best of our knowledge, FastRNABindR is the only protein-RNA interface residue prediction online server that requires generation of PSSM profiles for query sequences and accepts hundreds of protein sequences per submission. Our approach for determining the optimal BLAST database for a protein-RNA interface residue classification task has the potential of substantially speeding up, and hence increasing the practical utility of, other amino acid sequence based predictors of protein-protein and protein

  4. Predictive local receptive fields based respiratory motion tracking for motion-adaptive radiotherapy.

    PubMed

    Yubo Wang; Tatinati, Sivanagaraja; Liyu Huang; Kim Jeong Hong; Shafiq, Ghufran; Veluvolu, Kalyana C; Khong, Andy W H

    2017-07-01

    Extracranial robotic radiotherapy employs external markers and a correlation model to trace the tumor motion caused by the respiration. The real-time tracking of tumor motion however requires a prediction model to compensate the latencies induced by the software (image data acquisition and processing) and hardware (mechanical and kinematic) limitations of the treatment system. A new prediction algorithm based on local receptive fields extreme learning machines (pLRF-ELM) is proposed for respiratory motion prediction. All the existing respiratory motion prediction methods model the non-stationary respiratory motion traces directly to predict the future values. Unlike these existing methods, the pLRF-ELM performs prediction by modeling the higher-level features obtained by mapping the raw respiratory motion into the random feature space of ELM instead of directly modeling the raw respiratory motion. The developed method is evaluated using the dataset acquired from 31 patients for two horizons in-line with the latencies of treatment systems like CyberKnife. Results showed that pLRF-ELM is superior to that of existing prediction methods. Results further highlight that the abstracted higher-level features are suitable to approximate the nonlinear and non-stationary characteristics of respiratory motion for accurate prediction.

  5. Accurate Prediction of Drug-Induced Liver Injury Using Stem Cell-Derived Populations

    PubMed Central

    Szkolnicka, Dagmara; Farnworth, Sarah L.; Lucendo-Villarin, Baltasar; Storck, Christopher; Zhou, Wenli; Iredale, John P.; Flint, Oliver

    2014-01-01

    Despite major progress in the knowledge and management of human liver injury, there are millions of people suffering from chronic liver disease. Currently, the only cure for end-stage liver disease is orthotopic liver transplantation; however, this approach is severely limited by organ donation. Alternative approaches to restoring liver function have therefore been pursued, including the use of somatic and stem cell populations. Although such approaches are essential in developing scalable treatments, there is also an imperative to develop predictive human systems that more effectively study and/or prevent the onset of liver disease and decompensated organ function. We used a renewable human stem cell resource, from defined genetic backgrounds, and drove them through developmental intermediates to yield highly active, drug-inducible, and predictive human hepatocyte populations. Most importantly, stem cell-derived hepatocytes displayed equivalence to primary adult hepatocytes, following incubation with known hepatotoxins. In summary, we have developed a serum-free, scalable, and shippable cell-based model that faithfully predicts the potential for human liver injury. Such a resource has direct application in human modeling and, in the future, could play an important role in developing renewable cell-based therapies. PMID:24375539

  6. Lower NIH stroke scale scores are required to accurately predict a good prognosis in posterior circulation stroke.

    PubMed

    Inoa, Violiza; Aron, Abraham W; Staff, Ilene; Fortunato, Gilbert; Sansing, Lauren H

    2014-01-01

    The NIH stroke scale (NIHSS) is an indispensable tool that aids in the determination of acute stroke prognosis and decision making. Patients with posterior circulation (PC) strokes often present with lower NIHSS scores, which may result in the withholding of thrombolytic treatment from these patients. However, whether these lower initial NIHSS scores predict better long-term prognoses is uncertain. We aimed to assess the utility of the NIHSS at presentation for predicting the functional outcome at 3 months in anterior circulation (AC) versus PC strokes. This was a retrospective analysis of a large prospectively collected database of adults with acute ischemic stroke. Univariate and multivariate analyses were conducted to identify factors associated with outcome. Additional analyses were performed to determine the receiver operating characteristic (ROC) curves for NIHSS scores and outcomes in AC and PC infarctions. Both the optimal cutoffs for maximal diagnostic accuracy and the cutoffs to obtain >80% sensitivity for poor outcomes were determined in AC and PC strokes. The analysis included 1,197 patients with AC stroke and 372 with PC stroke. The median initial NIHSS score for patients with AC strokes was 7 and for PC strokes it was 2. The majority (71%) of PC stroke patients had baseline NIHSS scores ≤4, and 15% of these 'minor' stroke patients had a poor outcome at 3 months. ROC analysis identified that the optimal NIHSS cutoff for outcome prediction after infarction in the AC was 8 and for infarction in the PC it was 4. To achieve >80% sensitivity for detecting patients with a subsequent poor outcome, the NIHSS cutoff for infarctions in the AC was 4 and for infarctions in the PC it was 2. The NIHSS cutoff that most accurately predicts outcomes is 4 points higher in AC compared to PC infarctions. There is potential for poor outcomes in patients with PC strokes and low NIHSS scores, suggesting that thrombolytic treatment should not be withheld from these patients

  7. FINDSITE-metal: Integrating evolutionary information and machine learning for structure-based metal binding site prediction at the proteome level

    PubMed Central

    Brylinski, Michal; Skolnick, Jeffrey

    2010-01-01

    The rapid accumulation of gene sequences, many of which are hypothetical proteins with unknown function, has stimulated the development of accurate computational tools for protein function prediction with evolution/structure-based approaches showing considerable promise. In this paper, we present FINDSITE-metal, a new threading-based method designed specifically to detect metal binding sites in modeled protein structures. Comprehensive benchmarks using different quality protein structures show that weakly homologous protein models provide sufficient structural information for quite accurate annotation by FINDSITE-metal. Combining structure/evolutionary information with machine learning results in highly accurate metal binding annotations; for protein models constructed by TASSER, whose average Cα RMSD from the native structure is 8.9 Å, 59.5% (71.9%) of the best of top five predicted metal locations are within 4 Å (8 Å) from a bound metal in the crystal structure. For most of the targets, multiple metal binding sites are detected with the best predicted binding site at rank 1 and within the top 2 ranks in 65.6% and 83.1% of the cases, respectively. Furthermore, for iron, copper, zinc, calcium and magnesium ions, the binding metal can be predicted with high, typically 70-90%, accuracy. FINDSITE-metal also provides a set of confidence indexes that help assess the reliability of predictions. Finally, we describe the proteome-wide application of FINDSITE-metal that quantifies the metal binding complement of the human proteome. FINDSITE-metal is freely available to the academic community at http://cssb.biology.gatech.edu/findsite-metal/. PMID:21287609

  8. A composite score combining waist circumference and body mass index more accurately predicts body fat percentage in 6- to 13-year-old children.

    PubMed

    Aeberli, I; Gut-Knabenhans, M; Kusche-Ammann, R S; Molinari, L; Zimmermann, M B

    2013-02-01

    Body mass index (BMI) and waist circumference (WC) are widely used to predict % body fat (BF) and classify degrees of pediatric adiposity. However, both measures have limitations. The aim of this study was to evaluate whether a combination of WC and BMI would more accurately predict %BF than either alone. In a nationally representative sample of 2,303 6- to 13-year-old Swiss children, weight, height, and WC were measured, and %BF was determined from multiple skinfold thicknesses. Regression and receiver operating characteristic (ROC) curves were used to evaluate the combination of WC and BMI in predicting %BF against WC or BMI alone. An optimized composite score (CS) was generated. A quadratic polynomial combination of WC and BMI led to a better prediction of %BF (r (2) = 0.68) compared with the two measures alone (r (2) = 0.58-0.62). The areas under the ROC curve for the CS [0.6 * WC-SDS + 0.4 * BMI-SDS] ranged from 0.962 ± 0.0053 (overweight girls) to 0.982 ± 0.0046 (obese boys) and were somewhat greater than the AUCs for either BMI or WC alone. At a given specificity, the sensitivity of the prediction of overweight and obesity based on the CS was higher than that based on either WC or BMI alone, although the improvement was small. Both BMI and WC are good predictors of %BF in primary school children. However, a composite score incorporating both measures increased sensitivity at a constant specificity as compared to the individual measures. It may therefore be a useful tool for clinical and epidemiological studies of pediatric adiposity.

  9. Simple Learned Weighted Sums of Inferior Temporal Neuronal Firing Rates Accurately Predict Human Core Object Recognition Performance

    PubMed Central

    Hong, Ha; Solomon, Ethan A.; DiCarlo, James J.

    2015-01-01

    database of images for evaluating object recognition performance. We used multielectrode arrays to characterize hundreds of neurons in the visual ventral stream of nonhuman primates and measured the object recognition performance of >100 human observers. Remarkably, we found that simple learned weighted sums of firing rates of neurons in monkey inferior temporal (IT) cortex accurately predicted human performance. Although previous work led us to expect that IT would outperform V4, we were surprised by the quantitative precision with which simple IT-based linking hypotheses accounted for human behavior. PMID:26424887

  10. Simple Learned Weighted Sums of Inferior Temporal Neuronal Firing Rates Accurately Predict Human Core Object Recognition Performance.

    PubMed

    Majaj, Najib J; Hong, Ha; Solomon, Ethan A; DiCarlo, James J

    2015-09-30

    database of images for evaluating object recognition performance. We used multielectrode arrays to characterize hundreds of neurons in the visual ventral stream of nonhuman primates and measured the object recognition performance of >100 human observers. Remarkably, we found that simple learned weighted sums of firing rates of neurons in monkey inferior temporal (IT) cortex accurately predicted human performance. Although previous work led us to expect that IT would outperform V4, we were surprised by the quantitative precision with which simple IT-based linking hypotheses accounted for human behavior. Copyright © 2015 the authors 0270-6474/15/3513402-17$15.00/0.

  11. Accurate Prediction of Inducible Transcription Factor Binding Intensities In Vivo

    PubMed Central

    Siepel, Adam; Lis, John T.

    2012-01-01

    DNA sequence and local chromatin landscape act jointly to determine transcription factor (TF) binding intensity profiles. To disentangle these influences, we developed an experimental approach, called protein/DNA binding followed by high-throughput sequencing (PB–seq), that allows the binding energy landscape to be characterized genome-wide in the absence of chromatin. We applied our methods to the Drosophila Heat Shock Factor (HSF), which inducibly binds a target DNA sequence element (HSE) following heat shock stress. PB–seq involves incubating sheared naked genomic DNA with recombinant HSF, partitioning the HSF–bound and HSF–free DNA, and then detecting HSF–bound DNA by high-throughput sequencing. We compared PB–seq binding profiles with ones observed in vivo by ChIP–seq and developed statistical models to predict the observed departures from idealized binding patterns based on covariates describing the local chromatin environment. We found that DNase I hypersensitivity and tetra-acetylation of H4 were the most influential covariates in predicting changes in HSF binding affinity. We also investigated the extent to which DNA accessibility, as measured by digital DNase I footprinting data, could be predicted from MNase–seq data and the ChIP–chip profiles for many histone modifications and TFs, and found GAGA element associated factor (GAF), tetra-acetylation of H4, and H4K16 acetylation to be the most predictive covariates. Lastly, we generated an unbiased model of HSF binding sequences, which revealed distinct biophysical properties of the HSF/HSE interaction and a previously unrecognized substructure within the HSE. These findings provide new insights into the interplay between the genomic sequence and the chromatin landscape in determining transcription factor binding intensity. PMID:22479205

  12. Prostatectomy-based validation of combined urine and plasma test for predicting high grade prostate cancer.

    PubMed

    Albitar, Maher; Ma, Wanlong; Lund, Lars; Shahbaba, Babak; Uchio, Edward; Feddersen, Søren; Moylan, Donald; Wojno, Kirk; Shore, Neal

    2018-03-01

    Distinguishing between low- and high-grade prostate cancers (PCa) is important, but biopsy may underestimate the actual grade of cancer. We have previously shown that urine/plasma-based prostate-specific biomarkers can predict high grade PCa. Our objective was to determine the accuracy of a test using cell-free RNA levels of biomarkers in predicting prostatectomy results. This multicenter community-based prospective study was conducted using urine/blood samples collected from 306 patients. All recruited patients were treatment-naïve, without metastases, and had been biopsied, designated a Gleason Score (GS) based on biopsy, and assigned to prostatectomy prior to participation in the study. The primary outcome measure was the urine/plasma test accuracy in predicting high grade PCa on prostatectomy compared with biopsy findings. Sensitivity and specificity were calculated using standard formulas, while comparisons between groups were performed using the Wilcoxon Rank Sum, Kruskal-Wallis, Chi-Square, and Fisher's exact test. GS as assigned by standard 10-12 core biopsies was 3 + 3 in 90 (29.4%), 3 + 4 in 122 (39.8%), 4 + 3 in 50 (16.3%), and > 4 + 3 in 44 (14.4%) patients. The urine/plasma assay confirmed a previous validation and was highly accurate in predicting the presence of high-grade PCa (Gleason ≥3 + 4) with sensitivity between 88% and 95% as verified by prostatectomy findings. GS was upgraded after prostatectomy in 27% of patients and downgraded in 12% of patients. This plasma/urine biomarker test accurately predicts high grade cancer as determined by prostatectomy with a sensitivity at 92-97%, while the sensitivity of core biopsies was 78%. © 2018 Wiley Periodicals, Inc.

  13. Can single empirical algorithms accurately predict inland shallow water quality status from high resolution, multi-sensor, multi-temporal satellite data?

    NASA Astrophysics Data System (ADS)

    Theologou, I.; Patelaki, M.; Karantzalos, K.

    2015-04-01

    Assessing and monitoring water quality status through timely, cost effective and accurate manner is of fundamental importance for numerous environmental management and policy making purposes. Therefore, there is a current need for validated methodologies which can effectively exploit, in an unsupervised way, the enormous amount of earth observation imaging datasets from various high-resolution satellite multispectral sensors. To this end, many research efforts are based on building concrete relationships and empirical algorithms from concurrent satellite and in-situ data collection campaigns. We have experimented with Landsat 7 and Landsat 8 multi-temporal satellite data, coupled with hyperspectral data from a field spectroradiometer and in-situ ground truth data with several physico-chemical and other key monitoring indicators. All available datasets, covering a 4 years period, in our case study Lake Karla in Greece, were processed and fused under a quantitative evaluation framework. The performed comprehensive analysis posed certain questions regarding the applicability of single empirical models across multi-temporal, multi-sensor datasets towards the accurate prediction of key water quality indicators for shallow inland systems. Single linear regression models didn't establish concrete relations across multi-temporal, multi-sensor observations. Moreover, the shallower parts of the inland system followed, in accordance with the literature, different regression patterns. Landsat 7 and 8 resulted in quite promising results indicating that from the recreation of the lake and onward consistent per-sensor, per-depth prediction models can be successfully established. The highest rates were for chl-a (r2=89.80%), dissolved oxygen (r2=88.53%), conductivity (r2=88.18%), ammonium (r2=87.2%) and pH (r2=86.35%), while the total phosphorus (r2=70.55%) and nitrates (r2=55.50%) resulted in lower correlation rates.

  14. Metabolite signal identification in accurate mass metabolomics data with MZedDB, an interactive m/z annotation tool utilising predicted ionisation behaviour 'rules'

    PubMed Central

    Draper, John; Enot, David P; Parker, David; Beckmann, Manfred; Snowdon, Stuart; Lin, Wanchang; Zubair, Hassan

    2009-01-01

    Background Metabolomics experiments using Mass Spectrometry (MS) technology measure the mass to charge ratio (m/z) and intensity of ionised molecules in crude extracts of complex biological samples to generate high dimensional metabolite 'fingerprint' or metabolite 'profile' data. High resolution MS instruments perform routinely with a mass accuracy of < 5 ppm (parts per million) thus providing potentially a direct method for signal putative annotation using databases containing metabolite mass information. Most database interfaces support only simple queries with the default assumption that molecules either gain or lose a single proton when ionised. In reality the annotation process is confounded by the fact that many ionisation products will be not only molecular isotopes but also salt/solvent adducts and neutral loss fragments of original metabolites. This report describes an annotation strategy that will allow searching based on all potential ionisation products predicted to form during electrospray ionisation (ESI). Results Metabolite 'structures' harvested from publicly accessible databases were converted into a common format to generate a comprehensive archive in MZedDB. 'Rules' were derived from chemical information that allowed MZedDB to generate a list of adducts and neutral loss fragments putatively able to form for each structure and calculate, on the fly, the exact molecular weight of every potential ionisation product to provide targets for annotation searches based on accurate mass. We demonstrate that data matrices representing populations of ionisation products generated from different biological matrices contain a large proportion (sometimes > 50%) of molecular isotopes, salt adducts and neutral loss fragments. Correlation analysis of ESI-MS data features confirmed the predicted relationships of m/z signals. An integrated isotope enumerator in MZedDB allowed verification of exact isotopic pattern distributions to corroborate experimental data

  15. Accurate genomic predictions for BCWD resistance in rainbow trout are achieved using low-density SNP panels: Evidence that long-range LD is a major contributing factor.

    PubMed

    Vallejo, Roger L; Silva, Rafael M O; Evenhuis, Jason P; Gao, Guangtu; Liu, Sixin; Parsons, James E; Martin, Kyle E; Wiens, Gregory D; Lourenco, Daniela A L; Leeds, Timothy D; Palti, Yniv

    2018-06-05

    Previously accurate genomic predictions for Bacterial cold water disease (BCWD) resistance in rainbow trout were obtained using a medium-density single nucleotide polymorphism (SNP) array. Here, the impact of lower-density SNP panels on the accuracy of genomic predictions was investigated in a commercial rainbow trout breeding population. Using progeny performance data, the accuracy of genomic breeding values (GEBV) using 35K, 10K, 3K, 1K, 500, 300 and 200 SNP panels as well as a panel with 70 quantitative trait loci (QTL)-flanking SNP was compared. The GEBVs were estimated using the Bayesian method BayesB, single-step GBLUP (ssGBLUP) and weighted ssGBLUP (wssGBLUP). The accuracy of GEBVs remained high despite the sharp reductions in SNP density, and even with 500 SNP accuracy was higher than the pedigree-based prediction (0.50-0.56 versus 0.36). Furthermore, the prediction accuracy with the 70 QTL-flanking SNP (0.65-0.72) was similar to the panel with 35K SNP (0.65-0.71). Genomewide linkage disequilibrium (LD) analysis revealed strong LD (r 2  ≥ 0.25) spanning on average over 1 Mb across the rainbow trout genome. This long-range LD likely contributed to the accurate genomic predictions with the low-density SNP panels. Population structure analysis supported the hypothesis that long-range LD in this population may be caused by admixture. Results suggest that lower-cost, low-density SNP panels can be used for implementing genomic selection for BCWD resistance in rainbow trout breeding programs. © 2018 The Authors. This article is a U.S. Government work and is in the public domain in the USA. Journal of Animal Breeding and Genetics published by Blackwell Verlag GmbH.

  16. Ensemble framework based real-time respiratory motion prediction for adaptive radiotherapy applications.

    PubMed

    Tatinati, Sivanagaraja; Nazarpour, Kianoush; Tech Ang, Wei; Veluvolu, Kalyana C

    2016-08-01

    Successful treatment of tumors with motion-adaptive radiotherapy requires accurate prediction of respiratory motion, ideally with a prediction horizon larger than the latency in radiotherapy system. Accurate prediction of respiratory motion is however a non-trivial task due to the presence of irregularities and intra-trace variabilities, such as baseline drift and temporal changes in fundamental frequency pattern. In this paper, to enhance the accuracy of the respiratory motion prediction, we propose a stacked regression ensemble framework that integrates heterogeneous respiratory motion prediction algorithms. We further address two crucial issues for developing a successful ensemble framework: (1) selection of appropriate prediction methods to ensemble (level-0 methods) among the best existing prediction methods; and (2) finding a suitable generalization approach that can successfully exploit the relative advantages of the chosen level-0 methods. The efficacy of the developed ensemble framework is assessed with real respiratory motion traces acquired from 31 patients undergoing treatment. Results show that the developed ensemble framework improves the prediction performance significantly compared to the best existing methods. Copyright © 2016 IPEM. Published by Elsevier Ltd. All rights reserved.

  17. Towards more accurate vegetation mortality predictions

    DOE PAGES

    Sevanto, Sanna Annika; Xu, Chonggang

    2016-09-26

    Predicting the fate of vegetation under changing climate is one of the major challenges of the climate modeling community. Here, terrestrial vegetation dominates the carbon and water cycles over land areas, and dramatic changes in vegetation cover resulting from stressful environmental conditions such as drought feed directly back to local and regional climate, potentially leading to a vicious cycle where vegetation recovery after a disturbance is delayed or impossible.

  18. Predictive equation of state method for heavy materials based on the Dirac equation and density functional theory

    NASA Astrophysics Data System (ADS)

    Wills, John M.; Mattsson, Ann E.

    2012-02-01

    Density functional theory (DFT) provides a formally predictive base for equation of state properties. Available approximations to the exchange/correlation functional provide accurate predictions for many materials in the periodic table. For heavy materials however, DFT calculations, using available functionals, fail to provide quantitative predictions, and often fail to be even qualitative. This deficiency is due both to the lack of the appropriate confinement physics in the exchange/correlation functional and to approximations used to evaluate the underlying equations. In order to assess and develop accurate functionals, it is essential to eliminate all other sources of error. In this talk we describe an efficient first-principles electronic structure method based on the Dirac equation and compare the results obtained with this method with other methods generally used. Implications for high-pressure equation of state of relativistic materials are demonstrated in application to Ce and the light actinides. Sandia National Laboratories is a multi-program laboratory managed andoperated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000.

  19. Accurate prediction of RNA-binding protein residues with two discriminative structural descriptors.

    PubMed

    Sun, Meijian; Wang, Xia; Zou, Chuanxin; He, Zenghui; Liu, Wei; Li, Honglin

    2016-06-07

    RNA-binding proteins participate in many important biological processes concerning RNA-mediated gene regulation, and several computational methods have been recently developed to predict the protein-RNA interactions of RNA-binding proteins. Newly developed discriminative descriptors will help to improve the prediction accuracy of these prediction methods and provide further meaningful information for researchers. In this work, we designed two structural features (residue electrostatic surface potential and triplet interface propensity) and according to the statistical and structural analysis of protein-RNA complexes, the two features were powerful for identifying RNA-binding protein residues. Using these two features and other excellent structure- and sequence-based features, a random forest classifier was constructed to predict RNA-binding residues. The area under the receiver operating characteristic curve (AUC) of five-fold cross-validation for our method on training set RBP195 was 0.900, and when applied to the test set RBP68, the prediction accuracy (ACC) was 0.868, and the F-score was 0.631. The good prediction performance of our method revealed that the two newly designed descriptors could be discriminative for inferring protein residues interacting with RNAs. To facilitate the use of our method, a web-server called RNAProSite, which implements the proposed method, was constructed and is freely available at http://lilab.ecust.edu.cn/NABind .

  20. Remaining dischargeable time prediction for lithium-ion batteries using unscented Kalman filter

    NASA Astrophysics Data System (ADS)

    Dong, Guangzhong; Wei, Jingwen; Chen, Zonghai; Sun, Han; Yu, Xiaowei

    2017-10-01

    To overcome the range anxiety, one of the important strategies is to accurately predict the range or dischargeable time of the battery system. To accurately predict the remaining dischargeable time (RDT) of a battery, a RDT prediction framework based on accurate battery modeling and state estimation is presented in this paper. Firstly, a simplified linearized equivalent-circuit-model is developed to simulate the dynamic characteristics of a battery. Then, an online recursive least-square-algorithm method and unscented-Kalman-filter are employed to estimate the system matrices and SOC at every prediction point. Besides, a discrete wavelet transform technique is employed to capture the statistical information of past dynamics of input currents, which are utilized to predict the future battery currents. Finally, the RDT can be predicted based on the battery model, SOC estimation results and predicted future battery currents. The performance of the proposed methodology has been verified by a lithium-ion battery cell. Experimental results indicate that the proposed method can provide an accurate SOC and parameter estimation and the predicted RDT can solve the range anxiety issues.

  1. A high order accurate finite element algorithm for high Reynolds number flow prediction

    NASA Technical Reports Server (NTRS)

    Baker, A. J.

    1978-01-01

    A Galerkin-weighted residuals formulation is employed to establish an implicit finite element solution algorithm for generally nonlinear initial-boundary value problems. Solution accuracy, and convergence rate with discretization refinement, are quantized in several error norms, by a systematic study of numerical solutions to several nonlinear parabolic and a hyperbolic partial differential equation characteristic of the equations governing fluid flows. Solutions are generated using selective linear, quadratic and cubic basis functions. Richardson extrapolation is employed to generate a higher-order accurate solution to facilitate isolation of truncation error in all norms. Extension of the mathematical theory underlying accuracy and convergence concepts for linear elliptic equations is predicted for equations characteristic of laminar and turbulent fluid flows at nonmodest Reynolds number. The nondiagonal initial-value matrix structure introduced by the finite element theory is determined intrinsic to improved solution accuracy and convergence. A factored Jacobian iteration algorithm is derived and evaluated to yield a consequential reduction in both computer storage and execution CPU requirements while retaining solution accuracy.

  2. Bankruptcy prediction based on financial ratios using Jordan Recurrent Neural Networks: a case study in Polish companies

    NASA Astrophysics Data System (ADS)

    Hardinata, Lingga; Warsito, Budi; Suparti

    2018-05-01

    Complexity of bankruptcy causes the accurate models of bankruptcy prediction difficult to be achieved. Various prediction models have been developed to improve the accuracy of bankruptcy predictions. Machine learning has been widely used to predict because of its adaptive capabilities. Artificial Neural Networks (ANN) is one of machine learning which proved able to complete inference tasks such as prediction and classification especially in data mining. In this paper, we propose the implementation of Jordan Recurrent Neural Networks (JRNN) to classify and predict corporate bankruptcy based on financial ratios. Feedback interconnection in JRNN enable to make the network keep important information well allowing the network to work more effectively. The result analysis showed that JRNN works very well in bankruptcy prediction with average success rate of 81.3785%.

  3. Prediction using patient comparison vs. modeling: a case study for mortality prediction.

    PubMed

    Hoogendoorn, Mark; El Hassouni, Ali; Mok, Kwongyen; Ghassemi, Marzyeh; Szolovits, Peter

    2016-08-01

    Information in Electronic Medical Records (EMRs) can be used to generate accurate predictions for the occurrence of a variety of health states, which can contribute to more pro-active interventions. The very nature of EMRs does make the application of off-the-shelf machine learning techniques difficult. In this paper, we study two approaches to making predictions that have hardly been compared in the past: (1) extracting high-level (temporal) features from EMRs and building a predictive model, and (2) defining a patient similarity metric and predicting based on the outcome observed for similar patients. We analyze and compare both approaches on the MIMIC-II ICU dataset to predict patient mortality and find that the patient similarity approach does not scale well and results in a less accurate model (AUC of 0.68) compared to the modeling approach (0.84). We also show that mortality can be predicted within a median of 72 hours.

  4. A rapid and accurate approach for prediction of interactomes from co-elution data (PrInCE).

    PubMed

    Stacey, R Greg; Skinnider, Michael A; Scott, Nichollas E; Foster, Leonard J

    2017-10-23

    An organism's protein interactome, or complete network of protein-protein interactions, defines the protein complexes that drive cellular processes. Techniques for studying protein complexes have traditionally applied targeted strategies such as yeast two-hybrid or affinity purification-mass spectrometry to assess protein interactions. However, given the vast number of protein complexes, more scalable methods are necessary to accelerate interaction discovery and to construct whole interactomes. We recently developed a complementary technique based on the use of protein correlation profiling (PCP) and stable isotope labeling in amino acids in cell culture (SILAC) to assess chromatographic co-elution as evidence of interacting proteins. Importantly, PCP-SILAC is also capable of measuring protein interactions simultaneously under multiple biological conditions, allowing the detection of treatment-specific changes to an interactome. Given the uniqueness and high dimensionality of co-elution data, new tools are needed to compare protein elution profiles, control false discovery rates, and construct an accurate interactome. Here we describe a freely available bioinformatics pipeline, PrInCE, for the analysis of co-elution data. PrInCE is a modular, open-source library that is computationally inexpensive, able to use label and label-free data, and capable of detecting tens of thousands of protein-protein interactions. Using a machine learning approach, PrInCE offers greatly reduced run time, more predicted interactions at the same stringency, prediction of protein complexes, and greater ease of use over previous bioinformatics tools for co-elution data. PrInCE is implemented in Matlab (version R2017a). Source code and standalone executable programs for Windows and Mac OSX are available at https://github.com/fosterlab/PrInCE , where usage instructions can be found. An example dataset and output are also provided for testing purposes. PrInCE is the first fast and easy

  5. Perceived Physician-informed Weight Status Predicts Accurate Weight Self-Perception and Weight Self-Regulation in Low-income, African American Women.

    PubMed

    Harris, Charlie L; Strayhorn, Gregory; Moore, Sandra; Goldman, Brian; Martin, Michelle Y

    2016-01-01

    Obese African American women under-appraise their body mass index (BMI) classification and report fewer weight loss attempts than women who accurately appraise their weight status. This cross-sectional study examined whether physician-informed weight status could predict weight self-perception and weight self-regulation strategies in obese women. A convenience sample of 118 low-income women completed a survey assessing demographic characteristics, comorbidities, weight self-perception, and weight self-regulation strategies. BMI was calculated during nurse triage. Binary logistic regression models were performed to test hypotheses. The odds of obese accurate appraisers having been informed about their weight status were six times greater than those of under-appraisers. The odds of those using an "approach" self-regulation strategy having been physician-informed were four times greater compared with those using an "avoidance" strategy. Physicians are uniquely positioned to influence accurate weight self-perception and adaptive weight self-regulation strategies in underserved women, reducing their risk for obesity-related morbidity.

  6. Accurate positioning based on acoustic and optical sensors

    NASA Astrophysics Data System (ADS)

    Cai, Kerong; Deng, Jiahao; Guo, Hualing

    2009-11-01

    Unattended laser target designator (ULTD) was designed to partly take the place of conventional LTDs for accurate positioning and laser marking. Analyzed the precision, accuracy and errors of acoustic sensor array, the requirements of laser generator, and the technology of image analysis and tracking, the major system modules were determined. The target's classification, velocity and position can be measured by sensors, and then coded laser beam will be emitted intelligently to mark the excellent position at the excellent time. The conclusion shows that, ULTD can not only avoid security threats, be deployed massively, and accomplish battle damage assessment (BDA), but also be fit for information-based warfare.

  7. Bridge Structure Deformation Prediction Based on GNSS Data Using Kalman-ARIMA-GARCH Model

    PubMed Central

    Li, Xiaoqing; Wang, Yu

    2018-01-01

    Bridges are an essential part of the ground transportation system. Health monitoring is fundamentally important for the safety and service life of bridges. A large amount of structural information is obtained from various sensors using sensing technology, and the data processing has become a challenging issue. To improve the prediction accuracy of bridge structure deformation based on data mining and to accurately evaluate the time-varying characteristics of bridge structure performance evolution, this paper proposes a new method for bridge structure deformation prediction, which integrates the Kalman filter, autoregressive integrated moving average model (ARIMA), and generalized autoregressive conditional heteroskedasticity (GARCH). Firstly, the raw deformation data is directly pre-processed using the Kalman filter to reduce the noise. After that, the linear recursive ARIMA model is established to analyze and predict the structure deformation. Finally, the nonlinear recursive GARCH model is introduced to further improve the accuracy of the prediction. Simulation results based on measured sensor data from the Global Navigation Satellite System (GNSS) deformation monitoring system demonstrated that: (1) the Kalman filter is capable of denoising the bridge deformation monitoring data; (2) the prediction accuracy of the proposed Kalman-ARIMA-GARCH model is satisfactory, where the mean absolute error increases only from 3.402 mm to 5.847 mm with the increment of the prediction step; and (3) in comparision to the Kalman-ARIMA model, the Kalman-ARIMA-GARCH model results in superior prediction accuracy as it includes partial nonlinear characteristics (heteroscedasticity); the mean absolute error of five-step prediction using the proposed model is improved by 10.12%. This paper provides a new way for structural behavior prediction based on data processing, which can lay a foundation for the early warning of bridge health monitoring system based on sensor data using sensing

  8. Bridge Structure Deformation Prediction Based on GNSS Data Using Kalman-ARIMA-GARCH Model.

    PubMed

    Xin, Jingzhou; Zhou, Jianting; Yang, Simon X; Li, Xiaoqing; Wang, Yu

    2018-01-19

    Bridges are an essential part of the ground transportation system. Health monitoring is fundamentally important for the safety and service life of bridges. A large amount of structural information is obtained from various sensors using sensing technology, and the data processing has become a challenging issue. To improve the prediction accuracy of bridge structure deformation based on data mining and to accurately evaluate the time-varying characteristics of bridge structure performance evolution, this paper proposes a new method for bridge structure deformation prediction, which integrates the Kalman filter, autoregressive integrated moving average model (ARIMA), and generalized autoregressive conditional heteroskedasticity (GARCH). Firstly, the raw deformation data is directly pre-processed using the Kalman filter to reduce the noise. After that, the linear recursive ARIMA model is established to analyze and predict the structure deformation. Finally, the nonlinear recursive GARCH model is introduced to further improve the accuracy of the prediction. Simulation results based on measured sensor data from the Global Navigation Satellite System (GNSS) deformation monitoring system demonstrated that: (1) the Kalman filter is capable of denoising the bridge deformation monitoring data; (2) the prediction accuracy of the proposed Kalman-ARIMA-GARCH model is satisfactory, where the mean absolute error increases only from 3.402 mm to 5.847 mm with the increment of the prediction step; and (3) in comparision to the Kalman-ARIMA model, the Kalman-ARIMA-GARCH model results in superior prediction accuracy as it includes partial nonlinear characteristics (heteroscedasticity); the mean absolute error of five-step prediction using the proposed model is improved by 10.12%. This paper provides a new way for structural behavior prediction based on data processing, which can lay a foundation for the early warning of bridge health monitoring system based on sensor data using sensing

  9. Knotty: Efficient and Accurate Prediction of Complex RNA Pseudoknot Structures.

    PubMed

    Jabbari, Hosna; Wark, Ian; Montemagno, Carlo; Will, Sebastian

    2018-06-01

    The computational prediction of RNA secondary structure by free energy minimization has become an important tool in RNA research. However in practice, energy minimization is mostly limited to pseudoknot-free structures or rather simple pseudoknots, not covering many biologically important structures such as kissing hairpins. Algorithms capable of predicting sufficiently complex pseudoknots (for sequences of length n) used to have extreme complexities, e.g. Pknots (Rivas and Eddy, 1999) has O(n6) time and O(n4) space complexity. The algorithm CCJ (Chen et al., 2009) dramatically improves the asymptotic run time for predicting complex pseudoknots (handling almost all relevant pseudoknots, while being slightly less general than Pknots), but this came at the cost of large constant factors in space and time, which strongly limited its practical application (∼200 bases already require 256GB space). We present a CCJ-type algorithm, Knotty, that handles the same comprehensive pseudoknot class of structures as CCJ with improved space complexity of Θ(n3 + Z)-due to the applied technique of sparsification, the number of "candidates", Z, appears to grow significantly slower than n4 on our benchmark set (which include pseudoknotted RNAs up to 400 nucleotides). In terms of run time over this benchmark, Knotty clearly outperforms Pknots and the original CCJ implementation, CCJ 1.0; Knotty's space consumption fundamentally improves over CCJ 1.0, being on a par with the space-economic Pknots. By comparing to CCJ 2.0, our unsparsified Knotty variant, we demonstrate the isolated effect of sparsification. Moreover, Knotty employs the state-of-the-art energy model of "HotKnots DP09", which results in superior prediction accuracy over Pknots. Our software is available at https://github.com/HosnaJabbari/Knotty. will@tbi.unvie.ac.at. Supplementary data are available at Bioinformatics online.

  10. Enhancing emotional-based target prediction

    NASA Astrophysics Data System (ADS)

    Gosnell, Michael; Woodley, Robert

    2008-04-01

    This work extends existing agent-based target movement prediction to include key ideas of behavioral inertia, steady states, and catastrophic change from existing psychological, sociological, and mathematical work. Existing target prediction work inherently assumes a single steady state for target behavior, and attempts to classify behavior based on a single emotional state set. The enhanced, emotional-based target prediction maintains up to three distinct steady states, or typical behaviors, based on a target's operating conditions and observed behaviors. Each steady state has an associated behavioral inertia, similar to the standard deviation of behaviors within that state. The enhanced prediction framework also allows steady state transitions through catastrophic change and individual steady states could be used in an offline analysis with additional modeling efforts to better predict anticipated target reactions.

  11. Variant effect prediction tools assessed using independent, functional assay-based datasets: implications for discovery and diagnostics.

    PubMed

    Mahmood, Khalid; Jung, Chol-Hee; Philip, Gayle; Georgeson, Peter; Chung, Jessica; Pope, Bernard J; Park, Daniel J

    2017-05-16

    Genetic variant effect prediction algorithms are used extensively in clinical genomics and research to determine the likely consequences of amino acid substitutions on protein function. It is vital that we better understand their accuracies and limitations because published performance metrics are confounded by serious problems of circularity and error propagation. Here, we derive three independent, functionally determined human mutation datasets, UniFun, BRCA1-DMS and TP53-TA, and employ them, alongside previously described datasets, to assess the pre-eminent variant effect prediction tools. Apparent accuracies of variant effect prediction tools were influenced significantly by the benchmarking dataset. Benchmarking with the assay-determined datasets UniFun and BRCA1-DMS yielded areas under the receiver operating characteristic curves in the modest ranges of 0.52 to 0.63 and 0.54 to 0.75, respectively, considerably lower than observed for other, potentially more conflicted datasets. These results raise concerns about how such algorithms should be employed, particularly in a clinical setting. Contemporary variant effect prediction tools are unlikely to be as accurate at the general prediction of functional impacts on proteins as reported prior. Use of functional assay-based datasets that avoid prior dependencies promises to be valuable for the ongoing development and accurate benchmarking of such tools.

  12. A knowledge-based potential with an accurate description of local interactions improves discrimination between native and near-native protein conformations.

    PubMed

    Ferrada, Evandro; Vergara, Ismael A; Melo, Francisco

    2007-01-01

    The correct discrimination between native and near-native protein conformations is essential for achieving accurate computer-based protein structure prediction. However, this has proven to be a difficult task, since currently available physical energy functions, empirical potentials and statistical scoring functions are still limited in achieving this goal consistently. In this work, we assess and compare the ability of different full atom knowledge-based potentials to discriminate between native protein structures and near-native protein conformations generated by comparative modeling. Using a benchmark of 152 near-native protein models and their corresponding native structures that encompass several different folds, we demonstrate that the incorporation of close non-bonded pairwise atom terms improves the discriminating power of the empirical potentials. Since the direct and unbiased derivation of close non-bonded terms from current experimental data is not possible, we obtained and used those terms from the corresponding pseudo-energy functions of a non-local knowledge-based potential. It is shown that this methodology significantly improves the discrimination between native and near-native protein conformations, suggesting that a proper description of close non-bonded terms is important to achieve a more complete and accurate description of native protein conformations. Some external knowledge-based energy functions that are widely used in model assessment performed poorly, indicating that the benchmark of models and the specific discrimination task tested in this work constitutes a difficult challenge.

  13. Sphinx: merging knowledge-based and ab initio approaches to improve protein loop prediction

    PubMed Central

    Marks, Claire; Nowak, Jaroslaw; Klostermann, Stefan; Georges, Guy; Dunbar, James; Shi, Jiye; Kelm, Sebastian

    2017-01-01

    Abstract Motivation: Loops are often vital for protein function, however, their irregular structures make them difficult to model accurately. Current loop modelling algorithms can mostly be divided into two categories: knowledge-based, where databases of fragments are searched to find suitable conformations and ab initio, where conformations are generated computationally. Existing knowledge-based methods only use fragments that are the same length as the target, even though loops of slightly different lengths may adopt similar conformations. Here, we present a novel method, Sphinx, which combines ab initio techniques with the potential extra structural information contained within loops of a different length to improve structure prediction. Results: We show that Sphinx is able to generate high-accuracy predictions and decoy sets enriched with near-native loop conformations, performing better than the ab initio algorithm on which it is based. In addition, it is able to provide predictions for every target, unlike some knowledge-based methods. Sphinx can be used successfully for the difficult problem of antibody H3 prediction, outperforming RosettaAntibody, one of the leading H3-specific ab initio methods, both in accuracy and speed. Availability and Implementation: Sphinx is available at http://opig.stats.ox.ac.uk/webapps/sphinx. Contact: deane@stats.ox.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:28453681

  14. Sphinx: merging knowledge-based and ab initio approaches to improve protein loop prediction.

    PubMed

    Marks, Claire; Nowak, Jaroslaw; Klostermann, Stefan; Georges, Guy; Dunbar, James; Shi, Jiye; Kelm, Sebastian; Deane, Charlotte M

    2017-05-01

    Loops are often vital for protein function, however, their irregular structures make them difficult to model accurately. Current loop modelling algorithms can mostly be divided into two categories: knowledge-based, where databases of fragments are searched to find suitable conformations and ab initio, where conformations are generated computationally. Existing knowledge-based methods only use fragments that are the same length as the target, even though loops of slightly different lengths may adopt similar conformations. Here, we present a novel method, Sphinx, which combines ab initio techniques with the potential extra structural information contained within loops of a different length to improve structure prediction. We show that Sphinx is able to generate high-accuracy predictions and decoy sets enriched with near-native loop conformations, performing better than the ab initio algorithm on which it is based. In addition, it is able to provide predictions for every target, unlike some knowledge-based methods. Sphinx can be used successfully for the difficult problem of antibody H3 prediction, outperforming RosettaAntibody, one of the leading H3-specific ab initio methods, both in accuracy and speed. Sphinx is available at http://opig.stats.ox.ac.uk/webapps/sphinx. deane@stats.ox.ac.uk. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press.

  15. Deformation, Failure, and Fatigue Life of SiC/Ti-15-3 Laminates Accurately Predicted by MAC/GMC

    NASA Technical Reports Server (NTRS)

    Bednarcyk, Brett A.; Arnold, Steven M.

    2002-01-01

    NASA Glenn Research Center's Micromechanics Analysis Code with Generalized Method of Cells (MAC/GMC) (ref.1) has been extended to enable fully coupled macro-micro deformation, failure, and fatigue life predictions for advanced metal matrix, ceramic matrix, and polymer matrix composites. Because of the multiaxial nature of the code's underlying micromechanics model, GMC--which allows the incorporation of complex local inelastic constitutive models--MAC/GMC finds its most important application in metal matrix composites, like the SiC/Ti-15-3 composite examined here. Furthermore, since GMC predicts the microscale fields within each constituent of the composite material, submodels for local effects such as fiber breakage, interfacial debonding, and matrix fatigue damage can and have been built into MAC/GMC. The present application of MAC/GMC highlights the combination of these features, which has enabled the accurate modeling of the deformation, failure, and life of titanium matrix composites.

  16. Next Day Building Load Predictions based on Limited Input Features Using an On-Line Laterally Primed Adaptive Resonance Theory Artificial Neural Network.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jones, Christian Birk; Robinson, Matt; Yasaei, Yasser

    Optimal integration of thermal energy storage within commercial building applications requires accurate load predictions. Several methods exist that provide an estimate of a buildings future needs. Methods include component-based models and data-driven algorithms. This work implemented a previously untested algorithm for this application that is called a Laterally Primed Adaptive Resonance Theory (LAPART) artificial neural network (ANN). The LAPART algorithm provided accurate results over a two month period where minimal historical data and a small amount of input types were available. These results are significant, because common practice has often overlooked the implementation of an ANN. ANN have often beenmore » perceived to be too complex and require large amounts of data to provide accurate results. The LAPART neural network was implemented in an on-line learning manner. On-line learning refers to the continuous updating of training data as time occurs. For this experiment, training began with a singe day and grew to two months of data. This approach provides a platform for immediate implementation that requires minimal time and effort. The results from the LAPART algorithm were compared with statistical regression and a component-based model. The comparison was based on the predictions linear relationship with the measured data, mean squared error, mean bias error, and cost savings achieved by the respective prediction techniques. The results show that the LAPART algorithm provided a reliable and cost effective means to predict the building load for the next day.« less

  17. Spread prediction model of continuous steel tube based on BP neural network

    NASA Astrophysics Data System (ADS)

    Zhai, Jian-wei; Yu, Hui; Zou, Hai-bei; Wang, San-zhong; Liu, Li-gang

    2017-07-01

    According to the geometric pass of roll and technological parameters of three-roller continuous mandrel rolling mill in a factory, a finite element model is established to simulate the continuous rolling process of seamless steel tube, and the reliability of finite element model is verified by comparing with the simulation results and actual results of rolling force, wall thickness and outer diameter of the tube. The effect of roller reduction, roller rotation speed and blooming temperature on the spread rule is studied. Based on BP(Back Propagation) neural network technology, a spread prediction model of continuous rolling tube is established for training wall thickness coefficient and spread coefficient of the continuous rolling tube, and the rapid and accurate prediction of continuous rolling tube size is realized.

  18. An Accurate Absorption-Based Net Primary Production Model for the Global Ocean

    NASA Astrophysics Data System (ADS)

    Silsbe, G.; Westberry, T. K.; Behrenfeld, M. J.; Halsey, K.; Milligan, A.

    2016-02-01

    As a vital living link in the global carbon cycle, understanding how net primary production (NPP) varies through space, time, and across climatic oscillations (e.g. ENSO) is a key objective in oceanographic research. The continual improvement of ocean observing satellites and data analytics now present greater opportunities for advanced understanding and characterization of the factors regulating NPP. In particular, the emergence of spectral inversion algorithms now permits accurate retrievals of the phytoplankton absorption coefficient (aΦ) from space. As NPP is the efficiency in which absorbed energy is converted into carbon biomass, aΦ measurements circumvents chlorophyll-based empirical approaches by permitting direct and accurate measurements of phytoplankton energy absorption. It has long been recognized, and perhaps underappreciated, that NPP and phytoplankton growth rates display muted variability when normalized to aΦ rather than chlorophyll. Here we present a novel absorption-based NPP model that parameterizes the underlying physiological mechanisms behind this muted variability, and apply this physiological model to the global ocean. Through a comparison against field data from the Hawaii and Bermuda Ocean Time Series, we demonstrate how this approach yields more accurate NPP measurements than other published NPP models. By normalizing NPP to satellite estimates of phytoplankton carbon biomass, this presentation also explores the seasonality of phytoplankton growth rates across several oceanic regions. Finally, we discuss how future advances in remote-sensing (e.g. hyperspectral satellites, LIDAR, autonomous profilers) can be exploited to further improve absorption-based NPP models.

  19. Prediction of CO concentrations based on a hybrid Partial Least Square and Support Vector Machine model

    NASA Astrophysics Data System (ADS)

    Yeganeh, B.; Motlagh, M. Shafie Pour; Rashidi, Y.; Kamalan, H.

    2012-08-01

    Due to the health impacts caused by exposures to air pollutants in urban areas, monitoring and forecasting of air quality parameters have become popular as an important topic in atmospheric and environmental research today. The knowledge on the dynamics and complexity of air pollutants behavior has made artificial intelligence models as a useful tool for a more accurate pollutant concentration prediction. This paper focuses on an innovative method of daily air pollution prediction using combination of Support Vector Machine (SVM) as predictor and Partial Least Square (PLS) as a data selection tool based on the measured values of CO concentrations. The CO concentrations of Rey monitoring station in the south of Tehran, from Jan. 2007 to Feb. 2011, have been used to test the effectiveness of this method. The hourly CO concentrations have been predicted using the SVM and the hybrid PLS-SVM models. Similarly, daily CO concentrations have been predicted based on the aforementioned four years measured data. Results demonstrated that both models have good prediction ability; however the hybrid PLS-SVM has better accuracy. In the analysis presented in this paper, statistic estimators including relative mean errors, root mean squared errors and the mean absolute relative error have been employed to compare performances of the models. It has been concluded that the errors decrease after size reduction and coefficients of determination increase from 56 to 81% for SVM model to 65-85% for hybrid PLS-SVM model respectively. Also it was found that the hybrid PLS-SVM model required lower computational time than SVM model as expected, hence supporting the more accurate and faster prediction ability of hybrid PLS-SVM model.

  20. Research of Coal Resources Reserves Prediction Based on GM (1, 1) Model

    NASA Astrophysics Data System (ADS)

    Xiao, Jiancheng

    2018-01-01

    Based on the forecast of China’s coal reserves, this paper uses the GM (1, 1) gray forecasting theory to establish the gray forecasting model of China’s coal reserves based on the data of China’s coal reserves from 2002 to 2009, and obtained the trend of coal resources reserves with the current economic and social development situation, and the residual test model is established, so the prediction model is more accurate. The results show that China’s coal reserves can ensure the use of production at least 300 years of use. And the results are similar to the mainstream forecast results, and that are in line with objective reality.

  1. Control surface hinge moment prediction using computational fluid dynamics

    NASA Astrophysics Data System (ADS)

    Simpson, Christopher David

    The following research determines the feasibility of predicting control surface hinge moments using various computational methods. A detailed analysis is conducted using a 2D GA(W)-1 airfoil with a 20% plain flap. Simple hinge moment prediction methods are tested, including empirical Datcom relations and XFOIL. Steady-state and time-accurate turbulent, viscous, Navier-Stokes solutions are computed using Fun3D. Hinge moment coefficients are computed. Mesh construction techniques are discussed. An adjoint-based mesh adaptation case is also evaluated. An NACA 0012 45-degree swept horizontal stabilizer with a 25% elevator is also evaluated using Fun3D. Results are compared with experimental wind-tunnel data obtained from references. Finally, the costs of various solution methods are estimated. Results indicate that while a steady-state Navier-Stokes solution can accurately predict control surface hinge moments for small angles of attack and deflection angles, a time-accurate solution is necessary to accurately predict hinge moments in the presence of flow separation. The ability to capture the unsteady vortex shedding behavior present in moderate to large control surface deflections is found to be critical to hinge moment prediction accuracy. Adjoint-based mesh adaptation is shown to give hinge moment predictions similar to a globally-refined mesh for a steady-state 2D simulation.

  2. Geometry-based pressure drop prediction in mildly diseased human coronary arteries.

    PubMed

    Schrauwen, J T C; Wentzel, J J; van der Steen, A F W; Gijsen, F J H

    2014-06-03

    Pressure drop (△p) estimations in human coronary arteries have several important applications, including determination of appropriate boundary conditions for CFD and estimation of fractional flow reserve (FFR). In this study a △p prediction was made based on geometrical features derived from patient-specific imaging data. Twenty-two mildly diseased human coronary arteries were imaged with computed tomography and intravascular ultrasound. Each artery was modelled in three consecutive steps: from straight to tapered, to stenosed, to curved model. CFD was performed to compute the additional △p in each model under steady flow for a wide range of Reynolds numbers. The correlations between the added geometrical complexity and additional △p were used to compute a predicted △p. This predicted △p based on geometry was compared to CFD results. The mean △p calculated with CFD was 855±666Pa. Tapering and curvature added significantly to the total △p, accounting for 31.4±19.0% and 18.0±10.9% respectively at Re=250. Using tapering angle, maximum area stenosis and angularity of the centerline, we were able to generate a good estimate for the predicted △p with a low mean but high standard deviation: average error of 41.1±287.8Pa at Re=250. Furthermore, the predicted △p was used to accurately estimate FFR (r=0.93). The effect of the geometric features was determined and the pressure drop in mildly diseased human coronary arteries was predicted quickly based solely on geometry. This pressure drop estimation could serve as a boundary condition in CFD to model the impact of distal epicardial vessels. Copyright © 2014 Elsevier Ltd. All rights reserved.

  3. Sequence Based Prediction of Antioxidant Proteins Using a Classifier Selection Strategy

    PubMed Central

    Zhang, Lina; Zhang, Chengjin; Gao, Rui; Yang, Runtao; Song, Qing

    2016-01-01

    Antioxidant proteins perform significant functions in maintaining oxidation/antioxidation balance and have potential therapies for some diseases. Accurate identification of antioxidant proteins could contribute to revealing physiological processes of oxidation/antioxidation balance and developing novel antioxidation-based drugs. In this study, an ensemble method is presented to predict antioxidant proteins with hybrid features, incorporating SSI (Secondary Structure Information), PSSM (Position Specific Scoring Matrix), RSA (Relative Solvent Accessibility), and CTD (Composition, Transition, Distribution). The prediction results of the ensemble predictor are determined by an average of prediction results of multiple base classifiers. Based on a classifier selection strategy, we obtain an optimal ensemble classifier composed of RF (Random Forest), SMO (Sequential Minimal Optimization), NNA (Nearest Neighbor Algorithm), and J48 with an accuracy of 0.925. A Relief combined with IFS (Incremental Feature Selection) method is adopted to obtain optimal features from hybrid features. With the optimal features, the ensemble method achieves improved performance with a sensitivity of 0.95, a specificity of 0.93, an accuracy of 0.94, and an MCC (Matthew’s Correlation Coefficient) of 0.880, far better than the existing method. To evaluate the prediction performance objectively, the proposed method is compared with existing methods on the same independent testing dataset. Encouragingly, our method performs better than previous studies. In addition, our method achieves more balanced performance with a sensitivity of 0.878 and a specificity of 0.860. These results suggest that the proposed ensemble method can be a potential candidate for antioxidant protein prediction. For public access, we develop a user-friendly web server for antioxidant protein identification that is freely accessible at http://antioxidant.weka.cc. PMID:27662651

  4. Researches of fruit quality prediction model based on near infrared spectrum

    NASA Astrophysics Data System (ADS)

    Shen, Yulin; Li, Lian

    2018-04-01

    With the improvement in standards for food quality and safety, people pay more attention to the internal quality of fruits, therefore the measurement of fruit internal quality is increasingly imperative. In general, nondestructive soluble solid content (SSC) and total acid content (TAC) analysis of fruits is vital and effective for quality measurement in global fresh produce markets, so in this paper, we aim at establishing a novel fruit internal quality prediction model based on SSC and TAC for Near Infrared Spectrum. Firstly, the model of fruit quality prediction based on PCA + BP neural network, PCA + GRNN network, PCA + BP adaboost strong classifier, PCA + ELM and PCA + LS_SVM classifier are designed and implemented respectively; then, in the NSCT domain, the median filter and the SavitzkyGolay filter are used to preprocess the spectral signal, Kennard-Stone algorithm is used to automatically select the training samples and test samples; thirdly, we achieve the optimal models by comparing 15 kinds of prediction model based on the theory of multi-classifier competition mechanism, specifically, the non-parametric estimation is introduced to measure the effectiveness of proposed model, the reliability and variance of nonparametric estimation evaluation of each prediction model to evaluate the prediction result, while the estimated value and confidence interval regard as a reference, the experimental results demonstrate that this model can better achieve the optimal evaluation of the internal quality of fruit; finally, we employ cat swarm optimization to optimize two optimal models above obtained from nonparametric estimation, empirical testing indicates that the proposed method can provide more accurate and effective results than other forecasting methods.

  5. DNA methylation-based forensic age prediction using artificial neural networks and next generation sequencing.

    PubMed

    Vidaki, Athina; Ballard, David; Aliferi, Anastasia; Miller, Thomas H; Barron, Leon P; Syndercombe Court, Denise

    2017-05-01

    The ability to estimate the age of the donor from recovered biological material at a crime scene can be of substantial value in forensic investigations. Aging can be complex and is associated with various molecular modifications in cells that accumulate over a person's lifetime including epigenetic patterns. The aim of this study was to use age-specific DNA methylation patterns to generate an accurate model for the prediction of chronological age using data from whole blood. In total, 45 age-associated CpG sites were selected based on their reported age coefficients in a previous extensive study and investigated using publicly available methylation data obtained from 1156 whole blood samples (aged 2-90 years) analysed with Illumina's genome-wide methylation platforms (27K/450K). Applying stepwise regression for variable selection, 23 of these CpG sites were identified that could significantly contribute to age prediction modelling and multiple regression analysis carried out with these markers provided an accurate prediction of age (R 2 =0.92, mean absolute error (MAE)=4.6 years). However, applying machine learning, and more specifically a generalised regression neural network model, the age prediction significantly improved (R 2 =0.96) with a MAE=3.3 years for the training set and 4.4 years for a blind test set of 231 cases. The machine learning approach used 16 CpG sites, located in 16 different genomic regions, with the top 3 predictors of age belonged to the genes NHLRC1, SCGN and CSNK1D. The proposed model was further tested using independent cohorts of 53 monozygotic twins (MAE=7.1 years) and a cohort of 1011 disease state individuals (MAE=7.2 years). Furthermore, we highlighted the age markers' potential applicability in samples other than blood by predicting age with similar accuracy in 265 saliva samples (R 2 =0.96) with a MAE=3.2 years (training set) and 4.0 years (blind test). In an attempt to create a sensitive and accurate age prediction test, a next

  6. Variability in the Propagation Phase of CFD-Based Noise Prediction: Summary of Results From Category 8 of the BANC-III Workshop

    NASA Technical Reports Server (NTRS)

    Lopes, Leonard; Redonnet, Stephane; Imamura, Taro; Ikeda, Tomoaki; Zawodny, Nikolas; Cunha, Guilherme

    2015-01-01

    The usage of Computational Fluid Dynamics (CFD) in noise prediction typically has been a two part process: accurately predicting the flow conditions in the near-field and then propagating the noise from the near-field to the observer. Due to the increase in computing power and the cost benefit when weighed against wind tunnel testing, the usage of CFD to estimate the local flow field of complex geometrical structures has become more routine. Recently, the Benchmark problems in Airframe Noise Computation (BANC) workshops have provided a community focus on accurately simulating the local flow field near the body with various CFD approaches. However, to date, little effort has been given into assessing the impact of the propagation phase of noise prediction. This paper includes results from the BANC-III workshop which explores variability in the propagation phase of CFD-based noise prediction. This includes two test cases: an analytical solution of a quadrupole source near a sphere and a computational solution around a nose landing gear. Agreement between three codes was very good for the analytic test case, but CFD-based noise predictions indicate that the propagation phase can introduce 3dB or more of variability in noise predictions.

  7. A Stationary Wavelet Entropy-Based Clustering Approach Accurately Predicts Gene Expression

    PubMed Central

    Nguyen, Nha; Vo, An; Choi, Inchan

    2015-01-01

    Abstract Studying epigenetic landscapes is important to understand the condition for gene regulation. Clustering is a useful approach to study epigenetic landscapes by grouping genes based on their epigenetic conditions. However, classical clustering approaches that often use a representative value of the signals in a fixed-sized window do not fully use the information written in the epigenetic landscapes. Clustering approaches to maximize the information of the epigenetic signals are necessary for better understanding gene regulatory environments. For effective clustering of multidimensional epigenetic signals, we developed a method called Dewer, which uses the entropy of stationary wavelet of epigenetic signals inside enriched regions for gene clustering. Interestingly, the gene expression levels were highly correlated with the entropy levels of epigenetic signals. Dewer separates genes better than a window-based approach in the assessment using gene expression and achieved a correlation coefficient above 0.9 without using any training procedure. Our results show that the changes of the epigenetic signals are useful to study gene regulation. PMID:25383910

  8. Prediction-based dynamic load-sharing heuristics

    NASA Technical Reports Server (NTRS)

    Goswami, Kumar K.; Devarakonda, Murthy; Iyer, Ravishankar K.

    1993-01-01

    The authors present dynamic load-sharing heuristics that use predicted resource requirements of processes to manage workloads in a distributed system. A previously developed statistical pattern-recognition method is employed for resource prediction. While nonprediction-based heuristics depend on a rapidly changing system status, the new heuristics depend on slowly changing program resource usage patterns. Furthermore, prediction-based heuristics can be more effective since they use future requirements rather than just the current system state. Four prediction-based heuristics, two centralized and two distributed, are presented. Using trace driven simulations, they are compared against random scheduling and two effective nonprediction based heuristics. Results show that the prediction-based centralized heuristics achieve up to 30 percent better response times than the nonprediction centralized heuristic, and that the prediction-based distributed heuristics achieve up to 50 percent improvements relative to their nonprediction counterpart.

  9. Do Skilled Elementary Teachers Hold Scientific Conceptions and Can They Accurately Predict the Type and Source of Students' Preconceptions of Electric Circuits?

    ERIC Educational Resources Information Center

    Lin, Jing-Wen

    2016-01-01

    Holding scientific conceptions and having the ability to accurately predict students' preconceptions are a prerequisite for science teachers to design appropriate constructivist-oriented learning experiences. This study explored the types and sources of students' preconceptions of electric circuits. First, 438 grade 3 (9 years old) students were…

  10. PredictSNP2: A Unified Platform for Accurately Evaluating SNP Effects by Exploiting the Different Characteristics of Variants in Distinct Genomic Regions

    PubMed Central

    Brezovský, Jan

    2016-01-01

    An important message taken from human genome sequencing projects is that the human population exhibits approximately 99.9% genetic similarity. Variations in the remaining parts of the genome determine our identity, trace our history and reveal our heritage. The precise delineation of phenotypically causal variants plays a key role in providing accurate personalized diagnosis, prognosis, and treatment of inherited diseases. Several computational methods for achieving such delineation have been reported recently. However, their ability to pinpoint potentially deleterious variants is limited by the fact that their mechanisms of prediction do not account for the existence of different categories of variants. Consequently, their output is biased towards the variant categories that are most strongly represented in the variant databases. Moreover, most such methods provide numeric scores but not binary predictions of the deleteriousness of variants or confidence scores that would be more easily understood by users. We have constructed three datasets covering different types of disease-related variants, which were divided across five categories: (i) regulatory, (ii) splicing, (iii) missense, (iv) synonymous, and (v) nonsense variants. These datasets were used to develop category-optimal decision thresholds and to evaluate six tools for variant prioritization: CADD, DANN, FATHMM, FitCons, FunSeq2 and GWAVA. This evaluation revealed some important advantages of the category-based approach. The results obtained with the five best-performing tools were then combined into a consensus score. Additional comparative analyses showed that in the case of missense variations, protein-based predictors perform better than DNA sequence-based predictors. A user-friendly web interface was developed that provides easy access to the five tools’ predictions, and their consensus scores, in a user-understandable format tailored to the specific features of different categories of variations

  11. PredictSNP2: A Unified Platform for Accurately Evaluating SNP Effects by Exploiting the Different Characteristics of Variants in Distinct Genomic Regions.

    PubMed

    Bendl, Jaroslav; Musil, Miloš; Štourač, Jan; Zendulka, Jaroslav; Damborský, Jiří; Brezovský, Jan

    2016-05-01

    An important message taken from human genome sequencing projects is that the human population exhibits approximately 99.9% genetic similarity. Variations in the remaining parts of the genome determine our identity, trace our history and reveal our heritage. The precise delineation of phenotypically causal variants plays a key role in providing accurate personalized diagnosis, prognosis, and treatment of inherited diseases. Several computational methods for achieving such delineation have been reported recently. However, their ability to pinpoint potentially deleterious variants is limited by the fact that their mechanisms of prediction do not account for the existence of different categories of variants. Consequently, their output is biased towards the variant categories that are most strongly represented in the variant databases. Moreover, most such methods provide numeric scores but not binary predictions of the deleteriousness of variants or confidence scores that would be more easily understood by users. We have constructed three datasets covering different types of disease-related variants, which were divided across five categories: (i) regulatory, (ii) splicing, (iii) missense, (iv) synonymous, and (v) nonsense variants. These datasets were used to develop category-optimal decision thresholds and to evaluate six tools for variant prioritization: CADD, DANN, FATHMM, FitCons, FunSeq2 and GWAVA. This evaluation revealed some important advantages of the category-based approach. The results obtained with the five best-performing tools were then combined into a consensus score. Additional comparative analyses showed that in the case of missense variations, protein-based predictors perform better than DNA sequence-based predictors. A user-friendly web interface was developed that provides easy access to the five tools' predictions, and their consensus scores, in a user-understandable format tailored to the specific features of different categories of variations. To

  12. Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties

    NASA Astrophysics Data System (ADS)

    Xie, Tian; Grossman, Jeffrey C.

    2018-04-01

    The use of machine learning methods for accelerating the design of crystalline materials usually requires manually constructed feature vectors or complex transformation of atom coordinates to input the crystal structure, which either constrains the model to certain crystal types or makes it difficult to provide chemical insights. Here, we develop a crystal graph convolutional neural networks framework to directly learn material properties from the connection of atoms in the crystal, providing a universal and interpretable representation of crystalline materials. Our method provides a highly accurate prediction of density functional theory calculated properties for eight different properties of crystals with various structure types and compositions after being trained with 1 04 data points. Further, our framework is interpretable because one can extract the contributions from local chemical environments to global properties. Using an example of perovskites, we show how this information can be utilized to discover empirical rules for materials design.

  13. Accurate phylogenetic classification of DNA fragments based onsequence composition

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McHardy, Alice C.; Garcia Martin, Hector; Tsirigos, Aristotelis

    2006-05-01

    Metagenome studies have retrieved vast amounts of sequenceout of a variety of environments, leading to novel discoveries and greatinsights into the uncultured microbial world. Except for very simplecommunities, diversity makes sequence assembly and analysis a verychallenging problem. To understand the structure a 5 nd function ofmicrobial communities, a taxonomic characterization of the obtainedsequence fragments is highly desirable, yet currently limited mostly tothose sequences that contain phylogenetic marker genes. We show that forclades at the rank of domain down to genus, sequence composition allowsthe very accurate phylogenetic 10 characterization of genomic sequence.We developed a composition-based classifier, PhyloPythia, for de novophylogenetic sequencemore » characterization and have trained it on adata setof 340 genomes. By extensive evaluation experiments we show that themethodis accurate across all taxonomic ranks considered, even forsequences that originate fromnovel organisms and are as short as 1kb.Application to two metagenome datasets 15 obtained from samples ofphosphorus-removing sludge showed that the method allows the accurateclassification at genus level of most sequence fragments from thedominant populations, while at the same time correctly characterizingeven larger parts of the samples at higher taxonomic levels.« less

  14. A Predictive Model of Anesthesia Depth Based on SVM in the Primary Visual Cortex

    PubMed Central

    Shi, Li; Li, Xiaoyuan; Wan, Hong

    2013-01-01

    In this paper, a novel model for predicting anesthesia depth is put forward based on local field potentials (LFPs) in the primary visual cortex (V1 area) of rats. The model is constructed using a Support Vector Machine (SVM) to realize anesthesia depth online prediction and classification. The raw LFP signal was first decomposed into some special scaling components. Among these components, those containing higher frequency information were well suited for more precise analysis of the performance of the anesthetic depth by wavelet transform. Secondly, the characteristics of anesthetized states were extracted by complexity analysis. In addition, two frequency domain parameters were selected. The above extracted features were used as the input vector of the predicting model. Finally, we collected the anesthesia samples from the LFP recordings under the visual stimulus experiments of Long Evans rats. Our results indicate that the predictive model is accurate and computationally fast, and that it is also well suited for online predicting. PMID:24044024

  15. Accurate X-Ray Spectral Predictions: An Advanced Self-Consistent-Field Approach Inspired by Many-Body Perturbation Theory

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Liang, Yufeng; Vinson, John; Pemmaraju, Sri

    Constrained-occupancy delta-self-consistent-field (ΔSCF) methods and many-body perturbation theories (MBPT) are two strategies for obtaining electronic excitations from first principles. Using the two distinct approaches, we study the O 1s core excitations that have become increasingly important for characterizing transition-metal oxides and understanding strong electronic correlation. The ΔSCF approach, in its current single-particle form, systematically underestimates the pre-edge intensity for chosen oxides, despite its success in weakly correlated systems. By contrast, the Bethe-Salpeter equation within MBPT predicts much better line shapes. This motivates one to reexamine the many-electron dynamics of x-ray excitations. We find that the single-particle ΔSCF approach can bemore » rectified by explicitly calculating many-electron transition amplitudes, producing x-ray spectra in excellent agreement with experiments. This study paves the way to accurately predict x-ray near-edge spectral fingerprints for physics and materials science beyond the Bethe-Salpether equation.« less

  16. Accurate X-Ray Spectral Predictions: An Advanced Self-Consistent-Field Approach Inspired by Many-Body Perturbation Theory

    DOE PAGES

    Liang, Yufeng; Vinson, John; Pemmaraju, Sri; ...

    2017-03-03

    Constrained-occupancy delta-self-consistent-field (ΔSCF) methods and many-body perturbation theories (MBPT) are two strategies for obtaining electronic excitations from first principles. Using the two distinct approaches, we study the O 1s core excitations that have become increasingly important for characterizing transition-metal oxides and understanding strong electronic correlation. The ΔSCF approach, in its current single-particle form, systematically underestimates the pre-edge intensity for chosen oxides, despite its success in weakly correlated systems. By contrast, the Bethe-Salpeter equation within MBPT predicts much better line shapes. This motivates one to reexamine the many-electron dynamics of x-ray excitations. We find that the single-particle ΔSCF approach can bemore » rectified by explicitly calculating many-electron transition amplitudes, producing x-ray spectra in excellent agreement with experiments. This study paves the way to accurately predict x-ray near-edge spectral fingerprints for physics and materials science beyond the Bethe-Salpether equation.« less

  17. Accurate X-Ray Spectral Predictions: An Advanced Self-Consistent-Field Approach Inspired by Many-Body Perturbation Theory.

    PubMed

    Liang, Yufeng; Vinson, John; Pemmaraju, Sri; Drisdell, Walter S; Shirley, Eric L; Prendergast, David

    2017-03-03

    Constrained-occupancy delta-self-consistent-field (ΔSCF) methods and many-body perturbation theories (MBPT) are two strategies for obtaining electronic excitations from first principles. Using the two distinct approaches, we study the O 1s core excitations that have become increasingly important for characterizing transition-metal oxides and understanding strong electronic correlation. The ΔSCF approach, in its current single-particle form, systematically underestimates the pre-edge intensity for chosen oxides, despite its success in weakly correlated systems. By contrast, the Bethe-Salpeter equation within MBPT predicts much better line shapes. This motivates one to reexamine the many-electron dynamics of x-ray excitations. We find that the single-particle ΔSCF approach can be rectified by explicitly calculating many-electron transition amplitudes, producing x-ray spectra in excellent agreement with experiments. This study paves the way to accurately predict x-ray near-edge spectral fingerprints for physics and materials science beyond the Bethe-Salpether equation.

  18. Accurate electrostatic and van der Waals pull-in prediction for fully clamped nano/micro-beams using linear universal graphs of pull-in instability

    NASA Astrophysics Data System (ADS)

    Tahani, Masoud; Askari, Amir R.

    2014-09-01

    In spite of the fact that pull-in instability of electrically actuated nano/micro-beams has been investigated by many researchers to date, no explicit formula has been presented yet which can predict pull-in voltage based on a geometrically non-linear and distributed parameter model. The objective of present paper is to introduce a simple and accurate formula to predict this value for a fully clamped electrostatically actuated nano/micro-beam. To this end, a non-linear Euler-Bernoulli beam model is employed, which accounts for the axial residual stress, geometric non-linearity of mid-plane stretching, distributed electrostatic force and the van der Waals (vdW) attraction. The non-linear boundary value governing equation of equilibrium is non-dimensionalized and solved iteratively through single-term Galerkin based reduced order model (ROM). The solutions are validated thorough direct comparison with experimental and other existing results reported in previous studies. Pull-in instability under electrical and vdW loads are also investigated using universal graphs. Based on the results of these graphs, non-dimensional pull-in and vdW parameters, which are defined in the text, vary linearly versus the other dimensionless parameters of the problem. Using this fact, some linear equations are presented to predict pull-in voltage, the maximum allowable length, the so-called detachment length, and the minimum allowable gap for a nano/micro-system. These linear equations are also reduced to a couple of universal pull-in formulas for systems with small initial gap. The accuracy of the universal pull-in formulas are also validated by comparing its results with available experimental and some previous geometric linear and closed-form findings published in the literature.

  19. New support vector machine-based method for microRNA target prediction.

    PubMed

    Li, L; Gao, Q; Mao, X; Cao, Y

    2014-06-09

    MicroRNA (miRNA) plays important roles in cell differentiation, proliferation, growth, mobility, and apoptosis. An accurate list of precise target genes is necessary in order to fully understand the importance of miRNAs in animal development and disease. Several computational methods have been proposed for miRNA target-gene identification. However, these methods still have limitations with respect to their sensitivity and accuracy. Thus, we developed a new miRNA target-prediction method based on the support vector machine (SVM) model. The model supplies information of two binding sites (primary and secondary) for a radial basis function kernel as a similarity measure for SVM features. The information is categorized based on structural, thermodynamic, and sequence conservation. Using high-confidence datasets selected from public miRNA target databases, we obtained a human miRNA target SVM classifier model with high performance and provided an efficient tool for human miRNA target gene identification. Experiments have shown that our method is a reliable tool for miRNA target-gene prediction, and a successful application of an SVM classifier. Compared with other methods, the method proposed here improves the sensitivity and accuracy of miRNA prediction. Its performance can be further improved by providing more training examples.

  20. Robust High-Resolution Cloth Using Parallelism, History-Based Collisions and Accurate Friction

    PubMed Central

    Selle, Andrew; Su, Jonathan; Irving, Geoffrey; Fedkiw, Ronald

    2015-01-01

    In this paper we simulate high resolution cloth consisting of up to 2 million triangles which allows us to achieve highly detailed folds and wrinkles. Since the level of detail is also influenced by object collision and self collision, we propose a more accurate model for cloth-object friction. We also propose a robust history-based repulsion/collision framework where repulsions are treated accurately and efficiently on a per time step basis. Distributed memory parallelism is used for both time evolution and collisions and we specifically address Gauss-Seidel ordering of repulsion/collision response. This algorithm is demonstrated by several high-resolution and high-fidelity simulations. PMID:19147895

  1. Does resident ranking during recruitment accurately predict subsequent performance as a surgical resident?

    PubMed

    Fryer, Jonathan P; Corcoran, Noreen; George, Brian; Wang, Ed; Darosa, Debra

    2012-01-01

    While the primary goal of ranking applicants for surgical residency training positions is to identify the candidates who will subsequently perform best as surgical residents, the effectiveness of the ranking process has not been adequately studied. We evaluated our general surgery resident recruitment process between 2001 and 2011 inclusive, to determine if our recruitment ranking parameters effectively predicted subsequent resident performance. We identified 3 candidate ranking parameters (United States Medical Licensing Examination [USMLE] Step 1 score, unadjusted ranking score [URS], and final adjusted ranking [FAR]), and 4 resident performance parameters (American Board of Surgery In-Training Examination [ABSITE] score, PGY1 resident evaluation grade [REG], overall REG, and independent faculty rating ranking [IFRR]), and assessed whether the former were predictive of the latter. Analyses utilized Spearman correlation coefficient. We found that the URS, which is based on objective and criterion based parameters, was a better predictor of subsequent performance than the FAR, which is a modification of the URS based on subsequent determinations of the resident selection committee. USMLE score was a reliable predictor of ABSITE scores only. However, when we compared our worst residence performances with the performances of the other residents in this evaluation, the data did not produce convincing evidence that poor resident performances could be reliably predicted by any of the recruitment ranking parameters. Finally, stratifying candidates based on their rank range did not effectively define a ranking cut-off beyond which resident performance would drop off. Based on these findings, we recommend surgery programs may be better served by utilizing a more structured resident ranking process and that subsequent adjustments to the rank list generated by this process should be undertaken with caution. Copyright © 2012 Association of Program Directors in Surgery

  2. Prediction of redox-sensitive cysteines using sequential distance and other sequence-based features.

    PubMed

    Sun, Ming-An; Zhang, Qing; Wang, Yejun; Ge, Wei; Guo, Dianjing

    2016-08-24

    Reactive oxygen species can modify the structure and function of proteins and may also act as important signaling molecules in various cellular processes. Cysteine thiol groups of proteins are particularly susceptible to oxidation. Meanwhile, their reversible oxidation is of critical roles for redox regulation and signaling. Recently, several computational tools have been developed for predicting redox-sensitive cysteines; however, those methods either only focus on catalytic redox-sensitive cysteines in thiol oxidoreductases, or heavily depend on protein structural data, thus cannot be widely used. In this study, we analyzed various sequence-based features potentially related to cysteine redox-sensitivity, and identified three types of features for efficient computational prediction of redox-sensitive cysteines. These features are: sequential distance to the nearby cysteines, PSSM profile and predicted secondary structure of flanking residues. After further feature selection using SVM-RFE, we developed Redox-Sensitive Cysteine Predictor (RSCP), a SVM based classifier for redox-sensitive cysteine prediction using primary sequence only. Using 10-fold cross-validation on RSC758 dataset, the accuracy, sensitivity, specificity, MCC and AUC were estimated as 0.679, 0.602, 0.756, 0.362 and 0.727, respectively. When evaluated using 10-fold cross-validation with BALOSCTdb dataset which has structure information, the model achieved performance comparable to current structure-based method. Further validation using an independent dataset indicates it is robust and of relatively better accuracy for predicting redox-sensitive cysteines from non-enzyme proteins. In this study, we developed a sequence-based classifier for predicting redox-sensitive cysteines. The major advantage of this method is that it does not rely on protein structure data, which ensures more extensive application compared to other current implementations. Accurate prediction of redox-sensitive cysteines not

  3. HIV-1 protease cleavage site prediction based on two-stage feature selection method.

    PubMed

    Niu, Bing; Yuan, Xiao-Cheng; Roeper, Preston; Su, Qiang; Peng, Chun-Rong; Yin, Jing-Yuan; Ding, Juan; Li, HaiPeng; Lu, Wen-Cong

    2013-03-01

    Knowledge of the mechanism of HIV protease cleavage specificity is critical to the design of specific and effective HIV inhibitors. Searching for an accurate, robust, and rapid method to correctly predict the cleavage sites in proteins is crucial when searching for possible HIV inhibitors. In this article, HIV-1 protease specificity was studied using the correlation-based feature subset (CfsSubset) selection method combined with Genetic Algorithms method. Thirty important biochemical features were found based on a jackknife test from the original data set containing 4,248 features. By using the AdaBoost method with the thirty selected features the prediction model yields an accuracy of 96.7% for the jackknife test and 92.1% for an independent set test, with increased accuracy over the original dataset by 6.7% and 77.4%, respectively. Our feature selection scheme could be a useful technique for finding effective competitive inhibitors of HIV protease.

  4. Aggregation Trade Offs in Family Based Recommendations

    NASA Astrophysics Data System (ADS)

    Berkovsky, Shlomo; Freyne, Jill; Coombe, Mac

    Personalized information access tools are frequently based on collaborative filtering recommendation algorithms. Collaborative filtering recommender systems typically suffer from a data sparsity problem, where systems do not have sufficient user data to generate accurate and reliable predictions. Prior research suggested using group-based user data in the collaborative filtering recommendation process to generate group-based predictions and partially resolve the sparsity problem. Although group recommendations are less accurate than personalized recommendations, they are more accurate than general non-personalized recommendations, which are the natural fall back when personalized recommendations cannot be generated. In this work we present initial results of a study that exploits the browsing logs of real families of users gathered in an eHealth portal. The browsing logs allowed us to experimentally compare the accuracy of two group-based recommendation strategies: aggregated group models and aggregated predictions. Our results showed that aggregating individual models into group models resulted in more accurate predictions than aggregating individual predictions into group predictions.

  5. Prediction of Protein-Protein Interaction Sites with Machine-Learning-Based Data-Cleaning and Post-Filtering Procedures.

    PubMed

    Liu, Guang-Hui; Shen, Hong-Bin; Yu, Dong-Jun

    2016-04-01

    Accurately predicting protein-protein interaction sites (PPIs) is currently a hot topic because it has been demonstrated to be very useful for understanding disease mechanisms and designing drugs. Machine-learning-based computational approaches have been broadly utilized and demonstrated to be useful for PPI prediction. However, directly applying traditional machine learning algorithms, which often assume that samples in different classes are balanced, often leads to poor performance because of the severe class imbalance that exists in the PPI prediction problem. In this study, we propose a novel method for improving PPI prediction performance by relieving the severity of class imbalance using a data-cleaning procedure and reducing predicted false positives with a post-filtering procedure: First, a machine-learning-based data-cleaning procedure is applied to remove those marginal targets, which may potentially have a negative effect on training a model with a clear classification boundary, from the majority samples to relieve the severity of class imbalance in the original training dataset; then, a prediction model is trained on the cleaned dataset; finally, an effective post-filtering procedure is further used to reduce potential false positive predictions. Stringent cross-validation and independent validation tests on benchmark datasets demonstrated the efficacy of the proposed method, which exhibits highly competitive performance compared with existing state-of-the-art sequence-based PPIs predictors and should supplement existing PPI prediction methods.

  6. Prediction of gestational age based on genome-wide differentially methylated regions.

    PubMed

    Bohlin, J; Håberg, S E; Magnus, P; Reese, S E; Gjessing, H K; Magnus, M C; Parr, C L; Page, C M; London, S J; Nystad, W

    2016-10-07

    We explored the association between gestational age and cord blood DNA methylation at birth and whether DNA methylation could be effective in predicting gestational age due to limitations with the presently used methods. We used data from the Norwegian Mother and Child Birth Cohort study (MoBa) with Illumina HumanMethylation450 data measured for 1753 newborns in two batches: MoBa 1, n = 1068; and MoBa 2, n = 685. Gestational age was computed using both ultrasound and the last menstrual period. We evaluated associations between DNA methylation and gestational age and developed a statistical model for predicting gestational age using MoBa 1 for training and MoBa 2 for predictions. The prediction model was additionally used to compare ultrasound and last menstrual period-based gestational age predictions. Furthermore, both CpGs and associated genes detected in the training models were compared to those detected in a published prediction model for chronological age. There were 5474 CpGs associated with ultrasound gestational age after adjustment for a set of covariates, including estimated cell type proportions, and Bonferroni-correction for multiple testing. Our model predicted ultrasound gestational age more accurately than it predicted last menstrual period gestational age. DNA methylation at birth appears to be a good predictor of gestational age. Ultrasound gestational age is more strongly associated with methylation than last menstrual period gestational age. The CpGs linked with our gestational age prediction model, and their associated genes, differed substantially from the corresponding CpGs and genes associated with a chronological age prediction model.

  7. A cross-race effect in metamemory: Predictions of face recognition are more accurate for members of our own race

    PubMed Central

    Hourihan, Kathleen L.; Benjamin, Aaron S.; Liu, Xiping

    2012-01-01

    The Cross-Race Effect (CRE) in face recognition is the well-replicated finding that people are better at recognizing faces from their own race, relative to other races. The CRE reveals systematic limitations on eyewitness identification accuracy and suggests that some caution is warranted in evaluating cross-race identification. The CRE is a problem because jurors value eyewitness identification highly in verdict decisions. In the present paper, we explore how accurate people are in predicting their ability to recognize own-race and other-race faces. Caucasian and Asian participants viewed photographs of Caucasian and Asian faces, and made immediate judgments of learning during study. An old/new recognition test replicated the CRE: both groups displayed superior discriminability of own-race faces, relative to other-race faces. Importantly, relative metamnemonic accuracy was also greater for own-race faces, indicating that the accuracy of predictions about face recognition is influenced by race. This result indicates another source of concern when eliciting or evaluating eyewitness identification: people are less accurate in judging whether they will or will not recognize a face when that face is of a different race than they are. This new result suggests that a witness’s claim of being likely to recognize a suspect from a lineup should be interpreted with caution when the suspect is of a different race than the witness. PMID:23162788

  8. Improved patient size estimates for accurate dose calculations in abdomen computed tomography

    NASA Astrophysics Data System (ADS)

    Lee, Chang-Lae

    2017-07-01

    The radiation dose of CT (computed tomography) is generally represented by the CTDI (CT dose index). CTDI, however, does not accurately predict the actual patient doses for different human body sizes because it relies on a cylinder-shaped head (diameter : 16 cm) and body (diameter : 32 cm) phantom. The purpose of this study was to eliminate the drawbacks of the conventional CTDI and to provide more accurate radiation dose information. Projection radiographs were obtained from water cylinder phantoms of various sizes, and the sizes of the water cylinder phantoms were calculated and verified using attenuation profiles. The effective diameter was also calculated using the attenuation of the abdominal projection radiographs of 10 patients. When the results of the attenuation-based method and the geometry-based method shown were compared with the results of the reconstructed-axial-CT-image-based method, the effective diameter of the attenuation-based method was found to be similar to the effective diameter of the reconstructed-axial-CT-image-based method, with a difference of less than 3.8%, but the geometry-based method showed a difference of less than 11.4%. This paper proposes a new method of accurately computing the radiation dose of CT based on the patient sizes. This method computes and provides the exact patient dose before the CT scan, and can therefore be effectively used for imaging and dose control.

  9. [Predictive model based multimetric index of macroinvertebrates for river health assessment].

    PubMed

    Chen, Kai; Yu, Hai Yan; Zhang, Ji Wei; Wang, Bei Xin; Chen, Qiu Wen

    2017-06-18

    Improving the stability of integrity of biotic index (IBI; i.e., multi-metric indices, MMI) across temporal and spatial scales is one of the most important issues in water ecosystem integrity bioassessment and water environment management. Using datasets of field-based macroinvertebrate and physicochemical variables and GIS-based natural predictors (e.g., geomorphology and climate) and land use variables collected at 227 river sites from 2004 to 2011 across the Zhejiang Province, China, we used random forests (RF) to adjust the effects of natural variations at temporal and spatial scales on macroinvertebrate metrics. We then developed natural variations adjusted (predictive) and unadjusted (null) MMIs and compared performance between them. The core me-trics selected for predictive and null MMIs were different from each other, and natural variations within core metrics in predictive MMI explained by RF models ranged between 11.4% and 61.2%. The predictive MMI was more precise and accurate, but less responsive and sensitive than null MMI. The multivariate nearest-neighbor test determined that 9 test sites and 1 most degraded site were flagged outside of the environmental space of the reference site network. We found that combination of predictive MMI developed by using predictive model and the nearest-neighbor test performed best and decreased risks of inferring type I (designating a water body as being in poor biological condition, when it was actually in good condition) and type II (designating a water body as being in good biological condition, when it was actually in poor condition) errors. Our results provided an effective method to improve the stability and performance of integrity of biotic index.

  10. Stringent DDI-based prediction of H. sapiens-M. tuberculosis H37Rv protein-protein interactions.

    PubMed

    Zhou, Hufeng; Rezaei, Javad; Hugo, Willy; Gao, Shangzhi; Jin, Jingjing; Fan, Mengyuan; Yong, Chern-Han; Wozniak, Michal; Wong, Limsoon

    2013-01-01

    important properties of domains involved in host-pathogen PPIs. We find that both host and pathogen proteins involved in host-pathogen PPIs tend to have more domains than proteins involved in intra-species PPIs, and these domains have more interaction partners than domains on proteins involved in intra-species PPI. The stringent DDI-based prediction approach reported in this work provides a stringent strategy for predicting host-pathogen PPIs. It also performs better than a conventional DDI-based approach in predicting PPIs. We have predicted a small set of accurate H. sapiens-M. tuberculosis H37Rv PPIs which could be very useful for a variety of related studies.

  11. Stringent DDI-based Prediction of H. sapiens-M. tuberculosis H37Rv Protein-Protein Interactions

    PubMed Central

    2013-01-01

    discovered some important properties of domains involved in host-pathogen PPIs. We find that both host and pathogen proteins involved in host-pathogen PPIs tend to have more domains than proteins involved in intra-species PPIs, and these domains have more interaction partners than domains on proteins involved in intra-species PPI. Conclusions The stringent DDI-based prediction approach reported in this work provides a stringent strategy for predicting host-pathogen PPIs. It also performs better than a conventional DDI-based approach in predicting PPIs. We have predicted a small set of accurate H. sapiens-M. tuberculosis H37Rv PPIs which could be very useful for a variety of related studies. PMID:24564941

  12. Field-scale prediction of enhanced DNAPL dissolution based on partitioning tracers.

    PubMed

    Wang, Fang; Annable, Michael D; Jawitz, James W

    2013-09-01

    The equilibrium streamtube model (EST) has demonstrated the ability to accurately predict dense nonaqueous phase liquid (DNAPL) dissolution in laboratory experiments and numerical simulations. Here the model is applied to predict DNAPL dissolution at a tetrachloroethylene (PCE)-contaminated dry cleaner site, located in Jacksonville, Florida. The EST model is an analytical solution with field-measurable input parameters. Measured data from a field-scale partitioning tracer test were used to parameterize the EST model and the predicted PCE dissolution was compared to measured data from an in-situ ethanol flood. In addition, a simulated partitioning tracer test from a calibrated, three-dimensional, spatially explicit multiphase flow model (UTCHEM) was also used to parameterize the EST analytical solution. The EST ethanol prediction based on both the field partitioning tracer test and the simulation closely matched the total recovery well field ethanol data with Nash-Sutcliffe efficiency E=0.96 and 0.90, respectively. The EST PCE predictions showed a peak shift to earlier arrival times for models based on either field-measured or simulated partitioning tracer tests, resulting in poorer matches to the field PCE data in both cases. The peak shifts were concluded to be caused by well screen interval differences between the field tracer test and ethanol flood. Both the EST model and UTCHEM were also used to predict PCE aqueous dissolution under natural gradient conditions, which has a much less complex flow pattern than the forced-gradient double five spot used for the ethanol flood. The natural gradient EST predictions based on parameters determined from tracer tests conducted with a complex flow pattern underestimated the UTCHEM-simulated natural gradient total mass removal by 12% after 170 pore volumes of water flushing indicating that some mass was not detected by the tracers likely due to stagnation zones in the flow field. These findings highlight the important

  13. Field-scale prediction of enhanced DNAPL dissolution based on partitioning tracers

    NASA Astrophysics Data System (ADS)

    Wang, Fang; Annable, Michael D.; Jawitz, James W.

    2013-09-01

    The equilibrium streamtube model (EST) has demonstrated the ability to accurately predict dense nonaqueous phase liquid (DNAPL) dissolution in laboratory experiments and numerical simulations. Here the model is applied to predict DNAPL dissolution at a tetrachloroethylene (PCE)-contaminated dry cleaner site, located in Jacksonville, Florida. The EST model is an analytical solution with field-measurable input parameters. Measured data from a field-scale partitioning tracer test were used to parameterize the EST model and the predicted PCE dissolution was compared to measured data from an in-situ ethanol flood. In addition, a simulated partitioning tracer test from a calibrated, three-dimensional, spatially explicit multiphase flow model (UTCHEM) was also used to parameterize the EST analytical solution. The EST ethanol prediction based on both the field partitioning tracer test and the simulation closely matched the total recovery well field ethanol data with Nash-Sutcliffe efficiency E = 0.96 and 0.90, respectively. The EST PCE predictions showed a peak shift to earlier arrival times for models based on either field-measured or simulated partitioning tracer tests, resulting in poorer matches to the field PCE data in both cases. The peak shifts were concluded to be caused by well screen interval differences between the field tracer test and ethanol flood. Both the EST model and UTCHEM were also used to predict PCE aqueous dissolution under natural gradient conditions, which has a much less complex flow pattern than the forced-gradient double five spot used for the ethanol flood. The natural gradient EST predictions based on parameters determined from tracer tests conducted with a complex flow pattern underestimated the UTCHEM-simulated natural gradient total mass removal by 12% after 170 pore volumes of water flushing indicating that some mass was not detected by the tracers likely due to stagnation zones in the flow field. These findings highlight the important

  14. Improvement of experimental testing and network training conditions with genome-wide microarrays for more accurate predictions of drug gene targets

    PubMed Central

    2014-01-01

    Background Genome-wide microarrays have been useful for predicting chemical-genetic interactions at the gene level. However, interpreting genome-wide microarray results can be overwhelming due to the vast output of gene expression data combined with off-target transcriptional responses many times induced by a drug treatment. This study demonstrates how experimental and computational methods can interact with each other, to arrive at more accurate predictions of drug-induced perturbations. We present a two-stage strategy that links microarray experimental testing and network training conditions to predict gene perturbations for a drug with a known mechanism of action in a well-studied organism. Results S. cerevisiae cells were treated with the antifungal, fluconazole, and expression profiling was conducted under different biological conditions using Affymetrix genome-wide microarrays. Transcripts were filtered with a formal network-based method, sparse simultaneous equation models and Lasso regression (SSEM-Lasso), under different network training conditions. Gene expression results were evaluated using both gene set and single gene target analyses, and the drug’s transcriptional effects were narrowed first by pathway and then by individual genes. Variables included: (i) Testing conditions – exposure time and concentration and (ii) Network training conditions – training compendium modifications. Two analyses of SSEM-Lasso output – gene set and single gene – were conducted to gain a better understanding of how SSEM-Lasso predicts perturbation targets. Conclusions This study demonstrates that genome-wide microarrays can be optimized using a two-stage strategy for a more in-depth understanding of how a cell manifests biological reactions to a drug treatment at the transcription level. Additionally, a more detailed understanding of how the statistical model, SSEM-Lasso, propagates perturbations through a network of gene regulatory interactions is achieved

  15. Finite element based model predictive control for active vibration suppression of a one-link flexible manipulator.

    PubMed

    Dubay, Rickey; Hassan, Marwan; Li, Chunying; Charest, Meaghan

    2014-09-01

    This paper presents a unique approach for active vibration control of a one-link flexible manipulator. The method combines a finite element model of the manipulator and an advanced model predictive controller to suppress vibration at its tip. This hybrid methodology improves significantly over the standard application of a predictive controller for vibration control. The finite element model used in place of standard modelling in the control algorithm provides a more accurate prediction of dynamic behavior, resulting in enhanced control. Closed loop control experiments were performed using the flexible manipulator, instrumented with strain gauges and piezoelectric actuators. In all instances, experimental and simulation results demonstrate that the finite element based predictive controller provides improved active vibration suppression in comparison with using a standard predictive control strategy. Copyright © 2014 ISA. Published by Elsevier Ltd. All rights reserved.

  16. The NAFLD Index: A Simple and Accurate Screening Tool for the Prediction of Non-Alcoholic Fatty Liver Disease.

    PubMed

    Ichino, Naohiro; Osakabe, Keisuke; Sugimoto, Keiko; Suzuki, Koji; Yamada, Hiroya; Takai, Hiroji; Sugiyama, Hiroko; Yukitake, Jun; Inoue, Takashi; Ohashi, Koji; Hata, Tadayoshi; Hamajima, Nobuyuki; Nishikawa, Toru; Hashimoto, Senju; Kawabe, Naoto; Yoshioka, Kentaro

    2015-01-01

    Non-alcoholic fatty liver disease (NAFLD) is a common debilitating condition in many industrialized countries that increases the risk of cardiovascular disease. The aim of this study was to derive a simple and accurate screening tool for the prediction of NAFLD in the Japanese population. A total of 945 participants, 279 men and 666 women living in Hokkaido, Japan, were enrolled among residents who attended a health check-up program from 2010 to 2014. Participants with an alcohol consumption > 20 g/day and/or a chronic liver disease, such as chronic hepatitis B, chronic hepatitis C or autoimmune hepatitis, were excluded from this study. Clinical and laboratory data were examined to identify predictive markers of NAFLD. A new predictive index for NAFLD, the NAFLD index, was constructed for men and for women. The NAFLD index for men = -15.5693+0.3264 [BMI] +0.0134 [triglycerides (mg/dl)], and for women = -31.4686+0.3683 [BMI] +2.5699 [albumin (g/dl)] +4.6740[ALT/AST] -0.0379 [HDL cholesterol (mg/dl)]. The AUROC of the NAFLD index for men and for women was 0.87(95% CI 0.88-1.60) and 0.90 (95% CI 0.66-1.02), respectively. The cut-off point of -5.28 for men predicted NAFLD with an accuracy of 82.8%. For women, the cut-off point of -7.65 predicted NAFLD with an accuracy of 87.7%. A new index for the non-invasive prediction of NAFLD, the NAFLD index, was constructed using available clinical and laboratory data. This index is a simple screening tool to predict the presence of NAFLD.

  17. Predictions of adult Anopheles albimanus densities in villages based on distances to remotely sensed larval habitats

    NASA Technical Reports Server (NTRS)

    Rejmankova, E.; Roberts, D. R.; Pawley, A.; Manguin, S.; Polanco, J.

    1995-01-01

    Remote sensing is particularly helpful for assessing the location and extent of vegetation formations, such as herbaceous wetlands, that are difficult to examine on the ground. Marshes that are sparsely populated with emergent macrophytes and dense cyanobacterial mats have previously been identified as very productive Anopheles albimanus larval habitats. This type of habitat was detectable on a classified multispectral System Probatoire d'Observation de la Terre image of northern Belize as a mixture of two isoclasses. A similar spectral signature is characteristic for vegetation of river margins consisting of aquatic grasses and water hyacinth, which constitutes another productive larval habitat. Based on the distance between human settlements (sites) of various sizes and the nearest marsh/river exhibiting this particular class combination, we selected two groups of sites: those located closer than 500 m and those located more than 1,500 m from such habitats. Based on previous adult collections near larval habitats, we defined a landing rate of 0.5 mosquitoes/human/min from 6:30 PM to 8:00 PM as the threshold for high (> or = 0.5 mosquitoes/human/min) versus low (< 0.5 mosquitoes/human/min) densities of An. albimanus. Sites located less than 500 m from the habitat were predicted as having values higher than this threshold, while lower values were predicted for sites located greater than 1,500 m from the habitat. Predictions were verified by collections of mosquitoes landing on humans. The predictions were 100% accurate for sites in the > 1,500-m category and 89% accurate for sites in the < 500-m category.

  18. Modeling and prediction of extraction profile for microwave-assisted extraction based on absorbed microwave energy.

    PubMed

    Chan, Chung-Hung; Yusoff, Rozita; Ngoh, Gek-Cheng

    2013-09-01

    A modeling technique based on absorbed microwave energy was proposed to model microwave-assisted extraction (MAE) of antioxidant compounds from cocoa (Theobroma cacao L.) leaves. By adapting suitable extraction model at the basis of microwave energy absorbed during extraction, the model can be developed to predict extraction profile of MAE at various microwave irradiation power (100-600 W) and solvent loading (100-300 ml). Verification with experimental data confirmed that the prediction was accurate in capturing the extraction profile of MAE (R-square value greater than 0.87). Besides, the predicted yields from the model showed good agreement with the experimental results with less than 10% deviation observed. Furthermore, suitable extraction times to ensure high extraction yield at various MAE conditions can be estimated based on absorbed microwave energy. The estimation is feasible as more than 85% of active compounds can be extracted when compared with the conventional extraction technique. Copyright © 2013 Elsevier Ltd. All rights reserved.

  19. A Deep Learning Framework for Robust and Accurate Prediction of ncRNA-Protein Interactions Using Evolutionary Information.

    PubMed

    Yi, Hai-Cheng; You, Zhu-Hong; Huang, De-Shuang; Li, Xiao; Jiang, Tong-Hai; Li, Li-Ping

    2018-06-01

    The interactions between non-coding RNAs (ncRNAs) and proteins play an important role in many biological processes, and their biological functions are primarily achieved by binding with a variety of proteins. High-throughput biological techniques are used to identify protein molecules bound with specific ncRNA, but they are usually expensive and time consuming. Deep learning provides a powerful solution to computationally predict RNA-protein interactions. In this work, we propose the RPI-SAN model by using the deep-learning stacked auto-encoder network to mine the hidden high-level features from RNA and protein sequences and feed them into a random forest (RF) model to predict ncRNA binding proteins. Stacked assembling is further used to improve the accuracy of the proposed method. Four benchmark datasets, including RPI2241, RPI488, RPI1807, and NPInter v2.0, were employed for the unbiased evaluation of five established prediction tools: RPI-Pred, IPMiner, RPISeq-RF, lncPro, and RPI-SAN. The experimental results show that our RPI-SAN model achieves much better performance than other methods, with accuracies of 90.77%, 89.7%, 96.1%, and 99.33%, respectively. It is anticipated that RPI-SAN can be used as an effective computational tool for future biomedical researches and can accurately predict the potential ncRNA-protein interacted pairs, which provides reliable guidance for biological research. Copyright © 2018 The Author(s). Published by Elsevier Inc. All rights reserved.

  20. A hybrid method for accurate star tracking using star sensor and gyros.

    PubMed

    Lu, Jiazhen; Yang, Lie; Zhang, Hao

    2017-10-01

    Star tracking is the primary operating mode of star sensors. To improve tracking accuracy and efficiency, a hybrid method using a star sensor and gyroscopes is proposed in this study. In this method, the dynamic conditions of an aircraft are determined first by the estimated angular acceleration. Under low dynamic conditions, the star sensor is used to measure the star vector and the vector difference method is adopted to estimate the current angular velocity. Under high dynamic conditions, the angular velocity is obtained by the calibrated gyros. The star position is predicted based on the estimated angular velocity and calibrated gyros using the star vector measurements. The results of the semi-physical experiment show that this hybrid method is accurate and feasible. In contrast with the star vector difference and gyro-assisted methods, the star position prediction result of the hybrid method is verified to be more accurate in two different cases under the given random noise of the star centroid.

  1. Performance of a process-based hydrodynamic model in predicting shoreline change

    NASA Astrophysics Data System (ADS)

    Safak, I.; Warner, J. C.; List, J. H.

    2012-12-01

    Shoreline change is controlled by a complex combination of processes that include waves, currents, sediment characteristics and availability, geologic framework, human interventions, and sea level rise. A comprehensive data set of shoreline position (14 shorelines between 1978-2002) along the continuous and relatively non-interrupted North Carolina Coast from Oregon Inlet to Cape Hatteras (65 km) reveals a spatial pattern of alternating erosion and accretion, with an erosional average shoreline change rate of -1.6 m/yr and up to -8 m/yr in some locations. This data set gives a unique opportunity to study long-term shoreline change in an area hit by frequent storm events while relatively uninfluenced by human interventions and the effects of tidal inlets. Accurate predictions of long-term shoreline change may require a model that accurately resolves surf zone processes and sediment transport patterns. Conventional methods for predicting shoreline change such as one-line models and regression of shoreline positions have been designed for computational efficiency. These methods, however, not only have several underlying restrictions (validity for small angle of wave approach, assuming bottom contours and shoreline to be parallel, depth of closure, etc.) but also their empirical estimates of sediment transport rates in the surf zone have been shown to vary greatly from the calculations of process-based hydrodynamic models. We focus on hind-casting long-term shoreline change using components of the process-based, three-dimensional coupled-ocean-atmosphere-wave-sediment transport modeling system (COAWST). COAWST is forced with historical predictions of atmospheric and oceanographic data from public-domain global models. Through a method of coupled concurrent grid-refinement approach in COAWST, the finest grid with resolution of O(10 m) that covers the surf zone along the section of interest is forced at its spatial boundaries with waves and currents computed on the grids

  2. Generating highly accurate prediction hypotheses through collaborative ensemble learning

    NASA Astrophysics Data System (ADS)

    Arsov, Nino; Pavlovski, Martin; Basnarkov, Lasko; Kocarev, Ljupco

    2017-03-01

    Ensemble generation is a natural and convenient way of achieving better generalization performance of learning algorithms by gathering their predictive capabilities. Here, we nurture the idea of ensemble-based learning by combining bagging and boosting for the purpose of binary classification. Since the former improves stability through variance reduction, while the latter ameliorates overfitting, the outcome of a multi-model that combines both strives toward a comprehensive net-balancing of the bias-variance trade-off. To further improve this, we alter the bagged-boosting scheme by introducing collaboration between the multi-model’s constituent learners at various levels. This novel stability-guided classification scheme is delivered in two flavours: during or after the boosting process. Applied among a crowd of Gentle Boost ensembles, the ability of the two suggested algorithms to generalize is inspected by comparing them against Subbagging and Gentle Boost on various real-world datasets. In both cases, our models obtained a 40% generalization error decrease. But their true ability to capture details in data was revealed through their application for protein detection in texture analysis of gel electrophoresis images. They achieve improved performance of approximately 0.9773 AUROC when compared to the AUROC of 0.9574 obtained by an SVM based on recursive feature elimination.

  3. Fast integration-based prediction bands for ordinary differential equation models.

    PubMed

    Hass, Helge; Kreutz, Clemens; Timmer, Jens; Kaschek, Daniel

    2016-04-15

    To gain a deeper understanding of biological processes and their relevance in disease, mathematical models are built upon experimental data. Uncertainty in the data leads to uncertainties of the model's parameters and in turn to uncertainties of predictions. Mechanistic dynamic models of biochemical networks are frequently based on nonlinear differential equation systems and feature a large number of parameters, sparse observations of the model components and lack of information in the available data. Due to the curse of dimensionality, classical and sampling approaches propagating parameter uncertainties to predictions are hardly feasible and insufficient. However, for experimental design and to discriminate between competing models, prediction and confidence bands are essential. To circumvent the hurdles of the former methods, an approach to calculate a profile likelihood on arbitrary observations for a specific time point has been introduced, which provides accurate confidence and prediction intervals for nonlinear models and is computationally feasible for high-dimensional models. In this article, reliable and smooth point-wise prediction and confidence bands to assess the model's uncertainty on the whole time-course are achieved via explicit integration with elaborate correction mechanisms. The corresponding system of ordinary differential equations is derived and tested on three established models for cellular signalling. An efficiency analysis is performed to illustrate the computational benefit compared with repeated profile likelihood calculations at multiple time points. The integration framework and the examples used in this article are provided with the software package Data2Dynamics, which is based on MATLAB and freely available at http://www.data2dynamics.org helge.hass@fdm.uni-freiburg.de Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e

  4. An injury mortality prediction based on the anatomic injury scale

    PubMed Central

    Wang, Muding; Wu, Dan; Qiu, Wusi; Wang, Weimi; Zeng, Yunji; Shen, Yi

    2017-01-01

    Abstract To determine whether the injury mortality prediction (IMP) statistically outperforms the trauma mortality prediction model (TMPM) as a predictor of mortality. The TMPM is currently the best trauma score method, which is based on the anatomic injury. Its ability of mortality prediction is superior to the injury severity score (ISS) and to the new injury severity score (NISS). However, despite its statistical significance, the predictive power of TMPM needs to be further improved. Retrospective cohort study is based on the data of 1,148,359 injured patients in the National Trauma Data Bank hospitalized from 2010 to 2011. Sixty percent of the data was used to derive an empiric measure of severity of different Abbreviated Injury Scale predot codes by taking the weighted average death probabilities of trauma patients. Twenty percent of the data was used to create computing method of the IMP model. The remaining 20% of the data was used to evaluate the statistical performance of IMP and then be compared with the TMPM and the single worst injury by examining area under the receiver operating characteristic curve (ROC), the Hosmer–Lemeshow (HL) statistic, and the Akaike information criterion. IMP exhibits significantly both better discrimination (ROC-IMP, 0.903 [0.899–0.907] and ROC-TMPM, 0.890 [0.886–0.895]) and calibration (HL-IMP, 9.9 [4.4–14.7] and HL-TMPM, 197 [143–248]) compared with TMPM. All models show slight changes after the extension of age, gender, and mechanism of injury, but the extended IMP still dominated TMPM in every performance. The IMP has slight improvement in discrimination and calibration compared with the TMPM and can accurately predict mortality. Therefore, we consider it as a new feasible scoring method in trauma research. PMID:28858124

  5. Predictive Structure-Based Toxicology Approaches To Assess the Androgenic Potential of Chemicals.

    PubMed

    Trisciuzzi, Daniela; Alberga, Domenico; Mansouri, Kamel; Judson, Richard; Novellino, Ettore; Mangiatordi, Giuseppe Felice; Nicolotti, Orazio

    2017-11-27

    We present a practical and easy-to-run in silico workflow exploiting a structure-based strategy making use of docking simulations to derive highly predictive classification models of the androgenic potential of chemicals. Models were trained on a high-quality chemical collection comprising 1689 curated compounds made available within the CoMPARA consortium from the US Environmental Protection Agency and were integrated with a two-step applicability domain whose implementation had the effect of improving both the confidence in prediction and statistics by reducing the number of false negatives. Among the nine androgen receptor X-ray solved structures, the crystal 2PNU (entry code from the Protein Data Bank) was associated with the best performing structure-based classification model. Three validation sets comprising each 2590 compounds extracted by the DUD-E collection were used to challenge model performance and the effectiveness of Applicability Domain implementation. Next, the 2PNU model was applied to screen and prioritize two collections of chemicals. The first is a small pool of 12 representative androgenic compounds that were accurately classified based on outstanding rationale at the molecular level. The second is a large external blind set of 55450 chemicals with potential for human exposure. We show how the use of molecular docking provides highly interpretable models and can represent a real-life option as an alternative nontesting method for predictive toxicology.

  6. A machine-learning approach for predicting palmitoylation sites from integrated sequence-based features.

    PubMed

    Li, Liqi; Luo, Qifa; Xiao, Weidong; Li, Jinhui; Zhou, Shiwen; Li, Yongsheng; Zheng, Xiaoqi; Yang, Hua

    2017-02-01

    Palmitoylation is the covalent attachment of lipids to amino acid residues in proteins. As an important form of protein posttranslational modification, it increases the hydrophobicity of proteins, which contributes to the protein transportation, organelle localization, and functions, therefore plays an important role in a variety of cell biological processes. Identification of palmitoylation sites is necessary for understanding protein-protein interaction, protein stability, and activity. Since conventional experimental techniques to determine palmitoylation sites in proteins are both labor intensive and costly, a fast and accurate computational approach to predict palmitoylation sites from protein sequences is in urgent need. In this study, a support vector machine (SVM)-based method was proposed through integrating PSI-BLAST profile, physicochemical properties, [Formula: see text]-mer amino acid compositions (AACs), and [Formula: see text]-mer pseudo AACs into the principal feature vector. A recursive feature selection scheme was subsequently implemented to single out the most discriminative features. Finally, an SVM method was implemented to predict palmitoylation sites in proteins based on the optimal features. The proposed method achieved an accuracy of 99.41% and Matthews Correlation Coefficient of 0.9773 for a benchmark dataset. The result indicates the efficiency and accuracy of our method in prediction of palmitoylation sites based on protein sequences.

  7. Uncertainty analysis of neural network based flood forecasting models: An ensemble based approach for constructing prediction interval

    NASA Astrophysics Data System (ADS)

    Kasiviswanathan, K.; Sudheer, K.

    2013-05-01

    Artificial neural network (ANN) based hydrologic models have gained lot of attention among water resources engineers and scientists, owing to their potential for accurate prediction of flood flows as compared to conceptual or physics based hydrologic models. The ANN approximates the non-linear functional relationship between the complex hydrologic variables in arriving at the river flow forecast values. Despite a large number of applications, there is still some criticism that ANN's point prediction lacks in reliability since the uncertainty of predictions are not quantified, and it limits its use in practical applications. A major concern in application of traditional uncertainty analysis techniques on neural network framework is its parallel computing architecture with large degrees of freedom, which makes the uncertainty assessment a challenging task. Very limited studies have considered assessment of predictive uncertainty of ANN based hydrologic models. In this study, a novel method is proposed that help construct the prediction interval of ANN flood forecasting model during calibration itself. The method is designed to have two stages of optimization during calibration: at stage 1, the ANN model is trained with genetic algorithm (GA) to obtain optimal set of weights and biases vector, and during stage 2, the optimal variability of ANN parameters (obtained in stage 1) is identified so as to create an ensemble of predictions. During the 2nd stage, the optimization is performed with multiple objectives, (i) minimum residual variance for the ensemble mean, (ii) maximum measured data points to fall within the estimated prediction interval and (iii) minimum width of prediction interval. The method is illustrated using a real world case study of an Indian basin. The method was able to produce an ensemble that has an average prediction interval width of 23.03 m3/s, with 97.17% of the total validation data points (measured) lying within the interval. The derived

  8. Prediction of protein-protein interactions based on PseAA composition and hybrid feature selection.

    PubMed

    Liu, Liang; Cai, Yudong; Lu, Wencong; Feng, Kaiyan; Peng, Chunrong; Niu, Bing

    2009-03-06

    Based on pseudo amino acid (PseAA) composition and a novel hybrid feature selection frame, this paper presents a computational system to predict the PPIs (protein-protein interactions) using 8796 protein pairs. These pairs are coded by PseAA composition, resulting in 114 features. A hybrid feature selection system, mRMR-KNNs-wrapper, is applied to obtain an optimized feature set by excluding poor-performed and/or redundant features, resulting in 103 remaining features. Using the optimized 103-feature subset, a prediction model is trained and tested in the k-nearest neighbors (KNNs) learning system. This prediction model achieves an overall accurate prediction rate of 76.18%, evaluated by 10-fold cross-validation test, which is 1.46% higher than using the initial 114 features and is 6.51% higher than the 20 features, coded by amino acid compositions. The PPIs predictor, developed for this research, is available for public use at http://chemdata.shu.edu.cn/ppi.

  9. Towards more accurate and reliable predictions for nuclear applications

    NASA Astrophysics Data System (ADS)

    Goriely, Stephane; Hilaire, Stephane; Dubray, Noel; Lemaître, Jean-François

    2017-09-01

    The need for nuclear data far from the valley of stability, for applications such as nuclear astrophysics or future nuclear facilities, challenges the robustness as well as the predictive power of present nuclear models. Most of the nuclear data evaluation and prediction are still performed on the basis of phenomenological nuclear models. For the last decades, important progress has been achieved in fundamental nuclear physics, making it now feasible to use more reliable, but also more complex microscopic or semi-microscopic models in the evaluation and prediction of nuclear data for practical applications. Nowadays mean-field models can be tuned at the same level of accuracy as the phenomenological models, renormalized on experimental data if needed, and therefore can replace the phenomenological inputs in the evaluation of nuclear data. The latest achievements to determine nuclear masses within the non-relativistic HFB approach, including the related uncertainties in the model predictions, are discussed. Similarly, recent efforts to determine fission observables within the mean-field approach are described and compared with more traditional existing models.

  10. Can blind persons accurately assess body size from the voice?

    PubMed

    Pisanski, Katarzyna; Oleszkiewicz, Anna; Sorokowska, Agnieszka

    2016-04-01

    Vocal tract resonances provide reliable information about a speaker's body size that human listeners use for biosocial judgements as well as speech recognition. Although humans can accurately assess men's relative body size from the voice alone, how this ability is acquired remains unknown. In this study, we test the prediction that accurate voice-based size estimation is possible without prior audiovisual experience linking low frequencies to large bodies. Ninety-one healthy congenitally or early blind, late blind and sighted adults (aged 20-65) participated in the study. On the basis of vowel sounds alone, participants assessed the relative body sizes of male pairs of varying heights. Accuracy of voice-based body size assessments significantly exceeded chance and did not differ among participants who were sighted, or congenitally blind or who had lost their sight later in life. Accuracy increased significantly with relative differences in physical height between men, suggesting that both blind and sighted participants used reliable vocal cues to size (i.e. vocal tract resonances). Our findings demonstrate that prior visual experience is not necessary for accurate body size estimation. This capacity, integral to both nonverbal communication and speech perception, may be present at birth or may generalize from broader cross-modal correspondences. © 2016 The Author(s).

  11. Can blind persons accurately assess body size from the voice?

    PubMed Central

    Oleszkiewicz, Anna; Sorokowska, Agnieszka

    2016-01-01

    Vocal tract resonances provide reliable information about a speaker's body size that human listeners use for biosocial judgements as well as speech recognition. Although humans can accurately assess men's relative body size from the voice alone, how this ability is acquired remains unknown. In this study, we test the prediction that accurate voice-based size estimation is possible without prior audiovisual experience linking low frequencies to large bodies. Ninety-one healthy congenitally or early blind, late blind and sighted adults (aged 20–65) participated in the study. On the basis of vowel sounds alone, participants assessed the relative body sizes of male pairs of varying heights. Accuracy of voice-based body size assessments significantly exceeded chance and did not differ among participants who were sighted, or congenitally blind or who had lost their sight later in life. Accuracy increased significantly with relative differences in physical height between men, suggesting that both blind and sighted participants used reliable vocal cues to size (i.e. vocal tract resonances). Our findings demonstrate that prior visual experience is not necessary for accurate body size estimation. This capacity, integral to both nonverbal communication and speech perception, may be present at birth or may generalize from broader cross-modal correspondences. PMID:27095264

  12. A Predictive Model for Medical Events Based on Contextual Embedding of Temporal Sequences

    PubMed Central

    Wang, Zhimu; Huang, Yingxiang; Wang, Shuang; Wang, Fei; Jiang, Xiaoqian

    2016-01-01

    Background Medical concepts are inherently ambiguous and error-prone due to human fallibility, which makes it hard for them to be fully used by classical machine learning methods (eg, for tasks like early stage disease prediction). Objective Our work was to create a new machine-friendly representation that resembles the semantics of medical concepts. We then developed a sequential predictive model for medical events based on this new representation. Methods We developed novel contextual embedding techniques to combine different medical events (eg, diagnoses, prescriptions, and labs tests). Each medical event is converted into a numerical vector that resembles its “semantics,” via which the similarity between medical events can be easily measured. We developed simple and effective predictive models based on these vectors to predict novel diagnoses. Results We evaluated our sequential prediction model (and standard learning methods) in estimating the risk of potential diseases based on our contextual embedding representation. Our model achieved an area under the receiver operating characteristic (ROC) curve (AUC) of 0.79 on chronic systolic heart failure and an average AUC of 0.67 (over the 80 most common diagnoses) using the Medical Information Mart for Intensive Care III (MIMIC-III) dataset. Conclusions We propose a general early prognosis predictor for 80 different diagnoses. Our method computes numeric representation for each medical event to uncover the potential meaning of those events. Our results demonstrate the efficiency of the proposed method, which will benefit patients and physicians by offering more accurate diagnosis. PMID:27888170

  13. Accurate Segmentation of CT Male Pelvic Organs via Regression-based Deformable Models and Multi-task Random Forests

    PubMed Central

    Gao, Yaozong; Shao, Yeqin; Lian, Jun; Wang, Andrew Z.; Chen, Ronald C.

    2016-01-01

    Segmenting male pelvic organs from CT images is a prerequisite for prostate cancer radiotherapy. The efficacy of radiation treatment highly depends on segmentation accuracy. However, accurate segmentation of male pelvic organs is challenging due to low tissue contrast of CT images, as well as large variations of shape and appearance of the pelvic organs. Among existing segmentation methods, deformable models are the most popular, as shape prior can be easily incorporated to regularize the segmentation. Nonetheless, the sensitivity to initialization often limits their performance, especially for segmenting organs with large shape variations. In this paper, we propose a novel approach to guide deformable models, thus making them robust against arbitrary initializations. Specifically, we learn a displacement regressor, which predicts 3D displacement from any image voxel to the target organ boundary based on the local patch appearance. This regressor provides a nonlocal external force for each vertex of deformable model, thus overcoming the initialization problem suffered by the traditional deformable models. To learn a reliable displacement regressor, two strategies are particularly proposed. 1) A multi-task random forest is proposed to learn the displacement regressor jointly with the organ classifier; 2) an auto-context model is used to iteratively enforce structural information during voxel-wise prediction. Extensive experiments on 313 planning CT scans of 313 patients show that our method achieves better results than alternative classification or regression based methods, and also several other existing methods in CT pelvic organ segmentation. PMID:26800531

  14. Estimating energy expenditure in vascular surgery patients: Are predictive equations accurate enough?

    PubMed

    Suen, J; Thomas, J M; Delaney, C L; Spark, J I; Miller, M D

    2016-12-01

    Malnutrition is prevalent in vascular surgical patients who commonly seek tertiary care at advanced stages of disease. Adjunct nutrition support is therefore pertinent to optimise patient outcomes. To negate consequences related to excessive or suboptimal dietary energy intake, it is essential to accurately determine energy expenditure and subsequent requirements. This study aims to compare resting energy expenditure (REE) measured by indirect calorimetry, a commonly used comparator, to REE estimated by predictive equations (Schofield, Harris-Benedict equations and Miller equation) to determine the most suitable equation for vascular surgery patients. Data were collected from four studies that measured REE in 77 vascular surgery patients. Bland-Altman analyses were conducted to explore agreement. Presence of fixed or proportional bias was assessed by linear regression analyses. In comparison to measured REE, on average REE was overestimated when Schofield (+857 kJ/day), Harris-Benedict (+801 kJ/day) and Miller (+71 kJ/day) equations were used. Wide limits of agreement led to an over or underestimation from 1552 to 1755 kJ. Proportional bias was absent in Schofield (R 2  = 0.005, p = 0.54) and Harris-Benedict equations (R 2  = 0.045, p = 0.06) but was present in the Miller equation (R 2  = 0.210, p < 0.01) even after logarithmic transformation (R 2  = 0.213, p < 0.01). Whilst the Miller equation tended to overestimate resting energy expenditure and was affected by proportional bias, the limits of agreement and mean bias were smaller compared to Schofield and Harris-Benedict equations. This suggested that it is the preferred predictive equation for vascular surgery patients. Future research to refine the Miller equation to improve its overall accuracy will better inform the provision of nutritional support for vascular surgery patients and subsequently improve outcomes. Alternatively, an equation might be developed specifically for use with

  15. Computer-Assisted Decision Support for Student Admissions Based on Their Predicted Academic Performance.

    PubMed

    Muratov, Eugene; Lewis, Margaret; Fourches, Denis; Tropsha, Alexander; Cox, Wendy C

    2017-04-01

    Objective. To develop predictive computational models forecasting the academic performance of students in the didactic-rich portion of a doctor of pharmacy (PharmD) curriculum as admission-assisting tools. Methods. All PharmD candidates over three admission cycles were divided into two groups: those who completed the PharmD program with a GPA ≥ 3; and the remaining candidates. Random Forest machine learning technique was used to develop a binary classification model based on 11 pre-admission parameters. Results. Robust and externally predictive models were developed that had particularly high overall accuracy of 77% for candidates with high or low academic performance. These multivariate models were highly accurate in predicting these groups to those obtained using undergraduate GPA and composite PCAT scores only. Conclusion. The models developed in this study can be used to improve the admission process as preliminary filters and thus quickly identify candidates who are likely to be successful in the PharmD curriculum.

  16. A novel method for structure-based prediction of ion channel conductance properties.

    PubMed Central

    Smart, O S; Breed, J; Smith, G R; Sansom, M S

    1997-01-01

    A rapid and easy-to-use method of predicting the conductance of an ion channel from its three-dimensional structure is presented. The method combines the pore dimensions of the channel as measured in the HOLE program with an Ohmic model of conductance. An empirically based correction factor is then applied. The method yielded good results for six experimental channel structures (none of which were included in the training set) with predictions accurate to within an average factor of 1.62 to the true values. The predictive r2 was equal to 0.90, which is indicative of a good predictive ability. The procedure is used to validate model structures of alamethicin and phospholamban. Two genuine predictions for the conductance of channels with known structure but without reported conductances are given. A modification of the procedure that calculates the expected results for the effect of the addition of nonelectrolyte polymers on conductance is set out. Results for a cholera toxin B-subunit crystal structure agree well with the measured values. The difficulty in interpreting such studies is discussed, with the conclusion that measurements on channels of known structure are required. Images FIGURE 1 FIGURE 3 FIGURE 4 FIGURE 6 FIGURE 10 PMID:9138559

  17. Comparing niche- and process-based models to reduce prediction uncertainty in species range shifts under climate change.

    PubMed

    Morin, Xavier; Thuiller, Wilfried

    2009-05-01

    Obtaining reliable predictions of species range shifts under climate change is a crucial challenge for ecologists and stakeholders. At the continental scale, niche-based models have been widely used in the last 10 years to predict the potential impacts of climate change on species distributions all over the world, although these models do not include any mechanistic relationships. In contrast, species-specific, process-based predictions remain scarce at the continental scale. This is regrettable because to secure relevant and accurate predictions it is always desirable to compare predictions derived from different kinds of models applied independently to the same set of species and using the same raw data. Here we compare predictions of range shifts under climate change scenarios for 2100 derived from niche-based models with those of a process-based model for 15 North American boreal and temperate tree species. A general pattern emerged from our comparisons: niche-based models tend to predict a stronger level of extinction and a greater proportion of colonization than the process-based model. This result likely arises because niche-based models do not take phenotypic plasticity and local adaptation into account. Nevertheless, as the two kinds of models rely on different assumptions, their complementarity is revealed by common findings. Both modeling approaches highlight a major potential limitation on species tracking their climatic niche because of migration constraints and identify similar zones where species extirpation is likely. Such convergent predictions from models built on very different principles provide a useful way to offset uncertainties at the continental scale. This study shows that the use in concert of both approaches with their own caveats and advantages is crucial to obtain more robust results and that comparisons among models are needed in the near future to gain accuracy regarding predictions of range shifts under climate change.

  18. Continuous Metabolic Monitoring Based on Multi-Analyte Biomarkers to Predict Exhaustion

    PubMed Central

    Kastellorizios, Michail; Burgess, Diane J.

    2015-01-01

    This work introduces the concept of multi-analyte biomarkers for continuous metabolic monitoring. The importance of using more than one marker lies in the ability to obtain a holistic understanding of the metabolism. This is showcased for the detection and prediction of exhaustion during intense physical exercise. The findings presented here indicate that when glucose and lactate changes over time are combined into multi-analyte biomarkers, their monitoring trends are more sensitive in the subcutaneous tissue, an implantation-friendly peripheral tissue, compared to the blood. This unexpected observation was confirmed in normal as well as type 1 diabetic rats. This study was designed to be of direct value to continuous monitoring biosensor research, where single analytes are typically monitored. These findings can be implemented in new multi-analyte continuous monitoring technologies for more accurate insulin dosing, as well as for exhaustion prediction studies based on objective data rather than the subject’s perception. PMID:26028477

  19. Continuous metabolic monitoring based on multi-analyte biomarkers to predict exhaustion.

    PubMed

    Kastellorizios, Michail; Burgess, Diane J

    2015-06-01

    This work introduces the concept of multi-analyte biomarkers for continuous metabolic monitoring. The importance of using more than one marker lies in the ability to obtain a holistic understanding of the metabolism. This is showcased for the detection and prediction of exhaustion during intense physical exercise. The findings presented here indicate that when glucose and lactate changes over time are combined into multi-analyte biomarkers, their monitoring trends are more sensitive in the subcutaneous tissue, an implantation-friendly peripheral tissue, compared to the blood. This unexpected observation was confirmed in normal as well as type 1 diabetic rats. This study was designed to be of direct value to continuous monitoring biosensor research, where single analytes are typically monitored. These findings can be implemented in new multi-analyte continuous monitoring technologies for more accurate insulin dosing, as well as for exhaustion prediction studies based on objective data rather than the subject's perception.

  20. A fast and accurate method to predict 2D and 3D aerodynamic boundary layer flows

    NASA Astrophysics Data System (ADS)

    Bijleveld, H. A.; Veldman, A. E. P.

    2014-12-01

    A quasi-simultaneous interaction method is applied to predict 2D and 3D aerodynamic flows. This method is suitable for offshore wind turbine design software as it is a very accurate and computationally reasonably cheap method. This study shows the results for a NACA 0012 airfoil. The two applied solvers converge to the experimental values when the grid is refined. We also show that in separation the eigenvalues remain positive thus avoiding the Goldstein singularity at separation. In 3D we show a flow over a dent in which separation occurs. A rotating flat plat is used to show the applicability of the method for rotating flows. The shown capabilities of the method indicate that the quasi-simultaneous interaction method is suitable for design methods for offshore wind turbine blades.

  1. Research on prediction of agricultural machinery total power based on grey model optimized by genetic algorithm

    NASA Astrophysics Data System (ADS)

    Xie, Yan; Li, Mu; Zhou, Jin; Zheng, Chang-zheng

    2009-07-01

    Agricultural machinery total power is an important index to reflex and evaluate the level of agricultural mechanization. It is the power source of agricultural production, and is the main factors to enhance the comprehensive agricultural production capacity expand production scale and increase the income of the farmers. Its demand is affected by natural, economic, technological and social and other "grey" factors. Therefore, grey system theory can be used to analyze the development of agricultural machinery total power. A method based on genetic algorithm optimizing grey modeling process is introduced in this paper. This method makes full use of the advantages of the grey prediction model and characteristics of genetic algorithm to find global optimization. So the prediction model is more accurate. According to data from a province, the GM (1, 1) model for predicting agricultural machinery total power was given based on the grey system theories and genetic algorithm. The result indicates that the model can be used as agricultural machinery total power an effective tool for prediction.

  2. Model-based predictions for dopamine.

    PubMed

    Langdon, Angela J; Sharpe, Melissa J; Schoenbaum, Geoffrey; Niv, Yael

    2018-04-01

    Phasic dopamine responses are thought to encode a prediction-error signal consistent with model-free reinforcement learning theories. However, a number of recent findings highlight the influence of model-based computations on dopamine responses, and suggest that dopamine prediction errors reflect more dimensions of an expected outcome than scalar reward value. Here, we review a selection of these recent results and discuss the implications and complications of model-based predictions for computational theories of dopamine and learning. Copyright © 2017. Published by Elsevier Ltd.

  3. HomPPI: a class of sequence homology based protein-protein interface prediction methods

    PubMed Central

    2011-01-01

    Background Although homology-based methods are among the most widely used methods for predicting the structure and function of proteins, the question as to whether interface sequence conservation can be effectively exploited in predicting protein-protein interfaces has been a subject of debate. Results We studied more than 300,000 pair-wise alignments of protein sequences from structurally characterized protein complexes, including both obligate and transient complexes. We identified sequence similarity criteria required for accurate homology-based inference of interface residues in a query protein sequence. Based on these analyses, we developed HomPPI, a class of sequence homology-based methods for predicting protein-protein interface residues. We present two variants of HomPPI: (i) NPS-HomPPI (Non partner-specific HomPPI), which can be used to predict interface residues of a query protein in the absence of knowledge of the interaction partner; and (ii) PS-HomPPI (Partner-specific HomPPI), which can be used to predict the interface residues of a query protein with a specific target protein. Our experiments on a benchmark dataset of obligate homodimeric complexes show that NPS-HomPPI can reliably predict protein-protein interface residues in a given protein, with an average correlation coefficient (CC) of 0.76, sensitivity of 0.83, and specificity of 0.78, when sequence homologs of the query protein can be reliably identified. NPS-HomPPI also reliably predicts the interface residues of intrinsically disordered proteins. Our experiments suggest that NPS-HomPPI is competitive with several state-of-the-art interface prediction servers including those that exploit the structure of the query proteins. The partner-specific classifier, PS-HomPPI can, on a large dataset of transient complexes, predict the interface residues of a query protein with a specific target, with a CC of 0.65, sensitivity of 0.69, and specificity of 0.70, when homologs of both the query and the

  4. Integrating metabolic performance, thermal tolerance, and plasticity enables for more accurate predictions on species vulnerability to acute and chronic effects of global warming.

    PubMed

    Magozzi, Sarah; Calosi, Piero

    2015-01-01

    Predicting species vulnerability to global warming requires a comprehensive, mechanistic understanding of sublethal and lethal thermal tolerances. To date, however, most studies investigating species physiological responses to increasing temperature have focused on the underlying physiological traits of either acute or chronic tolerance in isolation. Here we propose an integrative, synthetic approach including the investigation of multiple physiological traits (metabolic performance and thermal tolerance), and their plasticity, to provide more accurate and balanced predictions on species and assemblage vulnerability to both acute and chronic effects of global warming. We applied this approach to more accurately elucidate relative species vulnerability to warming within an assemblage of six caridean prawns occurring in the same geographic, hence macroclimatic, region, but living in different thermal habitats. Prawns were exposed to four incubation temperatures (10, 15, 20 and 25 °C) for 7 days, their metabolic rates and upper thermal limits were measured, and plasticity was calculated according to the concept of Reaction Norms, as well as Q10 for metabolism. Compared to species occupying narrower/more stable thermal niches, species inhabiting broader/more variable thermal environments (including the invasive Palaemon macrodactylus) are likely to be less vulnerable to extreme acute thermal events as a result of their higher upper thermal limits. Nevertheless, they may be at greater risk from chronic exposure to warming due to the greater metabolic costs they incur. Indeed, a trade-off between acute and chronic tolerance was apparent in the assemblage investigated. However, the invasive species P. macrodactylus represents an exception to this pattern, showing elevated thermal limits and plasticity of these limits, as well as a high metabolic control. In general, integrating multiple proxies for species physiological acute and chronic responses to increasing

  5. Accurate FRET Measurements within Single Diffusing Biomolecules Using Alternating-Laser Excitation

    PubMed Central

    Lee, Nam Ki; Kapanidis, Achillefs N.; Wang, You; Michalet, Xavier; Mukhopadhyay, Jayanta; Ebright, Richard H.; Weiss, Shimon

    2005-01-01

    Fluorescence resonance energy transfer (FRET) between a donor (D) and an acceptor (A) at the single-molecule level currently provides qualitative information about distance, and quantitative information about kinetics of distance changes. Here, we used the sorting ability of confocal microscopy equipped with alternating-laser excitation (ALEX) to measure accurate FRET efficiencies and distances from single molecules, using corrections that account for cross-talk terms that contaminate the FRET-induced signal, and for differences in the detection efficiency and quantum yield of the probes. ALEX yields accurate FRET independent of instrumental factors, such as excitation intensity or detector alignment. Using DNA fragments, we showed that ALEX-based distances agree well with predictions from a cylindrical model of DNA; ALEX-based distances fit better to theory than distances obtained at the ensemble level. Distance measurements within transcription complexes agreed well with ensemble-FRET measurements, and with structural models based on ensemble-FRET and x-ray crystallography. ALEX can benefit structural analysis of biomolecules, especially when such molecules are inaccessible to conventional structural methods due to heterogeneity or transient nature. PMID:15653725

  6. Robust and Accurate Modeling Approaches for Migraine Per-Patient Prediction from Ambulatory Data

    PubMed Central

    Pagán, Josué; Irene De Orbe, M.; Gago, Ana; Sobrado, Mónica; Risco-Martín, José L.; Vivancos Mora, J.; Moya, José M.; Ayala, José L.

    2015-01-01

    Migraine is one of the most wide-spread neurological disorders, and its medical treatment represents a high percentage of the costs of health systems. In some patients, characteristic symptoms that precede the headache appear. However, they are nonspecific, and their prediction horizon is unknown and pretty variable; hence, these symptoms are almost useless for prediction, and they are not useful to advance the intake of drugs to be effective and neutralize the pain. To solve this problem, this paper sets up a realistic monitoring scenario where hemodynamic variables from real patients are monitored in ambulatory conditions with a wireless body sensor network (WBSN). The acquired data are used to evaluate the predictive capabilities and robustness against noise and failures in sensors of several modeling approaches. The obtained results encourage the development of per-patient models based on state-space models (N4SID) that are capable of providing average forecast windows of 47 min and a low rate of false positives. PMID:26134103

  7. Prediction of Chemical Respiratory Sensitizers Using GARD, a Novel In Vitro Assay Based on a Genomic Biomarker Signature

    PubMed Central

    Albrekt, Ann-Sofie; Borrebaeck, Carl A. K.; Lindstedt, Malin

    2015-01-01

    Background Repeated exposure to certain low molecular weight (LMW) chemical compounds may result in development of allergic reactions in the skin or in the respiratory tract. In most cases, a certain LMW compound selectively sensitize the skin, giving rise to allergic contact dermatitis (ACD), or the respiratory tract, giving rise to occupational asthma (OA). To limit occurrence of allergic diseases, efforts are currently being made to develop predictive assays that accurately identify chemicals capable of inducing such reactions. However, while a few promising methods for prediction of skin sensitization have been described, to date no validated method, in vitro or in vivo, exists that is able to accurately classify chemicals as respiratory sensitizers. Results Recently, we presented the in vitro based Genomic Allergen Rapid Detection (GARD) assay as a novel testing strategy for classification of skin sensitizing chemicals based on measurement of a genomic biomarker signature. We have expanded the applicability domain of the GARD assay to classify also respiratory sensitizers by identifying a separate biomarker signature containing 389 differentially regulated genes for respiratory sensitizers in comparison to non-respiratory sensitizers. By using an independent data set in combination with supervised machine learning, we validated the assay, showing that the identified genomic biomarker is able to accurately classify respiratory sensitizers. Conclusions We have identified a genomic biomarker signature for classification of respiratory sensitizers. Combining this newly identified biomarker signature with our previously identified biomarker signature for classification of skin sensitizers, we have developed a novel in vitro testing strategy with a potent ability to predict both skin and respiratory sensitization in the same sample. PMID:25760038

  8. MOST: most-similar ligand based approach to target prediction.

    PubMed

    Huang, Tao; Mi, Hong; Lin, Cheng-Yuan; Zhao, Ling; Zhong, Linda L D; Liu, Feng-Bin; Zhang, Ge; Lu, Ai-Ping; Bian, Zhao-Xiang

    2017-03-11

    Many computational approaches have been used for target prediction, including machine learning, reverse docking, bioactivity spectra analysis, and chemical similarity searching. Recent studies have suggested that chemical similarity searching may be driven by the most-similar ligand. However, the extent of bioactivity of most-similar ligands has been oversimplified or even neglected in these studies, and this has impaired the prediction power. Here we propose the MOst-Similar ligand-based Target inference approach, namely MOST, which uses fingerprint similarity and explicit bioactivity of the most-similar ligands to predict targets of the query compound. Performance of MOST was evaluated by using combinations of different fingerprint schemes, machine learning methods, and bioactivity representations. In sevenfold cross-validation with a benchmark Ki dataset from CHEMBL release 19 containing 61,937 bioactivity data of 173 human targets, MOST achieved high average prediction accuracy (0.95 for pKi ≥ 5, and 0.87 for pKi ≥ 6). Morgan fingerprint was shown to be slightly better than FP2. Logistic Regression and Random Forest methods performed better than Naïve Bayes. In a temporal validation, the Ki dataset from CHEMBL19 were used to train models and predict the bioactivity of newly deposited ligands in CHEMBL20. MOST also performed well with high accuracy (0.90 for pKi ≥ 5, and 0.76 for pKi ≥ 6), when Logistic Regression and Morgan fingerprint were employed. Furthermore, the p values associated with explicit bioactivity were found be a robust index for removing false positive predictions. Implicit bioactivity did not offer this capability. Finally, p values generated with Logistic Regression, Morgan fingerprint and explicit activity were integrated with a false discovery rate (FDR) control procedure to reduce false positives in multiple-target prediction scenario, and the success of this strategy it was demonstrated with a case of fluanisone

  9. Can computerized tomography accurately stage childhood renal tumors?

    PubMed

    Abdelhalim, Ahmed; Helmy, Tamer E; Harraz, Ahmed M; Abou-El-Ghar, Mohamed E; Dawaba, Mohamed E; Hafez, Ashraf T

    2014-07-01

    Staging of childhood renal tumors is crucial for treatment planning and outcome prediction. We sought to identify whether computerized tomography could accurately predict the local stage of childhood renal tumors. We retrospectively reviewed our database for patients diagnosed with childhood renal tumors and treated surgically between 1990 and 2013. Inability to retrieve preoperative computerized tomography, intraoperative tumor spillage and nonWilms childhood renal tumors were exclusion criteria. Local computerized tomography stage was assigned by a single experienced pediatric radiologist blinded to the pathological stage, using a consensus similar to the Children's Oncology Group Wilms tumor staging system. Tumors were stratified into up-front surgery and preoperative chemotherapy groups. The radiological stage of each tumor was compared to the pathological stage. A total of 189 tumors in 179 patients met inclusion criteria. Computerized tomography staging matched pathological staging in 68% of up-front surgery (70 of 103), 31.8% of pre-chemotherapy (21 of 66) and 48.8% of post-chemotherapy scans (42 of 86). Computerized tomography over staged 21.4%, 65.2% and 46.5% of tumors in the up-front surgery, pre-chemotherapy and post-chemotherapy scans, respectively, and under staged 10.7%, 3% and 4.7%. Computerized tomography staging was more accurate in tumors managed by up-front surgery (p <0.001) and those without extracapsular extension (p <0.001). The validity of computerized tomography staging of childhood renal tumors remains doubtful. This staging is more accurate for tumors treated with up-front surgery and those without extracapsular extension. Preoperative computerized tomography can help to exclude capsular breach. Treatment strategy should be based on surgical and pathological staging to avoid the hazards of inaccurate staging. Copyright © 2014 American Urological Association Education and Research, Inc. Published by Elsevier Inc. All rights reserved.

  10. Analysis of energy-based algorithms for RNA secondary structure prediction

    PubMed Central

    2012-01-01

    Background RNA molecules play critical roles in the cells of organisms, including roles in gene regulation, catalysis, and synthesis of proteins. Since RNA function depends in large part on its folded structures, much effort has been invested in developing accurate methods for prediction of RNA secondary structure from the base sequence. Minimum free energy (MFE) predictions are widely used, based on nearest neighbor thermodynamic parameters of Mathews, Turner et al. or those of Andronescu et al. Some recently proposed alternatives that leverage partition function calculations find the structure with maximum expected accuracy (MEA) or pseudo-expected accuracy (pseudo-MEA) methods. Advances in prediction methods are typically benchmarked using sensitivity, positive predictive value and their harmonic mean, namely F-measure, on datasets of known reference structures. Since such benchmarks document progress in improving accuracy of computational prediction methods, it is important to understand how measures of accuracy vary as a function of the reference datasets and whether advances in algorithms or thermodynamic parameters yield statistically significant improvements. Our work advances such understanding for the MFE and (pseudo-)MEA-based methods, with respect to the latest datasets and energy parameters. Results We present three main findings. First, using the bootstrap percentile method, we show that the average F-measure accuracy of the MFE and (pseudo-)MEA-based algorithms, as measured on our largest datasets with over 2000 RNAs from diverse families, is a reliable estimate (within a 2% range with high confidence) of the accuracy of a population of RNA molecules represented by this set. However, average accuracy on smaller classes of RNAs such as a class of 89 Group I introns used previously in benchmarking algorithm accuracy is not reliable enough to draw meaningful conclusions about the relative merits of the MFE and MEA-based algorithms. Second, on our large

  11. Analysis of energy-based algorithms for RNA secondary structure prediction.

    PubMed

    Hajiaghayi, Monir; Condon, Anne; Hoos, Holger H

    2012-02-01

    RNA molecules play critical roles in the cells of organisms, including roles in gene regulation, catalysis, and synthesis of proteins. Since RNA function depends in large part on its folded structures, much effort has been invested in developing accurate methods for prediction of RNA secondary structure from the base sequence. Minimum free energy (MFE) predictions are widely used, based on nearest neighbor thermodynamic parameters of Mathews, Turner et al. or those of Andronescu et al. Some recently proposed alternatives that leverage partition function calculations find the structure with maximum expected accuracy (MEA) or pseudo-expected accuracy (pseudo-MEA) methods. Advances in prediction methods are typically benchmarked using sensitivity, positive predictive value and their harmonic mean, namely F-measure, on datasets of known reference structures. Since such benchmarks document progress in improving accuracy of computational prediction methods, it is important to understand how measures of accuracy vary as a function of the reference datasets and whether advances in algorithms or thermodynamic parameters yield statistically significant improvements. Our work advances such understanding for the MFE and (pseudo-)MEA-based methods, with respect to the latest datasets and energy parameters. We present three main findings. First, using the bootstrap percentile method, we show that the average F-measure accuracy of the MFE and (pseudo-)MEA-based algorithms, as measured on our largest datasets with over 2000 RNAs from diverse families, is a reliable estimate (within a 2% range with high confidence) of the accuracy of a population of RNA molecules represented by this set. However, average accuracy on smaller classes of RNAs such as a class of 89 Group I introns used previously in benchmarking algorithm accuracy is not reliable enough to draw meaningful conclusions about the relative merits of the MFE and MEA-based algorithms. Second, on our large datasets, the

  12. Predicting Patient-specific Dosimetric Benefits of Proton Therapy for Skull-base Tumors Using a Geometric Knowledge-based Method

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hall, David C.; Trofimov, Alexei V.; Winey, Brian A.

    Purpose: To predict the organ at risk (OAR) dose levels achievable with proton beam therapy (PBT), solely based on the geometric arrangement of the target volume in relation to the OARs. A comparison with an alternative therapy yields a prediction of the patient-specific benefits offered by PBT. This could enable physicians at hospitals without proton capabilities to make a better-informed referral decision or aid patient selection in model-based clinical trials. Methods and Materials: Skull-base tumors were chosen to test the method, owing to their geometric complexity and multitude of nearby OARs. By exploiting the correlations between the dose and distance-to-targetmore » in existing PBT plans, the models were independently trained for 6 types of OARs: brainstem, cochlea, optic chiasm, optic nerve, parotid gland, and spinal cord. Once trained, the models could estimate the feasible dose–volume histogram and generalized equivalent uniform dose (gEUD) for OAR structures of new patients. The models were trained using 20 patients and validated using an additional 21 patients. Validation was achieved by comparing the predicted gEUD to that of the actual PBT plan. Results: The predicted and planned gEUD were in good agreement. Considering all OARs, the prediction error was +1.4 ± 5.1 Gy (mean ± standard deviation), and Pearson's correlation coefficient was 93%. By comparing with an intensity modulated photon treatment plan, the model could classify whether an OAR structure would experience a gain, with a sensitivity of 93% (95% confidence interval: 87%-97%) and specificity of 63% (95% confidence interval: 38%-84%). Conclusions: We trained and validated models that could quickly and accurately predict the patient-specific benefits of PBT for skull-base tumors. Similar models could be developed for other tumor sites. Such models will be useful when an estimation of the feasible benefits of PBT is desired but the experience and/or resources required for

  13. Molecular Dynamics in Mixed Solvents Reveals Protein-Ligand Interactions, Improves Docking, and Allows Accurate Binding Free Energy Predictions.

    PubMed

    Arcon, Juan Pablo; Defelipe, Lucas A; Modenutti, Carlos P; López, Elias D; Alvarez-Garcia, Daniel; Barril, Xavier; Turjanski, Adrián G; Martí, Marcelo A

    2017-04-24

    One of the most important biological processes at the molecular level is the formation of protein-ligand complexes. Therefore, determining their structure and underlying key interactions is of paramount relevance and has direct applications in drug development. Because of its low cost relative to its experimental sibling, molecular dynamics (MD) simulations in the presence of different solvent probes mimicking specific types of interactions have been increasingly used to analyze protein binding sites and reveal protein-ligand interaction hot spots. However, a systematic comparison of different probes and their real predictive power from a quantitative and thermodynamic point of view is still missing. In the present work, we have performed MD simulations of 18 different proteins in pure water as well as water mixtures of ethanol, acetamide, acetonitrile and methylammonium acetate, leading to a total of 5.4 μs simulation time. For each system, we determined the corresponding solvent sites, defined as space regions adjacent to the protein surface where the probability of finding a probe atom is higher than that in the bulk solvent. Finally, we compared the identified solvent sites with 121 different protein-ligand complexes and used them to perform molecular docking and ligand binding free energy estimates. Our results show that combining solely water and ethanol sites allows sampling over 70% of all possible protein-ligand interactions, especially those that coincide with ligand-based pharmacophoric points. Most important, we also show how the solvent sites can be used to significantly improve ligand docking in terms of both accuracy and precision, and that accurate predictions of ligand binding free energies, along with relative ranking of ligand affinity, can be performed.

  14. Predicting Individual Fuel Economy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lin, Zhenhong; Greene, David L

    2011-01-01

    To make informed decisions about travel and vehicle purchase, consumers need unbiased and accurate information of the fuel economy they will actually obtain. In the past, the EPA fuel economy estimates based on its 1984 rules have been widely criticized for overestimating on-road fuel economy. In 2008, EPA adopted a new estimation rule. This study compares the usefulness of the EPA's 1984 and 2008 estimates based on their prediction bias and accuracy and attempts to improve the prediction of on-road fuel economies based on consumer and vehicle attributes. We examine the usefulness of the EPA fuel economy estimates using amore » large sample of self-reported on-road fuel economy data and develop an Individualized Model for more accurately predicting an individual driver's on-road fuel economy based on easily determined vehicle and driver attributes. Accuracy rather than bias appears to have limited the usefulness of the EPA 1984 estimates in predicting on-road MPG. The EPA 2008 estimates appear to be equally inaccurate and substantially more biased relative to the self-reported data. Furthermore, the 2008 estimates exhibit an underestimation bias that increases with increasing fuel economy, suggesting that the new numbers will tend to underestimate the real-world benefits of fuel economy and emissions standards. By including several simple driver and vehicle attributes, the Individualized Model reduces the unexplained variance by over 55% and the standard error by 33% based on an independent test sample. The additional explanatory variables can be easily provided by the individuals.« less

  15. A storm-based CSLE incorporating the modified SCS-CN method for soil loss prediction on the Chinese Loess Plateau

    NASA Astrophysics Data System (ADS)

    Shi, Wenhai; Huang, Mingbin

    2017-04-01

    The Chinese Loess Plateau is one of the most erodible areas in the world. In order to reduce soil and water losses, suitable conservation practices need to be designed. For this purpose, there is an increasing demand for an appropriate model that can accurately predict storm-based surface runoff and soil losses on the Loess Plateau. The Chinese Soil Loss Equation (CSLE) has been widely used in this region to assess soil losses from different land use types. However, the CSLE was intended only to predict the mean annual gross soil loss. In this study, a CSLE was proposed that would be storm-based and that introduced a new rainfall-runoff erosivity factor. A dataset was compiled that comprised measurements of soil losses during individual storms from three runoff-erosion plots in each of three different watersheds in the gully region of the Plateau for 3-7 years in three different time periods (1956-1959; 1973-1980; 2010-13). The accuracy of the soil loss predictions made by the new storm-based CSLE was determined using the data for the six plots in two of the watersheds measured during 165 storm-runoff events. The performance of the storm-based CSLE was further compared with the performance of the storm-based Revised Universal Soil Loss Equation (RUSLE) for the same six plots. During the calibration (83 storms) and validation (82 storms) of the storm-based CSLE, the model efficiency, E, was 87.7% and 88.9%, respectively, while the root mean square error (RMSE) was 2.7 and 2.3 t ha-1 indicating a high degree of accuracy. Furthermore, the storm-based CSLE performed better than the storm-based RULSE (E: 75.8% and 70.3%; RMSE: 3.8 and 3.7 t ha-1, for the calibration and validation storms, respectively). The storm-based CSLE was then used to predict the soil losses from the three experimental plots in the third watershed. For these predictions, the model parameter values, previously determined by the calibration based on the data from the initial six plots, were used in

  16. Does ultrasonography accurately diagnose acute cholecystitis? Improving diagnostic accuracy based on a review at a regional hospital

    PubMed Central

    Hwang, Hamish; Marsh, Ian; Doyle, Jason

    2014-01-01

    Background Acute cholecystitis is one of the most common diseases requiring emergency surgery. Ultrasonography is an accurate test for cholelithiasis but has a high false-negative rate for acute cholecystitis. The Murphy sign and laboratory tests performed independently are also not particularly accurate. This study was designed to review the accuracy of ultrasonography for diagnosing acute cholecystitis in a regional hospital. Methods We studied all emergency cholecystectomies performed over a 1-year period. All imaging studies were reviewed by a single radiologist, and all pathology was reviewed by a single pathologist. The reviewers were blinded to each other’s results. Results A total of 107 patients required an emergency cholecystectomy in the study period; 83 of them underwent ultrasonography. Interradiologist agreement was 92% for ultrasonography. For cholelithiasis, ultrasonography had 100% sensitivity, 18% specificity, 81% positive predictive value (PPV) and 100% negative predictive value (NPV). For acute cholecystitis, it had 54% sensitivity, 81% specificity, 85% PPV and 47% NPV. All patients had chronic cholecystitis and 67% had acute cholecystitis on histology. When combined with positive Murphy sign and elevated neutrophil count, an ultrasound showing cholelithiasis or acute cholecystitis yielded a sensitivity of 74%, specificity of 62%, PPV of 80% and NPV of 53% for the diagnosis of acute cholecystitis. Conclusion Ultrasonography alone has a high rate of false-negative studies for acute cholecystitis. However, a higher rate of accurate diagnosis can be achieved using a triad of positive Murphy sign, elevated neutrophil count and an ultrasound showing cholelithiasis or cholecystitis. PMID:24869607

  17. Accurate prediction of retention in hydrophilic interaction chromatography by back calculation of high pressure liquid chromatography gradient profiles.

    PubMed

    Wang, Nu; Boswell, Paul G

    2017-10-20

    Gradient retention times are difficult to project from the underlying retention factor (k) vs. solvent composition (φ) relationships. A major reason for this difficulty is that gradients produced by HPLC pumps are imperfect - gradient delay, gradient dispersion, and solvent mis-proportioning are all difficult to account for in calculations. However, we recently showed that a gradient "back-calculation" methodology can measure these imperfections and take them into account. In RPLC, when the back-calculation methodology was used, error in projected gradient retention times is as low as could be expected based on repeatability in the k vs. φ relationships. HILIC, however, presents a new challenge: the selectivity of HILIC columns drift strongly over time. Retention is repeatable in short time, but selectivity frequently drifts over the course of weeks. In this study, we set out to understand if the issue of selectivity drift can be avoid by doing our experiments quickly, and if there any other factors that make it difficult to predict gradient retention times from isocratic k vs. φ relationships when gradient imperfections are taken into account with the back-calculation methodology. While in past reports, the accuracy of retention projections was >5%, the back-calculation methodology brought our error down to ∼1%. This result was 6-43 times more accurate than projections made using ideal gradients and 3-5 times more accurate than the same retention projections made using offset gradients (i.e., gradients that only took gradient delay into account). Still, the error remained higher in our HILIC projections than in RPLC. Based on the shape of the back-calculated gradients, we suspect the higher error is a result of prominent gradient distortion caused by strong, preferential water uptake from the mobile phase into the stationary phase during the gradient - a factor our model did not properly take into account. It appears that, at least with the stationary phase

  18. Development of a New Model for Accurate Prediction of Cloud Water Deposition on Vegetation

    NASA Astrophysics Data System (ADS)

    Katata, G.; Nagai, H.; Wrzesinsky, T.; Klemm, O.; Eugster, W.; Burkard, R.

    2006-12-01

    Scarcity of water resources in arid and semi-arid areas is of great concern in the light of population growth and food shortages. Several experiments focusing on cloud (fog) water deposition on the land surface suggest that cloud water plays an important role in water resource in such regions. A one-dimensional vegetation model including the process of cloud water deposition on vegetation has been developed to better predict cloud water deposition on the vegetation. New schemes to calculate capture efficiency of leaf, cloud droplet size distribution, and gravitational flux of cloud water were incorporated in the model. Model calculations were compared with the data acquired at the Norway spruce forest at the Waldstein site, Germany. High performance of the model was confirmed by comparisons of calculated net radiation, sensible and latent heat, and cloud water fluxes over the forest with measurements. The present model provided a better prediction of measured turbulent and gravitational fluxes of cloud water over the canopy than the Lovett model, which is a commonly used cloud water deposition model. Detailed calculations of evapotranspiration and of turbulent exchange of heat and water vapor within the canopy and the modifications are necessary for accurate prediction of cloud water deposition. Numerical experiments to examine the dependence of cloud water deposition on the vegetation species (coniferous and broad-leaved trees, flat and cylindrical grasses) and structures (Leaf Area Index (LAI) and canopy height) are performed using the presented model. The results indicate that the differences of leaf shape and size have a large impact on cloud water deposition. Cloud water deposition also varies with the growth of vegetation and seasonal change of LAI. We found that the coniferous trees whose height and LAI are 24 m and 2.0 m2m-2, respectively, produce the largest amount of cloud water deposition in all combinations of vegetation species and structures in the

  19. Deep Learning Accurately Predicts Estrogen Receptor Status in Breast Cancer Metabolomics Data.

    PubMed

    Alakwaa, Fadhl M; Chaudhary, Kumardeep; Garmire, Lana X

    2018-01-05

    Metabolomics holds the promise as a new technology to diagnose highly heterogeneous diseases. Conventionally, metabolomics data analysis for diagnosis is done using various statistical and machine learning based classification methods. However, it remains unknown if deep neural network, a class of increasingly popular machine learning methods, is suitable to classify metabolomics data. Here we use a cohort of 271 breast cancer tissues, 204 positive estrogen receptor (ER+), and 67 negative estrogen receptor (ER-) to test the accuracies of feed-forward networks, a deep learning (DL) framework, as well as six widely used machine learning models, namely random forest (RF), support vector machines (SVM), recursive partitioning and regression trees (RPART), linear discriminant analysis (LDA), prediction analysis for microarrays (PAM), and generalized boosted models (GBM). DL framework has the highest area under the curve (AUC) of 0.93 in classifying ER+/ER- patients, compared to the other six machine learning algorithms. Furthermore, the biological interpretation of the first hidden layer reveals eight commonly enriched significant metabolomics pathways (adjusted P-value <0.05) that cannot be discovered by other machine learning methods. Among them, protein digestion and absorption and ATP-binding cassette (ABC) transporters pathways are also confirmed in integrated analysis between metabolomics and gene expression data in these samples. In summary, deep learning method shows advantages for metabolomics based breast cancer ER status classification, with both the highest prediction accuracy (AUC = 0.93) and better revelation of disease biology. We encourage the adoption of feed-forward networks based deep learning method in the metabolomics research community for classification.

  20. Accurate prediction of subcellular location of apoptosis proteins combining Chou's PseAAC and PsePSSM based on wavelet denoising.

    PubMed

    Yu, Bin; Li, Shan; Qiu, Wen-Ying; Chen, Cheng; Chen, Rui-Xin; Wang, Lei; Wang, Ming-Hui; Zhang, Yan

    2017-12-08

    Apoptosis proteins subcellular localization information are very important for understanding the mechanism of programmed cell death and the development of drugs. The prediction of subcellular localization of an apoptosis protein is still a challenging task because the prediction of apoptosis proteins subcellular localization can help to understand their function and the role of metabolic processes. In this paper, we propose a novel method for protein subcellular localization prediction. Firstly, the features of the protein sequence are extracted by combining Chou's pseudo amino acid composition (PseAAC) and pseudo-position specific scoring matrix (PsePSSM), then the feature information of the extracted is denoised by two-dimensional (2-D) wavelet denoising. Finally, the optimal feature vectors are input to the SVM classifier to predict subcellular location of apoptosis proteins. Quite promising predictions are obtained using the jackknife test on three widely used datasets and compared with other state-of-the-art methods. The results indicate that the method proposed in this paper can remarkably improve the prediction accuracy of apoptosis protein subcellular localization, which will be a supplementary tool for future proteomics research.

  1. Accurate Predictions of Mean Geomagnetic Dipole Excursion and Reversal Frequencies, Mean Paleomagnetic Field Intensity, and the Radius of Earth's Core Using McLeod's Rule

    NASA Technical Reports Server (NTRS)

    Voorhies, Coerte V.; Conrad, Joy

    1996-01-01

    The geomagnetic spatial power spectrum R(sub n)(r) is the mean square magnetic induction represented by degree n spherical harmonic coefficients of the internal scalar potential averaged over the geocentric sphere of radius r. McLeod's Rule for the magnetic field generated by Earth's core geodynamo says that the expected core surface power spectrum (R(sub nc)(c)) is inversely proportional to (2n + 1) for 1 less than n less than or equal to N(sub E). McLeod's Rule is verified by locating Earth's core with main field models of Magsat data; the estimated core radius of 3485 kn is close to the seismologic value for c of 3480 km. McLeod's Rule and similar forms are then calibrated with the model values of R(sub n) for 3 less than or = n less than or = 12. Extrapolation to the degree 1 dipole predicts the expectation value of Earth's dipole moment to be about 5.89 x 10(exp 22) Am(exp 2)rms (74.5% of the 1980 value) and the expected geomagnetic intensity to be about 35.6 (mu)T rms at Earth's surface. Archeo- and paleomagnetic field intensity data show these and related predictions to be reasonably accurate. The probability distribution chi(exp 2) with 2n+1 degrees of freedom is assigned to (2n + 1)R(sub nc)/(R(sub nc). Extending this to the dipole implies that an exceptionally weak absolute dipole moment (less than or = 20% of the 1980 value) will exist during 2.5% of geologic time. The mean duration for such major geomagnetic dipole power excursions, one quarter of which feature durable axial dipole reversal, is estimated from the modern dipole power time-scale and the statistical model of excursions. The resulting mean excursion duration of 2767 years forces us to predict an average of 9.04 excursions per million years, 2.26 axial dipole reversals per million years, and a mean reversal duration of 5533 years. Paleomagnetic data show these predictions to be quite accurate. McLeod's Rule led to accurate predictions of Earth's core radius, mean paleomagnetic field

  2. Prediction of Combustion Gas Deposit Compositions

    NASA Technical Reports Server (NTRS)

    Kohl, F. J.; Mcbride, B. J.; Zeleznik, F. J.; Gordon, S.

    1985-01-01

    Demonstrated procedure used to predict accurately chemical compositions of complicated deposit mixtures. NASA Lewis Research Center's Computer Program for Calculation of Complex Chemical Equilibrium Compositions (CEC) used in conjunction with Computer Program for Calculation of Ideal Gas Thermodynamic Data (PAC) and resulting Thermodynamic Data Base (THDATA) to predict deposit compositions from metal or mineral-seeded combustion processes.

  3. nuMap: A Web Platform for Accurate Prediction of Nucleosome Positioning

    PubMed Central

    Alharbi, Bader A.; Alshammari, Thamir H.; Felton, Nathan L.; Zhurkin, Victor B.; Cui, Feng

    2014-01-01

    Nucleosome positioning is critical for gene expression and of major biological interest. The high cost of experimentally mapping nucleosomal arrangement signifies the need for computational approaches to predict nucleosome positions at high resolution. Here, we present a web-based application to fulfill this need by implementing two models, YR and W/S schemes, for the translational and rotational positioning of nucleosomes, respectively. Our methods are based on sequence-dependent anisotropic bending that dictates how DNA is wrapped around a histone octamer. This application allows users to specify a number of options such as schemes and parameters for threading calculation and provides multiple layout formats. The nuMap is implemented in Java/Perl/MySQL and is freely available for public use at http://numap.rit.edu. The user manual, implementation notes, description of the methodology and examples are available at the site. PMID:25220945

  4. A Multiscale Red Blood Cell Model with Accurate Mechanics, Rheology, and Dynamics

    PubMed Central

    Fedosov, Dmitry A.; Caswell, Bruce; Karniadakis, George Em

    2010-01-01

    Abstract Red blood cells (RBCs) have highly deformable viscoelastic membranes exhibiting complex rheological response and rich hydrodynamic behavior governed by special elastic and bending properties and by the external/internal fluid and membrane viscosities. We present a multiscale RBC model that is able to predict RBC mechanics, rheology, and dynamics in agreement with experiments. Based on an analytic theory, the modeled membrane properties can be uniquely related to the experimentally established RBC macroscopic properties without any adjustment of parameters. The RBC linear and nonlinear elastic deformations match those obtained in optical-tweezers experiments. The rheological properties of the membrane are compared with those obtained in optical magnetic twisting cytometry, membrane thermal fluctuations, and creep followed by cell recovery. The dynamics of RBCs in shear and Poiseuille flows is tested against experiments and theoretical predictions, and the applicability of the latter is discussed. Our findings clearly indicate that a purely elastic model for the membrane cannot accurately represent the RBC's rheological properties and its dynamics, and therefore accurate modeling of a viscoelastic membrane is necessary. PMID:20483330

  5. Robust and accurate decoding of motoneuron behavior and prediction of the resulting force output.

    PubMed

    Thompson, Christopher K; Negro, Francesco; Johnson, Michael D; Holmes, Matthew R; McPherson, Laura Miller; Powers, Randall K; Farina, Dario; Heckman, Charles J

    2018-05-03

    The spinal alpha motoneuron is the only cell in the human CNS whose discharge can be routinely recorded in humans. We have reengineered motor unit collection and decomposition approaches, originally developed in humans, to measure the neural drive to muscle and estimate muscle force generation in the decerebrate cat model. Experimental, computational, and predictive approaches are used to demonstrate the validity of this approach across a wide range of modes to activate the motor pool. The utility of this approach is shown through the ability to track individual motor units across trials, allowing for better predictions of muscle force than the electromyography signal, and providing insights in to the stereotypical discharge characteristics in response to synaptic activation of the motor pool. This approach now allows for a direct link between the intracellular data of single motoneurons, the discharge properties of motoneuron populations, and muscle force generation in the same preparation. The discharge of a spinal alpha motoneuron and the resulting contraction of its muscle fibers represents the functional quantum of the motor system. Recent advances in the recording and decomposition of the electromyographic signal allows for the identification of several tens of concurrently active motor units. These detailed population data provide the potential to achieve deep insights into the synaptic organization of motor commands. Yet most of our understanding of the synaptic input to motoneurons is derived from intracellular recordings in animal preparations. Thus, it is necessary to extend the new electrode and decomposition methods to recording of motor unit populations in these same preparations. To achieve this goal, we use high-density electrode arrays and decomposition techniques, analogous to those developed for humans, to record and decompose the activity of tens of concurrently active motor units in a hindlimb muscle in the decerebrate cat. Our results showed

  6. Accurate perception of negative emotions predicts functional capacity in schizophrenia.

    PubMed

    Abram, Samantha V; Karpouzian, Tatiana M; Reilly, James L; Derntl, Birgit; Habel, Ute; Smith, Matthew J

    2014-04-30

    Several studies suggest facial affect perception (FAP) deficits in schizophrenia are linked to poorer social functioning. However, whether reduced functioning is associated with inaccurate perception of specific emotional valence or a global FAP impairment remains unclear. The present study examined whether impairment in the perception of specific emotional valences (positive, negative) and neutrality were uniquely associated with social functioning, using a multimodal social functioning battery. A sample of 59 individuals with schizophrenia and 41 controls completed a computerized FAP task, and measures of functional capacity, social competence, and social attainment. Participants also underwent neuropsychological testing and symptom assessment. Regression analyses revealed that only accurately perceiving negative emotions explained significant variance (7.9%) in functional capacity after accounting for neurocognitive function and symptoms. Partial correlations indicated that accurately perceiving anger, in particular, was positively correlated with functional capacity. FAP for positive, negative, or neutral emotions were not related to social competence or social attainment. Our findings were consistent with prior literature suggesting negative emotions are related to functional capacity in schizophrenia. Furthermore, the observed relationship between perceiving anger and performance of everyday living skills is novel and warrants further exploration. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  7. Accurate SHAPE-directed RNA secondary structure modeling, including pseudoknots.

    PubMed

    Hajdin, Christine E; Bellaousov, Stanislav; Huggins, Wayne; Leonard, Christopher W; Mathews, David H; Weeks, Kevin M

    2013-04-02

    A pseudoknot forms in an RNA when nucleotides in a loop pair with a region outside the helices that close the loop. Pseudoknots occur relatively rarely in RNA but are highly overrepresented in functionally critical motifs in large catalytic RNAs, in riboswitches, and in regulatory elements of viruses. Pseudoknots are usually excluded from RNA structure prediction algorithms. When included, these pairings are difficult to model accurately, especially in large RNAs, because allowing this structure dramatically increases the number of possible incorrect folds and because it is difficult to search the fold space for an optimal structure. We have developed a concise secondary structure modeling approach that combines SHAPE (selective 2'-hydroxyl acylation analyzed by primer extension) experimental chemical probing information and a simple, but robust, energy model for the entropic cost of single pseudoknot formation. Structures are predicted with iterative refinement, using a dynamic programming algorithm. This melded experimental and thermodynamic energy function predicted the secondary structures and the pseudoknots for a set of 21 challenging RNAs of known structure ranging in size from 34 to 530 nt. On average, 93% of known base pairs were predicted, and all pseudoknots in well-folded RNAs were identified.

  8. Accurate visible speech synthesis based on concatenating variable length motion capture data.

    PubMed

    Ma, Jiyong; Cole, Ron; Pellom, Bryan; Ward, Wayne; Wise, Barbara

    2006-01-01

    We present a novel approach to synthesizing accurate visible speech based on searching and concatenating optimal variable-length units in a large corpus of motion capture data. Based on a set of visual prototypes selected on a source face and a corresponding set designated for a target face, we propose a machine learning technique to automatically map the facial motions observed on the source face to the target face. In order to model the long distance coarticulation effects in visible speech, a large-scale corpus that covers the most common syllables in English was collected, annotated and analyzed. For any input text, a search algorithm to locate the optimal sequences of concatenated units for synthesis is desrcribed. A new algorithm to adapt lip motions from a generic 3D face model to a specific 3D face model is also proposed. A complete, end-to-end visible speech animation system is implemented based on the approach. This system is currently used in more than 60 kindergarten through third grade classrooms to teach students to read using a lifelike conversational animated agent. To evaluate the quality of the visible speech produced by the animation system, both subjective evaluation and objective evaluation are conducted. The evaluation results show that the proposed approach is accurate and powerful for visible speech synthesis.

  9. BEST: Improved Prediction of B-Cell Epitopes from Antigen Sequences

    PubMed Central

    Gao, Jianzhao; Faraggi, Eshel; Zhou, Yaoqi; Ruan, Jishou; Kurgan, Lukasz

    2012-01-01

    Accurate identification of immunogenic regions in a given antigen chain is a difficult and actively pursued problem. Although accurate predictors for T-cell epitopes are already in place, the prediction of the B-cell epitopes requires further research. We overview the available approaches for the prediction of B-cell epitopes and propose a novel and accurate sequence-based solution. Our BEST (B-cell Epitope prediction using Support vector machine Tool) method predicts epitopes from antigen sequences, in contrast to some method that predict only from short sequence fragments, using a new architecture based on averaging selected scores generated from sliding 20-mers by a Support Vector Machine (SVM). The SVM predictor utilizes a comprehensive and custom designed set of inputs generated by combining information derived from the chain, sequence conservation, similarity to known (training) epitopes, and predicted secondary structure and relative solvent accessibility. Empirical evaluation on benchmark datasets demonstrates that BEST outperforms several modern sequence-based B-cell epitope predictors including ABCPred, method by Chen et al. (2007), BCPred, COBEpro, BayesB, and CBTOPE, when considering the predictions from antigen chains and from the chain fragments. Our method obtains a cross-validated area under the receiver operating characteristic curve (AUC) for the fragment-based prediction at 0.81 and 0.85, depending on the dataset. The AUCs of BEST on the benchmark sets of full antigen chains equal 0.57 and 0.6, which is significantly and slightly better than the next best method we tested. We also present case studies to contrast the propensity profiles generated by BEST and several other methods. PMID:22761950

  10. A Simple and Accurate Model to Predict Responses to Multi-electrode Stimulation in the Retina

    PubMed Central

    Maturana, Matias I.; Apollo, Nicholas V.; Hadjinicolaou, Alex E.; Garrett, David J.; Cloherty, Shaun L.; Kameneva, Tatiana; Grayden, David B.; Ibbotson, Michael R.; Meffin, Hamish

    2016-01-01

    Implantable electrode arrays are widely used in therapeutic stimulation of the nervous system (e.g. cochlear, retinal, and cortical implants). Currently, most neural prostheses use serial stimulation (i.e. one electrode at a time) despite this severely limiting the repertoire of stimuli that can be applied. Methods to reliably predict the outcome of multi-electrode stimulation have not been available. Here, we demonstrate that a linear-nonlinear model accurately predicts neural responses to arbitrary patterns of stimulation using in vitro recordings from single retinal ganglion cells (RGCs) stimulated with a subretinal multi-electrode array. In the model, the stimulus is projected onto a low-dimensional subspace and then undergoes a nonlinear transformation to produce an estimate of spiking probability. The low-dimensional subspace is estimated using principal components analysis, which gives the neuron’s electrical receptive field (ERF), i.e. the electrodes to which the neuron is most sensitive. Our model suggests that stimulation proportional to the ERF yields a higher efficacy given a fixed amount of power when compared to equal amplitude stimulation on up to three electrodes. We find that the model captures the responses of all the cells recorded in the study, suggesting that it will generalize to most cell types in the retina. The model is computationally efficient to evaluate and, therefore, appropriate for future real-time applications including stimulation strategies that make use of recorded neural activity to improve the stimulation strategy. PMID:27035143

  11. Accurate Quantitative Sensing of Intracellular pH based on Self-ratiometric Upconversion Luminescent Nanoprobe.

    PubMed

    Li, Cuixia; Zuo, Jing; Zhang, Li; Chang, Yulei; Zhang, Youlin; Tu, Langping; Liu, Xiaomin; Xue, Bin; Li, Qiqing; Zhao, Huiying; Zhang, Hong; Kong, Xianggui

    2016-12-09

    Accurate quantitation of intracellular pH (pH i ) is of great importance in revealing the cellular activities and early warning of diseases. A series of fluorescence-based nano-bioprobes composed of different nanoparticles or/and dye pairs have already been developed for pH i sensing. Till now, biological auto-fluorescence background upon UV-Vis excitation and severe photo-bleaching of dyes are the two main factors impeding the accurate quantitative detection of pH i . Herein, we have developed a self-ratiometric luminescence nanoprobe based on förster resonant energy transfer (FRET) for probing pH i , in which pH-sensitive fluorescein isothiocyanate (FITC) and upconversion nanoparticles (UCNPs) were served as energy acceptor and donor, respectively. Under 980 nm excitation, upconversion emission bands at 475 nm and 645 nm of NaYF 4 :Yb 3+ , Tm 3+ UCNPs were used as pH i response and self-ratiometric reference signal, respectively. This direct quantitative sensing approach has circumvented the traditional software-based subsequent processing of images which may lead to relatively large uncertainty of the results. Due to efficient FRET and fluorescence background free, a highly-sensitive and accurate sensing has been achieved, featured by 3.56 per unit change in pH i value 3.0-7.0 with deviation less than 0.43. This approach shall facilitate the researches in pH i related areas and development of the intracellular drug delivery systems.

  12. Accurate Quantitative Sensing of Intracellular pH based on Self-ratiometric Upconversion Luminescent Nanoprobe

    NASA Astrophysics Data System (ADS)

    Li, Cuixia; Zuo, Jing; Zhang, Li; Chang, Yulei; Zhang, Youlin; Tu, Langping; Liu, Xiaomin; Xue, Bin; Li, Qiqing; Zhao, Huiying; Zhang, Hong; Kong, Xianggui

    2016-12-01

    Accurate quantitation of intracellular pH (pHi) is of great importance in revealing the cellular activities and early warning of diseases. A series of fluorescence-based nano-bioprobes composed of different nanoparticles or/and dye pairs have already been developed for pHi sensing. Till now, biological auto-fluorescence background upon UV-Vis excitation and severe photo-bleaching of dyes are the two main factors impeding the accurate quantitative detection of pHi. Herein, we have developed a self-ratiometric luminescence nanoprobe based on förster resonant energy transfer (FRET) for probing pHi, in which pH-sensitive fluorescein isothiocyanate (FITC) and upconversion nanoparticles (UCNPs) were served as energy acceptor and donor, respectively. Under 980 nm excitation, upconversion emission bands at 475 nm and 645 nm of NaYF4:Yb3+, Tm3+ UCNPs were used as pHi response and self-ratiometric reference signal, respectively. This direct quantitative sensing approach has circumvented the traditional software-based subsequent processing of images which may lead to relatively large uncertainty of the results. Due to efficient FRET and fluorescence background free, a highly-sensitive and accurate sensing has been achieved, featured by 3.56 per unit change in pHi value 3.0-7.0 with deviation less than 0.43. This approach shall facilitate the researches in pHi related areas and development of the intracellular drug delivery systems.

  13. Learning linear transformations between counting-based and prediction-based word embeddings

    PubMed Central

    Hayashi, Kohei; Kawarabayashi, Ken-ichi

    2017-01-01

    Despite the growing interest in prediction-based word embedding learning methods, it remains unclear as to how the vector spaces learnt by the prediction-based methods differ from that of the counting-based methods, or whether one can be transformed into the other. To study the relationship between counting-based and prediction-based embeddings, we propose a method for learning a linear transformation between two given sets of word embeddings. Our proposal contributes to the word embedding learning research in three ways: (a) we propose an efficient method to learn a linear transformation between two sets of word embeddings, (b) using the transformation learnt in (a), we empirically show that it is possible to predict distributed word embeddings for novel unseen words, and (c) empirically it is possible to linearly transform counting-based embeddings to prediction-based embeddings, for frequent words, different POS categories, and varying degrees of ambiguities. PMID:28926629

  14. Structural features based genome-wide characterization and prediction of nucleosome organization

    PubMed Central

    2012-01-01

    Background Nucleosome distribution along chromatin dictates genomic DNA accessibility and thus profoundly influences gene expression. However, the underlying mechanism of nucleosome formation remains elusive. Here, taking a structural perspective, we systematically explored nucleosome formation potential of genomic sequences and the effect on chromatin organization and gene expression in S. cerevisiae. Results We analyzed twelve structural features related to flexibility, curvature and energy of DNA sequences. The results showed that some structural features such as DNA denaturation, DNA-bending stiffness, Stacking energy, Z-DNA, Propeller twist and free energy, were highly correlated with in vitro and in vivo nucleosome occupancy. Specifically, they can be classified into two classes, one positively and the other negatively correlated with nucleosome occupancy. These two kinds of structural features facilitated nucleosome binding in centromere regions and repressed nucleosome formation in the promoter regions of protein-coding genes to mediate transcriptional regulation. Based on these analyses, we integrated all twelve structural features in a model to predict more accurately nucleosome occupancy in vivo than the existing methods that mainly depend on sequence compositional features. Furthermore, we developed a novel approach, named DLaNe, that located nucleosomes by detecting peaks of structural profiles, and built a meta predictor to integrate information from different structural features. As a comparison, we also constructed a hidden Markov model (HMM) to locate nucleosomes based on the profiles of these structural features. The result showed that the meta DLaNe and HMM-based method performed better than the existing methods, demonstrating the power of these structural features in predicting nucleosome positions. Conclusions Our analysis revealed that DNA structures significantly contribute to nucleosome organization and influence chromatin structure and gene

  15. nuMap: a web platform for accurate prediction of nucleosome positioning.

    PubMed

    Alharbi, Bader A; Alshammari, Thamir H; Felton, Nathan L; Zhurkin, Victor B; Cui, Feng

    2014-10-01

    Nucleosome positioning is critical for gene expression and of major biological interest. The high cost of experimentally mapping nucleosomal arrangement signifies the need for computational approaches to predict nucleosome positions at high resolution. Here, we present a web-based application to fulfill this need by implementing two models, YR and W/S schemes, for the translational and rotational positioning of nucleosomes, respectively. Our methods are based on sequence-dependent anisotropic bending that dictates how DNA is wrapped around a histone octamer. This application allows users to specify a number of options such as schemes and parameters for threading calculation and provides multiple layout formats. The nuMap is implemented in Java/Perl/MySQL and is freely available for public use at http://numap.rit.edu. The user manual, implementation notes, description of the methodology and examples are available at the site. Copyright © 2014 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.

  16. Prediction of microRNAs Associated with Human Diseases Based on Weighted k Most Similar Neighbors

    PubMed Central

    Guo, Maozu; Guo, Yahong; Li, Jinbao; Ding, Jian; Liu, Yong; Dai, Qiguo; Li, Jin; Teng, Zhixia; Huang, Yufei

    2013-01-01

    Background The identification of human disease-related microRNAs (disease miRNAs) is important for further investigating their involvement in the pathogenesis of diseases. More experimentally validated miRNA-disease associations have been accumulated recently. On the basis of these associations, it is essential to predict disease miRNAs for various human diseases. It is useful in providing reliable disease miRNA candidates for subsequent experimental studies. Methodology/Principal Findings It is known that miRNAs with similar functions are often associated with similar diseases and vice versa. Therefore, the functional similarity of two miRNAs has been successfully estimated by measuring the semantic similarity of their associated diseases. To effectively predict disease miRNAs, we calculated the functional similarity by incorporating the information content of disease terms and phenotype similarity between diseases. Furthermore, the members of miRNA family or cluster are assigned higher weight since they are more probably associated with similar diseases. A new prediction method, HDMP, based on weighted k most similar neighbors is presented for predicting disease miRNAs. Experiments validated that HDMP achieved significantly higher prediction performance than existing methods. In addition, the case studies examining prostatic neoplasms, breast neoplasms, and lung neoplasms, showed that HDMP can uncover potential disease miRNA candidates. Conclusions The superior performance of HDMP can be attributed to the accurate measurement of miRNA functional similarity, the weight assignment based on miRNA family or cluster, and the effective prediction based on weighted k most similar neighbors. The online prediction and analysis tool is freely available at http://nclab.hit.edu.cn/hdmpred. PMID:23950912

  17. Predicting intensity ranks of peptide fragment ions.

    PubMed

    Frank, Ari M

    2009-05-01

    Accurate modeling of peptide fragmentation is necessary for the development of robust scoring functions for peptide-spectrum matches, which are the cornerstone of MS/MS-based identification algorithms. Unfortunately, peptide fragmentation is a complex process that can involve several competing chemical pathways, which makes it difficult to develop generative probabilistic models that describe it accurately. However, the vast amounts of MS/MS data being generated now make it possible to use data-driven machine learning methods to develop discriminative ranking-based models that predict the intensity ranks of a peptide's fragment ions. We use simple sequence-based features that get combined by a boosting algorithm into models that make peak rank predictions with high accuracy. In an accompanying manuscript, we demonstrate how these prediction models are used to significantly improve the performance of peptide identification algorithms. The models can also be useful in the design of optimal multiple reaction monitoring (MRM) transitions, in cases where there is insufficient experimental data to guide the peak selection process. The prediction algorithm can also be run independently through PepNovo+, which is available for download from http://bix.ucsd.edu/Software/PepNovo.html.

  18. Predicting Intensity Ranks of Peptide Fragment Ions

    PubMed Central

    Frank, Ari M.

    2009-01-01

    Accurate modeling of peptide fragmentation is necessary for the development of robust scoring functions for peptide-spectrum matches, which are the cornerstone of MS/MS-based identification algorithms. Unfortunately, peptide fragmentation is a complex process that can involve several competing chemical pathways, which makes it difficult to develop generative probabilistic models that describe it accurately. However, the vast amounts of MS/MS data being generated now make it possible to use data-driven machine learning methods to develop discriminative ranking-based models that predict the intensity ranks of a peptide's fragment ions. We use simple sequence-based features that get combined by a boosting algorithm in to models that make peak rank predictions with high accuracy. In an accompanying manuscript, we demonstrate how these prediction models are used to significantly improve the performance of peptide identification algorithms. The models can also be useful in the design of optimal MRM transitions, in cases where there is insufficient experimental data to guide the peak selection process. The prediction algorithm can also be run independently through PepNovo+, which is available for download from http://bix.ucsd.edu/Software/PepNovo.html. PMID:19256476

  19. Predicting the stability of nanodevices

    NASA Astrophysics Data System (ADS)

    Lin, Z. Z.; Yu, W. F.; Wang, Y.; Ning, X. J.

    2011-05-01

    A simple model based on the statistics of single atoms is developed to predict the stability or lifetime of nanodevices without empirical parameters. Under certain conditions, the model produces the Arrhenius law and the Meyer-Neldel compensation rule. Compared with the classical molecular-dynamics simulations for predicting the stability of monatomic carbon chain at high temperature, the model is proved to be much more accurate than the transition state theory. Based on the ab initio calculation of the static potential, the model can give out a corrected lifetime of monatomic carbon and gold chains at higher temperature, and predict that the monatomic chains are very stable at room temperature.

  20. Accurate and robust brain image alignment using boundary-based registration.

    PubMed

    Greve, Douglas N; Fischl, Bruce

    2009-10-15

    The fine spatial scales of the structures in the human brain represent an enormous challenge to the successful integration of information from different images for both within- and between-subject analysis. While many algorithms to register image pairs from the same subject exist, visual inspection shows that their accuracy and robustness to be suspect, particularly when there are strong intensity gradients and/or only part of the brain is imaged. This paper introduces a new algorithm called Boundary-Based Registration, or BBR. The novelty of BBR is that it treats the two images very differently. The reference image must be of sufficient resolution and quality to extract surfaces that separate tissue types. The input image is then aligned to the reference by maximizing the intensity gradient across tissue boundaries. Several lower quality images can be aligned through their alignment with the reference. Visual inspection and fMRI results show that BBR is more accurate than correlation ratio or normalized mutual information and is considerably more robust to even strong intensity inhomogeneities. BBR also excels at aligning partial-brain images to whole-brain images, a domain in which existing registration algorithms frequently fail. Even in the limit of registering a single slice, we show the BBR results to be robust and accurate.

  1. Predicting plant biomass accumulation from image-derived parameters

    PubMed Central

    Chen, Dijun; Shi, Rongli; Pape, Jean-Michel; Neumann, Kerstin; Graner, Andreas; Chen, Ming; Klukas, Christian

    2018-01-01

    Abstract Background Image-based high-throughput phenotyping technologies have been rapidly developed in plant science recently, and they provide a great potential to gain more valuable information than traditionally destructive methods. Predicting plant biomass is regarded as a key purpose for plant breeders and ecologists. However, it is a great challenge to find a predictive biomass model across experiments. Results In the present study, we constructed 4 predictive models to examine the quantitative relationship between image-based features and plant biomass accumulation. Our methodology has been applied to 3 consecutive barley (Hordeum vulgare) experiments with control and stress treatments. The results proved that plant biomass can be accurately predicted from image-based parameters using a random forest model. The high prediction accuracy based on this model will contribute to relieving the phenotyping bottleneck in biomass measurement in breeding applications. The prediction performance is still relatively high across experiments under similar conditions. The relative contribution of individual features for predicting biomass was further quantified, revealing new insights into the phenotypic determinants of the plant biomass outcome. Furthermore, methods could also be used to determine the most important image-based features related to plant biomass accumulation, which would be promising for subsequent genetic mapping to uncover the genetic basis of biomass. Conclusions We have developed quantitative models to accurately predict plant biomass accumulation from image data. We anticipate that the analysis results will be useful to advance our views of the phenotypic determinants of plant biomass outcome, and the statistical methods can be broadly used for other plant species. PMID:29346559

  2. Machine-Learning-Based Electronic Triage More Accurately Differentiates Patients With Respect to Clinical Outcomes Compared With the Emergency Severity Index.

    PubMed

    Levin, Scott; Toerper, Matthew; Hamrock, Eric; Hinson, Jeremiah S; Barnes, Sean; Gardner, Heather; Dugas, Andrea; Linton, Bob; Kirsch, Tom; Kelen, Gabor

    2018-05-01

    Standards for emergency department (ED) triage in the United States rely heavily on subjective assessment and are limited in their ability to risk-stratify patients. This study seeks to evaluate an electronic triage system (e-triage) based on machine learning that predicts likelihood of acute outcomes enabling improved patient differentiation. A multisite, retrospective, cross-sectional study of 172,726 ED visits from urban and community EDs was conducted. E-triage is composed of a random forest model applied to triage data (vital signs, chief complaint, and active medical history) that predicts the need for critical care, an emergency procedure, and inpatient hospitalization in parallel and translates risk to triage level designations. Predicted outcomes and secondary outcomes of elevated troponin and lactate levels were evaluated and compared with the Emergency Severity Index (ESI). E-triage predictions had an area under the curve ranging from 0.73 to 0.92 and demonstrated equivalent or improved identification of clinical patient outcomes compared with ESI at both EDs. E-triage provided rationale for risk-based differentiation of the more than 65% of ED visits triaged to ESI level 3. Matching the ESI patient distribution for comparisons, e-triage identified more than 10% (14,326 patients) of ESI level 3 patients requiring up triage who had substantially increased risk of critical care or emergency procedure (1.7% ESI level 3 versus 6.2% up triaged) and hospitalization (18.9% versus 45.4%) across EDs. E-triage more accurately classifies ESI level 3 patients and highlights opportunities to use predictive analytics to support triage decisionmaking. Further prospective validation is needed. Copyright © 2017 American College of Emergency Physicians. Published by Elsevier Inc. All rights reserved.

  3. A nonlinear viscoelastic approach to durability predictions for polymer based composite structures

    NASA Technical Reports Server (NTRS)

    Brinson, Hal F.

    1991-01-01

    Current industry approaches for the durability assessment of metallic structures are briefly reviewed. For polymer based composite structures, it is suggested that new approaches must be adopted to include memory or viscoelastic effects which could lead to delayed failures that might not be predicted using current techniques. A durability or accelerated life assessment plan for fiber reinforced plastics (FRP) developed and documented over the last decade or so is reviewed and discussed. Limitations to the plan are outlined and suggestions to remove the limitations are given. These include the development of a finite element code to replace the previously used lamination theory code and the development of new specimen geometries to evaluate delamination failures. The new DCB model is reviewed and results are presented. Finally, it is pointed out that new procedures are needed to determine interfacial properties and current efforts underway to determine such properties are reviewed. Suggestions for additional efforts to develop a consistent and accurate durability predictive approach for FRP structures are outlined.

  4. A nonlinear viscoelastic approach to durability predictions for polymer based composite structures

    NASA Technical Reports Server (NTRS)

    Brinson, Hal F.; Hiel, C. C.

    1990-01-01

    Current industry approaches for the durability assessment of metallic structures are briefly reviewed. For polymer based composite structures, it is suggested that new approaches must be adopted to include memory or viscoelastic effects which could lead to delayed failures that might not be predicted using current techniques. A durability or accelerated life assessment plan for fiber reinforced plastics (FRP) developed and documented over the last decade or so is reviewed and discussed. Limitations to the plan are outlined and suggestions to remove the limitations are given. These include the development of a finite element code to replace the previously used lamination theory code and the development of new specimen geometries to evaluate delamination failures. The new DCB model is reviewed and results are presented. Finally, it is pointed out that new procedures are needed to determine interfacial properties and current efforts underway to determine such properties are reviewed. Suggestions for additional efforts to develop a consistent and accurate durability predictive approach for FRP structures is outlined.

  5. Accurate fluid force measurement based on control surface integration

    NASA Astrophysics Data System (ADS)

    Lentink, David

    2018-01-01

    Nonintrusive 3D fluid force measurements are still challenging to conduct accurately for freely moving animals, vehicles, and deforming objects. Two techniques, 3D particle image velocimetry (PIV) and a new technique, the aerodynamic force platform (AFP), address this. Both rely on the control volume integral for momentum; whereas PIV requires numerical integration of flow fields, the AFP performs the integration mechanically based on rigid walls that form the control surface. The accuracy of both PIV and AFP measurements based on the control surface integration is thought to hinge on determining the unsteady body force associated with the acceleration of the volume of displaced fluid. Here, I introduce a set of non-dimensional error ratios to show which fluid and body parameters make the error negligible. The unsteady body force is insignificant in all conditions where the average density of the body is much greater than the density of the fluid, e.g., in gas. Whenever a strongly deforming body experiences significant buoyancy and acceleration, the error is significant. Remarkably, this error can be entirely corrected for with an exact factor provided that the body has a sufficiently homogenous density or acceleration distribution, which is common in liquids. The correction factor for omitting the unsteady body force, {{{ {ρ f}} {1 - {ρ f} ( {{ρ b}+{ρ f}} )}.{( {{{{ρ }}b}+{ρ f}} )}}} , depends only on the fluid, {ρ f}, and body, {{ρ }}b, density. Whereas these straightforward solutions work even at the liquid-gas interface in a significant number of cases, they do not work for generalized bodies undergoing buoyancy in combination with appreciable body density inhomogeneity, volume change (PIV), or volume rate-of-change (PIV and AFP). In these less common cases, the 3D body shape needs to be measured and resolved in time and space to estimate the unsteady body force. The analysis shows that accounting for the unsteady body force is straightforward to non

  6. Adaptive learning compressive tracking based on Markov location prediction

    NASA Astrophysics Data System (ADS)

    Zhou, Xingyu; Fu, Dongmei; Yang, Tao; Shi, Yanan

    2017-03-01

    Object tracking is an interdisciplinary research topic in image processing, pattern recognition, and computer vision which has theoretical and practical application value in video surveillance, virtual reality, and automatic navigation. Compressive tracking (CT) has many advantages, such as efficiency and accuracy. However, when there are object occlusion, abrupt motion and blur, similar objects, and scale changing, the CT has the problem of tracking drift. We propose the Markov object location prediction to get the initial position of the object. Then CT is used to locate the object accurately, and the classifier parameter adaptive updating strategy is given based on the confidence map. At the same time according to the object location, extract the scale features, which is able to deal with object scale variations effectively. Experimental results show that the proposed algorithm has better tracking accuracy and robustness than current advanced algorithms and achieves real-time performance.

  7. Artificial intelligence models for predicting iron deficiency anemia and iron serum level based on accessible laboratory data.

    PubMed

    Azarkhish, Iman; Raoufy, Mohammad Reza; Gharibzadeh, Shahriar

    2012-06-01

    Iron deficiency anemia (IDA) is the most common nutritional deficiency worldwide. Measuring serum iron is time consuming, expensive and not available in most hospitals. In this study, based on four accessible laboratory data (MCV, MCH, MCHC, Hb/RBC), we developed an artificial neural network (ANN) and an adaptive neuro-fuzzy inference system (ANFIS) to diagnose the IDA and to predict serum iron level. Our results represent that the neural network analysis is superior to ANFIS and logistic regression models in diagnosing IDA. Moreover, the results show that the ANN is likely to provide an accurate test for predicting serum iron levels with high accuracy and acceptable precision.

  8. Selecting Optimal Random Forest Predictive Models: A Case Study on Predicting the Spatial Distribution of Seabed Hardness

    PubMed Central

    Li, Jin; Tran, Maggie; Siwabessy, Justy

    2016-01-01

    Spatially continuous predictions of seabed hardness are important baseline environmental information for sustainable management of Australia’s marine jurisdiction. Seabed hardness is often inferred from multibeam backscatter data with unknown accuracy and can be inferred from underwater video footage at limited locations. In this study, we classified the seabed into four classes based on two new seabed hardness classification schemes (i.e., hard90 and hard70). We developed optimal predictive models to predict seabed hardness using random forest (RF) based on the point data of hardness classes and spatially continuous multibeam data. Five feature selection (FS) methods that are variable importance (VI), averaged variable importance (AVI), knowledge informed AVI (KIAVI), Boruta and regularized RF (RRF) were tested based on predictive accuracy. Effects of highly correlated, important and unimportant predictors on the accuracy of RF predictive models were examined. Finally, spatial predictions generated using the most accurate models were visually examined and analysed. This study confirmed that: 1) hard90 and hard70 are effective seabed hardness classification schemes; 2) seabed hardness of four classes can be predicted with a high degree of accuracy; 3) the typical approach used to pre-select predictive variables by excluding highly correlated variables needs to be re-examined; 4) the identification of the important and unimportant predictors provides useful guidelines for further improving predictive models; 5) FS methods select the most accurate predictive model(s) instead of the most parsimonious ones, and AVI and Boruta are recommended for future studies; and 6) RF is an effective modelling method with high predictive accuracy for multi-level categorical data and can be applied to ‘small p and large n’ problems in environmental sciences. Additionally, automated computational programs for AVI need to be developed to increase its computational efficiency and

  9. Selecting Optimal Random Forest Predictive Models: A Case Study on Predicting the Spatial Distribution of Seabed Hardness.

    PubMed

    Li, Jin; Tran, Maggie; Siwabessy, Justy

    2016-01-01

    Spatially continuous predictions of seabed hardness are important baseline environmental information for sustainable management of Australia's marine jurisdiction. Seabed hardness is often inferred from multibeam backscatter data with unknown accuracy and can be inferred from underwater video footage at limited locations. In this study, we classified the seabed into four classes based on two new seabed hardness classification schemes (i.e., hard90 and hard70). We developed optimal predictive models to predict seabed hardness using random forest (RF) based on the point data of hardness classes and spatially continuous multibeam data. Five feature selection (FS) methods that are variable importance (VI), averaged variable importance (AVI), knowledge informed AVI (KIAVI), Boruta and regularized RF (RRF) were tested based on predictive accuracy. Effects of highly correlated, important and unimportant predictors on the accuracy of RF predictive models were examined. Finally, spatial predictions generated using the most accurate models were visually examined and analysed. This study confirmed that: 1) hard90 and hard70 are effective seabed hardness classification schemes; 2) seabed hardness of four classes can be predicted with a high degree of accuracy; 3) the typical approach used to pre-select predictive variables by excluding highly correlated variables needs to be re-examined; 4) the identification of the important and unimportant predictors provides useful guidelines for further improving predictive models; 5) FS methods select the most accurate predictive model(s) instead of the most parsimonious ones, and AVI and Boruta are recommended for future studies; and 6) RF is an effective modelling method with high predictive accuracy for multi-level categorical data and can be applied to 'small p and large n' problems in environmental sciences. Additionally, automated computational programs for AVI need to be developed to increase its computational efficiency and

  10. Performance of protein-structure predictions with the physics-based UNRES force field in CASP11.

    PubMed

    Krupa, Paweł; Mozolewska, Magdalena A; Wiśniewska, Marta; Yin, Yanping; He, Yi; Sieradzan, Adam K; Ganzynkowicz, Robert; Lipska, Agnieszka G; Karczyńska, Agnieszka; Ślusarz, Magdalena; Ślusarz, Rafał; Giełdoń, Artur; Czaplewski, Cezary; Jagieła, Dawid; Zaborowski, Bartłomiej; Scheraga, Harold A; Liwo, Adam

    2016-11-01

    Participating as the Cornell-Gdansk group, we have used our physics-based coarse-grained UNited RESidue (UNRES) force field to predict protein structure in the 11th Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP11). Our methodology involved extensive multiplexed replica exchange simulations of the target proteins with a recently improved UNRES force field to provide better reproductions of the local structures of polypeptide chains. All simulations were started from fully extended polypeptide chains, and no external information was included in the simulation process except for weak restraints on secondary structure to enable us to finish each prediction within the allowed 3-week time window. Because of simplified UNRES representation of polypeptide chains, use of enhanced sampling methods, code optimization and parallelization and sufficient computational resources, we were able to treat, for the first time, all 55 human prediction targets with sizes from 44 to 595 amino acid residues, the average size being 251 residues. Complete structures of six single-domain proteins were predicted accurately, with the highest accuracy being attained for the T0769, for which the CαRMSD was 3.8 Å for 97 residues of the experimental structure. Correct structures were also predicted for 13 domains of multi-domain proteins with accuracy comparable to that of the best template-based modeling methods. With further improvements of the UNRES force field that are now underway, our physics-based coarse-grained approach to protein-structure prediction will eventually reach global prediction capacity and, consequently, reliability in simulating protein structure and dynamics that are important in biochemical processes. Freely available on the web at http://www.unres.pl/ CONTACT: has5@cornell.edu. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  11. Risk and the physics of clinical prediction.

    PubMed

    McEvoy, John W; Diamond, George A; Detrano, Robert C; Kaul, Sanjay; Blaha, Michael J; Blumenthal, Roger S; Jones, Steven R

    2014-04-15

    The current paradigm of primary prevention in cardiology uses traditional risk factors to estimate future cardiovascular risk. These risk estimates are based on prediction models derived from prospective cohort studies and are incorporated into guideline-based initiation algorithms for commonly used preventive pharmacologic treatments, such as aspirin and statins. However, risk estimates are more accurate for populations of similar patients than they are for any individual patient. It may be hazardous to presume that the point estimate of risk derived from a population model represents the most accurate estimate for a given patient. In this review, we exploit principles derived from physics as a metaphor for the distinction between predictions regarding populations versus patients. We identify the following: (1) predictions of risk are accurate at the level of populations but do not translate directly to patients, (2) perfect accuracy of individual risk estimation is unobtainable even with the addition of multiple novel risk factors, and (3) direct measurement of subclinical disease (screening) affords far greater certainty regarding the personalized treatment of patients, whereas risk estimates often remain uncertain for patients. In conclusion, shifting our focus from prediction of events to detection of disease could improve personalized decision-making and outcomes. We also discuss innovative future strategies for risk estimation and treatment allocation in preventive cardiology. Copyright © 2014 Elsevier Inc. All rights reserved.

  12. A Data Driven Model for Predicting RNA-Protein Interactions based on Gradient Boosting Machine.

    PubMed

    Jain, Dharm Skandh; Gupte, Sanket Rajan; Aduri, Raviprasad

    2018-06-22

    RNA protein interactions (RPI) play a pivotal role in the regulation of various biological processes. Experimental validation of RPI has been time-consuming, paving the way for computational prediction methods. The major limiting factor of these methods has been the accuracy and confidence of the predictions, and our in-house experiments show that they fail to accurately predict RPI involving short RNA sequences such as TERRA RNA. Here, we present a data-driven model for RPI prediction using a gradient boosting classifier. Amino acids and nucleotides are classified based on the high-resolution structural data of RNA protein complexes. The minimum structural unit consisting of five residues is used as the descriptor. Comparative analysis of existing methods shows the consistently higher performance of our method irrespective of the length of RNA present in the RPI. The method has been successfully applied to map RPI networks involving both long noncoding RNA as well as TERRA RNA. The method is also shown to successfully predict RNA and protein hubs present in RPI networks of four different organisms. The robustness of this method will provide a way for predicting RPI networks of yet unknown interactions for both long noncoding RNA and microRNA.

  13. An evidential link prediction method and link predictability based on Shannon entropy

    NASA Astrophysics Data System (ADS)

    Yin, Likang; Zheng, Haoyang; Bian, Tian; Deng, Yong

    2017-09-01

    Predicting missing links is of both theoretical value and practical interest in network science. In this paper, we empirically investigate a new link prediction method base on similarity and compare nine well-known local similarity measures on nine real networks. Most of the previous studies focus on the accuracy, however, it is crucial to consider the link predictability as an initial property of networks itself. Hence, this paper has proposed a new link prediction approach called evidential measure (EM) based on Dempster-Shafer theory. Moreover, this paper proposed a new method to measure link predictability via local information and Shannon entropy.

  14. Genome-Scale Metabolic Model for the Green Alga Chlorella vulgaris UTEX 395 Accurately Predicts Phenotypes under Autotrophic, Heterotrophic, and Mixotrophic Growth Conditions1

    PubMed Central

    Zuñiga, Cristal; Li, Chien-Ting; Zielinski, Daniel C.; Guarnieri, Michael T.; Antoniewicz, Maciek R.; Zengler, Karsten

    2016-01-01

    The green microalga Chlorella vulgaris has been widely recognized as a promising candidate for biofuel production due to its ability to store high lipid content and its natural metabolic versatility. Compartmentalized genome-scale metabolic models constructed from genome sequences enable quantitative insight into the transport and metabolism of compounds within a target organism. These metabolic models have long been utilized to generate optimized design strategies for an improved production process. Here, we describe the reconstruction, validation, and application of a genome-scale metabolic model for C. vulgaris UTEX 395, iCZ843. The reconstruction represents the most comprehensive model for any eukaryotic photosynthetic organism to date, based on the genome size and number of genes in the reconstruction. The highly curated model accurately predicts phenotypes under photoautotrophic, heterotrophic, and mixotrophic conditions. The model was validated against experimental data and lays the foundation for model-driven strain design and medium alteration to improve yield. Calculated flux distributions under different trophic conditions show that a number of key pathways are affected by nitrogen starvation conditions, including central carbon metabolism and amino acid, nucleotide, and pigment biosynthetic pathways. Furthermore, model prediction of growth rates under various medium compositions and subsequent experimental validation showed an increased growth rate with the addition of tryptophan and methionine. PMID:27372244

  15. Predictions of heading date in bread wheat (Triticum aestivum L.) using QTL-based parameters of an ecophysiological model

    PubMed Central

    Bogard, Matthieu; Ravel, Catherine; Paux, Etienne; Bordes, Jacques; Balfourier, François; Chapman, Scott C.; Le Gouis, Jacques; Allard, Vincent

    2014-01-01

    Prediction of wheat phenology facilitates the selection of cultivars with specific adaptations to a particular environment. However, while QTL analysis for heading date can identify major genes controlling phenology, the results are limited to the environments and genotypes tested. Moreover, while ecophysiological models allow accurate predictions in new environments, they may require substantial phenotypic data to parameterize each genotype. Also, the model parameters are rarely related to all underlying genes, and all the possible allelic combinations that could be obtained by breeding cannot be tested with models. In this study, a QTL-based model is proposed to predict heading date in bread wheat (Triticum aestivum L.). Two parameters of an ecophysiological model (V sat and P base, representing genotype vernalization requirements and photoperiod sensitivity, respectively) were optimized for 210 genotypes grown in 10 contrasting location × sowing date combinations. Multiple linear regression models predicting V sat and P base with 11 and 12 associated genetic markers accounted for 71 and 68% of the variance of these parameters, respectively. QTL-based V sat and P base estimates were able to predict heading date of an independent validation data set (88 genotypes in six location × sowing date combinations) with a root mean square error of prediction of 5 to 8.6 days, explaining 48 to 63% of the variation for heading date. The QTL-based model proposed in this study may be used for agronomic purposes and to assist breeders in suggesting locally adapted ideotypes for wheat phenology. PMID:25148833

  16. Accurate pan-specific prediction of peptide-MHC class II binding affinity with improved binding core identification.

    PubMed

    Andreatta, Massimo; Karosiene, Edita; Rasmussen, Michael; Stryhn, Anette; Buus, Søren; Nielsen, Morten

    2015-11-01

    A key event in the generation of a cellular response against malicious organisms through the endocytic pathway is binding of peptidic antigens by major histocompatibility complex class II (MHC class II) molecules. The bound peptide is then presented on the cell surface where it can be recognized by T helper lymphocytes. NetMHCIIpan is a state-of-the-art method for the quantitative prediction of peptide binding to any human or mouse MHC class II molecule of known sequence. In this paper, we describe an updated version of the method with improved peptide binding register identification. Binding register prediction is concerned with determining the minimal core region of nine residues directly in contact with the MHC binding cleft, a crucial piece of information both for the identification and design of CD4(+) T cell antigens. When applied to a set of 51 crystal structures of peptide-MHC complexes with known binding registers, the new method NetMHCIIpan-3.1 significantly outperformed the earlier 3.0 version. We illustrate the impact of accurate binding core identification for the interpretation of T cell cross-reactivity using tetramer double staining with a CMV epitope and its variants mapped to the epitope binding core. NetMHCIIpan is publicly available at http://www.cbs.dtu.dk/services/NetMHCIIpan-3.1 .

  17. Proton dissociation properties of arylphosphonates: Determination of accurate Hammett equation parameters.

    PubMed

    Dargó, Gergő; Bölcskei, Adrienn; Grün, Alajos; Béni, Szabolcs; Szántó, Zoltán; Lopata, Antal; Keglevich, György; Balogh, György T

    2017-09-05

    Determination of the proton dissociation constants of several arylphosphonic acid derivatives was carried out to investigate the accuracy of the Hammett equations available for this family of compounds. For the measurement of the pK a values modern, accurate methods, such as the differential potentiometric titration and NMR-pH titration were used. We found our results significantly different from the pK a values reported before (pK a1 : MAE = 0.16 pK a2 : MAE=0.59). Based on our recently measured pK a values, refined Hammett equations were determined that might be used for predicting highly accurate ionization constants of newly synthesized compounds (pK a1 =1.70-0.894σ, pK a2 =6.92-0.934σ). Copyright © 2017 Elsevier B.V. All rights reserved.

  18. MO-AB-BRA-10: Cancer Therapy Outcome Prediction Based On Dempster-Shafer Theory and PET Imaging

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lian, C; University of Rouen, QuantIF - EA 4108 LITIS, 76000 Rouen; Li, H

    2015-06-15

    Purpose: In cancer therapy, utilizing FDG-18 PET image-based features for accurate outcome prediction is challenging because of 1) limited discriminative information within a small number of PET image sets, and 2) fluctuant feature characteristics caused by the inferior spatial resolution and system noise of PET imaging. In this study, we proposed a new Dempster-Shafer theory (DST) based approach, evidential low-dimensional transformation with feature selection (ELT-FS), to accurately predict cancer therapy outcome with both PET imaging features and clinical characteristics. Methods: First, a specific loss function with sparse penalty was developed to learn an adaptive low-rank distance metric for representing themore » dissimilarity between different patients’ feature vectors. By minimizing this loss function, a linear low-dimensional transformation of input features was achieved. Also, imprecise features were excluded simultaneously by applying a l2,1-norm regularization of the learnt dissimilarity metric in the loss function. Finally, the learnt dissimilarity metric was applied in an evidential K-nearest-neighbor (EK- NN) classifier to predict treatment outcome. Results: Twenty-five patients with stage II–III non-small-cell lung cancer and thirty-six patients with esophageal squamous cell carcinomas treated with chemo-radiotherapy were collected. For the two groups of patients, 52 and 29 features, respectively, were utilized. The leave-one-out cross-validation (LOOCV) protocol was used for evaluation. Compared to three existing linear transformation methods (PCA, LDA, NCA), the proposed ELT-FS leads to higher prediction accuracy for the training and testing sets both for lung-cancer patients (100+/−0.0, 88.0+/−33.17) and for esophageal-cancer patients (97.46+/−1.64, 83.33+/−37.8). The ELT-FS also provides superior class separation in both test data sets. Conclusion: A novel DST- based approach has been proposed to predict cancer treatment outcome

  19. Machine learning and linear regression models to predict catchment-level base cation weathering rates across the southern Appalachian Mountain region, USA

    Treesearch

    Nicholas A. Povak; Paul F. Hessburg; Todd C. McDonnell; Keith M. Reynolds; Timothy J. Sullivan; R. Brion Salter; Bernard J. Crosby

    2014-01-01

    Accurate estimates of soil mineral weathering are required for regional critical load (CL) modeling to identify ecosystems at risk of the deleterious effects from acidification. Within a correlative modeling framework, we used modeled catchment-level base cation weathering (BCw) as the response variable to identify key environmental correlates and predict a continuous...

  20. Proof-test-based life prediction of high-toughness pressure vessels

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Panontin, T.L.; Hill, M.R.

    1996-02-01

    The paper examines the problems associated with applying proof-test-based life prediction to vessels made of high-toughness metals. Two A106 Gr B pipe specimens containing long, through-wall circumferential flaws were tested. One failed during hydrostatic testing and the other during tension-tension cycling following a hydrostatic test. Quantitative fractography was used to verify experimentally obtained fatigue crack growth rates and a variety of LEFM and EPFM techniques were used to analyze the experimental results. The results show that: plastic collapse analysis provides accurate predictions of screened (initial) crack size when the flow stress is determined experimentally; LEFM analysis underestimates the crack sizemore » screened by the proof test and overpredicts the subsequent fatigue life of the vessel when retardation effects are small (i.e., low proof levels); and, at a high proof-test level (2.4 {times} operating pressure), the large retardation effect on fatigue crack growth due to the overload overwhelmed the deleterious effect on fatigue life from stable tearing during the proof test and alleviated the problem of screening only long cracks due to the high toughness of the metal.« less

  1. Knowledge-based fragment binding prediction.

    PubMed

    Tang, Grace W; Altman, Russ B

    2014-04-01

    Target-based drug discovery must assess many drug-like compounds for potential activity. Focusing on low-molecular-weight compounds (fragments) can dramatically reduce the chemical search space. However, approaches for determining protein-fragment interactions have limitations. Experimental assays are time-consuming, expensive, and not always applicable. At the same time, computational approaches using physics-based methods have limited accuracy. With increasing high-resolution structural data for protein-ligand complexes, there is now an opportunity for data-driven approaches to fragment binding prediction. We present FragFEATURE, a machine learning approach to predict small molecule fragments preferred by a target protein structure. We first create a knowledge base of protein structural environments annotated with the small molecule substructures they bind. These substructures have low-molecular weight and serve as a proxy for fragments. FragFEATURE then compares the structural environments within a target protein to those in the knowledge base to retrieve statistically preferred fragments. It merges information across diverse ligands with shared substructures to generate predictions. Our results demonstrate FragFEATURE's ability to rediscover fragments corresponding to the ligand bound with 74% precision and 82% recall on average. For many protein targets, it identifies high scoring fragments that are substructures of known inhibitors. FragFEATURE thus predicts fragments that can serve as inputs to fragment-based drug design or serve as refinement criteria for creating target-specific compound libraries for experimental or computational screening.

  2. Knowledge-based Fragment Binding Prediction

    PubMed Central

    Tang, Grace W.; Altman, Russ B.

    2014-01-01

    Target-based drug discovery must assess many drug-like compounds for potential activity. Focusing on low-molecular-weight compounds (fragments) can dramatically reduce the chemical search space. However, approaches for determining protein-fragment interactions have limitations. Experimental assays are time-consuming, expensive, and not always applicable. At the same time, computational approaches using physics-based methods have limited accuracy. With increasing high-resolution structural data for protein-ligand complexes, there is now an opportunity for data-driven approaches to fragment binding prediction. We present FragFEATURE, a machine learning approach to predict small molecule fragments preferred by a target protein structure. We first create a knowledge base of protein structural environments annotated with the small molecule substructures they bind. These substructures have low-molecular weight and serve as a proxy for fragments. FragFEATURE then compares the structural environments within a target protein to those in the knowledge base to retrieve statistically preferred fragments. It merges information across diverse ligands with shared substructures to generate predictions. Our results demonstrate FragFEATURE's ability to rediscover fragments corresponding to the ligand bound with 74% precision and 82% recall on average. For many protein targets, it identifies high scoring fragments that are substructures of known inhibitors. FragFEATURE thus predicts fragments that can serve as inputs to fragment-based drug design or serve as refinement criteria for creating target-specific compound libraries for experimental or computational screening. PMID:24762971

  3. Determining Cutoff Point of Ensemble Trees Based on Sample Size in Predicting Clinical Dose with DNA Microarray Data.

    PubMed

    Yılmaz Isıkhan, Selen; Karabulut, Erdem; Alpar, Celal Reha

    2016-01-01

    Background/Aim . Evaluating the success of dose prediction based on genetic or clinical data has substantially advanced recently. The aim of this study is to predict various clinical dose values from DNA gene expression datasets using data mining techniques. Materials and Methods . Eleven real gene expression datasets containing dose values were included. First, important genes for dose prediction were selected using iterative sure independence screening. Then, the performances of regression trees (RTs), support vector regression (SVR), RT bagging, SVR bagging, and RT boosting were examined. Results . The results demonstrated that a regression-based feature selection method substantially reduced the number of irrelevant genes from raw datasets. Overall, the best prediction performance in nine of 11 datasets was achieved using SVR; the second most accurate performance was provided using a gradient-boosting machine (GBM). Conclusion . Analysis of various dose values based on microarray gene expression data identified common genes found in our study and the referenced studies. According to our findings, SVR and GBM can be good predictors of dose-gene datasets. Another result of the study was to identify the sample size of n = 25 as a cutoff point for RT bagging to outperform a single RT.

  4. BiRen: predicting enhancers with a deep-learning-based model using the DNA sequence alone.

    PubMed

    Yang, Bite; Liu, Feng; Ren, Chao; Ouyang, Zhangyi; Xie, Ziwei; Bo, Xiaochen; Shu, Wenjie

    2017-07-01

    Enhancer elements are noncoding stretches of DNA that play key roles in controlling gene expression programmes. Despite major efforts to develop accurate enhancer prediction methods, identifying enhancer sequences continues to be a challenge in the annotation of mammalian genomes. One of the major issues is the lack of large, sufficiently comprehensive and experimentally validated enhancers for humans or other species. Thus, the development of computational methods based on limited experimentally validated enhancers and deciphering the transcriptional regulatory code encoded in the enhancer sequences is urgent. We present a deep-learning-based hybrid architecture, BiRen, which predicts enhancers using the DNA sequence alone. Our results demonstrate that BiRen can learn common enhancer patterns directly from the DNA sequence and exhibits superior accuracy, robustness and generalizability in enhancer prediction relative to other state-of-the-art enhancer predictors based on sequence characteristics. Our BiRen will enable researchers to acquire a deeper understanding of the regulatory code of enhancer sequences. Our BiRen method can be freely accessed at https://github.com/wenjiegroup/BiRen . shuwj@bmi.ac.cn or boxc@bmi.ac.cn. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  5. Genome-Scale Metabolic Model for the Green Alga Chlorella vulgaris UTEX 395 Accurately Predicts Phenotypes under Autotrophic, Heterotrophic, and Mixotrophic Growth Conditions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zuniga, Cristal; Li, Chien -Ting; Huelsman, Tyler

    The green microalgae Chlorella vulgaris has been widely recognized as a promising candidate for biofuel production due to its ability to store high lipid content and its natural metabolic versatility. Compartmentalized genome-scale metabolic models constructed from genome sequences enable quantitative insight into the transport and metabolism of compounds within a target organism. These metabolic models have long been utilized to generate optimized design strategies for an improved production process. Here, we describe the reconstruction, validation, and application of a genome-scale metabolic model for C. vulgaris UTEX 395, iCZ843. The reconstruction represents the most comprehensive model for any eukaryotic photosynthetic organismmore » to date, based on the genome size and number of genes in the reconstruction. The highly curated model accurately predicts phenotypes under photoautotrophic, heterotrophic, and mixotrophic conditions. The model was validated against experimental data and lays the foundation for model-driven strain design and medium alteration to improve yield. Calculated flux distributions under different trophic conditions show that a number of key pathways are affected by nitrogen starvation conditions, including central carbon metabolism and amino acid, nucleotide, and pigment biosynthetic pathways. Moreover, model prediction of growth rates under various medium compositions and subsequent experimental validation showed an increased growth rate with the addition of tryptophan and methionine.« less

  6. Genome-Scale Metabolic Model for the Green Alga Chlorella vulgaris UTEX 395 Accurately Predicts Phenotypes under Autotrophic, Heterotrophic, and Mixotrophic Growth Conditions

    DOE PAGES

    Zuniga, Cristal; Li, Chien -Ting; Huelsman, Tyler; ...

    2016-07-02

    The green microalgae Chlorella vulgaris has been widely recognized as a promising candidate for biofuel production due to its ability to store high lipid content and its natural metabolic versatility. Compartmentalized genome-scale metabolic models constructed from genome sequences enable quantitative insight into the transport and metabolism of compounds within a target organism. These metabolic models have long been utilized to generate optimized design strategies for an improved production process. Here, we describe the reconstruction, validation, and application of a genome-scale metabolic model for C. vulgaris UTEX 395, iCZ843. The reconstruction represents the most comprehensive model for any eukaryotic photosynthetic organismmore » to date, based on the genome size and number of genes in the reconstruction. The highly curated model accurately predicts phenotypes under photoautotrophic, heterotrophic, and mixotrophic conditions. The model was validated against experimental data and lays the foundation for model-driven strain design and medium alteration to improve yield. Calculated flux distributions under different trophic conditions show that a number of key pathways are affected by nitrogen starvation conditions, including central carbon metabolism and amino acid, nucleotide, and pigment biosynthetic pathways. Moreover, model prediction of growth rates under various medium compositions and subsequent experimental validation showed an increased growth rate with the addition of tryptophan and methionine.« less

  7. Genome-Scale Metabolic Model for the Green Alga Chlorella vulgaris UTEX 395 Accurately Predicts Phenotypes under Autotrophic, Heterotrophic, and Mixotrophic Growth Conditions.

    PubMed

    Zuñiga, Cristal; Li, Chien-Ting; Huelsman, Tyler; Levering, Jennifer; Zielinski, Daniel C; McConnell, Brian O; Long, Christopher P; Knoshaug, Eric P; Guarnieri, Michael T; Antoniewicz, Maciek R; Betenbaugh, Michael J; Zengler, Karsten

    2016-09-01

    The green microalga Chlorella vulgaris has been widely recognized as a promising candidate for biofuel production due to its ability to store high lipid content and its natural metabolic versatility. Compartmentalized genome-scale metabolic models constructed from genome sequences enable quantitative insight into the transport and metabolism of compounds within a target organism. These metabolic models have long been utilized to generate optimized design strategies for an improved production process. Here, we describe the reconstruction, validation, and application of a genome-scale metabolic model for C. vulgaris UTEX 395, iCZ843. The reconstruction represents the most comprehensive model for any eukaryotic photosynthetic organism to date, based on the genome size and number of genes in the reconstruction. The highly curated model accurately predicts phenotypes under photoautotrophic, heterotrophic, and mixotrophic conditions. The model was validated against experimental data and lays the foundation for model-driven strain design and medium alteration to improve yield. Calculated flux distributions under different trophic conditions show that a number of key pathways are affected by nitrogen starvation conditions, including central carbon metabolism and amino acid, nucleotide, and pigment biosynthetic pathways. Furthermore, model prediction of growth rates under various medium compositions and subsequent experimental validation showed an increased growth rate with the addition of tryptophan and methionine. © 2016 American Society of Plant Biologists. All rights reserved.

  8. Matrix factorization-based data fusion for the prediction of lncRNA-disease associations.

    PubMed

    Fu, Guangyuan; Wang, Jun; Domeniconi, Carlotta; Yu, Guoxian

    2018-05-01

    Long non-coding RNAs (lncRNAs) play crucial roles in complex disease diagnosis, prognosis, prevention and treatment, but only a small portion of lncRNA-disease associations have been experimentally verified. Various computational models have been proposed to identify lncRNA-disease associations by integrating heterogeneous data sources. However, existing models generally ignore the intrinsic structure of data sources or treat them as equally relevant, while they may not be. To accurately identify lncRNA-disease associations, we propose a Matrix Factorization based LncRNA-Disease Association prediction model (MFLDA in short). MFLDA decomposes data matrices of heterogeneous data sources into low-rank matrices via matrix tri-factorization to explore and exploit their intrinsic and shared structure. MFLDA can select and integrate the data sources by assigning different weights to them. An iterative solution is further introduced to simultaneously optimize the weights and low-rank matrices. Next, MFLDA uses the optimized low-rank matrices to reconstruct the lncRNA-disease association matrix and thus to identify potential associations. In 5-fold cross validation experiments to identify verified lncRNA-disease associations, MFLDA achieves an area under the receiver operating characteristic curve (AUC) of 0.7408, at least 3% higher than those given by state-of-the-art data fusion based computational models. An empirical study on identifying masked lncRNA-disease associations again shows that MFLDA can identify potential associations more accurately than competing models. A case study on identifying lncRNAs associated with breast, lung and stomach cancers show that 38 out of 45 (84%) associations predicted by MFLDA are supported by recent biomedical literature and further proves the capability of MFLDA in identifying novel lncRNA-disease associations. MFLDA is a general data fusion framework, and as such it can be adopted to predict associations between other biological

  9. Accurate thermoelastic tensor and acoustic velocities of NaCl

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Marcondes, Michel L., E-mail: michel@if.usp.br; Chemical Engineering and Material Science, University of Minnesota, Minneapolis, 55455; Shukla, Gaurav, E-mail: shukla@physics.umn.edu

    Despite the importance of thermoelastic properties of minerals in geology and geophysics, their measurement at high pressures and temperatures are still challenging. Thus, ab initio calculations are an essential tool for predicting these properties at extreme conditions. Owing to the approximate description of the exchange-correlation energy, approximations used in calculations of vibrational effects, and numerical/methodological approximations, these methods produce systematic deviations. Hybrid schemes combining experimental data and theoretical results have emerged as a way to reconcile available information and offer more reliable predictions at experimentally inaccessible thermodynamics conditions. Here we introduce a method to improve the calculated thermoelastic tensor bymore » using highly accurate thermal equation of state (EoS). The corrective scheme is general, applicable to crystalline solids with any symmetry, and can produce accurate results at conditions where experimental data may not exist. We apply it to rock-salt-type NaCl, a material whose structural properties have been challenging to describe accurately by standard ab initio methods and whose acoustic/seismic properties are important for the gas and oil industry.« less

  10. Accurate predictions of population-level changes in sequence and structural properties of HIV-1 Env using a volatility-controlled diffusion model

    PubMed Central

    DeLeon, Orlando; Hodis, Hagit; O’Malley, Yunxia; Johnson, Jacklyn; Salimi, Hamid; Zhai, Yinjie; Winter, Elizabeth; Remec, Claire; Eichelberger, Noah; Van Cleave, Brandon; Puliadi, Ramya; Harrington, Robert D.; Stapleton, Jack T.; Haim, Hillel

    2017-01-01

    The envelope glycoproteins (Envs) of HIV-1 continuously evolve in the host by random mutations and recombination events. The resulting diversity of Env variants circulating in the population and their continuing diversification process limit the efficacy of AIDS vaccines. We examined the historic changes in Env sequence and structural features (measured by integrity of epitopes on the Env trimer) in a geographically defined population in the United States. As expected, many Env features were relatively conserved during the 1980s. From this state, some features diversified whereas others remained conserved across the years. We sought to identify “clues” to predict the observed historic diversification patterns. Comparison of viruses that cocirculate in patients at any given time revealed that each feature of Env (sequence or structural) exists at a defined level of variance. The in-host variance of each feature is highly conserved among individuals but can vary between different HIV-1 clades. We designate this property “volatility” and apply it to model evolution of features as a linear diffusion process that progresses with increasing genetic distance. Volatilities of different features are highly correlated with their divergence in longitudinally monitored patients. Volatilities of features also correlate highly with their population-level diversification. Using volatility indices measured from a small number of patient samples, we accurately predict the population diversity that developed for each feature over the course of 30 years. Amino acid variants that evolved at key antigenic sites are also predicted well. Therefore, small “fluctuations” in feature values measured in isolated patient samples accurately describe their potential for population-level diversification. These tools will likely contribute to the design of population-targeted AIDS vaccines by effectively capturing the diversity of currently circulating strains and addressing properties

  11. Tehran Air Pollutants Prediction Based on Random Forest Feature Selection Method

    NASA Astrophysics Data System (ADS)

    Shamsoddini, A.; Aboodi, M. R.; Karami, J.

    2017-09-01

    Air pollution as one of the most serious forms of environmental pollutions poses huge threat to human life. Air pollution leads to environmental instability, and has harmful and undesirable effects on the environment. Modern prediction methods of the pollutant concentration are able to improve decision making and provide appropriate solutions. This study examines the performance of the Random Forest feature selection in combination with multiple-linear regression and Multilayer Perceptron Artificial Neural Networks methods, in order to achieve an efficient model to estimate carbon monoxide and nitrogen dioxide, sulfur dioxide and PM2.5 contents in the air. The results indicated that Artificial Neural Networks fed by the attributes selected by Random Forest feature selection method performed more accurate than other models for the modeling of all pollutants. The estimation accuracy of sulfur dioxide emissions was lower than the other air contaminants whereas the nitrogen dioxide was predicted more accurate than the other pollutants.

  12. Theory of mind selectively predicts preschoolers’ knowledge-based selective word learning

    PubMed Central

    Brosseau-Liard, Patricia; Penney, Danielle; Poulin-Dubois, Diane

    2015-01-01

    Children can selectively attend to various attributes of a model, such as past accuracy or physical strength, to guide their social learning. There is a debate regarding whether a relation exists between theory-of-mind skills and selective learning. We hypothesized that high performance on theory-of-mind tasks would predict preference for learning new words from accurate informants (an epistemic attribute), but not from physically strong informants (a non-epistemic attribute). Three- and 4-year-olds (N = 65) completed two selective learning tasks, and their theory of mind abilities were assessed. As expected, performance on a theory-of-mind battery predicted children’s preference to learn from more accurate informants but not from physically stronger informants. Results thus suggest that preschoolers with more advanced theory of mind have a better understanding of knowledge and apply that understanding to guide their selection of informants. This work has important implications for research on children’s developing social cognition and early learning. PMID:26211504

  13. Theory of mind selectively predicts preschoolers' knowledge-based selective word learning.

    PubMed

    Brosseau-Liard, Patricia; Penney, Danielle; Poulin-Dubois, Diane

    2015-11-01

    Children can selectively attend to various attributes of a model, such as past accuracy or physical strength, to guide their social learning. There is a debate regarding whether a relation exists between theory-of-mind skills and selective learning. We hypothesized that high performance on theory-of-mind tasks would predict preference for learning new words from accurate informants (an epistemic attribute), but not from physically strong informants (a non-epistemic attribute). Three- and 4-year-olds (N = 65) completed two selective learning tasks, and their theory-of-mind abilities were assessed. As expected, performance on a theory-of-mind battery predicted children's preference to learn from more accurate informants but not from physically stronger informants. Results thus suggest that preschoolers with more advanced theory of mind have a better understanding of knowledge and apply that understanding to guide their selection of informants. This work has important implications for research on children's developing social cognition and early learning. © 2015 The British Psychological Society.

  14. Identification and Construction of Combinatory Cancer Hallmark-Based Gene Signature Sets to Predict Recurrence and Chemotherapy Benefit in Stage II Colorectal Cancer.

    PubMed

    Gao, Shanwu; Tibiche, Chabane; Zou, Jinfeng; Zaman, Naif; Trifiro, Mark; O'Connor-McCourt, Maureen; Wang, Edwin

    2016-01-01

    Decisions regarding adjuvant therapy in patients with stage II colorectal cancer (CRC) have been among the most challenging and controversial in oncology over the past 20 years. To develop robust combinatory cancer hallmark-based gene signature sets (CSS sets) that more accurately predict prognosis and identify a subset of patients with stage II CRC who could gain survival benefits from adjuvant chemotherapy. Thirteen retrospective studies of patients with stage II CRC who had clinical follow-up and adjuvant chemotherapy were analyzed. Respective totals of 162 and 843 patients from 2 and 11 independent cohorts were used as the discovery and validation cohorts, respectively. A total of 1005 patients with stage II CRC were included in the 13 cohorts. Among them, 84 of 416 patients in 3 independent cohorts received fluorouracil-based adjuvant chemotherapy. Identification of CSS sets to predict relapse-free survival and identify a subset of patients with stage II CRC who could gain substantial survival benefits from fluorouracil-based adjuvant chemotherapy. Eight cancer hallmark-based gene signatures (30 genes each) were identified and used to construct CSS sets for determining prognosis. The CSS sets were validated in 11 independent cohorts of 767 patients with stage II CRC who did not receive adjuvant chemotherapy. The CSS sets accurately stratified patients into low-, intermediate-, and high-risk groups. Five-year relapse-free survival rates were 94%, 78%, and 45%, respectively, representing 60%, 28%, and 12% of patients with stage II disease. The 416 patients with CSS set-defined high-risk stage II CRC who received fluorouracil-based adjuvant chemotherapy showed a substantial gain in survival benefits from the treatment (ie, recurrence reduced by 30%-40% in 5 years). The CSS sets substantially outperformed other prognostic predictors of stage 2 CRC. They are more accurate and robust for prognostic predictions and facilitate the identification of patients with stage

  15. Assessment of a remote sensing-based model for predicting malaria transmission risk in villages of Chiapas, Mexico

    NASA Technical Reports Server (NTRS)

    Beck, L. R.; Rodriguez, M. H.; Dister, S. W.; Rodriguez, A. D.; Washino, R. K.; Roberts, D. R.; Spanner, M. A.

    1997-01-01

    A blind test of two remote sensing-based models for predicting adult populations of Anopheles albimanus in villages, an indicator of malaria transmission risk, was conducted in southern Chiapas, Mexico. One model was developed using a discriminant analysis approach, while the other was based on regression analysis. The models were developed in 1992 for an area around Tapachula, Chiapas, using Landsat Thematic Mapper (TM) satellite data and geographic information system functions. Using two remotely sensed landscape elements, the discriminant model was able to successfully distinguish between villages with high and low An. albimanus abundance with an overall accuracy of 90%. To test the predictive capability of the models, multitemporal TM data were used to generate a landscape map of the Huixtla area, northwest of Tapachula, where the models were used to predict risk for 40 villages. The resulting predictions were not disclosed until the end of the test. Independently, An. albimanus abundance data were collected in the 40 randomly selected villages for which the predictions had been made. These data were subsequently used to assess the models' accuracies. The discriminant model accurately predicted 79% of the high-abundance villages and 50% of the low-abundance villages, for an overall accuracy of 70%. The regression model correctly identified seven of the 10 villages with the highest mosquito abundance. This test demonstrated that remote sensing-based models generated for one area can be used successfully in another, comparable area.

  16. NNLOPS accurate associated HW production

    NASA Astrophysics Data System (ADS)

    Astill, William; Bizon, Wojciech; Re, Emanuele; Zanderighi, Giulia

    2016-06-01

    We present a next-to-next-to-leading order accurate description of associated HW production consistently matched to a parton shower. The method is based on reweighting events obtained with the HW plus one jet NLO accurate calculation implemented in POWHEG, extended with the MiNLO procedure, to reproduce NNLO accurate Born distributions. Since the Born kinematics is more complex than the cases treated before, we use a parametrization of the Collins-Soper angles to reduce the number of variables required for the reweighting. We present phenomenological results at 13 TeV, with cuts suggested by the Higgs Cross section Working Group.

  17. Regression-based reduced-order models to predict transient thermal output for enhanced geothermal systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mudunuru, Maruti Kumar; Karra, Satish; Harp, Dylan Robert

    Reduced-order modeling is a promising approach, as many phenomena can be described by a few parameters/mechanisms. An advantage and attractive aspect of a reduced-order model is that it is computational inexpensive to evaluate when compared to running a high-fidelity numerical simulation. A reduced-order model takes couple of seconds to run on a laptop while a high-fidelity simulation may take couple of hours to run on a high-performance computing cluster. The goal of this paper is to assess the utility of regression-based reduced-order models (ROMs) developed from high-fidelity numerical simulations for predicting transient thermal power output for an enhanced geothermal reservoirmore » while explicitly accounting for uncertainties in the subsurface system and site-specific details. Numerical simulations are performed based on equally spaced values in the specified range of model parameters. Key sensitive parameters are then identified from these simulations, which are fracture zone permeability, well/skin factor, bottom hole pressure, and injection flow rate. We found the fracture zone permeability to be the most sensitive parameter. The fracture zone permeability along with time, are used to build regression-based ROMs for the thermal power output. The ROMs are trained and validated using detailed physics-based numerical simulations. Finally, predictions from the ROMs are then compared with field data. We propose three different ROMs with different levels of model parsimony, each describing key and essential features of the power production curves. The coefficients in the proposed regression-based ROMs are developed by minimizing a non-linear least-squares misfit function using the Levenberg–Marquardt algorithm. The misfit function is based on the difference between numerical simulation data and reduced-order model. ROM-1 is constructed based on polynomials up to fourth order. ROM-1 is able to accurately reproduce the power output of numerical simulations

  18. Regression-based reduced-order models to predict transient thermal output for enhanced geothermal systems

    DOE PAGES

    Mudunuru, Maruti Kumar; Karra, Satish; Harp, Dylan Robert; ...

    2017-07-10

    Reduced-order modeling is a promising approach, as many phenomena can be described by a few parameters/mechanisms. An advantage and attractive aspect of a reduced-order model is that it is computational inexpensive to evaluate when compared to running a high-fidelity numerical simulation. A reduced-order model takes couple of seconds to run on a laptop while a high-fidelity simulation may take couple of hours to run on a high-performance computing cluster. The goal of this paper is to assess the utility of regression-based reduced-order models (ROMs) developed from high-fidelity numerical simulations for predicting transient thermal power output for an enhanced geothermal reservoirmore » while explicitly accounting for uncertainties in the subsurface system and site-specific details. Numerical simulations are performed based on equally spaced values in the specified range of model parameters. Key sensitive parameters are then identified from these simulations, which are fracture zone permeability, well/skin factor, bottom hole pressure, and injection flow rate. We found the fracture zone permeability to be the most sensitive parameter. The fracture zone permeability along with time, are used to build regression-based ROMs for the thermal power output. The ROMs are trained and validated using detailed physics-based numerical simulations. Finally, predictions from the ROMs are then compared with field data. We propose three different ROMs with different levels of model parsimony, each describing key and essential features of the power production curves. The coefficients in the proposed regression-based ROMs are developed by minimizing a non-linear least-squares misfit function using the Levenberg–Marquardt algorithm. The misfit function is based on the difference between numerical simulation data and reduced-order model. ROM-1 is constructed based on polynomials up to fourth order. ROM-1 is able to accurately reproduce the power output of numerical simulations

  19. Accurate Mobile Urban Mapping via Digital Map-Based SLAM †

    PubMed Central

    Roh, Hyunchul; Jeong, Jinyong; Cho, Younggun; Kim, Ayoung

    2016-01-01

    This paper presents accurate urban map generation using digital map-based Simultaneous Localization and Mapping (SLAM). Throughout this work, our main objective is generating a 3D and lane map aiming for sub-meter accuracy. In conventional mapping approaches, achieving extremely high accuracy was performed by either (i) exploiting costly airborne sensors or (ii) surveying with a static mapping system in a stationary platform. Mobile scanning systems recently have gathered popularity but are mostly limited by the availability of the Global Positioning System (GPS). We focus on the fact that the availability of GPS and urban structures are both sporadic but complementary. By modeling both GPS and digital map data as measurements and integrating them with other sensor measurements, we leverage SLAM for an accurate mobile mapping system. Our proposed algorithm generates an efficient graph SLAM and achieves a framework running in real-time and targeting sub-meter accuracy with a mobile platform. Integrated with the SLAM framework, we implement a motion-adaptive model for the Inverse Perspective Mapping (IPM). Using motion estimation derived from SLAM, the experimental results show that the proposed approaches provide stable bird’s-eye view images, even with significant motion during the drive. Our real-time map generation framework is validated via a long-distance urban test and evaluated at randomly sampled points using Real-Time Kinematic (RTK)-GPS. PMID:27548175

  20. A risk factor-based predictive model of outcomes in carotid endarterectomy: the National Surgical Quality Improvement Program 2005-2010.

    PubMed

    Bekelis, Kimon; Bakhoum, Samuel F; Desai, Atman; Mackenzie, Todd A; Goodney, Philip; Labropoulos, Nicos

    2013-04-01

    Accurate knowledge of individualized risks and benefits is crucial to the surgical management of patients undergoing carotid endarterectomy (CEA). Although large randomized trials have determined specific cutoffs for the degree of stenosis, precise delineation of patient-level risks remains a topic of debate, especially in real world practice. We attempted to create a risk factor-based predictive model of outcomes in CEA. We performed a retrospective cohort study involving patients who underwent CEAs from 2005 to 2010 and were registered in the American College of Surgeons National Quality Improvement Project database. Of the 35 698 patients, 20 015 were asymptomatic (56.1%) and 15 683 were symptomatic (43.9%). These patients demonstrated a 1.64% risk of stroke, 0.69% risk of myocardial infarction, and 0.75% risk of death within 30 days after CEA. Multivariate analysis demonstrated that increasing age, male sex, history of chronic obstructive pulmonary disease, myocardial infarction, angina, congestive heart failure, peripheral vascular disease, previous stroke or transient ischemic attack, and dialysis were independent risk factors associated with an increased risk of the combined outcome of postoperative stroke, myocardial infarction, or death. A validated model for outcome prediction based on individual patient characteristics was developed. There was a steep effect of age on the risk of myocardial infarction and death. This national study confirms that that risks of CEA vary dramatically based on patient-level characteristics. Because of limited discrimination, it cannot be used for individual patient risk assessment. However, it can be used as a baseline for improvement and development of more accurate predictive models based on other databases or prospective studies.

  1. Spectral Neugebauer-based color halftone prediction model accounting for paper fluorescence.

    PubMed

    Hersch, Roger David

    2014-08-20

    We present a spectral model for predicting the fluorescent emission and the total reflectance of color halftones printed on optically brightened paper. By relying on extended Neugebauer models, the proposed model accounts for the attenuation by the ink halftones of both the incident exciting light in the UV wavelength range and the emerging fluorescent emission in the visible wavelength range. The total reflectance is predicted by adding the predicted fluorescent emission relative to the incident light and the pure reflectance predicted with an ink-spreading enhanced Yule-Nielsen modified Neugebauer reflectance prediction model. The predicted fluorescent emission spectrum as a function of the amounts of cyan, magenta, and yellow inks is very accurate. It can be useful to paper and ink manufacturers who would like to study in detail the contribution of the fluorescent brighteners and the attenuation of the fluorescent emission by ink halftones.

  2. Accurate mask-based spatially regularized correlation filter for visual tracking

    NASA Astrophysics Data System (ADS)

    Gu, Xiaodong; Xu, Xinping

    2017-01-01

    Recently, discriminative correlation filter (DCF)-based trackers have achieved extremely successful results in many competitions and benchmarks. These methods utilize a periodic assumption of the training samples to efficiently learn a classifier. However, this assumption will produce unwanted boundary effects, which severely degrade the tracking performance. Correlation filters with limited boundaries and spatially regularized DCFs were proposed to reduce boundary effects. However, their methods used the fixed mask or predesigned weights function, respectively, which was unsuitable for large appearance variation. We propose an accurate mask-based spatially regularized correlation filter for visual tracking. Our augmented objective can reduce the boundary effect even in large appearance variation. In our algorithm, the masking matrix is converted into the regularized function that acts on the correlation filter in frequency domain, which makes the algorithm fast convergence. Our online tracking algorithm performs favorably against state-of-the-art trackers on OTB-2015 Benchmark in terms of efficiency, accuracy, and robustness.

  3. Superpixel-based graph cuts for accurate stereo matching

    NASA Astrophysics Data System (ADS)

    Feng, Liting; Qin, Kaihuai

    2017-06-01

    Estimating the surface normal vector and disparity of a pixel simultaneously, also known as three-dimensional label method, has been widely used in recent continuous stereo matching problem to achieve sub-pixel accuracy. However, due to the infinite label space, it’s extremely hard to assign each pixel an appropriate label. In this paper, we present an accurate and efficient algorithm, integrating patchmatch with graph cuts, to approach this critical computational problem. Besides, to get robust and precise matching cost, we use a convolutional neural network to learn a similarity measure on small image patches. Compared with other MRF related methods, our method has several advantages: its sub-modular property ensures a sub-problem optimality which is easy to perform in parallel; graph cuts can simultaneously update multiple pixels, avoiding local minima caused by sequential optimizers like belief propagation; it uses segmentation results for better local expansion move; local propagation and randomization can easily generate the initial solution without using external methods. Middlebury experiments show that our method can get higher accuracy than other MRF-based algorithms.

  4. Fast and Accurate Prediction of Numerical Relativity Waveforms from Binary Black Hole Coalescences Using Surrogate Models

    NASA Astrophysics Data System (ADS)

    Blackman, Jonathan; Field, Scott E.; Galley, Chad R.; Szilágyi, Béla; Scheel, Mark A.; Tiglio, Manuel; Hemberger, Daniel A.

    2015-09-01

    Simulating a binary black hole coalescence by solving Einstein's equations is computationally expensive, requiring days to months of supercomputing time. Using reduced order modeling techniques, we construct an accurate surrogate model, which is evaluated in a millisecond to a second, for numerical relativity (NR) waveforms from nonspinning binary black hole coalescences with mass ratios in [1, 10] and durations corresponding to about 15 orbits before merger. We assess the model's uncertainty and show that our modeling strategy predicts NR waveforms not used for the surrogate's training with errors nearly as small as the numerical error of the NR code. Our model includes all spherical-harmonic -2Yℓm waveform modes resolved by the NR code up to ℓ=8 . We compare our surrogate model to effective one body waveforms from 50 M⊙ to 300 M⊙ for advanced LIGO detectors and find that the surrogate is always more faithful (by at least an order of magnitude in most cases).

  5. Fast and Accurate Prediction of Numerical Relativity Waveforms from Binary Black Hole Coalescences Using Surrogate Models.

    PubMed

    Blackman, Jonathan; Field, Scott E; Galley, Chad R; Szilágyi, Béla; Scheel, Mark A; Tiglio, Manuel; Hemberger, Daniel A

    2015-09-18

    Simulating a binary black hole coalescence by solving Einstein's equations is computationally expensive, requiring days to months of supercomputing time. Using reduced order modeling techniques, we construct an accurate surrogate model, which is evaluated in a millisecond to a second, for numerical relativity (NR) waveforms from nonspinning binary black hole coalescences with mass ratios in [1, 10] and durations corresponding to about 15 orbits before merger. We assess the model's uncertainty and show that our modeling strategy predicts NR waveforms not used for the surrogate's training with errors nearly as small as the numerical error of the NR code. Our model includes all spherical-harmonic _{-2}Y_{ℓm} waveform modes resolved by the NR code up to ℓ=8. We compare our surrogate model to effective one body waveforms from 50M_{⊙} to 300M_{⊙} for advanced LIGO detectors and find that the surrogate is always more faithful (by at least an order of magnitude in most cases).

  6. MEMS based shock pulse detection sensor for improved rotary Stirling cooler end of life prediction

    NASA Astrophysics Data System (ADS)

    Hübner, M.; Münzberg, M.

    2018-05-01

    The widespread use of rotary Stirling coolers in high performance thermal imagers used for critical 24/7 surveillance tasks justifies any effort to significantly enhance the reliability and predictable uptime of those coolers. Typically the lifetime of the whole imaging device is limited due to continuous wear and finally failure of the rotary compressor of the Stirling cooler, especially due to failure of the comprised bearings. MTTF based lifetime predictions, even based on refined MTTF models taking operational scenario dependent scaling factors into account, still lack in precision to forecast accurately the end of life (EOL) of individual coolers. Consequently preventive maintenance of individual coolers to avoid failures of the main sensor in critical operational scenarios are very costly or even useless. We have developed an integrated test method based on `Micro Electromechanical Systems', so called MEMS sensors, which significantly improves the cooler EOL prediction. The recently commercially available MEMS acceleration sensors have mechanical resonance frequencies up to 50 kHz. They are able to detect solid borne shock pulses in the cooler structure, originating from e.g. metal on metal impacts driven by periodical forces acting on moving inner parts of the rotary compressor within wear dependent slack and play. The impact driven transient shock pulse analyses uses only the high frequency signal <10kHz and differs therefore from the commonly used broadband low frequencies vibrational analysis of reciprocating machines. It offers a direct indicator of the individual state of wear. The predictive cooler lifetime model based on the shock pulse analysis is presented and results are discussed.

  7. A multiscale red blood cell model with accurate mechanics, rheology, and dynamics.

    PubMed

    Fedosov, Dmitry A; Caswell, Bruce; Karniadakis, George Em

    2010-05-19

    Red blood cells (RBCs) have highly deformable viscoelastic membranes exhibiting complex rheological response and rich hydrodynamic behavior governed by special elastic and bending properties and by the external/internal fluid and membrane viscosities. We present a multiscale RBC model that is able to predict RBC mechanics, rheology, and dynamics in agreement with experiments. Based on an analytic theory, the modeled membrane properties can be uniquely related to the experimentally established RBC macroscopic properties without any adjustment of parameters. The RBC linear and nonlinear elastic deformations match those obtained in optical-tweezers experiments. The rheological properties of the membrane are compared with those obtained in optical magnetic twisting cytometry, membrane thermal fluctuations, and creep followed by cell recovery. The dynamics of RBCs in shear and Poiseuille flows is tested against experiments and theoretical predictions, and the applicability of the latter is discussed. Our findings clearly indicate that a purely elastic model for the membrane cannot accurately represent the RBC's rheological properties and its dynamics, and therefore accurate modeling of a viscoelastic membrane is necessary. Copyright 2010 Biophysical Society. Published by Elsevier Inc. All rights reserved.

  8. A NEW CLINICAL PREDICTION CRITERION ACCURATELY DETERMINES A SUBSET OF PATIENTS WITH BILATERAL PRIMARY ALDOSTERONISM BEFORE ADRENAL VENOUS SAMPLING.

    PubMed

    Kocjan, Tomaz; Janez, Andrej; Stankovic, Milenko; Vidmar, Gaj; Jensterle, Mojca

    2016-05-01

    Adrenal venous sampling (AVS) is the only available method to distinguish bilateral from unilateral primary aldosteronism (PA). AVS has several drawbacks, so it is reasonable to avoid this procedure when the results would not affect clinical management. Our objective was to identify a clinical criterion that can reliably predict nonlateralized AVS as a surrogate for bilateral PA that is not treated surgically. A retrospective diagnostic cross-sectional study conducted at Slovenian national endocrine referral center included 69 consecutive patients (mean age 56 ± 8 years, 21 females) with PA who underwent AVS. PA was confirmed with the saline infusion test (SIT). AVS was performed sequentially during continuous adrenocorticotrophic hormone (ACTH) infusion. The main outcome measures were variables associated with nonlateralized AVS to derive a clinical prediction rule. Sixty-seven (97%) patients had a successful AVS and were included in the statistical analysis. A total of 39 (58%) patients had nonlateralized AVS. The combined criterion of serum potassium ≥3.5 mmol/L, post-SIT aldosterone <18 ng/dL, and either no or bilateral tumor found on computed tomography (CT) imaging had perfect estimated specificity (and thus 100% positive predictive value) for bilateral PA, saving an estimated 16% of the patients (11/67) from unnecessary AVS. The best overall classification accuracy (50/67 = 75%) was achieved using the post-SIT aldosterone level <18 ng/dL alone, which yielded 74% sensitivity and 75% specificity for predicting nonlateralized AVS. Our clinical prediction criterion appears to accurately determine a subset of patients with bilateral PA who could avoid unnecessary AVS and immediately commence with medical treatment.

  9. In silico toxicity prediction by support vector machine and SMILES representation-based string kernel.

    PubMed

    Cao, D-S; Zhao, J-C; Yang, Y-N; Zhao, C-X; Yan, J; Liu, S; Hu, Q-N; Xu, Q-S; Liang, Y-Z

    2012-01-01

    There is a great need to assess the harmful effects or toxicities of chemicals to which man is exposed. In the present paper, the simplified molecular input line entry specification (SMILES) representation-based string kernel, together with the state-of-the-art support vector machine (SVM) algorithm, were used to classify the toxicity of chemicals from the US Environmental Protection Agency Distributed Structure-Searchable Toxicity (DSSTox) database network. In this method, the molecular structure can be directly encoded by a series of SMILES substrings that represent the presence of some chemical elements and different kinds of chemical bonds (double, triple and stereochemistry) in the molecules. Thus, SMILES string kernel can accurately and directly measure the similarities of molecules by a series of local information hidden in the molecules. Two model validation approaches, five-fold cross-validation and independent validation set, were used for assessing the predictive capability of our developed models. The results obtained indicate that SVM based on the SMILES string kernel can be regarded as a very promising and alternative modelling approach for potential toxicity prediction of chemicals.

  10. Behavior-Based Budget Management Using Predictive Analytics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Troy Hiltbrand

    Historically, the mechanisms to perform forecasting have primarily used two common factors as a basis for future predictions: time and money. While time and money are very important aspects of determining future budgetary spend patterns, organizations represent a complex system of unique individuals with a myriad of associated behaviors and all of these behaviors have bearing on how budget is utilized. When looking to forecasted budgets, it becomes a guessing game about how budget managers will behave under a given set of conditions. This becomes relatively messy when human nature is introduced, as different managers will react very differently undermore » similar circumstances. While one manager becomes ultra conservative during periods of financial austerity, another might be un-phased and continue to spend as they have in the past. Both might revert into a state of budgetary protectionism masking what is truly happening at a budget holder level, in order to keep as much budget and influence as possible while at the same time sacrificing the greater good of the organization. To more accurately predict future outcomes, the models should consider both time and money and other behavioral patterns that have been observed across the organization. The field of predictive analytics is poised to provide the tools and methodologies needed for organizations to do just this: capture and leverage behaviors of the past to predict the future.« less

  11. A novel logic-based approach for quantitative toxicology prediction.

    PubMed

    Amini, Ata; Muggleton, Stephen H; Lodhi, Huma; Sternberg, Michael J E

    2007-01-01

    There is a pressing need for accurate in silico methods to predict the toxicity of molecules that are being introduced into the environment or are being developed into new pharmaceuticals. Predictive toxicology is in the realm of structure activity relationships (SAR), and many approaches have been used to derive such SAR. Previous work has shown that inductive logic programming (ILP) is a powerful approach that circumvents several major difficulties, such as molecular superposition, faced by some other SAR methods. The ILP approach reasons with chemical substructures within a relational framework and yields chemically understandable rules. Here, we report a general new approach, support vector inductive logic programming (SVILP), which extends the essentially qualitative ILP-based SAR to quantitative modeling. First, ILP is used to learn rules, the predictions of which are then used within a novel kernel to derive a support-vector generalization model. For a highly heterogeneous dataset of 576 molecules with known fathead minnow fish toxicity, the cross-validated correlation coefficients (R2CV) from a chemical descriptor method (CHEM) and SVILP are 0.52 and 0.66, respectively. The ILP, CHEM, and SVILP approaches correctly predict 55, 58, and 73%, respectively, of toxic molecules. In a set of 165 unseen molecules, the R2 values from the commercial software TOPKAT and SVILP are 0.26 and 0.57, respectively. In all calculations, SVILP showed significant improvements in comparison with the other methods. The SVILP approach has a major advantage in that it uses ILP automatically and consistently to derive rules, mostly novel, describing fragments that are toxicity alerts. The SVILP is a general machine-learning approach and has the potential of tackling many problems relevant to chemoinformatics including in silico drug design.

  12. Prediction of TF target sites based on atomistic models of protein-DNA complexes

    PubMed Central

    Angarica, Vladimir Espinosa; Pérez, Abel González; Vasconcelos, Ana T; Collado-Vides, Julio; Contreras-Moreira, Bruno

    2008-01-01

    Background The specific recognition of genomic cis-regulatory elements by transcription factors (TFs) plays an essential role in the regulation of coordinated gene expression. Studying the mechanisms determining binding specificity in protein-DNA interactions is thus an important goal. Most current approaches for modeling TF specific recognition rely on the knowledge of large sets of cognate target sites and consider only the information contained in their primary sequence. Results Here we describe a structure-based methodology for predicting sequence motifs starting from the coordinates of a TF-DNA complex. Our algorithm combines information regarding the direct and indirect readout of DNA into an atomistic statistical model, which is used to estimate the interaction potential. We first measure the ability of our method to correctly estimate the binding specificities of eight prokaryotic and eukaryotic TFs that belong to different structural superfamilies. Secondly, the method is applied to two homology models, finding that sampling of interface side-chain rotamers remarkably improves the results. Thirdly, the algorithm is compared with a reference structural method based on contact counts, obtaining comparable predictions for the experimental complexes and more accurate sequence motifs for the homology models. Conclusion Our results demonstrate that atomic-detail structural information can be feasibly used to predict TF binding sites. The computational method presented here is universal and might be applied to other systems involving protein-DNA recognition. PMID:18922190

  13. Prediction of Spacecraft Vibration using Acceleration and Force Envelopes

    NASA Technical Reports Server (NTRS)

    Gordon, Scott; Kaufman, Daniel; Kern, Dennis; Scharton, Terry

    2009-01-01

    The base forces in the GLAST X- and Z-axis sine vibration tests were similar to those derived using generic inputs (from users guide and handbook), but the base forces in the sine test were generally greater than the flight data. Basedrive analyses using envelopes of flight acceleration data provided more accurate predictions of the base force than generic inputs, and as expected, using envelopes of both the flight acceleration and force provided even more accurate predictions The GLAST spacecraft interface accelerations and forces measured during the MECO transient were relatively low in the 60 to 150 Hz regime. One may expect the flight forces measured at the base of various spacecraft to be more dependent on the mass, frequencies, etc. of the spacecraft than are the corresponding interface acceleration data, which may depend more on the launch vehicle configuration.

  14. Fast and accurate grid representations for atom-based docking with partner flexibility.

    PubMed

    de Vries, Sjoerd J; Zacharias, Martin

    2017-06-30

    Macromolecular docking methods can broadly be divided into geometric and atom-based methods. Geometric methods use fast algorithms that operate on simplified, grid-like molecular representations, while atom-based methods are more realistic and flexible, but far less efficient. Here, a hybrid approach of grid-based and atom-based docking is presented, combining precalculated grid potentials with neighbor lists for fast and accurate calculation of atom-based intermolecular energies and forces. The grid representation is compatible with simultaneous multibody docking and can tolerate considerable protein flexibility. When implemented in our docking method ATTRACT, grid-based docking was found to be ∼35x faster. With the OPLSX forcefield instead of the ATTRACT coarse-grained forcefield, the average speed improvement was >100x. Grid-based representations may allow atom-based docking methods to explore large conformational spaces with many degrees of freedom, such as multiple macromolecules including flexibility. This increases the domain of biological problems to which docking methods can be applied. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.

  15. Accurate modeling of switched reluctance machine based on hybrid trained WNN

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Song, Shoujun, E-mail: sunnyway@nwpu.edu.cn; Ge, Lefei; Ma, Shaojie

    2014-04-15

    According to the strong nonlinear electromagnetic characteristics of switched reluctance machine (SRM), a novel accurate modeling method is proposed based on hybrid trained wavelet neural network (WNN) which combines improved genetic algorithm (GA) with gradient descent (GD) method to train the network. In the novel method, WNN is trained by GD method based on the initial weights obtained per improved GA optimization, and the global parallel searching capability of stochastic algorithm and local convergence speed of deterministic algorithm are combined to enhance the training accuracy, stability and speed. Based on the measured electromagnetic characteristics of a 3-phase 12/8-pole SRM, themore » nonlinear simulation model is built by hybrid trained WNN in Matlab. The phase current and mechanical characteristics from simulation under different working conditions meet well with those from experiments, which indicates the accuracy of the model for dynamic and static performance evaluation of SRM and verifies the effectiveness of the proposed modeling method.« less

  16. Accurate prediction of subcellular location of apoptosis proteins combining Chou’s PseAAC and PsePSSM based on wavelet denoising

    PubMed Central

    Chen, Cheng; Chen, Rui-Xin; Wang, Lei; Wang, Ming-Hui; Zhang, Yan

    2017-01-01

    Apoptosis proteins subcellular localization information are very important for understanding the mechanism of programmed cell death and the development of drugs. The prediction of subcellular localization of an apoptosis protein is still a challenging task because the prediction of apoptosis proteins subcellular localization can help to understand their function and the role of metabolic processes. In this paper, we propose a novel method for protein subcellular localization prediction. Firstly, the features of the protein sequence are extracted by combining Chou's pseudo amino acid composition (PseAAC) and pseudo-position specific scoring matrix (PsePSSM), then the feature information of the extracted is denoised by two-dimensional (2-D) wavelet denoising. Finally, the optimal feature vectors are input to the SVM classifier to predict subcellular location of apoptosis proteins. Quite promising predictions are obtained using the jackknife test on three widely used datasets and compared with other state-of-the-art methods. The results indicate that the method proposed in this paper can remarkably improve the prediction accuracy of apoptosis protein subcellular localization, which will be a supplementary tool for future proteomics research. PMID:29296195

  17. Accurate prediction of protein-protein interactions by integrating potential evolutionary information embedded in PSSM profile and discriminative vector machine classifier.

    PubMed

    Li, Zheng-Wei; You, Zhu-Hong; Chen, Xing; Li, Li-Ping; Huang, De-Shuang; Yan, Gui-Ying; Nie, Ru; Huang, Yu-An

    2017-04-04

    Identification of protein-protein interactions (PPIs) is of critical importance for deciphering the underlying mechanisms of almost all biological processes of cell and providing great insight into the study of human disease. Although much effort has been devoted to identifying PPIs from various organisms, existing high-throughput biological techniques are time-consuming, expensive, and have high false positive and negative results. Thus it is highly urgent to develop in silico methods to predict PPIs efficiently and accurately in this post genomic era. In this article, we report a novel computational model combining our newly developed discriminative vector machine classifier (DVM) and an improved Weber local descriptor (IWLD) for the prediction of PPIs. Two components, differential excitation and orientation, are exploited to build evolutionary features for each protein sequence. The main characteristics of the proposed method lies in introducing an effective feature descriptor IWLD which can capture highly discriminative evolutionary information from position-specific scoring matrixes (PSSM) of protein data, and employing the powerful and robust DVM classifier. When applying the proposed method to Yeast and H. pylori data sets, we obtained excellent prediction accuracies as high as 96.52% and 91.80%, respectively, which are significantly better than the previous methods. Extensive experiments were then performed for predicting cross-species PPIs and the predictive results were also pretty promising. To further validate the performance of the proposed method, we compared it with the state-of-the-art support vector machine (SVM) classifier on Human data set. The experimental results obtained indicate that our method is highly effective for PPIs prediction and can be taken as a supplementary tool for future proteomics research.

  18. A time series based sequence prediction algorithm to detect activities of daily living in smart home.

    PubMed

    Marufuzzaman, M; Reaz, M B I; Ali, M A M; Rahman, L F

    2015-01-01

    The goal of smart homes is to create an intelligent environment adapting the inhabitants need and assisting the person who needs special care and safety in their daily life. This can be reached by collecting the ADL (activities of daily living) data and further analysis within existing computing elements. In this research, a very recent algorithm named sequence prediction via enhanced episode discovery (SPEED) is modified and in order to improve accuracy time component is included. The modified SPEED or M-SPEED is a sequence prediction algorithm, which modified the previous SPEED algorithm by using time duration of appliance's ON-OFF states to decide the next state. M-SPEED discovered periodic episodes of inhabitant behavior, trained it with learned episodes, and made decisions based on the obtained knowledge. The results showed that M-SPEED achieves 96.8% prediction accuracy, which is better than other time prediction algorithms like PUBS, ALZ with temporal rules and the previous SPEED. Since human behavior shows natural temporal patterns, duration times can be used to predict future events more accurately. This inhabitant activity prediction system will certainly improve the smart homes by ensuring safety and better care for elderly and handicapped people.

  19. Outcome Prediction of Consciousness Disorders in the Acute Stage Based on a Complementary Motor Behavioural Tool.

    PubMed

    Pignat, Jean-Michel; Mauron, Etienne; Jöhr, Jane; Gilart de Keranflec'h, Charlotte; Van De Ville, Dimitri; Preti, Maria Giulia; Meskaldji, Djalel E; Hömberg, Volker; Laureys, Steven; Draganski, Bogdan; Frackowiak, Richard; Diserens, Karin

    2016-01-01

    Attaining an accurate diagnosis in the acute phase for severely brain-damaged patients presenting Disorders of Consciousness (DOC) is crucial for prognostic validity; such a diagnosis determines further medical management, in terms of therapeutic choices and end-of-life decisions. However, DOC evaluation based on validated scales, such as the Revised Coma Recovery Scale (CRS-R), can lead to an underestimation of consciousness and to frequent misdiagnoses particularly in cases of cognitive motor dissociation due to other aetiologies. The purpose of this study is to determine the clinical signs that lead to a more accurate consciousness assessment allowing more reliable outcome prediction. From the Unit of Acute Neurorehabilitation (University Hospital, Lausanne, Switzerland) between 2011 and 2014, we enrolled 33 DOC patients with a DOC diagnosis according to the CRS-R that had been established within 28 days of brain damage. The first CRS-R assessment established the initial diagnosis of Unresponsive Wakefulness Syndrome (UWS) in 20 patients and a Minimally Consciousness State (MCS) in the remaining13 patients. We clinically evaluated the patients over time using the CRS-R scale and concurrently from the beginning with complementary clinical items of a new observational Motor Behaviour Tool (MBT). Primary endpoint was outcome at unit discharge distinguishing two main classes of patients (DOC patients having emerged from DOC and those remaining in DOC) and 6 subclasses detailing the outcome of UWS and MCS patients, respectively. Based on CRS-R and MBT scores assessed separately and jointly, statistical testing was performed in the acute phase using a non-parametric Mann-Whitney U test; longitudinal CRS-R data were modelled with a Generalized Linear Model. Fifty-five per cent of the UWS patients and 77% of the MCS patients had emerged from DOC. First, statistical prediction of the first CRS-R scores did not permit outcome differentiation between classes; longitudinal

  20. aPPRove: An HMM-Based Method for Accurate Prediction of RNA-Pentatricopeptide Repeat Protein Binding Events

    PubMed Central

    Harrison, Thomas; Ruiz, Jaime; Sloan, Daniel B.; Ben-Hur, Asa; Boucher, Christina

    2016-01-01

    Pentatricopeptide repeat containing proteins (PPRs) bind to RNA transcripts originating from mitochondria and plastids. There are two classes of PPR proteins. The P class contains tandem P-type motif sequences, and the PLS class contains alternating P, L and S type sequences. In this paper, we describe a novel tool that predicts PPR-RNA interaction; specifically, our method, which we call aPPRove, determines where and how a PLS-class PPR protein will bind to RNA when given a PPR and one or more RNA transcripts by using a combinatorial binding code for site specificity proposed by Barkan et al. Our results demonstrate that aPPRove successfully locates how and where a PPR protein belonging to the PLS class can bind to RNA. For each binding event it outputs the binding site, the amino-acid-nucleotide interaction, and its statistical significance. Furthermore, we show that our method can be used to predict binding events for PLS-class proteins using a known edit site and the statistical significance of aligning the PPR protein to that site. In particular, we use our method to make a conjecture regarding an interaction between CLB19 and the second intronic region of ycf3. The aPPRove web server can be found at www.cs.colostate.edu/~approve. PMID:27560805

  1. BiPPred: Combined sequence- and structure-based prediction of peptide binding to the Hsp70 chaperone BiP.

    PubMed

    Schneider, Markus; Rosam, Mathias; Glaser, Manuel; Patronov, Atanas; Shah, Harpreet; Back, Katrin Christiane; Daake, Marina Angelika; Buchner, Johannes; Antes, Iris

    2016-10-01

    Substrate binding to Hsp70 chaperones is involved in many biological processes, and the identification of potential substrates is important for a comprehensive understanding of these events. We present a multi-scale pipeline for an accurate, yet efficient prediction of peptides binding to the Hsp70 chaperone BiP by combining sequence-based prediction with molecular docking and MMPBSA calculations. First, we measured the binding of 15mer peptides from known substrate proteins of BiP by peptide array (PA) experiments and performed an accuracy assessment of the PA data by fluorescence anisotropy studies. Several sequence-based prediction models were fitted using this and other peptide binding data. A structure-based position-specific scoring matrix (SB-PSSM) derived solely from structural modeling data forms the core of all models. The matrix elements are based on a combination of binding energy estimations, molecular dynamics simulations, and analysis of the BiP binding site, which led to new insights into the peptide binding specificities of the chaperone. Using this SB-PSSM, peptide binders could be predicted with high selectivity even without training of the model on experimental data. Additional training further increased the prediction accuracies. Subsequent molecular docking (DynaDock) and MMGBSA/MMPBSA-based binding affinity estimations for predicted binders allowed the identification of the correct binding mode of the peptides as well as the calculation of nearly quantitative binding affinities. The general concept behind the developed multi-scale pipeline can readily be applied to other protein-peptide complexes with linearly bound peptides, for which sufficient experimental binding data for the training of classical sequence-based prediction models is not available. Proteins 2016; 84:1390-1407. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  2. A genomic biomarker signature can predict skin sensitizers using a cell-based in vitro alternative to animal tests

    PubMed Central

    2011-01-01

    Background Allergic contact dermatitis is an inflammatory skin disease that affects a significant proportion of the population. This disease is caused by an adverse immune response towards chemical haptens, and leads to a substantial economic burden for society. Current test of sensitizing chemicals rely on animal experimentation. New legislations on the registration and use of chemicals within pharmaceutical and cosmetic industries have stimulated significant research efforts to develop alternative, human cell-based assays for the prediction of sensitization. The aim is to replace animal experiments with in vitro tests displaying a higher predictive power. Results We have developed a novel cell-based assay for the prediction of sensitizing chemicals. By analyzing the transcriptome of the human cell line MUTZ-3 after 24 h stimulation, using 20 different sensitizing chemicals, 20 non-sensitizing chemicals and vehicle controls, we have identified a biomarker signature of 200 genes with potent discriminatory ability. Using a Support Vector Machine for supervised classification, the prediction performance of the assay revealed an area under the ROC curve of 0.98. In addition, categorizing the chemicals according to the LLNA assay, this gene signature could also predict sensitizing potency. The identified markers are involved in biological pathways with immunological relevant functions, which can shed light on the process of human sensitization. Conclusions A gene signature predicting sensitization, using a human cell line in vitro, has been identified. This simple and robust cell-based assay has the potential to completely replace or drastically reduce the utilization of test systems based on experimental animals. Being based on human biology, the assay is proposed to be more accurate for predicting sensitization in humans, than the traditional animal-based tests. PMID:21824406

  3. Secondary Structure Predictions for Long RNA Sequences Based on Inversion Excursions and MapReduce.

    PubMed

    Yehdego, Daniel T; Zhang, Boyu; Kodimala, Vikram K R; Johnson, Kyle L; Taufer, Michela; Leung, Ming-Ying

    2013-05-01

    Secondary structures of ribonucleic acid (RNA) molecules play important roles in many biological processes including gene expression and regulation. Experimental observations and computing limitations suggest that we can approach the secondary structure prediction problem for long RNA sequences by segmenting them into shorter chunks, predicting the secondary structures of each chunk individually using existing prediction programs, and then assembling the results to give the structure of the original sequence. The selection of cutting points is a crucial component of the segmenting step. Noting that stem-loops and pseudoknots always contain an inversion, i.e., a stretch of nucleotides followed closely by its inverse complementary sequence, we developed two cutting methods for segmenting long RNA sequences based on inversion excursions: the centered and optimized method. Each step of searching for inversions, chunking, and predictions can be performed in parallel. In this paper we use a MapReduce framework, i.e., Hadoop, to extensively explore meaningful inversion stem lengths and gap sizes for the segmentation and identify correlations between chunking methods and prediction accuracy. We show that for a set of long RNA sequences in the RFAM database, whose secondary structures are known to contain pseudoknots, our approach predicts secondary structures more accurately than methods that do not segment the sequence, when the latter predictions are possible computationally. We also show that, as sequences exceed certain lengths, some programs cannot computationally predict pseudoknots while our chunking methods can. Overall, our predicted structures still retain the accuracy level of the original prediction programs when compared with known experimental secondary structure.

  4. GASP: Gapped Ancestral Sequence Prediction for proteins

    PubMed Central

    Edwards, Richard J; Shields, Denis C

    2004-01-01

    Background The prediction of ancestral protein sequences from multiple sequence alignments is useful for many bioinformatics analyses. Predicting ancestral sequences is not a simple procedure and relies on accurate alignments and phylogenies. Several algorithms exist based on Maximum Parsimony or Maximum Likelihood methods but many current implementations are unable to process residues with gaps, which may represent insertion/deletion (indel) events or sequence fragments. Results Here we present a new algorithm, GASP (Gapped Ancestral Sequence Prediction), for predicting ancestral sequences from phylogenetic trees and the corresponding multiple sequence alignments. Alignments may be of any size and contain gaps. GASP first assigns the positions of gaps in the phylogeny before using a likelihood-based approach centred on amino acid substitution matrices to assign ancestral amino acids. Important outgroup information is used by first working down from the tips of the tree to the root, using descendant data only to assign probabilities, and then working back up from the root to the tips using descendant and outgroup data to make predictions. GASP was tested on a number of simulated datasets based on real phylogenies. Prediction accuracy for ungapped data was similar to three alternative algorithms tested, with GASP performing better in some cases and worse in others. Adding simple insertions and deletions to the simulated data did not have a detrimental effect on GASP accuracy. Conclusions GASP (Gapped Ancestral Sequence Prediction) will predict ancestral sequences from multiple protein alignments of any size. Although not as accurate in all cases as some of the more sophisticated maximum likelihood approaches, it can process a wide range of input phylogenies and will predict ancestral sequences for gapped and ungapped residues alike. PMID:15350199

  5. Improved prediction of peptide detectability for targeted proteomics using a rank-based algorithm and organism-specific data.

    PubMed

    Qeli, Ermir; Omasits, Ulrich; Goetze, Sandra; Stekhoven, Daniel J; Frey, Juerg E; Basler, Konrad; Wollscheid, Bernd; Brunner, Erich; Ahrens, Christian H

    2014-08-28

    The in silico prediction of the best-observable "proteotypic" peptides in mass spectrometry-based workflows is a challenging problem. Being able to accurately predict such peptides would enable the informed selection of proteotypic peptides for targeted quantification of previously observed and non-observed proteins for any organism, with a significant impact for clinical proteomics and systems biology studies. Current prediction algorithms rely on physicochemical parameters in combination with positive and negative training sets to identify those peptide properties that most profoundly affect their general detectability. Here we present PeptideRank, an approach that uses learning to rank algorithm for peptide detectability prediction from shotgun proteomics data, and that eliminates the need to select a negative dataset for the training step. A large number of different peptide properties are used to train ranking models in order to predict a ranking of the best-observable peptides within a protein. Empirical evaluation with rank accuracy metrics showed that PeptideRank complements existing prediction algorithms. Our results indicate that the best performance is achieved when it is trained on organism-specific shotgun proteomics data, and that PeptideRank is most accurate for short to medium-sized and abundant proteins, without any loss in prediction accuracy for the important class of membrane proteins. Targeted proteomics approaches have been gaining a lot of momentum and hold immense potential for systems biology studies and clinical proteomics. However, since only very few complete proteomes have been reported to date, for a considerable fraction of a proteome there is no experimental proteomics evidence that would allow to guide the selection of the best-suited proteotypic peptides (PTPs), i.e. peptides that are specific to a given proteoform and that are repeatedly observed in a mass spectrometer. We describe a novel, rank-based approach for the prediction

  6. Protein docking prediction using predicted protein-protein interface.

    PubMed

    Li, Bin; Kihara, Daisuke

    2012-01-10

    Many important cellular processes are carried out by protein complexes. To provide physical pictures of interacting proteins, many computational protein-protein prediction methods have been developed in the past. However, it is still difficult to identify the correct docking complex structure within top ranks among alternative conformations. We present a novel protein docking algorithm that utilizes imperfect protein-protein binding interface prediction for guiding protein docking. Since the accuracy of protein binding site prediction varies depending on cases, the challenge is to develop a method which does not deteriorate but improves docking results by using a binding site prediction which may not be 100% accurate. The algorithm, named PI-LZerD (using Predicted Interface with Local 3D Zernike descriptor-based Docking algorithm), is based on a pair wise protein docking prediction algorithm, LZerD, which we have developed earlier. PI-LZerD starts from performing docking prediction using the provided protein-protein binding interface prediction as constraints, which is followed by the second round of docking with updated docking interface information to further improve docking conformation. Benchmark results on bound and unbound cases show that PI-LZerD consistently improves the docking prediction accuracy as compared with docking without using binding site prediction or using the binding site prediction as post-filtering. We have developed PI-LZerD, a pairwise docking algorithm, which uses imperfect protein-protein binding interface prediction to improve docking accuracy. PI-LZerD consistently showed better prediction accuracy over alternative methods in the series of benchmark experiments including docking using actual docking interface site predictions as well as unbound docking cases.

  7. Using physics-based pose predictions and free energy perturbation calculations to predict binding poses and relative binding affinities for FXR ligands in the D3R Grand Challenge 2

    NASA Astrophysics Data System (ADS)

    Athanasiou, Christina; Vasilakaki, Sofia; Dellis, Dimitris; Cournia, Zoe

    2018-01-01

    Computer-aided drug design has become an integral part of drug discovery and development in the pharmaceutical and biotechnology industry, and is nowadays extensively used in the lead identification and lead optimization phases. The drug design data resource (D3R) organizes challenges against blinded experimental data to prospectively test computational methodologies as an opportunity for improved methods and algorithms to emerge. We participated in Grand Challenge 2 to predict the crystallographic poses of 36 Farnesoid X Receptor (FXR)-bound ligands and the relative binding affinities for two designated subsets of 18 and 15 FXR-bound ligands. Here, we present our methodology for pose and affinity predictions and its evaluation after the release of the experimental data. For predicting the crystallographic poses, we used docking and physics-based pose prediction methods guided by the binding poses of native ligands. For FXR ligands with known chemotypes in the PDB, we accurately predicted their binding modes, while for those with unknown chemotypes the predictions were more challenging. Our group ranked #1st (based on the median RMSD) out of 46 groups, which submitted complete entries for the binding pose prediction challenge. For the relative binding affinity prediction challenge, we performed free energy perturbation (FEP) calculations coupled with molecular dynamics (MD) simulations. FEP/MD calculations displayed a high success rate in identifying compounds with better or worse binding affinity than the reference (parent) compound. Our studies suggest that when ligands with chemical precedent are available in the literature, binding pose predictions using docking and physics-based methods are reliable; however, predictions are challenging for ligands with completely unknown chemotypes. We also show that FEP/MD calculations hold predictive value and can nowadays be used in a high throughput mode in a lead optimization project provided that crystal structures of

  8. Critical Evaluation of Prediction Models for Phosphorus Partition between CaO-based Slags and Iron-based Melts during Dephosphorization Processes

    NASA Astrophysics Data System (ADS)

    Yang, Xue-Min; Li, Jin-Yan; Chai, Guo-Ming; Duan, Dong-Ping; Zhang, Jian

    2016-08-01

    According to the experimental results of hot metal dephosphorization by CaO-based slags at a commercial-scale hot metal pretreatment station, the collected 16 models of equilibrium quotient k_{{P}} or phosphorus partition L_{{P}} between CaO-based slags and iron-based melts from the literature have been evaluated. The collected 16 models for predicting equilibrium quotient k_{{P}} can be transferred to predict phosphorus partition L_{{P}} . The predicted results by the collected 16 models cannot be applied to be criteria for evaluating k_{{P}} or L_{{P}} due to various forms or definitions of k_{{P}} or L_{{P}} . Thus, the measured phosphorus content [pct P] in a hot metal bath at the end point of the dephosphorization pretreatment process is applied to be the fixed criteria for evaluating the collected 16 models. The collected 16 models can be described in the form of linear functions as y = c0 + c1 x , in which independent variable x represents the chemical composition of slags, intercept c0 including the constant term depicts the temperature effect and other unmentioned or acquiescent thermodynamic factors, and slope c1 is regressed by the experimental results of k_{{P}} or L_{{P}} . Thus, a general approach to developing the thermodynamic model for predicting equilibrium quotient k_{{P}} or phosphorus partition L P or [pct P] in iron-based melts during the dephosphorization process is proposed by revising the constant term in intercept c0 for the summarized 15 models except for Suito's model (M3). The better models with an ideal revising possibility or flexibility among the collected 16 models have been selected and recommended. Compared with the predicted result by the revised 15 models and Suito's model (M3), the developed IMCT- L_{{P}} model coupled with the proposed dephosphorization mechanism by the present authors can be applied to accurately predict phosphorus partition L_{{P}} with the lowest mean deviation δ_{{L_{{P}} }} of log L_{{P}} as 2.33, as

  9. An automatic and accurate method of full heart segmentation from CT image based on linear gradient model

    NASA Astrophysics Data System (ADS)

    Yang, Zili

    2017-07-01

    Heart segmentation is an important auxiliary method in the diagnosis of many heart diseases, such as coronary heart disease and atrial fibrillation, and in the planning of tumor radiotherapy. Most of the existing methods for full heart segmentation treat the heart as a whole part and cannot accurately extract the bottom of the heart. In this paper, we propose a new method based on linear gradient model to segment the whole heart from the CT images automatically and accurately. Twelve cases were tested in order to test this method and accurate segmentation results were achieved and identified by clinical experts. The results can provide reliable clinical support.

  10. Predicting vapor-liquid phase equilibria with augmented ab initio interatomic potentials

    NASA Astrophysics Data System (ADS)

    Vlasiuk, Maryna; Sadus, Richard J.

    2017-06-01

    The ability of ab initio interatomic potentials to accurately predict vapor-liquid phase equilibria is investigated. Monte Carlo simulations are reported for the vapor-liquid equilibria of argon and krypton using recently developed accurate ab initio interatomic potentials. Seventeen interatomic potentials are studied, formulated from different combinations of two-body plus three-body terms. The simulation results are compared to either experimental or reference data for conditions ranging from the triple point to the critical point. It is demonstrated that the use of ab initio potentials enables systematic improvements to the accuracy of predictions via the addition of theoretically based terms. The contribution of three-body interactions is accounted for using the Axilrod-Teller-Muto plus other multipole contributions and the effective Marcelli-Wang-Sadus potentials. The results indicate that the predictive ability of recent interatomic potentials, obtained from quantum chemical calculations, is comparable to that of accurate empirical models. It is demonstrated that the Marcelli-Wang-Sadus potential can be used in combination with accurate two-body ab initio models for the computationally inexpensive and accurate estimation of vapor-liquid phase equilibria.

  11. Predicting vapor-liquid phase equilibria with augmented ab initio interatomic potentials.

    PubMed

    Vlasiuk, Maryna; Sadus, Richard J

    2017-06-28

    The ability of ab initio interatomic potentials to accurately predict vapor-liquid phase equilibria is investigated. Monte Carlo simulations are reported for the vapor-liquid equilibria of argon and krypton using recently developed accurate ab initio interatomic potentials. Seventeen interatomic potentials are studied, formulated from different combinations of two-body plus three-body terms. The simulation results are compared to either experimental or reference data for conditions ranging from the triple point to the critical point. It is demonstrated that the use of ab initio potentials enables systematic improvements to the accuracy of predictions via the addition of theoretically based terms. The contribution of three-body interactions is accounted for using the Axilrod-Teller-Muto plus other multipole contributions and the effective Marcelli-Wang-Sadus potentials. The results indicate that the predictive ability of recent interatomic potentials, obtained from quantum chemical calculations, is comparable to that of accurate empirical models. It is demonstrated that the Marcelli-Wang-Sadus potential can be used in combination with accurate two-body ab initio models for the computationally inexpensive and accurate estimation of vapor-liquid phase equilibria.

  12. Learning-based prediction of gestational age from ultrasound images of the fetal brain.

    PubMed

    Namburete, Ana I L; Stebbing, Richard V; Kemp, Bryn; Yaqub, Mohammad; Papageorghiou, Aris T; Alison Noble, J

    2015-04-01

    We propose an automated framework for predicting gestational age (GA) and neurodevelopmental maturation of a fetus based on 3D ultrasound (US) brain image appearance. Our method capitalizes on age-related sonographic image patterns in conjunction with clinical measurements to develop, for the first time, a predictive age model which improves on the GA-prediction potential of US images. The framework benefits from a manifold surface representation of the fetal head which delineates the inner skull boundary and serves as a common coordinate system based on cranial position. This allows for fast and efficient sampling of anatomically-corresponding brain regions to achieve like-for-like structural comparison of different developmental stages. We develop bespoke features which capture neurosonographic patterns in 3D images, and using a regression forest classifier, we characterize structural brain development both spatially and temporally to capture the natural variation existing in a healthy population (N=447) over an age range of active brain maturation (18-34weeks). On a routine clinical dataset (N=187) our age prediction results strongly correlate with true GA (r=0.98,accurate within±6.10days), confirming the link between maturational progression and neurosonographic activity observable across gestation. Our model also outperforms current clinical methods by ±4.57 days in the third trimester-a period complicated by biological variations in the fetal population. Through feature selection, the model successfully identified the most age-discriminating anatomies over this age range as being the Sylvian fissure, cingulate, and callosal sulci. Copyright © 2015 The Authors. Published by Elsevier B.V. All rights reserved.

  13. Prediction of Slot Shape and Slot Size for Improving the Performance of Microstrip Antennas Using Knowledge-Based Neural Networks.

    PubMed

    Khan, Taimoor; De, Asok

    2014-01-01

    In the last decade, artificial neural networks have become very popular techniques for computing different performance parameters of microstrip antennas. The proposed work illustrates a knowledge-based neural networks model for predicting the appropriate shape and accurate size of the slot introduced on the radiating patch for achieving desired level of resonance, gain, directivity, antenna efficiency, and radiation efficiency for dual-frequency operation. By incorporating prior knowledge in neural model, the number of required training patterns is drastically reduced. Further, the neural model incorporated with prior knowledge can be used for predicting response in extrapolation region beyond the training patterns region. For validation, a prototype is also fabricated and its performance parameters are measured. A very good agreement is attained between measured, simulated, and predicted results.

  14. An adaptive data-driven method for accurate prediction of remaining useful life of rolling bearings

    NASA Astrophysics Data System (ADS)

    Peng, Yanfeng; Cheng, Junsheng; Liu, Yanfei; Li, Xuejun; Peng, Zhihua

    2018-06-01

    A novel data-driven method based on Gaussian mixture model (GMM) and distance evaluation technique (DET) is proposed to predict the remaining useful life (RUL) of rolling bearings. The data sets are clustered by GMM to divide all data sets into several health states adaptively and reasonably. The number of clusters is determined by the minimum description length principle. Thus, either the health state of the data sets or the number of the states is obtained automatically. Meanwhile, the abnormal data sets can be recognized during the clustering process and removed from the training data sets. After obtaining the health states, appropriate features are selected by DET for increasing the classification and prediction accuracy. In the prediction process, each vibration signal is decomposed into several components by empirical mode decomposition. Some common statistical parameters of the components are calculated first and then the features are clustered using GMM to divide the data sets into several health states and remove the abnormal data sets. Thereafter, appropriate statistical parameters of the generated components are selected using DET. Finally, least squares support vector machine is utilized to predict the RUL of rolling bearings. Experimental results indicate that the proposed method reliably predicts the RUL of rolling bearings.

  15. A RSM-based predictive model to characterize heat treating parameters of D2 steel using combined Barkhausen noise and hysteresis loop methods

    NASA Astrophysics Data System (ADS)

    Kahrobaee, Saeed; Hejazi, Taha-Hossein

    2017-07-01

    Austenitizing and tempering temperatures are the effective characteristics in heat treating process of AISI D2 tool steel. Therefore, controlling them enables the heat treatment process to be designed more accurately which results in more balanced mechanical properties. The aim of this work is to develop a multiresponse predictive model that enables finding these characteristics based on nondestructive tests by a set of parameters of the magnetic Barkhausen noise technique and hysteresis loop method. To produce various microstructural changes, identical specimens from the AISI D2 steel sheet were austenitized in the range 1025-1130 °C, for 30 min, oil-quenched and finally tempered at various temperatures between 200 °C and 650 °C. A set of nondestructive data have been gathered based on general factorial design of experiments and used for training and testing the multiple response surface model. Finally, an optimization model has been proposed to achieve minimal error prediction. Results revealed that applying Barkhausen and hysteresis loop methods, simultaneously, coupling to the multiresponse model, has a potential to be used as a reliable and accurate nondestructive tool for predicting austenitizing and tempering temperatures (which, in turn, led to characterizing the microstructural changes) of the parts with unknown heat treating conditions.

  16. Accurate experimental and theoretical comparisons between superconductor-insulator-superconductor mixers showing weak and strong quantum effects

    NASA Technical Reports Server (NTRS)

    Mcgrath, W. R.; Richards, P. L.; Face, D. W.; Prober, D. E.; Lloyd, F. L.

    1988-01-01

    A systematic study of the gain and noise in superconductor-insulator-superconductor mixers employing Ta based, Nb based, and Pb-alloy based tunnel junctions was made. These junctions displayed both weak and strong quantum effects at a signal frequency of 33 GHz. The effects of energy gap sharpness and subgap current were investigated and are quantitatively related to mixer performance. Detailed comparisons are made of the mixing results with the predictions of a three-port model approximation to the Tucker theory. Mixer performance was measured with a novel test apparatus which is accurate enough to allow for the first quantitative tests of theoretical noise predictions. It is found that the three-port model of the Tucker theory underestimates the mixer noise temperature by a factor of about 2 for all of the mixers. In addition, predicted values of available mixer gain are in reasonable agreement with experiment when quantum effects are weak. However, as quantum effects become strong, the predicted available gain diverges to infinity, which is in sharp contrast to the experimental results. Predictions of coupled gain do not always show such divergences.

  17. Prediction of brain maturity based on cortical thickness at different spatial resolutions.

    PubMed

    Khundrakpam, Budhachandra S; Tohka, Jussi; Evans, Alan C

    2015-05-01

    Several studies using magnetic resonance imaging (MRI) scans have shown developmental trajectories of cortical thickness. Cognitive milestones happen concurrently with these structural changes, and a delay in such changes has been implicated in developmental disorders such as attention-deficit/hyperactivity disorder (ADHD). Accurate estimation of individuals' brain maturity, therefore, is critical in establishing a baseline for normal brain development against which neurodevelopmental disorders can be assessed. In this study, cortical thickness derived from structural magnetic resonance imaging (MRI) scans of a large longitudinal dataset of normally growing children and adolescents (n=308), were used to build a highly accurate predictive model for estimating chronological age (cross-validated correlation up to R=0.84). Unlike previous studies which used kernelized approach in building prediction models, we used an elastic net penalized linear regression model capable of producing a spatially sparse, yet accurate predictive model of chronological age. Upon investigating different scales of cortical parcellation from 78 to 10,240 brain parcels, we observed that the accuracy in estimated age improved with increased spatial scale of brain parcellation, with the best estimations obtained for spatial resolutions consisting of 2560 and 10,240 brain parcels. The top predictors of brain maturity were found in highly localized sensorimotor and association areas. The results of our study demonstrate that cortical thickness can be used to estimate individuals' brain maturity with high accuracy, and the estimated ages relate to functional and behavioural measures, underscoring the relevance and scope of the study in the understanding of biological maturity. Copyright © 2015 Elsevier Inc. All rights reserved.

  18. Accurate recapture identification for genetic mark–recapture studies with error-tolerant likelihood-based match calling and sample clustering

    USGS Publications Warehouse

    Sethi, Suresh; Linden, Daniel; Wenburg, John; Lewis, Cara; Lemons, Patrick R.; Fuller, Angela K.; Hare, Matthew P.

    2016-01-01

    Error-tolerant likelihood-based match calling presents a promising technique to accurately identify recapture events in genetic mark–recapture studies by combining probabilities of latent genotypes and probabilities of observed genotypes, which may contain genotyping errors. Combined with clustering algorithms to group samples into sets of recaptures based upon pairwise match calls, these tools can be used to reconstruct accurate capture histories for mark–recapture modelling. Here, we assess the performance of a recently introduced error-tolerant likelihood-based match-calling model and sample clustering algorithm for genetic mark–recapture studies. We assessed both biallelic (i.e. single nucleotide polymorphisms; SNP) and multiallelic (i.e. microsatellite; MSAT) markers using a combination of simulation analyses and case study data on Pacific walrus (Odobenus rosmarus divergens) and fishers (Pekania pennanti). A novel two-stage clustering approach is demonstrated for genetic mark–recapture applications. First, repeat captures within a sampling occasion are identified. Subsequently, recaptures across sampling occasions are identified. The likelihood-based matching protocol performed well in simulation trials, demonstrating utility for use in a wide range of genetic mark–recapture studies. Moderately sized SNP (64+) and MSAT (10–15) panels produced accurate match calls for recaptures and accurate non-match calls for samples from closely related individuals in the face of low to moderate genotyping error. Furthermore, matching performance remained stable or increased as the number of genetic markers increased, genotyping error notwithstanding.

  19. Predicting Multicomponent Adsorption Isotherms in Open-Metal Site Materials Using Force Field Calculations Based on Energy Decomposed Density Functional Theory.

    PubMed

    Heinen, Jurn; Burtch, Nicholas C; Walton, Krista S; Fonseca Guerra, Célia; Dubbeldam, David

    2016-12-12

    For the design of adsorptive-separation units, knowledge is required of the multicomponent adsorption behavior. Ideal adsorbed solution theory (IAST) breaks down for olefin adsorption in open-metal site (OMS) materials due to non-ideal donor-acceptor interactions. Using a density-function-theory-based energy decomposition scheme, we develop a physically justifiable classical force field that incorporates the missing orbital interactions using an appropriate functional form. Our first-principles derived force field shows greatly improved quantitative agreement with the inflection points, initial uptake, saturation capacity, and enthalpies of adsorption obtained from our in-house adsorption experiments. While IAST fails to make accurate predictions, our improved force field model is able to correctly predict the multicomponent behavior. Our approach is also transferable to other OMS structures, allowing the accurate study of their separation performances for olefins/paraffins and further mixtures involving complex donor-acceptor interactions. © 2016 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  20. Profitable capitation requires accurate costing.

    PubMed

    West, D A; Hicks, L L; Balas, E A; West, T D

    1996-01-01

    In the name of costing accuracy, nurses are asked to track inventory use on per treatment basis when more significant costs, such as general overhead and nursing salaries, are usually allocated to patients or treatments on an average cost basis. Accurate treatment costing and financial viability require analysis of all resources actually consumed in treatment delivery, including nursing services and inventory. More precise costing information enables more profitable decisions as is demonstrated by comparing the ratio-of-cost-to-treatment method (aggregate costing) with alternative activity-based costing methods (ABC). Nurses must participate in this costing process to assure that capitation bids are based upon accurate costs rather than simple averages.

  1. EOID Model Validation and Performance Prediction

    DTIC Science & Technology

    2002-09-30

    Our long-term goal is to accurately predict the capability of the current generation of laser-based underwater imaging sensors to perform Electro ... Optic Identification (EOID) against relevant targets in a variety of realistic environmental conditions. The two most prominent technologies in this area

  2. lncRNATargets: A platform for lncRNA target prediction based on nucleic acid thermodynamics.

    PubMed

    Hu, Ruifeng; Sun, Xiaobo

    2016-08-01

    Many studies have supported that long noncoding RNAs (lncRNAs) perform various functions in various critical biological processes. Advanced experimental and computational technologies allow access to more information on lncRNAs. Determining the functions and action mechanisms of these RNAs on a large scale is urgently needed. We provided lncRNATargets, which is a web-based platform for lncRNA target prediction based on nucleic acid thermodynamics. The nearest-neighbor (NN) model was used to calculate binging-free energy. The main principle of NN model for nucleic acid assumes that identity and orientation of neighbor base pairs determine stability of a given base pair. lncRNATargets features the following options: setting of a specific temperature that allow use not only for human but also for other animals or plants; processing all lncRNAs in high throughput without RNA size limitation that is superior to any other existing tool; and web-based, user-friendly interface, and colored result displays that allow easy access for nonskilled computer operators and provide better understanding of results. This technique could provide accurate calculation on the binding-free energy of lncRNA-target dimers to predict if these structures are well targeted together. lncRNATargets provides high accuracy calculations, and this user-friendly program is available for free at http://www.herbbol.org:8001/lrt/ .

  3. Accurate, predictable, repeatable micro-assembly technology for polymer, microfluidic modules.

    PubMed

    Lee, Tae Yoon; Han, Kyudong; Barrett, Dwhyte O; Park, Sunggook; Soper, Steven A; Murphy, Michael C

    2018-01-01

    A method for the design, construction, and assembly of modular, polymer-based, microfluidic devices using simple micro-assembly technology was demonstrated to build an integrated fluidic system consisting of vertically stacked modules for carrying out multi-step molecular assays. As an example of the utility of the modular system, point mutation detection using the ligase detection reaction (LDR) following amplification by the polymerase chain reaction (PCR) was carried out. Fluid interconnects and standoffs ensured that temperatures in the vertically stacked reactors were within ± 0.2 C° at the center of the temperature zones and ± 1.1 C° overall. The vertical spacing between modules was confirmed using finite element models (ANSYS, Inc., Canonsburg, PA) to simulate the steady-state temperature distribution for the assembly. Passive alignment structures, including a hemispherical pin-in-hole, a hemispherical pin-in-slot, and a plate-plate lap joint, were developed using screw theory to enable accurate exactly constrained assembly of the microfluidic reactors, cover sheets, and fluid interconnects to facilitate the modular approach. The mean mismatch between the centers of adjacent through holes was 64 ± 7.7 μm, significantly reducing the dead volume necessary to accommodate manufacturing variation. The microfluidic components were easily assembled by hand and the assembly of several different configurations of microfluidic modules for executing the assay was evaluated. Temperatures were measured in the desired range in each reactor. The biochemical performance was comparable to that obtained with benchtop instruments, but took less than 45 min to execute, half the time.

  4. Accurate Bit Error Rate Calculation for Asynchronous Chaos-Based DS-CDMA over Multipath Channel

    NASA Astrophysics Data System (ADS)

    Kaddoum, Georges; Roviras, Daniel; Chargé, Pascal; Fournier-Prunaret, Daniele

    2009-12-01

    An accurate approach to compute the bit error rate expression for multiuser chaosbased DS-CDMA system is presented in this paper. For more realistic communication system a slow fading multipath channel is considered. A simple RAKE receiver structure is considered. Based on the bit energy distribution, this approach compared to others computation methods existing in literature gives accurate results with low computation charge. Perfect estimation of the channel coefficients with the associated delays and chaos synchronization is assumed. The bit error rate is derived in terms of the bit energy distribution, the number of paths, the noise variance, and the number of users. Results are illustrated by theoretical calculations and numerical simulations which point out the accuracy of our approach.

  5. Improving Prediction Accuracy for WSN Data Reduction by Applying Multivariate Spatio-Temporal Correlation

    PubMed Central

    Carvalho, Carlos; Gomes, Danielo G.; Agoulmine, Nazim; de Souza, José Neuman

    2011-01-01

    This paper proposes a method based on multivariate spatial and temporal correlation to improve prediction accuracy in data reduction for Wireless Sensor Networks (WSN). Prediction of data not sent to the sink node is a technique used to save energy in WSNs by reducing the amount of data traffic. However, it may not be very accurate. Simulations were made involving simple linear regression and multiple linear regression functions to assess the performance of the proposed method. The results show a higher correlation between gathered inputs when compared to time, which is an independent variable widely used for prediction and forecasting. Prediction accuracy is lower when simple linear regression is used, whereas multiple linear regression is the most accurate one. In addition to that, our proposal outperforms some current solutions by about 50% in humidity prediction and 21% in light prediction. To the best of our knowledge, we believe that we are probably the first to address prediction based on multivariate correlation for WSN data reduction. PMID:22346626

  6. Accurate density functional prediction of molecular electron affinity with the scaling corrected Kohn–Sham frontier orbital energies

    NASA Astrophysics Data System (ADS)

    Zhang, DaDi; Yang, Xiaolong; Zheng, Xiao; Yang, Weitao

    2018-04-01

    Electron affinity (EA) is the energy released when an additional electron is attached to an atom or a molecule. EA is a fundamental thermochemical property, and it is closely pertinent to other important properties such as electronegativity and hardness. However, accurate prediction of EA is difficult with density functional theory methods. The somewhat large error of the calculated EAs originates mainly from the intrinsic delocalisation error associated with the approximate exchange-correlation functional. In this work, we employ a previously developed non-empirical global scaling correction approach, which explicitly imposes the Perdew-Parr-Levy-Balduz condition to the approximate functional, and achieve a substantially improved accuracy for the calculated EAs. In our approach, the EA is given by the scaling corrected Kohn-Sham lowest unoccupied molecular orbital energy of the neutral molecule, without the need to carry out the self-consistent-field calculation for the anion.

  7. Parkinsonian rest tremor can be detected accurately based on neuronal oscillations recorded from the subthalamic nucleus.

    PubMed

    Hirschmann, J; Schoffelen, J M; Schnitzler, A; van Gerven, M A J

    2017-10-01

    To investigate the possibility of tremor detection based on deep brain activity. We re-analyzed recordings of local field potentials (LFPs) from the subthalamic nucleus in 10 PD patients (12 body sides) with spontaneously fluctuating rest tremor. Power in several frequency bands was estimated and used as input to Hidden Markov Models (HMMs) which classified short data segments as either tremor-free rest or rest tremor. HMMs were compared to direct threshold application to individual power features. Applying a threshold directly to band-limited power was insufficient for tremor detection (mean area under the curve [AUC] of receiver operating characteristic: 0.64, STD: 0.19). Multi-feature HMMs, in contrast, allowed for accurate detection (mean AUC: 0.82, STD: 0.15), using four power features obtained from a single contact pair. Within-patient training yielded better accuracy than across-patient training (0.84vs. 0.78, p=0.03), yet tremor could often be detected accurately with either approach. High frequency oscillations (>200Hz) were the best performing individual feature. LFP-based markers of tremor are robust enough to allow for accurate tremor detection in short data segments, provided that appropriate statistical models are used. LFP-based markers of tremor could be useful control signals for closed-loop deep brain stimulation. Copyright © 2017 International Federation of Clinical Neurophysiology. Published by Elsevier B.V. All rights reserved.

  8. Chemical structure-based predictive model for methanogenic anaerobic biodegradation potential.

    PubMed

    Meylan, William; Boethling, Robert; Aronson, Dallas; Howard, Philip; Tunkel, Jay

    2007-09-01

    Many screening-level models exist for predicting aerobic biodegradation potential from chemical structure, but anaerobic biodegradation generally has been ignored by modelers. We used a fragment contribution approach to develop a model for predicting biodegradation potential under methanogenic anaerobic conditions. The new model has 37 fragments (substructures) and classifies a substance as either fast or slow, relative to the potential to be biodegraded in the "serum bottle" anaerobic biodegradation screening test (Organization for Economic Cooperation and Development Guideline 311). The model correctly classified 90, 77, and 91% of the chemicals in the training set (n = 169) and two independent validation sets (n = 35 and 23), respectively. Accuracy of predictions of fast and slow degradation was equal for training-set chemicals, but fast-degradation predictions were less accurate than slow-degradation predictions for the validation sets. Analysis of the signs of the fragment coefficients for this and the other (aerobic) Biowin models suggests that in the context of simple group contribution models, the majority of positive and negative structural influences on ultimate degradation are the same for aerobic and methanogenic anaerobic biodegradation.

  9. Microclimate Data Improve Predictions of Insect Abundance Models Based on Calibrated Spatiotemporal Temperatures.

    PubMed

    Rebaudo, François; Faye, Emile; Dangles, Olivier

    2016-01-01

    A large body of literature has recently recognized the role of microclimates in controlling the physiology and ecology of species, yet the relevance of fine-scale climatic data for modeling species performance and distribution remains a matter of debate. Using a 6-year monitoring of three potato moth species, major crop pests in the tropical Andes, we asked whether the spatiotemporal resolution of temperature data affect the predictions of models of moth performance and distribution. For this, we used three different climatic data sets: (i) the WorldClim dataset (global dataset), (ii) air temperature recorded using data loggers (weather station dataset), and (iii) air crop canopy temperature (microclimate dataset). We developed a statistical procedure to calibrate all datasets to monthly and yearly variation in temperatures, while keeping both spatial and temporal variances (air monthly temperature at 1 km² for the WorldClim dataset, air hourly temperature for the weather station, and air minute temperature over 250 m radius disks for the microclimate dataset). Then, we computed pest performances based on these three datasets. Results for temperature ranging from 9 to 11°C revealed discrepancies in the simulation outputs in both survival and development rates depending on the spatiotemporal resolution of the temperature dataset. Temperature and simulated pest performances were then combined into multiple linear regression models to compare predicted vs. field data. We used an additional set of study sites to test the ability of the results of our model to be extrapolated over larger scales. Results showed that the model implemented with microclimatic data best predicted observed pest abundances for our study sites, but was less accurate than the global dataset model when performed at larger scales. Our simulations therefore stress the importance to consider different temperature datasets depending on the issue to be solved in order to accurately predict species

  10. Microclimate Data Improve Predictions of Insect Abundance Models Based on Calibrated Spatiotemporal Temperatures

    PubMed Central

    Rebaudo, François; Faye, Emile; Dangles, Olivier

    2016-01-01

    A large body of literature has recently recognized the role of microclimates in controlling the physiology and ecology of species, yet the relevance of fine-scale climatic data for modeling species performance and distribution remains a matter of debate. Using a 6-year monitoring of three potato moth species, major crop pests in the tropical Andes, we asked whether the spatiotemporal resolution of temperature data affect the predictions of models of moth performance and distribution. For this, we used three different climatic data sets: (i) the WorldClim dataset (global dataset), (ii) air temperature recorded using data loggers (weather station dataset), and (iii) air crop canopy temperature (microclimate dataset). We developed a statistical procedure to calibrate all datasets to monthly and yearly variation in temperatures, while keeping both spatial and temporal variances (air monthly temperature at 1 km² for the WorldClim dataset, air hourly temperature for the weather station, and air minute temperature over 250 m radius disks for the microclimate dataset). Then, we computed pest performances based on these three datasets. Results for temperature ranging from 9 to 11°C revealed discrepancies in the simulation outputs in both survival and development rates depending on the spatiotemporal resolution of the temperature dataset. Temperature and simulated pest performances were then combined into multiple linear regression models to compare predicted vs. field data. We used an additional set of study sites to test the ability of the results of our model to be extrapolated over larger scales. Results showed that the model implemented with microclimatic data best predicted observed pest abundances for our study sites, but was less accurate than the global dataset model when performed at larger scales. Our simulations therefore stress the importance to consider different temperature datasets depending on the issue to be solved in order to accurately predict species

  11. New analytical model for the ozone electronic ground state potential surface and accurate ab initio vibrational predictions at high energy range.

    PubMed

    Tyuterev, Vladimir G; Kochanov, Roman V; Tashkun, Sergey A; Holka, Filip; Szalay, Péter G

    2013-10-07

    An accurate description of the complicated shape of the potential energy surface (PES) and that of the highly excited vibration states is of crucial importance for various unsolved issues in the spectroscopy and dynamics of ozone and remains a challenge for the theory. In this work a new analytical representation is proposed for the PES of the ground electronic state of the ozone molecule in the range covering the main potential well and the transition state towards the dissociation. This model accounts for particular features specific to the ozone PES for large variations of nuclear displacements along the minimum energy path. The impact of the shape of the PES near the transition state (existence of the "reef structure") on vibration energy levels was studied for the first time. The major purpose of this work was to provide accurate theoretical predictions for ozone vibrational band centres at the energy range near the dissociation threshold, which would be helpful for understanding the very complicated high-resolution spectra and its analyses currently in progress. Extended ab initio electronic structure calculations were carried out enabling the determination of the parameters of a minimum energy path PES model resulting in a new set of theoretical vibrational levels of ozone. A comparison with recent high-resolution spectroscopic data on the vibrational levels gives the root-mean-square deviations below 1 cm(-1) for ozone band centres up to 90% of the dissociation energy. New ab initio vibrational predictions represent a significant improvement with respect to all previously available calculations.

  12. Climate-based models for pulsed resources improve predictability of consumer population dynamics: outbreaks of house mice in forest ecosystems.

    PubMed

    Holland, E Penelope; James, Alex; Ruscoe, Wendy A; Pech, Roger P; Byrom, Andrea E

    2015-01-01

    Accurate predictions of the timing and magnitude of consumer responses to episodic seeding events (masts) are important for understanding ecosystem dynamics and for managing outbreaks of invasive species generated by masts. While models relating consumer populations to resource fluctuations have been developed successfully for a range of natural and modified ecosystems, a critical gap that needs addressing is better prediction of resource pulses. A recent model used change in summer temperature from one year to the next (ΔT) for predicting masts for forest and grassland plants in New Zealand. We extend this climate-based method in the framework of a model for consumer-resource dynamics to predict invasive house mouse (Mus musculus) outbreaks in forest ecosystems. Compared with previous mast models based on absolute temperature, the ΔT method for predicting masts resulted in an improved model for mouse population dynamics. There was also a threshold effect of ΔT on the likelihood of an outbreak occurring. The improved climate-based method for predicting resource pulses and consumer responses provides a straightforward rule of thumb for determining, with one year's advance warning, whether management intervention might be required in invaded ecosystems. The approach could be applied to consumer-resource systems worldwide where climatic variables are used to model the size and duration of resource pulses, and may have particular relevance for ecosystems where global change scenarios predict increased variability in climatic events.

  13. Predicting age groups of Twitter users based on language and metadata features.

    PubMed

    Morgan-Lopez, Antonio A; Kim, Annice E; Chew, Robert F; Ruddle, Paul

    2017-01-01

    Health organizations are increasingly using social media, such as Twitter, to disseminate health messages to target audiences. Determining the extent to which the target audience (e.g., age groups) was reached is critical to evaluating the impact of social media education campaigns. The main objective of this study was to examine the separate and joint predictive validity of linguistic and metadata features in predicting the age of Twitter users. We created a labeled dataset of Twitter users across different age groups (youth, young adults, adults) by collecting publicly available birthday announcement tweets using the Twitter Search application programming interface. We manually reviewed results and, for each age-labeled handle, collected the 200 most recent publicly available tweets and user handles' metadata. The labeled data were split into training and test datasets. We created separate models to examine the predictive validity of language features only, metadata features only, language and metadata features, and words/phrases from another age-validated dataset. We estimated accuracy, precision, recall, and F1 metrics for each model. An L1-regularized logistic regression model was conducted for each age group, and predicted probabilities between the training and test sets were compared for each age group. Cohen's d effect sizes were calculated to examine the relative importance of significant features. Models containing both Tweet language features and metadata features performed the best (74% precision, 74% recall, 74% F1) while the model containing only Twitter metadata features were least accurate (58% precision, 60% recall, and 57% F1 score). Top predictive features included use of terms such as "school" for youth and "college" for young adults. Overall, it was more challenging to predict older adults accurately. These results suggest that examining linguistic and Twitter metadata features to predict youth and young adult Twitter users may be helpful for

  14. A computational approach to predicting ligand selectivity for the size-based separation of trivalent lanthanides

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ivanov, Alexander S.; Bryantsev, Vyacheslav S.

    An accurate description of solvation effects for trivalent lanthanide ions is a main stumbling block to the qualitative prediction of selectivity trends along the lanthanide series. In this work, we propose a simple model to describe the differential effect of solvation in the competitive binding of a ligand by lanthanide ions by including weakly co-ordinated counterions in the complexes of more than a +1 charge. The success of the approach to quantitatively reproduce selectivities obtained from aqueous phase complexation studies demonstrates its potential for the design and screening of new ligands for efficient size-based separation.

  15. A computational approach to predicting ligand selectivity for the size-based separation of trivalent lanthanides

    DOE PAGES

    Ivanov, Alexander S.; Bryantsev, Vyacheslav S.

    2016-06-20

    An accurate description of solvation effects for trivalent lanthanide ions is a main stumbling block to the qualitative prediction of selectivity trends along the lanthanide series. In this work, we propose a simple model to describe the differential effect of solvation in the competitive binding of a ligand by lanthanide ions by including weakly co-ordinated counterions in the complexes of more than a +1 charge. The success of the approach to quantitatively reproduce selectivities obtained from aqueous phase complexation studies demonstrates its potential for the design and screening of new ligands for efficient size-based separation.

  16. Fast and Accurate Metadata Authoring Using Ontology-Based Recommendations.

    PubMed

    Martínez-Romero, Marcos; O'Connor, Martin J; Shankar, Ravi D; Panahiazar, Maryam; Willrett, Debra; Egyedi, Attila L; Gevaert, Olivier; Graybeal, John; Musen, Mark A

    2017-01-01

    In biomedicine, high-quality metadata are crucial for finding experimental datasets, for understanding how experiments were performed, and for reproducing those experiments. Despite the recent focus on metadata, the quality of metadata available in public repositories continues to be extremely poor. A key difficulty is that the typical metadata acquisition process is time-consuming and error prone, with weak or nonexistent support for linking metadata to ontologies. There is a pressing need for methods and tools to speed up the metadata acquisition process and to increase the quality of metadata that are entered. In this paper, we describe a methodology and set of associated tools that we developed to address this challenge. A core component of this approach is a value recommendation framework that uses analysis of previously entered metadata and ontology-based metadata specifications to help users rapidly and accurately enter their metadata. We performed an initial evaluation of this approach using metadata from a public metadata repository.

  17. Fast and Accurate Metadata Authoring Using Ontology-Based Recommendations

    PubMed Central

    Martínez-Romero, Marcos; O’Connor, Martin J.; Shankar, Ravi D.; Panahiazar, Maryam; Willrett, Debra; Egyedi, Attila L.; Gevaert, Olivier; Graybeal, John; Musen, Mark A.

    2017-01-01

    In biomedicine, high-quality metadata are crucial for finding experimental datasets, for understanding how experiments were performed, and for reproducing those experiments. Despite the recent focus on metadata, the quality of metadata available in public repositories continues to be extremely poor. A key difficulty is that the typical metadata acquisition process is time-consuming and error prone, with weak or nonexistent support for linking metadata to ontologies. There is a pressing need for methods and tools to speed up the metadata acquisition process and to increase the quality of metadata that are entered. In this paper, we describe a methodology and set of associated tools that we developed to address this challenge. A core component of this approach is a value recommendation framework that uses analysis of previously entered metadata and ontology-based metadata specifications to help users rapidly and accurately enter their metadata. We performed an initial evaluation of this approach using metadata from a public metadata repository. PMID:29854196

  18. Past, present and prospect of an Artificial Intelligence (AI) based model for sediment transport prediction

    NASA Astrophysics Data System (ADS)

    Afan, Haitham Abdulmohsin; El-shafie, Ahmed; Mohtar, Wan Hanna Melini Wan; Yaseen, Zaher Mundher

    2016-10-01

    An accurate model for sediment prediction is a priority for all hydrological researchers. Many conventional methods have shown an inability to achieve an accurate prediction of suspended sediment. These methods are unable to understand the behaviour of sediment transport in rivers due to the complexity, noise, non-stationarity, and dynamism of the sediment pattern. In the past two decades, Artificial Intelligence (AI) and computational approaches have become a remarkable tool for developing an accurate model. These approaches are considered a powerful tool for solving any non-linear model, as they can deal easily with a large number of data and sophisticated models. This paper is a review of all AI approaches that have been applied in sediment modelling. The current research focuses on the development of AI application in sediment transport. In addition, the review identifies major challenges and opportunities for prospective research. Throughout the literature, complementary models superior to classical modelling.

  19. High-accurate optical vector analysis based on optical single-sideband modulation

    NASA Astrophysics Data System (ADS)

    Xue, Min; Pan, Shilong

    2016-11-01

    Most of the efforts devoted to the area of optical communications were on the improvement of the optical spectral efficiency. Varies innovative optical devices are thus developed to finely manipulate the optical spectrum. Knowing the spectral responses of these devices, including the magnitude, phase and polarization responses, is of great importance for their fabrication and application. To achieve high-resolution characterization, optical vector analyzers (OVAs) based on optical single-sideband (OSSB) modulation have been proposed and developed. Benefiting from the mature and highresolution microwave technologies, the OSSB-based OVA can potentially achieve a resolution of sub-Hz. However, the accuracy is restricted by the measurement errors induced by the unwanted first-order sideband and the high-order sidebands in the OSSB signal, since electrical-to-optical conversion and optical-to-electrical conversion are essentially required to achieve high-resolution frequency sweeping and extract the magnitude and phase information in the electrical domain. Recently, great efforts have been devoted to improve the accuracy of the OSSB-based OVA. In this paper, the influence of the unwanted-sideband induced measurement errors and techniques for implementing high-accurate OSSB-based OVAs are discussed.

  20. Establishment and Validation of RNA-Based Predictive Models for Understanding Survival of Vibrio parahaemolyticus in Oysters Stored at Low Temperatures

    PubMed Central

    Liao, Chao; Zhao, Yong

    2017-01-01

    ABSTRACT This study developed RNA-based predictive models describing the survival of Vibrio parahaemolyticus in Eastern oysters (Crassostrea virginica) during storage at 0, 4, and 10°C. Postharvested oysters were inoculated with a cocktail of five V. parahaemolyticus strains and were then stored at 0, 4, and 10°C for 21 or 11 days. A real-time reverse transcription-PCR (RT-PCR) assay targeting expression of the tlh gene was used to evaluate the number of surviving V. parahaemolyticus cells, which was then used to establish primary molecular models (MMs). Before construction of the MMs, consistent expression levels of the tlh gene at 0, 4, and 10°C were confirmed, and this gene was used to monitor the survival of the total V. parahaemolyticus cells. In addition, the tdh and trh genes were used for monitoring the survival of virulent V. parahaemolyticus. Traditional models (TMs) were built based on data collected using a plate counting method. From the MMs, V. parahaemolyticus populations had decreased 0.493, 0.362, and 0.238 log10 CFU/g by the end of storage at 0, 4, and 10°C, respectively. Rates of reduction of V. parahaemolyticus shown in the TMs were 2.109, 1.579, and 0.894 log10 CFU/g for storage at 0, 4, and 10°C, respectively. Bacterial inactivation rates (IRs) estimated with the TMs (−0.245, −0.152, and −0.121 log10 CFU/day, respectively) were higher than those estimated with the MMs (−0.134, −0.0887, and −0.0732 log10 CFU/day, respectively) for storage at 0, 4, and 10°C. Higher viable V. parahaemolyticus numbers were predicted using the MMs than using the TMs. On the basis of this study, RNA-based predictive MMs are the more accurate and reliable models and can prevent false-negative results compared to TMs. IMPORTANCE One important method for validating postharvest techniques and for monitoring the behavior of V. parahaemolyticus is to establish predictive models. Unfortunately, previous predictive models established based on plate

  1. Establishment and Validation of RNA-Based Predictive Models for Understanding Survival of Vibrio parahaemolyticus in Oysters Stored at Low Temperatures.

    PubMed

    Liao, Chao; Zhao, Yong; Wang, Luxin

    2017-03-15

    This study developed RNA-based predictive models describing the survival of Vibrio parahaemolyticus in Eastern oysters ( Crassostrea virginica ) during storage at 0, 4, and 10°C. Postharvested oysters were inoculated with a cocktail of five V. parahaemolyticus strains and were then stored at 0, 4, and 10°C for 21 or 11 days. A real-time reverse transcription-PCR (RT-PCR) assay targeting expression of the tlh gene was used to evaluate the number of surviving V. parahaemolyticus cells, which was then used to establish primary molecular models (MMs). Before construction of the MMs, consistent expression levels of the tlh gene at 0, 4, and 10°C were confirmed, and this gene was used to monitor the survival of the total V. parahaemolyticus cells. In addition, the tdh and trh genes were used for monitoring the survival of virulent V. parahaemolyticus Traditional models (TMs) were built based on data collected using a plate counting method. From the MMs, V. parahaemolyticus populations had decreased 0.493, 0.362, and 0.238 log 10 CFU/g by the end of storage at 0, 4, and 10°C, respectively. Rates of reduction of V. parahaemolyticus shown in the TMs were 2.109, 1.579, and 0.894 log 10 CFU/g for storage at 0, 4, and 10°C, respectively. Bacterial inactivation rates (IRs) estimated with the TMs (-0.245, -0.152, and -0.121 log 10 CFU/day, respectively) were higher than those estimated with the MMs (-0.134, -0.0887, and -0.0732 log 10 CFU/day, respectively) for storage at 0, 4, and 10°C. Higher viable V. parahaemolyticus numbers were predicted using the MMs than using the TMs. On the basis of this study, RNA-based predictive MMs are the more accurate and reliable models and can prevent false-negative results compared to TMs. IMPORTANCE One important method for validating postharvest techniques and for monitoring the behavior of V. parahaemolyticus is to establish predictive models. Unfortunately, previous predictive models established based on plate counting methods or on

  2. Novel serologic biomarkers provide accurate estimates of recent Plasmodium falciparum exposure for individuals and communities

    PubMed Central

    Helb, Danica A.; Tetteh, Kevin K. A.; Felgner, Philip L.; Skinner, Jeff; Hubbard, Alan; Arinaitwe, Emmanuel; Mayanja-Kizza, Harriet; Ssewanyana, Isaac; Kamya, Moses R.; Beeson, James G.; Tappero, Jordan; Smith, David L.; Crompton, Peter D.; Rosenthal, Philip J.; Dorsey, Grant; Drakeley, Christopher J.; Greenhouse, Bryan

    2015-01-01

    Tools to reliably measure Plasmodium falciparum (Pf) exposure in individuals and communities are needed to guide and evaluate malaria control interventions. Serologic assays can potentially produce precise exposure estimates at low cost; however, current approaches based on responses to a few characterized antigens are not designed to estimate exposure in individuals. Pf-specific antibody responses differ by antigen, suggesting that selection of antigens with defined kinetic profiles will improve estimates of Pf exposure. To identify novel serologic biomarkers of malaria exposure, we evaluated responses to 856 Pf antigens by protein microarray in 186 Ugandan children, for whom detailed Pf exposure data were available. Using data-adaptive statistical methods, we identified combinations of antibody responses that maximized information on an individual’s recent exposure. Responses to three novel Pf antigens accurately classified whether an individual had been infected within the last 30, 90, or 365 d (cross-validated area under the curve = 0.86–0.93), whereas responses to six antigens accurately estimated an individual’s malaria incidence in the prior year. Cross-validated incidence predictions for individuals in different communities provided accurate stratification of exposure between populations and suggest that precise estimates of community exposure can be obtained from sampling a small subset of that community. In addition, serologic incidence predictions from cross-sectional samples characterized heterogeneity within a community similarly to 1 y of continuous passive surveillance. Development of simple ELISA-based assays derived from the successful selection strategy outlined here offers the potential to generate rich epidemiologic surveillance data that will be widely accessible to malaria control programs. PMID:26216993

  3. Novel serologic biomarkers provide accurate estimates of recent Plasmodium falciparum exposure for individuals and communities.

    PubMed

    Helb, Danica A; Tetteh, Kevin K A; Felgner, Philip L; Skinner, Jeff; Hubbard, Alan; Arinaitwe, Emmanuel; Mayanja-Kizza, Harriet; Ssewanyana, Isaac; Kamya, Moses R; Beeson, James G; Tappero, Jordan; Smith, David L; Crompton, Peter D; Rosenthal, Philip J; Dorsey, Grant; Drakeley, Christopher J; Greenhouse, Bryan

    2015-08-11

    Tools to reliably measure Plasmodium falciparum (Pf) exposure in individuals and communities are needed to guide and evaluate malaria control interventions. Serologic assays can potentially produce precise exposure estimates at low cost; however, current approaches based on responses to a few characterized antigens are not designed to estimate exposure in individuals. Pf-specific antibody responses differ by antigen, suggesting that selection of antigens with defined kinetic profiles will improve estimates of Pf exposure. To identify novel serologic biomarkers of malaria exposure, we evaluated responses to 856 Pf antigens by protein microarray in 186 Ugandan children, for whom detailed Pf exposure data were available. Using data-adaptive statistical methods, we identified combinations of antibody responses that maximized information on an individual's recent exposure. Responses to three novel Pf antigens accurately classified whether an individual had been infected within the last 30, 90, or 365 d (cross-validated area under the curve = 0.86-0.93), whereas responses to six antigens accurately estimated an individual's malaria incidence in the prior year. Cross-validated incidence predictions for individuals in different communities provided accurate stratification of exposure between populations and suggest that precise estimates of community exposure can be obtained from sampling a small subset of that community. In addition, serologic incidence predictions from cross-sectional samples characterized heterogeneity within a community similarly to 1 y of continuous passive surveillance. Development of simple ELISA-based assays derived from the successful selection strategy outlined here offers the potential to generate rich epidemiologic surveillance data that will be widely accessible to malaria control programs.

  4. When high working memory capacity is and is not beneficial for predicting nonlinear processes.

    PubMed

    Fischer, Helen; Holt, Daniel V

    2017-04-01

    Predicting the development of dynamic processes is vital in many areas of life. Previous findings are inconclusive as to whether higher working memory capacity (WMC) is always associated with using more accurate prediction strategies, or whether higher WMC can also be associated with using overly complex strategies that do not improve accuracy. In this study, participants predicted a range of systematically varied nonlinear processes based on exponential functions where prediction accuracy could or could not be enhanced using well-calibrated rules. Results indicate that higher WMC participants seem to rely more on well-calibrated strategies, leading to more accurate predictions for processes with highly nonlinear trajectories in the prediction region. Predictions of lower WMC participants, in contrast, point toward an increased use of simple exemplar-based prediction strategies, which perform just as well as more complex strategies when the prediction region is approximately linear. These results imply that with respect to predicting dynamic processes, working memory capacity limits are not generally a strength or a weakness, but that this depends on the process to be predicted.

  5. Shedding light on the variability of optical skin properties: finding a path towards more accurate prediction of light propagation in human cutaneous compartments

    PubMed Central

    Mignon, C.; Tobin, D. J.; Zeitouny, M.; Uzunbajakava, N. E.

    2018-01-01

    Finding a path towards a more accurate prediction of light propagation in human skin remains an aspiration of biomedical scientists working on cutaneous applications both for diagnostic and therapeutic reasons. The objective of this study was to investigate variability of the optical properties of human skin compartments reported in literature, to explore the underlying rational of this variability and to propose a dataset of values, to better represent an in vivo case and recommend a solution towards a more accurate prediction of light propagation through cutaneous compartments. To achieve this, we undertook a novel, logical yet simple approach. We first reviewed scientific articles published between 1981 and 2013 that reported on skin optical properties, to reveal the spread in the reported quantitative values. We found variations of up to 100-fold. Then we extracted the most trust-worthy datasets guided by a rule that the spectral properties should reflect the specific biochemical composition of each of the skin layers. This resulted in the narrowing of the spread in the calculated photon densities to 6-fold. We conclude with a recommendation to use the identified most robust datasets when estimating light propagation in human skin using Monte Carlo simulations. Alternatively, otherwise follow our proposed strategy to screen any new datasets to determine their biological relevance. PMID:29552418

  6. A sparse autoencoder-based deep neural network for protein solvent accessibility and contact number prediction.

    PubMed

    Deng, Lei; Fan, Chao; Zeng, Zhiwen

    2017-12-28

    Direct prediction of the three-dimensional (3D) structures of proteins from one-dimensional (1D) sequences is a challenging problem. Significant structural characteristics such as solvent accessibility and contact number are essential for deriving restrains in modeling protein folding and protein 3D structure. Thus, accurately predicting these features is a critical step for 3D protein structure building. In this study, we present DeepSacon, a computational method that can effectively predict protein solvent accessibility and contact number by using a deep neural network, which is built based on stacked autoencoder and a dropout method. The results demonstrate that our proposed DeepSacon achieves a significant improvement in the prediction quality compared with the state-of-the-art methods. We obtain 0.70 three-state accuracy for solvent accessibility, 0.33 15-state accuracy and 0.74 Pearson Correlation Coefficient (PCC) for the contact number on the 5729 monomeric soluble globular protein dataset. We also evaluate the performance on the CASP11 benchmark dataset, DeepSacon achieves 0.68 three-state accuracy and 0.69 PCC for solvent accessibility and contact number, respectively. We have shown that DeepSacon can reliably predict solvent accessibility and contact number with stacked sparse autoencoder and a dropout approach.

  7. Neural network and SVM classifiers accurately predict lipid binding proteins, irrespective of sequence homology.

    PubMed

    Bakhtiarizadeh, Mohammad Reza; Moradi-Shahrbabak, Mohammad; Ebrahimi, Mansour; Ebrahimie, Esmaeil

    2014-09-07

    Due to the central roles of lipid binding proteins (LBPs) in many biological processes, sequence based identification of LBPs is of great interest. The major challenge is that LBPs are diverse in sequence, structure, and function which results in low accuracy of sequence homology based methods. Therefore, there is a need for developing alternative functional prediction methods irrespective of sequence similarity. To identify LBPs from non-LBPs, the performances of support vector machine (SVM) and neural network were compared in this study. Comprehensive protein features and various techniques were employed to create datasets. Five-fold cross-validation (CV) and independent evaluation (IE) tests were used to assess the validity of the two methods. The results indicated that SVM outperforms neural network. SVM achieved 89.28% (CV) and 89.55% (IE) overall accuracy in identification of LBPs from non-LBPs and 92.06% (CV) and 92.90% (IE) (in average) for classification of different LBPs classes. Increasing the number and the range of extracted protein features as well as optimization of the SVM parameters significantly increased the efficiency of LBPs class prediction in comparison to the only previous report in this field. Altogether, the results showed that the SVM algorithm can be run on broad, computationally calculated protein features and offers a promising tool in detection of LBPs classes. The proposed approach has the potential to integrate and improve the common sequence alignment based methods. Copyright © 2014 Elsevier Ltd. All rights reserved.

  8. Predicting moisture content: fuel moisture indicator sticks in the Pacific Northwest.

    Treesearch

    Owen P. Cramer

    1961-01-01

    Successful day-to-day planning of presuppression activities requires accurate prediction of burning index. In the Pacific Northwest, forecasts of burning index are prepared by the fire-control man and are based on predictions of windspeed and fuel moisture. Although fuel moisture is affected by a number of weather elements and is consequently difficult to predict, the...

  9. Risk prediction model: Statistical and artificial neural network approach

    NASA Astrophysics Data System (ADS)

    Paiman, Nuur Azreen; Hariri, Azian; Masood, Ibrahim

    2017-04-01

    Prediction models are increasingly gaining popularity and had been used in numerous areas of studies to complement and fulfilled clinical reasoning and decision making nowadays. The adoption of such models assist physician's decision making, individual's behavior, and consequently improve individual outcomes and the cost-effectiveness of care. The objective of this paper is to reviewed articles related to risk prediction model in order to understand the suitable approach, development and the validation process of risk prediction model. A qualitative review of the aims, methods and significant main outcomes of the nineteen published articles that developed risk prediction models from numerous fields were done. This paper also reviewed on how researchers develop and validate the risk prediction models based on statistical and artificial neural network approach. From the review done, some methodological recommendation in developing and validating the prediction model were highlighted. According to studies that had been done, artificial neural network approached in developing the prediction model were more accurate compared to statistical approach. However currently, only limited published literature discussed on which approach is more accurate for risk prediction model development.

  10. PSORTb 3.0: improved protein subcellular localization prediction with refined localization subcategories and predictive capabilities for all prokaryotes.

    PubMed

    Yu, Nancy Y; Wagner, James R; Laird, Matthew R; Melli, Gabor; Rey, Sébastien; Lo, Raymond; Dao, Phuong; Sahinalp, S Cenk; Ester, Martin; Foster, Leonard J; Brinkman, Fiona S L

    2010-07-01

    PSORTb has remained the most precise bacterial protein subcellular localization (SCL) predictor since it was first made available in 2003. However, the recall needs to be improved and no accurate SCL predictors yet make predictions for archaea, nor differentiate important localization subcategories, such as proteins targeted to a host cell or bacterial hyperstructures/organelles. Such improvements should preferably be encompassed in a freely available web-based predictor that can also be used as a standalone program. We developed PSORTb version 3.0 with improved recall, higher proteome-scale prediction coverage, and new refined localization subcategories. It is the first SCL predictor specifically geared for all prokaryotes, including archaea and bacteria with atypical membrane/cell wall topologies. It features an improved standalone program, with a new batch results delivery system complementing its web interface. We evaluated the most accurate SCL predictors using 5-fold cross validation plus we performed an independent proteomics analysis, showing that PSORTb 3.0 is the most accurate but can benefit from being complemented by Proteome Analyst predictions. http://www.psort.org/psortb (download open source software or use the web interface). psort-mail@sfu.ca Supplementary data are available at Bioinformatics online.

  11. gCUP: rapid GPU-based HIV-1 co-receptor usage prediction for next-generation sequencing.

    PubMed

    Olejnik, Michael; Steuwer, Michel; Gorlatch, Sergei; Heider, Dominik

    2014-11-15

    Next-generation sequencing (NGS) has a large potential in HIV diagnostics, and genotypic prediction models have been developed and successfully tested in the recent years. However, albeit being highly accurate, these computational models lack computational efficiency to reach their full potential. In this study, we demonstrate the use of graphics processing units (GPUs) in combination with a computational prediction model for HIV tropism. Our new model named gCUP, parallelized and optimized for GPU, is highly accurate and can classify >175 000 sequences per second on an NVIDIA GeForce GTX 460. The computational efficiency of our new model is the next step to enable NGS technologies to reach clinical significance in HIV diagnostics. Moreover, our approach is not limited to HIV tropism prediction, but can also be easily adapted to other settings, e.g. drug resistance prediction. The source code can be downloaded at http://www.heiderlab.de d.heider@wz-straubing.de. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  12. Helicopter noise prediction - The current status and future direction

    NASA Technical Reports Server (NTRS)

    Brentner, Kenneth S.; Farassat, F.

    1992-01-01

    The paper takes stock of the progress, assesses the current prediction capabilities, and forecasts the direction of future helicopter noise prediction research. The acoustic analogy approach, specifically, theories based on the Ffowcs Williams-Hawkings equations, are the most widely used for deterministic noise sources. Thickness and loading noise can be routinely predicted given good plane motion and blade loading inputs. Blade-vortex interaction noise can also be predicted well with measured input data, but prediction of airloads with the high spatial and temporal resolution required for BVI is still difficult. Current semiempirical broadband noise predictions are useful and reasonably accurate. New prediction methods based on a Kirchhoff formula and direct computation appear to be very promising, but are currently very demanding computationally.

  13. Prediction of Slot Shape and Slot Size for Improving the Performance of Microstrip Antennas Using Knowledge-Based Neural Networks

    PubMed Central

    De, Asok

    2014-01-01

    In the last decade, artificial neural networks have become very popular techniques for computing different performance parameters of microstrip antennas. The proposed work illustrates a knowledge-based neural networks model for predicting the appropriate shape and accurate size of the slot introduced on the radiating patch for achieving desired level of resonance, gain, directivity, antenna efficiency, and radiation efficiency for dual-frequency operation. By incorporating prior knowledge in neural model, the number of required training patterns is drastically reduced. Further, the neural model incorporated with prior knowledge can be used for predicting response in extrapolation region beyond the training patterns region. For validation, a prototype is also fabricated and its performance parameters are measured. A very good agreement is attained between measured, simulated, and predicted results. PMID:27382616

  14. Hounsfield unit density accurately predicts ESWL success.

    PubMed

    Magnuson, William J; Tomera, Kevin M; Lance, Raymond S

    2005-01-01

    Extracorporeal shockwave lithotripsy (ESWL) is a commonly used non-invasive treatment for urolithiasis. Helical CT scans provide much better and detailed imaging of the patient with urolithiasis including the ability to measure density of urinary stones. In this study we tested the hypothesis that density of urinary calculi as measured by CT can predict successful ESWL treatment. 198 patients were treated at Alaska Urological Associates with ESWL between January 2002 and April 2004. Of these 101 met study inclusion with accessible CT scans and stones ranging from 5-15 mm. Follow-up imaging demonstrated stone freedom in 74.2%. The overall mean Houndsfield density value for stone-free compared to residual stone groups were significantly different ( 93.61 vs 122.80 p < 0.0001). We determined by receiver operator curve (ROC) that HDV of 93 or less carries a 90% or better chance of stone freedom following ESWL for upper tract calculi between 5-15mm.

  15. BPP: a sequence-based algorithm for branch point prediction.

    PubMed

    Zhang, Qing; Fan, Xiaodan; Wang, Yejun; Sun, Ming-An; Shao, Jianlin; Guo, Dianjing

    2017-10-15

    Although high-throughput sequencing methods have been proposed to identify splicing branch points in the human genome, these methods can only detect a small fraction of the branch points subject to the sequencing depth, experimental cost and the expression level of the mRNA. An accurate computational model for branch point prediction is therefore an ongoing objective in human genome research. We here propose a novel branch point prediction algorithm that utilizes information on the branch point sequence and the polypyrimidine tract. Using experimentally validated data, we demonstrate that our proposed method outperforms existing methods. Availability and implementation: https://github.com/zhqingit/BPP. djguo@cuhk.edu.hk. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  16. Assessment of MRI-Based Automated Fetal Cerebral Cortical Folding Measures in Prediction of Gestational Age in the Third Trimester.

    PubMed

    Wu, J; Awate, S P; Licht, D J; Clouchoux, C; du Plessis, A J; Avants, B B; Vossough, A; Gee, J C; Limperopoulos, C

    2015-07-01

    Traditional methods of dating a pregnancy based on history or sonographic assessment have a large variation in the third trimester. We aimed to assess the ability of various quantitative measures of brain cortical folding on MR imaging in determining fetal gestational age in the third trimester. We evaluated 8 different quantitative cortical folding measures to predict gestational age in 33 healthy fetuses by using T2-weighted fetal MR imaging. We compared the accuracy of the prediction of gestational age by these cortical folding measures with the accuracy of prediction by brain volume measurement and by a previously reported semiquantitative visual scale of brain maturity. Regression models were constructed, and measurement biases and variances were determined via a cross-validation procedure. The cortical folding measures are accurate in the estimation and prediction of gestational age (mean of the absolute error, 0.43 ± 0.45 weeks) and perform better than (P = .024) brain volume (mean of the absolute error, 0.72 ± 0.61 weeks) or sonography measures (SDs approximately 1.5 weeks, as reported in literature). Prediction accuracy is comparable with that of the semiquantitative visual assessment score (mean, 0.57 ± 0.41 weeks). Quantitative cortical folding measures such as global average curvedness can be an accurate and reliable estimator of gestational age and brain maturity for healthy fetuses in the third trimester and have the potential to be an indicator of brain-growth delays for at-risk fetuses and preterm neonates. © 2015 by American Journal of Neuroradiology.

  17. Ability of commercially available dairy ration programs to predict duodenal flows of protein and essential amino acids in dairy cows.

    PubMed

    Pacheco, D; Patton, R A; Parys, C; Lapierre, H

    2012-02-01

    The objective of this analysis was to compare the rumen submodel predictions of 4 commonly used dairy ration programs to observed values of duodenal flows of crude protein (CP), protein fractions, and essential AA (EAA). The literature was searched and 40 studies, including 154 diets, were used to compare observed values with those predicted by AminoCow (AC), Agricultural Modeling and Training Systems (AMTS), Cornell-Penn-Miner (CPM), and National Research Council 2001 (NRC) models. The models were evaluated based on their ability to predict the mean, their root mean square prediction error (RMSPE), error bias, and adequacy of regression equations for each protein fraction. The models predicted the mean duodenal CP flow within 5%, with more than 90% of the variation due to random disturbance. The models also predicted within 5% the mean microbial CP flow except CPM, which overestimated it by 27%. Only NRC, however, predicted mean rumen-undegraded protein (RUP) flows within 5%, whereas AC and AMTS underpredicted it by 8 to 9% and CPM by 24%. Regarding duodenal flows of individual AA, across all diets, CPM predicted substantially greater (>10%) mean flows of Arg, His, Ile, Met, and Lys; AMTS predicted greater flow for Arg and Met, whereas AC and NRC estimations were, on average, within 10% of observed values. Overpredictions by the CPM model were mainly related to mean bias, whereas the NRC model had the highest proportion of bias in random disturbance for flows of EAA. Models tended to predict mean flows of EAA more accurately on corn silage and alfalfa diets than on grass-based diets, more accurately on corn grain-based diets than on non-corn-based diets, and finally more accurately in the mid range of diet types. The 4 models were accurate at predicting mean dry matter intake. The AC, AMTS, and NRC models were all sufficiently accurate to be used for balancing EAA in dairy rations under field conditions. Copyright © 2012 American Dairy Science Association

  18. Comparison of integrated clustering methods for accurate and stable prediction of building energy consumption data

    DOE PAGES

    Hsu, David

    2015-09-27

    Clustering methods are often used to model energy consumption for two reasons. First, clustering is often used to process data and to improve the predictive accuracy of subsequent energy models. Second, stable clusters that are reproducible with respect to non-essential changes can be used to group, target, and interpret observed subjects. However, it is well known that clustering methods are highly sensitive to the choice of algorithms and variables. This can lead to misleading assessments of predictive accuracy and mis-interpretation of clusters in policymaking. This paper therefore introduces two methods to the modeling of energy consumption in buildings: clusterwise regression,more » also known as latent class regression, which integrates clustering and regression simultaneously; and cluster validation methods to measure stability. Using a large dataset of multifamily buildings in New York City, clusterwise regression is compared to common two-stage algorithms that use K-means and model-based clustering with linear regression. Predictive accuracy is evaluated using 20-fold cross validation, and the stability of the perturbed clusters is measured using the Jaccard coefficient. These results show that there seems to be an inherent tradeoff between prediction accuracy and cluster stability. This paper concludes by discussing which clustering methods may be appropriate for different analytical purposes.« less

  19. Can We Predict Patient Wait Time?

    PubMed

    Pianykh, Oleg S; Rosenthal, Daniel I

    2015-10-01

    The importance of patient wait-time management and predictability can hardly be overestimated: For most hospitals, it is the patient queues that drive and define every bit of clinical workflow. The objective of this work was to study the predictability of patient wait time and identify its most influential predictors. To solve this problem, we developed a comprehensive list of 25 wait-related parameters, suggested in earlier work and observed in our own experiments. All parameters were chosen as derivable from a typical Hospital Information System dataset. The parameters were fed into several time-predicting models, and the best parameter subsets, discovered through exhaustive model search, were applied to a large sample of actual patient wait data. We were able to discover the most efficient wait-time prediction factors and models, such as the line-size models introduced in this work. Moreover, these models proved to be equally accurate and computationally efficient. Finally, the selected models were implemented in our patient waiting areas, displaying predicted wait times on the monitors located at the front desks. The limitations of these models are also discussed. Optimal regression models based on wait-line sizes can provide accurate and efficient predictions for patient wait time. Copyright © 2015 American College of Radiology. Published by Elsevier Inc. All rights reserved.

  20. FragBag, an accurate representation of protein structure, retrieves structural neighbors from the entire PDB quickly and accurately.

    PubMed

    Budowski-Tal, Inbal; Nov, Yuval; Kolodny, Rachel

    2010-02-23

    Fast identification of protein structures that are similar to a specified query structure in the entire Protein Data Bank (PDB) is fundamental in structure and function prediction. We present FragBag: An ultrafast and accurate method for comparing protein structures. We describe a protein structure by the collection of its overlapping short contiguous backbone segments, and discretize this set using a library of fragments. Then, we succinctly represent the protein as a "bags-of-fragments"-a vector that counts the number of occurrences of each fragment-and measure the similarity between two structures by the similarity between their vectors. Our representation has two additional benefits: (i) it can be used to construct an inverted index, for implementing a fast structural search engine of the entire PDB, and (ii) one can specify a structure as a collection of substructures, without combining them into a single structure; this is valuable for structure prediction, when there are reliable predictions only of parts of the protein. We use receiver operating characteristic curve analysis to quantify the success of FragBag in identifying neighbor candidate sets in a dataset of over 2,900 structures. The gold standard is the set of neighbors found by six state of the art structural aligners. Our best FragBag library finds more accurate candidate sets than the three other filter methods: The SGM, PRIDE, and a method by Zotenko et al. More interestingly, FragBag performs on a par with the computationally expensive, yet highly trusted structural aligners STRUCTAL and CE.