Valls, Joan; Castellà, Gerard; Dyba, Tadeusz; Clèries, Ramon
2015-06-01
Predicting the future burden of cancer is a key issue for health services planning, where a method for selecting the predictive model and the prediction base is a challenge. A method, named here Goodness-of-Fit optimal (GoF-optimal), is presented to determine the minimum prediction base of historical data to perform 5-year predictions of the number of new cancer cases or deaths. An empirical ex-post evaluation exercise for cancer mortality data in Spain and cancer incidence in Finland using simple linear and log-linear Poisson models was performed. Prediction bases were considered within the time periods 1951-2006 in Spain and 1975-2007 in Finland, and then predictions were made for 37 and 33 single years in these periods, respectively. The performance of three fixed different prediction bases (last 5, 10, and 20 years of historical data) was compared to that of the prediction base determined by the GoF-optimal method. The coverage (COV) of the 95% prediction interval and the discrepancy ratio (DR) were calculated to assess the success of the prediction. The results showed that (i) models using the prediction base selected through GoF-optimal method reached the highest COV and the lowest DR and (ii) the best alternative strategy to GoF-optimal was the one using the base of prediction of 5-years. The GoF-optimal approach can be used as a selection criterion in order to find an adequate base of prediction. Copyright © 2015 Elsevier Ltd. All rights reserved.
2013-10-01
Based Logistics Prophets Using Science or Alchemy to Create Life-Cycle Affordability? Using Theory to Predict the Efficacy of Performance Based...Using Science or Alchemy to Create Life-Cycle Affordability? Using Theory to Predict the Efficacy of Performance Based Logistics 5a. CONTRACT NUMBER 5b...Are the PBL Prophets Using Science or Alchemy to Create Life Cycle Affordability? 328Defense ARJ, October 2013, Vol. 20 No. 3 : 325–348 Defense
Accuracy of Predicted Genomic Breeding Values in Purebred and Crossbred Pigs.
Hidalgo, André M; Bastiaansen, John W M; Lopes, Marcos S; Harlizius, Barbara; Groenen, Martien A M; de Koning, Dirk-Jan
2015-05-26
Genomic selection has been widely implemented in dairy cattle breeding when the aim is to improve performance of purebred animals. In pigs, however, the final product is a crossbred animal. This may affect the efficiency of methods that are currently implemented for dairy cattle. Therefore, the objective of this study was to determine the accuracy of predicted breeding values in crossbred pigs using purebred genomic and phenotypic data. A second objective was to compare the predictive ability of SNPs when training is done in either single or multiple populations for four traits: age at first insemination (AFI); total number of piglets born (TNB); litter birth weight (LBW); and litter variation (LVR). We performed marker-based and pedigree-based predictions. Within-population predictions for the four traits ranged from 0.21 to 0.72. Multi-population prediction yielded accuracies ranging from 0.18 to 0.67. Predictions across purebred populations as well as predicting genetic merit of crossbreds from their purebred parental lines for AFI performed poorly (not significantly different from zero). In contrast, accuracies of across-population predictions and accuracies of purebred to crossbred predictions for LBW and LVR ranged from 0.08 to 0.31 and 0.11 to 0.31, respectively. Accuracy for TNB was zero for across-population prediction, whereas for purebred to crossbred prediction it ranged from 0.08 to 0.22. In general, marker-based outperformed pedigree-based prediction across populations and traits. However, in some cases pedigree-based prediction performed similarly or outperformed marker-based prediction. There was predictive ability when purebred populations were used to predict crossbred genetic merit using an additive model in the populations studied. AFI was the only exception, indicating that predictive ability depends largely on the genetic correlation between PB and CB performance, which was 0.31 for AFI. Multi-population prediction was no better than within-population prediction for the purebred validation set. Accuracy of prediction was very trait-dependent. Copyright © 2015 Hidalgo et al.
Computational Model-Based Prediction of Human Episodic Memory Performance Based on Eye Movements
NASA Astrophysics Data System (ADS)
Sato, Naoyuki; Yamaguchi, Yoko
Subjects' episodic memory performance is not simply reflected by eye movements. We use a ‘theta phase coding’ model of the hippocampus to predict subjects' memory performance from their eye movements. Results demonstrate the ability of the model to predict subjects' memory performance. These studies provide a novel approach to computational modeling in the human-machine interface.
SVM and SVM Ensembles in Breast Cancer Prediction.
Huang, Min-Wei; Chen, Chih-Wen; Lin, Wei-Chao; Ke, Shih-Wen; Tsai, Chih-Fong
2017-01-01
Breast cancer is an all too common disease in women, making how to effectively predict it an active research problem. A number of statistical and machine learning techniques have been employed to develop various breast cancer prediction models. Among them, support vector machines (SVM) have been shown to outperform many related techniques. To construct the SVM classifier, it is first necessary to decide the kernel function, and different kernel functions can result in different prediction performance. However, there have been very few studies focused on examining the prediction performances of SVM based on different kernel functions. Moreover, it is unknown whether SVM classifier ensembles which have been proposed to improve the performance of single classifiers can outperform single SVM classifiers in terms of breast cancer prediction. Therefore, the aim of this paper is to fully assess the prediction performance of SVM and SVM ensembles over small and large scale breast cancer datasets. The classification accuracy, ROC, F-measure, and computational times of training SVM and SVM ensembles are compared. The experimental results show that linear kernel based SVM ensembles based on the bagging method and RBF kernel based SVM ensembles with the boosting method can be the better choices for a small scale dataset, where feature selection should be performed in the data pre-processing stage. For a large scale dataset, RBF kernel based SVM ensembles based on boosting perform better than the other classifiers.
SVM and SVM Ensembles in Breast Cancer Prediction
Huang, Min-Wei; Chen, Chih-Wen; Lin, Wei-Chao; Ke, Shih-Wen; Tsai, Chih-Fong
2017-01-01
Breast cancer is an all too common disease in women, making how to effectively predict it an active research problem. A number of statistical and machine learning techniques have been employed to develop various breast cancer prediction models. Among them, support vector machines (SVM) have been shown to outperform many related techniques. To construct the SVM classifier, it is first necessary to decide the kernel function, and different kernel functions can result in different prediction performance. However, there have been very few studies focused on examining the prediction performances of SVM based on different kernel functions. Moreover, it is unknown whether SVM classifier ensembles which have been proposed to improve the performance of single classifiers can outperform single SVM classifiers in terms of breast cancer prediction. Therefore, the aim of this paper is to fully assess the prediction performance of SVM and SVM ensembles over small and large scale breast cancer datasets. The classification accuracy, ROC, F-measure, and computational times of training SVM and SVM ensembles are compared. The experimental results show that linear kernel based SVM ensembles based on the bagging method and RBF kernel based SVM ensembles with the boosting method can be the better choices for a small scale dataset, where feature selection should be performed in the data pre-processing stage. For a large scale dataset, RBF kernel based SVM ensembles based on boosting perform better than the other classifiers. PMID:28060807
Predicting subcontractor performance using web-based Evolutionary Fuzzy Neural Networks.
Ko, Chien-Ho
2013-01-01
Subcontractor performance directly affects project success. The use of inappropriate subcontractors may result in individual work delays, cost overruns, and quality defects throughout the project. This study develops web-based Evolutionary Fuzzy Neural Networks (EFNNs) to predict subcontractor performance. EFNNs are a fusion of Genetic Algorithms (GAs), Fuzzy Logic (FL), and Neural Networks (NNs). FL is primarily used to mimic high level of decision-making processes and deal with uncertainty in the construction industry. NNs are used to identify the association between previous performance and future status when predicting subcontractor performance. GAs are optimizing parameters required in FL and NNs. EFNNs encode FL and NNs using floating numbers to shorten the length of a string. A multi-cut-point crossover operator is used to explore the parameter and retain solution legality. Finally, the applicability of the proposed EFNNs is validated using real subcontractors. The EFNNs are evolved using 22 historical patterns and tested using 12 unseen cases. Application results show that the proposed EFNNs surpass FL and NNs in predicting subcontractor performance. The proposed approach improves prediction accuracy and reduces the effort required to predict subcontractor performance, providing field operators with web-based remote access to a reliable, scientific prediction mechanism.
Predicting Subcontractor Performance Using Web-Based Evolutionary Fuzzy Neural Networks
2013-01-01
Subcontractor performance directly affects project success. The use of inappropriate subcontractors may result in individual work delays, cost overruns, and quality defects throughout the project. This study develops web-based Evolutionary Fuzzy Neural Networks (EFNNs) to predict subcontractor performance. EFNNs are a fusion of Genetic Algorithms (GAs), Fuzzy Logic (FL), and Neural Networks (NNs). FL is primarily used to mimic high level of decision-making processes and deal with uncertainty in the construction industry. NNs are used to identify the association between previous performance and future status when predicting subcontractor performance. GAs are optimizing parameters required in FL and NNs. EFNNs encode FL and NNs using floating numbers to shorten the length of a string. A multi-cut-point crossover operator is used to explore the parameter and retain solution legality. Finally, the applicability of the proposed EFNNs is validated using real subcontractors. The EFNNs are evolved using 22 historical patterns and tested using 12 unseen cases. Application results show that the proposed EFNNs surpass FL and NNs in predicting subcontractor performance. The proposed approach improves prediction accuracy and reduces the effort required to predict subcontractor performance, providing field operators with web-based remote access to a reliable, scientific prediction mechanism. PMID:23864830
Sebok, Angelia; Wickens, Christopher D
2017-03-01
The objectives were to (a) implement theoretical perspectives regarding human-automation interaction (HAI) into model-based tools to assist designers in developing systems that support effective performance and (b) conduct validations to assess the ability of the models to predict operator performance. Two key concepts in HAI, the lumberjack analogy and black swan events, have been studied extensively. The lumberjack analogy describes the effects of imperfect automation on operator performance. In routine operations, an increased degree of automation supports performance, but in failure conditions, increased automation results in more significantly impaired performance. Black swans are the rare and unexpected failures of imperfect automation. The lumberjack analogy and black swan concepts have been implemented into three model-based tools that predict operator performance in different systems. These tools include a flight management system, a remotely controlled robotic arm, and an environmental process control system. Each modeling effort included a corresponding validation. In one validation, the software tool was used to compare three flight management system designs, which were ranked in the same order as predicted by subject matter experts. The second validation compared model-predicted operator complacency with empirical performance in the same conditions. The third validation compared model-predicted and empirically determined time to detect and repair faults in four automation conditions. The three model-based tools offer useful ways to predict operator performance in complex systems. The three tools offer ways to predict the effects of different automation designs on operator performance.
NASA Technical Reports Server (NTRS)
Daigle, Matthew John; Goebel, Kai Frank
2010-01-01
Model-based prognostics captures system knowledge in the form of physics-based models of components, and how they fail, in order to obtain accurate predictions of end of life (EOL). EOL is predicted based on the estimated current state distribution of a component and expected profiles of future usage. In general, this requires simulations of the component using the underlying models. In this paper, we develop a simulation-based prediction methodology that achieves computational efficiency by performing only the minimal number of simulations needed in order to accurately approximate the mean and variance of the complete EOL distribution. This is performed through the use of the unscented transform, which predicts the means and covariances of a distribution passed through a nonlinear transformation. In this case, the EOL simulation acts as that nonlinear transformation. In this paper, we review the unscented transform, and describe how this concept is applied to efficient EOL prediction. As a case study, we develop a physics-based model of a solenoid valve, and perform simulation experiments to demonstrate improved computational efficiency without sacrificing prediction accuracy.
Predicting Energy Performance of a Net-Zero Energy Building: A Statistical Approach
Kneifel, Joshua; Webb, David
2016-01-01
Performance-based building requirements have become more prevalent because it gives freedom in building design while still maintaining or exceeding the energy performance required by prescriptive-based requirements. In order to determine if building designs reach target energy efficiency improvements, it is necessary to estimate the energy performance of a building using predictive models and different weather conditions. Physics-based whole building energy simulation modeling is the most common approach. However, these physics-based models include underlying assumptions and require significant amounts of information in order to specify the input parameter values. An alternative approach to test the performance of a building is to develop a statistically derived predictive regression model using post-occupancy data that can accurately predict energy consumption and production based on a few common weather-based factors, thus requiring less information than simulation models. A regression model based on measured data should be able to predict energy performance of a building for a given day as long as the weather conditions are similar to those during the data collection time frame. This article uses data from the National Institute of Standards and Technology (NIST) Net-Zero Energy Residential Test Facility (NZERTF) to develop and validate a regression model to predict the energy performance of the NZERTF using two weather variables aggregated to the daily level, applies the model to estimate the energy performance of hypothetical NZERTFs located in different cities in the Mixed-Humid climate zone, and compares these estimates to the results from already existing EnergyPlus whole building energy simulations. This regression model exhibits agreement with EnergyPlus predictive trends in energy production and net consumption, but differs greatly in energy consumption. The model can be used as a framework for alternative and more complex models based on the experimental data collected from the NZERTF. PMID:27956756
Predicting Energy Performance of a Net-Zero Energy Building: A Statistical Approach.
Kneifel, Joshua; Webb, David
2016-09-01
Performance-based building requirements have become more prevalent because it gives freedom in building design while still maintaining or exceeding the energy performance required by prescriptive-based requirements. In order to determine if building designs reach target energy efficiency improvements, it is necessary to estimate the energy performance of a building using predictive models and different weather conditions. Physics-based whole building energy simulation modeling is the most common approach. However, these physics-based models include underlying assumptions and require significant amounts of information in order to specify the input parameter values. An alternative approach to test the performance of a building is to develop a statistically derived predictive regression model using post-occupancy data that can accurately predict energy consumption and production based on a few common weather-based factors, thus requiring less information than simulation models. A regression model based on measured data should be able to predict energy performance of a building for a given day as long as the weather conditions are similar to those during the data collection time frame. This article uses data from the National Institute of Standards and Technology (NIST) Net-Zero Energy Residential Test Facility (NZERTF) to develop and validate a regression model to predict the energy performance of the NZERTF using two weather variables aggregated to the daily level, applies the model to estimate the energy performance of hypothetical NZERTFs located in different cities in the Mixed-Humid climate zone, and compares these estimates to the results from already existing EnergyPlus whole building energy simulations. This regression model exhibits agreement with EnergyPlus predictive trends in energy production and net consumption, but differs greatly in energy consumption. The model can be used as a framework for alternative and more complex models based on the experimental data collected from the NZERTF.
Estimating thermal performance curves from repeated field observations
Childress, Evan; Letcher, Benjamin H.
2017-01-01
Estimating thermal performance of organisms is critical for understanding population distributions and dynamics and predicting responses to climate change. Typically, performance curves are estimated using laboratory studies to isolate temperature effects, but other abiotic and biotic factors influence temperature-performance relationships in nature reducing these models' predictive ability. We present a model for estimating thermal performance curves from repeated field observations that includes environmental and individual variation. We fit the model in a Bayesian framework using MCMC sampling, which allowed for estimation of unobserved latent growth while propagating uncertainty. Fitting the model to simulated data varying in sampling design and parameter values demonstrated that the parameter estimates were accurate, precise, and unbiased. Fitting the model to individual growth data from wild trout revealed high out-of-sample predictive ability relative to laboratory-derived models, which produced more biased predictions for field performance. The field-based estimates of thermal maxima were lower than those based on laboratory studies. Under warming temperature scenarios, field-derived performance models predicted stronger declines in body size than laboratory-derived models, suggesting that laboratory-based models may underestimate climate change effects. The presented model estimates true, realized field performance, avoiding assumptions required for applying laboratory-based models to field performance, which should improve estimates of performance under climate change and advance thermal ecology.
ERIC Educational Resources Information Center
Lee, Young-Jin
2015-01-01
This study investigates whether information saved in the log files of a computer-based tutor can be used to predict the problem solving performance of students. The log files of a computer-based physics tutoring environment called Andes Physics Tutor was analyzed to build a logistic regression model that predicted success and failure of students'…
Small RNA-based prediction of hybrid performance in maize.
Seifert, Felix; Thiemann, Alexander; Schrag, Tobias A; Rybka, Dominika; Melchinger, Albrecht E; Frisch, Matthias; Scholten, Stefan
2018-05-21
Small RNA (sRNA) sequences are known to have a broad impact on gene regulation by various mechanisms. Their performance for the prediction of hybrid traits has not yet been analyzed. Our objective was to analyze the relation of parental sRNA expression with the performance of their hybrids, to develop a sRNA-based prediction approach, and to compare it to more common SNP and mRNA transcript based predictions using a factorial mating scheme of a maize hybrid breeding program. Correlation of genomic differences and messenger RNA (mRNA) or sRNA expression differences between parental lines with hybrid performance of their hybrids revealed that sRNAs showed an inverse relationship in contrast to the other two data types. We associated differences for SNPs, mRNA and sRNA expression between parental inbred lines with the performance of their hybrid combinations and developed two prediction approaches using distance measures based on associated markers. Cross-validations revealed parental differences in sRNA expression to be strong predictors for hybrid performance for grain yield in maize, comparable to genomic and mRNA data. The integration of both positively and negatively associated markers in the prediction approaches enhanced the prediction accurary. The associated sRNAs belong predominantly to the canonical size classes of 22- and 24-nt that show specific genomic mapping characteristics. Expression profiles of sRNA are a promising alternative to SNPs or mRNA expression profiles for hybrid prediction, especially for plant species without reference genome or transcriptome information. The characteristics of the sRNAs we identified suggest that association studies based on breeding populations facilitate the identification of sRNAs involved in hybrid performance.
Physics-based model for predicting the performance of a miniature wind turbine
NASA Astrophysics Data System (ADS)
Xu, F. J.; Hu, J. Z.; Qiu, Y. P.; Yuan, F. G.
2011-04-01
A comprehensive physics-based model for predicting the performance of the miniature wind turbine (MWT) for power wireless sensor systems was proposed in this paper. An approximation of the power coefficient of the turbine rotor was made after the turbine rotor performance was measured. Incorporation of the approximation with the equivalent circuit model which was proposed according to the principles of the MWT, the overall system performance of the MWT was predicted. To demonstrate the prediction, the MWT system comprised of a 7.6 cm thorgren plastic propeller as turbine rotor and a DC motor as generator was designed and its performance was tested experimentally. The predicted output voltage, power and system efficiency are matched well with the tested results, which imply that this study holds promise in estimating and optimizing the performance of the MWT.
Gender differences in the content of cognitive distraction during sex.
Meana, Marta; Nunnink, Sarah E
2006-02-01
This study compared 220 college men and 237 college women on two types of self-reported cognitive distraction during sex, performance- and appearance-based. Affect, psychological distress, sexual knowledge, attitudes, fantasies, experiences, body image, satisfaction, and sexual function were assessed with the Derogatis Sexual Functioning Inventory and the Sexual History Form to determine associations with distraction. Between-gender analyses revealed that women reported higher levels of overall and appearance-based distraction than did men, but similar levels of performance-based distraction. Within-gender analyses revealed that women reported as much of one type of distraction as the other, while men reported more performance- than appearance-based distraction. In women, appearance-based distraction was predicted by negative body image, psychological distress, and not being in a relationship, while performance-based distraction was predicted by negative body image, psychological distress, and sexual dissatisfaction. In men, appearance-based distraction was predicted by negative body image, sexual dissatisfaction and not being in a relationship, while performance-based distraction was predicted by negative body image and sexual dissatisfaction. Investigating the content of cognitive distraction may be useful in understanding gender differences in sexual experience and in refining cognitive components of sex therapy.
Choosing the appropriate forecasting model for predictive parameter control.
Aleti, Aldeida; Moser, Irene; Meedeniya, Indika; Grunske, Lars
2014-01-01
All commonly used stochastic optimisation algorithms have to be parameterised to perform effectively. Adaptive parameter control (APC) is an effective method used for this purpose. APC repeatedly adjusts parameter values during the optimisation process for optimal algorithm performance. The assignment of parameter values for a given iteration is based on previously measured performance. In recent research, time series prediction has been proposed as a method of projecting the probabilities to use for parameter value selection. In this work, we examine the suitability of a variety of prediction methods for the projection of future parameter performance based on previous data. All considered prediction methods have assumptions the time series data has to conform to for the prediction method to provide accurate projections. Looking specifically at parameters of evolutionary algorithms (EAs), we find that all standard EA parameters with the exception of population size conform largely to the assumptions made by the considered prediction methods. Evaluating the performance of these prediction methods, we find that linear regression provides the best results by a very small and statistically insignificant margin. Regardless of the prediction method, predictive parameter control outperforms state of the art parameter control methods when the performance data adheres to the assumptions made by the prediction method. When a parameter's performance data does not adhere to the assumptions made by the forecasting method, the use of prediction does not have a notable adverse impact on the algorithm's performance.
Effect of window length on performance of the elbow-joint angle prediction based on electromyography
NASA Astrophysics Data System (ADS)
Triwiyanto; Wahyunggoro, Oyas; Adi Nugroho, Hanung; Herianto
2017-05-01
The high performance of the elbow joint angle prediction is essential on the development of the devices based on electromyography (EMG) control. The performance of the prediction depends on the feature of extraction parameters such as window length. In this paper, we evaluated the effect of the window length on the performance of the elbow-joint angle prediction. The prediction algorithm consists of zero-crossing feature extraction and second order of Butterworth low pass filter. The feature was used to extract the EMG signal by varying window length. The EMG signal was collected from the biceps muscle while the elbow was moved in the flexion and extension motion. The subject performed the elbow motion by holding a 1-kg load and moved the elbow in different periods (12 seconds, 8 seconds and 6 seconds). The results indicated that the window length affected the performance of the prediction. The 250 window lengths yielded the best performance of the prediction algorithm of (mean±SD) root mean square error = 5.68%±1.53% and Person’s correlation = 0.99±0.0059.
NASA Technical Reports Server (NTRS)
Mercer, Joey S.; Bienert, Nancy; Gomez, Ashley; Hunt, Sarah; Kraut, Joshua; Martin, Lynne; Morey, Susan; Green, Steven M.; Prevot, Thomas; Wu, Minghong G.
2013-01-01
A Human-In-The-Loop air traffic control simulation investigated the impact of uncertainties in trajectory predictions on NextGen Trajectory-Based Operations concepts, seeking to understand when the automation would become unacceptable to controllers or when performance targets could no longer be met. Retired air traffic controllers staffed two en route transition sectors, delivering arrival traffic to the northwest corner-post of Atlanta approach control under time-based metering operations. Using trajectory-based decision-support tools, the participants worked the traffic under varying levels of wind forecast error and aircraft performance model error, impacting the ground automations ability to make accurate predictions. Results suggest that the controllers were able to maintain high levels of performance, despite even the highest levels of trajectory prediction errors.
Tan, Alexander; Delgaty, Lauren; Steward, Kayla; Bunner, Melissa
2018-04-16
Deficits in real-world executive functioning (EF) are a frequent characteristic of attention-deficit/hyperactivity disorder (ADHD). However, the predictive value of using performance-based and behavioral rating measures of EF when diagnosing ADHD remains unclear. The current study investigates the use of performance-based EF measures and a parent-report questionnaire with established ecological validity and clinical utility when diagnosing ADHD. Participants included 21 healthy controls, 21 ADHD-primary inattentive, and 21 ADHD-combined type subjects aged 6-15 years. A brief neuropsychological battery was administered to each subject including common EF assessment measures. Significant differences were not found between groups on most performance-based EF measures, whereas significant differences (p < 0.05) were found on most parent-report behavioral rating scales. Furthermore, performance-based measures did not predict group membership above chance levels. Results further support differences in predictive value of EF performance-based measures compared to parent-report questionnaires when diagnosing ADHD. Further research must investigate the relationship between performance-based and behavioral rating measures when assessing EF in ADHD.
Comparison of RNA-seq and microarray-based models for clinical endpoint prediction.
Zhang, Wenqian; Yu, Ying; Hertwig, Falk; Thierry-Mieg, Jean; Zhang, Wenwei; Thierry-Mieg, Danielle; Wang, Jian; Furlanello, Cesare; Devanarayan, Viswanath; Cheng, Jie; Deng, Youping; Hero, Barbara; Hong, Huixiao; Jia, Meiwen; Li, Li; Lin, Simon M; Nikolsky, Yuri; Oberthuer, André; Qing, Tao; Su, Zhenqiang; Volland, Ruth; Wang, Charles; Wang, May D; Ai, Junmei; Albanese, Davide; Asgharzadeh, Shahab; Avigad, Smadar; Bao, Wenjun; Bessarabova, Marina; Brilliant, Murray H; Brors, Benedikt; Chierici, Marco; Chu, Tzu-Ming; Zhang, Jibin; Grundy, Richard G; He, Min Max; Hebbring, Scott; Kaufman, Howard L; Lababidi, Samir; Lancashire, Lee J; Li, Yan; Lu, Xin X; Luo, Heng; Ma, Xiwen; Ning, Baitang; Noguera, Rosa; Peifer, Martin; Phan, John H; Roels, Frederik; Rosswog, Carolina; Shao, Susan; Shen, Jie; Theissen, Jessica; Tonini, Gian Paolo; Vandesompele, Jo; Wu, Po-Yen; Xiao, Wenzhong; Xu, Joshua; Xu, Weihong; Xuan, Jiekun; Yang, Yong; Ye, Zhan; Dong, Zirui; Zhang, Ke K; Yin, Ye; Zhao, Chen; Zheng, Yuanting; Wolfinger, Russell D; Shi, Tieliu; Malkas, Linda H; Berthold, Frank; Wang, Jun; Tong, Weida; Shi, Leming; Peng, Zhiyu; Fischer, Matthias
2015-06-25
Gene expression profiling is being widely applied in cancer research to identify biomarkers for clinical endpoint prediction. Since RNA-seq provides a powerful tool for transcriptome-based applications beyond the limitations of microarrays, we sought to systematically evaluate the performance of RNA-seq-based and microarray-based classifiers in this MAQC-III/SEQC study for clinical endpoint prediction using neuroblastoma as a model. We generate gene expression profiles from 498 primary neuroblastomas using both RNA-seq and 44 k microarrays. Characterization of the neuroblastoma transcriptome by RNA-seq reveals that more than 48,000 genes and 200,000 transcripts are being expressed in this malignancy. We also find that RNA-seq provides much more detailed information on specific transcript expression patterns in clinico-genetic neuroblastoma subgroups than microarrays. To systematically compare the power of RNA-seq and microarray-based models in predicting clinical endpoints, we divide the cohort randomly into training and validation sets and develop 360 predictive models on six clinical endpoints of varying predictability. Evaluation of factors potentially affecting model performances reveals that prediction accuracies are most strongly influenced by the nature of the clinical endpoint, whereas technological platforms (RNA-seq vs. microarrays), RNA-seq data analysis pipelines, and feature levels (gene vs. transcript vs. exon-junction level) do not significantly affect performances of the models. We demonstrate that RNA-seq outperforms microarrays in determining the transcriptomic characteristics of cancer, while RNA-seq and microarray-based models perform similarly in clinical endpoint prediction. Our findings may be valuable to guide future studies on the development of gene expression-based predictive models and their implementation in clinical practice.
A New Method for the Evaluation and Prediction of Base Stealing Performance.
Bricker, Joshua C; Bailey, Christopher A; Driggers, Austin R; McInnis, Timothy C; Alami, Arya
2016-11-01
Bricker, JC, Bailey, CA, Driggers, AR, McInnis, TC, and Alami, A. A new method for the evaluation and prediction of base stealing performance. J Strength Cond Res 30(11): 3044-3050, 2016-The purposes of this study were to evaluate a new method using electronic timing gates to monitor base stealing performance in terms of reliability, differences between it and traditional stopwatch-collected times, and its ability to predict base stealing performance. Twenty-five healthy collegiate baseball players performed maximal effort base stealing trials with a right and left-handed pitcher. An infrared electronic timing system was used to calculate the reaction time (RT) and total time (TT), whereas coaches' times (CT) were recorded with digital stopwatches. Reliability of the TGM was evaluated with intraclass correlation coefficients (ICCs) and coefficient of variation (CV). Differences between the TGM and traditional CT were calculated with paired samples t tests Cohen's d effect size estimates. Base stealing performance predictability of the TGM was evaluated with Pearson's bivariate correlations. Acceptable relative reliability was observed (ICCs 0.74-0.84). Absolute reliability measures were acceptable for TT (CVs = 4.4-4.8%), but measures were elevated for RT (CVs = 32.3-35.5%). Statistical and practical differences were found between TT and CT (right p = 0.00, d = 1.28 and left p = 0.00, d = 1.49). The TGM TT seems to be a decent predictor of base stealing performance (r = -0.49 to -0.61). The authors recommend using the TGM used in this investigation for athlete monitoring because it was found to be reliable, seems to be more precise than traditional CT measured with a stopwatch, provides an additional variable of value (RT), and may predict future performance.
Trust from the past: Bayesian Personalized Ranking based Link Prediction in Knowledge Graphs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Baichuan; Choudhury, Sutanay; Al-Hasan, Mohammad
2016-02-01
Estimating the confidence for a link is a critical task for Knowledge Graph construction. Link prediction, or predicting the likelihood of a link in a knowledge graph based on prior state is a key research direction within this area. We propose a Latent Feature Embedding based link recommendation model for prediction task and utilize Bayesian Personalized Ranking based optimization technique for learning models for each predicate. Experimental results on large-scale knowledge bases such as YAGO2 show that our approach achieves substantially higher performance than several state-of-art approaches. Furthermore, we also study the performance of the link prediction algorithm in termsmore » of topological properties of the Knowledge Graph and present a linear regression model to reason about its expected level of accuracy.« less
Noh, Wonjung; Seomun, Gyeongae
2015-06-01
This study was conducted to develop key performance indicators (KPIs) for home care nursing (HCN) based on a balanced scorecard, and to construct a performance prediction model of strategic objectives using the Bayesian Belief Network (BBN). This methodological study included four steps: establishment of KPIs, performance prediction modeling, development of a performance prediction model using BBN, and simulation of a suggested nursing management strategy. An HCN expert group and a staff group participated. The content validity index was analyzed using STATA 13.0, and BBN was analyzed using HUGIN 8.0. We generated a list of KPIs composed of 4 perspectives, 10 strategic objectives, and 31 KPIs. In the validity test of the performance prediction model, the factor with the greatest variance for increasing profit was maximum cost reduction of HCN services. The factor with the smallest variance for increasing profit was a minimum image improvement for HCN. During sensitivity analysis, the probability of the expert group did not affect the sensitivity. Furthermore, simulation of a 10% image improvement predicted the most effective way to increase profit. KPIs of HCN can estimate financial and non-financial performance. The performance prediction model for HCN will be useful to improve performance.
NASA Astrophysics Data System (ADS)
Duan, Rui; Xu, Xianjin; Zou, Xiaoqin
2018-01-01
D3R 2016 Grand Challenge 2 focused on predictions of binding modes and affinities for 102 compounds against the farnesoid X receptor (FXR). In this challenge, two distinct methods, a docking-based method and a template-based method, were employed by our team for the binding mode prediction. For the new template-based method, 3D ligand similarities were calculated for each query compound against the ligands in the co-crystal structures of FXR available in Protein Data Bank. The binding mode was predicted based on the co-crystal protein structure containing the ligand with the best ligand similarity score against the query compound. For the FXR dataset, the template-based method achieved a better performance than the docking-based method on the binding mode prediction. For the binding affinity prediction, an in-house knowledge-based scoring function ITScore2 and MM/PBSA approach were employed. Good performance was achieved for MM/PBSA, whereas the performance of ITScore2 was sensitive to ligand composition, e.g. the percentage of carbon atoms in the compounds. The sensitivity to ligand composition could be a clue for the further improvement of our knowledge-based scoring function.
ERIC Educational Resources Information Center
Zimmermann, Judith; Brodersen, Kay H.; Heinimann, Hans R.; Buhmann, Joachim M.
2015-01-01
The graduate admissions process is crucial for controlling the quality of higher education, yet, rules-of-thumb and domain-specific experiences often dominate evidence-based approaches. The goal of the present study is to dissect the predictive power of undergraduate performance indicators and their aggregates. We analyze 81 variables in 171…
Selection of optimal sensors for predicting performance of polymer electrolyte membrane fuel cell
NASA Astrophysics Data System (ADS)
Mao, Lei; Jackson, Lisa
2016-10-01
In this paper, sensor selection algorithms are investigated based on a sensitivity analysis, and the capability of optimal sensors in predicting PEM fuel cell performance is also studied using test data. The fuel cell model is developed for generating the sensitivity matrix relating sensor measurements and fuel cell health parameters. From the sensitivity matrix, two sensor selection approaches, including the largest gap method, and exhaustive brute force searching technique, are applied to find the optimal sensors providing reliable predictions. Based on the results, a sensor selection approach considering both sensor sensitivity and noise resistance is proposed to find the optimal sensor set with minimum size. Furthermore, the performance of the optimal sensor set is studied to predict fuel cell performance using test data from a PEM fuel cell system. Results demonstrate that with optimal sensors, the performance of PEM fuel cell can be predicted with good quality.
NASA Technical Reports Server (NTRS)
Sebok, Angelia; Wickens, Christopher; Sargent, Robert
2015-01-01
One human factors challenge is predicting operator performance in novel situations. Approaches such as drawing on relevant previous experience, and developing computational models to predict operator performance in complex situations, offer potential methods to address this challenge. A few concerns with modeling operator performance are that models need to realistic, and they need to be tested empirically and validated. In addition, many existing human performance modeling tools are complex and require that an analyst gain significant experience to be able to develop models for meaningful data collection. This paper describes an effort to address these challenges by developing an easy to use model-based tool, using models that were developed from a review of existing human performance literature and targeted experimental studies, and performing an empirical validation of key model predictions.
Ell, Shawn W; Cosley, Brandon; McCoy, Shannon K
2011-02-01
The way in which we respond to everyday stressors can have a profound impact on cognitive functioning. Maladaptive stress responses in particular are generally associated with impaired cognitive performance. We argue, however, that the cognitive system mediating task performance is also a critical determinant of the stress-cognition relationship. Consistent with this prediction, we observed that stress reactivity consistent with a maladaptive, threat response differentially predicted performance on two categorization tasks. Increased threat reactivity predicted enhanced performance on an information-integration task (i.e., learning is thought to depend upon a procedural-based memory system), and a (nonsignificant) trend for impaired performance on a rule-based task (i.e., learning is thought to depend upon a hypothesis-testing system). These data suggest that it is critical to consider both variability in the stress response and variability in the cognitive system mediating task performance in order to fully understand the stress-cognition relationship.
NASA Astrophysics Data System (ADS)
Rooper, Christopher N.; Zimmermann, Mark; Prescott, Megan M.
2017-08-01
Deep-sea coral and sponge ecosystems are widespread throughout most of Alaska's marine waters, and are associated with many different species of fishes and invertebrates. These ecosystems are vulnerable to the effects of commercial fishing activities and climate change. We compared four commonly used species distribution models (general linear models, generalized additive models, boosted regression trees and random forest models) and an ensemble model to predict the presence or absence and abundance of six groups of benthic invertebrate taxa in the Gulf of Alaska. All four model types performed adequately on training data for predicting presence and absence, with regression forest models having the best overall performance measured by the area under the receiver-operating-curve (AUC). The models also performed well on the test data for presence and absence with average AUCs ranging from 0.66 to 0.82. For the test data, ensemble models performed the best. For abundance data, there was an obvious demarcation in performance between the two regression-based methods (general linear models and generalized additive models), and the tree-based models. The boosted regression tree and random forest models out-performed the other models by a wide margin on both the training and testing data. However, there was a significant drop-off in performance for all models of invertebrate abundance ( 50%) when moving from the training data to the testing data. Ensemble model performance was between the tree-based and regression-based methods. The maps of predictions from the models for both presence and abundance agreed very well across model types, with an increase in variability in predictions for the abundance data. We conclude that where data conforms well to the modeled distribution (such as the presence-absence data and binomial distribution in this study), the four types of models will provide similar results, although the regression-type models may be more consistent with biological theory. For data with highly zero-inflated distributions and non-normal distributions such as the abundance data from this study, the tree-based methods performed better. Ensemble models that averaged predictions across the four model types, performed better than the GLM or GAM models but slightly poorer than the tree-based methods, suggesting ensemble models might be more robust to overfitting than tree methods, while mitigating some of the disadvantages in predictive performance of regression methods.
Minimalist ensemble algorithms for genome-wide protein localization prediction.
Lin, Jhih-Rong; Mondal, Ananda Mohan; Liu, Rong; Hu, Jianjun
2012-07-03
Computational prediction of protein subcellular localization can greatly help to elucidate its functions. Despite the existence of dozens of protein localization prediction algorithms, the prediction accuracy and coverage are still low. Several ensemble algorithms have been proposed to improve the prediction performance, which usually include as many as 10 or more individual localization algorithms. However, their performance is still limited by the running complexity and redundancy among individual prediction algorithms. This paper proposed a novel method for rational design of minimalist ensemble algorithms for practical genome-wide protein subcellular localization prediction. The algorithm is based on combining a feature selection based filter and a logistic regression classifier. Using a novel concept of contribution scores, we analyzed issues of algorithm redundancy, consensus mistakes, and algorithm complementarity in designing ensemble algorithms. We applied the proposed minimalist logistic regression (LR) ensemble algorithm to two genome-wide datasets of Yeast and Human and compared its performance with current ensemble algorithms. Experimental results showed that the minimalist ensemble algorithm can achieve high prediction accuracy with only 1/3 to 1/2 of individual predictors of current ensemble algorithms, which greatly reduces computational complexity and running time. It was found that the high performance ensemble algorithms are usually composed of the predictors that together cover most of available features. Compared to the best individual predictor, our ensemble algorithm improved the prediction accuracy from AUC score of 0.558 to 0.707 for the Yeast dataset and from 0.628 to 0.646 for the Human dataset. Compared with popular weighted voting based ensemble algorithms, our classifier-based ensemble algorithms achieved much better performance without suffering from inclusion of too many individual predictors. We proposed a method for rational design of minimalist ensemble algorithms using feature selection and classifiers. The proposed minimalist ensemble algorithm based on logistic regression can achieve equal or better prediction performance while using only half or one-third of individual predictors compared to other ensemble algorithms. The results also suggested that meta-predictors that take advantage of a variety of features by combining individual predictors tend to achieve the best performance. The LR ensemble server and related benchmark datasets are available at http://mleg.cse.sc.edu/LRensemble/cgi-bin/predict.cgi.
Minimalist ensemble algorithms for genome-wide protein localization prediction
2012-01-01
Background Computational prediction of protein subcellular localization can greatly help to elucidate its functions. Despite the existence of dozens of protein localization prediction algorithms, the prediction accuracy and coverage are still low. Several ensemble algorithms have been proposed to improve the prediction performance, which usually include as many as 10 or more individual localization algorithms. However, their performance is still limited by the running complexity and redundancy among individual prediction algorithms. Results This paper proposed a novel method for rational design of minimalist ensemble algorithms for practical genome-wide protein subcellular localization prediction. The algorithm is based on combining a feature selection based filter and a logistic regression classifier. Using a novel concept of contribution scores, we analyzed issues of algorithm redundancy, consensus mistakes, and algorithm complementarity in designing ensemble algorithms. We applied the proposed minimalist logistic regression (LR) ensemble algorithm to two genome-wide datasets of Yeast and Human and compared its performance with current ensemble algorithms. Experimental results showed that the minimalist ensemble algorithm can achieve high prediction accuracy with only 1/3 to 1/2 of individual predictors of current ensemble algorithms, which greatly reduces computational complexity and running time. It was found that the high performance ensemble algorithms are usually composed of the predictors that together cover most of available features. Compared to the best individual predictor, our ensemble algorithm improved the prediction accuracy from AUC score of 0.558 to 0.707 for the Yeast dataset and from 0.628 to 0.646 for the Human dataset. Compared with popular weighted voting based ensemble algorithms, our classifier-based ensemble algorithms achieved much better performance without suffering from inclusion of too many individual predictors. Conclusions We proposed a method for rational design of minimalist ensemble algorithms using feature selection and classifiers. The proposed minimalist ensemble algorithm based on logistic regression can achieve equal or better prediction performance while using only half or one-third of individual predictors compared to other ensemble algorithms. The results also suggested that meta-predictors that take advantage of a variety of features by combining individual predictors tend to achieve the best performance. The LR ensemble server and related benchmark datasets are available at http://mleg.cse.sc.edu/LRensemble/cgi-bin/predict.cgi. PMID:22759391
Su, Jiandong; Barbera, Lisa; Sutradhar, Rinku
2015-06-01
Prior work has utilized longitudinal information on performance status to demonstrate its association with risk of death among cancer patients; however, no study has assessed whether such longitudinal information improves the predictions for risk of death. To examine whether the use of repeated performance status assessments improve predictions for risk of death compared to using only performance status assessment at the time of cancer diagnosis. This was a population-based longitudinal study of adult outpatients who had a cancer diagnosis and had at least one assessment of performance status. To account for each patient's changing performance status over time, we implemented a Cox model with a time-varying covariate for performance status. This model was compared to a Cox model using only a time-fixed (baseline) covariate for performance status. The regression coefficients of each model were derived based on a randomly selected 60% of patients, and then, the predictive ability of each model was assessed via concordance probabilities when applied to the remaining 40% of patients. Our study consisted of 15,487 cancer patients with over 53,000 performance status assessments. The utilization of repeated performance status assessments improved predictions for risk of death compared to using only the performance status assessment taken at diagnosis. When studying the hazard of death among patients with cancer, if available, researchers should incorporate changing information on performance status scores, instead of simply baseline information on performance status. © The Author(s) 2015.
Stringent DDI-based prediction of H. sapiens-M. tuberculosis H37Rv protein-protein interactions.
Zhou, Hufeng; Rezaei, Javad; Hugo, Willy; Gao, Shangzhi; Jin, Jingjing; Fan, Mengyuan; Yong, Chern-Han; Wozniak, Michal; Wong, Limsoon
2013-01-01
H. sapiens-M. tuberculosis H37Rv protein-protein interaction (PPI) data are very important information to illuminate the infection mechanism of M. tuberculosis H37Rv. But current H. sapiens-M. tuberculosis H37Rv PPI data are very scarce. This seriously limits the study of the interaction between this important pathogen and its host H. sapiens. Computational prediction of H. sapiens-M. tuberculosis H37Rv PPIs is an important strategy to fill in the gap. Domain-domain interaction (DDI) based prediction is one of the frequently used computational approaches in predicting both intra-species and inter-species PPIs. However, the performance of DDI-based host-pathogen PPI prediction has been rather limited. We develop a stringent DDI-based prediction approach with emphasis on (i) differences between the specific domain sequences on annotated regions of proteins under the same domain ID and (ii) calculation of the interaction strength of predicted PPIs based on the interacting residues in their interaction interfaces. We compare our stringent DDI-based approach to a conventional DDI-based approach for predicting PPIs based on gold standard intra-species PPIs and coherent informative Gene Ontology terms assessment. The assessment results show that our stringent DDI-based approach achieves much better performance in predicting PPIs than the conventional approach. Using our stringent DDI-based approach, we have predicted a small set of reliable H. sapiens-M. tuberculosis H37Rv PPIs which could be very useful for a variety of related studies. We also analyze the H. sapiens-M. tuberculosis H37Rv PPIs predicted by our stringent DDI-based approach using cellular compartment distribution analysis, functional category enrichment analysis and pathway enrichment analysis. The analyses support the validity of our prediction result. Also, based on an analysis of the H. sapiens-M. tuberculosis H37Rv PPI network predicted by our stringent DDI-based approach, we have discovered some important properties of domains involved in host-pathogen PPIs. We find that both host and pathogen proteins involved in host-pathogen PPIs tend to have more domains than proteins involved in intra-species PPIs, and these domains have more interaction partners than domains on proteins involved in intra-species PPI. The stringent DDI-based prediction approach reported in this work provides a stringent strategy for predicting host-pathogen PPIs. It also performs better than a conventional DDI-based approach in predicting PPIs. We have predicted a small set of accurate H. sapiens-M. tuberculosis H37Rv PPIs which could be very useful for a variety of related studies.
NASA Astrophysics Data System (ADS)
Velarde, P.; Valverde, L.; Maestre, J. M.; Ocampo-Martinez, C.; Bordons, C.
2017-03-01
In this paper, a performance comparison among three well-known stochastic model predictive control approaches, namely, multi-scenario, tree-based, and chance-constrained model predictive control is presented. To this end, three predictive controllers have been designed and implemented in a real renewable-hydrogen-based microgrid. The experimental set-up includes a PEM electrolyzer, lead-acid batteries, and a PEM fuel cell as main equipment. The real experimental results show significant differences from the plant components, mainly in terms of use of energy, for each implemented technique. Effectiveness, performance, advantages, and disadvantages of these techniques are extensively discussed and analyzed to give some valid criteria when selecting an appropriate stochastic predictive controller.
Improved Fuzzy Modelling to Predict the Academic Performance of Distance Education Students
ERIC Educational Resources Information Center
Yildiz, Osman; Bal, Abdullah; Gulsecen, Sevinc
2013-01-01
It is essential to predict distance education students' year-end academic performance early during the course of the semester and to take precautions using such prediction-based information. This will, in particular, help enhance their academic performance and, therefore, improve the overall educational quality. The present study was on the…
Kazaura, Kamugisha; Omae, Kazunori; Suzuki, Toshiji; Matsumoto, Mitsuji; Mutafungwa, Edward; Korhonen, Timo O; Murakami, Tadaaki; Takahashi, Koichi; Matsumoto, Hideki; Wakamori, Kazuhiko; Arimoto, Yoshinori
2006-06-12
The deterioration and deformation of a free-space optical beam wave-front as it propagates through the atmosphere can reduce the link availability and may introduce burst errors thus degrading the performance of the system. We investigate the suitability of utilizing soft-computing (SC) based tools for improving performance of free-space optical (FSO) communications systems. The SC based tools are used for the prediction of key parameters of a FSO communications system. Measured data collected from an experimental FSO communication system is used as training and testing data for a proposed multi-layer neural network predictor (MNNP) used to predict future parameter values. The predicted parameters are essential for reducing transmission errors by improving the antenna's accuracy of tracking data beams. This is particularly essential for periods considered to be of strong atmospheric turbulence. The parameter values predicted using the proposed tool show acceptable conformity with original measurements.
Liang, Yunyun; Liu, Sanyang; Zhang, Shengli
2015-01-01
Prediction of protein structural classes for low-similarity sequences is useful for understanding fold patterns, regulation, functions, and interactions of proteins. It is well known that feature extraction is significant to prediction of protein structural class and it mainly uses protein primary sequence, predicted secondary structure sequence, and position-specific scoring matrix (PSSM). Currently, prediction solely based on the PSSM has played a key role in improving the prediction accuracy. In this paper, we propose a novel method called CSP-SegPseP-SegACP by fusing consensus sequence (CS), segmented PsePSSM, and segmented autocovariance transformation (ACT) based on PSSM. Three widely used low-similarity datasets (1189, 25PDB, and 640) are adopted in this paper. Then a 700-dimensional (700D) feature vector is constructed and the dimension is decreased to 224D by using principal component analysis (PCA). To verify the performance of our method, rigorous jackknife cross-validation tests are performed on 1189, 25PDB, and 640 datasets. Comparison of our results with the existing PSSM-based methods demonstrates that our method achieves the favorable and competitive performance. This will offer an important complementary to other PSSM-based methods for prediction of protein structural classes for low-similarity sequences.
Prediction Of The Expected Safety Performance Of Rural Two-Lane Highways
DOT National Transportation Integrated Search
2000-12-01
This report presents an algorithm for predicting the safety performance of a rural two-lane highway. The accident prediction algorithm consists of base models and accident modification factors for both roadway segments and at-grade intersections on r...
Wan, Cen; Lees, Jonathan G; Minneci, Federico; Orengo, Christine A; Jones, David T
2017-10-01
Accurate gene or protein function prediction is a key challenge in the post-genome era. Most current methods perform well on molecular function prediction, but struggle to provide useful annotations relating to biological process functions due to the limited power of sequence-based features in that functional domain. In this work, we systematically evaluate the predictive power of temporal transcription expression profiles for protein function prediction in Drosophila melanogaster. Our results show significantly better performance on predicting protein function when transcription expression profile-based features are integrated with sequence-derived features, compared with the sequence-derived features alone. We also observe that the combination of expression-based and sequence-based features leads to further improvement of accuracy on predicting all three domains of gene function. Based on the optimal feature combinations, we then propose a novel multi-classifier-based function prediction method for Drosophila melanogaster proteins, FFPred-fly+. Interpreting our machine learning models also allows us to identify some of the underlying links between biological processes and developmental stages of Drosophila melanogaster.
Performance predictions affect attentional processes of event-based prospective memory.
Rummel, Jan; Kuhlmann, Beatrice G; Touron, Dayna R
2013-09-01
To investigate whether making performance predictions affects prospective memory (PM) processing, we asked one group of participants to predict their performance in a PM task embedded in an ongoing task and compared their performance with a control group that made no predictions. A third group gave not only PM predictions but also ongoing-task predictions. Exclusive PM predictions resulted in slower ongoing-task responding both in a nonfocal (Experiment 1) and in a focal (Experiment 2) PM task. Only in the nonfocal task was the additional slowing accompanied by improved PM performance. Even in the nonfocal task, however, was the correlation between ongoing-task speed and PM performance reduced after predictions, suggesting that the slowing was not completely functional for PM. Prediction-induced changes could be avoided by asking participants to additionally predict their performance in the ongoing task. In sum, the present findings substantiate a role of metamemory for attention-allocation strategies of PM. Copyright © 2013 Elsevier Inc. All rights reserved.
Stata Modules for Calculating Novel Predictive Performance Indices for Logistic Models.
Barkhordari, Mahnaz; Padyab, Mojgan; Hadaegh, Farzad; Azizi, Fereidoun; Bozorgmanesh, Mohammadreza
2016-01-01
Prediction is a fundamental part of prevention of cardiovascular diseases (CVD). The development of prediction algorithms based on the multivariate regression models loomed several decades ago. Parallel with predictive models development, biomarker researches emerged in an impressively great scale. The key question is how best to assess and quantify the improvement in risk prediction offered by new biomarkers or more basically how to assess the performance of a risk prediction model. Discrimination, calibration, and added predictive value have been recently suggested to be used while comparing the predictive performances of the predictive models' with and without novel biomarkers. Lack of user-friendly statistical software has restricted implementation of novel model assessment methods while examining novel biomarkers. We intended, thus, to develop a user-friendly software that could be used by researchers with few programming skills. We have written a Stata command that is intended to help researchers obtain cut point-free and cut point-based net reclassification improvement index and (NRI) and relative and absolute Integrated discriminatory improvement index (IDI) for logistic-based regression analyses.We applied the commands to a real data on women participating the Tehran lipid and glucose study (TLGS) to examine if information of a family history of premature CVD, waist circumference, and fasting plasma glucose can improve predictive performance of the Framingham's "general CVD risk" algorithm. The command is addpred for logistic regression models. The Stata package provided herein can encourage the use of novel methods in examining predictive capacity of ever-emerging plethora of novel biomarkers.
Genome-based prediction of test cross performance in two subsequent breeding cycles.
Hofheinz, Nina; Borchardt, Dietrich; Weissleder, Knuth; Frisch, Matthias
2012-12-01
Genome-based prediction of genetic values is expected to overcome shortcomings that limit the application of QTL mapping and marker-assisted selection in plant breeding. Our goal was to study the genome-based prediction of test cross performance with genetic effects that were estimated using genotypes from the preceding breeding cycle. In particular, our objectives were to employ a ridge regression approach that approximates best linear unbiased prediction of genetic effects, compare cross validation with validation using genetic material of the subsequent breeding cycle, and investigate the prospects of genome-based prediction in sugar beet breeding. We focused on the traits sugar content and standard molasses loss (ML) and used a set of 310 sugar beet lines to estimate genetic effects at 384 SNP markers. In cross validation, correlations >0.8 between observed and predicted test cross performance were observed for both traits. However, in validation with 56 lines from the next breeding cycle, a correlation of 0.8 could only be observed for sugar content, for standard ML the correlation reduced to 0.4. We found that ridge regression based on preliminary estimates of the heritability provided a very good approximation of best linear unbiased prediction and was not accompanied with a loss in prediction accuracy. We conclude that prediction accuracy assessed with cross validation within one cycle of a breeding program can not be used as an indicator for the accuracy of predicting lines of the next cycle. Prediction of lines of the next cycle seems promising for traits with high heritabilities.
A High Performance Cloud-Based Protein-Ligand Docking Prediction Algorithm
Chen, Jui-Le; Yang, Chu-Sing
2013-01-01
The potential of predicting druggability for a particular disease by integrating biological and computer science technologies has witnessed success in recent years. Although the computer science technologies can be used to reduce the costs of the pharmaceutical research, the computation time of the structure-based protein-ligand docking prediction is still unsatisfied until now. Hence, in this paper, a novel docking prediction algorithm, named fast cloud-based protein-ligand docking prediction algorithm (FCPLDPA), is presented to accelerate the docking prediction algorithm. The proposed algorithm works by leveraging two high-performance operators: (1) the novel migration (information exchange) operator is designed specially for cloud-based environments to reduce the computation time; (2) the efficient operator is aimed at filtering out the worst search directions. Our simulation results illustrate that the proposed method outperforms the other docking algorithms compared in this paper in terms of both the computation time and the quality of the end result. PMID:23762864
Scoring annual earthquake predictions in China
NASA Astrophysics Data System (ADS)
Zhuang, Jiancang; Jiang, Changsheng
2012-02-01
The Annual Consultation Meeting on Earthquake Tendency in China is held by the China Earthquake Administration (CEA) in order to provide one-year earthquake predictions over most China. In these predictions, regions of concern are denoted together with the corresponding magnitude range of the largest earthquake expected during the next year. Evaluating the performance of these earthquake predictions is rather difficult, especially for regions that are of no concern, because they are made on arbitrary regions with flexible magnitude ranges. In the present study, the gambling score is used to evaluate the performance of these earthquake predictions. Based on a reference model, this scoring method rewards successful predictions and penalizes failures according to the risk (probability of being failure) that the predictors have taken. Using the Poisson model, which is spatially inhomogeneous and temporally stationary, with the Gutenberg-Richter law for earthquake magnitudes as the reference model, we evaluate the CEA predictions based on 1) a partial score for evaluating whether issuing the alarmed regions is based on information that differs from the reference model (knowledge of average seismicity level) and 2) a complete score that evaluates whether the overall performance of the prediction is better than the reference model. The predictions made by the Annual Consultation Meetings on Earthquake Tendency from 1990 to 2003 are found to include significant precursory information, but the overall performance is close to that of the reference model.
Mass Properties for Space Systems Standards Development
NASA Technical Reports Server (NTRS)
Beech, Geoffrey
2013-01-01
Current Verbiage in S-120 Applies to Dry Mass. Mass Margin is difference between Required Mass and Predicted Mass. Performance Margin is difference between Predicted Performance and Required Performance. Performance estimates and corresponding margin should be based on Predicted Mass (and other inputs). Contractor Mass Margin reserved from Performance Margin. Remaining performance margin allocated according to mass partials. Compliance can be evaluated effectively by comparison of three areas (preferably on a single sheet). Basic and Predicted Mass (including historical trend). Aggregate potential changes (threats and opportunities) which gives Mass Forecast. Mass Maturity by category (Estimated/Calculated/Actual).
Improving real-time efficiency of case-based reasoning for medical diagnosis.
Park, Yoon-Joo
2014-01-01
Conventional case-based reasoning (CBR) does not perform efficiently for high volume dataset because of case-retrieval time. Some previous researches overcome this problem by clustering a case-base into several small groups, and retrieve neighbors within a corresponding group to a target case. However, this approach generally produces less accurate predictive performances than the conventional CBR. This paper suggests a new case-based reasoning method called the Clustering-Merging CBR (CM-CBR) which produces similar level of predictive performances than the conventional CBR with spending significantly less computational cost.
ERIC Educational Resources Information Center
Henry, Gary T.; Campbell, Shanyce L.; Thompson, Charles L.; Patriarca, Linda A.; Luterbach, Kenneth J.; Lys, Diana B.; Covington, Vivian Martin
2013-01-01
Calls for evidence-based reform of teacher preparation programs (TPPs) suggest the question: Do the current indicators of progress and performance used by TPPs predict effectiveness of their graduates when they become teachers? In this study, the indicators of progress and performance used by one program are examined for their ability to predict…
Comparison of four statistical and machine learning methods for crash severity prediction.
Iranitalab, Amirfarrokh; Khattak, Aemal
2017-11-01
Crash severity prediction models enable different agencies to predict the severity of a reported crash with unknown severity or the severity of crashes that may be expected to occur sometime in the future. This paper had three main objectives: comparison of the performance of four statistical and machine learning methods including Multinomial Logit (MNL), Nearest Neighbor Classification (NNC), Support Vector Machines (SVM) and Random Forests (RF), in predicting traffic crash severity; developing a crash costs-based approach for comparison of crash severity prediction methods; and investigating the effects of data clustering methods comprising K-means Clustering (KC) and Latent Class Clustering (LCC), on the performance of crash severity prediction models. The 2012-2015 reported crash data from Nebraska, United States was obtained and two-vehicle crashes were extracted as the analysis data. The dataset was split into training/estimation (2012-2014) and validation (2015) subsets. The four prediction methods were trained/estimated using the training/estimation dataset and the correct prediction rates for each crash severity level, overall correct prediction rate and a proposed crash costs-based accuracy measure were obtained for the validation dataset. The correct prediction rates and the proposed approach showed NNC had the best prediction performance in overall and in more severe crashes. RF and SVM had the next two sufficient performances and MNL was the weakest method. Data clustering did not affect the prediction results of SVM, but KC improved the prediction performance of MNL, NNC and RF, while LCC caused improvement in MNL and RF but weakened the performance of NNC. Overall correct prediction rate had almost the exact opposite results compared to the proposed approach, showing that neglecting the crash costs can lead to misjudgment in choosing the right prediction method. Copyright © 2017 Elsevier Ltd. All rights reserved.
Liu, Guang-Hui; Shen, Hong-Bin; Yu, Dong-Jun
2016-04-01
Accurately predicting protein-protein interaction sites (PPIs) is currently a hot topic because it has been demonstrated to be very useful for understanding disease mechanisms and designing drugs. Machine-learning-based computational approaches have been broadly utilized and demonstrated to be useful for PPI prediction. However, directly applying traditional machine learning algorithms, which often assume that samples in different classes are balanced, often leads to poor performance because of the severe class imbalance that exists in the PPI prediction problem. In this study, we propose a novel method for improving PPI prediction performance by relieving the severity of class imbalance using a data-cleaning procedure and reducing predicted false positives with a post-filtering procedure: First, a machine-learning-based data-cleaning procedure is applied to remove those marginal targets, which may potentially have a negative effect on training a model with a clear classification boundary, from the majority samples to relieve the severity of class imbalance in the original training dataset; then, a prediction model is trained on the cleaned dataset; finally, an effective post-filtering procedure is further used to reduce potential false positive predictions. Stringent cross-validation and independent validation tests on benchmark datasets demonstrated the efficacy of the proposed method, which exhibits highly competitive performance compared with existing state-of-the-art sequence-based PPIs predictors and should supplement existing PPI prediction methods.
Automated Clinical Assessment from Smart home-based Behavior Data
Dawadi, Prafulla Nath; Cook, Diane Joyce; Schmitter-Edgecombe, Maureen
2016-01-01
Smart home technologies offer potential benefits for assisting clinicians by automating health monitoring and well-being assessment. In this paper, we examine the actual benefits of smart home-based analysis by monitoring daily behaviour in the home and predicting standard clinical assessment scores of the residents. To accomplish this goal, we propose a Clinical Assessment using Activity Behavior (CAAB) approach to model a smart home resident’s daily behavior and predict the corresponding standard clinical assessment scores. CAAB uses statistical features that describe characteristics of a resident’s daily activity performance to train machine learning algorithms that predict the clinical assessment scores. We evaluate the performance of CAAB utilizing smart home sensor data collected from 18 smart homes over two years using prediction and classification-based experiments. In the prediction-based experiments, we obtain a statistically significant correlation (r = 0.72) between CAAB-predicted and clinician-provided cognitive assessment scores and a statistically significant correlation (r = 0.45) between CAAB-predicted and clinician-provided mobility scores. Similarly, for the classification-based experiments, we find CAAB has a classification accuracy of 72% while classifying cognitive assessment scores and 76% while classifying mobility scores. These prediction and classification results suggest that it is feasible to predict standard clinical scores using smart home sensor data and learning-based data analysis. PMID:26292348
Yang, Xiaoxia; Wang, Jia; Sun, Jun; Liu, Rong
2015-01-01
Protein-nucleic acid interactions are central to various fundamental biological processes. Automated methods capable of reliably identifying DNA- and RNA-binding residues in protein sequence are assuming ever-increasing importance. The majority of current algorithms rely on feature-based prediction, but their accuracy remains to be further improved. Here we propose a sequence-based hybrid algorithm SNBRFinder (Sequence-based Nucleic acid-Binding Residue Finder) by merging a feature predictor SNBRFinderF and a template predictor SNBRFinderT. SNBRFinderF was established using the support vector machine whose inputs include sequence profile and other complementary sequence descriptors, while SNBRFinderT was implemented with the sequence alignment algorithm based on profile hidden Markov models to capture the weakly homologous template of query sequence. Experimental results show that SNBRFinderF was clearly superior to the commonly used sequence profile-based predictor and SNBRFinderT can achieve comparable performance to the structure-based template methods. Leveraging the complementary relationship between these two predictors, SNBRFinder reasonably improved the performance of both DNA- and RNA-binding residue predictions. More importantly, the sequence-based hybrid prediction reached competitive performance relative to our previous structure-based counterpart. Our extensive and stringent comparisons show that SNBRFinder has obvious advantages over the existing sequence-based prediction algorithms. The value of our algorithm is highlighted by establishing an easy-to-use web server that is freely accessible at http://ibi.hzau.edu.cn/SNBRFinder.
Stringent DDI-based Prediction of H. sapiens-M. tuberculosis H37Rv Protein-Protein Interactions
2013-01-01
Background H. sapiens-M. tuberculosis H37Rv protein-protein interaction (PPI) data are very important information to illuminate the infection mechanism of M. tuberculosis H37Rv. But current H. sapiens-M. tuberculosis H37Rv PPI data are very scarce. This seriously limits the study of the interaction between this important pathogen and its host H. sapiens. Computational prediction of H. sapiens-M. tuberculosis H37Rv PPIs is an important strategy to fill in the gap. Domain-domain interaction (DDI) based prediction is one of the frequently used computational approaches in predicting both intra-species and inter-species PPIs. However, the performance of DDI-based host-pathogen PPI prediction has been rather limited. Results We develop a stringent DDI-based prediction approach with emphasis on (i) differences between the specific domain sequences on annotated regions of proteins under the same domain ID and (ii) calculation of the interaction strength of predicted PPIs based on the interacting residues in their interaction interfaces. We compare our stringent DDI-based approach to a conventional DDI-based approach for predicting PPIs based on gold standard intra-species PPIs and coherent informative Gene Ontology terms assessment. The assessment results show that our stringent DDI-based approach achieves much better performance in predicting PPIs than the conventional approach. Using our stringent DDI-based approach, we have predicted a small set of reliable H. sapiens-M. tuberculosis H37Rv PPIs which could be very useful for a variety of related studies. We also analyze the H. sapiens-M. tuberculosis H37Rv PPIs predicted by our stringent DDI-based approach using cellular compartment distribution analysis, functional category enrichment analysis and pathway enrichment analysis. The analyses support the validity of our prediction result. Also, based on an analysis of the H. sapiens-M. tuberculosis H37Rv PPI network predicted by our stringent DDI-based approach, we have discovered some important properties of domains involved in host-pathogen PPIs. We find that both host and pathogen proteins involved in host-pathogen PPIs tend to have more domains than proteins involved in intra-species PPIs, and these domains have more interaction partners than domains on proteins involved in intra-species PPI. Conclusions The stringent DDI-based prediction approach reported in this work provides a stringent strategy for predicting host-pathogen PPIs. It also performs better than a conventional DDI-based approach in predicting PPIs. We have predicted a small set of accurate H. sapiens-M. tuberculosis H37Rv PPIs which could be very useful for a variety of related studies. PMID:24564941
Sharma, Ashok K; Srivastava, Gopal N; Roy, Ankita; Sharma, Vineet K
2017-01-01
The experimental methods for the prediction of molecular toxicity are tedious and time-consuming tasks. Thus, the computational approaches could be used to develop alternative methods for toxicity prediction. We have developed a tool for the prediction of molecular toxicity along with the aqueous solubility and permeability of any molecule/metabolite. Using a comprehensive and curated set of toxin molecules as a training set, the different chemical and structural based features such as descriptors and fingerprints were exploited for feature selection, optimization and development of machine learning based classification and regression models. The compositional differences in the distribution of atoms were apparent between toxins and non-toxins, and hence, the molecular features were used for the classification and regression. On 10-fold cross-validation, the descriptor-based, fingerprint-based and hybrid-based classification models showed similar accuracy (93%) and Matthews's correlation coefficient (0.84). The performances of all the three models were comparable (Matthews's correlation coefficient = 0.84-0.87) on the blind dataset. In addition, the regression-based models using descriptors as input features were also compared and evaluated on the blind dataset. Random forest based regression model for the prediction of solubility performed better ( R 2 = 0.84) than the multi-linear regression (MLR) and partial least square regression (PLSR) models, whereas, the partial least squares based regression model for the prediction of permeability (caco-2) performed better ( R 2 = 0.68) in comparison to the random forest and MLR based regression models. The performance of final classification and regression models was evaluated using the two validation datasets including the known toxins and commonly used constituents of health products, which attests to its accuracy. The ToxiM web server would be a highly useful and reliable tool for the prediction of toxicity, solubility, and permeability of small molecules.
Sharma, Ashok K.; Srivastava, Gopal N.; Roy, Ankita; Sharma, Vineet K.
2017-01-01
The experimental methods for the prediction of molecular toxicity are tedious and time-consuming tasks. Thus, the computational approaches could be used to develop alternative methods for toxicity prediction. We have developed a tool for the prediction of molecular toxicity along with the aqueous solubility and permeability of any molecule/metabolite. Using a comprehensive and curated set of toxin molecules as a training set, the different chemical and structural based features such as descriptors and fingerprints were exploited for feature selection, optimization and development of machine learning based classification and regression models. The compositional differences in the distribution of atoms were apparent between toxins and non-toxins, and hence, the molecular features were used for the classification and regression. On 10-fold cross-validation, the descriptor-based, fingerprint-based and hybrid-based classification models showed similar accuracy (93%) and Matthews's correlation coefficient (0.84). The performances of all the three models were comparable (Matthews's correlation coefficient = 0.84–0.87) on the blind dataset. In addition, the regression-based models using descriptors as input features were also compared and evaluated on the blind dataset. Random forest based regression model for the prediction of solubility performed better (R2 = 0.84) than the multi-linear regression (MLR) and partial least square regression (PLSR) models, whereas, the partial least squares based regression model for the prediction of permeability (caco-2) performed better (R2 = 0.68) in comparison to the random forest and MLR based regression models. The performance of final classification and regression models was evaluated using the two validation datasets including the known toxins and commonly used constituents of health products, which attests to its accuracy. The ToxiM web server would be a highly useful and reliable tool for the prediction of toxicity, solubility, and permeability of small molecules. PMID:29249969
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yan, Shiju; Qian, Wei; Guan, Yubao
2016-06-15
Purpose: This study aims to investigate the potential to improve lung cancer recurrence risk prediction performance for stage I NSCLS patients by integrating oversampling, feature selection, and score fusion techniques and develop an optimal prediction model. Methods: A dataset involving 94 early stage lung cancer patients was retrospectively assembled, which includes CT images, nine clinical and biological (CB) markers, and outcome of 3-yr disease-free survival (DFS) after surgery. Among the 94 patients, 74 remained DFS and 20 had cancer recurrence. Applying a computer-aided detection scheme, tumors were segmented from the CT images and 35 quantitative image (QI) features were initiallymore » computed. Two normalized Gaussian radial basis function network (RBFN) based classifiers were built based on QI features and CB markers separately. To improve prediction performance, the authors applied a synthetic minority oversampling technique (SMOTE) and a BestFirst based feature selection method to optimize the classifiers and also tested fusion methods to combine QI and CB based prediction results. Results: Using a leave-one-case-out cross-validation (K-fold cross-validation) method, the computed areas under a receiver operating characteristic curve (AUCs) were 0.716 ± 0.071 and 0.642 ± 0.061, when using the QI and CB based classifiers, respectively. By fusion of the scores generated by the two classifiers, AUC significantly increased to 0.859 ± 0.052 (p < 0.05) with an overall prediction accuracy of 89.4%. Conclusions: This study demonstrated the feasibility of improving prediction performance by integrating SMOTE, feature selection, and score fusion techniques. Combining QI features and CB markers and performing SMOTE prior to feature selection in classifier training enabled RBFN based classifier to yield improved prediction accuracy.« less
Quantitative computed tomography-based predictions of vertebral strength in anterior bending.
Buckley, Jenni M; Cheng, Liu; Loo, Kenneth; Slyfield, Craig; Xu, Zheng
2007-04-20
This study examined the ability of QCT-based structural assessment techniques to predict vertebral strength in anterior bending. The purpose of this study was to compare the abilities of QCT-based bone mineral density (BMD), mechanics of solids models (MOS), e.g., bending rigidity, and finite element analyses (FE) to predict the strength of isolated vertebral bodies under anterior bending boundary conditions. Although the relative performance of QCT-based structural measures is well established for uniform compression, the ability of these techniques to predict vertebral strength under nonuniform loading conditions has not yet been established. Thirty human thoracic vertebrae from 30 donors (T9-T10, 20 female, 10 male; 87 +/- 5 years of age) were QCT scanned and destructively tested in anterior bending using an industrial robot arm. The QCT scans were processed to generate specimen-specific FE models as well as trabecular bone mineral density (tBMD), integral bone mineral density (iBMD), and MOS measures, such as axial and bending rigidities. Vertebral strength in anterior bending was poorly to moderately predicted by QCT-based BMD and MOS measures (R2 = 0.14-0.22). QCT-based FE models were better strength predictors (R2 = 0.34-0.40); however, their predictive performance was not statistically different from MOS bending rigidity (P > 0.05). Our results suggest that the poor clinical performance of noninvasive structural measures may be due to their inability to predict vertebral strength under bending loads. While their performance was not statistically better than MOS bending rigidities, QCT-based FE models were moderate predictors of both compressive and bending loads at failure, suggesting that this technique has the potential for strength prediction under nonuniform loads. The current FE modeling strategy is insufficient, however, and significant modifications must be made to better mimic whole bone elastic and inelastic material behavior.
NASA Astrophysics Data System (ADS)
Lu, Jianbo; Xi, Yugeng; Li, Dewei; Xu, Yuli; Gan, Zhongxue
2018-01-01
A common objective of model predictive control (MPC) design is the large initial feasible region, low online computational burden as well as satisfactory control performance of the resulting algorithm. It is well known that interpolation-based MPC can achieve a favourable trade-off among these different aspects. However, the existing results are usually based on fixed prediction scenarios, which inevitably limits the performance of the obtained algorithms. So by replacing the fixed prediction scenarios with the time-varying multi-step prediction scenarios, this paper provides a new insight into improvement of the existing MPC designs. The adopted control law is a combination of predetermined multi-step feedback control laws, based on which two MPC algorithms with guaranteed recursive feasibility and asymptotic stability are presented. The efficacy of the proposed algorithms is illustrated by a numerical example.
Teklehaimanot, Hailay D; Schwartz, Joel; Teklehaimanot, Awash; Lipsitch, Marc
2004-11-19
Timely and accurate information about the onset of malaria epidemics is essential for effective control activities in epidemic-prone regions. Early warning methods that provide earlier alerts (usually by the use of weather variables) may permit control measures to interrupt transmission earlier in the epidemic, perhaps at the expense of some level of accuracy. Expected case numbers were modeled using a Poisson regression with lagged weather factors in a 4th-degree polynomial distributed lag model. For each week, the numbers of malaria cases were predicted using coefficients obtained using all years except that for which the prediction was being made. The effectiveness of alerts generated by the prediction system was compared against that of alerts based on observed cases. The usefulness of the prediction system was evaluated in cold and hot districts. The system predicts the overall pattern of cases well, yet underestimates the height of the largest peaks. Relative to alerts triggered by observed cases, the alerts triggered by the predicted number of cases performed slightly worse, within 5% of the detection system. The prediction-based alerts were able to prevent 10-25% more cases at a given sensitivity in cold districts than in hot ones. The prediction of malaria cases using lagged weather performed well in identifying periods of increased malaria cases. Weather-derived predictions identified epidemics with reasonable accuracy and better timeliness than early detection systems; therefore, the prediction of malarial epidemics using weather is a plausible alternative to early detection systems.
Stata Modules for Calculating Novel Predictive Performance Indices for Logistic Models
Barkhordari, Mahnaz; Padyab, Mojgan; Hadaegh, Farzad; Azizi, Fereidoun; Bozorgmanesh, Mohammadreza
2016-01-01
Background Prediction is a fundamental part of prevention of cardiovascular diseases (CVD). The development of prediction algorithms based on the multivariate regression models loomed several decades ago. Parallel with predictive models development, biomarker researches emerged in an impressively great scale. The key question is how best to assess and quantify the improvement in risk prediction offered by new biomarkers or more basically how to assess the performance of a risk prediction model. Discrimination, calibration, and added predictive value have been recently suggested to be used while comparing the predictive performances of the predictive models’ with and without novel biomarkers. Objectives Lack of user-friendly statistical software has restricted implementation of novel model assessment methods while examining novel biomarkers. We intended, thus, to develop a user-friendly software that could be used by researchers with few programming skills. Materials and Methods We have written a Stata command that is intended to help researchers obtain cut point-free and cut point-based net reclassification improvement index and (NRI) and relative and absolute Integrated discriminatory improvement index (IDI) for logistic-based regression analyses.We applied the commands to a real data on women participating the Tehran lipid and glucose study (TLGS) to examine if information of a family history of premature CVD, waist circumference, and fasting plasma glucose can improve predictive performance of the Framingham’s “general CVD risk” algorithm. Results The command is addpred for logistic regression models. Conclusions The Stata package provided herein can encourage the use of novel methods in examining predictive capacity of ever-emerging plethora of novel biomarkers. PMID:27279830
Kawabata, Takeshi; Nakamura, Haruki
2014-07-28
A protein-bound conformation of a target molecule can be predicted by aligning the target molecule on the reference molecule obtained from the 3D structure of the compound-protein complex. This strategy is called "similarity-based docking". For this purpose, we develop the flexible alignment program fkcombu, which aligns the target molecule based on atomic correspondences with the reference molecule. The correspondences are obtained by the maximum common substructure (MCS) of 2D chemical structures, using our program kcombu. The prediction performance was evaluated using many target-reference pairs of superimposed ligand 3D structures on the same protein in the PDB, with different ranges of chemical similarity. The details of atomic correspondence largely affected the prediction success. We found that topologically constrained disconnected MCS (TD-MCS) with the simple element-based atomic classification provides the best prediction. The crashing potential energy with the receptor protein improved the performance. We also found that the RMSD between the predicted and correct target conformations significantly correlates with the chemical similarities between target-reference molecules. Generally speaking, if the reference and target compounds have more than 70% chemical similarity, then the average RMSD of 3D conformations is <2.0 Å. We compared the performance with a rigid-body molecular alignment program based on volume-overlap scores (ShaEP). Our MCS-based flexible alignment program performed better than the rigid-body alignment program, especially when the target and reference molecules were sufficiently similar.
NASA Technical Reports Server (NTRS)
Trejo, Leonard J.; Shensa, Mark J.; Remington, Roger W. (Technical Monitor)
1998-01-01
This report describes the development and evaluation of mathematical models for predicting human performance from discrete wavelet transforms (DWT) of event-related potentials (ERP) elicited by task-relevant stimuli. The DWT was compared to principal components analysis (PCA) for representation of ERPs in linear regression and neural network models developed to predict a composite measure of human signal detection performance. Linear regression models based on coefficients of the decimated DWT predicted signal detection performance with half as many f ree parameters as comparable models based on PCA scores. In addition, the DWT-based models were more resistant to model degradation due to over-fitting than PCA-based models. Feed-forward neural networks were trained using the backpropagation,-, algorithm to predict signal detection performance based on raw ERPs, PCA scores, or high-power coefficients of the DWT. Neural networks based on high-power DWT coefficients trained with fewer iterations, generalized to new data better, and were more resistant to overfitting than networks based on raw ERPs. Networks based on PCA scores did not generalize to new data as well as either the DWT network or the raw ERP network. The results show that wavelet expansions represent the ERP efficiently and extract behaviorally important features for use in linear regression or neural network models of human performance. The efficiency of the DWT is discussed in terms of its decorrelation and energy compaction properties. In addition, the DWT models provided evidence that a pattern of low-frequency activity (1 to 3.5 Hz) occurring at specific times and scalp locations is a reliable correlate of human signal detection performance.
NASA Technical Reports Server (NTRS)
Trejo, L. J.; Shensa, M. J.
1999-01-01
This report describes the development and evaluation of mathematical models for predicting human performance from discrete wavelet transforms (DWT) of event-related potentials (ERP) elicited by task-relevant stimuli. The DWT was compared to principal components analysis (PCA) for representation of ERPs in linear regression and neural network models developed to predict a composite measure of human signal detection performance. Linear regression models based on coefficients of the decimated DWT predicted signal detection performance with half as many free parameters as comparable models based on PCA scores. In addition, the DWT-based models were more resistant to model degradation due to over-fitting than PCA-based models. Feed-forward neural networks were trained using the backpropagation algorithm to predict signal detection performance based on raw ERPs, PCA scores, or high-power coefficients of the DWT. Neural networks based on high-power DWT coefficients trained with fewer iterations, generalized to new data better, and were more resistant to overfitting than networks based on raw ERPs. Networks based on PCA scores did not generalize to new data as well as either the DWT network or the raw ERP network. The results show that wavelet expansions represent the ERP efficiently and extract behaviorally important features for use in linear regression or neural network models of human performance. The efficiency of the DWT is discussed in terms of its decorrelation and energy compaction properties. In addition, the DWT models provided evidence that a pattern of low-frequency activity (1 to 3.5 Hz) occurring at specific times and scalp locations is a reliable correlate of human signal detection performance. Copyright 1999 Academic Press.
Ligand Binding Site Detection by Local Structure Alignment and Its Performance Complementarity
Lee, Hui Sun; Im, Wonpil
2013-01-01
Accurate determination of potential ligand binding sites (BS) is a key step for protein function characterization and structure-based drug design. Despite promising results of template-based BS prediction methods using global structure alignment (GSA), there is a room to improve the performance by properly incorporating local structure alignment (LSA) because BS are local structures and often similar for proteins with dissimilar global folds. We present a template-based ligand BS prediction method using G-LoSA, our LSA tool. A large benchmark set validation shows that G-LoSA predicts drug-like ligands’ positions in single-chain protein targets more precisely than TM-align, a GSA-based method, while the overall success rate of TM-align is better. G-LoSA is particularly efficient for accurate detection of local structures conserved across proteins with diverse global topologies. Recognizing the performance complementarity of G-LoSA to TM-align and a non-template geometry-based method, fpocket, a robust consensus scoring method, CMCS-BSP (Complementary Methods and Consensus Scoring for ligand Binding Site Prediction), is developed and shows improvement on prediction accuracy. The G-LoSA source code is freely available at http://im.bioinformatics.ku.edu/GLoSA. PMID:23957286
Using string invariants for prediction searching for optimal parameters
NASA Astrophysics Data System (ADS)
Bundzel, Marek; Kasanický, Tomáš; Pinčák, Richard
2016-02-01
We have developed a novel prediction method based on string invariants. The method does not require learning but a small set of parameters must be set to achieve optimal performance. We have implemented an evolutionary algorithm for the parametric optimization. We have tested the performance of the method on artificial and real world data and compared the performance to statistical methods and to a number of artificial intelligence methods. We have used data and the results of a prediction competition as a benchmark. The results show that the method performs well in single step prediction but the method's performance for multiple step prediction needs to be improved. The method works well for a wide range of parameters.
Predicting reading and mathematics from neural activity for feedback learning.
Peters, Sabine; Van der Meulen, Mara; Zanolie, Kiki; Crone, Eveline A
2017-01-01
Although many studies use feedback learning paradigms to study the process of learning in laboratory settings, little is known about their relevance for real-world learning settings such as school. In a large developmental sample (N = 228, 8-25 years), we investigated whether performance and neural activity during a feedback learning task predicted reading and mathematics performance 2 years later. The results indicated that feedback learning performance predicted both reading and mathematics performance. Activity during feedback learning in left superior dorsolateral prefrontal cortex (DLPFC) predicted reading performance, whereas activity in presupplementary motor area/anterior cingulate cortex (pre-SMA/ACC) predicted mathematical performance. Moreover, left superior DLPFC and pre-SMA/ACC activity predicted unique variance in reading and mathematics ability over behavioral testing of feedback learning performance alone. These results provide valuable insights into the relationship between laboratory-based learning tasks and learning in school settings, and the value of neural assessments for prediction of school performance over behavioral testing alone. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Salvaggio, C N; Forman, E J; Garnsey, H M; Treff, N R; Scott, R T
2014-09-01
Polar body (polar body) biopsy represents one possible solution to performing comprehensive chromosome screening (CCS). This study adds to what is known about the predictive value of polar body based testing for the genetic status of the resulting embryo, but more importantly, provides the first evaluation of the predictive value for actual clinical outcomes after embryo transfer. SNP array was performed on first polar body, second polar body, and either a blastomere or trophectoderm biopsy, or the entire arrested embryo. Concordance of the polar body-based prediction with the observed diagnoses in the embryos was assessed. In addition, the predictive value of the polar body -based diagnosis for the specific clinical outcome of transferred embryos was evaluated through the use of DNA fingerprinting to track individual embryos. There were 459 embryos analyzed from 96 patients with a mean maternal age of 35.3. The polar body-based predictive value for the embryo based diagnosis was 70.3%. The blastocyst implantation predictive value of a euploid trophectoderm was higher than from euploid polar bodies (51% versus 40%). The cleavage stage embryo implantation predictive value of a euploid blastomere was also higher than from euploid polar bodies (31% versus 22%). Polar body based aneuploidy screening results were less predictive of actual clinical outcomes than direct embryo assessment and may not be adequate to improve sustained implantation rates. In nearly one-third of cases the polar body based analysis failed to predict the ploidy of the embryo. This imprecision may hinder efforts for polar body based CCS to improve IVF clinical outcomes.
Auinger, Hans-Jürgen; Schönleben, Manfred; Lehermeier, Christina; Schmidt, Malthe; Korzun, Viktor; Geiger, Hartwig H; Piepho, Hans-Peter; Gordillo, Andres; Wilde, Peer; Bauer, Eva; Schön, Chris-Carolin
2016-11-01
Genomic prediction accuracy can be significantly increased by model calibration across multiple breeding cycles as long as selection cycles are connected by common ancestors. In hybrid rye breeding, application of genome-based prediction is expected to increase selection gain because of long selection cycles in population improvement and development of hybrid components. Essentially two prediction scenarios arise: (1) prediction of the genetic value of lines from the same breeding cycle in which model training is performed and (2) prediction of lines from subsequent cycles. It is the latter from which a reduction in cycle length and consequently the strongest impact on selection gain is expected. We empirically investigated genome-based prediction of grain yield, plant height and thousand kernel weight within and across four selection cycles of a hybrid rye breeding program. Prediction performance was assessed using genomic and pedigree-based best linear unbiased prediction (GBLUP and PBLUP). A total of 1040 S 2 lines were genotyped with 16 k SNPs and each year testcrosses of 260 S 2 lines were phenotyped in seven or eight locations. The performance gap between GBLUP and PBLUP increased significantly for all traits when model calibration was performed on aggregated data from several cycles. Prediction accuracies obtained from cross-validation were in the order of 0.70 for all traits when data from all cycles (N CS = 832) were used for model training and exceeded within-cycle accuracies in all cases. As long as selection cycles are connected by a sufficient number of common ancestors and prediction accuracy has not reached a plateau when increasing sample size, aggregating data from several preceding cycles is recommended for predicting genetic values in subsequent cycles despite decreasing relatedness over time.
On the use and the performance of software reliability growth models
NASA Technical Reports Server (NTRS)
Keiller, Peter A.; Miller, Douglas R.
1991-01-01
We address the problem of predicting future failures for a piece of software. The number of failures occurring during a finite future time interval is predicted from the number failures observed during an initial period of usage by using software reliability growth models. Two different methods for using the models are considered: straightforward use of individual models, and dynamic selection among models based on goodness-of-fit and quality-of-prediction criteria. Performance is judged by the relative error of the predicted number of failures over future finite time intervals relative to the number of failures eventually observed during the intervals. Six of the former models and eight of the latter are evaluated, based on their performance on twenty data sets. Many open questions remain regarding the use and the performance of software reliability growth models.
Lyyra, Tiina-Mari; Heikkinen, Eino; Lyyra, Anna-Liisa; Jylhä, Marja
2006-01-01
It is well established that self-rated health (SRH) predicts mortality even when other indicators of health status are taken into account. It has been suggested that SRH measures a wide array of mortality-related physiological and pathological characteristics not captured by the covariates included in the analyses. Our aim was to test this hypothesis by examining the predictive value of SRH on mortality controlling for different measurements of body structure, performance-based functioning and diagnosed diseases with a population-based, prospective study over an 18-year follow-up. Subjects consisted of 257 male residents of the city of Jyväskylä, central Finland, aged 51-55 and 71-75 years. Among the 71-75-year-olds the association between SRH and mortality was weaker over the longer compared to shorter follow-up period. In the multivariate Cox regression models with an 18-year follow-up time for middle-aged and a10-year follow-up time for older men, SRH predicted mortality even when the anthropometrics, clinical chemistry and performance-based measures of functioning were controlled for, but not when the number of chronic diseases was included. Although our results confirm the hypothesis that the predictive value of SRH can be explained by diagnosed diseases, its predictive power remained, when the clinical and performance-based measures of health and functioning were controlled.
2005-09-01
7 B. SLEEP ARCHITECTURE..................................7 1. Circadian Rhythm and Human Sleep Drive...body temperature. Van Dongen & Dinges, 2000 ....10 Figure 2. EEG of Human Brain Activity During Sleep. http://ist-socrates.berkeley.edu/~jmp...the predicted levels of human performance based on circadian rhythms , amount and quality of sleep, and combines cognitive performance 5 predictions
Then, Amy Y.; Hoenig, John M; Hall, Norman G.; Hewitt, David A.
2015-01-01
Many methods have been developed in the last 70 years to predict the natural mortality rate, M, of a stock based on empirical evidence from comparative life history studies. These indirect or empirical methods are used in most stock assessments to (i) obtain estimates of M in the absence of direct information, (ii) check on the reasonableness of a direct estimate of M, (iii) examine the range of plausible M estimates for the stock under consideration, and (iv) define prior distributions for Bayesian analyses. The two most cited empirical methods have appeared in the literature over 2500 times to date. Despite the importance of these methods, there is no consensus in the literature on how well these methods work in terms of prediction error or how their performance may be ranked. We evaluate estimators based on various combinations of maximum age (tmax), growth parameters, and water temperature by seeing how well they reproduce >200 independent, direct estimates of M. We use tenfold cross-validation to estimate the prediction error of the estimators and to rank their performance. With updated and carefully reviewed data, we conclude that a tmax-based estimator performs the best among all estimators evaluated. The tmax-based estimators in turn perform better than the Alverson–Carney method based on tmax and the von Bertalanffy K coefficient, Pauly’s method based on growth parameters and water temperature and methods based just on K. It is possible to combine two independent methods by computing a weighted mean but the improvement over the tmax-based methods is slight. Based on cross-validation prediction error, model residual patterns, model parsimony, and biological considerations, we recommend the use of a tmax-based estimator (M=4.899tmax−0.916">M=4.899t−0.916maxM=4.899tmax−0.916, prediction error = 0.32) when possible and a growth-based method (M=4.118K0.73L∞−0.33">M=4.118K0.73L−0.33∞M=4.118K0.73L∞−0.33 , prediction error = 0.6, length in cm) otherwise.
ERIC Educational Resources Information Center
Hsiao, Hsien-Sheng; Chen, Jyun-Chen; Hong, Jon-Chao; Chen, Po-Hsi; Lu, Chow-Chin; Chen, Sherry Y.
2017-01-01
A five-stage prediction-observation-explanation inquiry-based learning (FPOEIL) model was developed to improve students' scientific learning performance. In order to intensify the science learning effect, the repertory grid technology-assisted learning (RGTL) approach and the collaborative learning (CL) approach were utilized. A quasi-experimental…
Predicting preference-based SF-6D index scores from the SF-8 health survey.
Wang, P; Fu, A Z; Wee, H L; Lee, J; Tai, E S; Thumboo, J; Luo, N
2013-09-01
To develop and test functions for predicting the preference-based SF-6D index scores from the SF-8 health survey. This study was a secondary analysis of data collected in a population health survey in which respondents (n = 7,529) completed both the SF-36 and the SF-8 questionnaires. We examined seven ordinary least-square estimators for their performance in predicting SF-6D scores from the SF-8 at both the individual and the group levels. In general, all functions performed similarly well in predicting SF-6D scores, and the predictions at the group level were better than predictions at the individual level. At the individual level, 42.5-51.5% of prediction errors were smaller than the minimally important difference (MID) of the SF-6D scores, depending on the function specifications, while almost all prediction errors of the tested functions were smaller than the MID of SF-6D at the group level. At both individual and group levels, the tested functions predicted lower than actual scores at the higher end of the SF-6D scale. Our study developed functions to generate preference-based SF-6D index scores from the SF-8 health survey, the first of its kind. Further research is needed to evaluate the performance and validity of the prediction functions.
Predictive control and estimation algorithms for the NASA/JPL 70-meter antennas
NASA Technical Reports Server (NTRS)
Gawronski, W.
1991-01-01
A modified output prediction procedure and a new controller design is presented based on the predictive control law. Also, a new predictive estimator is developed to complement the controller and to enhance system performance. The predictive controller is designed and applied to the tracking control of the Deep Space Network 70 m antennas. Simulation results show significant improvement in tracking performance over the linear quadratic controller and estimator presently in use.
NASA Astrophysics Data System (ADS)
Qiu, Peng; D'Souza, Warren D.; McAvoy, Thomas J.; Liu, K. J. Ray
2007-09-01
Tumor motion induced by respiration presents a challenge to the reliable delivery of conformal radiation treatments. Real-time motion compensation represents the technologically most challenging clinical solution but has the potential to overcome the limitations of existing methods. The performance of a real-time couch-based motion compensation system is mainly dependent on two aspects: the ability to infer the internal anatomical position and the performance of the feedback control system. In this paper, we propose two novel methods for the two aspects respectively, and then combine the proposed methods into one system. To accurately estimate the internal tumor position, we present partial-least squares (PLS) regression to predict the position of the diaphragm using skin-based motion surrogates. Four radio-opaque markers were placed on the abdomen of patients who underwent fluoroscopic imaging of the diaphragm. The coordinates of the markers served as input variables and the position of the diaphragm served as the output variable. PLS resulted in lower prediction errors compared with standard multiple linear regression (MLR). The performance of the feedback control system depends on the system dynamics and dead time (delay between the initiation and execution of the control action). While the dynamics of the system can be inverted in a feedback control system, the dead time cannot be inverted. To overcome the dead time of the system, we propose a predictive feedback control system by incorporating forward prediction using least-mean-square (LMS) and recursive least square (RLS) filtering into the couch-based control system. Motion data were obtained using a skin-based marker. The proposed predictive feedback control system was benchmarked against pure feedback control (no forward prediction) and resulted in a significant performance gain. Finally, we combined the PLS inference model and the predictive feedback control to evaluate the overall performance of the feedback control system. Our results show that, with the tumor motion unknown but inferred by skin-based markers through the PLS model, the predictive feedback control system was able to effectively compensate intra-fraction motion.
A Measurement and Simulation Based Methodology for Cache Performance Modeling and Tuning
NASA Technical Reports Server (NTRS)
Waheed, Abdul; Yan, Jerry; Saini, Subhash (Technical Monitor)
1998-01-01
We present a cache performance modeling methodology that facilitates the tuning of uniprocessor cache performance for applications executing on shared memory multiprocessors by accurately predicting the effects of source code level modifications. Measurements on a single processor are initially used for identifying parts of code where cache utilization improvements may significantly impact the overall performance. Cache simulation based on trace-driven techniques can be carried out without gathering detailed address traces. Minimal runtime information for modeling cache performance of a selected code block includes: base virtual addresses of arrays, virtual addresses of variables, and loop bounds for that code block. Rest of the information is obtained from the source code. We show that the cache performance predictions are as reliable as those obtained through trace-driven simulations. This technique is particularly helpful to the exploration of various "what-if' scenarios regarding the cache performance impact for alternative code structures. We explain and validate this methodology using a simple matrix-matrix multiplication program. We then apply this methodology to predict and tune the cache performance of two realistic scientific applications taken from the Computational Fluid Dynamics (CFD) domain.
RRCRank: a fusion method using rank strategy for residue-residue contact prediction.
Jing, Xiaoyang; Dong, Qiwen; Lu, Ruqian
2017-09-02
In structural biology area, protein residue-residue contacts play a crucial role in protein structure prediction. Some researchers have found that the predicted residue-residue contacts could effectively constrain the conformational search space, which is significant for de novo protein structure prediction. In the last few decades, related researchers have developed various methods to predict residue-residue contacts, especially, significant performance has been achieved by using fusion methods in recent years. In this work, a novel fusion method based on rank strategy has been proposed to predict contacts. Unlike the traditional regression or classification strategies, the contact prediction task is regarded as a ranking task. First, two kinds of features are extracted from correlated mutations methods and ensemble machine-learning classifiers, and then the proposed method uses the learning-to-rank algorithm to predict contact probability of each residue pair. First, we perform two benchmark tests for the proposed fusion method (RRCRank) on CASP11 dataset and CASP12 dataset respectively. The test results show that the RRCRank method outperforms other well-developed methods, especially for medium and short range contacts. Second, in order to verify the superiority of ranking strategy, we predict contacts by using the traditional regression and classification strategies based on the same features as ranking strategy. Compared with these two traditional strategies, the proposed ranking strategy shows better performance for three contact types, in particular for long range contacts. Third, the proposed RRCRank has been compared with several state-of-the-art methods in CASP11 and CASP12. The results show that the RRCRank could achieve comparable prediction precisions and is better than three methods in most assessment metrics. The learning-to-rank algorithm is introduced to develop a novel rank-based method for the residue-residue contact prediction of proteins, which achieves state-of-the-art performance based on the extensive assessment.
Can air temperature be used to project influences of climate change on stream temperature?
Arismendi, Ivan; Safeeq, Mohammad; Dunham, Jason B.; Johnson, Sherri L.
2014-01-01
Worldwide, lack of data on stream temperature has motivated the use of regression-based statistical models to predict stream temperatures based on more widely available data on air temperatures. Such models have been widely applied to project responses of stream temperatures under climate change, but the performance of these models has not been fully evaluated. To address this knowledge gap, we examined the performance of two widely used linear and nonlinear regression models that predict stream temperatures based on air temperatures. We evaluated model performance and temporal stability of model parameters in a suite of regulated and unregulated streams with 11–44 years of stream temperature data. Although such models may have validity when predicting stream temperatures within the span of time that corresponds to the data used to develop them, model predictions did not transfer well to other time periods. Validation of model predictions of most recent stream temperatures, based on air temperature–stream temperature relationships from previous time periods often showed poor performance when compared with observed stream temperatures. Overall, model predictions were less robust in regulated streams and they frequently failed in detecting the coldest and warmest temperatures within all sites. In many cases, the magnitude of errors in these predictions falls within a range that equals or exceeds the magnitude of future projections of climate-related changes in stream temperatures reported for the region we studied (between 0.5 and 3.0 °C by 2080). The limited ability of regression-based statistical models to accurately project stream temperatures over time likely stems from the fact that underlying processes at play, namely the heat budgets of air and water, are distinctive in each medium and vary among localities and through time.
Proactive Supply Chain Performance Management with Predictive Analytics
Stefanovic, Nenad
2014-01-01
Today's business climate requires supply chains to be proactive rather than reactive, which demands a new approach that incorporates data mining predictive analytics. This paper introduces a predictive supply chain performance management model which combines process modelling, performance measurement, data mining models, and web portal technologies into a unique model. It presents the supply chain modelling approach based on the specialized metamodel which allows modelling of any supply chain configuration and at different level of details. The paper also presents the supply chain semantic business intelligence (BI) model which encapsulates data sources and business rules and includes the data warehouse model with specific supply chain dimensions, measures, and KPIs (key performance indicators). Next, the paper describes two generic approaches for designing the KPI predictive data mining models based on the BI semantic model. KPI predictive models were trained and tested with a real-world data set. Finally, a specialized analytical web portal which offers collaborative performance monitoring and decision making is presented. The results show that these models give very accurate KPI projections and provide valuable insights into newly emerging trends, opportunities, and problems. This should lead to more intelligent, predictive, and responsive supply chains capable of adapting to future business environment. PMID:25386605
Proactive supply chain performance management with predictive analytics.
Stefanovic, Nenad
2014-01-01
Today's business climate requires supply chains to be proactive rather than reactive, which demands a new approach that incorporates data mining predictive analytics. This paper introduces a predictive supply chain performance management model which combines process modelling, performance measurement, data mining models, and web portal technologies into a unique model. It presents the supply chain modelling approach based on the specialized metamodel which allows modelling of any supply chain configuration and at different level of details. The paper also presents the supply chain semantic business intelligence (BI) model which encapsulates data sources and business rules and includes the data warehouse model with specific supply chain dimensions, measures, and KPIs (key performance indicators). Next, the paper describes two generic approaches for designing the KPI predictive data mining models based on the BI semantic model. KPI predictive models were trained and tested with a real-world data set. Finally, a specialized analytical web portal which offers collaborative performance monitoring and decision making is presented. The results show that these models give very accurate KPI projections and provide valuable insights into newly emerging trends, opportunities, and problems. This should lead to more intelligent, predictive, and responsive supply chains capable of adapting to future business environment.
Kranz, Michael B; Baniqued, Pauline L; Voss, Michelle W; Lee, Hyunkyu; Kramer, Arthur F
2017-01-01
The variety and availability of casual video games presents an exciting opportunity for applications such as cognitive training. Casual games have been associated with fluid abilities such as working memory (WM) and reasoning, but the importance of these cognitive constructs in predicting performance may change across extended gameplay and vary with game structure. The current investigation examined the relationship between cognitive abilities and casual game performance over time by analyzing first and final session performance over 4-5 weeks of game play. We focused on two groups of subjects who played different types of casual games previously shown to relate to WM and reasoning when played for a single session: (1) puzzle-based games played adaptively across sessions and (2) speeded switching games played non-adaptively across sessions. Reasoning uniquely predicted first session casual game scores for both groups and accounted for much of the relationship with WM. Furthermore, over time, WM became uniquely important for predicting casual game performance for the puzzle-based adaptive games but not for the speeded switching non-adaptive games. These results extend the burgeoning literature on cognitive abilities involved in video games by showing differential relationships of fluid abilities across different game types and extended play. More broadly, the current study illustrates the usefulness of using multiple cognitive measures in predicting performance, and provides potential directions for game-based cognitive training research.
Kranz, Michael B.; Baniqued, Pauline L.; Voss, Michelle W.; Lee, Hyunkyu; Kramer, Arthur F.
2017-01-01
The variety and availability of casual video games presents an exciting opportunity for applications such as cognitive training. Casual games have been associated with fluid abilities such as working memory (WM) and reasoning, but the importance of these cognitive constructs in predicting performance may change across extended gameplay and vary with game structure. The current investigation examined the relationship between cognitive abilities and casual game performance over time by analyzing first and final session performance over 4–5 weeks of game play. We focused on two groups of subjects who played different types of casual games previously shown to relate to WM and reasoning when played for a single session: (1) puzzle-based games played adaptively across sessions and (2) speeded switching games played non-adaptively across sessions. Reasoning uniquely predicted first session casual game scores for both groups and accounted for much of the relationship with WM. Furthermore, over time, WM became uniquely important for predicting casual game performance for the puzzle-based adaptive games but not for the speeded switching non-adaptive games. These results extend the burgeoning literature on cognitive abilities involved in video games by showing differential relationships of fluid abilities across different game types and extended play. More broadly, the current study illustrates the usefulness of using multiple cognitive measures in predicting performance, and provides potential directions for game-based cognitive training research. PMID:28326042
Johansen, Kirsten L; Dalrymple, Lorien S; Delgado, Cynthia; Kaysen, George A; Kornak, John; Grimes, Barbara; Chertow, Glenn M
2014-10-01
A well-accepted definition of frailty includes measurements of physical performance, which may limit its clinical utility. In a cross-sectional study, we compared prevalence and patient characteristics based on a frailty definition that uses self-reported function to the classic performance-based definition and developed a modified self-report-based definition. Prevalent adult patients receiving hemodialysis in 14 centers around San Francisco and Atlanta in 2009-2011. Self-report-based frailty definition in which a score lower than 75 on the Physical Function scale of the 36-Item Short Form Health Survey (SF-36) was substituted for gait speed and grip strength in the classic definition; modified self-report definition with optimized Physical Function score cutoff points derived in a development (one-half) cohort and validated in the other half. Performance-based frailty defined as 3 of the following: weight loss, weakness, exhaustion, low physical activity, and slow gait speed. 387 (53%) patients were frail based on self-reported function, of whom 209 (29% of the cohort) met the performance-based definition. Only 23 (3%) met the performance-based definition of frailty only. The self-report definition had 90% sensitivity, 64% specificity, 54% positive predictive value, 93% negative predictive value, and 72.5% overall accuracy. Intracellular water per kilogram of body weight and serum albumin, prealbumin, and creatinine levels were highest among nonfrail individuals, intermediate among those who were frail by self-report, and lowest among those who also were frail by performance. Age, percentage of body fat, and C-reactive protein level followed an opposite pattern. The modified self-report definition had better accuracy (84%; 95% CI, 79%-89%) and superior specificity (88%) and positive predictive value (67%). Our study did not address prediction of outcomes. Patients who meet the self-report-based but not the performance-based definition of frailty may represent an intermediate phenotype. A modified self-report definition can improve the accuracy of a questionnaire-based method of defining frailty. Published by Elsevier Inc.
Driving and Low Vision: Validity of Assessments for Predicting Performance of Drivers
ERIC Educational Resources Information Center
Strong, J. Graham; Jutai, Jeffrey W.; Russell-Minda, Elizabeth; Evans, Mal
2008-01-01
The authors conducted a systematic review to examine whether vision-related assessments can predict the driving performance of individuals who have low vision. The results indicate that measures of visual field, contrast sensitivity, cognitive and attention-based tests, and driver screening tools have variable utility for predicting real-world…
Fryer, Jonathan P; Corcoran, Noreen; George, Brian; Wang, Ed; Darosa, Debra
2012-01-01
While the primary goal of ranking applicants for surgical residency training positions is to identify the candidates who will subsequently perform best as surgical residents, the effectiveness of the ranking process has not been adequately studied. We evaluated our general surgery resident recruitment process between 2001 and 2011 inclusive, to determine if our recruitment ranking parameters effectively predicted subsequent resident performance. We identified 3 candidate ranking parameters (United States Medical Licensing Examination [USMLE] Step 1 score, unadjusted ranking score [URS], and final adjusted ranking [FAR]), and 4 resident performance parameters (American Board of Surgery In-Training Examination [ABSITE] score, PGY1 resident evaluation grade [REG], overall REG, and independent faculty rating ranking [IFRR]), and assessed whether the former were predictive of the latter. Analyses utilized Spearman correlation coefficient. We found that the URS, which is based on objective and criterion based parameters, was a better predictor of subsequent performance than the FAR, which is a modification of the URS based on subsequent determinations of the resident selection committee. USMLE score was a reliable predictor of ABSITE scores only. However, when we compared our worst residence performances with the performances of the other residents in this evaluation, the data did not produce convincing evidence that poor resident performances could be reliably predicted by any of the recruitment ranking parameters. Finally, stratifying candidates based on their rank range did not effectively define a ranking cut-off beyond which resident performance would drop off. Based on these findings, we recommend surgery programs may be better served by utilizing a more structured resident ranking process and that subsequent adjustments to the rank list generated by this process should be undertaken with caution. Copyright © 2012 Association of Program Directors in Surgery. Published by Elsevier Inc. All rights reserved.
Automated Cognitive Health Assessment From Smart Home-Based Behavior Data.
Dawadi, Prafulla Nath; Cook, Diane Joyce; Schmitter-Edgecombe, Maureen
2016-07-01
Smart home technologies offer potential benefits for assisting clinicians by automating health monitoring and well-being assessment. In this paper, we examine the actual benefits of smart home-based analysis by monitoring daily behavior in the home and predicting clinical scores of the residents. To accomplish this goal, we propose a clinical assessment using activity behavior (CAAB) approach to model a smart home resident's daily behavior and predict the corresponding clinical scores. CAAB uses statistical features that describe characteristics of a resident's daily activity performance to train machine learning algorithms that predict the clinical scores. We evaluate the performance of CAAB utilizing smart home sensor data collected from 18 smart homes over two years. We obtain a statistically significant correlation ( r=0.72) between CAAB-predicted and clinician-provided cognitive scores and a statistically significant correlation ( r=0.45) between CAAB-predicted and clinician-provided mobility scores. These prediction results suggest that it is feasible to predict clinical scores using smart home sensor data and learning-based data analysis.
Kwon, Andrew T.; Chou, Alice Yi; Arenillas, David J.; Wasserman, Wyeth W.
2011-01-01
We performed a genome-wide scan for muscle-specific cis-regulatory modules (CRMs) using three computational prediction programs. Based on the predictions, 339 candidate CRMs were tested in cell culture with NIH3T3 fibroblasts and C2C12 myoblasts for capacity to direct selective reporter gene expression to differentiated C2C12 myotubes. A subset of 19 CRMs validated as functional in the assay. The rate of predictive success reveals striking limitations of computational regulatory sequence analysis methods for CRM discovery. Motif-based methods performed no better than predictions based only on sequence conservation. Analysis of the properties of the functional sequences relative to inactive sequences identifies nucleotide sequence composition can be an important characteristic to incorporate in future methods for improved predictive specificity. Muscle-related TFBSs predicted within the functional sequences display greater sequence conservation than non-TFBS flanking regions. Comparison with recent MyoD and histone modification ChIP-Seq data supports the validity of the functional regions. PMID:22144875
Drug-target interaction prediction from PSSM based evolutionary information.
Mousavian, Zaynab; Khakabimamaghani, Sahand; Kavousi, Kaveh; Masoudi-Nejad, Ali
2016-01-01
The labor-intensive and expensive experimental process of drug-target interaction prediction has motivated many researchers to focus on in silico prediction, which leads to the helpful information in supporting the experimental interaction data. Therefore, they have proposed several computational approaches for discovering new drug-target interactions. Several learning-based methods have been increasingly developed which can be categorized into two main groups: similarity-based and feature-based. In this paper, we firstly use the bi-gram features extracted from the Position Specific Scoring Matrix (PSSM) of proteins in predicting drug-target interactions. Our results demonstrate the high-confidence prediction ability of the Bigram-PSSM model in terms of several performance indicators specifically for enzymes and ion channels. Moreover, we investigate the impact of negative selection strategy on the performance of the prediction, which is not widely taken into account in the other relevant studies. This is important, as the number of non-interacting drug-target pairs are usually extremely large in comparison with the number of interacting ones in existing drug-target interaction data. An interesting observation is that different levels of performance reduction have been attained for four datasets when we change the sampling method from the random sampling to the balanced sampling. Copyright © 2015 Elsevier Inc. All rights reserved.
Yılmaz Isıkhan, Selen; Karabulut, Erdem; Alpar, Celal Reha
2016-01-01
Background/Aim . Evaluating the success of dose prediction based on genetic or clinical data has substantially advanced recently. The aim of this study is to predict various clinical dose values from DNA gene expression datasets using data mining techniques. Materials and Methods . Eleven real gene expression datasets containing dose values were included. First, important genes for dose prediction were selected using iterative sure independence screening. Then, the performances of regression trees (RTs), support vector regression (SVR), RT bagging, SVR bagging, and RT boosting were examined. Results . The results demonstrated that a regression-based feature selection method substantially reduced the number of irrelevant genes from raw datasets. Overall, the best prediction performance in nine of 11 datasets was achieved using SVR; the second most accurate performance was provided using a gradient-boosting machine (GBM). Conclusion . Analysis of various dose values based on microarray gene expression data identified common genes found in our study and the referenced studies. According to our findings, SVR and GBM can be good predictors of dose-gene datasets. Another result of the study was to identify the sample size of n = 25 as a cutoff point for RT bagging to outperform a single RT.
NASA Astrophysics Data System (ADS)
Tsao, Sinchai; Gajawelli, Niharika; Zhou, Jiayu; Shi, Jie; Ye, Jieping; Wang, Yalin; Lepore, Natasha
2014-03-01
Prediction of Alzheimers disease (AD) progression based on baseline measures allows us to understand disease progression and has implications in decisions concerning treatment strategy. To this end we combine a predictive multi-task machine learning method1 with novel MR-based multivariate morphometric surface map of the hippocampus2 to predict future cognitive scores of patients. Previous work by Zhou et al.1 has shown that a multi-task learning framework that performs prediction of all future time points (or tasks) simultaneously can be used to encode both sparsity as well as temporal smoothness. They showed that this can be used in predicting cognitive outcomes of Alzheimers Disease Neuroimaging Initiative (ADNI) subjects based on FreeSurfer-based baseline MRI features, MMSE score demographic information and ApoE status. Whilst volumetric information may hold generalized information on brain status, we hypothesized that hippocampus specific information may be more useful in predictive modeling of AD. To this end, we applied Shi et al.2s recently developed multivariate tensor-based (mTBM) parametric surface analysis method to extract features from the hippocampal surface. We show that by combining the power of the multi-task framework with the sensitivity of mTBM features of the hippocampus surface, we are able to improve significantly improve predictive performance of ADAS cognitive scores 6, 12, 24, 36 and 48 months from baseline.
Dissimilarity based Partial Least Squares (DPLS) for genomic prediction from SNPs.
Singh, Priyanka; Engel, Jasper; Jansen, Jeroen; de Haan, Jorn; Buydens, Lutgarde Maria Celina
2016-05-04
Genomic prediction (GP) allows breeders to select plants and animals based on their breeding potential for desirable traits, without lengthy and expensive field trials or progeny testing. We have proposed to use Dissimilarity-based Partial Least Squares (DPLS) for GP. As a case study, we use the DPLS approach to predict Bacterial wilt (BW) in tomatoes using SNPs as predictors. The DPLS approach was compared with the Genomic Best-Linear Unbiased Prediction (GBLUP) and single-SNP regression with SNP as a fixed effect to assess the performance of DPLS. Eight genomic distance measures were used to quantify relationships between the tomato accessions from the SNPs. Subsequently, each of these distance measures was used to predict the BW using the DPLS prediction model. The DPLS model was found to be robust to the choice of distance measures; similar prediction performances were obtained for each distance measure. DPLS greatly outperformed the single-SNP regression approach, showing that BW is a comprehensive trait dependent on several loci. Next, the performance of the DPLS model was compared to that of GBLUP. Although GBLUP and DPLS are conceptually very different, the prediction quality (PQ) measured by DPLS models were similar to the prediction statistics obtained from GBLUP. A considerable advantage of DPLS is that the genotype-phenotype relationship can easily be visualized in a 2-D scatter plot. This so-called score-plot provides breeders an insight to select candidates for their future breeding program. DPLS is a highly appropriate method for GP. The model prediction performance was similar to the GBLUP and far better than the single-SNP approach. The proposed method can be used in combination with a wide range of genomic dissimilarity measures and genotype representations such as allele-count, haplotypes or allele-intensity values. Additionally, the data can be insightfully visualized by the DPLS model, allowing for selection of desirable candidates from the breeding experiments. In this study, we have assessed the DPLS performance on a single trait.
Predictive modeling and reducing cyclic variability in autoignition engines
Hellstrom, Erik; Stefanopoulou, Anna; Jiang, Li; Larimore, Jacob
2016-08-30
Methods and systems are provided for controlling a vehicle engine to reduce cycle-to-cycle combustion variation. A predictive model is applied to predict cycle-to-cycle combustion behavior of an engine based on observed engine performance variables. Conditions are identified, based on the predicted cycle-to-cycle combustion behavior, that indicate high cycle-to-cycle combustion variation. Corrective measures are then applied to prevent the predicted high cycle-to-cycle combustion variation.
Sequence Based Prediction of Antioxidant Proteins Using a Classifier Selection Strategy
Zhang, Lina; Zhang, Chengjin; Gao, Rui; Yang, Runtao; Song, Qing
2016-01-01
Antioxidant proteins perform significant functions in maintaining oxidation/antioxidation balance and have potential therapies for some diseases. Accurate identification of antioxidant proteins could contribute to revealing physiological processes of oxidation/antioxidation balance and developing novel antioxidation-based drugs. In this study, an ensemble method is presented to predict antioxidant proteins with hybrid features, incorporating SSI (Secondary Structure Information), PSSM (Position Specific Scoring Matrix), RSA (Relative Solvent Accessibility), and CTD (Composition, Transition, Distribution). The prediction results of the ensemble predictor are determined by an average of prediction results of multiple base classifiers. Based on a classifier selection strategy, we obtain an optimal ensemble classifier composed of RF (Random Forest), SMO (Sequential Minimal Optimization), NNA (Nearest Neighbor Algorithm), and J48 with an accuracy of 0.925. A Relief combined with IFS (Incremental Feature Selection) method is adopted to obtain optimal features from hybrid features. With the optimal features, the ensemble method achieves improved performance with a sensitivity of 0.95, a specificity of 0.93, an accuracy of 0.94, and an MCC (Matthew’s Correlation Coefficient) of 0.880, far better than the existing method. To evaluate the prediction performance objectively, the proposed method is compared with existing methods on the same independent testing dataset. Encouragingly, our method performs better than previous studies. In addition, our method achieves more balanced performance with a sensitivity of 0.878 and a specificity of 0.860. These results suggest that the proposed ensemble method can be a potential candidate for antioxidant protein prediction. For public access, we develop a user-friendly web server for antioxidant protein identification that is freely accessible at http://antioxidant.weka.cc. PMID:27662651
Evaluation of procedures for prediction of unconventional gas in the presence of geologic trends
Attanasi, E.D.; Coburn, T.C.
2009-01-01
This study extends the application of local spatial nonparametric prediction models to the estimation of recoverable gas volumes in continuous-type gas plays to regimes where there is a single geologic trend. A transformation is presented, originally proposed by Tomczak, that offsets the distortions caused by the trend. This article reports on numerical experiments that compare predictive and classification performance of the local nonparametric prediction models based on the transformation with models based on Euclidean distance. The transformation offers improvement in average root mean square error when the trend is not severely misspecified. Because of the local nature of the models, even those based on Euclidean distance in the presence of trends are reasonably robust. The tests based on other model performance metrics such as prediction error associated with the high-grade tracts and the ability of the models to identify sites with the largest gas volumes also demonstrate the robustness of both local modeling approaches. ?? International Association for Mathematical Geology 2009.
A dynamic multi-scale Markov model based methodology for remaining life prediction
NASA Astrophysics Data System (ADS)
Yan, Jihong; Guo, Chaozhong; Wang, Xing
2011-05-01
The ability to accurately predict the remaining life of partially degraded components is crucial in prognostics. In this paper, a performance degradation index is designed using multi-feature fusion techniques to represent deterioration severities of facilities. Based on this indicator, an improved Markov model is proposed for remaining life prediction. Fuzzy C-Means (FCM) algorithm is employed to perform state division for Markov model in order to avoid the uncertainty of state division caused by the hard division approach. Considering the influence of both historical and real time data, a dynamic prediction method is introduced into Markov model by a weighted coefficient. Multi-scale theory is employed to solve the state division problem of multi-sample prediction. Consequently, a dynamic multi-scale Markov model is constructed. An experiment is designed based on a Bently-RK4 rotor testbed to validate the dynamic multi-scale Markov model, experimental results illustrate the effectiveness of the methodology.
Fukunishi, Yoshifumi
2010-01-01
For fragment-based drug development, both hit (active) compound prediction and docking-pose (protein-ligand complex structure) prediction of the hit compound are important, since chemical modification (fragment linking, fragment evolution) subsequent to the hit discovery must be performed based on the protein-ligand complex structure. However, the naïve protein-compound docking calculation shows poor accuracy in terms of docking-pose prediction. Thus, post-processing of the protein-compound docking is necessary. Recently, several methods for the post-processing of protein-compound docking have been proposed. In FBDD, the compounds are smaller than those for conventional drug screening. This makes it difficult to perform the protein-compound docking calculation. A method to avoid this problem has been reported. Protein-ligand binding free energy estimation is useful to reduce the procedures involved in the chemical modification of the hit fragment. Several prediction methods have been proposed for high-accuracy estimation of protein-ligand binding free energy. This paper summarizes the various computational methods proposed for docking-pose prediction and their usefulness in FBDD.
Kuo, Pao-Jen; Wu, Shao-Chun; Chien, Peng-Chen; Chang, Shu-Shya; Rau, Cheng-Shyuan; Tai, Hsueh-Ling; Peng, Shu-Hui; Lin, Yi-Chun; Chen, Yi-Chun; Hsieh, Hsiao-Yun; Hsieh, Ching-Hua
2018-03-02
The aim of this study was to develop an effective surgical site infection (SSI) prediction model in patients receiving free-flap reconstruction after surgery for head and neck cancer using artificial neural network (ANN), and to compare its predictive power with that of conventional logistic regression (LR). There were 1,836 patients with 1,854 free-flap reconstructions and 438 postoperative SSIs in the dataset for analysis. They were randomly assigned tin ratio of 7:3 into a training set and a test set. Based on comprehensive characteristics of patients and diseases in the absence or presence of operative data, prediction of SSI was performed at two time points (pre-operatively and post-operatively) with a feed-forward ANN and the LR models. In addition to the calculated accuracy, sensitivity, and specificity, the predictive performance of ANN and LR were assessed based on area under the curve (AUC) measures of receiver operator characteristic curves and Brier score. ANN had a significantly higher AUC (0.892) of post-operative prediction and AUC (0.808) of pre-operative prediction than LR (both P <0.0001). In addition, there was significant higher AUC of post-operative prediction than pre-operative prediction by ANN (p<0.0001). With the highest AUC and the lowest Brier score (0.090), the post-operative prediction by ANN had the highest overall predictive performance. The post-operative prediction by ANN had the highest overall performance in predicting SSI after free-flap reconstruction in patients receiving surgery for head and neck cancer.
Research on Improved Depth Belief Network-Based Prediction of Cardiovascular Diseases
Zhang, Hongpo
2018-01-01
Quantitative analysis and prediction can help to reduce the risk of cardiovascular disease. Quantitative prediction based on traditional model has low accuracy. The variance of model prediction based on shallow neural network is larger. In this paper, cardiovascular disease prediction model based on improved deep belief network (DBN) is proposed. Using the reconstruction error, the network depth is determined independently, and unsupervised training and supervised optimization are combined. It ensures the accuracy of model prediction while guaranteeing stability. Thirty experiments were performed independently on the Statlog (Heart) and Heart Disease Database data sets in the UCI database. Experimental results showed that the mean of prediction accuracy was 91.26% and 89.78%, respectively. The variance of prediction accuracy was 5.78 and 4.46, respectively. PMID:29854369
A vertical handoff decision algorithm based on ARMA prediction model
NASA Astrophysics Data System (ADS)
Li, Ru; Shen, Jiao; Chen, Jun; Liu, Qiuhuan
2012-01-01
With the development of computer technology and the increasing demand for mobile communications, the next generation wireless networks will be composed of various wireless networks (e.g., WiMAX and WiFi). Vertical handoff is a key technology of next generation wireless networks. During the vertical handoff procedure, handoff decision is a crucial issue for an efficient mobility. Based on auto regression moving average (ARMA) prediction model, we propose a vertical handoff decision algorithm, which aims to improve the performance of vertical handoff and avoid unnecessary handoff. Based on the current received signal strength (RSS) and the previous RSS, the proposed approach adopt ARMA model to predict the next RSS. And then according to the predicted RSS to determine whether trigger the link layer triggering event and complete vertical handoff. The simulation results indicate that the proposed algorithm outperforms the RSS-based scheme with a threshold in the performance of handoff and the number of handoff.
Development and verification of NRC`s single-rod fuel performance codes FRAPCON-3 AND FRAPTRAN
DOE Office of Scientific and Technical Information (OSTI.GOV)
Beyer, C.E.; Cunningham, M.E.; Lanning, D.D.
1998-03-01
The FRAPCON and FRAP-T code series, developed in the 1970s and early 1980s, are used by the US Nuclear Regulatory Commission (NRC) to predict fuel performance during steady-state and transient power conditions, respectively. Both code series are now being updated by Pacific Northwest National Laboratory to improve their predictive capabilities at high burnup levels. The newest versions of the codes are called FRAPCON-3 and FRAPTRAN. The updates to fuel property and behavior models are focusing on providing best estimate predictions under steady-state and fast transient power conditions up to extended fuel burnups (> 55 GWd/MTU). Both codes will be assessedmore » against a data base independent of the data base used for code benchmarking and an estimate of code predictive uncertainties will be made based on comparisons to the benchmark and independent data bases.« less
Does teacher evaluation based on student performance predict motivation, well-being, and ill-being?
Cuevas, Ricardo; Ntoumanis, Nikos; Fernandez-Bustos, Juan G; Bartholomew, Kimberley
2018-06-01
This study tests an explanatory model based on self-determination theory, which posits that pressure experienced by teachers when they are evaluated based on their students' academic performance will differentially predict teacher adaptive and maladaptive motivation, well-being, and ill-being. A total of 360 Spanish physical education teachers completed a multi-scale inventory. We found support for a structural equation model that showed that perceived pressure predicted teacher autonomous motivation negatively, predicted amotivation positively, and was unrelated to controlled motivation. In addition, autonomous motivation predicted vitality positively and exhaustion negatively, whereas controlled motivation and amotivation predicted vitality negatively and exhaustion positively. Amotivation significantly mediated the relation between pressure and vitality and between pressure and exhaustion. The results underline the potential negative impact of pressure felt by teachers due to this type of evaluation on teacher motivation and psychological health. Copyright © 2018 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.
Dynamic Bus Travel Time Prediction Models on Road with Multiple Bus Routes
Bai, Cong; Peng, Zhong-Ren; Lu, Qing-Chang; Sun, Jian
2015-01-01
Accurate and real-time travel time information for buses can help passengers better plan their trips and minimize waiting times. A dynamic travel time prediction model for buses addressing the cases on road with multiple bus routes is proposed in this paper, based on support vector machines (SVMs) and Kalman filtering-based algorithm. In the proposed model, the well-trained SVM model predicts the baseline bus travel times from the historical bus trip data; the Kalman filtering-based dynamic algorithm can adjust bus travel times with the latest bus operation information and the estimated baseline travel times. The performance of the proposed dynamic model is validated with the real-world data on road with multiple bus routes in Shenzhen, China. The results show that the proposed dynamic model is feasible and applicable for bus travel time prediction and has the best prediction performance among all the five models proposed in the study in terms of prediction accuracy on road with multiple bus routes. PMID:26294903
Dynamic Bus Travel Time Prediction Models on Road with Multiple Bus Routes.
Bai, Cong; Peng, Zhong-Ren; Lu, Qing-Chang; Sun, Jian
2015-01-01
Accurate and real-time travel time information for buses can help passengers better plan their trips and minimize waiting times. A dynamic travel time prediction model for buses addressing the cases on road with multiple bus routes is proposed in this paper, based on support vector machines (SVMs) and Kalman filtering-based algorithm. In the proposed model, the well-trained SVM model predicts the baseline bus travel times from the historical bus trip data; the Kalman filtering-based dynamic algorithm can adjust bus travel times with the latest bus operation information and the estimated baseline travel times. The performance of the proposed dynamic model is validated with the real-world data on road with multiple bus routes in Shenzhen, China. The results show that the proposed dynamic model is feasible and applicable for bus travel time prediction and has the best prediction performance among all the five models proposed in the study in terms of prediction accuracy on road with multiple bus routes.
Remembered or Forgotten?—An EEG-Based Computational Prediction Approach
Sun, Xuyun; Qian, Cunle; Chen, Zhongqin; Wu, Zhaohui; Luo, Benyan; Pan, Gang
2016-01-01
Prediction of memory performance (remembered or forgotten) has various potential applications not only for knowledge learning but also for disease diagnosis. Recently, subsequent memory effects (SMEs)—the statistical differences in electroencephalography (EEG) signals before or during learning between subsequently remembered and forgotten events—have been found. This finding indicates that EEG signals convey the information relevant to memory performance. In this paper, based on SMEs we propose a computational approach to predict memory performance of an event from EEG signals. We devise a convolutional neural network for EEG, called ConvEEGNN, to predict subsequently remembered and forgotten events from EEG recorded during memory process. With the ConvEEGNN, prediction of memory performance can be achieved by integrating two main stages: feature extraction and classification. To verify the proposed approach, we employ an auditory memory task to collect EEG signals from scalp electrodes. For ConvEEGNN, the average prediction accuracy was 72.07% by using EEG data from pre-stimulus and during-stimulus periods, outperforming other approaches. It was observed that signals from pre-stimulus period and those from during-stimulus period had comparable contributions to memory performance. Furthermore, the connection weights of ConvEEGNN network can reveal prominent channels, which are consistent with the distribution of SME studied previously. PMID:27973531
Li, Han; Liu, Yashu; Gong, Pinghua; Zhang, Changshui; Ye, Jieping
2014-01-01
Identifying patients with Mild Cognitive Impairment (MCI) who are likely to convert to dementia has recently attracted increasing attention in Alzheimer's disease (AD) research. An accurate prediction of conversion from MCI to AD can aid clinicians to initiate treatments at early stage and monitor their effectiveness. However, existing prediction systems based on the original biosignatures are not satisfactory. In this paper, we propose to fit the prediction models using pairwise biosignature interactions, thus capturing higher-order relationship among biosignatures. Specifically, we employ hierarchical constraints and sparsity regularization to prune the high-dimensional input features. Based on the significant biosignatures and underlying interactions identified, we build classifiers to predict the conversion probability based on the selected features. We further analyze the underlying interaction effects of different biosignatures based on the so-called stable expectation scores. We have used 293 MCI subjects from Alzheimer's Disease Neuroimaging Initiative (ADNI) database that have MRI measurements at the baseline to evaluate the effectiveness of the proposed method. Our proposed method achieves better classification performance than state-of-the-art methods. Moreover, we discover several significant interactions predictive of MCI-to-AD conversion. These results shed light on improving the prediction performance using interaction features. PMID:24416143
Sampath, Sivananthan; Tkachenko, Pavlo; Renard, Eric; Pereverzev, Sergei V
2016-11-01
Despite the risk associated with nocturnal hypoglycemia (NH) there are only a few methods aiming at the prediction of such events based on intermittent blood glucose monitoring data. One of the first methods that potentially can be used for NH prediction is based on the low blood glucose index (LBGI) and suggested, for example, in Accu-Chek® Connect as a hypoglycemia risk indicator. On the other hand, nowadays there are other glucose control indices (GCI), which could be used for NH prediction in the same spirit as LBGI. In the present study we propose a general approach of combining NH predictors constructed from different GCI. The approach is based on a recently developed strategy for aggregating ranking algorithms in machine learning. NH predictors have been calibrated and tested on data extracted from clinical trials, performed in EU FP7-funded project DIAdvisor. Then, to show a portability of the method we have tested it on another dataset that was received from EU Horizon 2020-funded project AMMODIT. We exemplify the proposed approach by aggregating NH predictors that have been constructed based on 4 GCI associated with hypoglycemia. Even though these predictors have been preliminary optimized to exhibit better performance on the considered dataset, our aggregation approach allows a further performance improvement. On the dataset, where a portability of the proposed approach has been demonstrated, the aggregating predictor has exhibited the following performance: sensitivity 77%, specificity 83.4%, positive predictive value 80.2%, negative predictive value 80.6%, which is higher than conventionally considered as acceptable. The proposed approach shows potential to be used in telemedicine systems for NH prediction. © 2016 Diabetes Technology Society.
Development and evaluation of a predictive algorithm for telerobotic task complexity
NASA Technical Reports Server (NTRS)
Gernhardt, M. L.; Hunter, R. C.; Hedgecock, J. C.; Stephenson, A. G.
1993-01-01
There is a wide range of complexity in the various telerobotic servicing tasks performed in subsea, space, and hazardous material handling environments. Experience with telerobotic servicing has evolved into a knowledge base used to design tasks to be 'telerobot friendly.' This knowledge base generally resides in a small group of people. Written documentation and requirements are limited in conveying this knowledge base to serviceable equipment designers and are subject to misinterpretation. A mathematical model of task complexity based on measurable task parameters and telerobot performance characteristics would be a valuable tool to designers and operational planners. Oceaneering Space Systems and TRW have performed an independent research and development project to develop such a tool for telerobotic orbital replacement unit (ORU) exchange. This algorithm was developed to predict an ORU exchange degree of difficulty rating (based on the Cooper-Harper rating used to assess piloted operations). It is based on measurable parameters of the ORU, attachment receptacle and quantifiable telerobotic performance characteristics (e.g., link length, joint ranges, positional accuracy, tool lengths, number of cameras, and locations). The resulting algorithm can be used to predict task complexity as the ORU parameters, receptacle parameters, and telerobotic characteristics are varied.
Cardador, Laura; De Cáceres, Miquel; Bota, Gerard; Giralt, David; Casas, Fabián; Arroyo, Beatriz; Mougeot, François; Cantero-Martínez, Carlos; Moncunill, Judit; Butler, Simon J.; Brotons, Lluís
2014-01-01
European agriculture is undergoing widespread changes that are likely to have profound impacts on farmland biodiversity. The development of tools that allow an assessment of the potential biodiversity effects of different land-use alternatives before changes occur is fundamental to guiding management decisions. In this study, we develop a resource-based model framework to estimate habitat suitability for target species, according to simple information on species’ key resource requirements (diet, foraging habitat and nesting site), and examine whether it can be used to link land-use and local species’ distribution. We take as a study case four steppe bird species in a lowland area of the north-eastern Iberian Peninsula. We also compare the performance of our resource-based approach to that obtained through habitat-based models relating species’ occurrence and land-cover variables. Further, we use our resource-based approach to predict the effects that change in farming systems can have on farmland bird habitat suitability and compare these predictions with those obtained using the habitat-based models. Habitat suitability estimates generated by our resource-based models performed similarly (and better for one study species) than habitat based-models when predicting current species distribution. Moderate prediction success was achieved for three out of four species considered by resource-based models and for two of four by habitat-based models. Although, there is potential for improving the performance of resource-based models, they provide a structure for using available knowledge of the functional links between agricultural practices, provision of key resources and the response of organisms to predict potential effects of changing land-uses in a variety of context or the impacts of changes such as altered management practices that are not easily incorporated into habitat-based models. PMID:24667825
An Examination of Diameter Density Prediction with k-NN and Airborne Lidar
Strunk, Jacob L.; Gould, Peter J.; Packalen, Petteri; ...
2017-11-16
While lidar-based forest inventory methods have been widely demonstrated, performances of methods to predict tree diameters with airborne lidar (lidar) are not well understood. One cause for this is that the performance metrics typically used in studies for prediction of diameters can be difficult to interpret, and may not support comparative inferences between sampling designs and study areas. To help with this problem we propose two indices and use them to evaluate a variety of lidar and k nearest neighbor (k-NN) strategies for prediction of tree diameter distributions. The indices are based on the coefficient of determination ( R 2),more » and root mean square deviation (RMSD). Both of the indices are highly interpretable, and the RMSD-based index facilitates comparisons with alternative (non-lidar) inventory strategies, and with projects in other regions. K-NN diameter distribution prediction strategies were examined using auxiliary lidar for 190 training plots distribute across the 800 km 2 Savannah River Site in South Carolina, USA. In conclusion, we evaluate the performance of k-NN with respect to distance metrics, number of neighbors, predictor sets, and response sets. K-NN and lidar explained 80% of variability in diameters, and Mahalanobis distance with k = 3 neighbors performed best according to a number of criteria.« less
An Examination of Diameter Density Prediction with k-NN and Airborne Lidar
DOE Office of Scientific and Technical Information (OSTI.GOV)
Strunk, Jacob L.; Gould, Peter J.; Packalen, Petteri
While lidar-based forest inventory methods have been widely demonstrated, performances of methods to predict tree diameters with airborne lidar (lidar) are not well understood. One cause for this is that the performance metrics typically used in studies for prediction of diameters can be difficult to interpret, and may not support comparative inferences between sampling designs and study areas. To help with this problem we propose two indices and use them to evaluate a variety of lidar and k nearest neighbor (k-NN) strategies for prediction of tree diameter distributions. The indices are based on the coefficient of determination ( R 2),more » and root mean square deviation (RMSD). Both of the indices are highly interpretable, and the RMSD-based index facilitates comparisons with alternative (non-lidar) inventory strategies, and with projects in other regions. K-NN diameter distribution prediction strategies were examined using auxiliary lidar for 190 training plots distribute across the 800 km 2 Savannah River Site in South Carolina, USA. In conclusion, we evaluate the performance of k-NN with respect to distance metrics, number of neighbors, predictor sets, and response sets. K-NN and lidar explained 80% of variability in diameters, and Mahalanobis distance with k = 3 neighbors performed best according to a number of criteria.« less
Singh, Kunwar P; Gupta, Shikha; Ojha, Priyanka; Rai, Premanjali
2013-04-01
The research aims to develop artificial intelligence (AI)-based model to predict the adsorptive removal of 2-chlorophenol (CP) in aqueous solution by coconut shell carbon (CSC) using four operational variables (pH of solution, adsorbate concentration, temperature, and contact time), and to investigate their effects on the adsorption process. Accordingly, based on a factorial design, 640 batch experiments were conducted. Nonlinearities in experimental data were checked using Brock-Dechert-Scheimkman (BDS) statistics. Five nonlinear models were constructed to predict the adsorptive removal of CP in aqueous solution by CSC using four variables as input. Performances of the constructed models were evaluated and compared using statistical criteria. BDS statistics revealed strong nonlinearity in experimental data. Performance of all the models constructed here was satisfactory. Radial basis function network (RBFN) and multilayer perceptron network (MLPN) models performed better than generalized regression neural network, support vector machines, and gene expression programming models. Sensitivity analysis revealed that the contact time had highest effect on adsorption followed by the solution pH, temperature, and CP concentration. The study concluded that all the models constructed here were capable of capturing the nonlinearity in data. A better generalization and predictive performance of RBFN and MLPN models suggested that these can be used to predict the adsorption of CP in aqueous solution using CSC.
Gene function prediction based on Gene Ontology Hierarchy Preserving Hashing.
Zhao, Yingwen; Fu, Guangyuan; Wang, Jun; Guo, Maozu; Yu, Guoxian
2018-02-23
Gene Ontology (GO) uses structured vocabularies (or terms) to describe the molecular functions, biological roles, and cellular locations of gene products in a hierarchical ontology. GO annotations associate genes with GO terms and indicate the given gene products carrying out the biological functions described by the relevant terms. However, predicting correct GO annotations for genes from a massive set of GO terms as defined by GO is a difficult challenge. To combat with this challenge, we introduce a Gene Ontology Hierarchy Preserving Hashing (HPHash) based semantic method for gene function prediction. HPHash firstly measures the taxonomic similarity between GO terms. It then uses a hierarchy preserving hashing technique to keep the hierarchical order between GO terms, and to optimize a series of hashing functions to encode massive GO terms via compact binary codes. After that, HPHash utilizes these hashing functions to project the gene-term association matrix into a low-dimensional one and performs semantic similarity based gene function prediction in the low-dimensional space. Experimental results on three model species (Homo sapiens, Mus musculus and Rattus norvegicus) for interspecies gene function prediction show that HPHash performs better than other related approaches and it is robust to the number of hash functions. In addition, we also take HPHash as a plugin for BLAST based gene function prediction. From the experimental results, HPHash again significantly improves the prediction performance. The codes of HPHash are available at: http://mlda.swu.edu.cn/codes.php?name=HPHash. Copyright © 2018 Elsevier Inc. All rights reserved.
Khan, Taimoor; De, Asok
2014-01-01
In the last decade, artificial neural networks have become very popular techniques for computing different performance parameters of microstrip antennas. The proposed work illustrates a knowledge-based neural networks model for predicting the appropriate shape and accurate size of the slot introduced on the radiating patch for achieving desired level of resonance, gain, directivity, antenna efficiency, and radiation efficiency for dual-frequency operation. By incorporating prior knowledge in neural model, the number of required training patterns is drastically reduced. Further, the neural model incorporated with prior knowledge can be used for predicting response in extrapolation region beyond the training patterns region. For validation, a prototype is also fabricated and its performance parameters are measured. A very good agreement is attained between measured, simulated, and predicted results.
De, Asok
2014-01-01
In the last decade, artificial neural networks have become very popular techniques for computing different performance parameters of microstrip antennas. The proposed work illustrates a knowledge-based neural networks model for predicting the appropriate shape and accurate size of the slot introduced on the radiating patch for achieving desired level of resonance, gain, directivity, antenna efficiency, and radiation efficiency for dual-frequency operation. By incorporating prior knowledge in neural model, the number of required training patterns is drastically reduced. Further, the neural model incorporated with prior knowledge can be used for predicting response in extrapolation region beyond the training patterns region. For validation, a prototype is also fabricated and its performance parameters are measured. A very good agreement is attained between measured, simulated, and predicted results. PMID:27382616
Radiomics-based Prognosis Analysis for Non-Small Cell Lung Cancer
NASA Astrophysics Data System (ADS)
Zhang, Yucheng; Oikonomou, Anastasia; Wong, Alexander; Haider, Masoom A.; Khalvati, Farzad
2017-04-01
Radiomics characterizes tumor phenotypes by extracting large numbers of quantitative features from radiological images. Radiomic features have been shown to provide prognostic value in predicting clinical outcomes in several studies. However, several challenges including feature redundancy, unbalanced data, and small sample sizes have led to relatively low predictive accuracy. In this study, we explore different strategies for overcoming these challenges and improving predictive performance of radiomics-based prognosis for non-small cell lung cancer (NSCLC). CT images of 112 patients (mean age 75 years) with NSCLC who underwent stereotactic body radiotherapy were used to predict recurrence, death, and recurrence-free survival using a comprehensive radiomics analysis. Different feature selection and predictive modeling techniques were used to determine the optimal configuration of prognosis analysis. To address feature redundancy, comprehensive analysis indicated that Random Forest models and Principal Component Analysis were optimum predictive modeling and feature selection methods, respectively, for achieving high prognosis performance. To address unbalanced data, Synthetic Minority Over-sampling technique was found to significantly increase predictive accuracy. A full analysis of variance showed that data endpoints, feature selection techniques, and classifiers were significant factors in affecting predictive accuracy, suggesting that these factors must be investigated when building radiomics-based predictive models for cancer prognosis.
NASA Astrophysics Data System (ADS)
Choudhury, Anustup; Farrell, Suzanne; Atkins, Robin; Daly, Scott
2017-09-01
We present an approach to predict overall HDR display quality as a function of key HDR display parameters. We first performed subjective experiments on a high quality HDR display that explored five key HDR display parameters: maximum luminance, minimum luminance, color gamut, bit-depth and local contrast. Subjects rated overall quality for different combinations of these display parameters. We explored two models | a physical model solely based on physically measured display characteristics and a perceptual model that transforms physical parameters using human vision system models. For the perceptual model, we use a family of metrics based on a recently published color volume model (ICT-CP), which consists of the PQ luminance non-linearity (ST2084) and LMS-based opponent color, as well as an estimate of the display point spread function. To predict overall visual quality, we apply linear regression and machine learning techniques such as Multilayer Perceptron, RBF and SVM networks. We use RMSE and Pearson/Spearman correlation coefficients to quantify performance. We found that the perceptual model is better at predicting subjective quality than the physical model and that SVM is better at prediction than linear regression. The significance and contribution of each display parameter was investigated. In addition, we found that combined parameters such as contrast do not improve prediction. Traditional perceptual models were also evaluated and we found that models based on the PQ non-linearity performed better.
Evaluation of in silico tools to predict the skin sensitization potential of chemicals.
Verheyen, G R; Braeken, E; Van Deun, K; Van Miert, S
2017-01-01
Public domain and commercial in silico tools were compared for their performance in predicting the skin sensitization potential of chemicals. The packages were either statistical based (Vega, CASE Ultra) or rule based (OECD Toolbox, Toxtree, Derek Nexus). In practice, several of these in silico tools are used in gap filling and read-across, but here their use was limited to make predictions based on presence/absence of structural features associated to sensitization. The top 400 ranking substances of the ATSDR 2011 Priority List of Hazardous Substances were selected as a starting point. Experimental information was identified for 160 chemically diverse substances (82 positive and 78 negative). The prediction for skin sensitization potential was compared with the experimental data. Rule-based tools perform slightly better, with accuracies ranging from 0.6 (OECD Toolbox) to 0.78 (Derek Nexus), compared with statistical tools that had accuracies ranging from 0.48 (Vega) to 0.73 (CASE Ultra - LLNA weak model). Combining models increased the performance, with positive and negative predictive values up to 80% and 84%, respectively. However, the number of substances that were predicted positive or negative for skin sensitization in both models was low. Adding more substances to the dataset will increase the confidence in the conclusions reached. The insights obtained in this evaluation are incorporated in a web database www.asopus.weebly.com that provides a potential end user context for the scope and performance of different in silico tools with respect to a common dataset of curated skin sensitization data.
Green, Dorota; Li, Qi; Lockman, Jeffrey J; Gredebäck, Gustaf
2016-05-01
The cultural specificity of action prediction was assessed in 8-month-old Chinese and Swedish infants. Infants were presented with an actor eating with a spoon or chopsticks. Predictive goal-directed gaze shifts were examined using eye tracking. The results demonstrate that Chinese infants only predict the goal of eating actions performed with chopsticks, whereas Swedish infants exclusively predict the goal of eating actions performed with a spoon. Infants in neither culture predicted the goal of object manipulation actions (e.g., picking up food) performed with a spoon or chopsticks. The results support the view that multiple processes (both visual/cultural learning and motor-based direct matching processes) facilitate goal prediction during observation of other peoples' actions early in infancy. © 2016 The Authors. Child Development © 2016 Society for Research in Child Development, Inc.
Improving stability of prediction models based on correlated omics data by using network approaches.
Tissier, Renaud; Houwing-Duistermaat, Jeanine; Rodríguez-Girondo, Mar
2018-01-01
Building prediction models based on complex omics datasets such as transcriptomics, proteomics, metabolomics remains a challenge in bioinformatics and biostatistics. Regularized regression techniques are typically used to deal with the high dimensionality of these datasets. However, due to the presence of correlation in the datasets, it is difficult to select the best model and application of these methods yields unstable results. We propose a novel strategy for model selection where the obtained models also perform well in terms of overall predictability. Several three step approaches are considered, where the steps are 1) network construction, 2) clustering to empirically derive modules or pathways, and 3) building a prediction model incorporating the information on the modules. For the first step, we use weighted correlation networks and Gaussian graphical modelling. Identification of groups of features is performed by hierarchical clustering. The grouping information is included in the prediction model by using group-based variable selection or group-specific penalization. We compare the performance of our new approaches with standard regularized regression via simulations. Based on these results we provide recommendations for selecting a strategy for building a prediction model given the specific goal of the analysis and the sizes of the datasets. Finally we illustrate the advantages of our approach by application of the methodology to two problems, namely prediction of body mass index in the DIetary, Lifestyle, and Genetic determinants of Obesity and Metabolic syndrome study (DILGOM) and prediction of response of each breast cancer cell line to treatment with specific drugs using a breast cancer cell lines pharmacogenomics dataset.
EOID Model Validation and Performance Prediction
2002-09-30
Our long-term goal is to accurately predict the capability of the current generation of laser-based underwater imaging sensors to perform Electro ... Optic Identification (EOID) against relevant targets in a variety of realistic environmental conditions. The two most prominent technologies in this area
A Sensor Dynamic Measurement Error Prediction Model Based on NAPSO-SVM.
Jiang, Minlan; Jiang, Lan; Jiang, Dingde; Li, Fei; Song, Houbing
2018-01-15
Dynamic measurement error correction is an effective way to improve sensor precision. Dynamic measurement error prediction is an important part of error correction, and support vector machine (SVM) is often used for predicting the dynamic measurement errors of sensors. Traditionally, the SVM parameters were always set manually, which cannot ensure the model's performance. In this paper, a SVM method based on an improved particle swarm optimization (NAPSO) is proposed to predict the dynamic measurement errors of sensors. Natural selection and simulated annealing are added in the PSO to raise the ability to avoid local optima. To verify the performance of NAPSO-SVM, three types of algorithms are selected to optimize the SVM's parameters: the particle swarm optimization algorithm (PSO), the improved PSO optimization algorithm (NAPSO), and the glowworm swarm optimization (GSO). The dynamic measurement error data of two sensors are applied as the test data. The root mean squared error and mean absolute percentage error are employed to evaluate the prediction models' performances. The experimental results show that among the three tested algorithms the NAPSO-SVM method has a better prediction precision and a less prediction errors, and it is an effective method for predicting the dynamic measurement errors of sensors.
An improved predictive functional control method with application to PMSM systems
NASA Astrophysics Data System (ADS)
Li, Shihua; Liu, Huixian; Fu, Wenshu
2017-01-01
In common design of prediction model-based control method, usually disturbances are not considered in the prediction model as well as the control design. For the control systems with large amplitude or strong disturbances, it is difficult to precisely predict the future outputs according to the conventional prediction model, and thus the desired optimal closed-loop performance will be degraded to some extent. To this end, an improved predictive functional control (PFC) method is developed in this paper by embedding disturbance information into the system model. Here, a composite prediction model is thus obtained by embedding the estimated value of disturbances, where disturbance observer (DOB) is employed to estimate the lumped disturbances. So the influence of disturbances on system is taken into account in optimisation procedure. Finally, considering the speed control problem for permanent magnet synchronous motor (PMSM) servo system, a control scheme based on the improved PFC method is designed to ensure an optimal closed-loop performance even in the presence of disturbances. Simulation and experimental results based on a hardware platform are provided to confirm the effectiveness of the proposed algorithm.
Hu, Jing; Zhang, Xiaolong; Liu, Xiaoming; Tang, Jinshan
2015-06-01
Discovering hot regions in protein-protein interaction is important for drug and protein design, while experimental identification of hot regions is a time-consuming and labor-intensive effort; thus, the development of predictive models can be very helpful. In hot region prediction research, some models are based on structure information, and others are based on a protein interaction network. However, the prediction accuracy of these methods can still be improved. In this paper, a new method is proposed for hot region prediction, which combines density-based incremental clustering with feature-based classification. The method uses density-based incremental clustering to obtain rough hot regions, and uses feature-based classification to remove the non-hot spot residues from the rough hot regions. Experimental results show that the proposed method significantly improves the prediction performance of hot regions. Copyright © 2015 Elsevier Ltd. All rights reserved.
Lai, Fu-Jou; Chang, Hong-Tsun; Huang, Yueh-Min; Wu, Wei-Sheng
2014-01-01
Eukaryotic transcriptional regulation is known to be highly connected through the networks of cooperative transcription factors (TFs). Measuring the cooperativity of TFs is helpful for understanding the biological relevance of these TFs in regulating genes. The recent advances in computational techniques led to various predictions of cooperative TF pairs in yeast. As each algorithm integrated different data resources and was developed based on different rationales, it possessed its own merit and claimed outperforming others. However, the claim was prone to subjectivity because each algorithm compared with only a few other algorithms and only used a small set of performance indices for comparison. This motivated us to propose a series of indices to objectively evaluate the prediction performance of existing algorithms. And based on the proposed performance indices, we conducted a comprehensive performance evaluation. We collected 14 sets of predicted cooperative TF pairs (PCTFPs) in yeast from 14 existing algorithms in the literature. Using the eight performance indices we adopted/proposed, the cooperativity of each PCTFP was measured and a ranking score according to the mean cooperativity of the set was given to each set of PCTFPs under evaluation for each performance index. It was seen that the ranking scores of a set of PCTFPs vary with different performance indices, implying that an algorithm used in predicting cooperative TF pairs is of strength somewhere but may be of weakness elsewhere. We finally made a comprehensive ranking for these 14 sets. The results showed that Wang J's study obtained the best performance evaluation on the prediction of cooperative TF pairs in yeast. In this study, we adopted/proposed eight performance indices to make a comprehensive performance evaluation on the prediction results of 14 existing cooperative TFs identification algorithms. Most importantly, these proposed indices can be easily applied to measure the performance of new algorithms developed in the future, thus expedite progress in this research field.
Multi-model comparison highlights consistency in predicted effect of warming on a semi-arid shrub
Renwick, Katherine M.; Curtis, Caroline; Kleinhesselink, Andrew R.; Schlaepfer, Daniel R.; Bradley, Bethany A.; Aldridge, Cameron L.; Poulter, Benjamin; Adler, Peter B.
2018-01-01
A number of modeling approaches have been developed to predict the impacts of climate change on species distributions, performance, and abundance. The stronger the agreement from models that represent different processes and are based on distinct and independent sources of information, the greater the confidence we can have in their predictions. Evaluating the level of confidence is particularly important when predictions are used to guide conservation or restoration decisions. We used a multi-model approach to predict climate change impacts on big sagebrush (Artemisia tridentata), the dominant plant species on roughly 43 million hectares in the western United States and a key resource for many endemic wildlife species. To evaluate the climate sensitivity of A. tridentata, we developed four predictive models, two based on empirically derived spatial and temporal relationships, and two that applied mechanistic approaches to simulate sagebrush recruitment and growth. This approach enabled us to produce an aggregate index of climate change vulnerability and uncertainty based on the level of agreement between models. Despite large differences in model structure, predictions of sagebrush response to climate change were largely consistent. Performance, as measured by change in cover, growth, or recruitment, was predicted to decrease at the warmest sites, but increase throughout the cooler portions of sagebrush's range. A sensitivity analysis indicated that sagebrush performance responds more strongly to changes in temperature than precipitation. Most of the uncertainty in model predictions reflected variation among the ecological models, raising questions about the reliability of forecasts based on a single modeling approach. Our results highlight the value of a multi-model approach in forecasting climate change impacts and uncertainties and should help land managers to maximize the value of conservation investments.
de Oliveira, Glaucia Martins; Cachioni, Meire; Falcão, Deusivania; Batistoni, Samila; Lopes, Andrea; Guimarães, Vanessa; Lima-Silva, Thais Bento; Neri, Anita Liberalesso; Yassuda, Mônica Sanches
2015-01-01
Previous studies have suggested that performance prediction, an aspect of metamemory, may be associated with objective performance on memory tasks. The objective of the study was to describe memory prediction before performing an episodic memory task, in community-dwelling older adults, stratified by sex, age group and educational level. Additionally, the association between predicted and objective performance on a memory task was investigated. The study was based on data from 359 participants in the FIBRA study carried out at Ermelino Matarazzo, São Paulo. Memory prediction was assessed by posing the question: "If someone showed you a sheet with drawings of 10 pictures to observe for 30 seconds, how many pictures do you think you could remember without seeing the sheet?". Memory performance was assessed by the memorization of 10 black and white pictures from the Brief Cognitive Screening Battery (BCSB). No differences were found between men and women, nor for age group and educational level, in memory performance prediction before carrying out the memory task. There was a modest association (rho=0.11, p=0.041) between memory prediction and performance in immediate memory. On multivariate linear regression analyses, memory performance prediction was moderately significantly associated with immediate memory (p=0.061). In this study, sociodemographic variables did not influence memory prediction, which was only modestly associated with immediate memory on the Brief Cognitive Screening Battery (BCSB).
Schilirò, Luca; Montrasio, Lorella; Scarascia Mugnozza, Gabriele
2016-11-01
In recent years, physically-based numerical models have frequently been used in the framework of early-warning systems devoted to rainfall-induced landslide hazard monitoring and mitigation. For this reason, in this work we describe the potential of SLIP (Shallow Landslides Instability Prediction), a simplified physically-based model for the analysis of shallow landslide occurrence. In order to test the reliability of this model, a back analysis of recent landslide events occurred in the study area (located SW of Messina, northeastern Sicily, Italy) on October 1st, 2009 was performed. The simulation results have been compared with those obtained for the same event by using TRIGRS, another well-established model for shallow landslide prediction. Afterwards, a simulation over a 2-year span period has been performed for the same area, with the aim of evaluating the performance of SLIP as early warning tool. The results confirm the good predictive capability of the model, both in terms of spatial and temporal prediction of the instability phenomena. For this reason, we recommend an operating procedure for the real-time definition of shallow landslide triggering scenarios at the catchment scale, which is based on the use of SLIP calibrated through a specific multi-methodological approach. Copyright © 2016 Elsevier B.V. All rights reserved.
ERIC Educational Resources Information Center
Ivancevich, John M.
1976-01-01
This empirically based study of 324 technicians investigated the moderating impact of job satisfaction in the prediction of job performance criteria from ability test scores. The findings suggest that the type of job satisfaction facet and the performance criterion used are important considerations when examining satisfaction as a moderator.…
Jeffrey J. Barry; John M. Buffington; Peter Goodwin; John .G. King; William W. Emmett
2008-01-01
Previous studies assessing the accuracy of bed-load transport equations have considered equation performance statistically based on paired observations of measured and predicted bed-load transport rates. However, transport measurements were typically taken during low flows, biasing the assessment of equation performance toward low discharges, and because equation...
Nair, Sankaran N; Czaja, Sara J; Sharit, Joseph
2007-06-01
This article explores the role of age, cognitive abilities, prior experience, and knowledge in skill acquisition for a computer-based simulated customer service task. Fifty-two participants aged 50-80 performed the task over 4 consecutive days following training. They also completed a battery that assessed prior computer experience and cognitive abilities. The data indicated that overall quality and efficiency of performance improved with practice. The predictors of initial level of performance and rate of change in performance varied according to the performance parameter assessed. Age and fluid intelligence predicted initial level and rate of improvement in overall quality, whereas crystallized intelligence and age predicted initial e-mail processing time, and crystallized intelligence predicted rate of change in e-mail processing time over days. We discuss the implications of these findings for the design of intervention strategies.
A community resource benchmarking predictions of peptide binding to MHC-I molecules.
Peters, Bjoern; Bui, Huynh-Hoa; Frankild, Sune; Nielson, Morten; Lundegaard, Claus; Kostem, Emrah; Basch, Derek; Lamberth, Kasper; Harndahl, Mikkel; Fleri, Ward; Wilson, Stephen S; Sidney, John; Lund, Ole; Buus, Soren; Sette, Alessandro
2006-06-09
Recognition of peptides bound to major histocompatibility complex (MHC) class I molecules by T lymphocytes is an essential part of immune surveillance. Each MHC allele has a characteristic peptide binding preference, which can be captured in prediction algorithms, allowing for the rapid scan of entire pathogen proteomes for peptide likely to bind MHC. Here we make public a large set of 48,828 quantitative peptide-binding affinity measurements relating to 48 different mouse, human, macaque, and chimpanzee MHC class I alleles. We use this data to establish a set of benchmark predictions with one neural network method and two matrix-based prediction methods extensively utilized in our groups. In general, the neural network outperforms the matrix-based predictions mainly due to its ability to generalize even on a small amount of data. We also retrieved predictions from tools publicly available on the internet. While differences in the data used to generate these predictions hamper direct comparisons, we do conclude that tools based on combinatorial peptide libraries perform remarkably well. The transparent prediction evaluation on this dataset provides tool developers with a benchmark for comparison of newly developed prediction methods. In addition, to generate and evaluate our own prediction methods, we have established an easily extensible web-based prediction framework that allows automated side-by-side comparisons of prediction methods implemented by experts. This is an advance over the current practice of tool developers having to generate reference predictions themselves, which can lead to underestimating the performance of prediction methods they are not as familiar with as their own. The overall goal of this effort is to provide a transparent prediction evaluation allowing bioinformaticians to identify promising features of prediction methods and providing guidance to immunologists regarding the reliability of prediction tools.
A unified frame of predicting side effects of drugs by using linear neighborhood similarity.
Zhang, Wen; Yue, Xiang; Liu, Feng; Chen, Yanlin; Tu, Shikui; Zhang, Xining
2017-12-14
Drug side effects are one of main concerns in the drug discovery, which gains wide attentions. Investigating drug side effects is of great importance, and the computational prediction can help to guide wet experiments. As far as we known, a great number of computational methods have been proposed for the side effect predictions. The assumption that similar drugs may induce same side effects is usually employed for modeling, and how to calculate the drug-drug similarity is critical in the side effect predictions. In this paper, we present a novel measure of drug-drug similarity named "linear neighborhood similarity", which is calculated in a drug feature space by exploring linear neighborhood relationship. Then, we transfer the similarity from the feature space into the side effect space, and predict drug side effects by propagating known side effect information through a similarity-based graph. Under a unified frame based on the linear neighborhood similarity, we propose method "LNSM" and its extension "LNSM-SMI" to predict side effects of new drugs, and propose the method "LNSM-MSE" to predict unobserved side effect of approved drugs. We evaluate the performances of LNSM and LNSM-SMI in predicting side effects of new drugs, and evaluate the performances of LNSM-MSE in predicting missing side effects of approved drugs. The results demonstrate that the linear neighborhood similarity can improve the performances of side effect prediction, and the linear neighborhood similarity-based methods can outperform existing side effect prediction methods. More importantly, the proposed methods can predict side effects of new drugs as well as unobserved side effects of approved drugs under a unified frame.
Comparison of in silico models for prediction of mutagenicity.
Bakhtyari, Nazanin G; Raitano, Giuseppa; Benfenati, Emilio; Martin, Todd; Young, Douglas
2013-01-01
Using a dataset with more than 6000 compounds, the performance of eight quantitative structure activity relationships (QSAR) models was evaluated: ACD/Tox Suite, Absorption, Distribution, Metabolism, Elimination, and Toxicity of chemical substances (ADMET) predictor, Derek, Toxicity Estimation Software Tool (T.E.S.T.), TOxicity Prediction by Komputer Assisted Technology (TOPKAT), Toxtree, CEASAR, and SARpy (SAR in python). In general, the results showed a high level of performance. To have a realistic estimate of the predictive ability, the results for chemicals inside and outside the training set for each model were considered. The effect of applicability domain tools (when available) on the prediction accuracy was also evaluated. The predictive tools included QSAR models, knowledge-based systems, and a combination of both methods. Models based on statistical QSAR methods gave better results.
Sjögren, Erik; Westergren, Jan; Grant, Iain; Hanisch, Gunilla; Lindfors, Lennart; Lennernäs, Hans; Abrahamsson, Bertil; Tannergren, Christer
2013-07-16
Oral drug delivery is the predominant administration route for a major part of the pharmaceutical products used worldwide. Further understanding and improvement of gastrointestinal drug absorption predictions is currently a highly prioritized area of research within the pharmaceutical industry. The fraction absorbed (fabs) of an oral dose after administration of a solid dosage form is a key parameter in the estimation of the in vivo performance of an orally administrated drug formulation. This study discloses an evaluation of the predictive performance of the mechanistic physiologically based absorption model GI-Sim. GI-Sim deploys a compartmental gastrointestinal absorption and transit model as well as algorithms describing permeability, dissolution rate, salt effects, partitioning into micelles, particle and micelle drifting in the aqueous boundary layer, particle growth and amorphous or crystalline precipitation. Twelve APIs with reported or expected absorption limitations in humans, due to permeability, dissolution and/or solubility, were investigated. Predictions of the intestinal absorption for different doses and formulations were performed based on physicochemical and biopharmaceutical properties, such as solubility in buffer and simulated intestinal fluid, molecular weight, pK(a), diffusivity and molecule density, measured or estimated human effective permeability and particle size distribution. The performance of GI-Sim was evaluated by comparing predicted plasma concentration-time profiles along with oral pharmacokinetic parameters originating from clinical studies in healthy individuals. The capability of GI-Sim to correctly predict impact of dose and particle size as well as the in vivo performance of nanoformulations was also investigated. The overall predictive performance of GI-Sim was good as >95% of the predicted pharmacokinetic parameters (C(max) and AUC) were within a 2-fold deviation from the clinical observations and the predicted plasma AUC was within one standard deviation of the observed mean plasma AUC in 74% of the simulations. GI-Sim was also able to correctly capture the trends in dose- and particle size dependent absorption for the study drugs with solubility and dissolution limited absorption, respectively. In addition, GI-Sim was also shown to be able to predict the increase in absorption and plasma exposure achieved with nanoformulations. Based on the results, the performance of GI-Sim was shown to be suitable for early risk assessment as well as to guide decision making in pharmaceutical formulation development. Copyright © 2013 Elsevier B.V. All rights reserved.
Sensor image prediction techniques
NASA Astrophysics Data System (ADS)
Stenger, A. J.; Stone, W. R.; Berry, L.; Murray, T. J.
1981-02-01
The preparation of prediction imagery is a complex, costly, and time consuming process. Image prediction systems which produce a detailed replica of the image area require the extensive Defense Mapping Agency data base. The purpose of this study was to analyze the use of image predictions in order to determine whether a reduced set of more compact image features contains enough information to produce acceptable navigator performance. A job analysis of the navigator's mission tasks was performed. It showed that the cognitive and perceptual tasks he performs during navigation are identical to those performed for the targeting mission function. In addition, the results of the analysis of his performance when using a particular sensor can be extended to the analysis of this mission tasks using any sensor. An experimental approach was used to determine the relationship between navigator performance and the type of amount of information in the prediction image. A number of subjects were given image predictions containing varying levels of scene detail and different image features, and then asked to identify the predicted targets in corresponding dynamic flight sequences over scenes of cultural, terrain, and mixed (both cultural and terrain) content.
A statistical model for predicting muscle performance
NASA Astrophysics Data System (ADS)
Byerly, Diane Leslie De Caix
The objective of these studies was to develop a capability for predicting muscle performance and fatigue to be utilized for both space- and ground-based applications. To develop this predictive model, healthy test subjects performed a defined, repetitive dynamic exercise to failure using a Lordex spinal machine. Throughout the exercise, surface electromyography (SEMG) data were collected from the erector spinae using a Mega Electronics ME3000 muscle tester and surface electrodes placed on both sides of the back muscle. These data were analyzed using a 5th order Autoregressive (AR) model and statistical regression analysis. It was determined that an AR derived parameter, the mean average magnitude of AR poles, significantly correlated with the maximum number of repetitions (designated Rmax) that a test subject was able to perform. Using the mean average magnitude of AR poles, a test subject's performance to failure could be predicted as early as the sixth repetition of the exercise. This predictive model has the potential to provide a basis for improving post-space flight recovery, monitoring muscle atrophy in astronauts and assessing the effectiveness of countermeasures, monitoring astronaut performance and fatigue during Extravehicular Activity (EVA) operations, providing pre-flight assessment of the ability of an EVA crewmember to perform a given task, improving the design of training protocols and simulations for strenuous International Space Station assembly EVA, and enabling EVA work task sequences to be planned enhancing astronaut performance and safety. Potential ground-based, medical applications of the predictive model include monitoring muscle deterioration and performance resulting from illness, establishing safety guidelines in the industry for repetitive tasks, monitoring the stages of rehabilitation for muscle-related injuries sustained in sports and accidents, and enhancing athletic performance through improved training protocols while reducing injury.
NASA Technical Reports Server (NTRS)
Abrahamson, Matthew J.; Oaida, Bogdan; Erkmen, Baris
2013-01-01
This paper will discuss the OPALS pointing strategy, focusing on incorporation of ISS trajectory and attitude models to build pointing predictions. Methods to extrapolate an ISS prediction based on past data will be discussed and will be compared to periodically published ISS predictions and Two-Line Element (TLE) predictions. The prediction performance will also be measured against GPS states available in telemetry. The performance of the pointing products will be compared to the allocated values in the OPALS pointing budget to assess compliance with requirements.
Kuo, Pao-Jen; Wu, Shao-Chun; Chien, Peng-Chen; Chang, Shu-Shya; Rau, Cheng-Shyuan; Tai, Hsueh-Ling; Peng, Shu-Hui; Lin, Yi-Chun; Chen, Yi-Chun; Hsieh, Hsiao-Yun; Hsieh, Ching-Hua
2018-01-01
Background The aim of this study was to develop an effective surgical site infection (SSI) prediction model in patients receiving free-flap reconstruction after surgery for head and neck cancer using artificial neural network (ANN), and to compare its predictive power with that of conventional logistic regression (LR). Materials and methods There were 1,836 patients with 1,854 free-flap reconstructions and 438 postoperative SSIs in the dataset for analysis. They were randomly assigned tin ratio of 7:3 into a training set and a test set. Based on comprehensive characteristics of patients and diseases in the absence or presence of operative data, prediction of SSI was performed at two time points (pre-operatively and post-operatively) with a feed-forward ANN and the LR models. In addition to the calculated accuracy, sensitivity, and specificity, the predictive performance of ANN and LR were assessed based on area under the curve (AUC) measures of receiver operator characteristic curves and Brier score. Results ANN had a significantly higher AUC (0.892) of post-operative prediction and AUC (0.808) of pre-operative prediction than LR (both P<0.0001). In addition, there was significant higher AUC of post-operative prediction than pre-operative prediction by ANN (p<0.0001). With the highest AUC and the lowest Brier score (0.090), the post-operative prediction by ANN had the highest overall predictive performance. Conclusion The post-operative prediction by ANN had the highest overall performance in predicting SSI after free-flap reconstruction in patients receiving surgery for head and neck cancer. PMID:29568393
A new method for enhancer prediction based on deep belief network.
Bu, Hongda; Gan, Yanglan; Wang, Yang; Zhou, Shuigeng; Guan, Jihong
2017-10-16
Studies have shown that enhancers are significant regulatory elements to play crucial roles in gene expression regulation. Since enhancers are unrelated to the orientation and distance to their target genes, it is a challenging mission for scholars and researchers to accurately predicting distal enhancers. In the past years, with the high-throughout ChiP-seq technologies development, several computational techniques emerge to predict enhancers using epigenetic or genomic features. Nevertheless, the inconsistency of computational models across different cell-lines and the unsatisfactory prediction performance call for further research in this area. Here, we propose a new Deep Belief Network (DBN) based computational method for enhancer prediction, which is called EnhancerDBN. This method combines diverse features, composed of DNA sequence compositional features, DNA methylation and histone modifications. Our computational results indicate that 1) EnhancerDBN outperforms 13 existing methods in prediction, and 2) GC content and DNA methylation can serve as relevant features for enhancer prediction. Deep learning is effective in boosting the performance of enhancer prediction.
Evaluation and integration of existing methods for computational prediction of allergens
2013-01-01
Background Allergy involves a series of complex reactions and factors that contribute to the development of the disease and triggering of the symptoms, including rhinitis, asthma, atopic eczema, skin sensitivity, even acute and fatal anaphylactic shock. Prediction and evaluation of the potential allergenicity is of importance for safety evaluation of foods and other environment factors. Although several computational approaches for assessing the potential allergenicity of proteins have been developed, their performance and relative merits and shortcomings have not been compared systematically. Results To evaluate and improve the existing methods for allergen prediction, we collected an up-to-date definitive dataset consisting of 989 known allergens and massive putative non-allergens. The three most widely used allergen computational prediction approaches including sequence-, motif- and SVM-based (Support Vector Machine) methods were systematically compared using the defined parameters and we found that SVM-based method outperformed the other two methods with higher accuracy and specificity. The sequence-based method with the criteria defined by FAO/WHO (FAO: Food and Agriculture Organization of the United Nations; WHO: World Health Organization) has higher sensitivity of over 98%, but having a low specificity. The advantage of motif-based method is the ability to visualize the key motif within the allergen. Notably, the performances of the sequence-based method defined by FAO/WHO and motif eliciting strategy could be improved by the optimization of parameters. To facilitate the allergen prediction, we integrated these three methods in a web-based application proAP, which provides the global search of the known allergens and a powerful tool for allergen predication. Flexible parameter setting and batch prediction were also implemented. The proAP can be accessed at http://gmobl.sjtu.edu.cn/proAP/main.html. Conclusions This study comprehensively evaluated sequence-, motif- and SVM-based computational prediction approaches for allergens and optimized their parameters to obtain better performance. These findings may provide helpful guidance for the researchers in allergen-prediction. Furthermore, we integrated these methods into a web application proAP, greatly facilitating users to do customizable allergen search and prediction. PMID:23514097
Evaluation and integration of existing methods for computational prediction of allergens.
Wang, Jing; Yu, Yabin; Zhao, Yunan; Zhang, Dabing; Li, Jing
2013-01-01
Allergy involves a series of complex reactions and factors that contribute to the development of the disease and triggering of the symptoms, including rhinitis, asthma, atopic eczema, skin sensitivity, even acute and fatal anaphylactic shock. Prediction and evaluation of the potential allergenicity is of importance for safety evaluation of foods and other environment factors. Although several computational approaches for assessing the potential allergenicity of proteins have been developed, their performance and relative merits and shortcomings have not been compared systematically. To evaluate and improve the existing methods for allergen prediction, we collected an up-to-date definitive dataset consisting of 989 known allergens and massive putative non-allergens. The three most widely used allergen computational prediction approaches including sequence-, motif- and SVM-based (Support Vector Machine) methods were systematically compared using the defined parameters and we found that SVM-based method outperformed the other two methods with higher accuracy and specificity. The sequence-based method with the criteria defined by FAO/WHO (FAO: Food and Agriculture Organization of the United Nations; WHO: World Health Organization) has higher sensitivity of over 98%, but having a low specificity. The advantage of motif-based method is the ability to visualize the key motif within the allergen. Notably, the performances of the sequence-based method defined by FAO/WHO and motif eliciting strategy could be improved by the optimization of parameters. To facilitate the allergen prediction, we integrated these three methods in a web-based application proAP, which provides the global search of the known allergens and a powerful tool for allergen predication. Flexible parameter setting and batch prediction were also implemented. The proAP can be accessed at http://gmobl.sjtu.edu.cn/proAP/main.html. This study comprehensively evaluated sequence-, motif- and SVM-based computational prediction approaches for allergens and optimized their parameters to obtain better performance. These findings may provide helpful guidance for the researchers in allergen-prediction. Furthermore, we integrated these methods into a web application proAP, greatly facilitating users to do customizable allergen search and prediction.
Priming of Spatial Distance Enhances Children's Creative Performance
ERIC Educational Resources Information Center
Liberman, Nira; Polack, Orli; Hameiri, Boaz; Blumenfeld, Maayan
2012-01-01
According to construal level theory, psychological distance promotes more abstract thought. Theories of creativity, in turn, suggest that abstract thought promotes creativity. Based on these lines of theorizing, we predicted that spatial distancing would enhance creative performance in elementary school children. To test this prediction, we primed…
A Particle Swarm Optimization-Based Approach with Local Search for Predicting Protein Folding.
Yang, Cheng-Hong; Lin, Yu-Shiun; Chuang, Li-Yeh; Chang, Hsueh-Wei
2017-10-01
The hydrophobic-polar (HP) model is commonly used for predicting protein folding structures and hydrophobic interactions. This study developed a particle swarm optimization (PSO)-based algorithm combined with local search algorithms; specifically, the high exploration PSO (HEPSO) algorithm (which can execute global search processes) was combined with three local search algorithms (hill-climbing algorithm, greedy algorithm, and Tabu table), yielding the proposed HE-L-PSO algorithm. By using 20 known protein structures, we evaluated the performance of the HE-L-PSO algorithm in predicting protein folding in the HP model. The proposed HE-L-PSO algorithm exhibited favorable performance in predicting both short and long amino acid sequences with high reproducibility and stability, compared with seven reported algorithms. The HE-L-PSO algorithm yielded optimal solutions for all predicted protein folding structures. All HE-L-PSO-predicted protein folding structures possessed a hydrophobic core that is similar to normal protein folding.
Bondi, Robert W; Igne, Benoît; Drennen, James K; Anderson, Carl A
2012-12-01
Near-infrared spectroscopy (NIRS) is a valuable tool in the pharmaceutical industry, presenting opportunities for online analyses to achieve real-time assessment of intermediates and finished dosage forms. The purpose of this work was to investigate the effect of experimental designs on prediction performance of quantitative models based on NIRS using a five-component formulation as a model system. The following experimental designs were evaluated: five-level, full factorial (5-L FF); three-level, full factorial (3-L FF); central composite; I-optimal; and D-optimal. The factors for all designs were acetaminophen content and the ratio of microcrystalline cellulose to lactose monohydrate. Other constituents included croscarmellose sodium and magnesium stearate (content remained constant). Partial least squares-based models were generated using data from individual experimental designs that related acetaminophen content to spectral data. The effect of each experimental design was evaluated by determining the statistical significance of the difference in bias and standard error of the prediction for that model's prediction performance. The calibration model derived from the I-optimal design had similar prediction performance as did the model derived from the 5-L FF design, despite containing 16 fewer design points. It also outperformed all other models estimated from designs with similar or fewer numbers of samples. This suggested that experimental-design selection for calibration-model development is critical, and optimum performance can be achieved with efficient experimental designs (i.e., optimal designs).
Soil-pipe interaction modeling for pipe behavior prediction with super learning based methods
NASA Astrophysics Data System (ADS)
Shi, Fang; Peng, Xiang; Liu, Huan; Hu, Yafei; Liu, Zheng; Li, Eric
2018-03-01
Underground pipelines are subject to severe distress from the surrounding expansive soil. To investigate the structural response of water mains to varying soil movements, field data, including pipe wall strains in situ soil water content, soil pressure and temperature, was collected. The research on monitoring data analysis has been reported, but the relationship between soil properties and pipe deformation has not been well-interpreted. To characterize the relationship between soil property and pipe deformation, this paper presents a super learning based approach combining feature selection algorithms to predict the water mains structural behavior in different soil environments. Furthermore, automatic variable selection method, e.i. recursive feature elimination algorithm, were used to identify the critical predictors contributing to the pipe deformations. To investigate the adaptability of super learning to different predictive models, this research employed super learning based methods to three different datasets. The predictive performance was evaluated by R-squared, root-mean-square error and mean absolute error. Based on the prediction performance evaluation, the superiority of super learning was validated and demonstrated by predicting three types of pipe deformations accurately. In addition, a comprehensive understand of the water mains working environments becomes possible.
Combining in silico and in cerebro approaches for virtual screening and pose prediction in SAMPL4.
Voet, Arnout R D; Kumar, Ashutosh; Berenger, Francois; Zhang, Kam Y J
2014-04-01
The SAMPL challenges provide an ideal opportunity for unbiased evaluation and comparison of different approaches used in computational drug design. During the fourth round of this SAMPL challenge, we participated in the virtual screening and binding pose prediction on inhibitors targeting the HIV-1 integrase enzyme. For virtual screening, we used well known and widely used in silico methods combined with personal in cerebro insights and experience. Regular docking only performed slightly better than random selection, but the performance was significantly improved upon incorporation of additional filters based on pharmacophore queries and electrostatic similarities. The best performance was achieved when logical selection was added. For the pose prediction, we utilized a similar consensus approach that amalgamated the results of the Glide-XP docking with structural knowledge and rescoring. The pose prediction results revealed that docking displayed reasonable performance in predicting the binding poses. However, prediction performance can be improved utilizing scientific experience and rescoring approaches. In both the virtual screening and pose prediction challenges, the top performance was achieved by our approaches. Here we describe the methods and strategies used in our approaches and discuss the rationale of their performances.
Combining in silico and in cerebro approaches for virtual screening and pose prediction in SAMPL4
NASA Astrophysics Data System (ADS)
Voet, Arnout R. D.; Kumar, Ashutosh; Berenger, Francois; Zhang, Kam Y. J.
2014-04-01
The SAMPL challenges provide an ideal opportunity for unbiased evaluation and comparison of different approaches used in computational drug design. During the fourth round of this SAMPL challenge, we participated in the virtual screening and binding pose prediction on inhibitors targeting the HIV-1 integrase enzyme. For virtual screening, we used well known and widely used in silico methods combined with personal in cerebro insights and experience. Regular docking only performed slightly better than random selection, but the performance was significantly improved upon incorporation of additional filters based on pharmacophore queries and electrostatic similarities. The best performance was achieved when logical selection was added. For the pose prediction, we utilized a similar consensus approach that amalgamated the results of the Glide-XP docking with structural knowledge and rescoring. The pose prediction results revealed that docking displayed reasonable performance in predicting the binding poses. However, prediction performance can be improved utilizing scientific experience and rescoring approaches. In both the virtual screening and pose prediction challenges, the top performance was achieved by our approaches. Here we describe the methods and strategies used in our approaches and discuss the rationale of their performances.
ERIC Educational Resources Information Center
Schoer, Volker; Ntuli, Miracle; Rankin, Neil; Sebastiao, Claire; Hunt, Karen
2010-01-01
Internationally, performance in school Mathematics has been found to be a reliable predictor of performance in commerce courses at university level. Based on the predictive power of school-leaving marks, universities use results from school-leaving Mathematics examinations to rank student applicants according to their predicted abilities. However,…
NASA Technical Reports Server (NTRS)
Johnston, John D.; Parrish, Keith; Howard, Joseph M.; Mosier, Gary E.; McGinnis, Mark; Bluth, Marcel; Kim, Kevin; Ha, Hong Q.
2004-01-01
This is a continuation of a series of papers on modeling activities for JWST. The structural-thermal- optical, often referred to as "STOP", analysis process is used to predict the effect of thermal distortion on optical performance. The benchmark STOP analysis for JWST assesses the effect of an observatory slew on wavefront error. The paper begins an overview of multi-disciplinary engineering analysis, or integrated modeling, which is a critical element of the JWST mission. The STOP analysis process is then described. This process consists of the following steps: thermal analysis, structural analysis, and optical analysis. Temperatures predicted using geometric and thermal math models are mapped to the structural finite element model in order to predict thermally-induced deformations. Motions and deformations at optical surfaces are input to optical models and optical performance is predicted using either an optical ray trace or WFE estimation techniques based on prior ray traces or first order optics. Following the discussion of the analysis process, results based on models representing the design at the time of the System Requirements Review. In addition to baseline performance predictions, sensitivity studies are performed to assess modeling uncertainties. Of particular interest is the sensitivity of optical performance to uncertainties in temperature predictions and variations in metal properties. The paper concludes with a discussion of modeling uncertainty as it pertains to STOP analysis.
NASA Technical Reports Server (NTRS)
Foyle, David C.
1993-01-01
Based on existing integration models in the psychological literature, an evaluation framework is developed to assess sensor fusion displays as might be implemented in an enhanced/synthetic vision system. The proposed evaluation framework for evaluating the operator's ability to use such systems is a normative approach: The pilot's performance with the sensor fusion image is compared to models' predictions based on the pilot's performance when viewing the original component sensor images prior to fusion. This allows for the determination as to when a sensor fusion system leads to: poorer performance than one of the original sensor displays, clearly an undesirable system in which the fused sensor system causes some distortion or interference; better performance than with either single sensor system alone, but at a sub-optimal level compared to model predictions; optimal performance compared to model predictions; or, super-optimal performance, which may occur if the operator were able to use some highly diagnostic 'emergent features' in the sensor fusion display, which were unavailable in the original sensor displays.
Lindor, Noralane M; Lindor, Rachel A; Apicella, Carmel; Dowty, James G; Ashley, Amanda; Hunt, Katherine; Mincey, Betty A; Wilson, Marcia; Smith, M Cathie; Hopper, John L
2007-01-01
Models have been developed to predict the probability that a person carries a detectable germline mutation in the BRCA1 or BRCA2 genes. Their relative performance in a clinical setting is unclear. To compare the performance characteristics of four BRCA1/BRCA2 gene mutation prediction models: LAMBDA, based on a checklist and scores developed from data on Ashkenazi Jewish (AJ) women; BRCAPRO, a Bayesian computer program; modified Couch tables based on regression analyses; and Myriad II tables collated by Myriad Genetics Laboratories. Family cancer history data were analyzed from 200 probands from the Mayo Clinic Familial Cancer Program, in a multispecialty tertiary care group practice. All probands had clinical testing for BRCA1 and BRCA2 mutations conducted in a single laboratory. For each model, performance was assessed by the area under the receiver operator characteristic curve (ROC) and by tests of accuracy and dispersion. Cases "missed" by one or more models (model predicted less than 10% probability of mutation when a mutation was actually found) were compared across models. All models gave similar areas under the ROC curve of 0.71 to 0.76. All models except LAMBDA substantially under-predicted the numbers of carriers. All models were too dispersed. In terms of ranking, all prediction models performed reasonably well with similar performance characteristics. Model predictions were widely discrepant for some families. Review of cancer family histories by an experienced clinician continues to be vital to ensure that critical elements are not missed and that the most appropriate risk prediction figures are provided.
Zhang, Hua; Kurgan, Lukasz
2014-12-01
Knowledge of protein flexibility is vital for deciphering the corresponding functional mechanisms. This knowledge would help, for instance, in improving computational drug design and refinement in homology-based modeling. We propose a new predictor of the residue flexibility, which is expressed by B-factors, from protein chains that use local (in the chain) predicted (or native) relative solvent accessibility (RSA) and custom-derived amino acid (AA) alphabets. Our predictor is implemented as a two-stage linear regression model that uses RSA-based space in a local sequence window in the first stage and a reduced AA pair-based space in the second stage as the inputs. This method is easy to comprehend explicit linear form in both stages. Particle swarm optimization was used to find an optimal reduced AA alphabet to simplify the input space and improve the prediction performance. The average correlation coefficients between the native and predicted B-factors measured on a large benchmark dataset are improved from 0.65 to 0.67 when using the native RSA values and from 0.55 to 0.57 when using the predicted RSA values. Blind tests that were performed on two independent datasets show consistent improvements in the average correlation coefficients by a modest value of 0.02 for both native and predicted RSA-based predictions.
In silico platform for predicting and initiating β-turns in a protein at desired locations.
Singh, Harinder; Singh, Sandeep; Raghava, Gajendra P S
2015-05-01
Numerous studies have been performed for analysis and prediction of β-turns in a protein. This study focuses on analyzing, predicting, and designing of β-turns to understand the preference of amino acids in β-turn formation. We analyzed around 20,000 PDB chains to understand the preference of residues or pair of residues at different positions in β-turns. Based on the results, a propensity-based method has been developed for predicting β-turns with an accuracy of 82%. We introduced a new approach entitled "Turn level prediction method," which predicts the complete β-turn rather than focusing on the residues in a β-turn. Finally, we developed BetaTPred3, a Random forest based method for predicting β-turns by utilizing various features of four residues present in β-turns. The BetaTPred3 achieved an accuracy of 79% with 0.51 MCC that is comparable or better than existing methods on BT426 dataset. Additionally, models were developed to predict β-turn types with better performance than other methods available in the literature. In order to improve the quality of prediction of turns, we developed prediction models on a large and latest dataset of 6376 nonredundant protein chains. Based on this study, a web server has been developed for prediction of β-turns and their types in proteins. This web server also predicts minimum number of mutations required to initiate or break a β-turn in a protein at specified location of a protein. © 2015 Wiley Periodicals, Inc.
Olexa, Edward M.; Lawrence, Rick L
2014-01-01
Federal land management agencies provide stewardship over much of the rangelands in the arid andsemi-arid western United States, but they often lack data of the proper spatiotemporal resolution andextent needed to assess range conditions and monitor trends. Recent advances in the blending of com-plementary, remotely sensed data could provide public lands managers with the needed information.We applied the Spatial and Temporal Adaptive Reflectance Fusion Model (STARFM) to five Landsat TMand concurrent Terra MODIS scenes, and used pixel-based regression and difference image analyses toevaluate the quality of synthetic reflectance and NDVI products associated with semi-arid rangeland. Pre-dicted red reflectance data consistently demonstrated higher accuracy, less bias, and stronger correlationwith observed data than did analogous near-infrared (NIR) data. The accuracy of both bands tended todecline as the lag between base and prediction dates increased; however, mean absolute errors (MAE)were typically ≤10%. The quality of area-wide NDVI estimates was less consistent than either spectra lband, although the MAE of estimates predicted using early season base pairs were ≤10% throughout the growing season. Correlation between known and predicted NDVI values and agreement with the 1:1regression line tended to decline as the prediction lag increased. Further analyses of NDVI predictions,based on a 22 June base pair and stratified by land cover/land use (LCLU), revealed accurate estimates through the growing season; however, inter-class performance varied. This work demonstrates the successful application of the STARFM algorithm to semi-arid rangeland; however, we encourage evaluation of STARFM’s performance on a per product basis, stratified by LCLU, with attention given to the influence of base pair selection and the impact of the time lag.
Fuzzy regression modeling for tool performance prediction and degradation detection.
Li, X; Er, M J; Lim, B S; Zhou, J H; Gan, O P; Rutkowski, L
2010-10-01
In this paper, the viability of using Fuzzy-Rule-Based Regression Modeling (FRM) algorithm for tool performance and degradation detection is investigated. The FRM is developed based on a multi-layered fuzzy-rule-based hybrid system with Multiple Regression Models (MRM) embedded into a fuzzy logic inference engine that employs Self Organizing Maps (SOM) for clustering. The FRM converts a complex nonlinear problem to a simplified linear format in order to further increase the accuracy in prediction and rate of convergence. The efficacy of the proposed FRM is tested through a case study - namely to predict the remaining useful life of a ball nose milling cutter during a dry machining process of hardened tool steel with a hardness of 52-54 HRc. A comparative study is further made between four predictive models using the same set of experimental data. It is shown that the FRM is superior as compared with conventional MRM, Back Propagation Neural Networks (BPNN) and Radial Basis Function Networks (RBFN) in terms of prediction accuracy and learning speed.
Matrix factorization-based data fusion for gene function prediction in baker's yeast and slime mold.
Zitnik, Marinka; Zupan, Blaž
2014-01-01
The development of effective methods for the characterization of gene functions that are able to combine diverse data sources in a sound and easily-extendible way is an important goal in computational biology. We have previously developed a general matrix factorization-based data fusion approach for gene function prediction. In this manuscript, we show that this data fusion approach can be applied to gene function prediction and that it can fuse various heterogeneous data sources, such as gene expression profiles, known protein annotations, interaction and literature data. The fusion is achieved by simultaneous matrix tri-factorization that shares matrix factors between sources. We demonstrate the effectiveness of the approach by evaluating its performance on predicting ontological annotations in slime mold D. discoideum and on recognizing proteins of baker's yeast S. cerevisiae that participate in the ribosome or are located in the cell membrane. Our approach achieves predictive performance comparable to that of the state-of-the-art kernel-based data fusion, but requires fewer data preprocessing steps.
Network-based ranking methods for prediction of novel disease associated microRNAs.
Le, Duc-Hau
2015-10-01
Many studies have shown roles of microRNAs on human disease and a number of computational methods have been proposed to predict such associations by ranking candidate microRNAs according to their relevance to a disease. Among them, machine learning-based methods usually have a limitation in specifying non-disease microRNAs as negative training samples. Meanwhile, network-based methods are becoming dominant since they well exploit a "disease module" principle in microRNA functional similarity networks. Of which, random walk with restart (RWR) algorithm-based method is currently state-of-the-art. The use of this algorithm was inspired from its success in predicting disease gene because the "disease module" principle also exists in protein interaction networks. Besides, many algorithms designed for webpage ranking have been successfully applied in ranking disease candidate genes because web networks share topological properties with protein interaction networks. However, these algorithms have not yet been utilized for disease microRNA prediction. We constructed microRNA functional similarity networks based on shared targets of microRNAs, and then we integrated them with a microRNA functional synergistic network, which was recently identified. After analyzing topological properties of these networks, in addition to RWR, we assessed the performance of (i) PRINCE (PRIoritizatioN and Complex Elucidation), which was proposed for disease gene prediction; (ii) PageRank with Priors (PRP) and K-Step Markov (KSM), which were used for studying web networks; and (iii) a neighborhood-based algorithm. Analyses on topological properties showed that all microRNA functional similarity networks are small-worldness and scale-free. The performance of each algorithm was assessed based on average AUC values on 35 disease phenotypes and average rankings of newly discovered disease microRNAs. As a result, the performance on the integrated network was better than that on individual ones. In addition, the performance of PRINCE, PRP and KSM was comparable with that of RWR, whereas it was worst for the neighborhood-based algorithm. Moreover, all the algorithms were stable with the change of parameters. Final, using the integrated network, we predicted six novel miRNAs (i.e., hsa-miR-101, hsa-miR-181d, hsa-miR-192, hsa-miR-423-3p, hsa-miR-484 and hsa-miR-98) associated with breast cancer. Network-based ranking algorithms, which were successfully applied for either disease gene prediction or for studying social/web networks, can be also used effectively for disease microRNA prediction. Copyright © 2015 Elsevier Ltd. All rights reserved.
Sayegh, Philip; Arentoft, Alyssa; Thaler, Nicholas S.; Dean, Andy C.; Thames, April D.
2014-01-01
The current study examined whether self-rated education quality predicts Wide Range Achievement Test-4th Edition (WRAT-4) Word Reading subtest and neurocognitive performance, and aimed to establish this subtest's construct validity as an educational quality measure. In a community-based adult sample (N = 106), we tested whether education quality both increased the prediction of Word Reading scores beyond demographic variables and predicted global neurocognitive functioning after adjusting for WRAT-4. As expected, race/ethnicity and education predicted WRAT-4 reading performance. Hierarchical regression revealed that when including education quality, the amount of WRAT-4's explained variance increased significantly, with race/ethnicity and both education quality and years as significant predictors. Finally, WRAT-4 scores, but not education quality, predicted neurocognitive performance. Results support WRAT-4 Word Reading as a valid proxy measure for education quality and a key predictor of neurocognitive performance. Future research should examine these findings in larger, more diverse samples to determine their robust nature. PMID:25404004
Yield performance and stability of CMS-based triticale hybrids.
Mühleisen, Jonathan; Piepho, Hans-Peter; Maurer, Hans Peter; Reif, Jochen Christoph
2015-02-01
CMS-based triticale hybrids showed only marginal midparent heterosis for grain yield and lower dynamic yield stability compared to inbred lines. Hybrids of triticale (×Triticosecale Wittmack) are expected to possess outstanding yield performance and increased dynamic yield stability. The objectives of the present study were to (1) examine the optimum choice of the biometrical model to compare yield stability of hybrids versus lines, (2) investigate whether hybrids exhibit a more pronounced grain yield performance and yield stability, and (3) study optimal strategies to predict yield stability of hybrids. Thirteen female and seven male parental lines and their 91 factorial hybrids as well as 30 commercial lines were evaluated for grain yield in up to 20 environments. Hybrids were produced using a cytoplasmic male sterility (CMS)-inducing cytoplasm that originated from Triticumtimopheevii Zhuk. We found that the choice of the biometrical model can cause contrasting results and concluded that a group-by-environment interaction term should be added to the model when estimating stability variance of hybrids and lines. midparent heterosis for grain yield was on average 3 % with a range from -15.0 to 11.5 %. No hybrid outperformed the best inbred line. Hybrids had, on average, lower dynamic yield stability compared to the inbred lines. Grain yield performance of hybrids could be predicted based on midparent values and general combining ability (GCA)-predicted values. In contrast, stability variance of hybrids could be predicted only based on GCA-predicted values. We speculated that negative effects of the used CMS cytoplasm might be the reason for the low performance and yield stability of the hybrids. For this purpose a detailed study on the reasons for the drawback of the currently existing CMS system in triticale is urgently required comprising also the search of potentially alternative hybridization systems.
An integrated physiology model to study regional lung damage effects and the physiologic response
2014-01-01
Background This work expands upon a previously developed exercise dynamic physiology model (DPM) with the addition of an anatomic pulmonary system in order to quantify the impact of lung damage on oxygen transport and physical performance decrement. Methods A pulmonary model is derived with an anatomic structure based on morphometric measurements, accounting for heterogeneous ventilation and perfusion observed experimentally. The model is incorporated into an existing exercise physiology model; the combined system is validated using human exercise data. Pulmonary damage from blast, blunt trauma, and chemical injury is quantified in the model based on lung fluid infiltration (edema) which reduces oxygen delivery to the blood. The pulmonary damage component is derived and calibrated based on published animal experiments; scaling laws are used to predict the human response to lung injury in terms of physical performance decrement. Results The augmented dynamic physiology model (DPM) accurately predicted the human response to hypoxia, altitude, and exercise observed experimentally. The pulmonary damage parameters (shunt and diffusing capacity reduction) were fit to experimental animal data obtained in blast, blunt trauma, and chemical damage studies which link lung damage to lung weight change; the model is able to predict the reduced oxygen delivery in damage conditions. The model accurately estimates physical performance reduction with pulmonary damage. Conclusions We have developed a physiologically-based mathematical model to predict performance decrement endpoints in the presence of thoracic damage; simulations can be extended to estimate human performance and escape in extreme situations. PMID:25044032
Allen, D D; Bond, C A
2001-07-01
Good admissions decisions are essential for identifying successful students and good practitioners. Various parameters have been shown to have predictive power for academic success. Previous academic performance, the Pharmacy College Admissions Test (PCAT), and specific prepharmacy courses have been suggested as academic performance indicators. However, critical thinking abilities have not been evaluated. We evaluated the connection between academic success and each of the following predictive parameters: the California Critical Thinking Skills Test (CCTST) score, PCAT score, interview score, overall academic performance prior to admission at a pharmacy school, and performance in specific prepharmacy courses. We confirmed previous reports but demonstrated intriguing results in predicting practice-based skills. Critical thinking skills predict practice-based course success. Also, the CCTST and PCAT scores (Pearson correlation [pc] = 0.448, p < 0.001) were closely related in our students. The strongest predictors of practice-related courses and clerkship success were PCAT (pc=0.237, p<0.001) and CCTST (pc = 0.201, p < 0.001). These findings and other analyses suggest that PCAT may predict critical thinking skills in pharmacy practice courses and clerkships. Further study is needed to confirm this finding and determine which PCAT components predict critical thinking abilities.
Chang, Hsien-Yen; Weiner, Jonathan P
2010-01-18
Diagnosis-based risk adjustment is becoming an important issue globally as a result of its implications for payment, high-risk predictive modelling and provider performance assessment. The Taiwanese National Health Insurance (NHI) programme provides universal coverage and maintains a single national computerized claims database, which enables the application of diagnosis-based risk adjustment. However, research regarding risk adjustment is limited. This study aims to examine the performance of the Adjusted Clinical Group (ACG) case-mix system using claims-based diagnosis information from the Taiwanese NHI programme. A random sample of NHI enrollees was selected. Those continuously enrolled in 2002 were included for concurrent analyses (n = 173,234), while those in both 2002 and 2003 were included for prospective analyses (n = 164,562). Health status measures derived from 2002 diagnoses were used to explain the 2002 and 2003 health expenditure. A multivariate linear regression model was adopted after comparing the performance of seven different statistical models. Split-validation was performed in order to avoid overfitting. The performance measures were adjusted R2 and mean absolute prediction error of five types of expenditure at individual level, and predictive ratio of total expenditure at group level. The more comprehensive models performed better when used for explaining resource utilization. Adjusted R2 of total expenditure in concurrent/prospective analyses were 4.2%/4.4% in the demographic model, 15%/10% in the ACGs or ADGs (Aggregated Diagnosis Group) model, and 40%/22% in the models containing EDCs (Expanded Diagnosis Cluster). When predicting expenditure for groups based on expenditure quintiles, all models underpredicted the highest expenditure group and overpredicted the four other groups. For groups based on morbidity burden, the ACGs model had the best performance overall. Given the widespread availability of claims data and the superior explanatory power of claims-based risk adjustment models over demographics-only models, Taiwan's government should consider using claims-based models for policy-relevant applications. The performance of the ACG case-mix system in Taiwan was comparable to that found in other countries. This suggested that the ACG system could be applied to Taiwan's NHI even though it was originally developed in the USA. Many of the findings in this paper are likely to be relevant to other diagnosis-based risk adjustment methodologies.
Prediction on carbon dioxide emissions based on fuzzy rules
NASA Astrophysics Data System (ADS)
Pauzi, Herrini; Abdullah, Lazim
2014-06-01
There are several ways to predict air quality, varying from simple regression to models based on artificial intelligence. Most of the conventional methods are not sufficiently able to provide good forecasting performances due to the problems with non-linearity uncertainty and complexity of the data. Artificial intelligence techniques are successfully used in modeling air quality in order to cope with the problems. This paper describes fuzzy inference system (FIS) to predict CO2 emissions in Malaysia. Furthermore, adaptive neuro-fuzzy inference system (ANFIS) is used to compare the prediction performance. Data of five variables: energy use, gross domestic product per capita, population density, combustible renewable and waste and CO2 intensity are employed in this comparative study. The results from the two model proposed are compared and it is clearly shown that the ANFIS outperforms FIS in CO2 prediction.
NASA Astrophysics Data System (ADS)
Febrian Umbara, Rian; Tarwidi, Dede; Budi Setiawan, Erwin
2018-03-01
The paper discusses the prediction of Jakarta Composite Index (JCI) in Indonesia Stock Exchange. The study is based on JCI historical data for 1286 days to predict the value of JCI one day ahead. This paper proposes predictions done in two stages., The first stage using Fuzzy Time Series (FTS) to predict values of ten technical indicators, and the second stage using Support Vector Regression (SVR) to predict the value of JCI one day ahead, resulting in a hybrid prediction model FTS-SVR. The performance of this combined prediction model is compared with the performance of the single stage prediction model using SVR only. Ten technical indicators are used as input for each model.
Life Extending Control. [mechanical fatigue in reusable rocket engines
NASA Technical Reports Server (NTRS)
Lorenzo, Carl F.; Merrill, Walter C.
1991-01-01
The concept of Life Extending Control is defined. Life is defined in terms of mechanical fatigue life. A brief description is given of the current approach to life prediction using a local, cyclic, stress-strain approach for a critical system component. An alternative approach to life prediction based on a continuous functional relationship to component performance is proposed. Based on cyclic life prediction, an approach to life extending control, called the Life Management Approach, is proposed. A second approach, also based on cyclic life prediction, called the implicit approach, is presented. Assuming the existence of the alternative functional life prediction approach, two additional concepts for Life Extending Control are presented.
Life extending control: A concept paper
NASA Technical Reports Server (NTRS)
Lorenzo, Carl F.; Merrill, Walter C.
1991-01-01
The concept of Life Extending Control is defined. Life is defined in terms of mechanical fatigue life. A brief description is given of the current approach to life prediction using a local, cyclic, stress-strain approach for a critical system component. An alternative approach to life prediction based on a continuous functional relationship to component performance is proposed.Base on cyclic life prediction an approach to Life Extending Control, called the Life Management Approach is proposed. A second approach, also based on cyclic life prediction, called the Implicit Approach, is presented. Assuming the existence of the alternative functional life prediction approach, two additional concepts for Life Extending Control are presented.
A Sensor Dynamic Measurement Error Prediction Model Based on NAPSO-SVM
Jiang, Minlan; Jiang, Lan; Jiang, Dingde; Li, Fei
2018-01-01
Dynamic measurement error correction is an effective way to improve sensor precision. Dynamic measurement error prediction is an important part of error correction, and support vector machine (SVM) is often used for predicting the dynamic measurement errors of sensors. Traditionally, the SVM parameters were always set manually, which cannot ensure the model’s performance. In this paper, a SVM method based on an improved particle swarm optimization (NAPSO) is proposed to predict the dynamic measurement errors of sensors. Natural selection and simulated annealing are added in the PSO to raise the ability to avoid local optima. To verify the performance of NAPSO-SVM, three types of algorithms are selected to optimize the SVM’s parameters: the particle swarm optimization algorithm (PSO), the improved PSO optimization algorithm (NAPSO), and the glowworm swarm optimization (GSO). The dynamic measurement error data of two sensors are applied as the test data. The root mean squared error and mean absolute percentage error are employed to evaluate the prediction models’ performances. The experimental results show that among the three tested algorithms the NAPSO-SVM method has a better prediction precision and a less prediction errors, and it is an effective method for predicting the dynamic measurement errors of sensors. PMID:29342942
Teyhen, Deydre S; Shaffer, Scott W; Butler, Robert J; Goffar, Stephen L; Kiesel, Kyle B; Rhon, Daniel I; Boyles, Robert E; McMillian, Daniel J; Williamson, Jared N; Plisky, Phillip J
2016-10-01
Performance on movement tests helps to predict injury risk in a variety of physically active populations. Understanding baseline measures for normal is an important first step. Determine differences in physical performance assessments and describe normative values for these tests based on military unit type. Assessment of power, balance, mobility, motor control, and performance on the Army Physical Fitness Test were assessed in a cohort of 1,466 soldiers. Analysis of variance was performed to compare the results based on military unit type (Rangers, Combat, Combat Service, and Combat Service Support) and analysis of covariance was performed to determine the influence of age and gender. Rangers performed the best on all performance and fitness measures (p < 0.05). Combat soldiers performed better than Combat Service and Service Support soldiers on several physical performance tests and the Army Physical Fitness Test (p < 0.05). Performance in Combat Service and Service Support soldiers was equivalent on most measures (p < 0.05). Functional performance and level of fitness varied significantly by military unit type. Understanding these differences will provide a foundation for future injury prediction and prevention strategies. Reprint & Copyright © 2016 Association of Military Surgeons of the U.S.
Machine learning-based methods for prediction of linear B-cell epitopes.
Wang, Hsin-Wei; Pai, Tun-Wen
2014-01-01
B-cell epitope prediction facilitates immunologists in designing peptide-based vaccine, diagnostic test, disease prevention, treatment, and antibody production. In comparison with T-cell epitope prediction, the performance of variable length B-cell epitope prediction is still yet to be satisfied. Fortunately, due to increasingly available verified epitope databases, bioinformaticians could adopt machine learning-based algorithms on all curated data to design an improved prediction tool for biomedical researchers. Here, we have reviewed related epitope prediction papers, especially those for linear B-cell epitope prediction. It should be noticed that a combination of selected propensity scales and statistics of epitope residues with machine learning-based tools formulated a general way for constructing linear B-cell epitope prediction systems. It is also observed from most of the comparison results that the kernel method of support vector machine (SVM) classifier outperformed other machine learning-based approaches. Hence, in this chapter, except reviewing recently published papers, we have introduced the fundamentals of B-cell epitope and SVM techniques. In addition, an example of linear B-cell prediction system based on physicochemical features and amino acid combinations is illustrated in details.
Performance Trends During Sleep Deprivation on a Tilt-Based Control Task.
Bolkhovsky, Jeffrey B; Ritter, Frank E; Chon, Ki H; Qin, Michael
2018-07-01
Understanding human behavior under the effects of sleep deprivation allows for the mitigation of risk due to reduced performance. To further this goal, this study investigated the effects of short-term sleep deprivation using a tilt-based control device and examined whether existing user models accurately predict targeting performance. A task in which the user tilts a surface to roll a ball into a target was developed to examine motor performance. A model was built to predict human performance for this task under various levels of sleep deprivation. Every 2 h, 10 subjects completed the task until they reached 24 h of wakefulness. Performance measurements of this task, which were based on Fitts' law, included movement time, task throughput, and time intercept. The model predicted significant performance decrements over the 24-h period with an increase in movement time (R2 = 0.61), a decrease in throughput (R2 = 0.57), and an increase in time intercept (R2 = 0.60). However, it was found that in experimental trials there was no significant change in movement time (R2 = 0.11), throughput (R2 = 0.15), or time intercept (R2 = 0.27). The results found were unexpected as performance decrement is frequently reported during sleep deprivation. These findings suggest a reexamination of the initial thought of sleep loss leading to a decrement in all aspects of performance.Bolkovsky JB, Ritter FE, Chon KH, Qin M. Performance trends during sleep deprivation on a tilt-based control task. Aerosp Med Hum Perform. 2018; 89(7):626-633.
Yajima, Airi; Uesawa, Yoshihiro; Ogawa, Chiaki; Yatabe, Megumi; Kondo, Naoki; Saito, Shinichiro; Suzuki, Yoshihiko; Atsuda, Kouichiro; Kagaya, Hajime
2015-05-01
There exist various useful predictive models, such as the Cockcroft-Gault model, for estimating creatinine clearance (CLcr). However, the prediction of renal function is difficult in patients with cancer treated with cisplatin. Therefore, we attempted to construct a new model for predicting CLcr in such patients. Japanese patients with head and neck cancer who had received cisplatin-based chemotherapy were used as subjects. A multiple regression equation was constructed as a model for predicting CLcr values based on background and laboratory data. A model for predicting CLcr, which included body surface area, serum creatinine and albumin, was constructed. The model exhibited good performance prior to cisplatin therapy. In addition, it performed better than previously reported models after cisplatin therapy. The predictive model constructed in the present study displayed excellent potential and was useful for estimating the renal function of patients treated with cisplatin therapy. Copyright© 2015 International Institute of Anticancer Research (Dr. John G. Delinassios), All rights reserved.
Deep learning architecture for air quality predictions.
Li, Xiang; Peng, Ling; Hu, Yuan; Shao, Jing; Chi, Tianhe
2016-11-01
With the rapid development of urbanization and industrialization, many developing countries are suffering from heavy air pollution. Governments and citizens have expressed increasing concern regarding air pollution because it affects human health and sustainable development worldwide. Current air quality prediction methods mainly use shallow models; however, these methods produce unsatisfactory results, which inspired us to investigate methods of predicting air quality based on deep architecture models. In this paper, a novel spatiotemporal deep learning (STDL)-based air quality prediction method that inherently considers spatial and temporal correlations is proposed. A stacked autoencoder (SAE) model is used to extract inherent air quality features, and it is trained in a greedy layer-wise manner. Compared with traditional time series prediction models, our model can predict the air quality of all stations simultaneously and shows the temporal stability in all seasons. Moreover, a comparison with the spatiotemporal artificial neural network (STANN), auto regression moving average (ARMA), and support vector regression (SVR) models demonstrates that the proposed method of performing air quality predictions has a superior performance.
A Case Study on a Combination NDVI Forecasting Model Based on the Entropy Weight Method
DOE Office of Scientific and Technical Information (OSTI.GOV)
Huang, Shengzhi; Ming, Bo; Huang, Qiang
It is critically meaningful to accurately predict NDVI (Normalized Difference Vegetation Index), which helps guide regional ecological remediation and environmental managements. In this study, a combination forecasting model (CFM) was proposed to improve the performance of NDVI predictions in the Yellow River Basin (YRB) based on three individual forecasting models, i.e., the Multiple Linear Regression (MLR), Artificial Neural Network (ANN), and Support Vector Machine (SVM) models. The entropy weight method was employed to determine the weight coefficient for each individual model depending on its predictive performance. Results showed that: (1) ANN exhibits the highest fitting capability among the four orecastingmore » models in the calibration period, whilst its generalization ability becomes weak in the validation period; MLR has a poor performance in both calibration and validation periods; the predicted results of CFM in the calibration period have the highest stability; (2) CFM generally outperforms all individual models in the validation period, and can improve the reliability and stability of predicted results through combining the strengths while reducing the weaknesses of individual models; (3) the performances of all forecasting models are better in dense vegetation areas than in sparse vegetation areas.« less
Predicting Protein-Protein Interaction Sites with a Novel Membership Based Fuzzy SVM Classifier.
Sriwastava, Brijesh K; Basu, Subhadip; Maulik, Ujjwal
2015-01-01
Predicting residues that participate in protein-protein interactions (PPI) helps to identify, which amino acids are located at the interface. In this paper, we show that the performance of the classical support vector machine (SVM) algorithm can further be improved with the use of a custom-designed fuzzy membership function, for the partner-specific PPI interface prediction problem. We evaluated the performances of both classical SVM and fuzzy SVM (F-SVM) on the PPI databases of three different model proteomes of Homo sapiens, Escherichia coli and Saccharomyces Cerevisiae and calculated the statistical significance of the developed F-SVM over classical SVM algorithm. We also compared our performance with the available state-of-the-art fuzzy methods in this domain and observed significant performance improvements. To predict interaction sites in protein complexes, local composition of amino acids together with their physico-chemical characteristics are used, where the F-SVM based prediction method exploits the membership function for each pair of sequence fragments. The average F-SVM performance (area under ROC curve) on the test samples in 10-fold cross validation experiment are measured as 77.07, 78.39, and 74.91 percent for the aforementioned organisms respectively. Performances on independent test sets are obtained as 72.09, 73.24 and 82.74 percent respectively. The software is available for free download from http://code.google.com/p/cmater-bioinfo.
Cunha-Cruz, Joana; Milgrom, Peter; Shirtcliff, R Michael; Bailit, Howard L; Huebner, Colleen E; Conrad, Douglas; Ludwig, Sharity; Mitchell, Melissa; Dysert, Jeanne; Allen, Gary; Scott, JoAnna; Mancl, Lloyd
2015-06-20
To improve the oral health of low-income children, innovations in dental delivery systems are needed, including community-based care, the use of expanded duty auxiliary dental personnel, capitation payments, and global budgets. This paper describes the protocol for PREDICT (Population-centered Risk- and Evidence-based Dental Interprofessional Care Team), an evaluation project to test the effectiveness of new delivery and payment systems for improving dental care and oral health. This is a parallel-group cluster randomized controlled trial. Fourteen rural Oregon counties with a publicly insured (Medicaid) population of 82,000 children (0 to 21 years old) and pregnant women served by a managed dental care organization are randomized into test and control counties. In the test intervention (PREDICT), allied dental personnel provide screening and preventive services in community settings and case managers serve as patient navigators to arrange referrals of children who need dentist services. The delivery system intervention is paired with a compensation system for high performance (pay-for-performance) with efficient performance monitoring. PREDICT focuses on the following: 1) identifying eligible children and gaining caregiver consent for services in community settings (for example, schools); 2) providing risk-based preventive and caries stabilization services efficiently at these settings; 3) providing curative care in dental clinics; and 4) incentivizing local delivery teams to meet performance benchmarks. In the control intervention, care is delivered in dental offices without performance incentives. The primary outcome is the prevalence of untreated dental caries. Other outcomes are related to process, structure and cost. Data are collected through patient and staff surveys, clinical examinations, and the review of health and administrative records. If effective, PREDICT is expected to substantially reduce disparities in dental care and oral health. PREDICT can be disseminated to other care organizations as publicly insured clients are increasingly served by large practice organizations. ClinicalTrials.gov NCT02312921 6 December 2014. The Robert Wood Johnson Foundation and Advantage Dental Services, LLC, are supporting the evaluation.
NASA Astrophysics Data System (ADS)
Li, Hui; Yu, Jun-Ling; Yu, Le-An; Sun, Jie
2014-05-01
Case-based reasoning (CBR) is one of the main forecasting methods in business forecasting, which performs well in prediction and holds the ability of giving explanations for the results. In business failure prediction (BFP), the number of failed enterprises is relatively small, compared with the number of non-failed ones. However, the loss is huge when an enterprise fails. Therefore, it is necessary to develop methods (trained on imbalanced samples) which forecast well for this small proportion of failed enterprises and performs accurately on total accuracy meanwhile. Commonly used methods constructed on the assumption of balanced samples do not perform well in predicting minority samples on imbalanced samples consisting of the minority/failed enterprises and the majority/non-failed ones. This article develops a new method called clustering-based CBR (CBCBR), which integrates clustering analysis, an unsupervised process, with CBR, a supervised process, to enhance the efficiency of retrieving information from both minority and majority in CBR. In CBCBR, various case classes are firstly generated through hierarchical clustering inside stored experienced cases, and class centres are calculated out by integrating cases information in the same clustered class. When predicting the label of a target case, its nearest clustered case class is firstly retrieved by ranking similarities between the target case and each clustered case class centre. Then, nearest neighbours of the target case in the determined clustered case class are retrieved. Finally, labels of the nearest experienced cases are used in prediction. In the empirical experiment with two imbalanced samples from China, the performance of CBCBR was compared with the classical CBR, a support vector machine, a logistic regression and a multi-variant discriminate analysis. The results show that compared with the other four methods, CBCBR performed significantly better in terms of sensitivity for identifying the minority samples and generated high total accuracy meanwhile. The proposed approach makes CBR useful in imbalanced forecasting.
Template‐based field map prediction for rapid whole brain B0 shimming
Shi, Yuhang; Vannesjo, S. Johanna; Miller, Karla L.
2017-01-01
Purpose In typical MRI protocols, time is spent acquiring a field map to calculate the shim settings for best image quality. We propose a fast template‐based field map prediction method that yields near‐optimal shims without measuring the field. Methods The template‐based prediction method uses prior knowledge of the B0 distribution in the human brain, based on a large database of field maps acquired from different subjects, together with subject‐specific structural information from a quick localizer scan. The shimming performance of using the template‐based prediction is evaluated in comparison to a range of potential fast shimming methods. Results Static B0 shimming based on predicted field maps performed almost as well as shimming based on individually measured field maps. In experimental evaluations at 7 T, the proposed approach yielded a residual field standard deviation in the brain of on average 59 Hz, compared with 50 Hz using measured field maps and 176 Hz using no subject‐specific shim. Conclusions This work demonstrates that shimming based on predicted field maps is feasible. The field map prediction accuracy could potentially be further improved by generating the template from a subset of subjects, based on parameters such as head rotation and body mass index. Magn Reson Med 80:171–180, 2018. © 2017 The Authors Magnetic Resonance in Medicine published by Wiley Periodicals, Inc. on behalf of International Society for Magnetic Resonance in Medicine. This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. PMID:29193340
Using the surface panel method to predict the steady performance of ducted propellers
NASA Astrophysics Data System (ADS)
Cai, Hao-Peng; Su, Yu-Min; Li, Xin; Shen, Hai-Long
2009-12-01
A new numerical method was developed for predicting the steady hydrodynamic performance of ducted propellers. A potential based surface panel method was applied both to the duct and the propeller, and the interaction between them was solved by an induced velocity potential iterative method. Compared with the induced velocity iterative method, the method presented can save programming and calculating time. Numerical results for a JD simplified ducted propeller series showed that the method presented is effective for predicting the steady hydrodynamic performance of ducted propellers.
Moghram, Basem Ameen; Nabil, Emad; Badr, Amr
2018-01-01
T-cell epitope structure identification is a significant challenging immunoinformatic problem within epitope-based vaccine design. Epitopes or antigenic peptides are a set of amino acids that bind with the Major Histocompatibility Complex (MHC) molecules. The aim of this process is presented by Antigen Presenting Cells to be inspected by T-cells. MHC-molecule-binding epitopes are responsible for triggering the immune response to antigens. The epitope's three-dimensional (3D) molecular structure (i.e., tertiary structure) reflects its proper function. Therefore, the identification of MHC class-II epitopes structure is a significant step towards epitope-based vaccine design and understanding of the immune system. In this paper, we propose a new technique using a Genetic Algorithm for Predicting the Epitope Structure (GAPES), to predict the structure of MHC class-II epitopes based on their sequence. The proposed Elitist-based genetic algorithm for predicting the epitope's tertiary structure is based on Ab-Initio Empirical Conformational Energy Program for Peptides (ECEPP) Force Field Model. The developed secondary structure prediction technique relies on Ramachandran Plot. We used two alignment algorithms: the ROSS alignment and TM-Score alignment. We applied four different alignment approaches to calculate the similarity scores of the dataset under test. We utilized the support vector machine (SVM) classifier as an evaluation of the prediction performance. The prediction accuracy and the Area Under Receiver Operating Characteristic (ROC) Curve (AUC) were calculated as measures of performance. The calculations are performed on twelve similarity-reduced datasets of the Immune Epitope Data Base (IEDB) and a large dataset of peptide-binding affinities to HLA-DRB1*0101. The results showed that GAPES was reliable and very accurate. We achieved an average prediction accuracy of 93.50% and an average AUC of 0.974 in the IEDB dataset. Also, we achieved an accuracy of 95.125% and an AUC of 0.987 on the HLA-DRB1*0101 allele of the Wang benchmark dataset. The results indicate that the proposed prediction technique "GAPES" is a promising technique that will help researchers and scientists to predict the protein structure and it will assist them in the intelligent design of new epitope-based vaccines. Copyright © 2017 Elsevier B.V. All rights reserved.
The wind power prediction research based on mind evolutionary algorithm
NASA Astrophysics Data System (ADS)
Zhuang, Ling; Zhao, Xinjian; Ji, Tianming; Miao, Jingwen; Cui, Haina
2018-04-01
When the wind power is connected to the power grid, its characteristics of fluctuation, intermittent and randomness will affect the stability of the power system. The wind power prediction can guarantee the power quality and reduce the operating cost of power system. There were some limitations in several traditional wind power prediction methods. On the basis, the wind power prediction method based on Mind Evolutionary Algorithm (MEA) is put forward and a prediction model is provided. The experimental results demonstrate that MEA performs efficiently in term of the wind power prediction. The MEA method has broad prospect of engineering application.
The impact of subgroup type and subgroup configurational properties on work team performance.
Carton, Andrew M; Cummings, Jonathon N
2013-09-01
Scholars have invoked subgroups in a number of theories related to teams, yet certain tensions in the literature remain unresolved. In this article, we address 2 of these tensions, both relating to how subgroups are configured in work teams: (a) whether teams perform better with a greater number of subgroups and (b) whether teams perform better when they have imbalanced subgroups (majorities and minorities are present) or balanced subgroups (subgroups are of equal size). We predict that the impact of the number and balance of subgroups depends on the type of subgroup-whether subgroups are formed according to social identity (i.e., identity-based subgroups) or information processing (i.e., knowledge-based subgroups). We first propose that teams are more adversely affected by 2 identity-based subgroups than by any other number, yet the uniquely negative impact of a 2-subgroup configuration is not apparent for knowledge-based subgroups. Instead, a larger number of knowledge-based subgroups is beneficial for performance, such that 2 subgroups is worse for performance when compared with 3 or more subgroups but better for performance when compared with no subgroups or 1 subgroup. Second, we argue that teams perform better when identity-based subgroups are imbalanced yet knowledge-based subgroups are balanced. We also suggest that there are interactive effects between the number and balance of subgroups-however, the nature of this interaction depends on the type of subgroup. To test these predictions, we developed and validated an algorithm that measures the configurational properties of subgroups in organizational work teams. Results of a field study of 326 work teams from a multinational organization support our predictions. PsycINFO Database Record (c) 2013 APA, all rights reserved
ShinyGPAS: interactive genomic prediction accuracy simulator based on deterministic formulas.
Morota, Gota
2017-12-20
Deterministic formulas for the accuracy of genomic predictions highlight the relationships among prediction accuracy and potential factors influencing prediction accuracy prior to performing computationally intensive cross-validation. Visualizing such deterministic formulas in an interactive manner may lead to a better understanding of how genetic factors control prediction accuracy. The software to simulate deterministic formulas for genomic prediction accuracy was implemented in R and encapsulated as a web-based Shiny application. Shiny genomic prediction accuracy simulator (ShinyGPAS) simulates various deterministic formulas and delivers dynamic scatter plots of prediction accuracy versus genetic factors impacting prediction accuracy, while requiring only mouse navigation in a web browser. ShinyGPAS is available at: https://chikudaisei.shinyapps.io/shinygpas/ . ShinyGPAS is a shiny-based interactive genomic prediction accuracy simulator using deterministic formulas. It can be used for interactively exploring potential factors that influence prediction accuracy in genome-enabled prediction, simulating achievable prediction accuracy prior to genotyping individuals, or supporting in-class teaching. ShinyGPAS is open source software and it is hosted online as a freely available web-based resource with an intuitive graphical user interface.
2010-01-01
Background The large amount of high-throughput genomic data has facilitated the discovery of the regulatory relationships between transcription factors and their target genes. While early methods for discovery of transcriptional regulation relationships from microarray data often focused on the high-throughput experimental data alone, more recent approaches have explored the integration of external knowledge bases of gene interactions. Results In this work, we develop an algorithm that provides improved performance in the prediction of transcriptional regulatory relationships by supplementing the analysis of microarray data with a new method of integrating information from an existing knowledge base. Using a well-known dataset of yeast microarrays and the Yeast Proteome Database, a comprehensive collection of known information of yeast genes, we show that knowledge-based predictions demonstrate better sensitivity and specificity in inferring new transcriptional interactions than predictions from microarray data alone. We also show that comprehensive, direct and high-quality knowledge bases provide better prediction performance. Comparison of our results with ChIP-chip data and growth fitness data suggests that our predicted genome-wide regulatory pairs in yeast are reasonable candidates for follow-up biological verification. Conclusion High quality, comprehensive, and direct knowledge bases, when combined with appropriate bioinformatic algorithms, can significantly improve the discovery of gene regulatory relationships from high throughput gene expression data. PMID:20122245
Seok, Junhee; Kaushal, Amit; Davis, Ronald W; Xiao, Wenzhong
2010-01-18
The large amount of high-throughput genomic data has facilitated the discovery of the regulatory relationships between transcription factors and their target genes. While early methods for discovery of transcriptional regulation relationships from microarray data often focused on the high-throughput experimental data alone, more recent approaches have explored the integration of external knowledge bases of gene interactions. In this work, we develop an algorithm that provides improved performance in the prediction of transcriptional regulatory relationships by supplementing the analysis of microarray data with a new method of integrating information from an existing knowledge base. Using a well-known dataset of yeast microarrays and the Yeast Proteome Database, a comprehensive collection of known information of yeast genes, we show that knowledge-based predictions demonstrate better sensitivity and specificity in inferring new transcriptional interactions than predictions from microarray data alone. We also show that comprehensive, direct and high-quality knowledge bases provide better prediction performance. Comparison of our results with ChIP-chip data and growth fitness data suggests that our predicted genome-wide regulatory pairs in yeast are reasonable candidates for follow-up biological verification. High quality, comprehensive, and direct knowledge bases, when combined with appropriate bioinformatic algorithms, can significantly improve the discovery of gene regulatory relationships from high throughput gene expression data.
A Comparison of Three Strategies for Scale Construction to Predict a Specific Behavioral Outcome
ERIC Educational Resources Information Center
Garb, Howard N.; Wood, James M.; Fiedler, Edna R.
2011-01-01
Using 65 items from a mental health screening questionnaire, the History Opinion Inventory-Revised (HOI-R), the present study compared three strategies of scale construction--(1) internal (based on factor analysis), (2) external (based on empirical performance) and (3) intuitive (based on clinicians' opinion)--to predict whether 203,595 U.S. Air…
Maximizing lipocalin prediction through balanced and diversified training set and decision fusion.
Nath, Abhigyan; Subbiah, Karthikeyan
2015-12-01
Lipocalins are short in sequence length and perform several important biological functions. These proteins are having less than 20% sequence similarity among paralogs. Experimentally identifying them is an expensive and time consuming process. The computational methods based on the sequence similarity for allocating putative members to this family are also far elusive due to the low sequence similarity existing among the members of this family. Consequently, the machine learning methods become a viable alternative for their prediction by using the underlying sequence/structurally derived features as the input. Ideally, any machine learning based prediction method must be trained with all possible variations in the input feature vector (all the sub-class input patterns) to achieve perfect learning. A near perfect learning can be achieved by training the model with diverse types of input instances belonging to the different regions of the entire input space. Furthermore, the prediction performance can be improved through balancing the training set as the imbalanced data sets will tend to produce the prediction bias towards majority class and its sub-classes. This paper is aimed to achieve (i) the high generalization ability without any classification bias through the diversified and balanced training sets as well as (ii) enhanced the prediction accuracy by combining the results of individual classifiers with an appropriate fusion scheme. Instead of creating the training set randomly, we have first used the unsupervised Kmeans clustering algorithm to create diversified clusters of input patterns and created the diversified and balanced training set by selecting an equal number of patterns from each of these clusters. Finally, probability based classifier fusion scheme was applied on boosted random forest algorithm (which produced greater sensitivity) and K nearest neighbour algorithm (which produced greater specificity) to achieve the enhanced predictive performance than that of individual base classifiers. The performance of the learned models trained on Kmeans preprocessed training set is far better than the randomly generated training sets. The proposed method achieved a sensitivity of 90.6%, specificity of 91.4% and accuracy of 91.0% on the first test set and sensitivity of 92.9%, specificity of 96.2% and accuracy of 94.7% on the second blind test set. These results have established that diversifying training set improves the performance of predictive models through superior generalization ability and balancing the training set improves prediction accuracy. For smaller data sets, unsupervised Kmeans based sampling can be an effective technique to increase generalization than that of the usual random splitting method. Copyright © 2015 Elsevier Ltd. All rights reserved.
Hoffmann, Janina A; von Helversen, Bettina; Rieskamp, Jörg
2014-12-01
Making accurate judgments is an essential skill in everyday life. Although how different memory abilities relate to categorization and judgment processes has been hotly debated, the question is far from resolved. We contribute to the solution by investigating how individual differences in memory abilities affect judgment performance in 2 tasks that induced rule-based or exemplar-based judgment strategies. In a study with 279 participants, we investigated how working memory and episodic memory affect judgment accuracy and strategy use. As predicted, participants switched strategies between tasks. Furthermore, structural equation modeling showed that the ability to solve rule-based tasks was predicted by working memory, whereas episodic memory predicted judgment accuracy in the exemplar-based task. Last, the probability of choosing an exemplar-based strategy was related to better episodic memory, but strategy selection was unrelated to working memory capacity. In sum, our results suggest that different memory abilities are essential for successfully adopting different judgment strategies. PsycINFO Database Record (c) 2014 APA, all rights reserved.
Uncertainty aggregation and reduction in structure-material performance prediction
NASA Astrophysics Data System (ADS)
Hu, Zhen; Mahadevan, Sankaran; Ao, Dan
2018-02-01
An uncertainty aggregation and reduction framework is presented for structure-material performance prediction. Different types of uncertainty sources, structural analysis model, and material performance prediction model are connected through a Bayesian network for systematic uncertainty aggregation analysis. To reduce the uncertainty in the computational structure-material performance prediction model, Bayesian updating using experimental observation data is investigated based on the Bayesian network. It is observed that the Bayesian updating results will have large error if the model cannot accurately represent the actual physics, and that this error will be propagated to the predicted performance distribution. To address this issue, this paper proposes a novel uncertainty reduction method by integrating Bayesian calibration with model validation adaptively. The observation domain of the quantity of interest is first discretized into multiple segments. An adaptive algorithm is then developed to perform model validation and Bayesian updating over these observation segments sequentially. Only information from observation segments where the model prediction is highly reliable is used for Bayesian updating; this is found to increase the effectiveness and efficiency of uncertainty reduction. A composite rotorcraft hub component fatigue life prediction model, which combines a finite element structural analysis model and a material damage model, is used to demonstrate the proposed method.
NASA Astrophysics Data System (ADS)
Baumgartner, Matthew P.; Evans, David A.
2018-01-01
Two of the major ongoing challenges in computational drug discovery are predicting the binding pose and affinity of a compound to a protein. The Drug Design Data Resource Grand Challenge 2 was developed to address these problems and to drive development of new methods. The challenge provided the 2D structures of compounds for which the organizers help blinded data in the form of 35 X-ray crystal structures and 102 binding affinity measurements and challenged participants to predict the binding pose and affinity of the compounds. We tested a number of pose prediction methods as part of the challenge; we found that docking methods that incorporate protein flexibility (Induced Fit Docking) outperformed methods that treated the protein as rigid. We also found that using binding pose metadynamics, a molecular dynamics based method, to score docked poses provided the best predictions of our methods with an average RMSD of 2.01 Å. We tested both structure-based (e.g. docking) and ligand-based methods (e.g. QSAR) in the affinity prediction portion of the competition. We found that our structure-based methods based on docking with Smina (Spearman ρ = 0.614), performed slightly better than our ligand-based methods (ρ = 0.543), and had equivalent performance with the other top methods in the competition. Despite the overall good performance of our methods in comparison to other participants in the challenge, there exists significant room for improvement especially in cases such as these where protein flexibility plays such a large role.
Cheng, Ningtao; Wu, Leihong; Cheng, Yiyu
2013-01-01
The promise of microarray technology in providing prediction classifiers for cancer outcome estimation has been confirmed by a number of demonstrable successes. However, the reliability of prediction results relies heavily on the accuracy of statistical parameters involved in classifiers. It cannot be reliably estimated with only a small number of training samples. Therefore, it is of vital importance to determine the minimum number of training samples and to ensure the clinical value of microarrays in cancer outcome prediction. We evaluated the impact of training sample size on model performance extensively based on 3 large-scale cancer microarray datasets provided by the second phase of MicroArray Quality Control project (MAQC-II). An SSNR-based (scale of signal-to-noise ratio) protocol was proposed in this study for minimum training sample size determination. External validation results based on another 3 cancer datasets confirmed that the SSNR-based approach could not only determine the minimum number of training samples efficiently, but also provide a valuable strategy for estimating the underlying performance of classifiers in advance. Once translated into clinical routine applications, the SSNR-based protocol would provide great convenience in microarray-based cancer outcome prediction in improving classifier reliability. PMID:23861920
García-Jacas, César R; Contreras-Torres, Ernesto; Marrero-Ponce, Yovani; Pupo-Meriño, Mario; Barigye, Stephen J; Cabrera-Leyva, Lisset
2016-01-01
Recently, novel 3D alignment-free molecular descriptors (also known as QuBiLS-MIDAS) based on two-linear, three-linear and four-linear algebraic forms have been introduced. These descriptors codify chemical information for relations between two, three and four atoms by using several (dis-)similarity metrics and multi-metrics. Several studies aimed at assessing the quality of these novel descriptors have been performed. However, a deeper analysis of their performance is necessary. Therefore, in the present manuscript an assessment and statistical validation of the performance of these novel descriptors in QSAR studies is performed. To this end, eight molecular datasets (angiotensin converting enzyme, acetylcholinesterase inhibitors, benzodiazepine receptor, cyclooxygenase-2 inhibitors, dihydrofolate reductase inhibitors, glycogen phosphorylase b, thermolysin inhibitors, thrombin inhibitors) widely used as benchmarks in the evaluation of several procedures are utilized. Three to nine variable QSAR models based on Multiple Linear Regression are built for each chemical dataset according to the original division into training/test sets. Comparisons with respect to leave-one-out cross-validation correlation coefficients[Formula: see text] reveal that the models based on QuBiLS-MIDAS indices possess superior predictive ability in 7 of the 8 datasets analyzed, outperforming methodologies based on similar or more complex techniques such as: Partial Least Square, Neural Networks, Support Vector Machine and others. On the other hand, superior external correlation coefficients[Formula: see text] are attained in 6 of the 8 test sets considered, confirming the good predictive power of the obtained models. For the [Formula: see text] values non-parametric statistic tests were performed, which demonstrated that the models based on QuBiLS-MIDAS indices have the best global performance and yield significantly better predictions in 11 of the 12 QSAR procedures used in the comparison. Lastly, a study concerning to the performance of the indices according to several conformer generation methods was performed. This demonstrated that the quality of predictions of the QSAR models based on QuBiLS-MIDAS indices depend on 3D structure generation method considered, although in this preliminary study the results achieved do not present significant statistical differences among them. As conclusions it can be stated that the QuBiLS-MIDAS indices are suitable for extracting structural information of the molecules and thus, constitute a promissory alternative to build models that contribute to the prediction of pharmacokinetic, pharmacodynamics and toxicological properties on novel compounds.Graphical abstractComparative graphical representation of the performance of the novel QuBiLS-MIDAS 3D-MDs with respect to other methodologies in QSAR modeling of eight chemical datasets.
Testing the Predictive Power of Coulomb Stress on Aftershock Sequences
NASA Astrophysics Data System (ADS)
Woessner, J.; Lombardi, A.; Werner, M. J.; Marzocchi, W.
2009-12-01
Empirical and statistical models of clustered seismicity are usually strongly stochastic and perceived to be uninformative in their forecasts, since only marginal distributions are used, such as the Omori-Utsu and Gutenberg-Richter laws. In contrast, so-called physics-based aftershock models, based on seismic rate changes calculated from Coulomb stress changes and rate-and-state friction, make more specific predictions: anisotropic stress shadows and multiplicative rate changes. We test the predictive power of models based on Coulomb stress changes against statistical models, including the popular Short Term Earthquake Probabilities and Epidemic-Type Aftershock Sequences models: We score and compare retrospective forecasts on the aftershock sequences of the 1992 Landers, USA, the 1997 Colfiorito, Italy, and the 2008 Selfoss, Iceland, earthquakes. To quantify predictability, we use likelihood-based metrics that test the consistency of the forecasts with the data, including modified and existing tests used in prospective forecast experiments within the Collaboratory for the Study of Earthquake Predictability (CSEP). Our results indicate that a statistical model performs best. Moreover, two Coulomb model classes seem unable to compete: Models based on deterministic Coulomb stress changes calculated from a given fault-slip model, and those based on fixed receiver faults. One model of Coulomb stress changes does perform well and sometimes outperforms the statistical models, but its predictive information is diluted, because of uncertainties included in the fault-slip model. Our results suggest that models based on Coulomb stress changes need to incorporate stochastic features that represent model and data uncertainty.
Predicting outcome of status epilepticus.
Leitinger, M; Kalss, G; Rohracher, A; Pilz, G; Novak, H; Höfler, J; Deak, I; Kuchukhidze, G; Dobesberger, J; Wakonig, A; Trinka, E
2015-08-01
Status epilepticus (SE) is a frequent neurological emergency complicated by high mortality and often poor functional outcome in survivors. The aim of this study was to review available clinical scores to predict outcome. Literature review. PubMed Search terms were "score", "outcome", and "status epilepticus" (April 9th 2015). Publications with abstracts available in English, no other language restrictions, or any restrictions concerning investigated patients were included. Two scores were identified: "Status Epilepticus Severity Score--STESS" and "Epidemiology based Mortality score in SE--EMSE". A comprehensive comparison of test parameters concerning performance, options, and limitations was performed. Epidemiology based Mortality score in SE allows detailed individualization of risk factors and is significantly superior to STESS in a retrospective explorative study. In particular, EMSE is very good at detection of good and bad outcome, whereas STESS detecting bad outcome is limited by a ceiling effect and uncertainty of correct cutoff value. Epidemiology based Mortality score in SE can be adapted to different regions in the world and to advances in medicine, as new data emerge. In addition, we designed a reporting standard for status epilepticus to enhance acquisition and communication of outcome relevant data. A data acquisition sheet used from patient admission in emergency room, from the EEG lab to intensive care unit, is provided for optimized data collection. Status Epilepticus Severity Score is easy to perform and predicts bad outcome, but has a low predictive value for good outcomes. Epidemiology based Mortality score in SE is superior to STESS in predicting good or bad outcome but needs marginally more time to perform. Epidemiology based Mortality score in SE may prove very useful for risk stratification in interventional studies and is recommended for individual outcome prediction. Prospective validation in different cohorts is needed for EMSE, whereas STESS needs further validation in cohorts with a wider range of etiologies. This article is part of a Special Issue entitled "Status Epilepticus". Copyright © 2015. Published by Elsevier Inc.
Development of 1RM Prediction Equations for Bench Press in Moderately Trained Men.
Macht, Jordan W; Abel, Mark G; Mullineaux, David R; Yates, James W
2016-10-01
Macht, JW, Abel, MG, Mullineaux, DR, and Yates, JW. Development of 1RM prediction equations for bench press in moderately trained men. J Strength Cond Res 30(10): 2901-2906, 2016-There are a variety of established 1 repetition maximum (1RM) prediction equations, however, very few prediction equations use anthropometric characteristics exclusively or in part, to estimate 1RM strength. Therefore, the purpose of this study was to develop an original 1RM prediction equation for bench press using anthropometric and performance characteristics in moderately trained male subjects. Sixty male subjects (21.2 ± 2.4 years) completed a 1RM bench press and were randomly assigned a load to complete as many repetitions as possible. In addition, body composition, upper-body anthropometric characteristics, and handgrip strength were assessed. Regression analysis was used to develop a performance-based 1RM prediction equation: 1RM = 1.20 repetition weight + 2.19 repetitions to fatigue - 0.56 biacromial width (cm) + 9.6 (R = 0.99, standard error of estimate [SEE] = 3.5 kg). Regression analysis to develop a nonperformance-based 1RM prediction equation yielded: 1RM (kg) = 0.997 cross-sectional area (CSA) (cm) + 0.401 chest circumference (cm) - 0.385%fat - 0.185 arm length (cm) + 36.7 (R = 0.81, SEE = 13.0 kg). The performance prediction equations developed in this study had high validity coefficients, minimal mean bias, and small limits of agreement. The anthropometric equations had moderately high validity coefficient but larger limits of agreement. The practical applications of this study indicate that the inclusion of anthropometric characteristics and performance variables produce a valid prediction equation for 1RM strength. In addition, the CSA of the arm uses a simple nonperformance method of estimating the lifter's 1RM. This information may be used to predict the starting load for a lifter performing a 1RM prediction protocol or a 1RM testing protocol.
Cui, Zaixu; Gong, Gaolang
2018-06-02
Individualized behavioral/cognitive prediction using machine learning (ML) regression approaches is becoming increasingly applied. The specific ML regression algorithm and sample size are two key factors that non-trivially influence prediction accuracies. However, the effects of the ML regression algorithm and sample size on individualized behavioral/cognitive prediction performance have not been comprehensively assessed. To address this issue, the present study included six commonly used ML regression algorithms: ordinary least squares (OLS) regression, least absolute shrinkage and selection operator (LASSO) regression, ridge regression, elastic-net regression, linear support vector regression (LSVR), and relevance vector regression (RVR), to perform specific behavioral/cognitive predictions based on different sample sizes. Specifically, the publicly available resting-state functional MRI (rs-fMRI) dataset from the Human Connectome Project (HCP) was used, and whole-brain resting-state functional connectivity (rsFC) or rsFC strength (rsFCS) were extracted as prediction features. Twenty-five sample sizes (ranged from 20 to 700) were studied by sub-sampling from the entire HCP cohort. The analyses showed that rsFC-based LASSO regression performed remarkably worse than the other algorithms, and rsFCS-based OLS regression performed markedly worse than the other algorithms. Regardless of the algorithm and feature type, both the prediction accuracy and its stability exponentially increased with increasing sample size. The specific patterns of the observed algorithm and sample size effects were well replicated in the prediction using re-testing fMRI data, data processed by different imaging preprocessing schemes, and different behavioral/cognitive scores, thus indicating excellent robustness/generalization of the effects. The current findings provide critical insight into how the selected ML regression algorithm and sample size influence individualized predictions of behavior/cognition and offer important guidance for choosing the ML regression algorithm or sample size in relevant investigations. Copyright © 2018 Elsevier Inc. All rights reserved.
Aris-Brosou, Stephane; Kim, James; Li, Li; Liu, Hui
2018-05-15
Vendors in the health care industry produce diagnostic systems that, through a secured connection, allow them to monitor performance almost in real time. However, challenges exist in analyzing and interpreting large volumes of noisy quality control (QC) data. As a result, some QC shifts may not be detected early enough by the vendor, but lead a customer to complain. The aim of this study was to hypothesize that a more proactive response could be designed by utilizing the collected QC data more efficiently. Our aim is therefore to help prevent customer complaints by predicting them based on the QC data collected by in vitro diagnostic systems. QC data from five select in vitro diagnostic assays were combined with the corresponding database of customer complaints over a period of 90 days. A subset of these data over the last 45 days was also analyzed to assess how the length of the training period affects predictions. We defined a set of features used to train two classifiers, one based on decision trees and the other based on adaptive boosting, and assessed model performance by cross-validation. The cross-validations showed classification error rates close to zero for some assays with adaptive boosting when predicting the potential cause of customer complaints. Performance was improved by shortening the training period when the volume of complaints increased. Denoising filters that reduced the number of categories to predict further improved performance, as their application simplified the prediction problem. This novel approach to predicting customer complaints based on QC data may allow the diagnostic industry, the expected end user of our approach, to proactively identify potential product quality issues and fix these before receiving customer complaints. This represents a new step in the direction of using big data toward product quality improvement. ©Stephane Aris-Brosou, James Kim, Li Li, Hui Liu. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 15.05.2018.
Kim, James; Li, Li; Liu, Hui
2018-01-01
Background Vendors in the health care industry produce diagnostic systems that, through a secured connection, allow them to monitor performance almost in real time. However, challenges exist in analyzing and interpreting large volumes of noisy quality control (QC) data. As a result, some QC shifts may not be detected early enough by the vendor, but lead a customer to complain. Objective The aim of this study was to hypothesize that a more proactive response could be designed by utilizing the collected QC data more efficiently. Our aim is therefore to help prevent customer complaints by predicting them based on the QC data collected by in vitro diagnostic systems. Methods QC data from five select in vitro diagnostic assays were combined with the corresponding database of customer complaints over a period of 90 days. A subset of these data over the last 45 days was also analyzed to assess how the length of the training period affects predictions. We defined a set of features used to train two classifiers, one based on decision trees and the other based on adaptive boosting, and assessed model performance by cross-validation. Results The cross-validations showed classification error rates close to zero for some assays with adaptive boosting when predicting the potential cause of customer complaints. Performance was improved by shortening the training period when the volume of complaints increased. Denoising filters that reduced the number of categories to predict further improved performance, as their application simplified the prediction problem. Conclusions This novel approach to predicting customer complaints based on QC data may allow the diagnostic industry, the expected end user of our approach, to proactively identify potential product quality issues and fix these before receiving customer complaints. This represents a new step in the direction of using big data toward product quality improvement. PMID:29764796
Panagou, Efstathios Z; Nychas, George-John E
2008-09-01
A product-specific model was developed and validated under dynamic temperature conditions for predicting the growth of Listeria monocytogenes in pasteurized vanilla cream, a traditional milk-based product. Model performance was also compared with Growth Predictor and Sym'Previus predictive microbiology software packages. Commercially prepared vanilla cream samples were artificially inoculated with a five-strain cocktail of L. monocytogenes, with an initial concentration of 102 CFU g(-1), and stored at 3, 5, 10, and 15 degrees C for 36 days. The growth kinetic parameters at each temperature were determined by the primary model of Baranyi and Roberts. The maximum specific growth rate (mu(max)) was further modeled as a function of temperature by means of a square root-type model. The performance of the model in predicting the growth of the pathogen under dynamic temperature conditions was based on two different temperature scenarios with periodic changes from 4 to 15 degrees C. Growth prediction for dynamic temperature profiles was based on the square root model and the differential equations of the Baranyi and Roberts model, which were numerically integrated with respect to time. Model performance was based on the bias factor (B(f)), the accuracy factor (A(f)), the goodness-of-fit index (GoF), and the percent relative errors between observed and predicted growth. The product-specific model developed in the present study accurately predicted the growth of L. monocytogenes under dynamic temperature conditions. The average values for the performance indices were 1.038, 1.068, and 0.397 for B(f), A(f), and GoF, respectively for both temperature scenarios assayed. Predictions from Growth Predictor and Sym'Previus overestimated pathogen growth. The average values of B(f), A(f), and GoF were 1.173, 1.174, 1.162, and 0.956, 1.115, 0.713 for [corrected] Growth Predictor and Sym'Previus, respectively.
Wang, Xun-Heng; Jiao, Yun; Li, Lihua
2017-10-24
Attention deficit hyperactivity disorder (ADHD) is a common brain disorder with high prevalence in school-age children. Previously developed machine learning-based methods have discriminated patients with ADHD from normal controls by providing label information of the disease for individuals. Inattention and impulsivity are the two most significant clinical symptoms of ADHD. However, predicting clinical symptoms (i.e., inattention and impulsivity) is a challenging task based on neuroimaging data. The goal of this study is twofold: to build predictive models for clinical symptoms of ADHD based on resting-state fMRI and to mine brain networks for predictive patterns of inattention and impulsivity. To achieve this goal, a cohort of 74 boys with ADHD and a cohort of 69 age-matched normal controls were recruited from the ADHD-200 Consortium. Both structural and resting-state fMRI images were obtained for each participant. Temporal patterns between and within intrinsic connectivity networks (ICNs) were applied as raw features in the predictive models. Specifically, sample entropy was taken asan intra-ICN feature, and phase synchronization (PS) was used asan inter-ICN feature. The predictive models were based on the least absolute shrinkage and selectionator operator (LASSO) algorithm. The performance of the predictive model for inattention is r=0.79 (p<10 -8 ), and the performance of the predictive model for impulsivity is r=0.48 (p<10 -8 ). The ICN-related predictive patterns may provide valuable information for investigating the brain network mechanisms of ADHD. In summary, the predictive models for clinical symptoms could be beneficial for personalizing ADHD medications. Copyright © 2017 IBRO. Published by Elsevier Ltd. All rights reserved.
Wagner, Christian; Pan, Yuzhuo; Hsu, Vicky; Grillo, Joseph A; Zhang, Lei; Reynolds, Kellie S; Sinha, Vikram; Zhao, Ping
2015-01-01
The US Food and Drug Administration (FDA) has seen a recent increase in the application of physiologically based pharmacokinetic (PBPK) modeling towards assessing the potential of drug-drug interactions (DDI) in clinically relevant scenarios. To continue our assessment of such approaches, we evaluated the predictive performance of PBPK modeling in predicting cytochrome P450 (CYP)-mediated DDI. This evaluation was based on 15 substrate PBPK models submitted by nine sponsors between 2009 and 2013. For these 15 models, a total of 26 DDI studies (cases) with various CYP inhibitors were available. Sponsors developed the PBPK models, reportedly without considering clinical DDI data. Inhibitor models were either developed by sponsors or provided by PBPK software developers and applied with minimal or no modification. The metric for assessing predictive performance of the sponsors' PBPK approach was the R predicted/observed value (R predicted/observed = [predicted mean exposure ratio]/[observed mean exposure ratio], with the exposure ratio defined as [C max (maximum plasma concentration) or AUC (area under the plasma concentration-time curve) in the presence of CYP inhibition]/[C max or AUC in the absence of CYP inhibition]). In 81 % (21/26) and 77 % (20/26) of cases, respectively, the R predicted/observed values for AUC and C max ratios were within a pre-defined threshold of 1.25-fold of the observed data. For all cases, the R predicted/observed values for AUC and C max were within a 2-fold range. These results suggest that, based on the submissions to the FDA to date, there is a high degree of concordance between PBPK-predicted and observed effects of CYP inhibition, especially CYP3A-based, on the exposure of drug substrates.
Hoogendoorn, Mark; Szolovits, Peter; Moons, Leon M G; Numans, Mattijs E
2016-05-01
Machine learning techniques can be used to extract predictive models for diseases from electronic medical records (EMRs). However, the nature of EMRs makes it difficult to apply off-the-shelf machine learning techniques while still exploiting the rich content of the EMRs. In this paper, we explore the usage of a range of natural language processing (NLP) techniques to extract valuable predictors from uncoded consultation notes and study whether they can help to improve predictive performance. We study a number of existing techniques for the extraction of predictors from the consultation notes, namely a bag of words based approach and topic modeling. In addition, we develop a dedicated technique to match the uncoded consultation notes with a medical ontology. We apply these techniques as an extension to an existing pipeline to extract predictors from EMRs. We evaluate them in the context of predictive modeling for colorectal cancer (CRC), a disease known to be difficult to diagnose before performing an endoscopy. Our results show that we are able to extract useful information from the consultation notes. The predictive performance of the ontology-based extraction method moves significantly beyond the benchmark of age and gender alone (area under the receiver operating characteristic curve (AUC) of 0.870 versus 0.831). We also observe more accurate predictive models by adding features derived from processing the consultation notes compared to solely using coded data (AUC of 0.896 versus 0.882) although the difference is not significant. The extracted features from the notes are shown be equally predictive (i.e. there is no significant difference in performance) compared to the coded data of the consultations. It is possible to extract useful predictors from uncoded consultation notes that improve predictive performance. Techniques linking text to concepts in medical ontologies to derive these predictors are shown to perform best for predicting CRC in our EMR dataset. Copyright © 2016 Elsevier B.V. All rights reserved.
Tactile communication, cooperation, and performance: an ethological study of the NBA.
Kraus, Michael W; Huang, Cassey; Keltner, Dacher
2010-10-01
Tactile communication, or physical touch, promotes cooperation between people, communicates distinct emotions, soothes in times of stress, and is used to make inferences of warmth and trust. Based on this conceptual analysis, we predicted that in group competition, physical touch would predict increases in both individual and group performance. In an ethological study, we coded the touch behavior of players from the National Basketball Association (NBA) during the 2008-2009 regular season. Consistent with hypotheses, early season touch predicted greater performance for individuals as well as teams later in the season. Additional analyses confirmed that touch predicted improved performance even after accounting for player status, preseason expectations, and early season performance. Moreover, coded cooperative behaviors between teammates explained the association between touch and team performance. Discussion focused on the contributions touch makes to cooperative groups and the potential implications for other group settings. (PsycINFO Database Record (c) 2010 APA, all rights reserved).
Characterizing Decision-Analysis Performances of Risk Prediction Models Using ADAPT Curves.
Lee, Wen-Chung; Wu, Yun-Chun
2016-01-01
The area under the receiver operating characteristic curve is a widely used index to characterize the performance of diagnostic tests and prediction models. However, the index does not explicitly acknowledge the utilities of risk predictions. Moreover, for most clinical settings, what counts is whether a prediction model can guide therapeutic decisions in a way that improves patient outcomes, rather than to simply update probabilities.Based on decision theory, the authors propose an alternative index, the "average deviation about the probability threshold" (ADAPT).An ADAPT curve (a plot of ADAPT value against the probability threshold) neatly characterizes the decision-analysis performances of a risk prediction model.Several prediction models can be compared for their ADAPT values at a chosen probability threshold, for a range of plausible threshold values, or for the whole ADAPT curves. This should greatly facilitate the selection of diagnostic tests and prediction models.
Structural reliability analysis under evidence theory using the active learning kriging model
NASA Astrophysics Data System (ADS)
Yang, Xufeng; Liu, Yongshou; Ma, Panke
2017-11-01
Structural reliability analysis under evidence theory is investigated. It is rigorously proved that a surrogate model providing only correct sign prediction of the performance function can meet the accuracy requirement of evidence-theory-based reliability analysis. Accordingly, a method based on the active learning kriging model which only correctly predicts the sign of the performance function is proposed. Interval Monte Carlo simulation and a modified optimization method based on Karush-Kuhn-Tucker conditions are introduced to make the method more efficient in estimating the bounds of failure probability based on the kriging model. Four examples are investigated to demonstrate the efficiency and accuracy of the proposed method.
The Development of MST Test Information for the Prediction of Test Performances
ERIC Educational Resources Information Center
Park, Ryoungsun; Kim, Jiseon; Chung, Hyewon; Dodd, Barbara G.
2017-01-01
The current study proposes novel methods to predict multistage testing (MST) performance without conducting simulations. This method, called MST test information, is based on analytic derivation of standard errors of ability estimates across theta levels. We compared standard errors derived analytically to the simulation results to demonstrate the…
Gagné, Mathieu; Moore, Lynne; Beaudoin, Claudia; Batomen Kuimi, Brice Lionel; Sirois, Marie-Josée
2016-03-01
The International Classification of Diseases (ICD) is the main classification system used for population-based injury surveillance activities but does not contain information on injury severity. ICD-based injury severity measures can be empirically derived or mapped, but no single approach has been formally recommended. This study aimed to compare the performance of ICD-based injury severity measures to predict in-hospital mortality among injury-related admissions. A systematic review and a meta-analysis were conducted. MEDLINE, EMBASE, and Global Health databases were searched from their inception through September 2014. Observational studies that assessed the performance of ICD-based injury severity measures to predict in-hospital mortality and reported discriminative ability using the area under a receiver operating characteristic curve (AUC) were included. Metrics of model performance were extracted. Pooled AUC were estimated under random-effects models. Twenty-two eligible studies reported 72 assessments of discrimination on ICD-based injury severity measures. Reported AUC ranged from 0.681 to 0.958. Of the 72 assessments, 46 showed excellent (0.80 ≤ AUC < 0.90) and 6 outstanding (AUC ≥ 0.90) discriminative ability. Pooled AUC for ICD-based Injury Severity Score (ICISS) based on the product of traditional survival proportions was significantly higher than measures based on ICD mapped to Abbreviated Injury Scale (AIS) scores (0.863 vs. 0.825 for ICDMAP-ISS [p = 0.005] and ICDMAP-NISS [p = 0.016]). Similar results were observed when studies were stratified by the type of data used (trauma registry or hospital discharge) or the provenance of survival proportions (internally or externally derived). However, among studies published after 2003 the Trauma Mortality Prediction Model based on ICD-9 codes (TMPM-9) demonstrated superior discriminative ability than ICISS using the product of traditional survival proportions (0.850 vs. 0.802, p = 0.002). Models generally showed poor calibration. ICISS using the product of traditional survival proportions and TMPM-9 predict mortality more accurately than those mapped to AIS codes and should be preferred for describing injury severity when ICD is used to record injury diagnoses. Systematic review and meta-analysis, level III.
Chiu, Peter K F; Roobol, Monique J; Teoh, Jeremy Y; Lee, Wai-Man; Yip, Siu-Ying; Hou, See-Ming; Bangma, Chris H; Ng, Chi-Fai
2016-10-01
To investigate PSA- and PHI (prostate health index)-based models for prediction of prostate cancer (PCa) and the feasibility of using DRE-estimated prostate volume (DRE-PV) in the models. This study included 569 Chinese men with PSA 4-10 ng/mL and non-suspicious DRE with transrectal ultrasound (TRUS) 10-core prostate biopsies performed between April 2008 and July 2015. DRE-PV was estimated using 3 pre-defined classes: 25, 40, or 60 ml. The performance of PSA-based and PHI-based predictive models including age, DRE-PV, and TRUS prostate volume (TRUS-PV) was analyzed using logistic regression and area under the receiver operating curves (AUC), in both the whole cohort and the screening age group of 55-75. PCa and high-grade PCa (HGPCa) was diagnosed in 10.9 % (62/569) and 2.8 % (16/569) men, respectively. The performance of DRE-PV-based models was similar to TRUS-PV-based models. In the age group 55-75, the AUCs for PCa of PSA alone, PSA with DRE-PV and age, PHI alone, PHI with DRE-PV and age, and PHI with TRUS-PV and age were 0.54, 0.71, 0.76, 0.78, and 0.78, respectively. The corresponding AUCs for HGPCa were higher (0.60, 0.70, 0.85, 0.83, and 0.83). At 10 and 20 % risk threshold for PCa, 38.4 and 55.4 % biopsies could be avoided in the PHI-based model, respectively. PHI had better performance over PSA-based models and could reduce unnecessary biopsies. A DRE-assessed PV can replace TRUS-assessed PV in multivariate prediction models to facilitate clinical use.
Jeunet, Camille; N'Kaoua, Bernard; Subramanian, Sriram; Hachet, Martin; Lotte, Fabien
2015-01-01
Mental-Imagery based Brain-Computer Interfaces (MI-BCIs) allow their users to send commands to a computer using their brain-activity alone (typically measured by ElectroEncephaloGraphy-EEG), which is processed while they perform specific mental tasks. While very promising, MI-BCIs remain barely used outside laboratories because of the difficulty encountered by users to control them. Indeed, although some users obtain good control performances after training, a substantial proportion remains unable to reliably control an MI-BCI. This huge variability in user-performance led the community to look for predictors of MI-BCI control ability. However, these predictors were only explored for motor-imagery based BCIs, and mostly for a single training session per subject. In this study, 18 participants were instructed to learn to control an EEG-based MI-BCI by performing 3 MI-tasks, 2 of which were non-motor tasks, across 6 training sessions, on 6 different days. Relationships between the participants' BCI control performances and their personality, cognitive profile and neurophysiological markers were explored. While no relevant relationships with neurophysiological markers were found, strong correlations between MI-BCI performances and mental-rotation scores (reflecting spatial abilities) were revealed. Also, a predictive model of MI-BCI performance based on psychometric questionnaire scores was proposed. A leave-one-subject-out cross validation process revealed the stability and reliability of this model: it enabled to predict participants' performance with a mean error of less than 3 points. This study determined how users' profiles impact their MI-BCI control ability and thus clears the way for designing novel MI-BCI training protocols, adapted to the profile of each user.
Jeunet, Camille; N’Kaoua, Bernard; Subramanian, Sriram; Hachet, Martin; Lotte, Fabien
2015-01-01
Mental-Imagery based Brain-Computer Interfaces (MI-BCIs) allow their users to send commands to a computer using their brain-activity alone (typically measured by ElectroEncephaloGraphy—EEG), which is processed while they perform specific mental tasks. While very promising, MI-BCIs remain barely used outside laboratories because of the difficulty encountered by users to control them. Indeed, although some users obtain good control performances after training, a substantial proportion remains unable to reliably control an MI-BCI. This huge variability in user-performance led the community to look for predictors of MI-BCI control ability. However, these predictors were only explored for motor-imagery based BCIs, and mostly for a single training session per subject. In this study, 18 participants were instructed to learn to control an EEG-based MI-BCI by performing 3 MI-tasks, 2 of which were non-motor tasks, across 6 training sessions, on 6 different days. Relationships between the participants’ BCI control performances and their personality, cognitive profile and neurophysiological markers were explored. While no relevant relationships with neurophysiological markers were found, strong correlations between MI-BCI performances and mental-rotation scores (reflecting spatial abilities) were revealed. Also, a predictive model of MI-BCI performance based on psychometric questionnaire scores was proposed. A leave-one-subject-out cross validation process revealed the stability and reliability of this model: it enabled to predict participants’ performance with a mean error of less than 3 points. This study determined how users’ profiles impact their MI-BCI control ability and thus clears the way for designing novel MI-BCI training protocols, adapted to the profile of each user. PMID:26625261
Genomewide predictions from maize single-cross data.
Massman, Jon M; Gordillo, Andres; Lorenzana, Robenzon E; Bernardo, Rex
2013-01-01
Maize (Zea mays L.) breeders evaluate many single-cross hybrids each year in multiple environments. Our objective was to determine the usefulness of genomewide predictions, based on marker effects from maize single-cross data, for identifying the best untested single crosses and the best inbreds within a biparental cross. We considered 479 experimental maize single crosses between 59 Iowa Stiff Stalk Synthetic (BSSS) inbreds and 44 non-BSSS inbreds. The single crosses were evaluated in multilocation experiments from 2001 to 2009 and the BSSS and non-BSSS inbreds had genotypic data for 669 single nucleotide polymorphism (SNP) markers. Single-cross performance was predicted by a previous best linear unbiased prediction (BLUP) approach that utilized marker-based relatedness and information on relatives, and from genomewide marker effects calculated by ridge-regression BLUP (RR-BLUP). With BLUP, the mean prediction accuracy (r(MG)) of single-cross performance was 0.87 for grain yield, 0.90 for grain moisture, 0.69 for stalk lodging, and 0.84 for root lodging. The BLUP and RR-BLUP models did not lead to r(MG) values that differed significantly. We then used the RR-BLUP model, developed from single-cross data, to predict the performance of testcrosses within 14 biparental populations. The r(MG) values within each testcross population were generally low and were often negative. These results were obtained despite the above-average level of linkage disequilibrium, i.e., r(2) between adjacent markers of 0.35 in the BSSS inbreds and 0.26 in the non-BSSS inbreds. Overall, our results suggested that genomewide marker effects estimated from maize single crosses are not advantageous (cofmpared with BLUP) for predicting single-cross performance and have erratic usefulness for predicting testcross performance within a biparental cross.
A Flight Prediction for Performance of the SWAS Solar Array Deployment Mechanism
NASA Technical Reports Server (NTRS)
Seniderman, Gary; Daniel, Walter K.
1999-01-01
The focus of this paper is a comparison of ground-based solar array deployment tests with the on-orbit deployment. The discussion includes a summary of the mechanisms involved and the correlation of a dynamics model with ground based test results. Some of the unique characteristics of the mechanisms are explained through the analysis of force and angle data acquired from the test deployments. The correlated dynamics model is then used to predict the performance of the system in its flight application.
Temporal Prediction Errors Affect Short-Term Memory Scanning Response Time.
Limongi, Roberto; Silva, Angélica M
2016-11-01
The Sternberg short-term memory scanning task has been used to unveil cognitive operations involved in time perception. Participants produce time intervals during the task, and the researcher explores how task performance affects interval production - where time estimation error is the dependent variable of interest. The perspective of predictive behavior regards time estimation error as a temporal prediction error (PE), an independent variable that controls cognition, behavior, and learning. Based on this perspective, we investigated whether temporal PEs affect short-term memory scanning. Participants performed temporal predictions while they maintained information in memory. Model inference revealed that PEs affected memory scanning response time independently of the memory-set size effect. We discuss the results within the context of formal and mechanistic models of short-term memory scanning and predictive coding, a Bayes-based theory of brain function. We state the hypothesis that our finding could be associated with weak frontostriatal connections and weak striatal activity.
Li, Guanghui; Luo, Jiawei; Xiao, Qiu; Liang, Cheng; Ding, Pingjian
2018-05-12
Interactions between microRNAs (miRNAs) and diseases can yield important information for uncovering novel prognostic markers. Since experimental determination of disease-miRNA associations is time-consuming and costly, attention has been given to designing efficient and robust computational techniques for identifying undiscovered interactions. In this study, we present a label propagation model with linear neighborhood similarity, called LPLNS, to predict unobserved miRNA-disease associations. Additionally, a preprocessing step is performed to derive new interaction likelihood profiles that will contribute to the prediction since new miRNAs and diseases lack known associations. Our results demonstrate that the LPLNS model based on the known disease-miRNA associations could achieve impressive performance with an AUC of 0.9034. Furthermore, we observed that the LPLNS model based on new interaction likelihood profiles could improve the performance to an AUC of 0.9127. This was better than other comparable methods. In addition, case studies also demonstrated our method's outstanding performance for inferring undiscovered interactions between miRNAs and diseases, especially for novel diseases. Copyright © 2018. Published by Elsevier Inc.
An evaluation of NASA's program in human factors research: Aircrew-vehicle system interaction
NASA Technical Reports Server (NTRS)
1982-01-01
Research in human factors in the aircraft cockpit and a proposed program augmentation were reviewed. The dramatic growth of microprocessor technology makes it entirely feasible to automate increasingly more functions in the aircraft cockpit; the promise of improved vehicle performance, efficiency, and safety through automation makes highly automated flight inevitable. An organized data base and validated methodology for predicting the effects of automation on human performance and thus on safety are lacking and without such a data base and validated methodology for analyzing human performance, increased automation may introduce new risks. Efforts should be concentrated on developing methods and techniques for analyzing man machine interactions, including human workload and prediction of performance.
Sun Series program for the REEDA System. [predicting orbital lifetime using sunspot values
NASA Technical Reports Server (NTRS)
Shankle, R. W.
1980-01-01
Modifications made to data bases and to four programs in a series of computer programs (Sun Series) which run on the REEDA HP minicomputer system to aid NASA's solar activity predictions used in orbital life time predictions are described. These programs utilize various mathematical smoothing technique and perform statistical and graphical analysis of various solar activity data bases residing on the REEDA System.
Laboratory. The purpose of this technique is to predict specific impulse in large solid rocket motors based on data obtained in micromotors . As little as 2...concerning performance of a propellant in a large solid motor. Predictions, based on data obtained in micromotors , were within 0.6% of the delivered impulse in 6-pound motors and 70-pound BATES motors. (Author)
Toward a Model-Based Predictive Controller Design in Brain–Computer Interfaces
Kamrunnahar, M.; Dias, N. S.; Schiff, S. J.
2013-01-01
A first step in designing a robust and optimal model-based predictive controller (MPC) for brain–computer interface (BCI) applications is presented in this article. An MPC has the potential to achieve improved BCI performance compared to the performance achieved by current ad hoc, nonmodel-based filter applications. The parameters in designing the controller were extracted as model-based features from motor imagery task-related human scalp electroencephalography. Although the parameters can be generated from any model-linear or non-linear, we here adopted a simple autoregressive model that has well-established applications in BCI task discriminations. It was shown that the parameters generated for the controller design can as well be used for motor imagery task discriminations with performance (with 8–23% task discrimination errors) comparable to the discrimination performance of the commonly used features such as frequency specific band powers and the AR model parameters directly used. An optimal MPC has significant implications for high performance BCI applications. PMID:21267657
Toward a model-based predictive controller design in brain-computer interfaces.
Kamrunnahar, M; Dias, N S; Schiff, S J
2011-05-01
A first step in designing a robust and optimal model-based predictive controller (MPC) for brain-computer interface (BCI) applications is presented in this article. An MPC has the potential to achieve improved BCI performance compared to the performance achieved by current ad hoc, nonmodel-based filter applications. The parameters in designing the controller were extracted as model-based features from motor imagery task-related human scalp electroencephalography. Although the parameters can be generated from any model-linear or non-linear, we here adopted a simple autoregressive model that has well-established applications in BCI task discriminations. It was shown that the parameters generated for the controller design can as well be used for motor imagery task discriminations with performance (with 8-23% task discrimination errors) comparable to the discrimination performance of the commonly used features such as frequency specific band powers and the AR model parameters directly used. An optimal MPC has significant implications for high performance BCI applications.
NASA Technical Reports Server (NTRS)
Findlay, J. T.; Kelly, G. M.; Mcconnell, J. G.; Compton, H. R.
1983-01-01
Longitudinal performance comparisons between flight derived and predicted values are presented for the first five NASA Space Shuttle Columbia flights. Though subsonic comparisons are emphasized, comparisons during the transonic and low supersonic regions of flight are included. Computed air data information based on the remotely sensed atmospheric measurements as well as in situ Orbiter Air Data System (ADS) measurements were incorporated. Each air data source provides for comparisons versus the predicted values from the LaRC data base. Principally, L/D, C sub L, and C sub D, comparisons are presented, though some pitching moment results are included. Similarities in flight conditions and spacecraft configuration during the first five flights are discussed. Contributions from the various elements of the data base are presented and the overall differences observed between the flight and predicted values are discussed in terms of expected variations. A discussion on potential data base updates is presented based on the results from the five flights to date.
Green, Jasmine; Liem, Gregory Arief D; Martin, Andrew J; Colmar, Susan; Marsh, Herbert W; McInerney, Dennis
2012-10-01
The study tested three theoretically/conceptually hypothesized longitudinal models of academic processes leading to academic performance. Based on a longitudinal sample of 1866 high-school students across two consecutive years of high school (Time 1 and Time 2), the model with the most superior heuristic value demonstrated: (a) academic motivation and self-concept positively predicted attitudes toward school; (b) attitudes toward school positively predicted class participation and homework completion and negatively predicted absenteeism; and (c) class participation and homework completion positively predicted test performance whilst absenteeism negatively predicted test performance. Taken together, these findings provide support for the relevance of the self-system model and, particularly, the importance of examining the dynamic relationships amongst engagement factors of the model. The study highlights implications for educational and psychological theory, measurement, and intervention. Copyright © 2012 The Foundation for Professionals in Services for Adolescents. Published by Elsevier Ltd. All rights reserved.
Does the MCAT predict medical school and PGY-1 performance?
Saguil, Aaron; Dong, Ting; Gingerich, Robert J; Swygert, Kimberly; LaRochelle, Jeffrey S; Artino, Anthony R; Cruess, David F; Durning, Steven J
2015-04-01
The Medical College Admissions Test (MCAT) is a high-stakes test required for entry to most U. S. medical schools; admissions committees use this test to predict future accomplishment. Although there is evidence that the MCAT predicts success on multiple choice-based assessments, there is little information on whether the MCAT predicts clinical-based assessments of undergraduate and graduate medical education performance. This study looked at associations between the MCAT and medical school grade point average (GPA), Medical Licensing Examination (USMLE) scores, observed patient care encounters, and residency performance assessments. This study used data collected as part of the Long-Term Career Outcome Study to determine associations between MCAT scores, USMLE Step 1, Step 2 clinical knowledge and clinical skill, and Step 3 scores, Objective Structured Clinical Examination performance, medical school GPA, and PGY-1 program director (PD) assessment of physician performance for students graduating 2010 and 2011. MCAT data were available for all students, and the PGY PD evaluation response rate was 86.2% (N = 340). All permutations of MCAT scores (first, last, highest, average) were weakly associated with GPA, Step 2 clinical knowledge scores, and Step 3 scores. MCAT scores were weakly to moderately associated with Step 1 scores. MCAT scores were not significantly associated with Step 2 clinical skills Integrated Clinical Encounter and Communication and Interpersonal Skills subscores, Objective Structured Clinical Examination performance or PGY-1 PD evaluations. MCAT scores were weakly to moderately associated with assessments that rely on multiple choice testing. The association is somewhat stronger for assessments occurring earlier in medical school, such as USMLE Step 1. The MCAT was not able to predict assessments relying on direct clinical observation, nor was it able to predict PD assessment of PGY-1 performance. Reprint & Copyright © 2015 Association of Military Surgeons of the U.S.
Silitonga, Arridina Susan; Hassan, Masjuki Haji; Ong, Hwai Chyuan; Kusumo, Fitranto
2017-11-01
The purpose of this study is to investigate the performance, emission and combustion characteristics of a four-cylinder common-rail turbocharged diesel engine fuelled with Jatropha curcas biodiesel-diesel blends. A kernel-based extreme learning machine (KELM) model is developed in this study using MATLAB software in order to predict the performance, combustion and emission characteristics of the engine. To acquire the data for training and testing the KELM model, the engine speed was selected as the input parameter, whereas the performance, exhaust emissions and combustion characteristics were chosen as the output parameters of the KELM model. The performance, emissions and combustion characteristics predicted by the KELM model were validated by comparing the predicted data with the experimental data. The results show that the coefficient of determination of the parameters is within a range of 0.9805-0.9991 for both the KELM model and the experimental data. The mean absolute percentage error is within a range of 0.1259-2.3838. This study shows that KELM modelling is a useful technique in biodiesel production since it facilitates scientists and researchers to predict the performance, exhaust emissions and combustion characteristics of internal combustion engines with high accuracy.
Predicting residue-wise contact orders in proteins by support vector regression.
Song, Jiangning; Burrage, Kevin
2006-10-03
The residue-wise contact order (RWCO) describes the sequence separations between the residues of interest and its contacting residues in a protein sequence. It is a new kind of one-dimensional protein structure that represents the extent of long-range contacts and is considered as a generalization of contact order. Together with secondary structure, accessible surface area, the B factor, and contact number, RWCO provides comprehensive and indispensable important information to reconstructing the protein three-dimensional structure from a set of one-dimensional structural properties. Accurately predicting RWCO values could have many important applications in protein three-dimensional structure prediction and protein folding rate prediction, and give deep insights into protein sequence-structure relationships. We developed a novel approach to predict residue-wise contact order values in proteins based on support vector regression (SVR), starting from primary amino acid sequences. We explored seven different sequence encoding schemes to examine their effects on the prediction performance, including local sequence in the form of PSI-BLAST profiles, local sequence plus amino acid composition, local sequence plus molecular weight, local sequence plus secondary structure predicted by PSIPRED, local sequence plus molecular weight and amino acid composition, local sequence plus molecular weight and predicted secondary structure, and local sequence plus molecular weight, amino acid composition and predicted secondary structure. When using local sequences with multiple sequence alignments in the form of PSI-BLAST profiles, we could predict the RWCO distribution with a Pearson correlation coefficient (CC) between the predicted and observed RWCO values of 0.55, and root mean square error (RMSE) of 0.82, based on a well-defined dataset with 680 protein sequences. Moreover, by incorporating global features such as molecular weight and amino acid composition we could further improve the prediction performance with the CC to 0.57 and an RMSE of 0.79. In addition, combining the predicted secondary structure by PSIPRED was found to significantly improve the prediction performance and could yield the best prediction accuracy with a CC of 0.60 and RMSE of 0.78, which provided at least comparable performance compared with the other existing methods. The SVR method shows a prediction performance competitive with or at least comparable to the previously developed linear regression-based methods for predicting RWCO values. In contrast to support vector classification (SVC), SVR is very good at estimating the raw value profiles of the samples. The successful application of the SVR approach in this study reinforces the fact that support vector regression is a powerful tool in extracting the protein sequence-structure relationship and in estimating the protein structural profiles from amino acid sequences.
The Application of FIA-based Data to Wildlife Habitat Modeling: A Comparative Study
Thomas C., Jr. Edwards; Gretchen G. Moisen; Tracey S. Frescino; Randall J. Schultz
2005-01-01
We evaluated the capability of two types of models, one based on spatially explicit variables derived from FIA data and one using so-called traditional habitat evaluation methods, for predicting the presence of cavity-nesting bird habitat in Fishlake National Forest, Utah. Both models performed equally well, in measures of predictive accuracy, with the FIA-based model...
NASA Astrophysics Data System (ADS)
Müller, M. F.; Thompson, S. E.
2016-02-01
The prediction of flow duration curves (FDCs) in ungauged basins remains an important task for hydrologists given the practical relevance of FDCs for water management and infrastructure design. Predicting FDCs in ungauged basins typically requires spatial interpolation of statistical or model parameters. This task is complicated if climate becomes non-stationary, as the prediction challenge now also requires extrapolation through time. In this context, process-based models for FDCs that mechanistically link the streamflow distribution to climate and landscape factors may have an advantage over purely statistical methods to predict FDCs. This study compares a stochastic (process-based) and statistical method for FDC prediction in both stationary and non-stationary contexts, using Nepal as a case study. Under contemporary conditions, both models perform well in predicting FDCs, with Nash-Sutcliffe coefficients above 0.80 in 75 % of the tested catchments. The main drivers of uncertainty differ between the models: parameter interpolation was the main source of error for the statistical model, while violations of the assumptions of the process-based model represented the main source of its error. The process-based approach performed better than the statistical approach in numerical simulations with non-stationary climate drivers. The predictions of the statistical method under non-stationary rainfall conditions were poor if (i) local runoff coefficients were not accurately determined from the gauge network, or (ii) streamflow variability was strongly affected by changes in rainfall. A Monte Carlo analysis shows that the streamflow regimes in catchments characterized by frequent wet-season runoff and a rapid, strongly non-linear hydrologic response are particularly sensitive to changes in rainfall statistics. In these cases, process-based prediction approaches are favored over statistical models.
A Hybrid FPGA-Based System for EEG- and EMG-Based Online Movement Prediction.
Wöhrle, Hendrik; Tabie, Marc; Kim, Su Kyoung; Kirchner, Frank; Kirchner, Elsa Andrea
2017-07-03
A current trend in the development of assistive devices for rehabilitation, for example exoskeletons or active orthoses, is to utilize physiological data to enhance their functionality and usability, for example by predicting the patient's upcoming movements using electroencephalography (EEG) or electromyography (EMG). However, these modalities have different temporal properties and classification accuracies, which results in specific advantages and disadvantages. To use physiological data analysis in rehabilitation devices, the processing should be performed in real-time, guarantee close to natural movement onset support, provide high mobility, and should be performed by miniaturized systems that can be embedded into the rehabilitation device. We present a novel Field Programmable Gate Array (FPGA) -based system for real-time movement prediction using physiological data. Its parallel processing capabilities allows the combination of movement predictions based on EEG and EMG and additionally a P300 detection, which is likely evoked by instructions of the therapist. The system is evaluated in an offline and an online study with twelve healthy subjects in total. We show that it provides a high computational performance and significantly lower power consumption in comparison to a standard PC. Furthermore, despite the usage of fixed-point computations, the proposed system achieves a classification accuracy similar to systems with double precision floating-point precision.
A Hybrid FPGA-Based System for EEG- and EMG-Based Online Movement Prediction
Wöhrle, Hendrik; Tabie, Marc; Kim, Su Kyoung; Kirchner, Frank; Kirchner, Elsa Andrea
2017-01-01
A current trend in the development of assistive devices for rehabilitation, for example exoskeletons or active orthoses, is to utilize physiological data to enhance their functionality and usability, for example by predicting the patient’s upcoming movements using electroencephalography (EEG) or electromyography (EMG). However, these modalities have different temporal properties and classification accuracies, which results in specific advantages and disadvantages. To use physiological data analysis in rehabilitation devices, the processing should be performed in real-time, guarantee close to natural movement onset support, provide high mobility, and should be performed by miniaturized systems that can be embedded into the rehabilitation device. We present a novel Field Programmable Gate Array (FPGA) -based system for real-time movement prediction using physiological data. Its parallel processing capabilities allows the combination of movement predictions based on EEG and EMG and additionally a P300 detection, which is likely evoked by instructions of the therapist. The system is evaluated in an offline and an online study with twelve healthy subjects in total. We show that it provides a high computational performance and significantly lower power consumption in comparison to a standard PC. Furthermore, despite the usage of fixed-point computations, the proposed system achieves a classification accuracy similar to systems with double precision floating-point precision. PMID:28671632
Yu, Xianyu; Wang, Yi; Niu, Ruiqing; Hu, Youjian
2016-01-01
In this study, a novel coupling model for landslide susceptibility mapping is presented. In practice, environmental factors may have different impacts at a local scale in study areas. To provide better predictions, a geographically weighted regression (GWR) technique is firstly used in our method to segment study areas into a series of prediction regions with appropriate sizes. Meanwhile, a support vector machine (SVM) classifier is exploited in each prediction region for landslide susceptibility mapping. To further improve the prediction performance, the particle swarm optimization (PSO) algorithm is used in the prediction regions to obtain optimal parameters for the SVM classifier. To evaluate the prediction performance of our model, several SVM-based prediction models are utilized for comparison on a study area of the Wanzhou district in the Three Gorges Reservoir. Experimental results, based on three objective quantitative measures and visual qualitative evaluation, indicate that our model can achieve better prediction accuracies and is more effective for landslide susceptibility mapping. For instance, our model can achieve an overall prediction accuracy of 91.10%, which is 7.8%–19.1% higher than the traditional SVM-based models. In addition, the obtained landslide susceptibility map by our model can demonstrate an intensive correlation between the classified very high-susceptibility zone and the previously investigated landslides. PMID:27187430
Yu, Xianyu; Wang, Yi; Niu, Ruiqing; Hu, Youjian
2016-05-11
In this study, a novel coupling model for landslide susceptibility mapping is presented. In practice, environmental factors may have different impacts at a local scale in study areas. To provide better predictions, a geographically weighted regression (GWR) technique is firstly used in our method to segment study areas into a series of prediction regions with appropriate sizes. Meanwhile, a support vector machine (SVM) classifier is exploited in each prediction region for landslide susceptibility mapping. To further improve the prediction performance, the particle swarm optimization (PSO) algorithm is used in the prediction regions to obtain optimal parameters for the SVM classifier. To evaluate the prediction performance of our model, several SVM-based prediction models are utilized for comparison on a study area of the Wanzhou district in the Three Gorges Reservoir. Experimental results, based on three objective quantitative measures and visual qualitative evaluation, indicate that our model can achieve better prediction accuracies and is more effective for landslide susceptibility mapping. For instance, our model can achieve an overall prediction accuracy of 91.10%, which is 7.8%-19.1% higher than the traditional SVM-based models. In addition, the obtained landslide susceptibility map by our model can demonstrate an intensive correlation between the classified very high-susceptibility zone and the previously investigated landslides.
Kurgan, Lukasz; Cios, Krzysztof; Chen, Ke
2008-05-01
Protein structure prediction methods provide accurate results when a homologous protein is predicted, while poorer predictions are obtained in the absence of homologous templates. However, some protein chains that share twilight-zone pairwise identity can form similar folds and thus determining structural similarity without the sequence similarity would be desirable for the structure prediction. The folding type of a protein or its domain is defined as the structural class. Current structural class prediction methods that predict the four structural classes defined in SCOP provide up to 63% accuracy for the datasets in which sequence identity of any pair of sequences belongs to the twilight-zone. We propose SCPRED method that improves prediction accuracy for sequences that share twilight-zone pairwise similarity with sequences used for the prediction. SCPRED uses a support vector machine classifier that takes several custom-designed features as its input to predict the structural classes. Based on extensive design that considers over 2300 index-, composition- and physicochemical properties-based features along with features based on the predicted secondary structure and content, the classifier's input includes 8 features based on information extracted from the secondary structure predicted with PSI-PRED and one feature computed from the sequence. Tests performed with datasets of 1673 protein chains, in which any pair of sequences shares twilight-zone similarity, show that SCPRED obtains 80.3% accuracy when predicting the four SCOP-defined structural classes, which is superior when compared with over a dozen recent competing methods that are based on support vector machine, logistic regression, and ensemble of classifiers predictors. The SCPRED can accurately find similar structures for sequences that share low identity with sequence used for the prediction. The high predictive accuracy achieved by SCPRED is attributed to the design of the features, which are capable of separating the structural classes in spite of their low dimensionality. We also demonstrate that the SCPRED's predictions can be successfully used as a post-processing filter to improve performance of modern fold classification methods.
Kurgan, Lukasz; Cios, Krzysztof; Chen, Ke
2008-01-01
Background Protein structure prediction methods provide accurate results when a homologous protein is predicted, while poorer predictions are obtained in the absence of homologous templates. However, some protein chains that share twilight-zone pairwise identity can form similar folds and thus determining structural similarity without the sequence similarity would be desirable for the structure prediction. The folding type of a protein or its domain is defined as the structural class. Current structural class prediction methods that predict the four structural classes defined in SCOP provide up to 63% accuracy for the datasets in which sequence identity of any pair of sequences belongs to the twilight-zone. We propose SCPRED method that improves prediction accuracy for sequences that share twilight-zone pairwise similarity with sequences used for the prediction. Results SCPRED uses a support vector machine classifier that takes several custom-designed features as its input to predict the structural classes. Based on extensive design that considers over 2300 index-, composition- and physicochemical properties-based features along with features based on the predicted secondary structure and content, the classifier's input includes 8 features based on information extracted from the secondary structure predicted with PSI-PRED and one feature computed from the sequence. Tests performed with datasets of 1673 protein chains, in which any pair of sequences shares twilight-zone similarity, show that SCPRED obtains 80.3% accuracy when predicting the four SCOP-defined structural classes, which is superior when compared with over a dozen recent competing methods that are based on support vector machine, logistic regression, and ensemble of classifiers predictors. Conclusion The SCPRED can accurately find similar structures for sequences that share low identity with sequence used for the prediction. The high predictive accuracy achieved by SCPRED is attributed to the design of the features, which are capable of separating the structural classes in spite of their low dimensionality. We also demonstrate that the SCPRED's predictions can be successfully used as a post-processing filter to improve performance of modern fold classification methods. PMID:18452616
Modeling to predict pilot performance during CDTI-based in-trail following experiments
NASA Technical Reports Server (NTRS)
Sorensen, J. A.; Goka, T.
1984-01-01
A mathematical model was developed of the flight system with the pilot using a cockpit display of traffic information (CDTI) to establish and maintain in-trail spacing behind a lead aircraft during approach. Both in-trail and vertical dynamics were included. The nominal spacing was based on one of three criteria (Constant Time Predictor; Constant Time Delay; or Acceleration Cue). This model was used to simulate digitally the dynamics of a string of multiple following aircraft, including response to initial position errors. The simulation was used to predict the outcome of a series of in-trail following experiments, including pilot performance in maintaining correct longitudinal spacing and vertical position. The experiments were run in the NASA Ames Research Center multi-cab cockpit simulator facility. The experimental results were then used to evaluate the model and its prediction accuracy. Model parameters were adjusted, so that modeled performance matched experimental results. Lessons learned in this modeling and prediction study are summarized.
Muratov, Eugene; Lewis, Margaret; Fourches, Denis; Tropsha, Alexander; Cox, Wendy C
2017-04-01
Objective. To develop predictive computational models forecasting the academic performance of students in the didactic-rich portion of a doctor of pharmacy (PharmD) curriculum as admission-assisting tools. Methods. All PharmD candidates over three admission cycles were divided into two groups: those who completed the PharmD program with a GPA ≥ 3; and the remaining candidates. Random Forest machine learning technique was used to develop a binary classification model based on 11 pre-admission parameters. Results. Robust and externally predictive models were developed that had particularly high overall accuracy of 77% for candidates with high or low academic performance. These multivariate models were highly accurate in predicting these groups to those obtained using undergraduate GPA and composite PCAT scores only. Conclusion. The models developed in this study can be used to improve the admission process as preliminary filters and thus quickly identify candidates who are likely to be successful in the PharmD curriculum.
A learning-based autonomous driver: emulate human driver's intelligence in low-speed car following
NASA Astrophysics Data System (ADS)
Wei, Junqing; Dolan, John M.; Litkouhi, Bakhtiar
2010-04-01
In this paper, an offline learning mechanism based on the genetic algorithm is proposed for autonomous vehicles to emulate human driver behaviors. The autonomous driving ability is implemented based on a Prediction- and Cost function-Based algorithm (PCB). PCB is designed to emulate a human driver's decision process, which is modeled as traffic scenario prediction and evaluation. This paper focuses on using a learning algorithm to optimize PCB with very limited training data, so that PCB can have the ability to predict and evaluate traffic scenarios similarly to human drivers. 80 seconds of human driving data was collected in low-speed (< 30miles/h) car-following scenarios. In the low-speed car-following tests, PCB was able to perform more human-like carfollowing after learning. A more general 120 kilometer-long simulation showed that PCB performs robustly even in scenarios that are not part of the training set.
Model predictive control based on reduced order models applied to belt conveyor system.
Chen, Wei; Li, Xin
2016-11-01
In the paper, a model predictive controller based on reduced order model is proposed to control belt conveyor system, which is an electro-mechanics complex system with long visco-elastic body. Firstly, in order to design low-degree controller, the balanced truncation method is used for belt conveyor model reduction. Secondly, MPC algorithm based on reduced order model for belt conveyor system is presented. Because of the error bound between the full-order model and reduced order model, two Kalman state estimators are applied in the control scheme to achieve better system performance. Finally, the simulation experiments are shown that balanced truncation method can significantly reduce the model order with high-accuracy and model predictive control based on reduced-model performs well in controlling the belt conveyor system. Copyright © 2016 ISA. Published by Elsevier Ltd. All rights reserved.
A Complete Procedure for Predicting and Improving the Performance of HAWT's
NASA Astrophysics Data System (ADS)
Al-Abadi, Ali; Ertunç, Özgür; Sittig, Florian; Delgado, Antonio
2014-06-01
A complete procedure for predicting and improving the performance of the horizontal axis wind turbine (HAWT) has been developed. The first process is predicting the power extracted by the turbine and the derived rotor torque, which should be identical to that of the drive unit. The BEM method and a developed post-stall treatment for resolving stall-regulated HAWT is incorporated in the prediction. For that, a modified stall-regulated prediction model, which can predict the HAWT performance over the operating range of oncoming wind velocity, is derived from existing models. The model involves radius and chord, which has made it more general in applications for predicting the performance of different scales and rotor shapes of HAWTs. The second process is modifying the rotor shape by an optimization process, which can be applied to any existing HAWT, to improve its performance. A gradient- based optimization is used for adjusting the chord and twist angle distribution of the rotor blade to increase the extraction of the power while keeping the drive torque constant, thus the same drive unit can be kept. The final process is testing the modified turbine to predict its enhanced performance. The procedure is applied to NREL phase-VI 10kW as a baseline turbine. The study has proven the applicability of the developed model in predicting the performance of the baseline as well as the optimized turbine. In addition, the optimization method has shown that the power coefficient can be increased while keeping same design rotational speed.
Fingeret, Abbey L; Martinez, Rebecca H; Hsieh, Christine; Downey, Peter; Nowygrod, Roman
2016-02-01
We aim to determine whether observed operations or internet-based video review predict improved performance in the surgery clerkship. A retrospective review of students' usage of surgical videos, observed operations, evaluations, and examination scores were used to construct an exploratory principal component analysis. Multivariate regression was used to determine factors predictive of clerkship performance. Case log data for 231 students revealed a median of 25 observed cases. Students accessed the web-based video platform a median of 15 times. Principal component analysis yielded 4 factors contributing 74% of the variability with a Kaiser-Meyer-Olkin coefficient of .83. Multivariate regression predicted shelf score (P < .0001), internal clinical skills examination score (P < .0001), subjective evaluations (P < .001), and video website utilization (P < .001) but not observed cases to be significantly associated with overall performance. Utilization of a web-based operative video platform during a surgical clerkship is an independently associated with improved clinical reasoning, fund of knowledge, and overall evaluation. Thus, this modality can serve as a useful adjunct to live observation. Copyright © 2016 Elsevier Inc. All rights reserved.
Tsiouris, Κostas Μ; Pezoulas, Vasileios C; Zervakis, Michalis; Konitsiotis, Spiros; Koutsouris, Dimitrios D; Fotiadis, Dimitrios I
2018-05-17
The electroencephalogram (EEG) is the most prominent means to study epilepsy and capture changes in electrical brain activity that could declare an imminent seizure. In this work, Long Short-Term Memory (LSTM) networks are introduced in epileptic seizure prediction using EEG signals, expanding the use of deep learning algorithms with convolutional neural networks (CNN). A pre-analysis is initially performed to find the optimal architecture of the LSTM network by testing several modules and layers of memory units. Based on these results, a two-layer LSTM network is selected to evaluate seizure prediction performance using four different lengths of preictal windows, ranging from 15 min to 2 h. The LSTM model exploits a wide range of features extracted prior to classification, including time and frequency domain features, between EEG channels cross-correlation and graph theoretic features. The evaluation is performed using long-term EEG recordings from the open CHB-MIT Scalp EEG database, suggest that the proposed methodology is able to predict all 185 seizures, providing high rates of seizure prediction sensitivity and low false prediction rates (FPR) of 0.11-0.02 false alarms per hour, depending on the duration of the preictal window. The proposed LSTM-based methodology delivers a significant increase in seizure prediction performance compared to both traditional machine learning techniques and convolutional neural networks that have been previously evaluated in the literature. Copyright © 2018 Elsevier Ltd. All rights reserved.
Størset, Elisabet; Holford, Nick; Hennig, Stefanie; Bergmann, Troels K; Bergan, Stein; Bremer, Sara; Åsberg, Anders; Midtvedt, Karsten; Staatz, Christine E
2014-09-01
The aim was to develop a theory-based population pharmacokinetic model of tacrolimus in adult kidney transplant recipients and to externally evaluate this model and two previous empirical models. Data were obtained from 242 patients with 3100 tacrolimus whole blood concentrations. External evaluation was performed by examining model predictive performance using Bayesian forecasting. Pharmacokinetic disposition parameters were estimated based on tacrolimus plasma concentrations, predicted from whole blood concentrations, haematocrit and literature values for tacrolimus binding to red blood cells. Disposition parameters were allometrically scaled to fat free mass. Tacrolimus whole blood clearance/bioavailability standardized to haematocrit of 45% and fat free mass of 60 kg was estimated to be 16.1 l h−1 [95% CI 12.6, 18.0 l h−1]. Tacrolimus clearance was 30% higher (95% CI 13, 46%) and bioavailability 18% lower (95% CI 2, 29%) in CYP3A5 expressers compared with non-expressers. An Emax model described decreasing tacrolimus bioavailability with increasing prednisolone dose. The theory-based model was superior to the empirical models during external evaluation displaying a median prediction error of −1.2% (95% CI −3.0, 0.1%). Based on simulation, Bayesian forecasting led to 65% (95% CI 62, 68%) of patients achieving a tacrolimus average steady-state concentration within a suggested acceptable range. A theory-based population pharmacokinetic model was superior to two empirical models for prediction of tacrolimus concentrations and seemed suitable for Bayesian prediction of tacrolimus doses early after kidney transplantation.
Pre-operative prediction of surgical morbidity in children: comparison of five statistical models.
Cooper, Jennifer N; Wei, Lai; Fernandez, Soledad A; Minneci, Peter C; Deans, Katherine J
2015-02-01
The accurate prediction of surgical risk is important to patients and physicians. Logistic regression (LR) models are typically used to estimate these risks. However, in the fields of data mining and machine-learning, many alternative classification and prediction algorithms have been developed. This study aimed to compare the performance of LR to several data mining algorithms for predicting 30-day surgical morbidity in children. We used the 2012 National Surgical Quality Improvement Program-Pediatric dataset to compare the performance of (1) a LR model that assumed linearity and additivity (simple LR model) (2) a LR model incorporating restricted cubic splines and interactions (flexible LR model) (3) a support vector machine, (4) a random forest and (5) boosted classification trees for predicting surgical morbidity. The ensemble-based methods showed significantly higher accuracy, sensitivity, specificity, PPV, and NPV than the simple LR model. However, none of the models performed better than the flexible LR model in terms of the aforementioned measures or in model calibration or discrimination. Support vector machines, random forests, and boosted classification trees do not show better performance than LR for predicting pediatric surgical morbidity. After further validation, the flexible LR model derived in this study could be used to assist with clinical decision-making based on patient-specific surgical risks. Copyright © 2014 Elsevier Ltd. All rights reserved.
Bibby, Chris; Hodgson, Murray
2017-01-01
The work reported here, part of a study on the performance and optimal design of interior natural-ventilation openings and silencers ("ventilators"), discusses the prediction of the acoustical performance of such ventilators, and the factors that affect it. A wave-based numerical approach-the finite-element method (FEM)-is applied. The development of a FEM technique for the prediction of ventilator diffuse-field transmission loss is presented. Model convergence is studied with respect to mesh, frequency-sampling and diffuse-field convergence. The modeling technique is validated by way of predictions and the comparison of them to analytical and experimental results. The transmission-loss performance of crosstalk silencers of four shapes, and the factors that affect it, are predicted and discussed. Performance increases with flow-path length for all silencer types. Adding elbows significantly increases high-frequency transmission loss, but does not increase overall silencer performance which is controlled by low-to-mid-frequency transmission loss.
Song, Jiangning; Burrage, Kevin; Yuan, Zheng; Huber, Thomas
2006-03-09
The majority of peptide bonds in proteins are found to occur in the trans conformation. However, for proline residues, a considerable fraction of Prolyl peptide bonds adopt the cis form. Proline cis/trans isomerization is known to play a critical role in protein folding, splicing, cell signaling and transmembrane active transport. Accurate prediction of proline cis/trans isomerization in proteins would have many important applications towards the understanding of protein structure and function. In this paper, we propose a new approach to predict the proline cis/trans isomerization in proteins using support vector machine (SVM). The preliminary results indicated that using Radial Basis Function (RBF) kernels could lead to better prediction performance than that of polynomial and linear kernel functions. We used single sequence information of different local window sizes, amino acid compositions of different local sequences, multiple sequence alignment obtained from PSI-BLAST and the secondary structure information predicted by PSIPRED. We explored these different sequence encoding schemes in order to investigate their effects on the prediction performance. The training and testing of this approach was performed on a newly enlarged dataset of 2424 non-homologous proteins determined by X-Ray diffraction method using 5-fold cross-validation. Selecting the window size 11 provided the best performance for determining the proline cis/trans isomerization based on the single amino acid sequence. It was found that using multiple sequence alignments in the form of PSI-BLAST profiles could significantly improve the prediction performance, the prediction accuracy increased from 62.8% with single sequence to 69.8% and Matthews Correlation Coefficient (MCC) improved from 0.26 with single local sequence to 0.40. Furthermore, if coupled with the predicted secondary structure information by PSIPRED, our method yielded a prediction accuracy of 71.5% and MCC of 0.43, 9% and 0.17 higher than the accuracy achieved based on the singe sequence information, respectively. A new method has been developed to predict the proline cis/trans isomerization in proteins based on support vector machine, which used the single amino acid sequence with different local window sizes, the amino acid compositions of local sequence flanking centered proline residues, the position-specific scoring matrices (PSSMs) extracted by PSI-BLAST and the predicted secondary structures generated by PSIPRED. The successful application of SVM approach in this study reinforced that SVM is a powerful tool in predicting proline cis/trans isomerization in proteins and biological sequence analysis.
NASA Astrophysics Data System (ADS)
Efthimiou, G. C.; Andronopoulos, S.; Bartzis, J. G.
2018-02-01
One of the key issues of recent research on the dispersion inside complex urban environments is the ability to predict dosage-based parameters from the puff release of an airborne material from a point source in the atmospheric boundary layer inside the built-up area. The present work addresses the question of whether the computational fluid dynamics (CFD)-Reynolds-averaged Navier-Stokes (RANS) methodology can be used to predict ensemble-average dosage-based parameters that are related with the puff dispersion. RANS simulations with the ADREA-HF code were, therefore, performed, where a single puff was released in each case. The present method is validated against the data sets from two wind-tunnel experiments. In each experiment, more than 200 puffs were released from which ensemble-averaged dosage-based parameters were calculated and compared to the model's predictions. The performance of the model was evaluated using scatter plots and three validation metrics: fractional bias, normalized mean square error, and factor of two. The model presented a better performance for the temporal parameters (i.e., ensemble-average times of puff arrival, peak, leaving, duration, ascent, and descent) than for the ensemble-average dosage and peak concentration. The majority of the obtained values of validation metrics were inside established acceptance limits. Based on the obtained model performance indices, the CFD-RANS methodology as implemented in the code ADREA-HF is able to predict the ensemble-average temporal quantities related to transient emissions of airborne material in urban areas within the range of the model performance acceptance criteria established in the literature. The CFD-RANS methodology as implemented in the code ADREA-HF is also able to predict the ensemble-average dosage, but the dosage results should be treated with some caution; as in one case, the observed ensemble-average dosage was under-estimated slightly more than the acceptance criteria. Ensemble-average peak concentration was systematically underpredicted by the model to a degree higher than the allowable by the acceptance criteria, in 1 of the 2 wind-tunnel experiments. The model performance depended on the positions of the examined sensors in relation to the emission source and the buildings configuration. The work presented in this paper was carried out (partly) within the scope of COST Action ES1006 "Evaluation, improvement, and guidance for the use of local-scale emergency prediction and response tools for airborne hazards in built environments".
NASA Astrophysics Data System (ADS)
Winder, Anthony J.; Siemonsen, Susanne; Flottmann, Fabian; Fiehler, Jens; Forkert, Nils D.
2017-03-01
Voxel-based tissue outcome prediction in acute ischemic stroke patients is highly relevant for both clinical routine and research. Previous research has shown that features extracted from baseline multi-parametric MRI datasets have a high predictive value and can be used for the training of classifiers, which can generate tissue outcome predictions for both intravenous and conservative treatments. However, with the recent advent and popularization of intra-arterial thrombectomy treatment, novel research specifically addressing the utility of predictive classi- fiers for thrombectomy intervention is necessary for a holistic understanding of current stroke treatment options. The aim of this work was to develop three clinically viable tissue outcome prediction models using approximate nearest-neighbor, generalized linear model, and random decision forest approaches and to evaluate the accuracy of predicting tissue outcome after intra-arterial treatment. Therefore, the three machine learning models were trained, evaluated, and compared using datasets of 42 acute ischemic stroke patients treated with intra-arterial thrombectomy. Classifier training utilized eight voxel-based features extracted from baseline MRI datasets and five global features. Evaluation of classifier-based predictions was performed via comparison to the known tissue outcome, which was determined in follow-up imaging, using the Dice coefficient and leave-on-patient-out cross validation. The random decision forest prediction model led to the best tissue outcome predictions with a mean Dice coefficient of 0.37. The approximate nearest-neighbor and generalized linear model performed equally suboptimally with average Dice coefficients of 0.28 and 0.27 respectively, suggesting that both non-linearity and machine learning are desirable properties of a classifier well-suited to the intra-arterial tissue outcome prediction problem.
A Novel Local Learning based Approach With Application to Breast Cancer Diagnosis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xu, Songhua; Tourassi, Georgia
2012-01-01
The purpose of this study is to develop and evaluate a novel local learning-based approach for computer-assisted diagnosis of breast cancer. Our new local learning based algorithm using the linear logistic regression method as its base learner is described. Overall, our algorithm will perform its stochastic searching process until the total allowed computing time is used up by our random walk process in identifying the most suitable population subdivision scheme and their corresponding individual base learners. The proposed local learning-based approach was applied for the prediction of breast cancer given 11 mammographic and clinical findings reported by physicians using themore » BI-RADS lexicon. Our database consisted of 850 patients with biopsy confirmed diagnosis (290 malignant and 560 benign). We also compared the performance of our method with a collection of publicly available state-of-the-art machine learning methods. Predictive performance for all classifiers was evaluated using 10-fold cross validation and Receiver Operating Characteristics (ROC) analysis. Figure 1 reports the performance of 54 machine learning methods implemented in the machine learning toolkit Weka (version 3.0). We introduced a novel local learning-based classifier and compared it with an extensive list of other classifiers for the problem of breast cancer diagnosis. Our experiments show that the algorithm superior prediction performance outperforming a wide range of other well established machine learning techniques. Our conclusion complements the existing understanding in the machine learning field that local learning may capture complicated, non-linear relationships exhibited by real-world datasets.« less
Pai, Priyadarshini P; Mondal, Sukanta
2016-10-01
Proteins interact with carbohydrates to perform various cellular interactions. Of the many carbohydrate ligands that proteins bind with, mannose constitute an important class, playing important roles in host defense mechanisms. Accurate identification of mannose-interacting residues (MIR) may provide important clues to decipher the underlying mechanisms of protein-mannose interactions during infections. This study proposes an approach using an ensemble of base classifiers for prediction of MIR using their evolutionary information in the form of position-specific scoring matrix. The base classifiers are random forests trained by different subsets of training data set Dset128 using 10-fold cross-validation. The optimized ensemble of base classifiers, MOWGLI, is then used to predict MIR on protein chains of the test data set Dtestset29 which showed a promising performance with 92.0% accurate prediction. An overall improvement of 26.6% in precision was observed upon comparison with the state-of-art. It is hoped that this approach, yielding enhanced predictions, could be eventually used for applications in drug design and vaccine development.
MATRIX FACTORIZATION-BASED DATA FUSION FOR GENE FUNCTION PREDICTION IN BAKER’S YEAST AND SLIME MOLD
ŽITNIK, MARINKA; ZUPAN, BLAŽ
2014-01-01
The development of effective methods for the characterization of gene functions that are able to combine diverse data sources in a sound and easily-extendible way is an important goal in computational biology. We have previously developed a general matrix factorization-based data fusion approach for gene function prediction. In this manuscript, we show that this data fusion approach can be applied to gene function prediction and that it can fuse various heterogeneous data sources, such as gene expression profiles, known protein annotations, interaction and literature data. The fusion is achieved by simultaneous matrix tri-factorization that shares matrix factors between sources. We demonstrate the effectiveness of the approach by evaluating its performance on predicting ontological annotations in slime mold D. discoideum and on recognizing proteins of baker’s yeast S. cerevisiae that participate in the ribosome or are located in the cell membrane. Our approach achieves predictive performance comparable to that of the state-of-the-art kernel-based data fusion, but requires fewer data preprocessing steps. PMID:24297565
Yu, Haiying; Kühne, Ralph; Ebert, Ralf-Uwe; Schüürmann, Gerrit
2010-11-22
For 1143 organic compounds comprising 580 oxygen acids and 563 nitrogen bases that cover more than 17 orders of experimental pK(a) (from -5.00 to 12.23), the pK(a) prediction performances of ACD, SPARC, and two calibrations of a semiempirical quantum chemical (QC) AM1 approach have been analyzed. The overall root-mean-square errors (rms) for the acids are 0.41, 0.58 (0.42 without ortho-substituted phenols with intramolecular H-bonding), and 0.55 and for the bases are 0.65, 0.70, 1.17, and 1.27 for ACD, SPARC, and both QC methods, respectively. Method-specific performances are discussed in detail for six acid subsets (phenols and aromatic and aliphatic carboxylic acids with different substitution patterns) and nine base subsets (anilines, primary, secondary and tertiary amines, meta/para-substituted and ortho-substituted pyridines, pyrimidines, imidazoles, and quinolines). The results demonstrate an overall better performance for acids than for bases but also a substantial variation across subsets. For the overall best-performing ACD, rms ranges from 0.12 to 1.11 and 0.40 to 1.21 pK(a) units for the acid and base subsets, respectively. With regard to the squared correlation coefficient r², the results are 0.86 to 0.96 (acids) and 0.79 to 0.95 (bases) for ACD, 0.77 to 0.95 (acids) and 0.85 to 0.97 (bases) for SPARC, and 0.64 to 0.87 (acids) and 0.43 to 0.83 (bases) for the QC methods, respectively. Attention is paid to structural and method-specific causes for observed pitfalls. The significant subset dependence of the prediction performances suggests a consensus modeling approach.
Predicting Cavitation on Marine and Hydrokinetic Turbine Blades with AeroDyn V15.04
DOE Office of Scientific and Technical Information (OSTI.GOV)
Murray, Robynne
Cavitation is an important consideration in the design of marine and hydrokinetic (MHK) turbines. The National Renewable Energy Laboratory's AeroDyn performance code was originally developed for horizontal-axis wind turbines and did not have the capability to predict cavitation inception. Therefore, AeroDyn has been updated to include the ability to predict cavitation on MHK turbines based on user-specified vapor pressure and submerged depth. This report outlines a verification of the AeroDyn V15.04 performance code for MHK turbines through a comparison to publicly available performance data.
An Application to the Prediction of LOD Change Based on General Regression Neural Network
NASA Astrophysics Data System (ADS)
Zhang, X. H.; Wang, Q. J.; Zhu, J. J.; Zhang, H.
2011-07-01
Traditional prediction of the LOD (length of day) change was based on linear models, such as the least square model and the autoregressive technique, etc. Due to the complex non-linear features of the LOD variation, the performances of the linear model predictors are not fully satisfactory. This paper applies a non-linear neural network - general regression neural network (GRNN) model to forecast the LOD change, and the results are analyzed and compared with those obtained with the back propagation neural network and other models. The comparison shows that the performance of the GRNN model in the prediction of the LOD change is efficient and feasible.
Evaluating Prospective Teachers: Testing the Predictive Validity of the EdTPA
ERIC Educational Resources Information Center
Goldhaber, Dan; Cowan, James; Theobald, Roddy
2017-01-01
We use longitudinal data from Washington State to provide estimates of the extent to which performance on the edTPA, a performance-based, subject-specific assessment of teacher candidates, is predictive of the likelihood of employment in the teacher workforce and value-added measures of teacher effectiveness. While edTPA scores are highly…
Beyond the "High-Tech" Suits: Predicting 2012 Olympic Swim Performances
ERIC Educational Resources Information Center
Brammer, Chris L.; Stager, Joel M.; Tanner, Dave A.
2012-01-01
The purpose of the authors in this study was to predict the mean swim time of the top eight swimmers in swim events at the 2012 Olympic Games based upon prior Olympic performances from 1972 through 2008. Using the mean top eight time across all years, a best fit power curve [time = a x year[superscript b
Evaluating Prospective Teachers: Testing the Predictive Validity of the edTPA. Working Paper 157
ERIC Educational Resources Information Center
Goldhaber, Dan; Cowan, James; Theobald, Roddy
2016-01-01
We use longitudinal data from Washington State to provide estimates of the extent to which performance on the edTPA, a performance-based, subject-specific assessment of teacher candidates, is predictive of the likelihood of employment in the teacher workforce and value-added measures of teacher effectiveness. While edTPA scores are highly…
ERIC Educational Resources Information Center
Gallant, Dorinda J.
2013-01-01
Early childhood professional organizations support teachers as the best assessors of students' academic, social, emotional, and physical development. This study investigates the predictive nature of teacher ratings of first-grade students' performance on a standards-based curriculum-embedded performance assessment within the context of a state…
Developing Local Oral Reading Fluency Cut Scores for Predicting High-Stakes Test Performance
ERIC Educational Resources Information Center
Grapin, Sally L.; Kranzler, John H.; Waldron, Nancy; Joyce-Beaulieu, Diana; Algina, James
2017-01-01
This study evaluated the classification accuracy of a second grade oral reading fluency curriculum-based measure (R-CBM) in predicting third grade state test performance. It also compared the long-term classification accuracy of local and publisher-recommended R-CBM cut scores. Participants were 266 students who were divided into a calibration…
NASA Astrophysics Data System (ADS)
Ko, P.; Kurosawa, S.
2014-03-01
The understanding and accurate prediction of the flow behaviour related to cavitation and pressure fluctuation in a Kaplan turbine are important to the design work enhancing the turbine performance including the elongation of the operation life span and the improvement of turbine efficiency. In this paper, high accuracy turbine and cavitation performance prediction method based on entire flow passage for a Kaplan turbine is presented and evaluated. Two-phase flow field is predicted by solving Reynolds-Averaged Navier-Stokes equations expressed by volume of fluid method tracking the free surface and combined with Reynolds Stress model. The growth and collapse of cavitation bubbles are modelled by the modified Rayleigh-Plesset equation. The prediction accuracy is evaluated by comparing with the model test results of Ns 400 Kaplan model turbine. As a result that the experimentally measured data including turbine efficiency, cavitation performance, and pressure fluctuation are accurately predicted. Furthermore, the cavitation occurrence on the runner blade surface and the influence to the hydraulic loss of the flow passage are discussed. Evaluated prediction method for the turbine flow and performance is introduced to facilitate the future design and research works on Kaplan type turbine.
Hahn, Andreas; Lang, Michael; Stuckart, Claudia
2016-01-01
Abstract The objective of this work is to evaluate whether clinically important factors may predict an individual's capability to utilize the functional benefits provided by an advanced hydraulic, microprocessor-controlled exo-prosthetic knee component. This retrospective cross-sectional cohort analysis investigated the data of above knee amputees captured during routine trial fittings. Prosthetists rated the performance indicators showing the functional benefits of the advanced maneuvering capabilities of the device. Subjects were asked to rate their perception. Simple and multiple linear and logistic regression was applied. Data from 899 subjects with demographics typical for the population were evaluated. Ability to vary gait speed, perform toileting, and ascend stairs were identified as the most sensitive performance predictors. Prior C-Leg users showed benefits during advanced maneuvering. Variables showed plausible and meaningful effects, however, could not claim predictive power. Mobility grade showed the largest effect but also failed to be predictive. Clinical parameters such as etiology, age, mobility grade, and others analyzed here do not suffice to predict individual potential. Daily walking distance may pose a threshold value and be part of a predictive instrument. Decisions based solely on single parameters such as mobility grade rating or walking distance seem to be questionable. PMID:27828871
Hahn, Andreas; Lang, Michael; Stuckart, Claudia
2016-11-01
The objective of this work is to evaluate whether clinically important factors may predict an individual's capability to utilize the functional benefits provided by an advanced hydraulic, microprocessor-controlled exo-prosthetic knee component.This retrospective cross-sectional cohort analysis investigated the data of above knee amputees captured during routine trial fittings. Prosthetists rated the performance indicators showing the functional benefits of the advanced maneuvering capabilities of the device. Subjects were asked to rate their perception. Simple and multiple linear and logistic regression was applied.Data from 899 subjects with demographics typical for the population were evaluated. Ability to vary gait speed, perform toileting, and ascend stairs were identified as the most sensitive performance predictors. Prior C-Leg users showed benefits during advanced maneuvering. Variables showed plausible and meaningful effects, however, could not claim predictive power. Mobility grade showed the largest effect but also failed to be predictive.Clinical parameters such as etiology, age, mobility grade, and others analyzed here do not suffice to predict individual potential. Daily walking distance may pose a threshold value and be part of a predictive instrument. Decisions based solely on single parameters such as mobility grade rating or walking distance seem to be questionable.
Bao, Wei; Hu, Frank B.; Rong, Shuang; Rong, Ying; Bowers, Katherine; Schisterman, Enrique F.; Liu, Liegang; Zhang, Cuilin
2013-01-01
This study aimed to evaluate the predictive performance of genetic risk models based on risk loci identified and/or confirmed in genome-wide association studies for type 2 diabetes mellitus. A systematic literature search was conducted in the PubMed/MEDLINE and EMBASE databases through April 13, 2012, and published data relevant to the prediction of type 2 diabetes based on genome-wide association marker–based risk models (GRMs) were included. Of the 1,234 potentially relevant articles, 21 articles representing 23 studies were eligible for inclusion. The median area under the receiver operating characteristic curve (AUC) among eligible studies was 0.60 (range, 0.55–0.68), which did not differ appreciably by study design, sample size, participants’ race/ethnicity, or the number of genetic markers included in the GRMs. In addition, the AUCs for type 2 diabetes did not improve appreciably with the addition of genetic markers into conventional risk factor–based models (median AUC, 0.79 (range, 0.63–0.91) vs. median AUC, 0.78 (range, 0.63–0.90), respectively). A limited number of included studies used reclassification measures and yielded inconsistent results. In conclusion, GRMs showed a low predictive performance for risk of type 2 diabetes, irrespective of study design, participants’ race/ethnicity, and the number of genetic markers included. Moreover, the addition of genome-wide association markers into conventional risk models produced little improvement in predictive performance. PMID:24008910
ERIC Educational Resources Information Center
Lee, Young-Jin
2017-01-01
Purpose: The purpose of this paper is to develop a quantitative model of problem solving performance of students in the computer-based mathematics learning environment. Design/methodology/approach: Regularized logistic regression was used to create a quantitative model of problem solving performance of students that predicts whether students can…
NASA Astrophysics Data System (ADS)
Murrill, Steven R.; Jacobs, Eddie L.; Franck, Charmaine C.; Petkie, Douglas T.; De Lucia, Frank C.
2015-10-01
The U.S. Army Research Laboratory (ARL) has continued to develop and enhance a millimeter-wave (MMW) and submillimeter- wave (SMMW)/terahertz (THz)-band imaging system performance prediction and analysis tool for both the detection and identification of concealed weaponry, and for pilotage obstacle avoidance. The details of the MATLAB-based model which accounts for the effects of all critical sensor and display components, for the effects of atmospheric attenuation, concealment material attenuation, and active illumination, were reported on at the 2005 SPIE Europe Security and Defence Symposium (Brugge). An advanced version of the base model that accounts for both the dramatic impact that target and background orientation can have on target observability as related to specular and Lambertian reflections captured by an active-illumination-based imaging system, and for the impact of target and background thermal emission, was reported on at the 2007 SPIE Defense and Security Symposium (Orlando). Further development of this tool that includes a MODTRAN-based atmospheric attenuation calculator and advanced system architecture configuration inputs that allow for straightforward performance analysis of active or passive systems based on scanning (single- or line-array detector element(s)) or staring (focal-plane-array detector elements) imaging architectures was reported on at the 2011 SPIE Europe Security and Defence Symposium (Prague). This paper provides a comprehensive review of a newly enhanced MMW and SMMW/THz imaging system analysis and design tool that now includes an improved noise sub-model for more accurate and reliable performance predictions, the capability to account for postcapture image contrast enhancement, and the capability to account for concealment material backscatter with active-illumination- based systems. Present plans for additional expansion of the model's predictive capabilities are also outlined.
In silico prediction of pharmaceutical degradation pathways: a benchmarking study.
Kleinman, Mark H; Baertschi, Steven W; Alsante, Karen M; Reid, Darren L; Mowery, Mark D; Shimanovich, Roman; Foti, Chris; Smith, William K; Reynolds, Dan W; Nefliu, Marcela; Ott, Martin A
2014-11-03
Zeneth is a new software application capable of predicting degradation products derived from small molecule active pharmaceutical ingredients. This study was aimed at understanding the current status of Zeneth's predictive capabilities and assessing gaps in predictivity. Using data from 27 small molecule drug substances from five pharmaceutical companies, the evolution of Zeneth predictions through knowledge base development since 2009 was evaluated. The experimentally observed degradation products from forced degradation, accelerated, and long-term stability studies were compared to Zeneth predictions. Steady progress in predictive performance was observed as the knowledge bases grew and were refined. Over the course of the development covered within this evaluation, the ability of Zeneth to predict experimentally observed degradants increased from 31% to 54%. In particular, gaps in predictivity were noted in the areas of epimerizations, N-dealkylation of N-alkylheteroaromatic compounds, photochemical decarboxylations, and electrocyclic reactions. The results of this study show that knowledge base development efforts have increased the ability of Zeneth to predict relevant degradation products and aid pharmaceutical research. This study has also provided valuable information to help guide further improvements to Zeneth and its knowledge base.
Predicting Performance in Higher Education Using Proximal Predictors.
Niessen, A Susan M; Meijer, Rob R; Tendeiro, Jorge N
2016-01-01
We studied the validity of two methods for predicting academic performance and student-program fit that were proximal to important study criteria. Applicants to an undergraduate psychology program participated in a selection procedure containing a trial-studying test based on a work sample approach, and specific skills tests in English and math. Test scores were used to predict academic achievement and progress after the first year, achievement in specific course types, enrollment, and dropout after the first year. All tests showed positive significant correlations with the criteria. The trial-studying test was consistently the best predictor in the admission procedure. We found no significant differences between the predictive validity of the trial-studying test and prior educational performance, and substantial shared explained variance between the two predictors. Only applicants with lower trial-studying scores were significantly less likely to enroll in the program. In conclusion, the trial-studying test yielded predictive validities similar to that of prior educational performance and possibly enabled self-selection. In admissions aimed at student-program fit, or in admissions in which past educational performance is difficult to use, a trial-studying test is a good instrument to predict academic performance.
Austin, Peter C; Lee, Douglas S; Steyerberg, Ewout W; Tu, Jack V
2012-01-01
In biomedical research, the logistic regression model is the most commonly used method for predicting the probability of a binary outcome. While many clinical researchers have expressed an enthusiasm for regression trees, this method may have limited accuracy for predicting health outcomes. We aimed to evaluate the improvement that is achieved by using ensemble-based methods, including bootstrap aggregation (bagging) of regression trees, random forests, and boosted regression trees. We analyzed 30-day mortality in two large cohorts of patients hospitalized with either acute myocardial infarction (N = 16,230) or congestive heart failure (N = 15,848) in two distinct eras (1999–2001 and 2004–2005). We found that both the in-sample and out-of-sample prediction of ensemble methods offered substantial improvement in predicting cardiovascular mortality compared to conventional regression trees. However, conventional logistic regression models that incorporated restricted cubic smoothing splines had even better performance. We conclude that ensemble methods from the data mining and machine learning literature increase the predictive performance of regression trees, but may not lead to clear advantages over conventional logistic regression models for predicting short-term mortality in population-based samples of subjects with cardiovascular disease. PMID:22777999
A community effort to assess and improve drug sensitivity prediction algorithms
Costello, James C; Heiser, Laura M; Georgii, Elisabeth; Gönen, Mehmet; Menden, Michael P; Wang, Nicholas J; Bansal, Mukesh; Ammad-ud-din, Muhammad; Hintsanen, Petteri; Khan, Suleiman A; Mpindi, John-Patrick; Kallioniemi, Olli; Honkela, Antti; Aittokallio, Tero; Wennerberg, Krister; Collins, James J; Gallahan, Dan; Singer, Dinah; Saez-Rodriguez, Julio; Kaski, Samuel; Gray, Joe W; Stolovitzky, Gustavo
2015-01-01
Predicting the best treatment strategy from genomic information is a core goal of precision medicine. Here we focus on predicting drug response based on a cohort of genomic, epigenomic and proteomic profiling data sets measured in human breast cancer cell lines. Through a collaborative effort between the National Cancer Institute (NCI) and the Dialogue on Reverse Engineering Assessment and Methods (DREAM) project, we analyzed a total of 44 drug sensitivity prediction algorithms. The top-performing approaches modeled nonlinear relationships and incorporated biological pathway information. We found that gene expression microarrays consistently provided the best predictive power of the individual profiling data sets; however, performance was increased by including multiple, independent data sets. We discuss the innovations underlying the top-performing methodology, Bayesian multitask MKL, and we provide detailed descriptions of all methods. This study establishes benchmarks for drug sensitivity prediction and identifies approaches that can be leveraged for the development of new methods. PMID:24880487
A community effort to assess and improve drug sensitivity prediction algorithms.
Costello, James C; Heiser, Laura M; Georgii, Elisabeth; Gönen, Mehmet; Menden, Michael P; Wang, Nicholas J; Bansal, Mukesh; Ammad-ud-din, Muhammad; Hintsanen, Petteri; Khan, Suleiman A; Mpindi, John-Patrick; Kallioniemi, Olli; Honkela, Antti; Aittokallio, Tero; Wennerberg, Krister; Collins, James J; Gallahan, Dan; Singer, Dinah; Saez-Rodriguez, Julio; Kaski, Samuel; Gray, Joe W; Stolovitzky, Gustavo
2014-12-01
Predicting the best treatment strategy from genomic information is a core goal of precision medicine. Here we focus on predicting drug response based on a cohort of genomic, epigenomic and proteomic profiling data sets measured in human breast cancer cell lines. Through a collaborative effort between the National Cancer Institute (NCI) and the Dialogue on Reverse Engineering Assessment and Methods (DREAM) project, we analyzed a total of 44 drug sensitivity prediction algorithms. The top-performing approaches modeled nonlinear relationships and incorporated biological pathway information. We found that gene expression microarrays consistently provided the best predictive power of the individual profiling data sets; however, performance was increased by including multiple, independent data sets. We discuss the innovations underlying the top-performing methodology, Bayesian multitask MKL, and we provide detailed descriptions of all methods. This study establishes benchmarks for drug sensitivity prediction and identifies approaches that can be leveraged for the development of new methods.
Predicting the Resiliency in Parents with Exceptional Children Based on Their Mindfulness
ERIC Educational Resources Information Center
Jabbari, Sosan; Firoozabadi, Somayeh Sadati; Rostami, Sedighe
2016-01-01
The purpose of the present study was to predict the resiliency in parents with exceptional children based on their mindfulness. This descriptive correlational study was performed on 260 parents of student (105 male and 159 female) that were selected by cluster sampling method. Family resiliency questionnaire (Sickby, 2005) and five aspect…
NASA Astrophysics Data System (ADS)
Feng, Shou; Fu, Ping; Zheng, Wenbin
2018-03-01
Predicting gene function based on biological instrumental data is a complicated and challenging hierarchical multi-label classification (HMC) problem. When using local approach methods to solve this problem, a preliminary results processing method is usually needed. This paper proposed a novel preliminary results processing method called the nodes interaction method. The nodes interaction method revises the preliminary results and guarantees that the predictions are consistent with the hierarchy constraint. This method exploits the label dependency and considers the hierarchical interaction between nodes when making decisions based on the Bayesian network in its first phase. In the second phase, this method further adjusts the results according to the hierarchy constraint. Implementing the nodes interaction method in the HMC framework also enhances the HMC performance for solving the gene function prediction problem based on the Gene Ontology (GO), the hierarchy of which is a directed acyclic graph that is more difficult to tackle. The experimental results validate the promising performance of the proposed method compared to state-of-the-art methods on eight benchmark yeast data sets annotated by the GO.
Similarity-based Regularized Latent Feature Model for Link Prediction in Bipartite Networks.
Wang, Wenjun; Chen, Xue; Jiao, Pengfei; Jin, Di
2017-12-05
Link prediction is an attractive research topic in the field of data mining and has significant applications in improving performance of recommendation system and exploring evolving mechanisms of the complex networks. A variety of complex systems in real world should be abstractly represented as bipartite networks, in which there are two types of nodes and no links connect nodes of the same type. In this paper, we propose a framework for link prediction in bipartite networks by combining the similarity based structure and the latent feature model from a new perspective. The framework is called Similarity Regularized Nonnegative Matrix Factorization (SRNMF), which explicitly takes the local characteristics into consideration and encodes the geometrical information of the networks by constructing a similarity based matrix. We also develop an iterative scheme to solve the objective function based on gradient descent. Extensive experiments on a variety of real world bipartite networks show that the proposed framework of link prediction has a more competitive, preferable and stable performance in comparison with the state-of-art methods.
Sequence-based predictive modeling to identify cancerlectins
Lai, Hong-Yan; Chen, Xin-Xin; Chen, Wei; Tang, Hua; Lin, Hao
2017-01-01
Lectins are a diverse type of glycoproteins or carbohydrate-binding proteins that have a wide distribution to various species. They can specially identify and exclusively bind to a certain kind of saccharide groups. Cancerlectins are a group of lectins that are closely related to cancer and play a major role in the initiation, survival, growth, metastasis and spread of tumor. Several computational methods have emerged to discriminate cancerlectins from non-cancerlectins, which promote the study on pathogenic mechanisms and clinical treatment of cancer. However, the predictive accuracies of most of these techniques are very limited. In this work, by constructing a benchmark dataset based on the CancerLectinDB database, a new amino acid sequence-based strategy for feature description was developed, and then the binomial distribution was applied to screen the optimal feature set. Ultimately, an SVM-based predictor was performed to distinguish cancerlectins from non-cancerlectins, and achieved an accuracy of 77.48% with AUC of 85.52% in jackknife cross-validation. The results revealed that our prediction model could perform better comparing with published predictive tools. PMID:28423655
[Predictive model based multimetric index of macroinvertebrates for river health assessment].
Chen, Kai; Yu, Hai Yan; Zhang, Ji Wei; Wang, Bei Xin; Chen, Qiu Wen
2017-06-18
Improving the stability of integrity of biotic index (IBI; i.e., multi-metric indices, MMI) across temporal and spatial scales is one of the most important issues in water ecosystem integrity bioassessment and water environment management. Using datasets of field-based macroinvertebrate and physicochemical variables and GIS-based natural predictors (e.g., geomorphology and climate) and land use variables collected at 227 river sites from 2004 to 2011 across the Zhejiang Province, China, we used random forests (RF) to adjust the effects of natural variations at temporal and spatial scales on macroinvertebrate metrics. We then developed natural variations adjusted (predictive) and unadjusted (null) MMIs and compared performance between them. The core me-trics selected for predictive and null MMIs were different from each other, and natural variations within core metrics in predictive MMI explained by RF models ranged between 11.4% and 61.2%. The predictive MMI was more precise and accurate, but less responsive and sensitive than null MMI. The multivariate nearest-neighbor test determined that 9 test sites and 1 most degraded site were flagged outside of the environmental space of the reference site network. We found that combination of predictive MMI developed by using predictive model and the nearest-neighbor test performed best and decreased risks of inferring type I (designating a water body as being in poor biological condition, when it was actually in good condition) and type II (designating a water body as being in good biological condition, when it was actually in poor condition) errors. Our results provided an effective method to improve the stability and performance of integrity of biotic index.
Schummers, Laura; Himes, Katherine P; Bodnar, Lisa M; Hutcheon, Jennifer A
2016-09-21
Compelled by the intuitive appeal of predicting each individual patient's risk of an outcome, there is a growing interest in risk prediction models. While the statistical methods used to build prediction models are increasingly well understood, the literature offers little insight to researchers seeking to gauge a priori whether a prediction model is likely to perform well for their particular research question. The objective of this study was to inform the development of new risk prediction models by evaluating model performance under a wide range of predictor characteristics. Data from all births to overweight or obese women in British Columbia, Canada from 2004 to 2012 (n = 75,225) were used to build a risk prediction model for preeclampsia. The data were then augmented with simulated predictors of the outcome with pre-set prevalence values and univariable odds ratios. We built 120 risk prediction models that included known demographic and clinical predictors, and one, three, or five of the simulated variables. Finally, we evaluated standard model performance criteria (discrimination, risk stratification capacity, calibration, and Nagelkerke's r 2 ) for each model. Findings from our models built with simulated predictors demonstrated the predictor characteristics required for a risk prediction model to adequately discriminate cases from non-cases and to adequately classify patients into clinically distinct risk groups. Several predictor characteristics can yield well performing risk prediction models; however, these characteristics are not typical of predictor-outcome relationships in many population-based or clinical data sets. Novel predictors must be both strongly associated with the outcome and prevalent in the population to be useful for clinical prediction modeling (e.g., one predictor with prevalence ≥20 % and odds ratio ≥8, or 3 predictors with prevalence ≥10 % and odds ratios ≥4). Area under the receiver operating characteristic curve values of >0.8 were necessary to achieve reasonable risk stratification capacity. Our findings provide a guide for researchers to estimate the expected performance of a prediction model before a model has been built based on the characteristics of available predictors.
NASA Astrophysics Data System (ADS)
Branger, E.; Grape, S.; Jansson, P.; Jacobsson Svärd, S.
2018-02-01
The Digital Cherenkov Viewing Device (DCVD) is a tool used by nuclear safeguards inspectors to verify irradiated nuclear fuel assemblies in wet storage based on the recording of Cherenkov light produced by the assemblies. One type of verification involves comparing the measured light intensity from an assembly with a predicted intensity, based on assembly declarations. Crucial for such analyses is the performance of the prediction model used, and recently new modelling methods have been introduced to allow for enhanced prediction capabilities by taking the irradiation history into account, and by including the cross-talk radiation from neighbouring assemblies in the predictions. In this work, the performance of three models for Cherenkov-light intensity prediction is evaluated by applying them to a set of short-cooled PWR 17x17 assemblies for which experimental DCVD measurements and operator-declared irradiation data was available; (1) a two-parameter model, based on total burnup and cooling time, previously used by the safeguards inspectors, (2) a newly introduced gamma-spectrum-based model, which incorporates cycle-wise burnup histories, and (3) the latter gamma-spectrum-based model with the addition to account for contributions from neighbouring assemblies. The results show that the two gamma-spectrum-based models provide significantly higher precision for the measured inventory compared to the two-parameter model, lowering the standard deviation between relative measured and predicted intensities from 15.2 % to 8.1 % respectively 7.8 %. The results show some systematic differences between assemblies of different designs (produced by different manufacturers) in spite of their similar PWR 17x17 geometries, and possible ways are discussed to address such differences, which may allow for even higher prediction capabilities. Still, it is concluded that the gamma-spectrum-based models enable confident verification of the fuel assembly inventory at the currently used detection limit for partial defects, being a 30 % discrepancy between measured and predicted intensities, while some false detection occurs with the two-parameter model. The results also indicate that the gamma-spectrum-based prediction methods are accurate enough that the 30 % discrepancy limit could potentially be lowered.
Pavement Performance : Approaches Using Predictive Analytics
DOT National Transportation Integrated Search
2018-03-23
Acceptable pavement condition is paramount to road safety. Using predictive analytics techniques, this project attempted to develop models that provide an assessment of pavement condition based on an array of indictors that include pavement distress,...
Colquitt, Jason A; Lepine, Jeffery A; Piccolo, Ronald F; Zapata, Cindy P; Rich, Bruce L
2012-01-01
Past research has revealed significant relationships between organizational justice dimensions and job performance, and trust is thought to be one mediator of those relationships. However, trust has been positioned in justice theorizing in 2 different ways, either as an indicator of the depth of an exchange relationship or as a variable that reflects levels of work-related uncertainty. Moreover, trust scholars distinguish between multiple forms of trust, including affect- and cognition-based trust, and it remains unclear which form is most relevant to justice effects. To explore these issues, we built and tested a more comprehensive model of trust mediation in which procedural, interpersonal, and distributive justice predicted affect- and cognition-based trust, with those trust forms predicting both exchange- and uncertainty-based mechanisms. The results of a field study in a hospital system revealed that the trust variables did indeed mediate the relationships between the organizational justice dimensions and job performance, with affect-based trust driving exchange-based mediation and cognition-based trust driving uncertainty-based mediation.
Application of LogitBoost Classifier for Traceability Using SNP Chip Data
Kang, Hyunsung; Cho, Seoae; Kim, Heebal; Seo, Kang-Seok
2015-01-01
Consumer attention to food safety has increased rapidly due to animal-related diseases; therefore, it is important to identify their places of origin (POO) for safety purposes. However, only a few studies have addressed this issue and focused on machine learning-based approaches. In the present study, classification analyses were performed using a customized SNP chip for POO prediction. To accomplish this, 4,122 pigs originating from 104 farms were genotyped using the SNP chip. Several factors were considered to establish the best prediction model based on these data. We also assessed the applicability of the suggested model using a kinship coefficient-filtering approach. Our results showed that the LogitBoost-based prediction model outperformed other classifiers in terms of classification performance under most conditions. Specifically, a greater level of accuracy was observed when a higher kinship-based cutoff was employed. These results demonstrated the applicability of a machine learning-based approach using SNP chip data for practical traceability. PMID:26436917
Application of LogitBoost Classifier for Traceability Using SNP Chip Data.
Kim, Kwondo; Seo, Minseok; Kang, Hyunsung; Cho, Seoae; Kim, Heebal; Seo, Kang-Seok
2015-01-01
Consumer attention to food safety has increased rapidly due to animal-related diseases; therefore, it is important to identify their places of origin (POO) for safety purposes. However, only a few studies have addressed this issue and focused on machine learning-based approaches. In the present study, classification analyses were performed using a customized SNP chip for POO prediction. To accomplish this, 4,122 pigs originating from 104 farms were genotyped using the SNP chip. Several factors were considered to establish the best prediction model based on these data. We also assessed the applicability of the suggested model using a kinship coefficient-filtering approach. Our results showed that the LogitBoost-based prediction model outperformed other classifiers in terms of classification performance under most conditions. Specifically, a greater level of accuracy was observed when a higher kinship-based cutoff was employed. These results demonstrated the applicability of a machine learning-based approach using SNP chip data for practical traceability.
NASA Astrophysics Data System (ADS)
Shi, Wenhai; Huang, Mingbin
2017-04-01
The Chinese Loess Plateau is one of the most erodible areas in the world. In order to reduce soil and water losses, suitable conservation practices need to be designed. For this purpose, there is an increasing demand for an appropriate model that can accurately predict storm-based surface runoff and soil losses on the Loess Plateau. The Chinese Soil Loss Equation (CSLE) has been widely used in this region to assess soil losses from different land use types. However, the CSLE was intended only to predict the mean annual gross soil loss. In this study, a CSLE was proposed that would be storm-based and that introduced a new rainfall-runoff erosivity factor. A dataset was compiled that comprised measurements of soil losses during individual storms from three runoff-erosion plots in each of three different watersheds in the gully region of the Plateau for 3-7 years in three different time periods (1956-1959; 1973-1980; 2010-13). The accuracy of the soil loss predictions made by the new storm-based CSLE was determined using the data for the six plots in two of the watersheds measured during 165 storm-runoff events. The performance of the storm-based CSLE was further compared with the performance of the storm-based Revised Universal Soil Loss Equation (RUSLE) for the same six plots. During the calibration (83 storms) and validation (82 storms) of the storm-based CSLE, the model efficiency, E, was 87.7% and 88.9%, respectively, while the root mean square error (RMSE) was 2.7 and 2.3 t ha-1 indicating a high degree of accuracy. Furthermore, the storm-based CSLE performed better than the storm-based RULSE (E: 75.8% and 70.3%; RMSE: 3.8 and 3.7 t ha-1, for the calibration and validation storms, respectively). The storm-based CSLE was then used to predict the soil losses from the three experimental plots in the third watershed. For these predictions, the model parameter values, previously determined by the calibration based on the data from the initial six plots, were used in the storm-based CSLE. In addition, the surface runoff used by the storm-based CSLE was either obtained from measurements or from the values predicted by the modified Soil Conservation Service Curve Number (SCS-CN) method. When using the measured runoff, the storm-based CSLE had an E of 76.6%, whereas the use of the predicted runoff gave an E of 76.4%. The high E values indicated that the storm-based CSLE incorporating the modified SCS-CN method could accurately predict storm-event-based soil losses resulting from both sheet and rill erosion at the field scale on the Chinese Loess Plateau. This approach could be applicable to other areas of the world once the model parameters have been suitably calibrated.
DOT National Transportation Integrated Search
2016-04-01
Advancements in pavement management practice require evaluating the performance of pavement preservation treatments using performance-related characteristics. However, state highway agencies face the challenge of developing performance-based relation...
Can traits predict individual growth performance? A test in a hyperdiverse tropical forest.
Poorter, Lourens; Castilho, Carolina V; Schietti, Juliana; Oliveira, Rafael S; Costa, Flávia R C
2018-07-01
The functional trait approach has, as a central tenet, that plant traits are functional and shape individual performance, but this has rarely been tested in the field. Here, we tested the individual-based trait approach in a hyperdiverse Amazonian tropical rainforest and evaluated intraspecific variation in trait values, plant strategies at the individual level, and whether traits are functional and predict individual performance. We evaluated > 1300 tree saplings belonging to > 383 species, measured 25 traits related to growth and defense, and evaluated the effects of environmental conditions, plant size, and traits on stem growth. A total of 44% of the trait variation was observed within species, indicating a strong potential for acclimation. Individuals showed two strategy spectra, related to tissue toughness and organ size vs leaf display. In this nutrient- and light-limited forest, traits measured at the individual level were surprisingly poor predictors of individual growth performance because of convergence of traits and growth rates. Functional trait approaches based on individuals or species are conceptually fundamentally different: the species-based approach focuses on the potential and the individual-based approach on the realized traits and growth rates. Counterintuitively, the individual approach leads to a poor prediction of individual performance, although it provides a more realistic view on community dynamics. © 2018 The Authors. New Phytologist © 2018 New Phytologist Trust.
Huang, Yu-An; You, Zhu-Hong; Chen, Xing; Huang, Zhi-An; Zhang, Shanwen; Yan, Gui-Ying
2017-10-16
Accumulating clinical researches have shown that specific microbes with abnormal levels are closely associated with the development of various human diseases. Knowledge of microbe-disease associations can provide valuable insights for complex disease mechanism understanding as well as the prevention, diagnosis and treatment of various diseases. However, little effort has been made to predict microbial candidates for human complex diseases on a large scale. In this work, we developed a new computational model for predicting microbe-disease associations by combining two single recommendation methods. Based on the assumption that functionally similar microbes tend to get involved in the mechanism of similar disease, we adopted neighbor-based collaborative filtering and a graph-based scoring method to compute association possibility of microbe-disease pairs. The promising prediction performance could be attributed to the use of hybrid approach based on two single recommendation methods as well as the introduction of Gaussian kernel-based similarity and symptom-based disease similarity. To evaluate the performance of the proposed model, we implemented leave-one-out and fivefold cross validations on the HMDAD database, which is recently built as the first database collecting experimentally-confirmed microbe-disease associations. As a result, NGRHMDA achieved reliable results with AUCs of 0.9023 ± 0.0031 and 0.9111 in the validation frameworks of fivefold CV and LOOCV. In addition, 78.2% microbe samples and 66.7% disease samples are found to be consistent with the basic assumption of our work that microbes tend to get involved in the similar disease clusters, and vice versa. Compared with other methods, the prediction results yielded by NGRHMDA demonstrate its effective prediction performance for microbe-disease associations. It is anticipated that NGRHMDA can be used as a useful tool to search the most potential microbial candidates for various diseases, and therefore boosts the medical knowledge and drug development. The codes and dataset of our work can be downloaded from https://github.com/yahuang1991/NGRHMDA .
Application of Support Vector Machine to Forex Monitoring
NASA Astrophysics Data System (ADS)
Kamruzzaman, Joarder; Sarker, Ruhul A.
Previous studies have demonstrated superior performance of artificial neural network (ANN) based forex forecasting models over traditional regression models. This paper applies support vector machines to build a forecasting model from the historical data using six simple technical indicators and presents a comparison with an ANN based model trained by scaled conjugate gradient (SCG) learning algorithm. The models are evaluated and compared on the basis of five commonly used performance metrics that measure closeness of prediction as well as correctness in directional change. Forecasting results of six different currencies against Australian dollar reveal superior performance of SVM model using simple linear kernel over ANN-SCG model in terms of all the evaluation metrics. The effect of SVM parameter selection on prediction performance is also investigated and analyzed.
Incorporating groundwater flow into the WEPP model
William Elliot; Erin Brooks; Tim Link; Sue Miller
2010-01-01
The water erosion prediction project (WEPP) model is a physically-based hydrology and erosion model. In recent years, the hydrology prediction within the model has been improved for forest watershed modeling by incorporating shallow lateral flow into watershed runoff prediction. This has greatly improved WEPP's hydrologic performance on small watersheds with...
Single-pass memory system evaluation for multiprogramming workloads
NASA Technical Reports Server (NTRS)
Conte, Thomas M.; Hwu, Wen-Mei W.
1990-01-01
Modern memory systems are composed of levels of cache memories, a virtual memory system, and a backing store. Varying more than a few design parameters and measuring the performance of such systems has traditionally be constrained by the high cost of simulation. Models of cache performance recently introduced reduce the cost simulation but at the expense of accuracy of performance prediction. Stack-based methods predict performance accurately using one pass over the trace for all cache sizes, but these techniques have been limited to fully-associative organizations. This paper presents a stack-based method of evaluating the performance of cache memories using a recurrence/conflict model for the miss ratio. Unlike previous work, the performance of realistic cache designs, such as direct-mapped caches, are predicted by the method. The method also includes a new approach to the problem of the effects of multiprogramming. This new technique separates the characteristics of the individual program from that of the workload. The recurrence/conflict method is shown to be practical, general, and powerful by comparing its performance to that of a popular traditional cache simulator. The authors expect that the availability of such a tool will have a large impact on future architectural studies of memory systems.
Proposed evaluation framework for assessing operator performance with multisensor displays
NASA Technical Reports Server (NTRS)
Foyle, David C.
1992-01-01
Despite aggressive work on the development of sensor fusion algorithms and techniques, no formal evaluation procedures have been proposed. Based on existing integration models in the literature, an evaluation framework is developed to assess an operator's ability to use multisensor, or sensor fusion, displays. The proposed evaluation framework for evaluating the operator's ability to use such systems is a normative approach: The operator's performance with the sensor fusion display can be compared to the models' predictions based on the operator's performance when viewing the original sensor displays prior to fusion. This allows for the determination as to when a sensor fusion system leads to: 1) poorer performance than one of the original sensor displays (clearly an undesirable system in which the fused sensor system causes some distortion or interference); 2) better performance than with either single sensor system alone, but at a sub-optimal (compared to the model predictions) level; 3) optimal performance (compared to model predictions); or, 4) super-optimal performance, which may occur if the operator were able to use some highly diagnostic 'emergent features' in the sensor fusion display, which were unavailable in the original sensor displays. An experiment demonstrating the usefulness of the proposed evaluation framework is discussed.
Predicting falls in older adults using the four square step test.
Cleary, Kimberly; Skornyakov, Elena
2017-10-01
The Four Square Step Test (FSST) is a performance-based balance tool involving stepping over four single-point canes placed on the floor in a cross configuration. The purpose of this study was to evaluate properties of the FSST in older adults who lived independently. Forty-five community dwelling older adults provided fall history and completed the FSST, Berg Balance Scale (BBS), Timed Up and Go (TUG), and Tinetti in random order. Future falls were recorded for 12 months following testing. The FSST accurately distinguished between non-fallers and multiple fallers, and the 15-second threshold score accurately distinguished multiple fallers from non-multiple fallers based on fall history. The FSST predicted future falls, and performance on the FSST was significantly correlated with performance on the BBS, TUG, and Tinetti. However, the test is not appropriate for older adults who use walkers. Overall, the FSST is a valid yet underutilized measure of balance performance and fall prediction tool that physical therapists should consider using in ambulatory community dwelling older adults.
Calibration of limited-area ensemble precipitation forecasts for hydrological predictions
NASA Astrophysics Data System (ADS)
Diomede, Tommaso; Marsigli, Chiara; Montani, Andrea; Nerozzi, Fabrizio; Paccagnella, Tiziana
2015-04-01
The main objective of this study is to investigate the impact of calibration for limited-area ensemble precipitation forecasts, to be used for driving discharge predictions up to 5 days in advance. A reforecast dataset, which spans 30 years, based on the Consortium for Small Scale Modeling Limited-Area Ensemble Prediction System (COSMO-LEPS) was used for testing the calibration strategy. Three calibration techniques were applied: quantile-to-quantile mapping, linear regression, and analogs. The performance of these methodologies was evaluated in terms of statistical scores for the precipitation forecasts operationally provided by COSMO-LEPS in the years 2003-2007 over Germany, Switzerland, and the Emilia-Romagna region (northern Italy). The analog-based method seemed to be preferred because of its capability of correct position errors and spread deficiencies. A suitable spatial domain for the analog search can help to handle model spatial errors as systematic errors. However, the performance of the analog-based method may degrade in cases where a limited training dataset is available. A sensitivity test on the length of the training dataset over which to perform the analog search has been performed. The quantile-to-quantile mapping and linear regression methods were less effective, mainly because the forecast-analysis relation was not so strong for the available training dataset. A comparison between the calibration based on the deterministic reforecast and the calibration based on the full operational ensemble used as training dataset has been considered, with the aim to evaluate whether reforecasts are really worthy for calibration, given that their computational cost is remarkable. The verification of the calibration process was then performed by coupling ensemble precipitation forecasts with a distributed rainfall-runoff model. This test was carried out for a medium-sized catchment located in Emilia-Romagna, showing a beneficial impact of the analog-based method on the reduction of missed events for discharge predictions.
Transforming RNA-Seq data to improve the performance of prognostic gene signatures.
Zwiener, Isabella; Frisch, Barbara; Binder, Harald
2014-01-01
Gene expression measurements have successfully been used for building prognostic signatures, i.e for identifying a short list of important genes that can predict patient outcome. Mostly microarray measurements have been considered, and there is little advice available for building multivariable risk prediction models from RNA-Seq data. We specifically consider penalized regression techniques, such as the lasso and componentwise boosting, which can simultaneously consider all measurements and provide both, multivariable regression models for prediction and automated variable selection. However, they might be affected by the typical skewness, mean-variance-dependency or extreme values of RNA-Seq covariates and therefore could benefit from transformations of the latter. In an analytical part, we highlight preferential selection of covariates with large variances, which is problematic due to the mean-variance dependency of RNA-Seq data. In a simulation study, we compare different transformations of RNA-Seq data for potentially improving detection of important genes. Specifically, we consider standardization, the log transformation, a variance-stabilizing transformation, the Box-Cox transformation, and rank-based transformations. In addition, the prediction performance for real data from patients with kidney cancer and acute myeloid leukemia is considered. We show that signature size, identification performance, and prediction performance critically depend on the choice of a suitable transformation. Rank-based transformations perform well in all scenarios and can even outperform complex variance-stabilizing approaches. Generally, the results illustrate that the distribution and potential transformations of RNA-Seq data need to be considered as a critical step when building risk prediction models by penalized regression techniques.
Transforming RNA-Seq Data to Improve the Performance of Prognostic Gene Signatures
Zwiener, Isabella; Frisch, Barbara; Binder, Harald
2014-01-01
Gene expression measurements have successfully been used for building prognostic signatures, i.e for identifying a short list of important genes that can predict patient outcome. Mostly microarray measurements have been considered, and there is little advice available for building multivariable risk prediction models from RNA-Seq data. We specifically consider penalized regression techniques, such as the lasso and componentwise boosting, which can simultaneously consider all measurements and provide both, multivariable regression models for prediction and automated variable selection. However, they might be affected by the typical skewness, mean-variance-dependency or extreme values of RNA-Seq covariates and therefore could benefit from transformations of the latter. In an analytical part, we highlight preferential selection of covariates with large variances, which is problematic due to the mean-variance dependency of RNA-Seq data. In a simulation study, we compare different transformations of RNA-Seq data for potentially improving detection of important genes. Specifically, we consider standardization, the log transformation, a variance-stabilizing transformation, the Box-Cox transformation, and rank-based transformations. In addition, the prediction performance for real data from patients with kidney cancer and acute myeloid leukemia is considered. We show that signature size, identification performance, and prediction performance critically depend on the choice of a suitable transformation. Rank-based transformations perform well in all scenarios and can even outperform complex variance-stabilizing approaches. Generally, the results illustrate that the distribution and potential transformations of RNA-Seq data need to be considered as a critical step when building risk prediction models by penalized regression techniques. PMID:24416353
Predicting the Performance of Chain Saw Machines Based on Shore Scleroscope Hardness
NASA Astrophysics Data System (ADS)
Tumac, Deniz
2014-03-01
Shore hardness has been used to estimate several physical and mechanical properties of rocks over the last few decades. However, the number of researches correlating Shore hardness with rock cutting performance is quite limited. Also, rather limited researches have been carried out on predicting the performance of chain saw machines. This study differs from the previous investigations in the way that Shore hardness values (SH1, SH2, and deformation coefficient) are used to determine the field performance of chain saw machines. The measured Shore hardness values are correlated with the physical and mechanical properties of natural stone samples, cutting parameters (normal force, cutting force, and specific energy) obtained from linear cutting tests in unrelieved cutting mode, and areal net cutting rate of chain saw machines. Two empirical models developed previously are improved for the prediction of the areal net cutting rate of chain saw machines. The first model is based on a revised chain saw penetration index, which uses SH1, machine weight, and useful arm cutting depth as predictors. The second model is based on the power consumed for only cutting the stone, arm thickness, and specific energy as a function of the deformation coefficient. While cutting force has a strong relationship with Shore hardness values, the normal force has a weak or moderate correlation. Uniaxial compressive strength, Cerchar abrasivity index, and density can also be predicted by Shore hardness values.
Assessment of Geometry and In-Flow Effects on Contra-Rotating Open Rotor Broadband Noise Predictions
NASA Technical Reports Server (NTRS)
Zawodny, Nikolas S.; Nark, Douglas M.; Boyd, D. Douglas, Jr.
2015-01-01
Application of previously formulated semi-analytical models for the prediction of broadband noise due to turbulent rotor wake interactions and rotor blade trailing edges is performed on the historical baseline F31/A31 contra-rotating open rotor configuration. Simplified two-dimensional blade element analysis is performed on cambered NACA 4-digit airfoil profiles, which are meant to serve as substitutes for the actual rotor blade sectional geometries. Rotor in-flow effects such as induced axial and tangential velocities are incorporated into the noise prediction models based on supporting computational fluid dynamics (CFD) results and simplified in-flow velocity models. Emphasis is placed on the development of simplified rotor in-flow models for the purpose of performing accurate noise predictions independent of CFD information. The broadband predictions are found to compare favorably with experimental acoustic results.
Predicting links based on knowledge dissemination in complex network
NASA Astrophysics Data System (ADS)
Zhou, Wen; Jia, Yifan
2017-04-01
Link prediction is the task of mining the missing links in networks or predicting the next vertex pair to be connected by a link. A lot of link prediction methods were inspired by evolutionary processes of networks. In this paper, a new mechanism for the formation of complex networks called knowledge dissemination (KD) is proposed with the assumption of knowledge disseminating through the paths of a network. Accordingly, a new link prediction method-knowledge dissemination based link prediction (KDLP)-is proposed to test KD. KDLP characterizes vertex similarity based on knowledge quantity (KQ) which measures the importance of a vertex through H-index. Extensive numerical simulations on six real-world networks demonstrate that KDLP is a strong link prediction method which performs at a higher prediction accuracy than four well-known similarity measures including common neighbors, local path index, average commute time and matrix forest index. Furthermore, based on the common conclusion that an excellent link prediction method reveals a good evolving mechanism, the experiment results suggest that KD is a considerable network evolving mechanism for the formation of complex networks.
NASA Astrophysics Data System (ADS)
Müller, M. F.; Thompson, S. E.
2015-09-01
The prediction of flow duration curves (FDCs) in ungauged basins remains an important task for hydrologists given the practical relevance of FDCs for water management and infrastructure design. Predicting FDCs in ungauged basins typically requires spatial interpolation of statistical or model parameters. This task is complicated if climate becomes non-stationary, as the prediction challenge now also requires extrapolation through time. In this context, process-based models for FDCs that mechanistically link the streamflow distribution to climate and landscape factors may have an advantage over purely statistical methods to predict FDCs. This study compares a stochastic (process-based) and statistical method for FDC prediction in both stationary and non-stationary contexts, using Nepal as a case study. Under contemporary conditions, both models perform well in predicting FDCs, with Nash-Sutcliffe coefficients above 0.80 in 75 % of the tested catchments. The main drives of uncertainty differ between the models: parameter interpolation was the main source of error for the statistical model, while violations of the assumptions of the process-based model represented the main source of its error. The process-based approach performed better than the statistical approach in numerical simulations with non-stationary climate drivers. The predictions of the statistical method under non-stationary rainfall conditions were poor if (i) local runoff coefficients were not accurately determined from the gauge network, or (ii) streamflow variability was strongly affected by changes in rainfall. A Monte Carlo analysis shows that the streamflow regimes in catchments characterized by a strong wet-season runoff and a rapid, strongly non-linear hydrologic response are particularly sensitive to changes in rainfall statistics. In these cases, process-based prediction approaches are strongly favored over statistical models.
Predicting Visual Distraction Using Driving Performance Data
Kircher, Katja; Ahlstrom, Christer
2010-01-01
Behavioral variables are often used as performance indicators (PIs) of visual or internal distraction induced by secondary tasks. The objective of this study is to investigate whether visual distraction can be predicted by driving performance PIs in a naturalistic setting. Visual distraction is here defined by a gaze based real-time distraction detection algorithm called AttenD. Seven drivers used an instrumented vehicle for one month each in a small scale field operational test. For each of the visual distraction events detected by AttenD, seven PIs such as steering wheel reversal rate and throttle hold were calculated. Corresponding data were also calculated for time periods during which the drivers were classified as attentive. For each PI, means between distracted and attentive states were calculated using t-tests for different time-window sizes (2 – 40 s), and the window width with the smallest resulting p-value was selected as optimal. Based on the optimized PIs, logistic regression was used to predict whether the drivers were attentive or distracted. The logistic regression resulted in predictions which were 76 % correct (sensitivity = 77 % and specificity = 76 %). The conclusion is that there is a relationship between behavioral variables and visual distraction, but the relationship is not strong enough to accurately predict visual driver distraction. Instead, behavioral PIs are probably best suited as complementary to eye tracking based algorithms in order to make them more accurate and robust. PMID:21050615
Performance of genomic prediction within and across generations in maritime pine.
Bartholomé, Jérôme; Van Heerwaarden, Joost; Isik, Fikret; Boury, Christophe; Vidal, Marjorie; Plomion, Christophe; Bouffier, Laurent
2016-08-11
Genomic selection (GS) is a promising approach for decreasing breeding cycle length in forest trees. Assessment of progeny performance and of the prediction accuracy of GS models over generations is therefore a key issue. A reference population of maritime pine (Pinus pinaster) with an estimated effective inbreeding population size (status number) of 25 was first selected with simulated data. This reference population (n = 818) covered three generations (G0, G1 and G2) and was genotyped with 4436 single-nucleotide polymorphism (SNP) markers. We evaluated the effects on prediction accuracy of both the relatedness between the calibration and validation sets and validation on the basis of progeny performance. Pedigree-based (best linear unbiased prediction, ABLUP) and marker-based (genomic BLUP and Bayesian LASSO) models were used to predict breeding values for three different traits: circumference, height and stem straightness. On average, the ABLUP model outperformed genomic prediction models, with a maximum difference in prediction accuracies of 0.12, depending on the trait and the validation method. A mean difference in prediction accuracy of 0.17 was found between validation methods differing in terms of relatedness. Including the progenitors in the calibration set reduced this difference in prediction accuracy to 0.03. When only genotypes from the G0 and G1 generations were used in the calibration set and genotypes from G2 were used in the validation set (progeny validation), prediction accuracies ranged from 0.70 to 0.85. This study suggests that the training of prediction models on parental populations can predict the genetic merit of the progeny with high accuracy: an encouraging result for the implementation of GS in the maritime pine breeding program.
PrePhyloPro: phylogenetic profile-based prediction of whole proteome linkages
Niu, Yulong; Liu, Chengcheng; Moghimyfiroozabad, Shayan; Yang, Yi
2017-01-01
Direct and indirect functional links between proteins as well as their interactions as part of larger protein complexes or common signaling pathways may be predicted by analyzing the correlation of their evolutionary patterns. Based on phylogenetic profiling, here we present a highly scalable and time-efficient computational framework for predicting linkages within the whole human proteome. We have validated this method through analysis of 3,697 human pathways and molecular complexes and a comparison of our results with the prediction outcomes of previously published co-occurrency model-based and normalization methods. Here we also introduce PrePhyloPro, a web-based software that uses our method for accurately predicting proteome-wide linkages. We present data on interactions of human mitochondrial proteins, verifying the performance of this software. PrePhyloPro is freely available at http://prephylopro.org/phyloprofile/. PMID:28875072
Weighted hybrid technique for recommender system
NASA Astrophysics Data System (ADS)
Suriati, S.; Dwiastuti, Meisyarah; Tulus, T.
2017-12-01
Recommender system becomes very popular and has important role in an information system or webpages nowadays. A recommender system tries to make a prediction of which item a user may like based on his activity on the system. There are some familiar techniques to build a recommender system, such as content-based filtering and collaborative filtering. Content-based filtering does not involve opinions from human to make the prediction, while collaborative filtering does, so collaborative filtering can predict more accurately. However, collaborative filtering cannot give prediction to items which have never been rated by any user. In order to cover the drawbacks of each approach with the advantages of other approach, both approaches can be combined with an approach known as hybrid technique. Hybrid technique used in this work is weighted technique in which the prediction score is combination linear of scores gained by techniques that are combined.The purpose of this work is to show how an approach of weighted hybrid technique combining content-based filtering and item-based collaborative filtering can work in a movie recommender system and to show the performance comparison when both approachare combined and when each approach works alone. There are three experiments done in this work, combining both techniques with different parameters. The result shows that the weighted hybrid technique that is done in this work does not really boost the performance up, but it helps to give prediction score for unrated movies that are impossible to be recommended by only using collaborative filtering.
ERIC Educational Resources Information Center
Goldhaber, Dan; Cowan, James; Theobald, Roddy
2016-01-01
We use longitudinal data from Washington State to provide estimates of the extent to which performance on the edTPA, a performance-based, subject-specific assessment of teacher candidates, is predictive of the likelihood of employment in the teacher workforce and value-added measures of teacher effectiveness. While edTPA scores are highly…
ERIC Educational Resources Information Center
Goldhaber, Dan; Cowan, James; Theobald, Roddy
2016-01-01
We use longitudinal data from Washington State to provide estimates of the extent to which performance on the edTPA, a performance-based, subject-specific assessment of teacher candidates, is predictive of the likelihood of employment in the teacher workforce and value-added measures of teacher effectiveness. While edTPA scores are highly…
USDA-ARS?s Scientific Manuscript database
The calculation of a thermal based Crop Water Stress Index (CWSI) requires an estimate of canopy temperature under non-water stressed conditions. The objective of this study was to assess the influence of different wine grape cultivars on the performance of models that predict canopy temperature non...
Gaillard, Jean-Michel; Lemaître, Jean-François
2017-12-01
Williams' evolutionary theory of senescence based on antagonistic pleiotropy has become a landmark in evolutionary biology, and more recently in biogerontology and evolutionary medicine. In his original article, Williams launched a set of nine "testable deductions" from his theory. Although some of these predictions have been repeatedly discussed, most have been overlooked and no systematic evaluation of the whole set of Williams' original predictions has been performed. For the sixtieth anniversary of the publication of the Williams' article, we provide an updated evaluation of all these predictions. We present the pros and cons of each prediction based on recent accumulation of both theoretical and empirical studies performed in the laboratory and in the wild. From our viewpoint, six predictions are mostly supported by our current knowledge at least under some conditions (although Williams' theory cannot thoroughly explain why for some of them). Three predictions, all involving the timing of senescence, are not supported. Our critical review of Williams' predictions highlights the importance of William's contribution and clearly demonstrates that, 60 years after its publication, his article does not show any sign of senescence. © 2017 The Author(s). Evolution © 2017 The Society for the Study of Evolution.
Turbine blade forced response prediction using FREPS
NASA Technical Reports Server (NTRS)
Murthy, Durbha, V.; Morel, Michael R.
1993-01-01
This paper describes a software system called FREPS (Forced REsponse Prediction System) that integrates structural dynamic, steady and unsteady aerodynamic analyses to efficiently predict the forced response dynamic stresses in axial flow turbomachinery blades due to aerodynamic and mechanical excitations. A flutter analysis capability is also incorporated into the system. The FREPS system performs aeroelastic analysis by modeling the motion of the blade in terms of its normal modes. The structural dynamic analysis is performed by a finite element code such as MSC/NASTRAN. The steady aerodynamic analysis is based on nonlinear potential theory and the unsteady aerodynamic analyses is based on the linearization of the non-uniform potential flow mean. The program description and presentation of the capabilities are reported herein. The effectiveness of the FREPS package is demonstrated on the High Pressure Oxygen Turbopump turbine of the Space Shuttle Main Engine. Both flutter and forced response analyses are performed and typical results are illustrated.
Weighted bi-prediction for light field image coding
NASA Astrophysics Data System (ADS)
Conti, Caroline; Nunes, Paulo; Ducla Soares, Luís.
2017-09-01
Light field imaging based on a single-tier camera equipped with a microlens array - also known as integral, holoscopic, and plenoptic imaging - has currently risen up as a practical and prospective approach for future visual applications and services. However, successfully deploying actual light field imaging applications and services will require developing adequate coding solutions to efficiently handle the massive amount of data involved in these systems. In this context, self-similarity compensated prediction is a non-local spatial prediction scheme based on block matching that has been shown to achieve high efficiency for light field image coding based on the High Efficiency Video Coding (HEVC) standard. As previously shown by the authors, this is possible by simply averaging two predictor blocks that are jointly estimated from a causal search window in the current frame itself, referred to as self-similarity bi-prediction. However, theoretical analyses for motion compensated bi-prediction have suggested that it is still possible to achieve further rate-distortion performance improvements by adaptively estimating the weighting coefficients of the two predictor blocks. Therefore, this paper presents a comprehensive study of the rate-distortion performance for HEVC-based light field image coding when using different sets of weighting coefficients for self-similarity bi-prediction. Experimental results demonstrate that it is possible to extend the previous theoretical conclusions to light field image coding and show that the proposed adaptive weighting coefficient selection leads to up to 5 % of bit savings compared to the previous self-similarity bi-prediction scheme.
Caffo, Brian; Diener-West, Marie; Punjabi, Naresh M.; Samet, Jonathan
2010-01-01
This manuscript considers a data-mining approach for the prediction of mild obstructive sleep disordered breathing, defined as an elevated respiratory disturbance index (RDI), in 5,530 participants in a community-based study, the Sleep Heart Health Study. The prediction algorithm was built using modern ensemble learning algorithms, boosting in specific, which allowed for assessing potential high-dimensional interactions between predictor variables or classifiers. To evaluate the performance of the algorithm, the data were split into training and validation sets for varying thresholds for predicting the probability of a high RDI (≥ 7 events per hour in the given results). Based on a moderate classification threshold from the boosting algorithm, the estimated post-test odds of a high RDI were 2.20 times higher than the pre-test odds given a positive test, while the corresponding post-test odds were decreased by 52% given a negative test (sensitivity and specificity of 0.66 and 0.70, respectively). In rank order, the following variables had the largest impact on prediction performance: neck circumference, body mass index, age, snoring frequency, waist circumference, and snoring loudness. Citation: Caffo B; Diener-West M; Punjabi NM; Samet J. A novel approach to prediction of mild obstructive sleep disordered breathing in a population-based sample: the Sleep Heart Health Study. SLEEP 2010;33(12):1641-1648. PMID:21120126
An exponential filter model predicts lightness illusions
Zeman, Astrid; Brooks, Kevin R.; Ghebreab, Sennay
2015-01-01
Lightness, or perceived reflectance of a surface, is influenced by surrounding context. This is demonstrated by the Simultaneous Contrast Illusion (SCI), where a gray patch is perceived lighter against a black background and vice versa. Conversely, assimilation is where the lightness of the target patch moves toward that of the bounding areas and can be demonstrated in White's effect. Blakeslee and McCourt (1999) introduced an oriented difference-of-Gaussian (ODOG) model that is able to account for both contrast and assimilation in a number of lightness illusions and that has been subsequently improved using localized normalization techniques. We introduce a model inspired by image statistics that is based on a family of exponential filters, with kernels spanning across multiple sizes and shapes. We include an optional second stage of normalization based on contrast gain control. Our model was tested on a well-known set of lightness illusions that have previously been used to evaluate ODOG and its variants, and model lightness values were compared with typical human data. We investigate whether predictive success depends on filters of a particular size or shape and whether pooling information across filters can improve performance. The best single filter correctly predicted the direction of lightness effects for 21 out of 27 illusions. Combining two filters together increased the best performance to 23, with asymptotic performance at 24 for an arbitrarily large combination of filter outputs. While normalization improved prediction magnitudes, it only slightly improved overall scores in direction predictions. The prediction performance of 24 out of 27 illusions equals that of the best performing ODOG variant, with greater parsimony. Our model shows that V1-style orientation-selectivity is not necessary to account for lightness illusions and that a low-level model based on image statistics is able to account for a wide range of both contrast and assimilation effects. PMID:26157381
NASA Technical Reports Server (NTRS)
Hoppa, Mary Ann; Wilson, Larry W.
1994-01-01
There are many software reliability models which try to predict future performance of software based on data generated by the debugging process. Our research has shown that by improving the quality of the data one can greatly improve the predictions. We are working on methodologies which control some of the randomness inherent in the standard data generation processes in order to improve the accuracy of predictions. Our contribution is twofold in that we describe an experimental methodology using a data structure called the debugging graph and apply this methodology to assess the robustness of existing models. The debugging graph is used to analyze the effects of various fault recovery orders on the predictive accuracy of several well-known software reliability algorithms. We found that, along a particular debugging path in the graph, the predictive performance of different models can vary greatly. Similarly, just because a model 'fits' a given path's data well does not guarantee that the model would perform well on a different path. Further we observed bug interactions and noted their potential effects on the predictive process. We saw that not only do different faults fail at different rates, but that those rates can be affected by the particular debugging stage at which the rates are evaluated. Based on our experiment, we conjecture that the accuracy of a reliability prediction is affected by the fault recovery order as well as by fault interaction.
Progressive Dictionary Learning with Hierarchical Predictive Structure for Scalable Video Coding.
Dai, Wenrui; Shen, Yangmei; Xiong, Hongkai; Jiang, Xiaoqian; Zou, Junni; Taubman, David
2017-04-12
Dictionary learning has emerged as a promising alternative to the conventional hybrid coding framework. However, the rigid structure of sequential training and prediction degrades its performance in scalable video coding. This paper proposes a progressive dictionary learning framework with hierarchical predictive structure for scalable video coding, especially in low bitrate region. For pyramidal layers, sparse representation based on spatio-temporal dictionary is adopted to improve the coding efficiency of enhancement layers (ELs) with a guarantee of reconstruction performance. The overcomplete dictionary is trained to adaptively capture local structures along motion trajectories as well as exploit the correlations between neighboring layers of resolutions. Furthermore, progressive dictionary learning is developed to enable the scalability in temporal domain and restrict the error propagation in a close-loop predictor. Under the hierarchical predictive structure, online learning is leveraged to guarantee the training and prediction performance with an improved convergence rate. To accommodate with the stateof- the-art scalable extension of H.264/AVC and latest HEVC, standardized codec cores are utilized to encode the base and enhancement layers. Experimental results show that the proposed method outperforms the latest SHVC and HEVC simulcast over extensive test sequences with various resolutions.
Survival Regression Modeling Strategies in CVD Prediction.
Barkhordari, Mahnaz; Padyab, Mojgan; Sardarinia, Mahsa; Hadaegh, Farzad; Azizi, Fereidoun; Bozorgmanesh, Mohammadreza
2016-04-01
A fundamental part of prevention is prediction. Potential predictors are the sine qua non of prediction models. However, whether incorporating novel predictors to prediction models could be directly translated to added predictive value remains an area of dispute. The difference between the predictive power of a predictive model with (enhanced model) and without (baseline model) a certain predictor is generally regarded as an indicator of the predictive value added by that predictor. Indices such as discrimination and calibration have long been used in this regard. Recently, the use of added predictive value has been suggested while comparing the predictive performances of the predictive models with and without novel biomarkers. User-friendly statistical software capable of implementing novel statistical procedures is conspicuously lacking. This shortcoming has restricted implementation of such novel model assessment methods. We aimed to construct Stata commands to help researchers obtain the aforementioned statistical indices. We have written Stata commands that are intended to help researchers obtain the following. 1, Nam-D'Agostino X 2 goodness of fit test; 2, Cut point-free and cut point-based net reclassification improvement index (NRI), relative absolute integrated discriminatory improvement index (IDI), and survival-based regression analyses. We applied the commands to real data on women participating in the Tehran lipid and glucose study (TLGS) to examine if information relating to a family history of premature cardiovascular disease (CVD), waist circumference, and fasting plasma glucose can improve predictive performance of Framingham's general CVD risk algorithm. The command is adpredsurv for survival models. Herein we have described the Stata package "adpredsurv" for calculation of the Nam-D'Agostino X 2 goodness of fit test as well as cut point-free and cut point-based NRI, relative and absolute IDI, and survival-based regression analyses. We hope this work encourages the use of novel methods in examining predictive capacity of the emerging plethora of novel biomarkers.
Rehan, Waqas; Fischer, Stefan; Rehan, Maaz
2016-09-12
Wireless sensor networks (WSNs) have become more and more diversified and are today able to also support high data rate applications, such as multimedia. In this case, per-packet channel handshaking/switching may result in inducing additional overheads, such as energy consumption, delays and, therefore, data loss. One of the solutions is to perform stream-based channel allocation where channel handshaking is performed once before transmitting the whole data stream. Deciding stream-based channel allocation is more critical in case of multichannel WSNs where channels of different quality/stability are available and the wish for high performance requires sensor nodes to switch to the best among the available channels. In this work, we will focus on devising mechanisms that perform channel quality/stability estimation in order to improve the accommodation of stream-based communication in multichannel wireless sensor networks. For performing channel quality assessment, we have formulated a composite metric, which we call channel rank measurement (CRM), that can demarcate channels into good, intermediate and bad quality on the basis of the standard deviation of the received signal strength indicator (RSSI) and the average of the link quality indicator (LQI) of the received packets. CRM is then used to generate a data set for training a supervised machine learning-based algorithm (which we call Normal Equation based Channel quality prediction (NEC) algorithm) in such a way that it may perform instantaneous channel rank estimation of any channel. Subsequently, two robust extensions of the NEC algorithm are proposed (which we call Normal Equation based Weighted Moving Average Channel quality prediction (NEWMAC) algorithm and Normal Equation based Aggregate Maturity Criteria with Beta Tracking based Channel weight prediction (NEAMCBTC) algorithm), that can perform channel quality estimation on the basis of both current and past values of channel rank estimation. In the end, simulations are made using MATLAB, and the results show that the Extended version of NEAMCBTC algorithm (Ext-NEAMCBTC) outperforms the compared techniques in terms of channel quality and stability assessment. It also minimizes channel switching overheads (in terms of switching delays and energy consumption) for accommodating stream-based communication in multichannel WSNs.
Rehan, Waqas; Fischer, Stefan; Rehan, Maaz
2016-01-01
Wireless sensor networks (WSNs) have become more and more diversified and are today able to also support high data rate applications, such as multimedia. In this case, per-packet channel handshaking/switching may result in inducing additional overheads, such as energy consumption, delays and, therefore, data loss. One of the solutions is to perform stream-based channel allocation where channel handshaking is performed once before transmitting the whole data stream. Deciding stream-based channel allocation is more critical in case of multichannel WSNs where channels of different quality/stability are available and the wish for high performance requires sensor nodes to switch to the best among the available channels. In this work, we will focus on devising mechanisms that perform channel quality/stability estimation in order to improve the accommodation of stream-based communication in multichannel wireless sensor networks. For performing channel quality assessment, we have formulated a composite metric, which we call channel rank measurement (CRM), that can demarcate channels into good, intermediate and bad quality on the basis of the standard deviation of the received signal strength indicator (RSSI) and the average of the link quality indicator (LQI) of the received packets. CRM is then used to generate a data set for training a supervised machine learning-based algorithm (which we call Normal Equation based Channel quality prediction (NEC) algorithm) in such a way that it may perform instantaneous channel rank estimation of any channel. Subsequently, two robust extensions of the NEC algorithm are proposed (which we call Normal Equation based Weighted Moving Average Channel quality prediction (NEWMAC) algorithm and Normal Equation based Aggregate Maturity Criteria with Beta Tracking based Channel weight prediction (NEAMCBTC) algorithm), that can perform channel quality estimation on the basis of both current and past values of channel rank estimation. In the end, simulations are made using MATLAB, and the results show that the Extended version of NEAMCBTC algorithm (Ext-NEAMCBTC) outperforms the compared techniques in terms of channel quality and stability assessment. It also minimizes channel switching overheads (in terms of switching delays and energy consumption) for accommodating stream-based communication in multichannel WSNs. PMID:27626429
Wind Prediction Accuracy for Air Traffic Management Decision Support Tools
NASA Technical Reports Server (NTRS)
Cole, Rod; Green, Steve; Jardin, Matt; Schwartz, Barry; Benjamin, Stan
2000-01-01
The performance of Air Traffic Management and flight deck decision support tools depends in large part on the accuracy of the supporting 4D trajectory predictions. This is particularly relevant to conflict prediction and active advisories for the resolution of conflicts and the conformance with of traffic-flow management flow-rate constraints (e.g., arrival metering / required time of arrival). Flight test results have indicated that wind prediction errors may represent the largest source of trajectory prediction error. The tests also discovered relatively large errors (e.g., greater than 20 knots), existing in pockets of space and time critical to ATM DST performance (one or more sectors, greater than 20 minutes), are inadequately represented by the classic RMS aggregate prediction-accuracy studies of the past. To facilitate the identification and reduction of DST-critical wind-prediction errors, NASA has lead a collaborative research and development activity with MIT Lincoln Laboratories and the Forecast Systems Lab of the National Oceanographic and Atmospheric Administration (NOAA). This activity, begun in 1996, has focussed on the development of key metrics for ATM DST performance, assessment of wind-prediction skill for state of the art systems, and development/validation of system enhancements to improve skill. A 13 month study was conducted for the Denver Center airspace in 1997. Two complementary wind-prediction systems were analyzed and compared to the forecast performance of the then standard 60 km Rapid Update Cycle - version 1 (RUC-1). One system, developed by NOAA, was the prototype 40-km RUC-2 that became operational at NCEP in 1999. RUC-2 introduced a faster cycle (1 hr vs. 3 hr) and improved mesoscale physics. The second system, Augmented Winds (AW), is a prototype en route wind application developed by MITLL based on the Integrated Terminal Wind System (ITWS). AW is run at a local facility (Center) level, and updates RUC predictions based on an optimal interpolation of the latest ACARS reports since the RUC run. This paper presents an overview of the study's results including the identification and use of new large mor wind-prediction accuracy metrics that are key to ATM DST performance.
Big Data Toolsets to Pharmacometrics: Application of Machine Learning for Time‐to‐Event Analysis
Gong, Xiajing; Hu, Meng
2018-01-01
Abstract Additional value can be potentially created by applying big data tools to address pharmacometric problems. The performances of machine learning (ML) methods and the Cox regression model were evaluated based on simulated time‐to‐event data synthesized under various preset scenarios, i.e., with linear vs. nonlinear and dependent vs. independent predictors in the proportional hazard function, or with high‐dimensional data featured by a large number of predictor variables. Our results showed that ML‐based methods outperformed the Cox model in prediction performance as assessed by concordance index and in identifying the preset influential variables for high‐dimensional data. The prediction performances of ML‐based methods are also less sensitive to data size and censoring rates than the Cox regression model. In conclusion, ML‐based methods provide a powerful tool for time‐to‐event analysis, with a built‐in capacity for high‐dimensional data and better performance when the predictor variables assume nonlinear relationships in the hazard function. PMID:29536640
Meertens, Linda J E; van Montfort, Pim; Scheepers, Hubertina C J; van Kuijk, Sander M J; Aardenburg, Robert; Langenveld, Josje; van Dooren, Ivo M A; Zwaan, Iris M; Spaanderman, Marc E A; Smits, Luc J M
2018-04-17
Prediction models may contribute to personalized risk-based management of women at high risk of spontaneous preterm delivery. Although prediction models are published frequently, often with promising results, external validation generally is lacking. We performed a systematic review of prediction models for the risk of spontaneous preterm birth based on routine clinical parameters. Additionally, we externally validated and evaluated the clinical potential of the models. Prediction models based on routinely collected maternal parameters obtainable during first 16 weeks of gestation were eligible for selection. Risk of bias was assessed according to the CHARMS guidelines. We validated the selected models in a Dutch multicenter prospective cohort study comprising 2614 unselected pregnant women. Information on predictors was obtained by a web-based questionnaire. Predictive performance of the models was quantified by the area under the receiver operating characteristic curve (AUC) and calibration plots for the outcomes spontaneous preterm birth <37 weeks and <34 weeks of gestation. Clinical value was evaluated by means of decision curve analysis and calculating classification accuracy for different risk thresholds. Four studies describing five prediction models fulfilled the eligibility criteria. Risk of bias assessment revealed a moderate to high risk of bias in three studies. The AUC of the models ranged from 0.54 to 0.67 and from 0.56 to 0.70 for the outcomes spontaneous preterm birth <37 weeks and <34 weeks of gestation, respectively. A subanalysis showed that the models discriminated poorly (AUC 0.51-0.56) for nulliparous women. Although we recalibrated the models, two models retained evidence of overfitting. The decision curve analysis showed low clinical benefit for the best performing models. This review revealed several reporting and methodological shortcomings of published prediction models for spontaneous preterm birth. Our external validation study indicated that none of the models had the ability to predict spontaneous preterm birth adequately in our population. Further improvement of prediction models, using recent knowledge about both model development and potential risk factors, is necessary to provide an added value in personalized risk assessment of spontaneous preterm birth. © 2018 The Authors Acta Obstetricia et Gynecologica Scandinavica published by John Wiley & Sons Ltd on behalf of Nordic Federation of Societies of Obstetrics and Gynecology (NFOG).
Peterson, Lenna X.; Kim, Hyungrae; Esquivel-Rodriguez, Juan; Roy, Amitava; Han, Xusi; Shin, Woong-Hee; Zhang, Jian; Terashi, Genki; Lee, Matt; Kihara, Daisuke
2016-01-01
We report the performance of protein-protein docking predictions by our group for recent rounds of the Critical Assessment of Prediction of Interactions (CAPRI), a community-wide assessment of state-of-the-art docking methods. Our prediction procedure uses a protein-protein docking program named LZerD developed in our group. LZerD represents a protein surface with 3D Zernike descriptors (3DZD), which are based on a mathematical series expansion of a 3D function. The appropriate soft representation of protein surface with 3DZD makes the method more tolerant to conformational change of proteins upon docking, which adds an advantage for unbound docking. Docking was guided by interface residue prediction performed with BindML and cons-PPISP as well as literature information when available. The generated docking models were ranked by a combination of scoring functions, including PRESCO, which evaluates the native-likeness of residues’ spatial environments in structure models. First, we discuss the overall performance of our group in the CAPRI prediction rounds and investigate the reasons for unsuccessful cases. Then, we examine the performance of several knowledge-based scoring functions and their combinations for ranking docking models. It was found that the quality of a pool of docking models generated by LZerD, i.e. whether or not the pool includes near-native models, can be predicted by the correlation of multiple scores. Although the current analysis used docking models generated by LZerD, findings on scoring functions are expected to be universally applicable to other docking methods. PMID:27654025
Ehret, A; Hochstuhl, D; Krattenmacher, N; Tetens, J; Klein, M S; Gronwald, W; Thaller, G
2015-01-01
Subclinical ketosis is one of the most prevalent metabolic disorders in high-producing dairy cows during early lactation. This renders its early detection and prevention important for both economical and animal-welfare reasons. Construction of reliable predictive models is challenging, because traits like ketosis are commonly affected by multiple factors. In this context, machine learning methods offer great advantages because of their universal learning ability and flexibility in integrating various sorts of data. Here, an artificial-neural-network approach was applied to investigate the utility of metabolic, genetic, and milk performance data for the prediction of milk levels of β-hydroxybutyrate within and across consecutive weeks postpartum. Data were collected from 218 dairy cows during their first 5wk in milk. All animals were genotyped with a 50,000 SNP panel, and weekly information on the concentrations of the milk metabolites glycerophosphocholine and phosphocholine as well as milk composition data (milk yield, fat and protein percentage) was available. The concentration of β-hydroxybutyric acid in milk was used as target variable in all prediction models. Average correlations between observed and predicted target values up to 0.643 could be obtained, if milk metabolite and routine milk recording data were combined for prediction at the same day within weeks. Predictive performance of metabolic as well as milk performance-based models was higher than that of models based on genetic information. Copyright © 2015 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Zhao, Ping; Pan, Yuzhuo; Wagner, Christian
2017-01-01
A comprehensive search in literature and published US Food and Drug Administration reviews was conducted to assess whether physiologically based pharmacokinetic (PBPK) modeling could be prospectively used to predict clinical food effect on oral drug absorption. Among the 48 resulted food effect predictions, ∼50% were predicted within 1.25‐fold of observed, and 75% within 2‐fold. Dissolution rate and precipitation time were commonly optimized parameters when PBPK modeling was not able to capture the food effect. The current work presents a knowledgebase for documenting PBPK experience to predict food effect. PMID:29168611
Bouktif, Salah; Hanna, Eileen Marie; Zaki, Nazar; Abu Khousa, Eman
2014-01-01
Prediction and classification techniques have been well studied by machine learning researchers and developed for several real-word problems. However, the level of acceptance and success of prediction models are still below expectation due to some difficulties such as the low performance of prediction models when they are applied in different environments. Such a problem has been addressed by many researchers, mainly from the machine learning community. A second problem, principally raised by model users in different communities, such as managers, economists, engineers, biologists, and medical practitioners, etc., is the prediction models' interpretability. The latter is the ability of a model to explain its predictions and exhibit the causality relationships between the inputs and the outputs. In the case of classification, a successful way to alleviate the low performance is to use ensemble classiers. It is an intuitive strategy to activate collaboration between different classifiers towards a better performance than individual classier. Unfortunately, ensemble classifiers method do not take into account the interpretability of the final classification outcome. It even worsens the original interpretability of the individual classifiers. In this paper we propose a novel implementation of classifiers combination approach that does not only promote the overall performance but also preserves the interpretability of the resulting model. We propose a solution based on Ant Colony Optimization and tailored for the case of Bayesian classifiers. We validate our proposed solution with case studies from medical domain namely, heart disease and Cardiotography-based predictions, problems where interpretability is critical to make appropriate clinical decisions. The datasets, Prediction Models and software tool together with supplementary materials are available at http://faculty.uaeu.ac.ae/salahb/ACO4BC.htm.
Changes in Predictive Task Switching with Age and with Cognitive Load.
Levy-Tzedek, Shelly
2017-01-01
Predictive control of movement is more efficient than feedback-based control, and is an important skill in everyday life. We tested whether the ability to predictively control movements of the upper arm is affected by age and by cognitive load. A total of 63 participants were tested in two experiments. In both experiments participants were seated, and controlled a cursor on a computer screen by flexing and extending their dominant arm. In Experiment 1, 20 young adults and 20 older adults were asked to continuously change the frequency of their horizontal arm movements, with the goal of inducing an abrupt switch between discrete movements (at low frequencies) and rhythmic movements (at high frequencies). We tested whether that change was performed based on a feed-forward (predictive) or on a feedback (reactive) control. In Experiment 2, 23 young adults performed the same task, while being exposed to a cognitive load half of the time via a serial subtraction task. We found that both aging and cognitive load diminished, on average, the ability of participants to predictively control their movements. Five older adults and one young adult under a cognitive load were not able to perform the switch between rhythmic and discrete movement (or vice versa). In Experiment 1, 40% of the older participants were able to predictively control their movements, compared with 70% in the young group. In Experiment 2, 48% of the participants were able to predictively control their movements with a cognitively loading task, compared with 70% in the no-load condition. The ability to predictively change a motor plan in anticipation of upcoming changes may be an important component in performing everyday functions, such as safe driving and avoiding falls.
Protein (multi-)location prediction: using location inter-dependencies in a probabilistic framework
2014-01-01
Motivation Knowing the location of a protein within the cell is important for understanding its function, role in biological processes, and potential use as a drug target. Much progress has been made in developing computational methods that predict single locations for proteins. Most such methods are based on the over-simplifying assumption that proteins localize to a single location. However, it has been shown that proteins localize to multiple locations. While a few recent systems attempt to predict multiple locations of proteins, their performance leaves much room for improvement. Moreover, they typically treat locations as independent and do not attempt to utilize possible inter-dependencies among locations. Our hypothesis is that directly incorporating inter-dependencies among locations into both the classifier-learning and the prediction process can improve location prediction performance. Results We present a new method and a preliminary system we have developed that directly incorporates inter-dependencies among locations into the location-prediction process of multiply-localized proteins. Our method is based on a collection of Bayesian network classifiers, where each classifier is used to predict a single location. Learning the structure of each Bayesian network classifier takes into account inter-dependencies among locations, and the prediction process uses estimates involving multiple locations. We evaluate our system on a dataset of single- and multi-localized proteins (the most comprehensive protein multi-localization dataset currently available, derived from the DBMLoc dataset). Our results, obtained by incorporating inter-dependencies, are significantly higher than those obtained by classifiers that do not use inter-dependencies. The performance of our system on multi-localized proteins is comparable to a top performing system (YLoc+), without being restricted only to location-combinations present in the training set. PMID:24646119
Hu, Meng; Müller, Erik; Schymanski, Emma L; Ruttkies, Christoph; Schulze, Tobias; Brack, Werner; Krauss, Martin
2018-03-01
In nontarget screening, structure elucidation of small molecules from high resolution mass spectrometry (HRMS) data is challenging, particularly the selection of the most likely candidate structure among the many retrieved from compound databases. Several fragmentation and retention prediction methods have been developed to improve this candidate selection. In order to evaluate their performance, we compared two in silico fragmenters (MetFrag and CFM-ID) and two retention time prediction models (based on the chromatographic hydrophobicity index (CHI) and on log D). A set of 78 known organic micropollutants was analyzed by liquid chromatography coupled to a LTQ Orbitrap HRMS with electrospray ionization (ESI) in positive and negative mode using two fragmentation techniques with different collision energies. Both fragmenters (MetFrag and CFM-ID) performed well for most compounds, with average ranking the correct candidate structure within the top 25% and 22 to 37% for ESI+ and ESI- mode, respectively. The rank of the correct candidate structure slightly improved when MetFrag and CFM-ID were combined. For unknown compounds detected in both ESI+ and ESI-, generally positive mode mass spectra were better for further structure elucidation. Both retention prediction models performed reasonably well for more hydrophobic compounds but not for early eluting hydrophilic substances. The log D prediction showed a better accuracy than the CHI model. Although the two fragmentation prediction methods are more diagnostic and sensitive for candidate selection, the inclusion of retention prediction by calculating a consensus score with optimized weighting can improve the ranking of correct candidates as compared to the individual methods. Graphical abstract Consensus workflow for combining fragmentation and retention prediction in LC-HRMS-based micropollutant identification.
Protein (multi-)location prediction: using location inter-dependencies in a probabilistic framework.
Simha, Ramanuja; Shatkay, Hagit
2014-03-19
Knowing the location of a protein within the cell is important for understanding its function, role in biological processes, and potential use as a drug target. Much progress has been made in developing computational methods that predict single locations for proteins. Most such methods are based on the over-simplifying assumption that proteins localize to a single location. However, it has been shown that proteins localize to multiple locations. While a few recent systems attempt to predict multiple locations of proteins, their performance leaves much room for improvement. Moreover, they typically treat locations as independent and do not attempt to utilize possible inter-dependencies among locations. Our hypothesis is that directly incorporating inter-dependencies among locations into both the classifier-learning and the prediction process can improve location prediction performance. We present a new method and a preliminary system we have developed that directly incorporates inter-dependencies among locations into the location-prediction process of multiply-localized proteins. Our method is based on a collection of Bayesian network classifiers, where each classifier is used to predict a single location. Learning the structure of each Bayesian network classifier takes into account inter-dependencies among locations, and the prediction process uses estimates involving multiple locations. We evaluate our system on a dataset of single- and multi-localized proteins (the most comprehensive protein multi-localization dataset currently available, derived from the DBMLoc dataset). Our results, obtained by incorporating inter-dependencies, are significantly higher than those obtained by classifiers that do not use inter-dependencies. The performance of our system on multi-localized proteins is comparable to a top performing system (YLoc+), without being restricted only to location-combinations present in the training set.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Xuejun; Tang, Qiuhong; Liu, Xingcai
Real-time monitoring and predicting drought development with several months in advance is of critical importance for drought risk adaptation and mitigation. In this paper, we present a drought monitoring and seasonal forecasting framework based on the Variable Infiltration Capacity (VIC) hydrologic model over Southwest China (SW). The satellite precipitation data are used to force VIC model for near real-time estimate of land surface hydrologic conditions. As initialized with satellite-aided monitoring, the climate model-based forecast (CFSv2_VIC) and ensemble streamflow prediction (ESP)-based forecast (ESP_VIC) are both performed and evaluated through their ability in reproducing the evolution of the 2009/2010 severe drought overmore » SW. The results show that the satellite-aided monitoring is able to provide reasonable estimate of forecast initial conditions (ICs) in a real-time manner. Both of CFSv2_VIC and ESP_VIC exhibit comparable performance against the observation-based estimates for the first month, whereas the predictive skill largely drops beyond 1-month. Compared to ESP_VIC, CFSv2_VIC shows better performance as indicated by the smaller ensemble range. This study highlights the value of this operational framework in generating near real-time ICs and giving a reliable prediction with 1-month ahead, which has great implications for drought risk assessment, preparation and relief.« less
The Predictive Value of Ultrasound Learning Curves Across Simulated and Clinical Settings.
Madsen, Mette E; Nørgaard, Lone N; Tabor, Ann; Konge, Lars; Ringsted, Charlotte; Tolsgaard, Martin G
2017-01-01
The aim of the study was to explore whether learning curves on a virtual-reality (VR) sonographic simulator can be used to predict subsequent learning curves on a physical mannequin and learning curves during clinical training. Twenty midwives completed a simulation-based training program in transvaginal sonography. The training was conducted on a VR simulator as well as on a physical mannequin. A subgroup of 6 participants underwent subsequent clinical training. During each of the 3 steps, the participants' performance was assessed using instruments with established validity evidence, and they advanced to the next level only after attaining predefined levels of performance. The number of repetitions and time needed to achieve predefined performance levels were recorded along with the performance scores in each setting. Finally, the outcomes were correlated across settings. A good correlation was found between time needed to achieve predefined performance levels on the VR simulator and the physical mannequin (Pearson correlation coefficient .78; P < .001). Performance scores on the VR simulator correlated well to the clinical performance scores (Pearson correlation coefficient .81; P = .049). No significant correlations were found between numbers of attempts needed to reach proficiency across the 3 different settings. A post hoc analysis found that the 50% fastest trainees at reaching proficiency during simulation-based training received higher clinical performance scores compared to trainees with scores placing them among the 50% slowest (P = .025). Performances during simulation-based sonography training may predict performance in related tasks and subsequent clinical learning curves. © 2016 by the American Institute of Ultrasound in Medicine.
Further Investigation of Receding Horizion-Based Controllers and Neural Network-Based Systems
NASA Technical Reports Server (NTRS)
Kelkar, Atul G.; Haley, Pamela J. (Technical Monitor)
2000-01-01
This report provides a comprehensive summary of the research work performed over the entire duration of the co-operative research agreement between NASA Langley Research Center and Kansas State University. This summary briefly lists the findings and also suggests possible future directions for the continuation of the subject research in the area of Generalized Predictive Control (GPC) and Network Based Generalized Predictive Control (NGPC).
An Adaptive Handover Prediction Scheme for Seamless Mobility Based Wireless Networks
Safa Sadiq, Ali; Fisal, Norsheila Binti; Ghafoor, Kayhan Zrar; Lloret, Jaime
2014-01-01
We propose an adaptive handover prediction (AHP) scheme for seamless mobility based wireless networks. That is, the AHP scheme incorporates fuzzy logic with AP prediction process in order to lend cognitive capability to handover decision making. Selection metrics, including received signal strength, mobile node relative direction towards the access points in the vicinity, and access point load, are collected and considered inputs of the fuzzy decision making system in order to select the best preferable AP around WLANs. The obtained handover decision which is based on the calculated quality cost using fuzzy inference system is also based on adaptable coefficients instead of fixed coefficients. In other words, the mean and the standard deviation of the normalized network prediction metrics of fuzzy inference system, which are collected from available WLANs are obtained adaptively. Accordingly, they are applied as statistical information to adjust or adapt the coefficients of membership functions. In addition, we propose an adjustable weight vector concept for input metrics in order to cope with the continuous, unpredictable variation in their membership degrees. Furthermore, handover decisions are performed in each MN independently after knowing RSS, direction toward APs, and AP load. Finally, performance evaluation of the proposed scheme shows its superiority compared with representatives of the prediction approaches. PMID:25574490
An adaptive handover prediction scheme for seamless mobility based wireless networks.
Sadiq, Ali Safa; Fisal, Norsheila Binti; Ghafoor, Kayhan Zrar; Lloret, Jaime
2014-01-01
We propose an adaptive handover prediction (AHP) scheme for seamless mobility based wireless networks. That is, the AHP scheme incorporates fuzzy logic with AP prediction process in order to lend cognitive capability to handover decision making. Selection metrics, including received signal strength, mobile node relative direction towards the access points in the vicinity, and access point load, are collected and considered inputs of the fuzzy decision making system in order to select the best preferable AP around WLANs. The obtained handover decision which is based on the calculated quality cost using fuzzy inference system is also based on adaptable coefficients instead of fixed coefficients. In other words, the mean and the standard deviation of the normalized network prediction metrics of fuzzy inference system, which are collected from available WLANs are obtained adaptively. Accordingly, they are applied as statistical information to adjust or adapt the coefficients of membership functions. In addition, we propose an adjustable weight vector concept for input metrics in order to cope with the continuous, unpredictable variation in their membership degrees. Furthermore, handover decisions are performed in each MN independently after knowing RSS, direction toward APs, and AP load. Finally, performance evaluation of the proposed scheme shows its superiority compared with representatives of the prediction approaches.
Nielsen, Anne; Hansen, Mikkel Bo; Tietze, Anna; Mouridsen, Kim
2018-06-01
Treatment options for patients with acute ischemic stroke depend on the volume of salvageable tissue. This volume assessment is currently based on fixed thresholds and single imagine modalities, limiting accuracy. We wish to develop and validate a predictive model capable of automatically identifying and combining acute imaging features to accurately predict final lesion volume. Using acute magnetic resonance imaging, we developed and trained a deep convolutional neural network (CNN deep ) to predict final imaging outcome. A total of 222 patients were included, of which 187 were treated with rtPA (recombinant tissue-type plasminogen activator). The performance of CNN deep was compared with a shallow CNN based on the perfusion-weighted imaging biomarker Tmax (CNN Tmax ), a shallow CNN based on a combination of 9 different biomarkers (CNN shallow ), a generalized linear model, and thresholding of the diffusion-weighted imaging biomarker apparent diffusion coefficient (ADC) at 600×10 -6 mm 2 /s (ADC thres ). To assess whether CNN deep is capable of differentiating outcomes of ±intravenous rtPA, patients not receiving intravenous rtPA were included to train CNN deep, -rtpa to access a treatment effect. The networks' performances were evaluated using visual inspection, area under the receiver operating characteristic curve (AUC), and contrast. CNN deep yields significantly better performance in predicting final outcome (AUC=0.88±0.12) than generalized linear model (AUC=0.78±0.12; P =0.005), CNN Tmax (AUC=0.72±0.14; P <0.003), and ADC thres (AUC=0.66±0.13; P <0.0001) and a substantially better performance than CNN shallow (AUC=0.85±0.11; P =0.063). Measured by contrast, CNN deep improves the predictions significantly, showing superiority to all other methods ( P ≤0.003). CNN deep also seems to be able to differentiate outcomes based on treatment strategy with the volume of final infarct being significantly different ( P =0.048). The considerable prediction improvement accuracy over current state of the art increases the potential for automated decision support in providing recommendations for personalized treatment plans. © 2018 American Heart Association, Inc.
Li, Yanpeng; Li, Xiang; Wang, Hongqiang; Chen, Yiping; Zhuang, Zhaowen; Cheng, Yongqiang; Deng, Bin; Wang, Liandong; Zeng, Yonghu; Gao, Lei
2014-01-01
This paper offers a compacted mechanism to carry out the performance evaluation work for an automatic target recognition (ATR) system: (a) a standard description of the ATR system's output is suggested, a quantity to indicate the operating condition is presented based on the principle of feature extraction in pattern recognition, and a series of indexes to assess the output in different aspects are developed with the application of statistics; (b) performance of the ATR system is interpreted by a quality factor based on knowledge of engineering mathematics; (c) through a novel utility called “context-probability” estimation proposed based on probability, performance prediction for an ATR system is realized. The simulation result shows that the performance of an ATR system can be accounted for and forecasted by the above-mentioned measures. Compared to existing technologies, the novel method can offer more objective performance conclusions for an ATR system. These conclusions may be helpful in knowing the practical capability of the tested ATR system. At the same time, the generalization performance of the proposed method is good. PMID:24967605
Chen, Fu; Sun, Huiyong; Wang, Junmei; Zhu, Feng; Liu, Hui; Wang, Zhe; Lei, Tailong; Li, Youyong; Hou, Tingjun
2018-06-21
Molecular docking provides a computationally efficient way to predict the atomic structural details of protein-RNA interactions (PRI), but accurate prediction of the three-dimensional structures and binding affinities for PRI is still notoriously difficult, partly due to the unreliability of the existing scoring functions for PRI. MM/PBSA and MM/GBSA are more theoretically rigorous than most scoring functions for protein-RNA docking, but their prediction performance for protein-RNA systems remains unclear. Here, we systemically evaluated the capability of MM/PBSA and MM/GBSA to predict the binding affinities and recognize the near-native binding structures for protein-RNA systems with different solvent models and interior dielectric constants (ϵ in ). For predicting the binding affinities, the predictions given by MM/GBSA based on the minimized structures in explicit solvent and the GBGBn1 model with ϵ in = 2 yielded the highest correlation with the experimental data. Moreover, the MM/GBSA calculations based on the minimized structures in implicit solvent and the GBGBn1 model distinguished the near-native binding structures within the top 10 decoys for 118 out of the 149 protein-RNA systems (79.2%). This performance is better than all docking scoring functions studied here. Therefore, the MM/GBSA rescoring is an efficient way to improve the prediction capability of scoring functions for protein-RNA systems. Published by Cold Spring Harbor Laboratory Press for the RNA Society.
The Development of a Handbook for Astrobee F Performance and Stability Analysis
NASA Technical Reports Server (NTRS)
Wolf, R. S.
1982-01-01
An astrobee F performance and stability analysis is presented, for use by the NASA Sounding Rocket Division. The performance analysis provides information regarding altitude, mach number, dynamic pressure, and velocity as functions of time since launch. It is found that payload weight has the greatest effect on performance, and performance prediction accuracy was calculated to remain within 1%. In addition, to assure sufficient flight stability, a predicted rigid-body static margin of at least 8% of the total vehicle length is required. Finally, fin cant angle predictions are given in order to achieve a 2.5 cycle per second burnout roll rate, based on obtaining 75% of the steady roll rate. It is noted that this method can be used by flight performance engineers to create a similar handbook for any sounding rocket series.
Janssen, Daniël M C; van Kuijk, Sander M J; d'Aumerie, Boudewijn B; Willems, Paul C
2018-05-16
A prediction model for surgical site infection (SSI) after spine surgery was developed in 2014 by Lee et al. This model was developed to compute an individual estimate of the probability of SSI after spine surgery based on the patient's comorbidity profile and invasiveness of surgery. Before any prediction model can be validly implemented in daily medical practice, it should be externally validated to assess how the prediction model performs in patients sampled independently from the derivation cohort. We included 898 consecutive patients who underwent instrumented thoracolumbar spine surgery. To quantify overall performance using Nagelkerke's R 2 statistic, the discriminative ability was quantified as the area under the receiver operating characteristic curve (AUC). We computed the calibration slope of the calibration plot, to judge prediction accuracy. Sixty patients developed an SSI. The overall performance of the prediction model in our population was poor: Nagelkerke's R 2 was 0.01. The AUC was 0.61 (95% confidence interval (CI) 0.54-0.68). The estimated slope of the calibration plot was 0.52. The previously published prediction model showed poor performance in our academic external validation cohort. To predict SSI after instrumented thoracolumbar spine surgery for the present population, a better fitting prediction model should be developed.
Enhanced thermoelectric performance of graphene nanoribbon-based devices
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hossain, Md Sharafat, E-mail: hossain@student.unimelb.edu.au; Huynh, Duc Hau; Nguyen, Phuong Duc
There have been numerous theoretical studies on exciting thermoelectric properties of graphene nano-ribbons (GNRs); however, most of these studies are mainly based on simulations. In this work, we measure and characterize the thermoelectric properties of GNRs and compare the results with theoretical predictions. Our experimental results verify that nano-structuring and patterning graphene into nano-ribbons significantly enhance its thermoelectric power, confirming previous predictions. Although patterning results in lower conductance (G), the overall power factor (S{sup 2}G) increases for nanoribbons. We demonstrate that edge roughness plays an important role in achieving such an enhanced performance and support it through first principles simulations.more » We show that uncontrolled edge roughness, which is considered detrimental in GNR-based electronic devices, leads to enhanced thermoelectric performance of GNR-based thermoelectric devices. The result validates previously reported theoretical studies of GNRs and demonstrates the potential of GNRs for the realization of highly efficient thermoelectric devices.« less
Probability-based collaborative filtering model for predicting gene-disease associations.
Zeng, Xiangxiang; Ding, Ningxiang; Rodríguez-Patón, Alfonso; Zou, Quan
2017-12-28
Accurately predicting pathogenic human genes has been challenging in recent research. Considering extensive gene-disease data verified by biological experiments, we can apply computational methods to perform accurate predictions with reduced time and expenses. We propose a probability-based collaborative filtering model (PCFM) to predict pathogenic human genes. Several kinds of data sets, containing data of humans and data of other nonhuman species, are integrated in our model. Firstly, on the basis of a typical latent factorization model, we propose model I with an average heterogeneous regularization. Secondly, we develop modified model II with personal heterogeneous regularization to enhance the accuracy of aforementioned models. In this model, vector space similarity or Pearson correlation coefficient metrics and data on related species are also used. We compared the results of PCFM with the results of four state-of-arts approaches. The results show that PCFM performs better than other advanced approaches. PCFM model can be leveraged for predictions of disease genes, especially for new human genes or diseases with no known relationships.
A novel artificial neural network method for biomedical prediction based on matrix pseudo-inversion.
Cai, Binghuang; Jiang, Xia
2014-04-01
Biomedical prediction based on clinical and genome-wide data has become increasingly important in disease diagnosis and classification. To solve the prediction problem in an effective manner for the improvement of clinical care, we develop a novel Artificial Neural Network (ANN) method based on Matrix Pseudo-Inversion (MPI) for use in biomedical applications. The MPI-ANN is constructed as a three-layer (i.e., input, hidden, and output layers) feed-forward neural network, and the weights connecting the hidden and output layers are directly determined based on MPI without a lengthy learning iteration. The LASSO (Least Absolute Shrinkage and Selection Operator) method is also presented for comparative purposes. Single Nucleotide Polymorphism (SNP) simulated data and real breast cancer data are employed to validate the performance of the MPI-ANN method via 5-fold cross validation. Experimental results demonstrate the efficacy of the developed MPI-ANN for disease classification and prediction, in view of the significantly superior accuracy (i.e., the rate of correct predictions), as compared with LASSO. The results based on the real breast cancer data also show that the MPI-ANN has better performance than other machine learning methods (including support vector machine (SVM), logistic regression (LR), and an iterative ANN). In addition, experiments demonstrate that our MPI-ANN could be used for bio-marker selection as well. Copyright © 2013 Elsevier Inc. All rights reserved.
Bias-adjusted satellite-based rainfall estimates for predicting floods: Narayani Basin
Shrestha, M.S.; Artan, G.A.; Bajracharya, S.R.; Gautam, D.K.; Tokar, S.A.
2011-01-01
In Nepal, as the spatial distribution of rain gauges is not sufficient to provide detailed perspective on the highly varied spatial nature of rainfall, satellite-based rainfall estimates provides the opportunity for timely estimation. This paper presents the flood prediction of Narayani Basin at the Devghat hydrometric station (32000km2) using bias-adjusted satellite rainfall estimates and the Geospatial Stream Flow Model (GeoSFM), a spatially distributed, physically based hydrologic model. The GeoSFM with gridded gauge observed rainfall inputs using kriging interpolation from 2003 was used for calibration and 2004 for validation to simulate stream flow with both having a Nash Sutcliff Efficiency of above 0.7. With the National Oceanic and Atmospheric Administration Climate Prediction Centre's rainfall estimates (CPC-RFE2.0), using the same calibrated parameters, for 2003 the model performance deteriorated but improved after recalibration with CPC-RFE2.0 indicating the need to recalibrate the model with satellite-based rainfall estimates. Adjusting the CPC-RFE2.0 by a seasonal, monthly and 7-day moving average ratio, improvement in model performance was achieved. Furthermore, a new gauge-satellite merged rainfall estimates obtained from ingestion of local rain gauge data resulted in significant improvement in flood predictability. The results indicate the applicability of satellite-based rainfall estimates in flood prediction with appropriate bias correction. ?? 2011 The Authors. Journal of Flood Risk Management ?? 2011 The Chartered Institution of Water and Environmental Management.
Bias-adjusted satellite-based rainfall estimates for predicting floods: Narayani Basin
Artan, Guleid A.; Tokar, S.A.; Gautam, D.K.; Bajracharya, S.R.; Shrestha, M.S.
2011-01-01
In Nepal, as the spatial distribution of rain gauges is not sufficient to provide detailed perspective on the highly varied spatial nature of rainfall, satellite-based rainfall estimates provides the opportunity for timely estimation. This paper presents the flood prediction of Narayani Basin at the Devghat hydrometric station (32 000 km2) using bias-adjusted satellite rainfall estimates and the Geospatial Stream Flow Model (GeoSFM), a spatially distributed, physically based hydrologic model. The GeoSFM with gridded gauge observed rainfall inputs using kriging interpolation from 2003 was used for calibration and 2004 for validation to simulate stream flow with both having a Nash Sutcliff Efficiency of above 0.7. With the National Oceanic and Atmospheric Administration Climate Prediction Centre's rainfall estimates (CPC_RFE2.0), using the same calibrated parameters, for 2003 the model performance deteriorated but improved after recalibration with CPC_RFE2.0 indicating the need to recalibrate the model with satellite-based rainfall estimates. Adjusting the CPC_RFE2.0 by a seasonal, monthly and 7-day moving average ratio, improvement in model performance was achieved. Furthermore, a new gauge-satellite merged rainfall estimates obtained from ingestion of local rain gauge data resulted in significant improvement in flood predictability. The results indicate the applicability of satellite-based rainfall estimates in flood prediction with appropriate bias correction.
Predicting missing links in complex networks based on common neighbors and distance
Yang, Jinxuan; Zhang, Xiao-Dong
2016-01-01
The algorithms based on common neighbors metric to predict missing links in complex networks are very popular, but most of these algorithms do not account for missing links between nodes with no common neighbors. It is not accurate enough to reconstruct networks by using these methods in some cases especially when between nodes have less common neighbors. We proposed in this paper a new algorithm based on common neighbors and distance to improve accuracy of link prediction. Our proposed algorithm makes remarkable effect in predicting the missing links between nodes with no common neighbors and performs better than most existing currently used methods for a variety of real-world networks without increasing complexity. PMID:27905526
Laser-Based Trespassing Prediction in Restrictive Environments: A Linear Approach
Cheein, Fernando Auat; Scaglia, Gustavo
2012-01-01
Stationary range laser sensors for intruder monitoring, restricted space violation detections and workspace determination are extensively used in risky environments. In this work we present a linear based approach for predicting the presence of moving agents before they trespass a laser-based restricted space. Our approach is based on the Taylor's series expansion of the detected objects' movements. The latter makes our proposal suitable for embedded applications. In the experimental results (carried out in different scenarios) presented herein, our proposal shows 100% of effectiveness in predicting trespassing situations. Several implementation results and statistics analysis showing the performance of our proposal are included in this work.
NASA Astrophysics Data System (ADS)
Zhao, Yan; Yang, Zijiang; Gao, Song; Liu, Jinbiao
2018-02-01
Automatic generation control(AGC) is a key technology to maintain real time power generation and load balance, and to ensure the quality of power supply. Power grids require each power generation unit to have a satisfactory AGC performance, being specified in two detailed rules. The two rules provide a set of indices to measure the AGC performance of power generation unit. However, the commonly-used method to calculate these indices is based on particular data samples from AGC responses and will lead to incorrect results in practice. This paper proposes a new method to estimate the AGC performance indices via system identification techniques. In addition, a nonlinear regression model between performance indices and load command is built in order to predict the AGC performance indices. The effectiveness of the proposed method is validated through industrial case studies.
Carr, Sandra E; Celenza, Antonio; Mercer, Annette M; Lake, Fiona; Puddey, Ian B
2018-01-21
Predicting workplace performance of junior doctors from before entry or during medical school is difficult and has limited available evidence. This study explored the association between selected predictor variables and workplace based performance in junior doctors during their first postgraduate year. Two cohorts of medical students (n = 200) from one university in Western Australia participated in the longitudinal study. Pearson correlation coefficients and multivariate analyses utilizing linear regression were used to assess the relationships between performance on the Junior Doctor Assessment Tool (JDAT) and its sub-components with demographic characteristics, selection scores for medical school entry, emotional intelligence, and undergraduate academic performance. Grade Point Average (GPA) at the completion of undergraduate studies had the most significant association with better performance on the overall JDAT and each subscale. Increased age was a negative predictor for junior doctor performance on the Clinical management subscale and understanding emotion was a predictor for the JDAT Communication subscale. Secondary school performance measured by Tertiary Entry Rank on entry to medical school score predicted GPA but not junior doctor performance. The GPA as a composite measure of ability and performance in medical school is associated with junior doctor assessment scores. Using this variable to identify students at risk of difficulty could assist planning for appropriate supervision, support, and training for medical graduates transitioning to the workplace.
Driver's mental workload prediction model based on physiological indices.
Yan, Shengyuan; Tran, Cong Chi; Wei, Yingying; Habiyaremye, Jean Luc
2017-09-15
Developing an early warning model to predict the driver's mental workload (MWL) is critical and helpful, especially for new or less experienced drivers. The present study aims to investigate the correlation between new drivers' MWL and their work performance, regarding the number of errors. Additionally, the group method of data handling is used to establish the driver's MWL predictive model based on subjective rating (NASA task load index [NASA-TLX]) and six physiological indices. The results indicate that the NASA-TLX and the number of errors are positively correlated, and the predictive model shows the validity of the proposed model with an R 2 value of 0.745. The proposed model is expected to provide a reference value for the new drivers of their MWL by providing the physiological indices, and the driving lesson plans can be proposed to sustain an appropriate MWL as well as improve the driver's work performance.
NASA Astrophysics Data System (ADS)
Yao, Bing; Yang, Hui
2016-12-01
This paper presents a novel physics-driven spatiotemporal regularization (STRE) method for high-dimensional predictive modeling in complex healthcare systems. This model not only captures the physics-based interrelationship between time-varying explanatory and response variables that are distributed in the space, but also addresses the spatial and temporal regularizations to improve the prediction performance. The STRE model is implemented to predict the time-varying distribution of electric potentials on the heart surface based on the electrocardiogram (ECG) data from the distributed sensor network placed on the body surface. The model performance is evaluated and validated in both a simulated two-sphere geometry and a realistic torso-heart geometry. Experimental results show that the STRE model significantly outperforms other regularization models that are widely used in current practice such as Tikhonov zero-order, Tikhonov first-order and L1 first-order regularization methods.
ERIC Educational Resources Information Center
Espin, Christine A.; Busch, Todd W.; Lembke, Erica S.; Hampton, David D.; Seo, Kyounghee; Zukowski, Beth A.
2013-01-01
The technical adequacy of curriculum-based measures in the form of short and simple vocabulary-matching probes to predict students' performance and progress in science at the secondary level was investigated. Participants were 198 seventh-grade students from 10 science classrooms. Curriculum-based measurements (CBM) were 5-min vocabulary-matching…
Manikandan, Narayanan; Subha, Srinivasan
2016-01-01
Software development life cycle has been characterized by destructive disconnects between activities like planning, analysis, design, and programming. Particularly software developed with prediction based results is always a big challenge for designers. Time series data forecasting like currency exchange, stock prices, and weather report are some of the areas where an extensive research is going on for the last three decades. In the initial days, the problems with financial analysis and prediction were solved by statistical models and methods. For the last two decades, a large number of Artificial Neural Networks based learning models have been proposed to solve the problems of financial data and get accurate results in prediction of the future trends and prices. This paper addressed some architectural design related issues for performance improvement through vectorising the strengths of multivariate econometric time series models and Artificial Neural Networks. It provides an adaptive approach for predicting exchange rates and it can be called hybrid methodology for predicting exchange rates. This framework is tested for finding the accuracy and performance of parallel algorithms used.
Manikandan, Narayanan; Subha, Srinivasan
2016-01-01
Software development life cycle has been characterized by destructive disconnects between activities like planning, analysis, design, and programming. Particularly software developed with prediction based results is always a big challenge for designers. Time series data forecasting like currency exchange, stock prices, and weather report are some of the areas where an extensive research is going on for the last three decades. In the initial days, the problems with financial analysis and prediction were solved by statistical models and methods. For the last two decades, a large number of Artificial Neural Networks based learning models have been proposed to solve the problems of financial data and get accurate results in prediction of the future trends and prices. This paper addressed some architectural design related issues for performance improvement through vectorising the strengths of multivariate econometric time series models and Artificial Neural Networks. It provides an adaptive approach for predicting exchange rates and it can be called hybrid methodology for predicting exchange rates. This framework is tested for finding the accuracy and performance of parallel algorithms used. PMID:26881271
Link prediction in multiplex online social networks
NASA Astrophysics Data System (ADS)
Jalili, Mahdi; Orouskhani, Yasin; Asgari, Milad; Alipourfard, Nazanin; Perc, Matjaž
2017-02-01
Online social networks play a major role in modern societies, and they have shaped the way social relationships evolve. Link prediction in social networks has many potential applications such as recommending new items to users, friendship suggestion and discovering spurious connections. Many real social networks evolve the connections in multiple layers (e.g. multiple social networking platforms). In this article, we study the link prediction problem in multiplex networks. As an example, we consider a multiplex network of Twitter (as a microblogging service) and Foursquare (as a location-based social network). We consider social networks of the same users in these two platforms and develop a meta-path-based algorithm for predicting the links. The connectivity information of the two layers is used to predict the links in Foursquare network. Three classical classifiers (naive Bayes, support vector machines (SVM) and K-nearest neighbour) are used for the classification task. Although the networks are not highly correlated in the layers, our experiments show that including the cross-layer information significantly improves the prediction performance. The SVM classifier results in the best performance with an average accuracy of 89%.
Link prediction in multiplex online social networks.
Jalili, Mahdi; Orouskhani, Yasin; Asgari, Milad; Alipourfard, Nazanin; Perc, Matjaž
2017-02-01
Online social networks play a major role in modern societies, and they have shaped the way social relationships evolve. Link prediction in social networks has many potential applications such as recommending new items to users, friendship suggestion and discovering spurious connections. Many real social networks evolve the connections in multiple layers (e.g. multiple social networking platforms). In this article, we study the link prediction problem in multiplex networks. As an example, we consider a multiplex network of Twitter (as a microblogging service) and Foursquare (as a location-based social network). We consider social networks of the same users in these two platforms and develop a meta-path-based algorithm for predicting the links. The connectivity information of the two layers is used to predict the links in Foursquare network. Three classical classifiers (naive Bayes, support vector machines (SVM) and K-nearest neighbour) are used for the classification task. Although the networks are not highly correlated in the layers, our experiments show that including the cross-layer information significantly improves the prediction performance. The SVM classifier results in the best performance with an average accuracy of 89%.
Albertsen, Karen; Rugulies, Reiner; Garde, Anne Helene; Burr, Hermann
2010-02-01
Interpersonal relations at work as well as individual factors seem to play prominent roles in the modern labour market, and arguably also for the change in stress symptoms. The aim was to examine whether exposures in the psychosocial work environment predicted symptoms of cognitive stress in a sample of Danish knowledge workers (i.e. employees working with sign, communication or exchange of knowledge) and whether performance-based self-esteem had a main effect, over and above the work environmental factors. 349 knowledge workers, selected from a national, representative cohort study, were followed up with two data collections, 12 months apart. We used data on psychosocial work environment factors and cognitive stress symptoms measured with the Copenhagen Psychosocial Questionnaire (COPSOQ), and a measurement of performance-based self-esteem. Effects on cognitive stress symptoms were analyzed with a GLM procedure with and without adjustment for baseline level. Measures at baseline of quantitative demands, role conflicts, lack of role clarity, recognition, predictability, influence and social support from management were positively associated with cognitive stress symptoms 12 months later. After adjustment for baseline level of cognitive stress symptoms, follow-up level was only predicted by lack of predictability. Performance-based self-esteem was prospectively associated with cognitive stress symptoms and had an independent effect above the psychosocial work environment factors on the level of and changes in cognitive stress symptoms. The results suggest that both work environmental and individual characteristics should be taken into account in order to capture sources of stress in modern working life.
Accurate prediction of energy expenditure using a shoe-based activity monitor.
Sazonova, Nadezhda; Browning, Raymond C; Sazonov, Edward
2011-07-01
The aim of this study was to develop and validate a method for predicting energy expenditure (EE) using a footwear-based system with integrated accelerometer and pressure sensors. We developed a footwear-based device with an embedded accelerometer and insole pressure sensors for the prediction of EE. The data from the device can be used to perform accurate recognition of major postures and activities and to estimate EE using the acceleration, pressure, and posture/activity classification information in a branched algorithm without the need for individual calibration. We measured EE via indirect calorimetry as 16 adults (body mass index=19-39 kg·m) performed various low- to moderate-intensity activities and compared measured versus predicted EE using several models based on the acceleration and pressure signals. Inclusion of pressure data resulted in better accuracy of EE prediction during static postures such as sitting and standing. The activity-based branched model that included predictors from accelerometer and pressure sensors (BACC-PS) achieved the lowest error (e.g., root mean squared error (RMSE)=0.69 METs) compared with the accelerometer-only-based branched model BACC (RMSE=0.77 METs) and nonbranched model (RMSE=0.94-0.99 METs). Comparison of EE prediction models using data from both legs versus models using data from a single leg indicates that only one shoe needs to be equipped with sensors. These results suggest that foot acceleration combined with insole pressure measurement, when used in an activity-specific branched model, can accurately estimate the EE associated with common daily postures and activities. The accuracy and unobtrusiveness of a footwear-based device may make it an effective physical activity monitoring tool.
Computational Fluid Dynamic Investigation of Loss Mechanisms in a Pulse-Tube Refrigerator
NASA Astrophysics Data System (ADS)
Martin, K.; Esguerra, J.; Dodson, C.; Razani, A.
2015-12-01
In predicting Pulse-Tube Cryocooler (PTC) performance, One-Dimensional (1-D) PTR design and analysis tools such as Gedeon Associates SAGE® typically include models for performance degradation due to thermodynamically irreversible processes. SAGE®, in particular, accounts for convective loss, turbulent conductive loss and numerical diffusion “loss” via correlation functions based on analysis and empirical testing. In this study, we compare CFD and SAGE® estimates of PTR refrigeration performance for four distinct pulse-tube lengths. Performance predictions from PTR CFD models are compared to SAGE® predictions for all four cases. Then, to further demonstrate the benefits of higher-fidelity and multidimensional CFD simulation, the PTR loss mechanisms are characterized in terms of their spatial and temporal locations.
Performance prediction of a ducted rocket combustor
NASA Astrophysics Data System (ADS)
Stowe, Robert
2001-07-01
The ducted rocket is a supersonic flight propulsion system that takes the exhaust from a solid fuel gas generator, mixes it with air, and burns it to produce thrust. To develop such systems, the use of numerical models based on Computational Fluid Dynamics (CFD) is increasingly popular, but their application to reacting flow requires specific attention and validation. Through a careful examination of the governing equations and experimental measurements, a CFD-based method was developed to predict the performance of a ducted rocket combustor. It uses an equilibrium-chemistry Probability Density Function (PDF) combustion model, with a gaseous and a separate stream of 75 nm diameter carbon spheres to represent the fuel. After extensive validation with water tunnel and direct-connect combustion experiments over a wide range of geometries and test conditions, this CFD-based method was able to predict, within a good degree of accuracy, the combustion efficiency of a ducted rocket combustor.
Retrosynthetic Reaction Prediction Using Neural Sequence-to-Sequence Models
2017-01-01
We describe a fully data driven model that learns to perform a retrosynthetic reaction prediction task, which is treated as a sequence-to-sequence mapping problem. The end-to-end trained model has an encoder–decoder architecture that consists of two recurrent neural networks, which has previously shown great success in solving other sequence-to-sequence prediction tasks such as machine translation. The model is trained on 50,000 experimental reaction examples from the United States patent literature, which span 10 broad reaction types that are commonly used by medicinal chemists. We find that our model performs comparably with a rule-based expert system baseline model, and also overcomes certain limitations associated with rule-based expert systems and with any machine learning approach that contains a rule-based expert system component. Our model provides an important first step toward solving the challenging problem of computational retrosynthetic analysis. PMID:29104927
NASA Astrophysics Data System (ADS)
Yu, Jiajia; He, Yong
Mango is a kind of popular tropical fruit, and the soluble solid content is an important in this study visible and short-wave near-infrared spectroscopy (VIS/SWNIR) technique was applied. For sake of investigating the feasibility of using VIS/SWNIR spectroscopy to measure the soluble solid content in mango, and validating the performance of selected sensitive bands, for the calibration set was formed by 135 mango samples, while the remaining 45 mango samples for the prediction set. The combination of partial least squares and backpropagation artificial neural networks (PLS-BP) was used to calculate the prediction model based on raw spectrum data. Based on PLS-BP, the determination coefficient for prediction (Rp) was 0.757 and root mean square and the process is simple and easy to operate. Compared with the Partial least squares (PLS) result, the performance of PLS-BP is better.
The Satellite Clock Bias Prediction Method Based on Takagi-Sugeno Fuzzy Neural Network
NASA Astrophysics Data System (ADS)
Cai, C. L.; Yu, H. G.; Wei, Z. C.; Pan, J. D.
2017-05-01
The continuous improvement of the prediction accuracy of Satellite Clock Bias (SCB) is the key problem of precision navigation. In order to improve the precision of SCB prediction and better reflect the change characteristics of SCB, this paper proposes an SCB prediction method based on the Takagi-Sugeno fuzzy neural network. Firstly, the SCB values are pre-treated based on their characteristics. Then, an accurate Takagi-Sugeno fuzzy neural network model is established based on the preprocessed data to predict SCB. This paper uses the precise SCB data with different sampling intervals provided by IGS (International Global Navigation Satellite System Service) to realize the short-time prediction experiment, and the results are compared with the ARIMA (Auto-Regressive Integrated Moving Average) model, GM(1,1) model, and the quadratic polynomial model. The results show that the Takagi-Sugeno fuzzy neural network model is feasible and effective for the SCB short-time prediction experiment, and performs well for different types of clocks. The prediction results for the proposed method are better than the conventional methods obviously.
A Free Wake Numerical Simulation for Darrieus Vertical Axis Wind Turbine Performance Prediction
NASA Astrophysics Data System (ADS)
Belu, Radian
2010-11-01
In the last four decades, several aerodynamic prediction models have been formulated for the Darrieus wind turbine performances and characteristics. We can identified two families: stream-tube and vortex. The paper presents a simplified numerical techniques for simulating vertical axis wind turbine flow, based on the lifting line theory and a free vortex wake model, including dynamic stall effects for predicting the performances of a 3-D vertical axis wind turbine. A vortex model is used in which the wake is composed of trailing stream-wise and shedding span-wise vortices, whose strengths are equal to the change in the bound vortex strength as required by the Helmholz and Kelvin theorems. Performance parameters are computed by application of the Biot-Savart law along with the Kutta-Jukowski theorem and a semi-empirical stall model. We tested the developed model with an adaptation of the earlier multiple stream-tube performance prediction model for the Darrieus turbines. Predictions by using our method are shown to compare favorably with existing experimental data and the outputs of other numerical models. The method can predict accurately the local and global performances of a vertical axis wind turbine, and can be used in the design and optimization of wind turbines for built environment applications.
Personalized mortality prediction driven by electronic medical data and a patient similarity metric.
Lee, Joon; Maslove, David M; Dubin, Joel A
2015-01-01
Clinical outcome prediction normally employs static, one-size-fits-all models that perform well for the average patient but are sub-optimal for individual patients with unique characteristics. In the era of digital healthcare, it is feasible to dynamically personalize decision support by identifying and analyzing similar past patients, in a way that is analogous to personalized product recommendation in e-commerce. Our objectives were: 1) to prove that analyzing only similar patients leads to better outcome prediction performance than analyzing all available patients, and 2) to characterize the trade-off between training data size and the degree of similarity between the training data and the index patient for whom prediction is to be made. We deployed a cosine-similarity-based patient similarity metric (PSM) to an intensive care unit (ICU) database to identify patients that are most similar to each patient and subsequently to custom-build 30-day mortality prediction models. Rich clinical and administrative data from the first day in the ICU from 17,152 adult ICU admissions were analyzed. The results confirmed that using data from only a small subset of most similar patients for training improves predictive performance in comparison with using data from all available patients. The results also showed that when too few similar patients are used for training, predictive performance degrades due to the effects of small sample sizes. Our PSM-based approach outperformed well-known ICU severity of illness scores. Although the improved prediction performance is achieved at the cost of increased computational burden, Big Data technologies can help realize personalized data-driven decision support at the point of care. The present study provides crucial empirical evidence for the promising potential of personalized data-driven decision support systems. With the increasing adoption of electronic medical record (EMR) systems, our novel medical data analytics contributes to meaningful use of EMR data.
Personalized Mortality Prediction Driven by Electronic Medical Data and a Patient Similarity Metric
Lee, Joon; Maslove, David M.; Dubin, Joel A.
2015-01-01
Background Clinical outcome prediction normally employs static, one-size-fits-all models that perform well for the average patient but are sub-optimal for individual patients with unique characteristics. In the era of digital healthcare, it is feasible to dynamically personalize decision support by identifying and analyzing similar past patients, in a way that is analogous to personalized product recommendation in e-commerce. Our objectives were: 1) to prove that analyzing only similar patients leads to better outcome prediction performance than analyzing all available patients, and 2) to characterize the trade-off between training data size and the degree of similarity between the training data and the index patient for whom prediction is to be made. Methods and Findings We deployed a cosine-similarity-based patient similarity metric (PSM) to an intensive care unit (ICU) database to identify patients that are most similar to each patient and subsequently to custom-build 30-day mortality prediction models. Rich clinical and administrative data from the first day in the ICU from 17,152 adult ICU admissions were analyzed. The results confirmed that using data from only a small subset of most similar patients for training improves predictive performance in comparison with using data from all available patients. The results also showed that when too few similar patients are used for training, predictive performance degrades due to the effects of small sample sizes. Our PSM-based approach outperformed well-known ICU severity of illness scores. Although the improved prediction performance is achieved at the cost of increased computational burden, Big Data technologies can help realize personalized data-driven decision support at the point of care. Conclusions The present study provides crucial empirical evidence for the promising potential of personalized data-driven decision support systems. With the increasing adoption of electronic medical record (EMR) systems, our novel medical data analytics contributes to meaningful use of EMR data. PMID:25978419
Genomic Prediction of Testcross Performance in Canola (Brassica napus)
Jan, Habib U.; Abbadi, Amine; Lücke, Sophie; Nichols, Richard A.; Snowdon, Rod J.
2016-01-01
Genomic selection (GS) is a modern breeding approach where genome-wide single-nucleotide polymorphism (SNP) marker profiles are simultaneously used to estimate performance of untested genotypes. In this study, the potential of genomic selection methods to predict testcross performance for hybrid canola breeding was applied for various agronomic traits based on genome-wide marker profiles. A total of 475 genetically diverse spring-type canola pollinator lines were genotyped at 24,403 single-copy, genome-wide SNP loci. In parallel, the 950 F1 testcross combinations between the pollinators and two representative testers were evaluated for a number of important agronomic traits including seedling emergence, days to flowering, lodging, oil yield and seed yield along with essential seed quality characters including seed oil content and seed glucosinolate content. A ridge-regression best linear unbiased prediction (RR-BLUP) model was applied in combination with 500 cross-validations for each trait to predict testcross performance, both across the whole population as well as within individual subpopulations or clusters, based solely on SNP profiles. Subpopulations were determined using multidimensional scaling and K-means clustering. Genomic prediction accuracy across the whole population was highest for seed oil content (0.81) followed by oil yield (0.75) and lowest for seedling emergence (0.29). For seed yieId, seed glucosinolate, lodging resistance and days to onset of flowering (DTF), prediction accuracies were 0.45, 0.61, 0.39 and 0.56, respectively. Prediction accuracies could be increased for some traits by treating subpopulations separately; a strategy which only led to moderate improvements for some traits with low heritability, like seedling emergence. No useful or consistent increase in accuracy was obtained by inclusion of a population substructure covariate in the model. Testcross performance prediction using genome-wide SNP markers shows considerable potential for pre-selection of promising hybrid combinations prior to resource-intensive field testing over multiple locations and years. PMID:26824924
Reevaluation of a walleye (Sander vitreus) bioenergetics model
Madenjian, Charles P.; Wang, Chunfang
2013-01-01
Walleye (Sander vitreus) is an important sport fish throughout much of North America, and walleye populations support valuable commercial fisheries in certain lakes as well. Using a corrected algorithm for balancing the energy budget, we reevaluated the performance of the Wisconsin bioenergetics model for walleye in the laboratory. Walleyes were fed rainbow smelt (Osmerus mordax) in four laboratory tanks each day during a 126-day experiment. Feeding rates ranged from 1.4 to 1.7 % of walleye body weight per day. Based on a statistical comparison of bioenergetics model predictions of monthly consumption with observed monthly consumption, we concluded that the bioenergetics model estimated food consumption by walleye without any significant bias. Similarly, based on a statistical comparison of bioenergetics model predictions of weight at the end of the monthly test period with observed weight, we concluded that the bioenergetics model predicted walleye growth without any detectable bias. In addition, the bioenergetics model predictions of cumulative consumption over the 126-day experiment differed fromobserved cumulative consumption by less than 10 %. Although additional laboratory and field testing will be needed to fully evaluate model performance, based on our laboratory results, the Wisconsin bioenergetics model for walleye appears to be providing unbiased predictions of food consumption.
NASA Astrophysics Data System (ADS)
Dash, Y.; Mishra, S. K.; Panigrahi, B. K.
2017-12-01
Prediction of northeast/post monsoon rainfall which occur during October, November and December (OND) over Indian peninsula is a challenging task due to the dynamic nature of uncertain chaotic climate. It is imperative to elucidate this issue by examining performance of different machine leaning (ML) approaches. The prime objective of this research is to compare between a) statistical prediction using historical rainfall observations and global atmosphere-ocean predictors like Sea Surface Temperature (SST) and Sea Level Pressure (SLP) and b) empirical prediction based on a time series analysis of past rainfall data without using any other predictors. Initially, ML techniques have been applied on SST and SLP data (1948-2014) obtained from NCEP/NCAR reanalysis monthly mean provided by the NOAA ESRL PSD. Later, this study investigated the applicability of ML methods using OND rainfall time series for 1948-2014 and forecasted up to 2018. The predicted values of aforementioned methods were verified using observed time series data collected from Indian Institute of Tropical Meteorology and the result revealed good performance of ML algorithms with minimal error scores. Thus, it is found that both statistical and empirical methods are useful for long range climatic projections.
NASA Technical Reports Server (NTRS)
Mcgrath, W. R.; Richards, P. L.; Face, D. W.; Prober, D. E.; Lloyd, F. L.
1988-01-01
A systematic study of the gain and noise in superconductor-insulator-superconductor mixers employing Ta based, Nb based, and Pb-alloy based tunnel junctions was made. These junctions displayed both weak and strong quantum effects at a signal frequency of 33 GHz. The effects of energy gap sharpness and subgap current were investigated and are quantitatively related to mixer performance. Detailed comparisons are made of the mixing results with the predictions of a three-port model approximation to the Tucker theory. Mixer performance was measured with a novel test apparatus which is accurate enough to allow for the first quantitative tests of theoretical noise predictions. It is found that the three-port model of the Tucker theory underestimates the mixer noise temperature by a factor of about 2 for all of the mixers. In addition, predicted values of available mixer gain are in reasonable agreement with experiment when quantum effects are weak. However, as quantum effects become strong, the predicted available gain diverges to infinity, which is in sharp contrast to the experimental results. Predictions of coupled gain do not always show such divergences.
NASA Astrophysics Data System (ADS)
Lim, Yeerang; Jung, Youeyun; Bang, Hyochoong
2018-05-01
This study presents model predictive formation control based on an eccentricity/inclination vector separation strategy. Alternative collision avoidance can be accomplished by using eccentricity/inclination vectors and adding a simple goal function term for optimization process. Real-time control is also achievable with model predictive controller based on convex formulation. Constraint-tightening approach is address as well improve robustness of the controller, and simulation results are presented to verify performance enhancement for the proposed approach.
Kernel-based whole-genome prediction of complex traits: a review.
Morota, Gota; Gianola, Daniel
2014-01-01
Prediction of genetic values has been a focus of applied quantitative genetics since the beginning of the 20th century, with renewed interest following the advent of the era of whole genome-enabled prediction. Opportunities offered by the emergence of high-dimensional genomic data fueled by post-Sanger sequencing technologies, especially molecular markers, have driven researchers to extend Ronald Fisher and Sewall Wright's models to confront new challenges. In particular, kernel methods are gaining consideration as a regression method of choice for genome-enabled prediction. Complex traits are presumably influenced by many genomic regions working in concert with others (clearly so when considering pathways), thus generating interactions. Motivated by this view, a growing number of statistical approaches based on kernels attempt to capture non-additive effects, either parametrically or non-parametrically. This review centers on whole-genome regression using kernel methods applied to a wide range of quantitative traits of agricultural importance in animals and plants. We discuss various kernel-based approaches tailored to capturing total genetic variation, with the aim of arriving at an enhanced predictive performance in the light of available genome annotation information. Connections between prediction machines born in animal breeding, statistics, and machine learning are revisited, and their empirical prediction performance is discussed. Overall, while some encouraging results have been obtained with non-parametric kernels, recovering non-additive genetic variation in a validation dataset remains a challenge in quantitative genetics.
Linear regression models for solvent accessibility prediction in proteins.
Wagner, Michael; Adamczak, Rafał; Porollo, Aleksey; Meller, Jarosław
2005-04-01
The relative solvent accessibility (RSA) of an amino acid residue in a protein structure is a real number that represents the solvent exposed surface area of this residue in relative terms. The problem of predicting the RSA from the primary amino acid sequence can therefore be cast as a regression problem. Nevertheless, RSA prediction has so far typically been cast as a classification problem. Consequently, various machine learning techniques have been used within the classification framework to predict whether a given amino acid exceeds some (arbitrary) RSA threshold and would thus be predicted to be "exposed," as opposed to "buried." We have recently developed novel methods for RSA prediction using nonlinear regression techniques which provide accurate estimates of the real-valued RSA and outperform classification-based approaches with respect to commonly used two-class projections. However, while their performance seems to provide a significant improvement over previously published approaches, these Neural Network (NN) based methods are computationally expensive to train and involve several thousand parameters. In this work, we develop alternative regression models for RSA prediction which are computationally much less expensive, involve orders-of-magnitude fewer parameters, and are still competitive in terms of prediction quality. In particular, we investigate several regression models for RSA prediction using linear L1-support vector regression (SVR) approaches as well as standard linear least squares (LS) regression. Using rigorously derived validation sets of protein structures and extensive cross-validation analysis, we compare the performance of the SVR with that of LS regression and NN-based methods. In particular, we show that the flexibility of the SVR (as encoded by metaparameters such as the error insensitivity and the error penalization terms) can be very beneficial to optimize the prediction accuracy for buried residues. We conclude that the simple and computationally much more efficient linear SVR performs comparably to nonlinear models and thus can be used in order to facilitate further attempts to design more accurate RSA prediction methods, with applications to fold recognition and de novo protein structure prediction methods.
Höhne, Marlene; Jahanbekam, Amirhossein; Bauckhage, Christian; Axmacher, Nikolai; Fell, Juergen
2016-10-01
Mediotemporal EEG characteristics are closely related to long-term memory formation. It has been reported that rhinal and hippocampal EEG measures reflecting the stability of phases across trials are better suited to distinguish subsequently remembered from forgotten trials than event-related potentials or amplitude-based measures. Theoretical models suggest that the phase of EEG oscillations reflects neural excitability and influences cellular plasticity. However, while previous studies have shown that the stability of phase values across trials is indeed a relevant predictor of subsequent memory performance, the effect of absolute single-trial phase values has been little explored. Here, we reanalyzed intracranial EEG recordings from the mediotemporal lobe of 27 epilepsy patients performing a continuous word recognition paradigm. Two-class classification using a support vector machine was performed to predict subsequently remembered vs. forgotten trials based on individually selected frequencies and time points. We demonstrate that it is possible to successfully predict single-trial memory formation in the majority of patients (23 out of 27) based on only three single-trial phase values given by a rhinal phase, a hippocampal phase, and a rhinal-hippocampal phase difference. Overall classification accuracy across all subjects was 69.2% choosing frequencies from the range between 0.5 and 50Hz and time points from the interval between -0.5s and 2s. For 19 patients, above chance prediction of subsequent memory was possible even when choosing only time points from the prestimulus interval (overall accuracy: 65.2%). Furthermore, prediction accuracies based on single-trial phase surpassed those based on single-trial power. Our results confirm the functional relevance of mediotemporal EEG phase for long-term memory operations and suggest that phase information may be utilized for memory enhancement applications based on deep brain stimulation. Copyright © 2016 Elsevier Inc. All rights reserved.
Hahn, Sowon; Buttaccio, Daniel R; Hahn, Jungwon; Lee, Taehun
2015-01-01
The present study demonstrates that levels of extraversion and neuroticism can predict attentional performance during a change detection task. After completing a change detection task built on the flicker paradigm, participants were assessed for personality traits using the Revised Eysenck Personality Questionnaire (EPQ-R). Multiple regression analyses revealed that higher levels of extraversion predict increased change detection accuracies, while higher levels of neuroticism predict decreased change detection accuracies. In addition, neurotic individuals exhibited decreased sensitivity A' and increased fixation dwell times. Hierarchical regression analyses further revealed that eye movement measures mediate the relationship between neuroticism and change detection accuracies. Based on the current results, we propose that neuroticism is associated with decreased attentional control over the visual field, presumably due to decreased attentional disengagement. Extraversion can predict increased attentional performance, but the effect is smaller than the relationship between neuroticism and attention.
Predicting Cost/Performance Trade-Offs for Whitney: A Commodity Computing Cluster
NASA Technical Reports Server (NTRS)
Becker, Jeffrey C.; Nitzberg, Bill; VanderWijngaart, Rob F.; Kutler, Paul (Technical Monitor)
1997-01-01
Recent advances in low-end processor and network technology have made it possible to build a "supercomputer" out of commodity components. We develop simple models of the NAS Parallel Benchmarks version 2 (NPB 2) to explore the cost/performance trade-offs involved in building a balanced parallel computer supporting a scientific workload. We develop closed form expressions detailing the number and size of messages sent by each benchmark. Coupling these with measured single processor performance, network latency, and network bandwidth, our models predict benchmark performance to within 30%. A comparison based on total system cost reveals that current commodity technology (200 MHz Pentium Pros with 100baseT Ethernet) is well balanced for the NPBs up to a total system cost of around $1,000,000.
A computational cognitive model of self-efficacy and daily adherence in mHealth.
Pirolli, Peter
2016-12-01
Mobile health (mHealth) applications provide an excellent opportunity for collecting rich, fine-grained data necessary for understanding and predicting day-to-day health behavior change dynamics. A computational predictive model (ACT-R-DStress) is presented and fit to individual daily adherence in 28-day mHealth exercise programs. The ACT-R-DStress model refines the psychological construct of self-efficacy. To explain and predict the dynamics of self-efficacy and predict individual performance of targeted behaviors, the self-efficacy construct is implemented as a theory-based neurocognitive simulation of the interaction of behavioral goals, memories of past experiences, and behavioral performance.
Gao, Ying-Duo; Hu, Yuan; Crespo, Alejandro; Wang, Deping; Armacost, Kira A; Fells, James I; Fradera, Xavier; Wang, Hongwu; Wang, Huijun; Sherborne, Brad; Verras, Andreas; Peng, Zhengwei
2018-01-01
The 2016 D3R Grand Challenge 2 includes both pose and affinity or ranking predictions. This article is focused exclusively on affinity predictions submitted to the D3R challenge from a collaborative effort of the modeling and informatics group. Our submissions include ranking of 102 ligands covering 4 different chemotypes against the FXR ligand binding domain structure, and the relative binding affinity predictions of the two designated free energy subsets of 15 and 18 compounds. Using all the complex structures prepared in the same way allowed us to cover many types of workflows and compare their performances effectively. We evaluated typical workflows used in our daily structure-based design modeling support, which include docking scores, force field-based scores, QM/MM, MMGBSA, MD-MMGBSA, and MacroModel interaction energy estimations. The best performing methods for the two free energy subsets are discussed. Our results suggest that affinity ranking still remains very challenging; that the knowledge of more structural information does not necessarily yield more accurate predictions; and that visual inspection and human intervention are considerably important for ranking. Knowledge of the mode of action and protein flexibility along with visualization tools that depict polar and hydrophobic maps are very useful for visual inspection. QM/MM-based workflows were found to be powerful in affinity ranking and are encouraged to be applied more often. The standardized input and output enable systematic analysis and support methodology development and improvement for high level blinded predictions.
NASA Astrophysics Data System (ADS)
Maljaars, E.; Felici, F.; Blanken, T. C.; Galperti, C.; Sauter, O.; de Baar, M. R.; Carpanese, F.; Goodman, T. P.; Kim, D.; Kim, S. H.; Kong, M.; Mavkov, B.; Merle, A.; Moret, J. M.; Nouailletas, R.; Scheffer, M.; Teplukhina, A. A.; Vu, N. M. T.; The EUROfusion MST1-team; The TCV-team
2017-12-01
The successful performance of a model predictive profile controller is demonstrated in simulations and experiments on the TCV tokamak, employing a profile controller test environment. Stable high-performance tokamak operation in hybrid and advanced plasma scenarios requires control over the safety factor profile (q-profile) and kinetic plasma parameters such as the plasma beta. This demands to establish reliable profile control routines in presently operational tokamaks. We present a model predictive profile controller that controls the q-profile and plasma beta using power requests to two clusters of gyrotrons and the plasma current request. The performance of the controller is analyzed in both simulation and TCV L-mode discharges where successful tracking of the estimated inverse q-profile as well as plasma beta is demonstrated under uncertain plasma conditions and the presence of disturbances. The controller exploits the knowledge of the time-varying actuator limits in the actuator input calculation itself such that fast transitions between targets are achieved without overshoot. A software environment is employed to prepare and test this and three other profile controllers in parallel in simulations and experiments on TCV. This set of tools includes the rapid plasma transport simulator RAPTOR and various algorithms to reconstruct the plasma equilibrium and plasma profiles by merging the available measurements with model-based predictions. In this work the estimated q-profile is merely based on RAPTOR model predictions due to the absence of internal current density measurements in TCV. These results encourage to further exploit model predictive profile control in experiments on TCV and other (future) tokamaks.
NASA Astrophysics Data System (ADS)
Gao, Ying-Duo; Hu, Yuan; Crespo, Alejandro; Wang, Deping; Armacost, Kira A.; Fells, James I.; Fradera, Xavier; Wang, Hongwu; Wang, Huijun; Sherborne, Brad; Verras, Andreas; Peng, Zhengwei
2018-01-01
The 2016 D3R Grand Challenge 2 includes both pose and affinity or ranking predictions. This article is focused exclusively on affinity predictions submitted to the D3R challenge from a collaborative effort of the modeling and informatics group. Our submissions include ranking of 102 ligands covering 4 different chemotypes against the FXR ligand binding domain structure, and the relative binding affinity predictions of the two designated free energy subsets of 15 and 18 compounds. Using all the complex structures prepared in the same way allowed us to cover many types of workflows and compare their performances effectively. We evaluated typical workflows used in our daily structure-based design modeling support, which include docking scores, force field-based scores, QM/MM, MMGBSA, MD-MMGBSA, and MacroModel interaction energy estimations. The best performing methods for the two free energy subsets are discussed. Our results suggest that affinity ranking still remains very challenging; that the knowledge of more structural information does not necessarily yield more accurate predictions; and that visual inspection and human intervention are considerably important for ranking. Knowledge of the mode of action and protein flexibility along with visualization tools that depict polar and hydrophobic maps are very useful for visual inspection. QM/MM-based workflows were found to be powerful in affinity ranking and are encouraged to be applied more often. The standardized input and output enable systematic analysis and support methodology development and improvement for high level blinded predictions.
An auxiliary optimization method for complex public transit route network based on link prediction
NASA Astrophysics Data System (ADS)
Zhang, Lin; Lu, Jian; Yue, Xianfei; Zhou, Jialin; Li, Yunxuan; Wan, Qian
2018-02-01
Inspired by the missing (new) link prediction and the spurious existing link identification in link prediction theory, this paper establishes an auxiliary optimization method for public transit route network (PTRN) based on link prediction. First, link prediction applied to PTRN is described, and based on reviewing the previous studies, the summary indices set and its algorithms set are collected for the link prediction experiment. Second, through analyzing the topological properties of Jinan’s PTRN established by the Space R method, we found that this is a typical small-world network with a relatively large average clustering coefficient. This phenomenon indicates that the structural similarity-based link prediction will show a good performance in this network. Then, based on the link prediction experiment of the summary indices set, three indices with maximum accuracy are selected for auxiliary optimization of Jinan’s PTRN. Furthermore, these link prediction results show that the overall layout of Jinan’s PTRN is stable and orderly, except for a partial area that requires optimization and reconstruction. The above pattern conforms to the general pattern of the optimal development stage of PTRN in China. Finally, based on the missing (new) link prediction and the spurious existing link identification, we propose optimization schemes that can be used not only to optimize current PTRN but also to evaluate PTRN planning.
A probabilistic approach to photovoltaic generator performance prediction
NASA Astrophysics Data System (ADS)
Khallat, M. A.; Rahman, S.
1986-09-01
A method for predicting the performance of a photovoltaic (PV) generator based on long term climatological data and expected cell performance is described. The equations for cell model formulation are provided. Use of the statistical model for characterizing the insolation level is discussed. The insolation data is fitted to appropriate probability distribution functions (Weibull, beta, normal). The probability distribution functions are utilized to evaluate the capacity factors of PV panels or arrays. An example is presented revealing the applicability of the procedure.
Predicting trauma patient mortality: ICD [or ICD-10-AM] versus AIS based approaches.
Willis, Cameron D; Gabbe, Belinda J; Jolley, Damien; Harrison, James E; Cameron, Peter A
2010-11-01
The International Classification of Diseases Injury Severity Score (ICISS) has been proposed as an International Classification of Diseases (ICD)-10-based alternative to mortality prediction tools that use Abbreviated Injury Scale (AIS) data, including the Trauma and Injury Severity Score (TRISS). To date, studies have not examined the performance of ICISS using Australian trauma registry data. This study aimed to compare the performance of ICISS with other mortality prediction tools in an Australian trauma registry. This was a retrospective review of prospectively collected data from the Victorian State Trauma Registry. A training dataset was created for model development and a validation dataset for evaluation. The multiplicative ICISS model was compared with a worst injury ICISS approach, Victorian TRISS (V-TRISS, using local coefficients), maximum AIS severity and a multivariable model including ICD-10-AM codes as predictors. Models were investigated for discrimination (C-statistic) and calibration (Hosmer-Lemeshow statistic). The multivariable approach had the highest level of discrimination (C-statistic 0.90) and calibration (H-L 7.65, P= 0.468). Worst injury ICISS, V-TRISS and maximum AIS had similar performance. The multiplicative ICISS produced the lowest level of discrimination (C-statistic 0.80) and poorest calibration (H-L 50.23, P < 0.001). The performance of ICISS may be affected by the data used to develop estimates, the ICD version employed, the methods for deriving estimates and the inclusion of covariates. In this analysis, a multivariable approach using ICD-10-AM codes was the best-performing method. A multivariable ICISS approach may therefore be a useful alternative to AIS-based methods and may have comparable predictive performance to locally derived TRISS models. © 2010 The Authors. ANZ Journal of Surgery © 2010 Royal Australasian College of Surgeons.
Perceived ability and social support as mediators of achievement motivation and performance anxiety.
Abrahamsen, F E; Roberts, G C; Pensgaard, A M; Ronglan, L T
2008-12-01
The present study is founded on achievement goal theory (AGT) and examines the relationship between motivation, social support and performance anxiety with team handball players (n=143) from 10 elite teams. Based on these theories and previous findings, the study has three purposes. First, it was predicted that the female athletes (n=69) would report more performance worries and more social support use than males (n=74). The findings support the hypothesis for anxiety, but not for social support use. However, females report that they felt social support was more available than males. Second, we predicted and found a positive relationship between the interaction of ego orientation and perceptions of a performance climate on performance anxiety, but only for females. As predicted, perceived ability mediated this relationship. Finally, we predicted that perceptions of a performance climate were related to the view that social support was less available especially for the male athletes. Simple correlation supports this prediction, but the regression analyses did not reach significance. Thus, we could not test for mediation of social support between motivational variables and anxiety. The results illustrate that fostering a mastery climate helps elite athletes tackle competitive pressure.
Wang, Ying; Goh, Joshua O; Resnick, Susan M; Davatzikos, Christos
2013-01-01
In this study, we used high-dimensional pattern regression methods based on structural (gray and white matter; GM and WM) and functional (positron emission tomography of regional cerebral blood flow; PET) brain data to identify cross-sectional imaging biomarkers of cognitive performance in cognitively normal older adults from the Baltimore Longitudinal Study of Aging (BLSA). We focused on specific components of executive and memory domains known to decline with aging, including manipulation, semantic retrieval, long-term memory (LTM), and short-term memory (STM). For each imaging modality, brain regions associated with each cognitive domain were generated by adaptive regional clustering. A relevance vector machine was adopted to model the nonlinear continuous relationship between brain regions and cognitive performance, with cross-validation to select the most informative brain regions (using recursive feature elimination) as imaging biomarkers and optimize model parameters. Predicted cognitive scores using our regression algorithm based on the resulting brain regions correlated well with actual performance. Also, regression models obtained using combined GM, WM, and PET imaging modalities outperformed models based on single modalities. Imaging biomarkers related to memory performance included the orbito-frontal and medial temporal cortical regions with LTM showing stronger correlation with the temporal lobe than STM. Brain regions predicting executive performance included orbito-frontal, and occipito-temporal areas. The PET modality had higher contribution to most cognitive domains except manipulation, which had higher WM contribution from the superior longitudinal fasciculus and the genu of the corpus callosum. These findings based on machine-learning methods demonstrate the importance of combining structural and functional imaging data in understanding complex cognitive mechanisms and also their potential usage as biomarkers that predict cognitive status.
Ochoa, David; García-Gutiérrez, Ponciano; Juan, David; Valencia, Alfonso; Pazos, Florencio
2013-01-27
A widespread family of methods for studying and predicting protein interactions using sequence information is based on co-evolution, quantified as similarity of phylogenetic trees. Part of the co-evolution observed between interacting proteins could be due to co-adaptation caused by inter-protein contacts. In this case, the co-evolution is expected to be more evident when evaluated on the surface of the proteins or the internal layers close to it. In this work we study the effect of incorporating information on predicted solvent accessibility to three methods for predicting protein interactions based on similarity of phylogenetic trees. We evaluate the performance of these methods in predicting different types of protein associations when trees based on positions with different characteristics of predicted accessibility are used as input. We found that predicted accessibility improves the results of two recent versions of the mirrortree methodology in predicting direct binary physical interactions, while it neither improves these methods, nor the original mirrortree method, in predicting other types of interactions. That improvement comes at no cost in terms of applicability since accessibility can be predicted for any sequence. We also found that predictions of protein-protein interactions are improved when multiple sequence alignments with a richer representation of sequences (including paralogs) are incorporated in the accessibility prediction.
NLLSS: Predicting Synergistic Drug Combinations Based on Semi-supervised Learning
Chen, Ming; Wang, Quanxin; Zhang, Lixin; Yan, Guiying
2016-01-01
Fungal infection has become one of the leading causes of hospital-acquired infections with high mortality rates. Furthermore, drug resistance is common for fungus-causing diseases. Synergistic drug combinations could provide an effective strategy to overcome drug resistance. Meanwhile, synergistic drug combinations can increase treatment efficacy and decrease drug dosage to avoid toxicity. Therefore, computational prediction of synergistic drug combinations for fungus-causing diseases becomes attractive. In this study, we proposed similar nature of drug combinations: principal drugs which obtain synergistic effect with similar adjuvant drugs are often similar and vice versa. Furthermore, we developed a novel algorithm termed Network-based Laplacian regularized Least Square Synergistic drug combination prediction (NLLSS) to predict potential synergistic drug combinations by integrating different kinds of information such as known synergistic drug combinations, drug-target interactions, and drug chemical structures. We applied NLLSS to predict antifungal synergistic drug combinations and showed that it achieved excellent performance both in terms of cross validation and independent prediction. Finally, we performed biological experiments for fungal pathogen Candida albicans to confirm 7 out of 13 predicted antifungal synergistic drug combinations. NLLSS provides an efficient strategy to identify potential synergistic antifungal combinations. PMID:27415801
Incorporating uncertainty in predictive species distribution modelling.
Beale, Colin M; Lennon, Jack J
2012-01-19
Motivated by the need to solve ecological problems (climate change, habitat fragmentation and biological invasions), there has been increasing interest in species distribution models (SDMs). Predictions from these models inform conservation policy, invasive species management and disease-control measures. However, predictions are subject to uncertainty, the degree and source of which is often unrecognized. Here, we review the SDM literature in the context of uncertainty, focusing on three main classes of SDM: niche-based models, demographic models and process-based models. We identify sources of uncertainty for each class and discuss how uncertainty can be minimized or included in the modelling process to give realistic measures of confidence around predictions. Because this has typically not been performed, we conclude that uncertainty in SDMs has often been underestimated and a false precision assigned to predictions of geographical distribution. We identify areas where development of new statistical tools will improve predictions from distribution models, notably the development of hierarchical models that link different types of distribution model and their attendant uncertainties across spatial scales. Finally, we discuss the need to develop more defensible methods for assessing predictive performance, quantifying model goodness-of-fit and for assessing the significance of model covariates.
Evaluation of non-negative matrix factorization of grey matter in age prediction.
Varikuti, Deepthi P; Genon, Sarah; Sotiras, Aristeidis; Schwender, Holger; Hoffstaedter, Felix; Patil, Kaustubh R; Jockwitz, Christiane; Caspers, Svenja; Moebus, Susanne; Amunts, Katrin; Davatzikos, Christos; Eickhoff, Simon B
2018-06-01
The relationship between grey matter volume (GMV) patterns and age can be captured by multivariate pattern analysis, allowing prediction of individuals' age based on structural imaging. Raw data, voxel-wise GMV and non-sparse factorization (with Principal Component Analysis, PCA) show good performance but do not promote relatively localized brain components for post-hoc examinations. Here we evaluated a non-negative matrix factorization (NNMF) approach to provide a reduced, but also interpretable representation of GMV data in age prediction frameworks in healthy and clinical populations. This examination was performed using three datasets: a multi-site cohort of life-span healthy adults, a single site cohort of older adults and clinical samples from the ADNI dataset with healthy subjects, participants with Mild Cognitive Impairment and patients with Alzheimer's disease (AD) subsamples. T1-weighted images were preprocessed with VBM8 standard settings to compute GMV values after normalization, segmentation and modulation for non-linear transformations only. Non-negative matrix factorization was computed on the GM voxel-wise values for a range of granularities (50-690 components) and LASSO (Least Absolute Shrinkage and Selection Operator) regression were used for age prediction. First, we compared the performance of our data compression procedure (i.e., NNMF) to various other approaches (i.e., uncompressed VBM data, PCA-based factorization and parcellation-based compression). We then investigated the impact of the granularity on the accuracy of age prediction, as well as the transferability of the factorization and model generalization across datasets. We finally validated our framework by examining age prediction in ADNI samples. Our results showed that our framework favorably compares with other approaches. They also demonstrated that the NNMF based factorization derived from one dataset could be efficiently applied to compress VBM data of another dataset and that granularities between 300 and 500 components give an optimal representation for age prediction. In addition to the good performance in healthy subjects our framework provided relatively localized brain regions as the features contributing to the prediction, thereby offering further insights into structural changes due to brain aging. Finally, our validation in clinical populations showed that our framework is sensitive to deviance from normal structural variations in pathological aging. Copyright © 2018 Elsevier Inc. All rights reserved.
A probabilistic neural network based approach for predicting the output power of wind turbines
NASA Astrophysics Data System (ADS)
Tabatabaei, Sajad
2017-03-01
Finding the authentic predicting tools of eliminating the uncertainty of wind speed forecasts is highly required while wind power sources are strongly penetrating. Recently, traditional predicting models of generating point forecasts have no longer been trustee. Thus, the present paper aims at utilising the concept of prediction intervals (PIs) to assess the uncertainty of wind power generation in power systems. Besides, this paper uses a newly introduced non-parametric approach called lower upper bound estimation (LUBE) to build the PIs since the forecasting errors are unable to be modelled properly by applying distribution probability functions. In the present proposed LUBE method, a PI combination-based fuzzy framework is used to overcome the performance instability of neutral networks (NNs) used in LUBE. In comparison to other methods, this formulation more suitably has satisfied the PI coverage and PI normalised average width (PINAW). Since this non-linear problem has a high complexity, a new heuristic-based optimisation algorithm comprising a novel modification is introduced to solve the aforesaid problems. Based on data sets taken from a wind farm in Australia, the feasibility and satisfying performance of the suggested method have been investigated.
Ma, Xin; Guo, Jing; Sun, Xiao
2015-01-01
The prediction of RNA-binding proteins is one of the most challenging problems in computation biology. Although some studies have investigated this problem, the accuracy of prediction is still not sufficient. In this study, a highly accurate method was developed to predict RNA-binding proteins from amino acid sequences using random forests with the minimum redundancy maximum relevance (mRMR) method, followed by incremental feature selection (IFS). We incorporated features of conjoint triad features and three novel features: binding propensity (BP), nonbinding propensity (NBP), and evolutionary information combined with physicochemical properties (EIPP). The results showed that these novel features have important roles in improving the performance of the predictor. Using the mRMR-IFS method, our predictor achieved the best performance (86.62% accuracy and 0.737 Matthews correlation coefficient). High prediction accuracy and successful prediction performance suggested that our method can be a useful approach to identify RNA-binding proteins from sequence information.
Analysis of Free Modeling Predictions by RBO Aleph in CASP11
Mabrouk, Mahmoud; Werner, Tim; Schneider, Michael; Putz, Ines; Brock, Oliver
2015-01-01
The CASP experiment is a biannual benchmark for assessing protein structure prediction methods. In CASP11, RBO Aleph ranked as one of the top-performing automated servers in the free modeling category. This category consists of targets for which structural templates are not easily retrievable. We analyze the performance of RBO Aleph and show that its success in CASP was a result of its ab initio structure prediction protocol. A detailed analysis of this protocol demonstrates that two components unique to our method greatly contributed to prediction quality: residue–residue contact prediction by EPC-map and contact–guided conformational space search by model-based search (MBS). Interestingly, our analysis also points to a possible fundamental problem in evaluating the performance of protein structure prediction methods: Improvements in components of the method do not necessarily lead to improvements of the entire method. This points to the fact that these components interact in ways that are poorly understood. This problem, if indeed true, represents a significant obstacle to community-wide progress. PMID:26492194
Fourier transform wavefront control with adaptive prediction of the atmosphere.
Poyneer, Lisa A; Macintosh, Bruce A; Véran, Jean-Pierre
2007-09-01
Predictive Fourier control is a temporal power spectral density-based adaptive method for adaptive optics that predicts the atmosphere under the assumption of frozen flow. The predictive controller is based on Kalman filtering and a Fourier decomposition of atmospheric turbulence using the Fourier transform reconstructor. It provides a stable way to compensate for arbitrary numbers of atmospheric layers. For each Fourier mode, efficient and accurate algorithms estimate the necessary atmospheric parameters from closed-loop telemetry and determine the predictive filter, adjusting as conditions change. This prediction improves atmospheric rejection, leading to significant improvements in system performance. For a 48x48 actuator system operating at 2 kHz, five-layer prediction for all modes is achievable in under 2x10(9) floating-point operations/s.
Chen, Qianting; Dai, Congling; Zhang, Qianjun; Du, Juan; Li, Wen
2016-10-01
To study the prediction performance evaluation with five kinds of bioinformatics software (SIFT, PolyPhen2, MutationTaster, Provean, MutationAssessor). From own database for genetic mutations collected over the past five years, Chinese literature database, Human Gene Mutation Database, and dbSNP, 121 missense mutations confirmed by functional studies, and 121 missense mutations suspected to be pathogenic by pedigree analysis were used as positive gold standard, while 242 missense mutations with minor allele frequency (MAF)>5% in dominant hereditary diseases were used as negative gold standard. The selected mutations were predicted with the five software. Based on the results, the performance of the five software was evaluated for their sensitivity, specificity, positive predict value, false positive rate, negative predict value, false negative rate, false discovery rate, accuracy, and receiver operating characteristic curve (ROC). In terms of sensitivity, negative predictive value and false negative rate, the rank was MutationTaster, PolyPhen2, Provean, SIFT, and MutationAssessor. For specificity and false positive rate, the rank was MutationTaster, Provean, MutationAssessor, SIFT, and PolyPhen2. For positive predict value and false discovery rate, the rank was MutationTaster, Provean, MutationAssessor, PolyPhen2, and SIFT. For area under the ROC curve (AUC) and accuracy, the rank was MutationTaster, Provean, PolyPhen2, MutationAssessor, and SIFT. The prediction performance of software may be different when using different parameters. Among the five software, MutationTaster has the best prediction performance.
Srinivasulu, Yerukala Sathipati; Wang, Jyun-Rong; Hsu, Kai-Ti; Tsai, Ming-Ju; Charoenkwan, Phasit; Huang, Wen-Lin; Huang, Hui-Ling; Ho, Shinn-Ying
2015-01-01
Protein-protein interactions (PPIs) are involved in various biological processes, and underlying mechanism of the interactions plays a crucial role in therapeutics and protein engineering. Most machine learning approaches have been developed for predicting the binding affinity of protein-protein complexes based on structure and functional information. This work aims to predict the binding affinity of heterodimeric protein complexes from sequences only. This work proposes a support vector machine (SVM) based binding affinity classifier, called SVM-BAC, to classify heterodimeric protein complexes based on the prediction of their binding affinity. SVM-BAC identified 14 of 580 sequence descriptors (physicochemical, energetic and conformational properties of the 20 amino acids) to classify 216 heterodimeric protein complexes into low and high binding affinity. SVM-BAC yielded the training accuracy, sensitivity, specificity, AUC and test accuracy of 85.80%, 0.89, 0.83, 0.86 and 83.33%, respectively, better than existing machine learning algorithms. The 14 features and support vector regression were further used to estimate the binding affinities (Pkd) of 200 heterodimeric protein complexes. Prediction performance of a Jackknife test was the correlation coefficient of 0.34 and mean absolute error of 1.4. We further analyze three informative physicochemical properties according to their contribution to prediction performance. Results reveal that the following properties are effective in predicting the binding affinity of heterodimeric protein complexes: apparent partition energy based on buried molar fractions, relations between chemical structure and biological activity in principal component analysis IV, and normalized frequency of beta turn. The proposed sequence-based prediction method SVM-BAC uses an optimal feature selection method to identify 14 informative features to classify and predict binding affinity of heterodimeric protein complexes. The characterization analysis revealed that the average numbers of beta turns and hydrogen bonds at protein-protein interfaces in high binding affinity complexes are more than those in low binding affinity complexes.
2015-01-01
Background Protein-protein interactions (PPIs) are involved in various biological processes, and underlying mechanism of the interactions plays a crucial role in therapeutics and protein engineering. Most machine learning approaches have been developed for predicting the binding affinity of protein-protein complexes based on structure and functional information. This work aims to predict the binding affinity of heterodimeric protein complexes from sequences only. Results This work proposes a support vector machine (SVM) based binding affinity classifier, called SVM-BAC, to classify heterodimeric protein complexes based on the prediction of their binding affinity. SVM-BAC identified 14 of 580 sequence descriptors (physicochemical, energetic and conformational properties of the 20 amino acids) to classify 216 heterodimeric protein complexes into low and high binding affinity. SVM-BAC yielded the training accuracy, sensitivity, specificity, AUC and test accuracy of 85.80%, 0.89, 0.83, 0.86 and 83.33%, respectively, better than existing machine learning algorithms. The 14 features and support vector regression were further used to estimate the binding affinities (Pkd) of 200 heterodimeric protein complexes. Prediction performance of a Jackknife test was the correlation coefficient of 0.34 and mean absolute error of 1.4. We further analyze three informative physicochemical properties according to their contribution to prediction performance. Results reveal that the following properties are effective in predicting the binding affinity of heterodimeric protein complexes: apparent partition energy based on buried molar fractions, relations between chemical structure and biological activity in principal component analysis IV, and normalized frequency of beta turn. Conclusions The proposed sequence-based prediction method SVM-BAC uses an optimal feature selection method to identify 14 informative features to classify and predict binding affinity of heterodimeric protein complexes. The characterization analysis revealed that the average numbers of beta turns and hydrogen bonds at protein-protein interfaces in high binding affinity complexes are more than those in low binding affinity complexes. PMID:26681483
Predicting DNA hybridization kinetics from sequence
NASA Astrophysics Data System (ADS)
Zhang, Jinny X.; Fang, John Z.; Duan, Wei; Wu, Lucia R.; Zhang, Angela W.; Dalchau, Neil; Yordanov, Boyan; Petersen, Rasmus; Phillips, Andrew; Zhang, David Yu
2018-01-01
Hybridization is a key molecular process in biology and biotechnology, but so far there is no predictive model for accurately determining hybridization rate constants based on sequence information. Here, we report a weighted neighbour voting (WNV) prediction algorithm, in which the hybridization rate constant of an unknown sequence is predicted based on similarity reactions with known rate constants. To construct this algorithm we first performed 210 fluorescence kinetics experiments to observe the hybridization kinetics of 100 different DNA target and probe pairs (36 nt sub-sequences of the CYCS and VEGF genes) at temperatures ranging from 28 to 55 °C. Automated feature selection and weighting optimization resulted in a final six-feature WNV model, which can predict hybridization rate constants of new sequences to within a factor of 3 with ∼91% accuracy, based on leave-one-out cross-validation. Accurate prediction of hybridization kinetics allows the design of efficient probe sequences for genomics research.
Thomas, Reuben; Thomas, Russell S.; Auerbach, Scott S.; Portier, Christopher J.
2013-01-01
Background Several groups have employed genomic data from subchronic chemical toxicity studies in rodents (90 days) to derive gene-centric predictors of chronic toxicity and carcinogenicity. Genes are annotated to belong to biological processes or molecular pathways that are mechanistically well understood and are described in public databases. Objectives To develop a molecular pathway-based prediction model of long term hepatocarcinogenicity using 90-day gene expression data and to evaluate the performance of this model with respect to both intra-species, dose-dependent and cross-species predictions. Methods Genome-wide hepatic mRNA expression was retrospectively measured in B6C3F1 mice following subchronic exposure to twenty-six (26) chemicals (10 were positive, 2 equivocal and 14 negative for liver tumors) previously studied by the US National Toxicology Program. Using these data, a pathway-based predictor model for long-term liver cancer risk was derived using random forests. The prediction model was independently validated on test sets associated with liver cancer risk obtained from mice, rats and humans. Results Using 5-fold cross validation, the developed prediction model had reasonable predictive performance with the area under receiver-operator curve (AUC) equal to 0.66. The developed prediction model was then used to extrapolate the results to data associated with rat and human liver cancer. The extrapolated model worked well for both extrapolated species (AUC value of 0.74 for rats and 0.91 for humans). The prediction models implied a balanced interplay between all pathway responses leading to carcinogenicity predictions. Conclusions Pathway-based prediction models estimated from sub-chronic data hold promise for predicting long-term carcinogenicity and also for its ability to extrapolate results across multiple species. PMID:23737943
Thomas, Reuben; Thomas, Russell S; Auerbach, Scott S; Portier, Christopher J
2013-01-01
Several groups have employed genomic data from subchronic chemical toxicity studies in rodents (90 days) to derive gene-centric predictors of chronic toxicity and carcinogenicity. Genes are annotated to belong to biological processes or molecular pathways that are mechanistically well understood and are described in public databases. To develop a molecular pathway-based prediction model of long term hepatocarcinogenicity using 90-day gene expression data and to evaluate the performance of this model with respect to both intra-species, dose-dependent and cross-species predictions. Genome-wide hepatic mRNA expression was retrospectively measured in B6C3F1 mice following subchronic exposure to twenty-six (26) chemicals (10 were positive, 2 equivocal and 14 negative for liver tumors) previously studied by the US National Toxicology Program. Using these data, a pathway-based predictor model for long-term liver cancer risk was derived using random forests. The prediction model was independently validated on test sets associated with liver cancer risk obtained from mice, rats and humans. Using 5-fold cross validation, the developed prediction model had reasonable predictive performance with the area under receiver-operator curve (AUC) equal to 0.66. The developed prediction model was then used to extrapolate the results to data associated with rat and human liver cancer. The extrapolated model worked well for both extrapolated species (AUC value of 0.74 for rats and 0.91 for humans). The prediction models implied a balanced interplay between all pathway responses leading to carcinogenicity predictions. Pathway-based prediction models estimated from sub-chronic data hold promise for predicting long-term carcinogenicity and also for its ability to extrapolate results across multiple species.
Prediction of global and local model quality in CASP8 using the ModFOLD server.
McGuffin, Liam J
2009-01-01
The development of effective methods for predicting the quality of three-dimensional (3D) models is fundamentally important for the success of tertiary structure (TS) prediction strategies. Since CASP7, the Quality Assessment (QA) category has existed to gauge the ability of various model quality assessment programs (MQAPs) at predicting the relative quality of individual 3D models. For the CASP8 experiment, automated predictions were submitted in the QA category using two methods from the ModFOLD server-ModFOLD version 1.1 and ModFOLDclust. ModFOLD version 1.1 is a single-model machine learning based method, which was used for automated predictions of global model quality (QMODE1). ModFOLDclust is a simple clustering based method, which was used for automated predictions of both global and local quality (QMODE2). In addition, manual predictions of model quality were made using ModFOLD version 2.0--an experimental method that combines the scores from ModFOLDclust and ModFOLD v1.1. Predictions from the ModFOLDclust method were the most successful of the three in terms of the global model quality, whilst the ModFOLD v1.1 method was comparable in performance to other single-model based methods. In addition, the ModFOLDclust method performed well at predicting the per-residue, or local, model quality scores. Predictions of the per-residue errors in our own 3D models, selected using the ModFOLD v2.0 method, were also the most accurate compared with those from other methods. All of the MQAPs described are publicly accessible via the ModFOLD server at: http://www.reading.ac.uk/bioinf/ModFOLD/. The methods are also freely available to download from: http://www.reading.ac.uk/bioinf/downloads/. Copyright 2009 Wiley-Liss, Inc.
NASA Astrophysics Data System (ADS)
Li, Zhe; Feng, Jinchao; Liu, Pengyu; Sun, Zhonghua; Li, Gang; Jia, Kebin
2018-05-01
Temperature is usually considered as a fluctuation in near-infrared spectral measurement. Chemometric methods were extensively studied to correct the effect of temperature variations. However, temperature can be considered as a constructive parameter that provides detailed chemical information when systematically changed during the measurement. Our group has researched the relationship between temperature-induced spectral variation (TSVC) and normalized squared temperature. In this study, we focused on the influence of temperature distribution in calibration set. Multi-temperature calibration set selection (MTCS) method was proposed to improve the prediction accuracy by considering the temperature distribution of calibration samples. Furthermore, double-temperature calibration set selection (DTCS) method was proposed based on MTCS method and the relationship between TSVC and normalized squared temperature. We compare the prediction performance of PLS models based on random sampling method and proposed methods. The results from experimental studies showed that the prediction performance was improved by using proposed methods. Therefore, MTCS method and DTCS method will be the alternative methods to improve prediction accuracy in near-infrared spectral measurement.
Predicting human olfactory perception from chemical features of odor molecules.
Keller, Andreas; Gerkin, Richard C; Guan, Yuanfang; Dhurandhar, Amit; Turu, Gabor; Szalai, Bence; Mainland, Joel D; Ihara, Yusuke; Yu, Chung Wen; Wolfinger, Russ; Vens, Celine; Schietgat, Leander; De Grave, Kurt; Norel, Raquel; Stolovitzky, Gustavo; Cecchi, Guillermo A; Vosshall, Leslie B; Meyer, Pablo
2017-02-24
It is still not possible to predict whether a given molecule will have a perceived odor or what olfactory percept it will produce. We therefore organized the crowd-sourced DREAM Olfaction Prediction Challenge. Using a large olfactory psychophysical data set, teams developed machine-learning algorithms to predict sensory attributes of molecules based on their chemoinformatic features. The resulting models accurately predicted odor intensity and pleasantness and also successfully predicted 8 among 19 rated semantic descriptors ("garlic," "fish," "sweet," "fruit," "burnt," "spices," "flower," and "sour"). Regularized linear models performed nearly as well as random forest-based ones, with a predictive accuracy that closely approaches a key theoretical limit. These models help to predict the perceptual qualities of virtually any molecule with high accuracy and also reverse-engineer the smell of a molecule. Copyright © 2017, American Association for the Advancement of Science.
CD-Based Indices for Link Prediction in Complex Network.
Wang, Tao; Wang, Hongjue; Wang, Xiaoxia
2016-01-01
Lots of similarity-based algorithms have been designed to deal with the problem of link prediction in the past decade. In order to improve prediction accuracy, a novel cosine similarity index CD based on distance between nodes and cosine value between vectors is proposed in this paper. Firstly, node coordinate matrix can be obtained by node distances which are different from distance matrix and row vectors of the matrix are regarded as coordinates of nodes. Then, cosine value between node coordinates is used as their similarity index. A local community density index LD is also proposed. Then, a series of CD-based indices include CD-LD-k, CD*LD-k, CD-k and CDI are presented and applied in ten real networks. Experimental results demonstrate the effectiveness of CD-based indices. The effects of network clustering coefficient and assortative coefficient on prediction accuracy of indices are analyzed. CD-LD-k and CD*LD-k can improve prediction accuracy without considering the assortative coefficient of network is negative or positive. According to analysis of relative precision of each method on each network, CD-LD-k and CD*LD-k indices have excellent average performance and robustness. CD and CD-k indices perform better on positive assortative networks than on negative assortative networks. For negative assortative networks, we improve and refine CD index, referred as CDI index, combining the advantages of CD index and evolutionary mechanism of the network model BA. Experimental results reveal that CDI index can increase prediction accuracy of CD on negative assortative networks.
CD-Based Indices for Link Prediction in Complex Network
Wang, Tao; Wang, Hongjue; Wang, Xiaoxia
2016-01-01
Lots of similarity-based algorithms have been designed to deal with the problem of link prediction in the past decade. In order to improve prediction accuracy, a novel cosine similarity index CD based on distance between nodes and cosine value between vectors is proposed in this paper. Firstly, node coordinate matrix can be obtained by node distances which are different from distance matrix and row vectors of the matrix are regarded as coordinates of nodes. Then, cosine value between node coordinates is used as their similarity index. A local community density index LD is also proposed. Then, a series of CD-based indices include CD-LD-k, CD*LD-k, CD-k and CDI are presented and applied in ten real networks. Experimental results demonstrate the effectiveness of CD-based indices. The effects of network clustering coefficient and assortative coefficient on prediction accuracy of indices are analyzed. CD-LD-k and CD*LD-k can improve prediction accuracy without considering the assortative coefficient of network is negative or positive. According to analysis of relative precision of each method on each network, CD-LD-k and CD*LD-k indices have excellent average performance and robustness. CD and CD-k indices perform better on positive assortative networks than on negative assortative networks. For negative assortative networks, we improve and refine CD index, referred as CDI index, combining the advantages of CD index and evolutionary mechanism of the network model BA. Experimental results reveal that CDI index can increase prediction accuracy of CD on negative assortative networks. PMID:26752405
Pekkala, Timo; Hall, Anette; Lötjönen, Jyrki; Mattila, Jussi; Soininen, Hilkka; Ngandu, Tiia; Laatikainen, Tiina; Kivipelto, Miia; Solomon, Alina
2017-01-01
This study aimed to develop a late-life dementia prediction model using a novel validated supervised machine learning method, the Disease State Index (DSI), in the Finnish population-based CAIDE study. The CAIDE study was based on previous population-based midlife surveys. CAIDE participants were re-examined twice in late-life, and the first late-life re-examination was used as baseline for the present study. The main study population included 709 cognitively normal subjects at first re-examination who returned to the second re-examination up to 10 years later (incident dementia n = 39). An extended population (n = 1009, incident dementia 151) included non-participants/non-survivors (national registers data). DSI was used to develop a dementia index based on first re-examination assessments. Performance in predicting dementia was assessed as area under the ROC curve (AUC). AUCs for DSI were 0.79 and 0.75 for main and extended populations. Included predictors were cognition, vascular factors, age, subjective memory complaints, and APOE genotype. The supervised machine learning method performed well in identifying comprehensive profiles for predicting dementia development up to 10 years later. DSI could thus be useful for identifying individuals who are most at risk and may benefit from dementia prevention interventions.
Peterson, Lenna X; Shin, Woong-Hee; Kim, Hyungrae; Kihara, Daisuke
2018-03-01
We report our group's performance for protein-protein complex structure prediction and scoring in Round 37 of the Critical Assessment of PRediction of Interactions (CAPRI), an objective assessment of protein-protein complex modeling. We demonstrated noticeable improvement in both prediction and scoring compared to previous rounds of CAPRI, with our human predictor group near the top of the rankings and our server scorer group at the top. This is the first time in CAPRI that a server has been the top scorer group. To predict protein-protein complex structures, we used both multi-chain template-based modeling (TBM) and our protein-protein docking program, LZerD. LZerD represents protein surfaces using 3D Zernike descriptors (3DZD), which are based on a mathematical series expansion of a 3D function. Because 3DZD are a soft representation of the protein surface, LZerD is tolerant to small conformational changes, making it well suited to docking unbound and TBM structures. The key to our improved performance in CAPRI Round 37 was to combine multi-chain TBM and docking. As opposed to our previous strategy of performing docking for all target complexes, we used TBM when multi-chain templates were available and docking otherwise. We also describe the combination of multiple scoring functions used by our server scorer group, which achieved the top rank for the scorer phase. © 2017 Wiley Periodicals, Inc.
Bao, Yu; Hayashida, Morihiro; Akutsu, Tatsuya
2016-11-25
Dicer is necessary for the process of mature microRNA (miRNA) formation because the Dicer enzyme cleaves pre-miRNA correctly to generate miRNA with correct seed regions. Nonetheless, the mechanism underlying the selection of a Dicer cleavage site is still not fully understood. To date, several studies have been conducted to solve this problem, for example, a recent discovery indicates that the loop/bulge structure plays a central role in the selection of Dicer cleavage sites. In accordance with this breakthrough, a support vector machine (SVM)-based method called PHDCleav was developed to predict Dicer cleavage sites which outperforms other methods based on random forest and naive Bayes. PHDCleav, however, tests only whether a position in the shift window belongs to a loop/bulge structure. In this paper, we used the length of loop/bulge structures (in addition to their presence or absence) to develop an improved method, LBSizeCleav, for predicting Dicer cleavage sites. To evaluate our method, we used 810 empirically validated sequences of human pre-miRNAs and performed fivefold cross-validation. In both 5p and 3p arms of pre-miRNAs, LBSizeCleav showed greater prediction accuracy than PHDCleav did. This result suggests that the length of loop/bulge structures is useful for prediction of Dicer cleavage sites. We developed a novel algorithm for feature space mapping based on the length of a loop/bulge for predicting Dicer cleavage sites. The better performance of our method indicates the usefulness of the length of loop/bulge structures for such predictions.
United3D: a protein model quality assessment program that uses two consensus based methods.
Terashi, Genki; Oosawa, Makoto; Nakamura, Yuuki; Kanou, Kazuhiko; Takeda-Shitaka, Mayuko
2012-01-01
In protein structure prediction, such as template-based modeling and free modeling (ab initio modeling), the step that assesses the quality of protein models is very important. We have developed a model quality assessment (QA) program United3D that uses an optimized clustering method and a simple Cα atom contact-based potential. United3D automatically estimates the quality scores (Qscore) of predicted protein models that are highly correlated with the actual quality (GDT_TS). The performance of United3D was tested in the ninth Critical Assessment of protein Structure Prediction (CASP9) experiment. In CASP9, United3D showed the lowest average loss of GDT_TS (5.3) among the QA methods participated in CASP9. This result indicates that the performance of United3D to identify the high quality models from the models predicted by CASP9 servers on 116 targets was best among the QA methods that were tested in CASP9. United3D also produced high average Pearson correlation coefficients (0.93) and acceptable Kendall rank correlation coefficients (0.68) between the Qscore and GDT_TS. This performance was competitive with the other top ranked QA methods that were tested in CASP9. These results indicate that United3D is a useful tool for selecting high quality models from many candidate model structures provided by various modeling methods. United3D will improve the accuracy of protein structure prediction.
A comparison of arcjet plume properties to model predictions
NASA Technical Reports Server (NTRS)
Cappelli, M. A.; Liebeskind, J. G.; Hanson, R. K.; Butler, G. W.; King, D. Q.
1993-01-01
This paper describes an experimental study of the plasma plume properties of a 1 kW class hydrogen arcjet thruster and the comparison of measured temperature and velocity field to model predictions. The experiments are based on laser-induced fluorescence excitation of the Balmer-alpha transition. The model is based on a single-fluid magnetohydrodynamic description of the flow originally developed to predict arcjet thruster performance. Excellent agreement between model predictions and experimental velocity is found, despite the complex nature of the flow. Measured and predicted exit plane temperatures are in disagreement by as much as 2000K over a range of operating conditions. The possible sources for this discrepancy are discussed.
Aerodynamics and thermal physics of helicopter ice accretion
NASA Astrophysics Data System (ADS)
Han, Yiqiang
Ice accretion on aircraft introduces significant loss in airfoil performance. Reduced lift-to- drag ratio reduces the vehicle capability to maintain altitude and also limits its maneuverability. Current ice accretion performance degradation modeling approaches are calibrated only to a limited envelope of liquid water content, impact velocity, temperature, and water droplet size; consequently inaccurate aerodynamic performance degradations are estimated. The reduced ice accretion prediction capabilities in the glaze ice regime are primarily due to a lack of knowledge of surface roughness induced by ice accretion. A comprehensive understanding of the ice roughness effects on airfoil heat transfer, ice accretion shapes, and ultimately aerodynamics performance is critical for the design of ice protection systems. Surface roughness effects on both heat transfer and aerodynamic performance degradation on airfoils have been experimentally evaluated. Novel techniques, such as ice molding and casting methods and transient heat transfer measurement using non-intrusive thermal imaging methods, were developed at the Adverse Environment Rotor Test Stand (AERTS) facility at Penn State. A novel heat transfer scaling method specifically for turbulent flow regime was also conceived. A heat transfer scaling parameter, labeled as Coefficient of Stanton and Reynolds Number (CSR = Stx/Rex --0.2), has been validated against reference data found in the literature for rough flat plates with Reynolds number (Re) up to 1x107, for rough cylinders with Re ranging from 3x104 to 4x106, and for turbine blades with Re from 7.5x105 to 7x106. This is the first time that the effect of Reynolds number is shown to be successfully eliminated on heat transfer magnitudes measured on rough surfaces. Analytical models for ice roughness distribution, heat transfer prediction, and aerodynamics performance degradation due to ice accretion have also been developed. The ice roughness prediction model was developed based on a set of 82 experimental measurements and also compared to existing predictions tools. Two reference predictions found in the literature yielded 76% and 54% discrepancy with respect to experimental testing, whereas the proposed ice roughness prediction model resulted in a 31% minimum accuracy in prediction. It must be noted that the accuracy of the proposed model is within the ice shape reproduction uncertainty of icing facilities. Based on the new ice roughness prediction model and the CSR heat transfer scaling method, an icing heat transfer model was developed. The approach achieved high accuracy in heat transfer prediction compared to experiments conducted at the AERTS facility. The discrepancy between predictions and experimental results was within +/-15%, which was within the measurement uncertainty range of the facility. By combining both the ice roughness and heat transfer predictions, and incorporating the modules into an existing ice prediction tool (LEWICE), improved prediction capability was obtained, especially for the glaze regime. With the available ice shapes accreted at the AERTS facility and additional experiments found in the literature, 490 sets of experimental ice shapes and corresponding aerodynamics testing data were available. A physics-based performance degradation empirical tool was developed and achieved a mean absolute deviation of 33% when compared to the entire experimental dataset, whereas 60% to 243% discrepancies were observed using legacy drag penalty prediction tools. Rotor torque predictions coupling Blade Element Momentum Theory and the proposed drag performance degradation tool was conducted on a total of 17 validation cases. The coupled prediction tool achieved a 10% predicting error for clean rotor conditions, and 16% error for iced rotor conditions. It was shown that additional roughness element could affect the measured drag by up to 25% during experimental testing, emphasizing the need of realistic ice structures during aerodynamics modeling and testing for ice accretion.
PARTS: Probabilistic Alignment for RNA joinT Secondary structure prediction
Harmanci, Arif Ozgun; Sharma, Gaurav; Mathews, David H.
2008-01-01
A novel method is presented for joint prediction of alignment and common secondary structures of two RNA sequences. The joint consideration of common secondary structures and alignment is accomplished by structural alignment over a search space defined by the newly introduced motif called matched helical regions. The matched helical region formulation generalizes previously employed constraints for structural alignment and thereby better accommodates the structural variability within RNA families. A probabilistic model based on pseudo free energies obtained from precomputed base pairing and alignment probabilities is utilized for scoring structural alignments. Maximum a posteriori (MAP) common secondary structures, sequence alignment and joint posterior probabilities of base pairing are obtained from the model via a dynamic programming algorithm called PARTS. The advantage of the more general structural alignment of PARTS is seen in secondary structure predictions for the RNase P family. For this family, the PARTS MAP predictions of secondary structures and alignment perform significantly better than prior methods that utilize a more restrictive structural alignment model. For the tRNA and 5S rRNA families, the richer structural alignment model of PARTS does not offer a benefit and the method therefore performs comparably with existing alternatives. For all RNA families studied, the posterior probability estimates obtained from PARTS offer an improvement over posterior probability estimates from a single sequence prediction. When considering the base pairings predicted over a threshold value of confidence, the combination of sensitivity and positive predictive value is superior for PARTS than for the single sequence prediction. PARTS source code is available for download under the GNU public license at http://rna.urmc.rochester.edu. PMID:18304945
Prediction of microRNAs Associated with Human Diseases Based on Weighted k Most Similar Neighbors
Guo, Maozu; Guo, Yahong; Li, Jinbao; Ding, Jian; Liu, Yong; Dai, Qiguo; Li, Jin; Teng, Zhixia; Huang, Yufei
2013-01-01
Background The identification of human disease-related microRNAs (disease miRNAs) is important for further investigating their involvement in the pathogenesis of diseases. More experimentally validated miRNA-disease associations have been accumulated recently. On the basis of these associations, it is essential to predict disease miRNAs for various human diseases. It is useful in providing reliable disease miRNA candidates for subsequent experimental studies. Methodology/Principal Findings It is known that miRNAs with similar functions are often associated with similar diseases and vice versa. Therefore, the functional similarity of two miRNAs has been successfully estimated by measuring the semantic similarity of their associated diseases. To effectively predict disease miRNAs, we calculated the functional similarity by incorporating the information content of disease terms and phenotype similarity between diseases. Furthermore, the members of miRNA family or cluster are assigned higher weight since they are more probably associated with similar diseases. A new prediction method, HDMP, based on weighted k most similar neighbors is presented for predicting disease miRNAs. Experiments validated that HDMP achieved significantly higher prediction performance than existing methods. In addition, the case studies examining prostatic neoplasms, breast neoplasms, and lung neoplasms, showed that HDMP can uncover potential disease miRNA candidates. Conclusions The superior performance of HDMP can be attributed to the accurate measurement of miRNA functional similarity, the weight assignment based on miRNA family or cluster, and the effective prediction based on weighted k most similar neighbors. The online prediction and analysis tool is freely available at http://nclab.hit.edu.cn/hdmpred. PMID:23950912
Sammour, T; Cohen, L; Karunatillake, A I; Lewis, M; Lawrence, M J; Hunter, A; Moore, J W; Thomas, M L
2017-11-01
Recently published data support the use of a web-based risk calculator ( www.anastomoticleak.com ) for the prediction of anastomotic leak after colectomy. The aim of this study was to externally validate this calculator on a larger dataset. Consecutive adult patients undergoing elective or emergency colectomy for colon cancer at a single institution over a 9-year period were identified using the Binational Colorectal Cancer Audit database. Patients with a rectosigmoid cancer, an R2 resection, or a diverting ostomy were excluded. The primary outcome was anastomotic leak within 90 days as defined by previously published criteria. Area under receiver operating characteristic curve (AUROC) was derived and compared with that of the American College of Surgeons National Surgical Quality Improvement Program ® (ACS NSQIP) calculator and the colon leakage score (CLS) calculator for left colectomy. Commercially available artificial intelligence-based analytics software was used to further interrogate the prediction algorithm. A total of 626 patients were identified. Four hundred and fifty-six patients met the inclusion criteria, and 402 had complete data available for all the calculator variables (126 had a left colectomy). Laparoscopic surgery was performed in 39.6% and emergency surgery in 14.7%. The anastomotic leak rate was 7.2%, with 31.0% requiring reoperation. The anastomoticleak.com calculator was significantly predictive of leak and performed better than the ACS NSQIP calculator (AUROC 0.73 vs 0.58) and the CLS calculator (AUROC 0.96 vs 0.80) for left colectomy. Artificial intelligence-predictive analysis supported these findings and identified an improved prediction model. The anastomotic leak risk calculator is significantly predictive of anastomotic leak after colon cancer resection. Wider investigation of artificial intelligence-based analytics for risk prediction is warranted.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wen, Haifang; Li, Xiaojun; Edil, Tuncer
The purpose of this study was to evaluate the performance of cementitious high carbon fly ash (CHCFA) stabilized recycled asphalt pavement as a base course material in a real world setting. Three test road cells were built at MnROAD facility in Minnesota. These cells have the same asphalt surface layers, subbases, and subgrades, but three different base courses: conventional crushed aggregates, untreated recycled pavement materials (RPM), and CHCFA stabilized RPM materials. During and after the construction of the three cells, laboratory and field tests were carried out to characterize the material properties. The test results were used in the mechanistic-empiricalmore » pavement design guide (MEPDG) to predict the pavement performance. Based on the performance prediction, the life cycle analyses of cost, energy consumption, and greenhouse gasses were performed. The leaching impacts of these three types of base materials were compared. The laboratory and field tests showed that fly ash stabilized RPM had higher modulus than crushed aggregate and RPM did. Based on the MEPDG performance prediction, the service life of the Cell 79 containing fly ash stabilized RPM, is 23.5 years, which is about twice the service life (11 years) of the Cell 77 with RPM base, and about three times the service life (7.5 years) of the Cell 78 with crushed aggregate base. The life cycle analysis indicated that the usage of the fly ash stabilized RPM as the base of the flexible pavement can significantly reduce the life cycle cost, the energy consumption, the greenhouse gases emission. Concentrations of many trace elements, particularly those with relatively low water quality standards, diminish over time as water flows through the pavement profile. For many elements, concentrations below US water drinking water quality standards are attained at the bottom of the pavement profile within 2-4 pore volumes of flow.« less
Petersen, Japke F; Stuiver, Martijn M; Timmermans, Adriana J; Chen, Amy; Zhang, Hongzhen; O'Neill, James P; Deady, Sandra; Vander Poorten, Vincent; Meulemans, Jeroen; Wennerberg, Johan; Skroder, Carl; Day, Andrew T; Koch, Wayne; van den Brekel, Michiel W M
2018-05-01
TNM-classification inadequately estimates patient-specific overall survival (OS). We aimed to improve this by developing a risk-prediction model for patients with advanced larynx cancer. Cohort study. We developed a risk prediction model to estimate the 5-year OS rate based on a cohort of 3,442 patients with T3T4N0N+M0 larynx cancer. The model was internally validated using bootstrapping samples and externally validated on patient data from five external centers (n = 770). The main outcome was performance of the model as tested by discrimination, calibration, and the ability to distinguish risk groups based on tertiles from the derivation dataset. The model performance was compared to a model based on T and N classification only. We included age, gender, T and N classification, and subsite as prognostic variables in the standard model. After external validation, the standard model had a significantly better fit than a model based on T and N classification alone (C statistic, 0.59 vs. 0.55, P < .001). The model was able to distinguish well among three risk groups based on tertiles of the risk score. Adding treatment modality to the model did not decrease the predictive power. As a post hoc analysis, we tested the added value of comorbidity as scored by American Society of Anesthesiologists score in a subsample, which increased the C statistic to 0.68. A risk prediction model for patients with advanced larynx cancer, consisting of readily available clinical variables, gives more accurate estimations of the estimated 5-year survival rate when compared to a model based on T and N classification alone. 2c. Laryngoscope, 128:1140-1145, 2018. © 2017 The American Laryngological, Rhinological and Otological Society, Inc.
Peterson, Lenna X; Kim, Hyungrae; Esquivel-Rodriguez, Juan; Roy, Amitava; Han, Xusi; Shin, Woong-Hee; Zhang, Jian; Terashi, Genki; Lee, Matt; Kihara, Daisuke
2017-03-01
We report the performance of protein-protein docking predictions by our group for recent rounds of the Critical Assessment of Prediction of Interactions (CAPRI), a community-wide assessment of state-of-the-art docking methods. Our prediction procedure uses a protein-protein docking program named LZerD developed in our group. LZerD represents a protein surface with 3D Zernike descriptors (3DZD), which are based on a mathematical series expansion of a 3D function. The appropriate soft representation of protein surface with 3DZD makes the method more tolerant to conformational change of proteins upon docking, which adds an advantage for unbound docking. Docking was guided by interface residue prediction performed with BindML and cons-PPISP as well as literature information when available. The generated docking models were ranked by a combination of scoring functions, including PRESCO, which evaluates the native-likeness of residues' spatial environments in structure models. First, we discuss the overall performance of our group in the CAPRI prediction rounds and investigate the reasons for unsuccessful cases. Then, we examine the performance of several knowledge-based scoring functions and their combinations for ranking docking models. It was found that the quality of a pool of docking models generated by LZerD, that is whether or not the pool includes near-native models, can be predicted by the correlation of multiple scores. Although the current analysis used docking models generated by LZerD, findings on scoring functions are expected to be universally applicable to other docking methods. Proteins 2017; 85:513-527. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Babcock, Chad; Finley, Andrew O.; Bradford, John B.; Kolka, Randall K.; Birdsey, Richard A.; Ryan, Michael G.
2015-01-01
Many studies and production inventory systems have shown the utility of coupling covariates derived from Light Detection and Ranging (LiDAR) data with forest variables measured on georeferenced inventory plots through regression models. The objective of this study was to propose and assess the use of a Bayesian hierarchical modeling framework that accommodates both residual spatial dependence and non-stationarity of model covariates through the introduction of spatial random effects. We explored this objective using four forest inventory datasets that are part of the North American Carbon Program, each comprising point-referenced measures of above-ground forest biomass and discrete LiDAR. For each dataset, we considered at least five regression model specifications of varying complexity. Models were assessed based on goodness of fit criteria and predictive performance using a 10-fold cross-validation procedure. Results showed that the addition of spatial random effects to the regression model intercept improved fit and predictive performance in the presence of substantial residual spatial dependence. Additionally, in some cases, allowing either some or all regression slope parameters to vary spatially, via the addition of spatial random effects, further improved model fit and predictive performance. In other instances, models showed improved fit but decreased predictive performance—indicating over-fitting and underscoring the need for cross-validation to assess predictive ability. The proposed Bayesian modeling framework provided access to pixel-level posterior predictive distributions that were useful for uncertainty mapping, diagnosing spatial extrapolation issues, revealing missing model covariates, and discovering locally significant parameters.
Assessment of driving-related skills for older drivers : traffic tech.
DOT National Transportation Integrated Search
2010-04-01
Relating behind-the-wheel driving performance to performance : on office-based screening tools is challenging. It is : important to use tools that are predictive of poor driving : performance (sensitivity), but also to find tools that do not : have h...
Predictive modeling of respiratory tumor motion for real-time prediction of baseline shifts
NASA Astrophysics Data System (ADS)
Balasubramanian, A.; Shamsuddin, R.; Prabhakaran, B.; Sawant, A.
2017-03-01
Baseline shifts in respiratory patterns can result in significant spatiotemporal changes in patient anatomy (compared to that captured during simulation), in turn, causing geometric and dosimetric errors in the administration of thoracic and abdominal radiotherapy. We propose predictive modeling of the tumor motion trajectories for predicting a baseline shift ahead of its occurrence. The key idea is to use the features of the tumor motion trajectory over a 1 min window, and predict the occurrence of a baseline shift in the 5 s that immediately follow (lookahead window). In this study, we explored a preliminary trend-based analysis with multi-class annotations as well as a more focused binary classification analysis. In both analyses, a number of different inter-fraction and intra-fraction training strategies were studied, both offline as well as online, along with data sufficiency and skew compensation for class imbalances. The performance of different training strategies were compared across multiple machine learning classification algorithms, including nearest neighbor, Naïve Bayes, linear discriminant and ensemble Adaboost. The prediction performance is evaluated using metrics such as accuracy, precision, recall and the area under the curve (AUC) for repeater operating characteristics curve. The key results of the trend-based analysis indicate that (i) intra-fraction training strategies achieve highest prediction accuracies (90.5-91.4%) (ii) the predictive modeling yields lowest accuracies (50-60%) when the training data does not include any information from the test patient; (iii) the prediction latencies are as low as a few hundred milliseconds, and thus conducive for real-time prediction. The binary classification performance is promising, indicated by high AUCs (0.96-0.98). It also confirms the utility of prior data from previous patients, and also the necessity of training the classifier on some initial data from the new patient for reasonable prediction performance. The ability to predict a baseline shift with a sufficient look-ahead window will enable clinical systems or even human users to hold the treatment beam in such situations, thereby reducing the probability of serious geometric and dosimetric errors.
Predictive modeling of respiratory tumor motion for real-time prediction of baseline shifts
Balasubramanian, A; Shamsuddin, R; Prabhakaran, B; Sawant, A
2017-01-01
Baseline shifts in respiratory patterns can result in significant spatiotemporal changes in patient anatomy (compared to that captured during simulation), in turn, causing geometric and dosimetric errors in the administration of thoracic and abdominal radiotherapy. We propose predictive modeling of the tumor motion trajectories for predicting a baseline shift ahead of its occurrence. The key idea is to use the features of the tumor motion trajectory over a 1 min window, and predict the occurrence of a baseline shift in the 5 s that immediately follow (lookahead window). In this study, we explored a preliminary trend-based analysis with multi-class annotations as well as a more focused binary classification analysis. In both analyses, a number of different inter-fraction and intra-fraction training strategies were studied, both offline as well as online, along with data sufficiency and skew compensation for class imbalances. The performance of different training strategies were compared across multiple machine learning classification algorithms, including nearest neighbor, Naïve Bayes, linear discriminant and ensemble Adaboost. The prediction performance is evaluated using metrics such as accuracy, precision, recall and the area under the curve (AUC) for repeater operating characteristics curve. The key results of the trend-based analysis indicate that (i) intra-fraction training strategies achieve highest prediction accuracies (90.5–91.4%); (ii) the predictive modeling yields lowest accuracies (50–60%) when the training data does not include any information from the test patient; (iii) the prediction latencies are as low as a few hundred milliseconds, and thus conducive for real-time prediction. The binary classification performance is promising, indicated by high AUCs (0.96–0.98). It also confirms the utility of prior data from previous patients, and also the necessity of training the classifier on some initial data from the new patient for reasonable prediction performance. The ability to predict a baseline shift with a sufficient lookahead window will enable clinical systems or even human users to hold the treatment beam in such situations, thereby reducing the probability of serious geometric and dosimetric errors. PMID:28075331
Predictive modeling of respiratory tumor motion for real-time prediction of baseline shifts.
Balasubramanian, A; Shamsuddin, R; Prabhakaran, B; Sawant, A
2017-03-07
Baseline shifts in respiratory patterns can result in significant spatiotemporal changes in patient anatomy (compared to that captured during simulation), in turn, causing geometric and dosimetric errors in the administration of thoracic and abdominal radiotherapy. We propose predictive modeling of the tumor motion trajectories for predicting a baseline shift ahead of its occurrence. The key idea is to use the features of the tumor motion trajectory over a 1 min window, and predict the occurrence of a baseline shift in the 5 s that immediately follow (lookahead window). In this study, we explored a preliminary trend-based analysis with multi-class annotations as well as a more focused binary classification analysis. In both analyses, a number of different inter-fraction and intra-fraction training strategies were studied, both offline as well as online, along with data sufficiency and skew compensation for class imbalances. The performance of different training strategies were compared across multiple machine learning classification algorithms, including nearest neighbor, Naïve Bayes, linear discriminant and ensemble Adaboost. The prediction performance is evaluated using metrics such as accuracy, precision, recall and the area under the curve (AUC) for repeater operating characteristics curve. The key results of the trend-based analysis indicate that (i) intra-fraction training strategies achieve highest prediction accuracies (90.5-91.4%); (ii) the predictive modeling yields lowest accuracies (50-60%) when the training data does not include any information from the test patient; (iii) the prediction latencies are as low as a few hundred milliseconds, and thus conducive for real-time prediction. The binary classification performance is promising, indicated by high AUCs (0.96-0.98). It also confirms the utility of prior data from previous patients, and also the necessity of training the classifier on some initial data from the new patient for reasonable prediction performance. The ability to predict a baseline shift with a sufficient look-ahead window will enable clinical systems or even human users to hold the treatment beam in such situations, thereby reducing the probability of serious geometric and dosimetric errors.
Ma, Yucheng; Wang, Qing; Yang, Jiayin; Yan, Lunan
2015-01-01
In order to provide a good match between donor and recipient in liver transplantation, four scoring systems [the product of donor age and Model for End-stage Liver Disease score (D-MELD), the score to predict survival outcomes following liver transplantation (SOFT), the balance of risk score (BAR), and the transplant risk index (TRI)] based on both donor and recipient parameters were designed. This study was conducted to evaluate the performance of the four scores in living donor liver transplantation (LDLT) and compare them with the MELD score. The clinical data of 249 adult patients undergoing LDLT in our center were retrospectively evaluated. The area under the receiver operating characteristic curves (AUCs) of each score were calculated and compared at 1-, 3-, 6-month and 1-year after LDLT. The BAR at 1-, 3-, 6-month and 1-year after LDLT and the D-MELD and TRI at 1-, 3- and 6-month after LDLT showed acceptable performances in the prediction of survival (AUC>0.6), while the SOFT showed poor discrimination at 6-month after LDLT (AUC = 0.569). In addition, the D-MELD and BAR displayed positive correlations with the length of ICU stay (D-MELD, p = 0.025; BAR, p = 0.022). The SOFT was correlated with the time of mechanical ventilation (p = 0.022). The D-MELD, BAR and TRI provided acceptable performance in predicting survival after LDLT. However, even though these scoring systems were based on both donor and recipient parameters, only the BAR provided better performance than the MELD in predicting 1-year survival after LDLT.
Charlier, Ruben; Caspers, Maarten; Knaeps, Sara; Mertens, Evelien; Lambrechts, Diether; Lefevre, Johan; Thomis, Martine
2017-03-01
Since both muscle mass and strength performance are polygenic in nature, the current study compared four genetic predisposition scores (GPS) in their ability to predict these phenotypes. Data were gathered within the framework of the first-generation Flemish Policy Research Centre "Sport, Physical Activity and Health" (2002-2004). Results are based on muscle characteristics data of 565 Flemish Caucasians (19-73 yr, 365 men). Skeletal muscle mass was determined from bioelectrical impedance. The Biodex dynamometer was used to measure isometric (PT static120° ) and isokinetic strength (PT dynamic60° and PT dynamic240° ), ballistic movement speed (S 20% ), and muscular endurance (Work) of the knee extensors. Genotyping was done for 153 gene variants, selected on the basis of a literature search and the expression quantitative trait loci of selected genes. Four GPS were designed: a total GPS (based on the sum of all 153 variants, each favorable allele = score 1), a data-driven and weighted GPS [respectively, the sum of favorable alleles of those variants with significant b-coefficients in stepwise regression (GPS dd ), and the sum of these variants weighted with their respective partial r 2 (GPS w )], and an elastic net GPS (based on the variants that were selected by an elastic net regularization; GPS en ). It was found that four different models for a GPS were able to significantly predict up to ~7% of the variance in strength performance. GPS en made the best prediction of SMM and Work. However, this was not the case for the remaining strength performance parameters, where best predictions were made by GPS dd and GPS w . Copyright © 2017 the American Physiological Society.
Study of CNG/diesel dual fuel engine's emissions by means of RBF neural network.
Liu, Zhen-tao; Fei, Shao-mei
2004-08-01
Great efforts have been made to resolve the serious environmental pollution and inevitable declining of energy resources. A review of Chinese fuel reserves and engine technology showed that compressed natural gas (CNG)/diesel dual fuel engine (DFE) was one of the best solutions for the above problems at present. In order to study and improve the emission performance of CNG/diesel DFE, an emission model for DFE based on radial basis function (RBF) neural network was developed which was a black-box input-output training data model not require priori knowledge. The RBF centers and the connected weights could be selected automatically according to the distribution of the training data in input-output space and the given approximating error. Studies showed that the predicted results accorded well with the experimental data over a large range of operating conditions from low load to high load. The developed emissions model based on the RBF neural network could be used to successfully predict and optimize the emissions performance of DFE. And the effect of the DFEmain performance parameters, such as rotation speed, load, pilot quantity and injection timing, were also predicted by means of this model. In resumé, an emission prediction model for CNG/diesel DFE based on RBF neural network was built for analyzing the effect of the main performance parameters on the CO, NOx, emissions of DFE. The predicted results agreed quite well with the traditional emissions model, which indicated that the model had certain application value, although it still has some limitations, because of its high dependence on the quantity of the experimental sample data.
2015-01-01
Background and Objectives In order to provide a good match between donor and recipient in liver transplantation, four scoring systems [the product of donor age and Model for End-stage Liver Disease score (D-MELD), the score to predict survival outcomes following liver transplantation (SOFT), the balance of risk score (BAR), and the transplant risk index (TRI)] based on both donor and recipient parameters were designed. This study was conducted to evaluate the performance of the four scores in living donor liver transplantation (LDLT) and compare them with the MELD score. Patients and Methods The clinical data of 249 adult patients undergoing LDLT in our center were retrospectively evaluated. The area under the receiver operating characteristic curves (AUCs) of each score were calculated and compared at 1-, 3-, 6-month and 1-year after LDLT. Results The BAR at 1-, 3-, 6-month and 1-year after LDLT and the D-MELD and TRI at 1-, 3- and 6-month after LDLT showed acceptable performances in the prediction of survival (AUC>0.6), while the SOFT showed poor discrimination at 6-month after LDLT (AUC = 0.569). In addition, the D-MELD and BAR displayed positive correlations with the length of ICU stay (D-MELD, p = 0.025; BAR, p = 0.022). The SOFT was correlated with the time of mechanical ventilation (p = 0.022). Conclusion The D-MELD, BAR and TRI provided acceptable performance in predicting survival after LDLT. However, even though these scoring systems were based on both donor and recipient parameters, only the BAR provided better performance than the MELD in predicting 1-year survival after LDLT. PMID:26378786
Hernando, Barbara; Ibañez, Maria Victoria; Deserio-Cuesta, Julio Alberto; Soria-Navarro, Raquel; Vilar-Sastre, Inca; Martinez-Cadenas, Conrado
2018-03-01
Prediction of human pigmentation traits, one of the most differentiable externally visible characteristics among individuals, from biological samples represents a useful tool in the field of forensic DNA phenotyping. In spite of freckling being a relatively common pigmentation characteristic in Europeans, little is known about the genetic basis of this largely genetically determined phenotype in southern European populations. In this work, we explored the predictive capacity of eight freckle and sunlight sensitivity-related genes in 458 individuals (266 non-freckled controls and 192 freckled cases) from Spain. Four loci were associated with freckling (MC1R, IRF4, ASIP and BNC2), and female sex was also found to be a predictive factor for having a freckling phenotype in our population. After identifying the most informative genetic variants responsible for human ephelides occurrence in our sample set, we developed a DNA-based freckle prediction model using a multivariate regression approach. Once developed, the capabilities of the prediction model were tested by a repeated 10-fold cross-validation approach. The proportion of correctly predicted individuals using the DNA-based freckle prediction model was 74.13%. The implementation of sex into the DNA-based freckle prediction model slightly improved the overall prediction accuracy by 2.19% (76.32%). Further evaluation of the newly-generated prediction model was performed by assessing the model's performance in a new cohort of 212 Spanish individuals, reaching a classification success rate of 74.61%. Validation of this prediction model may be carried out in larger populations, including samples from different European populations. Further research to validate and improve this newly-generated freckle prediction model will be needed before its forensic application. Together with DNA tests already validated for eye and hair colour prediction, this freckle prediction model may lead to a substantially more detailed physical description of unknown individuals from DNA found at the crime scene. Copyright © 2017 Elsevier B.V. All rights reserved.
Assessing deep and shallow learning methods for quantitative prediction of acute chemical toxicity.
Liu, Ruifeng; Madore, Michael; Glover, Kyle P; Feasel, Michael G; Wallqvist, Anders
2018-05-02
Animal-based methods for assessing chemical toxicity are struggling to meet testing demands. In silico approaches, including machine-learning methods, are promising alternatives. Recently, deep neural networks (DNNs) were evaluated and reported to outperform other machine-learning methods for quantitative structure-activity relationship modeling of molecular properties. However, most of the reported performance evaluations relied on global performance metrics, such as the root mean squared error (RMSE) between the predicted and experimental values of all samples, without considering the impact of sample distribution across the activity spectrum. Here, we carried out an in-depth analysis of DNN performance for quantitative prediction of acute chemical toxicity using several datasets. We found that the overall performance of DNN models on datasets of up to 30,000 compounds was similar to that of random forest (RF) models, as measured by the RMSE and correlation coefficients between the predicted and experimental results. However, our detailed analyses demonstrated that global performance metrics are inappropriate for datasets with a highly uneven sample distribution, because they show a strong bias for the most populous compounds along the toxicity spectrum. For highly toxic compounds, DNN and RF models trained on all samples performed much worse than the global performance metrics indicated. Surprisingly, our variable nearest neighbor method, which utilizes only structurally similar compounds to make predictions, performed reasonably well, suggesting that information of close near neighbors in the training sets is a key determinant of acute toxicity predictions.
Missile Guidance Law Based on Robust Model Predictive Control Using Neural-Network Optimization.
Li, Zhijun; Xia, Yuanqing; Su, Chun-Yi; Deng, Jun; Fu, Jun; He, Wei
2015-08-01
In this brief, the utilization of robust model-based predictive control is investigated for the problem of missile interception. Treating the target acceleration as a bounded disturbance, novel guidance law using model predictive control is developed by incorporating missile inside constraints. The combined model predictive approach could be transformed as a constrained quadratic programming (QP) problem, which may be solved using a linear variational inequality-based primal-dual neural network over a finite receding horizon. Online solutions to multiple parametric QP problems are used so that constrained optimal control decisions can be made in real time. Simulation studies are conducted to illustrate the effectiveness and performance of the proposed guidance control law for missile interception.
ERIC Educational Resources Information Center
Gandy, Robyn A.; Herial, Nabeel A.; Khuder, Sadik A.; Metting, Patricia J.
2008-01-01
This paper studies student performance predictions based on the United States Medical Licensure Exam (USMLE) Step 1. Subjects were second-year medical students from academic years of 2002 through 2006 (n = 711). Three measures of basic science knowledge (two curricular and one extracurricular) were evaluated as predictors of USMLE Step 1 scores.…
2012-10-01
Clearance Date: 7/20/2012. 14. ABSTRACT Breakthrough pressure is an important parameter associated with the performance of water- resistant fabrics... predicted values based on the geometry of the samples and the surface energy of the components. The theoretical predictions , however, do not explain...Edwards AFB, CA 93524 Abstract Breakthrough pressure is an important parameter associated with the performance of water- resistant fabrics
NASA Technical Reports Server (NTRS)
Omura, J. K.; Simon, M. K.
1982-01-01
A theory is presented for deducing and predicting the performance of transmitter/receivers for bandwidth efficient modulations suitable for use on the linear satellite channel. The underlying principle used is the development of receiver structures based on the maximum-likelihood decision rule. The application of the performance prediction tools, e.g., channel cutoff rate and bit error probability transfer function bounds to these modulation/demodulation techniques.
Blagus, Rok; Lusa, Lara
2015-11-04
Prediction models are used in clinical research to develop rules that can be used to accurately predict the outcome of the patients based on some of their characteristics. They represent a valuable tool in the decision making process of clinicians and health policy makers, as they enable them to estimate the probability that patients have or will develop a disease, will respond to a treatment, or that their disease will recur. The interest devoted to prediction models in the biomedical community has been growing in the last few years. Often the data used to develop the prediction models are class-imbalanced as only few patients experience the event (and therefore belong to minority class). Prediction models developed using class-imbalanced data tend to achieve sub-optimal predictive accuracy in the minority class. This problem can be diminished by using sampling techniques aimed at balancing the class distribution. These techniques include under- and oversampling, where a fraction of the majority class samples are retained in the analysis or new samples from the minority class are generated. The correct assessment of how the prediction model is likely to perform on independent data is of crucial importance; in the absence of an independent data set, cross-validation is normally used. While the importance of correct cross-validation is well documented in the biomedical literature, the challenges posed by the joint use of sampling techniques and cross-validation have not been addressed. We show that care must be taken to ensure that cross-validation is performed correctly on sampled data, and that the risk of overestimating the predictive accuracy is greater when oversampling techniques are used. Examples based on the re-analysis of real datasets and simulation studies are provided. We identify some results from the biomedical literature where the incorrect cross-validation was performed, where we expect that the performance of oversampling techniques was heavily overestimated.
A link prediction approach to cancer drug sensitivity prediction.
Turki, Turki; Wei, Zhi
2017-10-03
Predicting the response to a drug for cancer disease patients based on genomic information is an important problem in modern clinical oncology. This problem occurs in part because many available drug sensitivity prediction algorithms do not consider better quality cancer cell lines and the adoption of new feature representations; both lead to the accurate prediction of drug responses. By predicting accurate drug responses to cancer, oncologists gain a more complete understanding of the effective treatments for each patient, which is a core goal in precision medicine. In this paper, we model cancer drug sensitivity as a link prediction, which is shown to be an effective technique. We evaluate our proposed link prediction algorithms and compare them with an existing drug sensitivity prediction approach based on clinical trial data. The experimental results based on the clinical trial data show the stability of our link prediction algorithms, which yield the highest area under the ROC curve (AUC) and are statistically significant. We propose a link prediction approach to obtain new feature representation. Compared with an existing approach, the results show that incorporating the new feature representation to the link prediction algorithms has significantly improved the performance.
Hultsch, D F; Hammer, M; Small, B J
1993-01-01
The predictive relationships among individual differences in self-reported physical health and activity life style and performance on an array of information processing and intellectual ability measures were examined. A sample of 484 men and women aged 55 to 86 years completed a battery of cognitive tasks measuring verbal processing time, working memory, vocabulary, verbal fluency, world knowledge, word recall, and text recall. Hierarchical regression was used to predict performance on these tasks from measures of self-reported physical health, alcohol and tobacco use, and level of participation in everyday activities. The results indicated: (a) individual differences in self-reported health and activity predicted performance on multiple cognitive measures; (b) self-reported health was more predictive of processing resource variables than knowledge-based abilities; (c) interaction effects indicated that participation in cognitively demanding activities was more highly related to performance on some measures for older adults than for middle-aged adults; and (d) age-related differences in performance on multiple measures were attenuated by partialing individual differences in self-reported health and activity.
AbdelRahman, Samir E; Zhang, Mingyuan; Bray, Bruce E; Kawamoto, Kensaku
2014-05-27
The aim of this study was to propose an analytical approach to develop high-performing predictive models for congestive heart failure (CHF) readmission using an operational dataset with incomplete records and changing data over time. Our analytical approach involves three steps: pre-processing, systematic model development, and risk factor analysis. For pre-processing, variables that were absent in >50% of records were removed. Moreover, the dataset was divided into a validation dataset and derivation datasets which were separated into three temporal subsets based on changes to the data over time. For systematic model development, using the different temporal datasets and the remaining explanatory variables, the models were developed by combining the use of various (i) statistical analyses to explore the relationships between the validation and the derivation datasets; (ii) adjustment methods for handling missing values; (iii) classifiers; (iv) feature selection methods; and (iv) discretization methods. We then selected the best derivation dataset and the models with the highest predictive performance. For risk factor analysis, factors in the highest-performing predictive models were analyzed and ranked using (i) statistical analyses of the best derivation dataset, (ii) feature rankers, and (iii) a newly developed algorithm to categorize risk factors as being strong, regular, or weak. The analysis dataset consisted of 2,787 CHF hospitalizations at University of Utah Health Care from January 2003 to June 2013. In this study, we used the complete-case analysis and mean-based imputation adjustment methods; the wrapper subset feature selection method; and four ranking strategies based on information gain, gain ratio, symmetrical uncertainty, and wrapper subset feature evaluators. The best-performing models resulted from the use of a complete-case analysis derivation dataset combined with the Class-Attribute Contingency Coefficient discretization method and a voting classifier which averaged the results of multi-nominal logistic regression and voting feature intervals classifiers. Of 42 final model risk factors, discharge disposition, discretized age, and indicators of anemia were the most significant. This model achieved a c-statistic of 86.8%. The proposed three-step analytical approach enhanced predictive model performance for CHF readmissions. It could potentially be leveraged to improve predictive model performance in other areas of clinical medicine.
Song, Xiaoying; Huang, Qijun; Chang, Sheng; He, Jin; Wang, Hao
2018-06-01
To improve the compression rates for lossless compression of medical images, an efficient algorithm, based on irregular segmentation and region-based prediction, is proposed in this paper. Considering that the first step of a region-based compression algorithm is segmentation, this paper proposes a hybrid method by combining geometry-adaptive partitioning and quadtree partitioning to achieve adaptive irregular segmentation for medical images. Then, least square (LS)-based predictors are adaptively designed for each region (regular subblock or irregular subregion). The proposed adaptive algorithm not only exploits spatial correlation between pixels but it utilizes local structure similarity, resulting in efficient compression performance. Experimental results show that the average compression performance of the proposed algorithm is 10.48, 4.86, 3.58, and 0.10% better than that of JPEG 2000, CALIC, EDP, and JPEG-LS, respectively. Graphical abstract ᅟ.
Ma, Xiao H; Jia, Jia; Zhu, Feng; Xue, Ying; Li, Ze R; Chen, Yu Z
2009-05-01
Machine learning methods have been explored as ligand-based virtual screening tools for facilitating drug lead discovery. These methods predict compounds of specific pharmacodynamic, pharmacokinetic or toxicological properties based on their structure-derived structural and physicochemical properties. Increasing attention has been directed at these methods because of their capability in predicting compounds of diverse structures and complex structure-activity relationships without requiring the knowledge of target 3D structure. This article reviews current progresses in using machine learning methods for virtual screening of pharmacodynamically active compounds from large compound libraries, and analyzes and compares the reported performances of machine learning tools with those of structure-based and other ligand-based (such as pharmacophore and clustering) virtual screening methods. The feasibility to improve the performance of machine learning methods in screening large libraries is discussed.
Texture metric that predicts target detection performance
NASA Astrophysics Data System (ADS)
Culpepper, Joanne B.
2015-12-01
Two texture metrics based on gray level co-occurrence error (GLCE) are used to predict probability of detection and mean search time. The two texture metrics are local clutter metrics and are based on the statistics of GLCE probability distributions. The degree of correlation between various clutter metrics and the target detection performance of the nine military vehicles in complex natural scenes found in the Search_2 dataset are presented. Comparison is also made between four other common clutter metrics found in the literature: root sum of squares, Doyle, statistical variance, and target structure similarity. The experimental results show that the GLCE energy metric is a better predictor of target detection performance when searching for targets in natural scenes than the other clutter metrics studied.
Modeling Heavy/Medium-Duty Fuel Consumption Based on Drive Cycle Properties
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Lijuan; Duran, Adam; Gonder, Jeffrey
This paper presents multiple methods for predicting heavy/medium-duty vehicle fuel consumption based on driving cycle information. A polynomial model, a black box artificial neural net model, a polynomial neural network model, and a multivariate adaptive regression splines (MARS) model were developed and verified using data collected from chassis testing performed on a parcel delivery diesel truck operating over the Heavy Heavy-Duty Diesel Truck (HHDDT), City Suburban Heavy Vehicle Cycle (CSHVC), New York Composite Cycle (NYCC), and hydraulic hybrid vehicle (HHV) drive cycles. Each model was trained using one of four drive cycles as a training cycle and the other threemore » as testing cycles. By comparing the training and testing results, a representative training cycle was chosen and used to further tune each method. HHDDT as the training cycle gave the best predictive results, because HHDDT contains a variety of drive characteristics, such as high speed, acceleration, idling, and deceleration. Among the four model approaches, MARS gave the best predictive performance, with an average absolute percent error of -1.84% over the four chassis dynamometer drive cycles. To further evaluate the accuracy of the predictive models, the approaches were first applied to real-world data. MARS outperformed the other three approaches, providing an average absolute percent error of -2.2% of four real-world road segments. The MARS model performance was then compared to HHDDT, CSHVC, NYCC, and HHV drive cycles with the performance from Future Automotive System Technology Simulator (FASTSim). The results indicated that the MARS method achieved a comparative predictive performance with FASTSim.« less
Predictors of outcomes following reablement in community-dwelling older adults.
Tuntland, Hanne; Kjeken, Ingvild; Langeland, Eva; Folkestad, Bjarte; Espehaug, Birgitte; Førland, Oddvar; Aaslund, Mona Kristin
2017-01-01
Reablement is a rehabilitation intervention for community-dwelling older adults, which has recently been implemented in several countries. Its purpose is to improve functional ability in daily occupations (everyday activities) perceived as important by the older person. Performance and satisfaction with performance in everyday life are the major outcomes of reablement. However, the evidence base concerning which factors predict better outcomes and who receives the greatest benefit in reablement is lacking. The objective of this study was to determine the potential factors that predict occupational performance and satisfaction with that performance at 10 weeks follow-up. The sample in this study was derived from a nationwide clinical controlled trial evaluating the effects of reablement in Norway and consisted of 712 participants living in 34 municipalities. Multiple linear regression was used to investigate possible predictors of occupational performance (COPM-P) and satisfaction with that performance (COPM-S) at 10 weeks follow-up based on the Canadian Occupational Performance Measure (COPM). The results indicate that the factors that significantly predicted better COPM-P and COPM-S outcomes at 10 weeks follow-up were higher baseline scores of COPM-P and COPM-S respectively, female sex, having a fracture as the major health condition and high motivation for rehabilitation. Conversely, the factors that significantly predicted poorer COPM-P and COPM-S outcomes were having a neurological disease other than stroke, having dizziness/balance problems as the major health condition and having pain/discomfort. In addition, having anxiety/depression was a predictor of poorer COPM-P outcomes. The two regression models explained 38.3% and 38.8% of the total variance of the dependent variables of occupational performance and satisfaction with that performance, respectively. The results indicate that diagnosis, functional level, sex and motivation are significant predictors of outcomes following reablement.
Luo, Jiawei; Xiao, Qiu
2017-02-01
MicroRNAs (miRNAs) play a critical role by regulating their targets in post-transcriptional level. Identification of potential miRNA-disease associations will aid in deciphering the pathogenesis of human polygenic diseases. Several computational models have been developed to uncover novel miRNA-disease associations based on the predicted target genes. However, due to the insufficient number of experimentally validated miRNA-target interactions as well as the relatively high false-positive and false-negative rates of predicted target genes, it is still challenging for these prediction models to obtain remarkable performances. The purpose of this study is to prioritize miRNA candidates for diseases. We first construct a heterogeneous network, which consists of a disease similarity network, a miRNA functional similarity network and a known miRNA-disease association network. Then, an unbalanced bi-random walk-based algorithm on the heterogeneous network (BRWH) is adopted to discover potential associations by exploiting bipartite subgraphs. Based on 5-fold cross validation, the proposed network-based method achieves AUC values ranging from 0.782 to 0.907 for the 22 human diseases and an average AUC of almost 0.846. The experiments indicated that BRWH can achieve better performances compared with several popular methods. In addition, case studies of some common diseases further demonstrated the superior performance of our proposed method on prioritizing disease-related miRNA candidates. Copyright © 2017 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Wu, Zhihao; Lin, Youfang; Zhao, Yiji; Yan, Hongyan
2018-02-01
Networks can represent a wide range of complex systems, such as social, biological and technological systems. Link prediction is one of the most important problems in network analysis, and has attracted much research interest recently. Many link prediction methods have been proposed to solve this problem with various techniques. We can note that clustering information plays an important role in solving the link prediction problem. In previous literatures, we find node clustering coefficient appears frequently in many link prediction methods. However, node clustering coefficient is limited to describe the role of a common-neighbor in different local networks, because it cannot distinguish different clustering abilities of a node to different node pairs. In this paper, we shift our focus from nodes to links, and propose the concept of asymmetric link clustering (ALC) coefficient. Further, we improve three node clustering based link prediction methods via the concept of ALC. The experimental results demonstrate that ALC-based methods outperform node clustering based methods, especially achieving remarkable improvements on food web, hamster friendship and Internet networks. Besides, comparing with other methods, the performance of ALC-based methods are very stable in both globalized and personalized top-L link prediction tasks.
Naro, Daniel; Rummel, Christian; Schindler, Kaspar; Andrzejak, Ralph G
2014-09-01
The rank-based nonlinear predictability score was recently introduced as a test for determinism in point processes. We here adapt this measure to time series sampled from time-continuous flows. We use noisy Lorenz signals to compare this approach against a classical amplitude-based nonlinear prediction error. Both measures show an almost identical robustness against Gaussian white noise. In contrast, when the amplitude distribution of the noise has a narrower central peak and heavier tails than the normal distribution, the rank-based nonlinear predictability score outperforms the amplitude-based nonlinear prediction error. For this type of noise, the nonlinear predictability score has a higher sensitivity for deterministic structure in noisy signals. It also yields a higher statistical power in a surrogate test of the null hypothesis of linear stochastic correlated signals. We show the high relevance of this improved performance in an application to electroencephalographic (EEG) recordings from epilepsy patients. Here the nonlinear predictability score again appears of higher sensitivity to nonrandomness. Importantly, it yields an improved contrast between signals recorded from brain areas where the first ictal EEG signal changes were detected (focal EEG signals) versus signals recorded from brain areas that were not involved at seizure onset (nonfocal EEG signals).
NASA Astrophysics Data System (ADS)
Naro, Daniel; Rummel, Christian; Schindler, Kaspar; Andrzejak, Ralph G.
2014-09-01
The rank-based nonlinear predictability score was recently introduced as a test for determinism in point processes. We here adapt this measure to time series sampled from time-continuous flows. We use noisy Lorenz signals to compare this approach against a classical amplitude-based nonlinear prediction error. Both measures show an almost identical robustness against Gaussian white noise. In contrast, when the amplitude distribution of the noise has a narrower central peak and heavier tails than the normal distribution, the rank-based nonlinear predictability score outperforms the amplitude-based nonlinear prediction error. For this type of noise, the nonlinear predictability score has a higher sensitivity for deterministic structure in noisy signals. It also yields a higher statistical power in a surrogate test of the null hypothesis of linear stochastic correlated signals. We show the high relevance of this improved performance in an application to electroencephalographic (EEG) recordings from epilepsy patients. Here the nonlinear predictability score again appears of higher sensitivity to nonrandomness. Importantly, it yields an improved contrast between signals recorded from brain areas where the first ictal EEG signal changes were detected (focal EEG signals) versus signals recorded from brain areas that were not involved at seizure onset (nonfocal EEG signals).
Method for Predicting the Energy Characteristics of Li-Ion Cells Designed for High Specific Energy
NASA Technical Reports Server (NTRS)
Bennett, William, R.
2012-01-01
Novel electrode materials with increased specific capacity and voltage performance are critical to the NASA goals for developing Li-ion batteries with increased specific energy and energy density. Although performance metrics of the individual electrodes are critically important, a fundamental understanding of the interactions of electrodes in a full cell is essential to achieving the desired performance, and for establishing meaningful goals for electrode performance in the first place. This paper presents design considerations for matching positive and negative electrodes in a viable design. Methods for predicting cell-level performance, based on laboratory data for individual electrodes, are presented and discussed.
Le, Duc-Hau; Pham, Van-Huy
2017-06-15
Finding gene-disease and disease-disease associations play important roles in the biomedical area and many prioritization methods have been proposed for this goal. Among them, approaches based on a heterogeneous network of genes and diseases are considered state-of-the-art ones, which achieve high prediction performance and can be used for diseases with/without known molecular basis. Here, we developed a Cytoscape app, namely HGPEC, based on a random walk with restart algorithm on a heterogeneous network of genes and diseases. This app can prioritize candidate genes and diseases by employing a heterogeneous network consisting of a network of genes/proteins and a phenotypic disease similarity network. Based on the rankings, novel disease-gene and disease-disease associations can be identified. These associations can be supported with network- and rank-based visualization as well as evidences and annotations from biomedical data. A case study on prediction of novel breast cancer-associated genes and diseases shows the abilities of HGPEC. In addition, we showed prominence in the performance of HGPEC compared to other tools for prioritization of candidate disease genes. Taken together, our app is expected to effectively predict novel disease-gene and disease-disease associations and support network- and rank-based visualization as well as biomedical evidences for such the associations.
NASA Astrophysics Data System (ADS)
Caldararu, Silvia; Purves, Drew W.; Smith, Matthew J.
2017-04-01
Improving international food security under a changing climate and increasing human population will be greatly aided by improving our ability to modify, understand and predict crop growth. What we predominantly have at our disposal are either process-based models of crop physiology or statistical analyses of yield datasets, both of which suffer from various sources of error. In this paper, we present a generic process-based crop model (PeakN-crop v1.0) which we parametrise using a Bayesian model-fitting algorithm to three different sources: data-space-based vegetation indices, eddy covariance productivity measurements and regional crop yields. We show that the model parametrised without data, based on prior knowledge of the parameters, can largely capture the observed behaviour but the data-constrained model greatly improves both the model fit and reduces prediction uncertainty. We investigate the extent to which each dataset contributes to the model performance and show that while all data improve on the prior model fit, the satellite-based data and crop yield estimates are particularly important for reducing model error and uncertainty. Despite these improvements, we conclude that there are still significant knowledge gaps, in terms of available data for model parametrisation, but our study can help indicate the necessary data collection to improve our predictions of crop yields and crop responses to environmental changes.
Cao, Renzhi; Bhattacharya, Debswapna; Adhikari, Badri; Li, Jilong; Cheng, Jianlin
2015-01-01
Model evaluation and selection is an important step and a big challenge in template-based protein structure prediction. Individual model quality assessment methods designed for recognizing some specific properties of protein structures often fail to consistently select good models from a model pool because of their limitations. Therefore, combining multiple complimentary quality assessment methods is useful for improving model ranking and consequently tertiary structure prediction. Here, we report the performance and analysis of our human tertiary structure predictor (MULTICOM) based on the massive integration of 14 diverse complementary quality assessment methods that was successfully benchmarked in the 11th Critical Assessment of Techniques of Protein Structure prediction (CASP11). The predictions of MULTICOM for 39 template-based domains were rigorously assessed by six scoring metrics covering global topology of Cα trace, local all-atom fitness, side chain quality, and physical reasonableness of the model. The results show that the massive integration of complementary, diverse single-model and multi-model quality assessment methods can effectively leverage the strength of single-model methods in distinguishing quality variation among similar good models and the advantage of multi-model quality assessment methods of identifying reasonable average-quality models. The overall excellent performance of the MULTICOM predictor demonstrates that integrating a large number of model quality assessment methods in conjunction with model clustering is a useful approach to improve the accuracy, diversity, and consequently robustness of template-based protein structure prediction. PMID:26369671
Wang, Mingyu; Han, Lijuan; Liu, Shasha; Zhao, Xuebing; Yang, Jinghua; Loh, Soh Kheang; Sun, Xiaomin; Zhang, Chenxi; Fang, Xu
2015-09-01
Renewable energy from lignocellulosic biomass has been deemed an alternative to depleting fossil fuels. In order to improve this technology, we aim to develop robust mathematical models for the enzymatic lignocellulose degradation process. By analyzing 96 groups of previously published and newly obtained lignocellulose saccharification results and fitting them to Weibull distribution, we discovered Weibull statistics can accurately predict lignocellulose saccharification data, regardless of the type of substrates, enzymes and saccharification conditions. A mathematical model for enzymatic lignocellulose degradation was subsequently constructed based on Weibull statistics. Further analysis of the mathematical structure of the model and experimental saccharification data showed the significance of the two parameters in this model. In particular, the λ value, defined the characteristic time, represents the overall performance of the saccharification system. This suggestion was further supported by statistical analysis of experimental saccharification data and analysis of the glucose production levels when λ and n values change. In conclusion, the constructed Weibull statistics-based model can accurately predict lignocellulose hydrolysis behavior and we can use the λ parameter to assess the overall performance of enzymatic lignocellulose degradation. Advantages and potential applications of the model and the λ value in saccharification performance assessment were discussed. Copyright © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
NASA Astrophysics Data System (ADS)
Dolloff, John; Hottel, Bryant; Edwards, David; Theiss, Henry; Braun, Aaron
2017-05-01
This paper presents an overview of the Full Motion Video-Geopositioning Test Bed (FMV-GTB) developed to investigate algorithm performance and issues related to the registration of motion imagery and subsequent extraction of feature locations along with predicted accuracy. A case study is included corresponding to a video taken from a quadcopter. Registration of the corresponding video frames is performed without the benefit of a priori sensor attitude (pointing) information. In particular, tie points are automatically measured between adjacent frames using standard optical flow matching techniques from computer vision, an a priori estimate of sensor attitude is then computed based on supplied GPS sensor positions contained in the video metadata and a photogrammetric/search-based structure from motion algorithm, and then a Weighted Least Squares adjustment of all a priori metadata across the frames is performed. Extraction of absolute 3D feature locations, including their predicted accuracy based on the principles of rigorous error propagation, is then performed using a subset of the registered frames. Results are compared to known locations (check points) over a test site. Throughout this entire process, no external control information (e.g. surveyed points) is used other than for evaluation of solution errors and corresponding accuracy.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, W; Sawant, A; Ruan, D
Purpose: The development of high dimensional imaging systems (e.g. volumetric MRI, CBCT, photogrammetry systems) in image-guided radiotherapy provides important pathways to the ultimate goal of real-time volumetric/surface motion monitoring. This study aims to develop a prediction method for the high dimensional state subject to respiratory motion. Compared to conventional linear dimension reduction based approaches, our method utilizes manifold learning to construct a descriptive feature submanifold, where more efficient and accurate prediction can be performed. Methods: We developed a prediction framework for high-dimensional state subject to respiratory motion. The proposed method performs dimension reduction in a nonlinear setting to permit moremore » descriptive features compared to its linear counterparts (e.g., classic PCA). Specifically, a kernel PCA is used to construct a proper low-dimensional feature manifold, where low-dimensional prediction is performed. A fixed-point iterative pre-image estimation method is applied subsequently to recover the predicted value in the original state space. We evaluated and compared the proposed method with PCA-based method on 200 level-set surfaces reconstructed from surface point clouds captured by the VisionRT system. The prediction accuracy was evaluated with respect to root-mean-squared-error (RMSE) for both 200ms and 600ms lookahead lengths. Results: The proposed method outperformed PCA-based approach with statistically higher prediction accuracy. In one-dimensional feature subspace, our method achieved mean prediction accuracy of 0.86mm and 0.89mm for 200ms and 600ms lookahead lengths respectively, compared to 0.95mm and 1.04mm from PCA-based method. The paired t-tests further demonstrated the statistical significance of the superiority of our method, with p-values of 6.33e-3 and 5.78e-5, respectively. Conclusion: The proposed approach benefits from the descriptiveness of a nonlinear manifold and the prediction reliability in such low dimensional manifold. The fixed-point iterative approach turns out to work well practically for the pre-image recovery. Our approach is particularly suitable to facilitate managing respiratory motion in image-guide radiotherapy. This work is supported in part by NIH grant R01 CA169102-02.« less
Broad band solar EUV absorption in the earth's upper atmosphere.
NASA Technical Reports Server (NTRS)
Allen, K. H.; Rense, W. A.
1973-01-01
Observation data on solar radiation intensity, based on measurements performed as a function of time for three broad wavelength bands between 280 and 1030 A by a wheel spectrometer on Oso 5 during sunrise and sunset, are compared with predicted intensity variations based on Cira models. The differences between sunrise and sunset data, as well as those between observed and predicted data are discussed.
Airplane takeoff and landing performance monitoring system
NASA Technical Reports Server (NTRS)
Middleton, David B. (Inventor); Srivatsan, Raghavachari (Inventor); Person, Lee H. (Inventor)
1989-01-01
The invention is a real-time takeoff and landing performance monitoring system which provides the pilot with graphic and metric information to assist in decisions related to achieving rotation speed (V sub R) within the safe zone of the runway or stopping the aircraft on the runway after landing or take off abort. The system processes information in two segments: a pretakeoff segment and a real-time segment. One-time inputs of ambient conditions and airplane configuration information are used in the pretakeoff segment to generate scheduled performance data. The real-time segment uses the scheduled performance data, runway length data and transducer measured parameters to monitor the performance of the airplane throughout the takeoff roll. An important feature of this segment is that it updates the estimated runway rolling friction coefficient. Airplane performance predictions also reflect changes in headwind occurring as the takeoff roll progresses. The system displays the position of the airplane on the runway, indicating runway used and runway available, summarizes the critical information into a situation advisory flag, flags engine failures and off-nominal acceleration performance, and indicates where on the runway particular events such as decision speed (V sub 1), rotation speed (V sub R) and expected stop points will occur based on actual or predicted performance. The display also indicates airspeed, wind vector, engine pressure ratios, second segment climb speed, and balanced field length (BFL). The system detects performance deficiencies by comparing the airplane's present performance with a predicted nominal performance based upon the given conditions.
NASA Astrophysics Data System (ADS)
Hardikar, Kedar Y.; Liu, Bill J. J.; Bheemreddy, Venkata
2016-09-01
Gaining an understanding of degradation mechanisms and their characterization are critical in developing relevant accelerated tests to ensure PV module performance warranty over a typical lifetime of 25 years. As newer technologies are adapted for PV, including new PV cell technologies, new packaging materials, and newer product designs, the availability of field data over extended periods of time for product performance assessment cannot be expected within the typical timeframe for business decisions. In this work, to enable product design decisions and product performance assessment for PV modules utilizing newer technologies, Simulation and Mechanism based Accelerated Reliability Testing (SMART) methodology and empirical approaches to predict field performance from accelerated test results are presented. The method is demonstrated for field life assessment of flexible PV modules based on degradation mechanisms observed in two accelerated tests, namely, Damp Heat and Thermal Cycling. The method is based on design of accelerated testing scheme with the intent to develop relevant acceleration factor models. The acceleration factor model is validated by extensive reliability testing under different conditions going beyond the established certification standards. Once the acceleration factor model is validated for the test matrix a modeling scheme is developed to predict field performance from results of accelerated testing for particular failure modes of interest. Further refinement of the model can continue as more field data becomes available. While the demonstration of the method in this work is for thin film flexible PV modules, the framework and methodology can be adapted to other PV products.
Yuan, Qingjun; Gao, Junning; Wu, Dongliang; Zhang, Shihua; Mamitsuka, Hiroshi; Zhu, Shanfeng
2016-01-01
Motivation: Identifying drug–target interactions is an important task in drug discovery. To reduce heavy time and financial cost in experimental way, many computational approaches have been proposed. Although these approaches have used many different principles, their performance is far from satisfactory, especially in predicting drug–target interactions of new candidate drugs or targets. Methods: Approaches based on machine learning for this problem can be divided into two types: feature-based and similarity-based methods. Learning to rank is the most powerful technique in the feature-based methods. Similarity-based methods are well accepted, due to their idea of connecting the chemical and genomic spaces, represented by drug and target similarities, respectively. We propose a new method, DrugE-Rank, to improve the prediction performance by nicely combining the advantages of the two different types of methods. That is, DrugE-Rank uses LTR, for which multiple well-known similarity-based methods can be used as components of ensemble learning. Results: The performance of DrugE-Rank is thoroughly examined by three main experiments using data from DrugBank: (i) cross-validation on FDA (US Food and Drug Administration) approved drugs before March 2014; (ii) independent test on FDA approved drugs after March 2014; and (iii) independent test on FDA experimental drugs. Experimental results show that DrugE-Rank outperforms competing methods significantly, especially achieving more than 30% improvement in Area under Prediction Recall curve for FDA approved new drugs and FDA experimental drugs. Availability: http://datamining-iip.fudan.edu.cn/service/DrugE-Rank Contact: zhusf@fudan.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27307615
Yuan, Qingjun; Gao, Junning; Wu, Dongliang; Zhang, Shihua; Mamitsuka, Hiroshi; Zhu, Shanfeng
2016-06-15
Identifying drug-target interactions is an important task in drug discovery. To reduce heavy time and financial cost in experimental way, many computational approaches have been proposed. Although these approaches have used many different principles, their performance is far from satisfactory, especially in predicting drug-target interactions of new candidate drugs or targets. Approaches based on machine learning for this problem can be divided into two types: feature-based and similarity-based methods. Learning to rank is the most powerful technique in the feature-based methods. Similarity-based methods are well accepted, due to their idea of connecting the chemical and genomic spaces, represented by drug and target similarities, respectively. We propose a new method, DrugE-Rank, to improve the prediction performance by nicely combining the advantages of the two different types of methods. That is, DrugE-Rank uses LTR, for which multiple well-known similarity-based methods can be used as components of ensemble learning. The performance of DrugE-Rank is thoroughly examined by three main experiments using data from DrugBank: (i) cross-validation on FDA (US Food and Drug Administration) approved drugs before March 2014; (ii) independent test on FDA approved drugs after March 2014; and (iii) independent test on FDA experimental drugs. Experimental results show that DrugE-Rank outperforms competing methods significantly, especially achieving more than 30% improvement in Area under Prediction Recall curve for FDA approved new drugs and FDA experimental drugs. http://datamining-iip.fudan.edu.cn/service/DrugE-Rank zhusf@fudan.edu.cn Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
NASA Technical Reports Server (NTRS)
Sinha, Neeraj; Brinckman, Kevin; Jansen, Bernard; Seiner, John
2011-01-01
A method was developed of obtaining propulsive base flow data in both hot and cold jet environments, at Mach numbers and altitude of relevance to NASA launcher designs. The base flow data was used to perform computational fluid dynamics (CFD) turbulence model assessments of base flow predictive capabilities in order to provide increased confidence in base thermal and pressure load predictions obtained from computational modeling efforts. Predictive CFD analyses were used in the design of the experiments, available propulsive models were used to reduce program costs and increase success, and a wind tunnel facility was used. The data obtained allowed assessment of CFD/turbulence models in a complex flow environment, working within a building-block procedure to validation, where cold, non-reacting test data was first used for validation, followed by more complex reacting base flow validation.
ERIC Educational Resources Information Center
Kern, Ben D.; Graber, Kim C.; Shen, Sa; Hillman, Charles H.; McLoughlin, Gabriella
2018-01-01
Background: Socioeconomic status (SES) is the most accurate predictor of academic performance in US schools. Third-grade reading is highly predictive of high school graduation. Chronic physical activity (PA) is shown to improve cognition and academic performance. We hypothesized that school-based PA opportunities (recess and physical education)…
Benchmark data sets for structure-based computational target prediction.
Schomburg, Karen T; Rarey, Matthias
2014-08-25
Structure-based computational target prediction methods identify potential targets for a bioactive compound. Methods based on protein-ligand docking so far face many challenges, where the greatest probably is the ranking of true targets in a large data set of protein structures. Currently, no standard data sets for evaluation exist, rendering comparison and demonstration of improvements of methods cumbersome. Therefore, we propose two data sets and evaluation strategies for a meaningful evaluation of new target prediction methods, i.e., a small data set consisting of three target classes for detailed proof-of-concept and selectivity studies and a large data set consisting of 7992 protein structures and 72 drug-like ligands allowing statistical evaluation with performance metrics on a drug-like chemical space. Both data sets are built from openly available resources, and any information needed to perform the described experiments is reported. We describe the composition of the data sets, the setup of screening experiments, and the evaluation strategy. Performance metrics capable to measure the early recognition of enrichments like AUC, BEDROC, and NSLR are proposed. We apply a sequence-based target prediction method to the large data set to analyze its content of nontrivial evaluation cases. The proposed data sets are used for method evaluation of our new inverse screening method iRAISE. The small data set reveals the method's capability and limitations to selectively distinguish between rather similar protein structures. The large data set simulates real target identification scenarios. iRAISE achieves in 55% excellent or good enrichment a median AUC of 0.67 and RMSDs below 2.0 Å for 74% and was able to predict the first true target in 59 out of 72 cases in the top 2% of the protein data set of about 8000 structures.
Evaluation of the Performance of Texas Pavements Made with Different Coarse Aggregates
DOT National Transportation Integrated Search
2000-10-01
This report summarizes 23 years of work undertaken in Texas to understand the reasons for significant performance differences found in pavements placed around the state. To a significant degree, pavement performance can be predicted based on the conc...
Big Data Toolsets to Pharmacometrics: Application of Machine Learning for Time-to-Event Analysis.
Gong, Xiajing; Hu, Meng; Zhao, Liang
2018-05-01
Additional value can be potentially created by applying big data tools to address pharmacometric problems. The performances of machine learning (ML) methods and the Cox regression model were evaluated based on simulated time-to-event data synthesized under various preset scenarios, i.e., with linear vs. nonlinear and dependent vs. independent predictors in the proportional hazard function, or with high-dimensional data featured by a large number of predictor variables. Our results showed that ML-based methods outperformed the Cox model in prediction performance as assessed by concordance index and in identifying the preset influential variables for high-dimensional data. The prediction performances of ML-based methods are also less sensitive to data size and censoring rates than the Cox regression model. In conclusion, ML-based methods provide a powerful tool for time-to-event analysis, with a built-in capacity for high-dimensional data and better performance when the predictor variables assume nonlinear relationships in the hazard function. © 2018 The Authors. Clinical and Translational Science published by Wiley Periodicals, Inc. on behalf of American Society for Clinical Pharmacology and Therapeutics.
Wen, Ping-Ping; Shi, Shao-Ping; Xu, Hao-Dong; Wang, Li-Na; Qiu, Jian-Ding
2016-10-15
As one of the most important reversible types of post-translational modification, protein methylation catalyzed by methyltransferases carries many pivotal biological functions as well as many essential biological processes. Identification of methylation sites is prerequisite for decoding methylation regulatory networks in living cells and understanding their physiological roles. Experimental methods are limitations of labor-intensive and time-consuming. While in silicon approaches are cost-effective and high-throughput manner to predict potential methylation sites, but those previous predictors only have a mixed model and their prediction performances are not fully satisfactory now. Recently, with increasing availability of quantitative methylation datasets in diverse species (especially in eukaryotes), there is a growing need to develop a species-specific predictor. Here, we designed a tool named PSSMe based on information gain (IG) feature optimization method for species-specific methylation site prediction. The IG method was adopted to analyze the importance and contribution of each feature, then select the valuable dimension feature vectors to reconstitute a new orderly feature, which was applied to build the finally prediction model. Finally, our method improves prediction performance of accuracy about 15% comparing with single features. Furthermore, our species-specific model significantly improves the predictive performance compare with other general methylation prediction tools. Hence, our prediction results serve as useful resources to elucidate the mechanism of arginine or lysine methylation and facilitate hypothesis-driven experimental design and validation. The tool online service is implemented by C# language and freely available at http://bioinfo.ncu.edu.cn/PSSMe.aspx CONTACT: jdqiu@ncu.edu.cnSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Varying execution discipline to increase performance
DOE Office of Scientific and Technical Information (OSTI.GOV)
Campbell, P.L.; Maccabe, A.B.
1993-12-22
This research investigates the relationship between execution discipline and performance. The hypothesis has two parts: 1. Different execution disciplines exhibit different performance for different computations, and 2. These differences can be effectively predicted by heuristics. A machine model is developed that can vary its execution discipline. That is, the model can execute a given program using either the control-driven, data-driven or demand-driven execution discipline. This model is referred to as a ``variable-execution-discipline`` machine. The instruction set for the model is the Program Dependence Web (PDW). The first part of the hypothesis will be tested by simulating the execution of themore » machine model on a suite of computations, based on the Livermore Fortran Kernel (LFK) Test (a.k.a. the Livermore Loops), using all three execution disciplines. Heuristics are developed to predict relative performance. These heuristics predict (a) the execution time under each discipline for one iteration of each loop and (b) the number of iterations taken by that loop; then the heuristics use those predictions to develop a prediction for the execution of the entire loop. Similar calculations are performed for branch statements. The second part of the hypothesis will be tested by comparing the results of the simulated execution with the predictions produced by the heuristics. If the hypothesis is supported, then the door is open for the development of machines that can vary execution discipline to increase performance.« less
DOT National Transportation Integrated Search
2013-12-01
Travel forecasting models predict travel demand based on the present transportation system and its use. Transportation modelers must develop, validate, and calibrate models to ensure that predicted travel demand is as close to reality as possible. Mo...
Van Dongen, Hans P. A.; Mott, Christopher G.; Huang, Jen-Kuang; Mollicone, Daniel J.; McKenzie, Frederic D.; Dinges, David F.
2007-01-01
Current biomathematical models of fatigue and performance do not accurately predict cognitive performance for individuals with a priori unknown degrees of trait vulnerability to sleep loss, do not predict performance reliably when initial conditions are uncertain, and do not yield statistically valid estimates of prediction accuracy. These limitations diminish their usefulness for predicting the performance of individuals in operational environments. To overcome these 3 limitations, a novel modeling approach was developed, based on the expansion of a statistical technique called Bayesian forecasting. The expanded Bayesian forecasting procedure was implemented in the two-process model of sleep regulation, which has been used to predict performance on the basis of the combination of a sleep homeostatic process and a circadian process. Employing the two-process model with the Bayesian forecasting procedure to predict performance for individual subjects in the face of unknown traits and uncertain states entailed subject-specific optimization of 3 trait parameters (homeostatic build-up rate, circadian amplitude, and basal performance level) and 2 initial state parameters (initial homeostatic state and circadian phase angle). Prior information about the distribution of the trait parameters in the population at large was extracted from psychomotor vigilance test (PVT) performance measurements in 10 subjects who had participated in a laboratory experiment with 88 h of total sleep deprivation. The PVT performance data of 3 additional subjects in this experiment were set aside beforehand for use in prospective computer simulations. The simulations involved updating the subject-specific model parameters every time the next performance measurement became available, and then predicting performance 24 h ahead. Comparison of the predictions to the subjects' actual data revealed that as more data became available for the individuals at hand, the performance predictions became increasingly more accurate and had progressively smaller 95% confidence intervals, as the model parameters converged efficiently to those that best characterized each individual. Even when more challenging simulations were run (mimicking a change in the initial homeostatic state; simulating the data to be sparse), the predictions were still considerably more accurate than would have been achieved by the two-process model alone. Although the work described here is still limited to periods of consolidated wakefulness with stable circadian rhythms, the results obtained thus far indicate that the Bayesian forecasting procedure can successfully overcome some of the major outstanding challenges for biomathematical prediction of cognitive performance in operational settings. Citation: Van Dongen HPA; Mott CG; Huang JK; Mollicone DJ; McKenzie FD; Dinges DF. Optimization of biomathematical model predictions for cognitive performance impairment in individuals: accounting for unknown traits and uncertain states in homeostatic and circadian processes. SLEEP 2007;30(9):1129-1143. PMID:17910385
Construction of prediction intervals for Palmer Drought Severity Index using bootstrap
NASA Astrophysics Data System (ADS)
Beyaztas, Ufuk; Bickici Arikan, Bugrayhan; Beyaztas, Beste Hamiye; Kahya, Ercan
2018-04-01
In this study, we propose an approach based on the residual-based bootstrap method to obtain valid prediction intervals using monthly, short-term (three-months) and mid-term (six-months) drought observations. The effects of North Atlantic and Arctic Oscillation indexes on the constructed prediction intervals are also examined. Performance of the proposed approach is evaluated for the Palmer Drought Severity Index (PDSI) obtained from Konya closed basin located in Central Anatolia, Turkey. The finite sample properties of the proposed method are further illustrated by an extensive simulation study. Our results revealed that the proposed approach is capable of producing valid prediction intervals for future PDSI values.
Xie, Dan; Li, Ao; Wang, Minghui; Fan, Zhewen; Feng, Huanqing
2005-01-01
Subcellular location of a protein is one of the key functional characters as proteins must be localized correctly at the subcellular level to have normal biological function. In this paper, a novel method named LOCSVMPSI has been introduced, which is based on the support vector machine (SVM) and the position-specific scoring matrix generated from profiles of PSI-BLAST. With a jackknife test on the RH2427 data set, LOCSVMPSI achieved a high overall prediction accuracy of 90.2%, which is higher than the prediction results by SubLoc and ESLpred on this data set. In addition, prediction performance of LOCSVMPSI was evaluated with 5-fold cross validation test on the PK7579 data set and the prediction results were consistently better than the previous method based on several SVMs using composition of both amino acids and amino acid pairs. Further test on the SWISSPROT new-unique data set showed that LOCSVMPSI also performed better than some widely used prediction methods, such as PSORTII, TargetP and LOCnet. All these results indicate that LOCSVMPSI is a powerful tool for the prediction of eukaryotic protein subcellular localization. An online web server (current version is 1.3) based on this method has been developed and is freely available to both academic and commercial users, which can be accessed by at . PMID:15980436
Genomic prediction using imputed whole-genome sequence data in Holstein Friesian cattle.
van Binsbergen, Rianne; Calus, Mario P L; Bink, Marco C A M; van Eeuwijk, Fred A; Schrooten, Chris; Veerkamp, Roel F
2015-09-17
In contrast to currently used single nucleotide polymorphism (SNP) panels, the use of whole-genome sequence data is expected to enable the direct estimation of the effects of causal mutations on a given trait. This could lead to higher reliabilities of genomic predictions compared to those based on SNP genotypes. Also, at each generation of selection, recombination events between a SNP and a mutation can cause decay in reliability of genomic predictions based on markers rather than on the causal variants. Our objective was to investigate the use of imputed whole-genome sequence genotypes versus high-density SNP genotypes on (the persistency of) the reliability of genomic predictions using real cattle data. Highly accurate phenotypes based on daughter performance and Illumina BovineHD Beadchip genotypes were available for 5503 Holstein Friesian bulls. The BovineHD genotypes (631,428 SNPs) of each bull were used to impute whole-genome sequence genotypes (12,590,056 SNPs) using the Beagle software. Imputation was done using a multi-breed reference panel of 429 sequenced individuals. Genomic estimated breeding values for three traits were predicted using a Bayesian stochastic search variable selection (BSSVS) model and a genome-enabled best linear unbiased prediction model (GBLUP). Reliabilities of predictions were based on 2087 validation bulls, while the other 3416 bulls were used for training. Prediction reliabilities ranged from 0.37 to 0.52. BSSVS performed better than GBLUP in all cases. Reliabilities of genomic predictions were slightly lower with imputed sequence data than with BovineHD chip data. Also, the reliabilities tended to be lower for both sequence data and BovineHD chip data when relationships between training animals were low. No increase in persistency of prediction reliability using imputed sequence data was observed. Compared to BovineHD genotype data, using imputed sequence data for genomic prediction produced no advantage. To investigate the putative advantage of genomic prediction using (imputed) sequence data, a training set with a larger number of individuals that are distantly related to each other and genomic prediction models that incorporate biological information on the SNPs or that apply stricter SNP pre-selection should be considered.
Stylianou, Neophytos; Akbarov, Artur; Kontopantelis, Evangelos; Buchan, Iain; Dunn, Ken W
2015-08-01
Predicting mortality from burn injury has traditionally employed logistic regression models. Alternative machine learning methods have been introduced in some areas of clinical prediction as the necessary software and computational facilities have become accessible. Here we compare logistic regression and machine learning predictions of mortality from burn. An established logistic mortality model was compared to machine learning methods (artificial neural network, support vector machine, random forests and naïve Bayes) using a population-based (England & Wales) case-cohort registry. Predictive evaluation used: area under the receiver operating characteristic curve; sensitivity; specificity; positive predictive value and Youden's index. All methods had comparable discriminatory abilities, similar sensitivities, specificities and positive predictive values. Although some machine learning methods performed marginally better than logistic regression the differences were seldom statistically significant and clinically insubstantial. Random forests were marginally better for high positive predictive value and reasonable sensitivity. Neural networks yielded slightly better prediction overall. Logistic regression gives an optimal mix of performance and interpretability. The established logistic regression model of burn mortality performs well against more complex alternatives. Clinical prediction with a small set of strong, stable, independent predictors is unlikely to gain much from machine learning outside specialist research contexts. Copyright © 2015 Elsevier Ltd and ISBI. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jahandideh, Sepideh; Jahandideh, Samad; Asadabadi, Ebrahim Barzegari
2009-11-15
Prediction of the amount of hospital waste production will be helpful in the storage, transportation and disposal of hospital waste management. Based on this fact, two predictor models including artificial neural networks (ANNs) and multiple linear regression (MLR) were applied to predict the rate of medical waste generation totally and in different types of sharp, infectious and general. In this study, a 5-fold cross-validation procedure on a database containing total of 50 hospitals of Fars province (Iran) were used to verify the performance of the models. Three performance measures including MAR, RMSE and R{sup 2} were used to evaluate performancemore » of models. The MLR as a conventional model obtained poor prediction performance measure values. However, MLR distinguished hospital capacity and bed occupancy as more significant parameters. On the other hand, ANNs as a more powerful model, which has not been introduced in predicting rate of medical waste generation, showed high performance measure values, especially 0.99 value of R{sup 2} confirming the good fit of the data. Such satisfactory results could be attributed to the non-linear nature of ANNs in problem solving which provides the opportunity for relating independent variables to dependent ones non-linearly. In conclusion, the obtained results showed that our ANN-based model approach is very promising and may play a useful role in developing a better cost-effective strategy for waste management in future.« less
Predicting Power Outages Using Multi-Model Ensemble Forecasts
NASA Astrophysics Data System (ADS)
Cerrai, D.; Anagnostou, E. N.; Yang, J.; Astitha, M.
2017-12-01
Power outages affect every year millions of people in the United States, affecting the economy and conditioning the everyday life. An Outage Prediction Model (OPM) has been developed at the University of Connecticut for helping utilities to quickly restore outages and to limit their adverse consequences on the population. The OPM, operational since 2015, combines several non-parametric machine learning (ML) models that use historical weather storm simulations and high-resolution weather forecasts, satellite remote sensing data, and infrastructure and land cover data to predict the number and spatial distribution of power outages. A new methodology, developed for improving the outage model performances by combining weather- and soil-related variables using three different weather models (WRF 3.7, WRF 3.8 and RAMS/ICLAMS), will be presented in this study. First, we will present a performance evaluation of each model variable, by comparing historical weather analyses with station data or reanalysis over the entire storm data set. Hence, each variable of the new outage model version is extracted from the best performing weather model for that variable, and sensitivity tests are performed for investigating the most efficient variable combination for outage prediction purposes. Despite that the final variables combination is extracted from different weather models, this ensemble based on multi-weather forcing and multi-statistical model power outage prediction outperforms the currently operational OPM version that is based on a single weather forcing variable (WRF 3.7), because each model component is the closest to the actual atmospheric state.
Zhang, Jing; Liang, Lichen; Anderson, Jon R; Gatewood, Lael; Rottenberg, David A; Strother, Stephen C
2008-01-01
As functional magnetic resonance imaging (fMRI) becomes widely used, the demands for evaluation of fMRI processing pipelines and validation of fMRI analysis results is increasing rapidly. The current NPAIRS package, an IDL-based fMRI processing pipeline evaluation framework, lacks system interoperability and the ability to evaluate general linear model (GLM)-based pipelines using prediction metrics. Thus, it can not fully evaluate fMRI analytical software modules such as FSL.FEAT and NPAIRS.GLM. In order to overcome these limitations, a Java-based fMRI processing pipeline evaluation system was developed. It integrated YALE (a machine learning environment) into Fiswidgets (a fMRI software environment) to obtain system interoperability and applied an algorithm to measure GLM prediction accuracy. The results demonstrated that the system can evaluate fMRI processing pipelines with univariate GLM and multivariate canonical variates analysis (CVA)-based models on real fMRI data based on prediction accuracy (classification accuracy) and statistical parametric image (SPI) reproducibility. In addition, a preliminary study was performed where four fMRI processing pipelines with GLM and CVA modules such as FSL.FEAT and NPAIRS.CVA were evaluated with the system. The results indicated that (1) the system can compare different fMRI processing pipelines with heterogeneous models (NPAIRS.GLM, NPAIRS.CVA and FSL.FEAT) and rank their performance by automatic performance scoring, and (2) the rank of pipeline performance is highly dependent on the preprocessing operations. These results suggest that the system will be of value for the comparison, validation, standardization and optimization of functional neuroimaging software packages and fMRI processing pipelines.
Eronen, Lauri; Toivonen, Hannu
2012-06-06
Biological databases contain large amounts of data concerning the functions and associations of genes and proteins. Integration of data from several such databases into a single repository can aid the discovery of previously unknown connections spanning multiple types of relationships and databases. Biomine is a system that integrates cross-references from several biological databases into a graph model with multiple types of edges, such as protein interactions, gene-disease associations and gene ontology annotations. Edges are weighted based on their type, reliability, and informativeness. We present Biomine and evaluate its performance in link prediction, where the goal is to predict pairs of nodes that will be connected in the future, based on current data. In particular, we formulate protein interaction prediction and disease gene prioritization tasks as instances of link prediction. The predictions are based on a proximity measure computed on the integrated graph. We consider and experiment with several such measures, and perform a parameter optimization procedure where different edge types are weighted to optimize link prediction accuracy. We also propose a novel method for disease-gene prioritization, defined as finding a subset of candidate genes that cluster together in the graph. We experimentally evaluate Biomine by predicting future annotations in the source databases and prioritizing lists of putative disease genes. The experimental results show that Biomine has strong potential for predicting links when a set of selected candidate links is available. The predictions obtained using the entire Biomine dataset are shown to clearly outperform ones obtained using any single source of data alone, when different types of links are suitably weighted. In the gene prioritization task, an established reference set of disease-associated genes is useful, but the results show that under favorable conditions, Biomine can also perform well when no such information is available.The Biomine system is a proof of concept. Its current version contains 1.1 million entities and 8.1 million relations between them, with focus on human genetics. Some of its functionalities are available in a public query interface at http://biomine.cs.helsinki.fi, allowing searching for and visualizing connections between given biological entities.
Basak, Chandramallika; Voss, Michelle W.; Erickson, Kirk I.; Boot, Walter R.; Kramer, Arthur F.
2015-01-01
Previous studies have found that differences in brain volume among older adults predict performance in laboratory tasks of executive control, memory, and motor learning. In the present study we asked whether regional differences in brain volume as assessed by the application of a voxel-based morphometry technique on high resolution MRI would also be useful in predicting the acquisition of skill in complex tasks, such as strategy-based video games. Twenty older adults were trained for over 20 hours to play Rise of Nations, a complex real-time strategy game. These adults showed substantial improvements over the training period in game performance. MRI scans obtained prior to training revealed that the volume of a number of brain regions, which have been previously associated with subsets of the trained skills, predicted a substantial amount of variance in learning on the complex game. Thus, regional differences in brain volume can predict learning in complex tasks that entail the use of a variety of perceptual, cognitive and motor processes. PMID:21546146
Basak, Chandramallika; Voss, Michelle W; Erickson, Kirk I; Boot, Walter R; Kramer, Arthur F
2011-08-01
Previous studies have found that differences in brain volume among older adults predict performance in laboratory tasks of executive control, memory, and motor learning. In the present study we asked whether regional differences in brain volume as assessed by the application of a voxel-based morphometry technique on high resolution MRI would also be useful in predicting the acquisition of skill in complex tasks, such as strategy-based video games. Twenty older adults were trained for over 20 h to play Rise of Nations, a complex real-time strategy game. These adults showed substantial improvements over the training period in game performance. MRI scans obtained prior to training revealed that the volume of a number of brain regions, which have been previously associated with subsets of the trained skills, predicted a substantial amount of variance in learning on the complex game. Thus, regional differences in brain volume can predict learning in complex tasks that entail the use of a variety of perceptual, cognitive and motor processes. Copyright © 2011 Elsevier Inc. All rights reserved.
Estimation of relative effectiveness of phylogenetic programs by machine learning.
Krivozubov, Mikhail; Goebels, Florian; Spirin, Sergei
2014-04-01
Reconstruction of phylogeny of a protein family from a sequence alignment can produce results of different quality. Our goal is to predict the quality of phylogeny reconstruction basing on features that can be extracted from the input alignment. We used Fitch-Margoliash (FM) method of phylogeny reconstruction and random forest as a predictor. For training and testing the predictor, alignments of orthologous series (OS) were used, for which the result of phylogeny reconstruction can be evaluated by comparison with trees of corresponding organisms. Our results show that the quality of phylogeny reconstruction can be predicted with more than 80% precision. Also, we tried to predict which phylogeny reconstruction method, FM or UPGMA, is better for a particular alignment. With the used set of features, among alignments for which the obtained predictor predicts a better performance of UPGMA, 56% really give a better result with UPGMA. Taking into account that in our testing set only for 34% alignments UPGMA performs better, this result shows a principal possibility to predict the better phylogeny reconstruction method basing on features of a sequence alignment.
Predicting chroma from luma with frequency domain intra prediction
NASA Astrophysics Data System (ADS)
Egge, Nathan E.; Valin, Jean-Marc
2015-03-01
This paper describes a technique for performing intra prediction of the chroma planes based on the reconstructed luma plane in the frequency domain. This prediction exploits the fact that while RGB to YUV color conversion has the property that it decorrelates the color planes globally across an image, there is still some correlation locally at the block level.1 Previous proposals compute a linear model of the spatial relationship between the luma plane (Y) and the two chroma planes (U and V).2 In codecs that use lapped transforms this is not possible since transform support extends across the block boundaries3 and thus neighboring blocks are unavailable during intra- prediction. We design a frequency domain intra predictor for chroma that exploits the same local correlation with lower complexity than the spatial predictor and which works with lapped transforms. We then describe a low- complexity algorithm that directly uses luma coefficients as a chroma predictor based on gain-shape quantization and band partitioning. An experiment is performed that compares these two techniques inside the experimental Daala video codec and shows the lower complexity algorithm to be a better chroma predictor.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Srinivas, Nisha; Rose, Derek C; Bolme, David S
This paper examines the difficulty associated with performing machine-based automatic demographic prediction on a sub-population of Asian faces. We introduce the Wild East Asian Face dataset (WEAFD), a new and unique dataset to the research community. This dataset consists primarily of labeled face images of individuals from East Asian countries, including Vietnam, Burma, Thailand, China, Korea, Japan, Indonesia, and Malaysia. East Asian turk annotators were uniquely used to judge the age and fine grain ethnicity attributes to reduce the impact of the other race effect and improve quality of annotations. We focus on predicting age, gender and fine-grained ethnicity ofmore » an individual by providing baseline results with a convolutional neural network (CNN). Finegrained ethnicity prediction refers to predicting ethnicity of an individual by country or sub-region (Chinese, Japanese, Korean, etc.) of the East Asian continent. Performance for two CNN architectures is presented, highlighting the difficulty of these tasks and showcasing potential design considerations that ease network optimization by promoting region based feature extraction.« less
Music performance and the perception of key.
Thompson, W F; Cuddy, L L
1997-02-01
The effect of music performance on perceived key movement was examined. Listeners judged key movement in sequences presented without performance expression (mechanical) in Experiment 1 and with performance expression in Experiment 2. Modulation distance varied. Judgments corresponded to predictions based on the cycle of fifths and toroidal models of key relatedness, with the highest correspondence for performed versions with the toroidal model. In Experiment 3, listeners compared mechanical sequences with either performed sequences or modifications of performed sequences. Modifications preserved expressive differences between chords, but not between voices. Predictions from Experiments 1 and 2 held only for performed sequences, suggesting that differences between voices are informative of key movement. Experiment 4 confirmed that modifications did not disrupt musicality. Analyses of performances further suggested a link between performance expression and key.
Austin, Peter C.; Tu, Jack V.; Ho, Jennifer E.; Levy, Daniel; Lee, Douglas S.
2014-01-01
Objective Physicians classify patients into those with or without a specific disease. Furthermore, there is often interest in classifying patients according to disease etiology or subtype. Classification trees are frequently used to classify patients according to the presence or absence of a disease. However, classification trees can suffer from limited accuracy. In the data-mining and machine learning literature, alternate classification schemes have been developed. These include bootstrap aggregation (bagging), boosting, random forests, and support vector machines. Study design and Setting We compared the performance of these classification methods with those of conventional classification trees to classify patients with heart failure according to the following sub-types: heart failure with preserved ejection fraction (HFPEF) vs. heart failure with reduced ejection fraction (HFREF). We also compared the ability of these methods to predict the probability of the presence of HFPEF with that of conventional logistic regression. Results We found that modern, flexible tree-based methods from the data mining literature offer substantial improvement in prediction and classification of heart failure sub-type compared to conventional classification and regression trees. However, conventional logistic regression had superior performance for predicting the probability of the presence of HFPEF compared to the methods proposed in the data mining literature. Conclusion The use of tree-based methods offers superior performance over conventional classification and regression trees for predicting and classifying heart failure subtypes in a population-based sample of patients from Ontario. However, these methods do not offer substantial improvements over logistic regression for predicting the presence of HFPEF. PMID:23384592
NASA Astrophysics Data System (ADS)
Bellugi, D. G.; Tennant, C.; Larsen, L.
2016-12-01
Catchment and climate heterogeneity complicate prediction of runoff across time and space, and resulting parameter uncertainty can lead to large accumulated errors in hydrologic models, particularly in ungauged basins. Recently, data-driven modeling approaches have been shown to avoid the accumulated uncertainty associated with many physically-based models, providing an appealing alternative for hydrologic prediction. However, the effectiveness of different methods in hydrologically and geomorphically distinct catchments, and the robustness of these methods to changing climate and changing hydrologic processes remain to be tested. Here, we evaluate the use of machine learning techniques to predict daily runoff across time and space using only essential climatic forcing (e.g. precipitation, temperature, and potential evapotranspiration) time series as model input. Model training and testing was done using a high quality dataset of daily runoff and climate forcing data for 25+ years for 600+ minimally-disturbed catchments (drainage area range 5-25,000 km2, median size 336 km2) that cover a wide range of climatic and physical characteristics. Preliminary results using Support Vector Regression (SVR) suggest that in some catchments this nonlinear-based regression technique can accurately predict daily runoff, while the same approach fails in other catchments, indicating that the representation of climate inputs and/or catchment filter characteristics in the model structure need further refinement to increase performance. We bolster this analysis by using Sparse Identification of Nonlinear Dynamics (a sparse symbolic regression technique) to uncover the governing equations that describe runoff processes in catchments where SVR performed well and for ones where it performed poorly, thereby enabling inference about governing processes. This provides a robust means of examining how catchment complexity influences runoff prediction skill, and represents a contribution towards the integration of data-driven inference and physically-based models.
Evaluating the accuracy of SHAPE-directed RNA secondary structure predictions
Sükösd, Zsuzsanna; Swenson, M. Shel; Kjems, Jørgen; Heitsch, Christine E.
2013-01-01
Recent advances in RNA structure determination include using data from high-throughput probing experiments to improve thermodynamic prediction accuracy. We evaluate the extent and nature of improvements in data-directed predictions for a diverse set of 16S/18S ribosomal sequences using a stochastic model of experimental SHAPE data. The average accuracy for 1000 data-directed predictions always improves over the original minimum free energy (MFE) structure. However, the amount of improvement varies with the sequence, exhibiting a correlation with MFE accuracy. Further analysis of this correlation shows that accurate MFE base pairs are typically preserved in a data-directed prediction, whereas inaccurate ones are not. Thus, the positive predictive value of common base pairs is consistently higher than the directed prediction accuracy. Finally, we confirm sequence dependencies in the directability of thermodynamic predictions and investigate the potential for greater accuracy improvements in the worst performing test sequence. PMID:23325843
Liu, Zhichao; Kelly, Reagan; Fang, Hong; Ding, Don; Tong, Weida
2011-07-18
The primary testing strategy to identify nongenotoxic carcinogens largely relies on the 2-year rodent bioassay, which is time-consuming and labor-intensive. There is an increasing effort to develop alternative approaches to prioritize the chemicals for, supplement, or even replace the cancer bioassay. In silico approaches based on quantitative structure-activity relationships (QSAR) are rapid and inexpensive and thus have been investigated for such purposes. A slightly more expensive approach based on short-term animal studies with toxicogenomics (TGx) represents another attractive option for this application. Thus, the primary questions are how much better predictive performance using short-term TGx models can be achieved compared to that of QSAR models, and what length of exposure is sufficient for high quality prediction based on TGx. In this study, we developed predictive models for rodent liver carcinogenicity using gene expression data generated from short-term animal models at different time points and QSAR. The study was focused on the prediction of nongenotoxic carcinogenicity since the genotoxic chemicals can be inexpensively removed from further development using various in vitro assays individually or in combination. We identified 62 chemicals whose hepatocarcinogenic potential was available from the National Center for Toxicological Research liver cancer database (NCTRlcdb). The gene expression profiles of liver tissue obtained from rats treated with these chemicals at different time points (1 day, 3 days, and 5 days) are available from the Gene Expression Omnibus (GEO) database. Both TGx and QSAR models were developed on the basis of the same set of chemicals using the same modeling approach, a nearest-centroid method with a minimum redundancy and maximum relevancy-based feature selection with performance assessed using compound-based 5-fold cross-validation. We found that the TGx models outperformed QSAR in every aspect of modeling. For example, the TGx models' predictive accuracy (0.77, 0.77, and 0.82 for the 1-day, 3-day, and 5-day models, respectively) was much higher for an independent validation set than that of a QSAR model (0.55). Permutation tests confirmed the statistical significance of the model's prediction performance. The study concluded that a short-term 5-day TGx animal model holds the potential to predict nongenotoxic hepatocarcinogenicity. © 2011 American Chemical Society
EFICAz2: enzyme function inference by a combined approach enhanced by machine learning.
Arakaki, Adrian K; Huang, Ying; Skolnick, Jeffrey
2009-04-13
We previously developed EFICAz, an enzyme function inference approach that combines predictions from non-completely overlapping component methods. Two of the four components in the original EFICAz are based on the detection of functionally discriminating residues (FDRs). FDRs distinguish between member of an enzyme family that are homofunctional (classified under the EC number of interest) or heterofunctional (annotated with another EC number or lacking enzymatic activity). Each of the two FDR-based components is associated to one of two specific kinds of enzyme families. EFICAz exhibits high precision performance, except when the maximal test to training sequence identity (MTTSI) is lower than 30%. To improve EFICAz's performance in this regime, we: i) increased the number of predictive components and ii) took advantage of consensual information from the different components to make the final EC number assignment. We have developed two new EFICAz components, analogs to the two FDR-based components, where the discrimination between homo and heterofunctional members is based on the evaluation, via Support Vector Machine models, of all the aligned positions between the query sequence and the multiple sequence alignments associated to the enzyme families. Benchmark results indicate that: i) the new SVM-based components outperform their FDR-based counterparts, and ii) both SVM-based and FDR-based components generate unique predictions. We developed classification tree models to optimally combine the results from the six EFICAz components into a final EC number prediction. The new implementation of our approach, EFICAz2, exhibits a highly improved prediction precision at MTTSI < 30% compared to the original EFICAz, with only a slight decrease in prediction recall. A comparative analysis of enzyme function annotation of the human proteome by EFICAz2 and KEGG shows that: i) when both sources make EC number assignments for the same protein sequence, the assignments tend to be consistent and ii) EFICAz2 generates considerably more unique assignments than KEGG. Performance benchmarks and the comparison with KEGG demonstrate that EFICAz2 is a powerful and precise tool for enzyme function annotation, with multiple applications in genome analysis and metabolic pathway reconstruction. The EFICAz2 web service is available at: http://cssb.biology.gatech.edu/skolnick/webservice/EFICAz2/index.html.
Smeers, Inge; Decorte, Ronny; Van de Voorde, Wim; Bekaert, Bram
2018-05-01
DNA methylation is a promising biomarker for forensic age prediction. A challenge that has emerged in recent studies is the fact that prediction errors become larger with increasing age due to interindividual differences in epigenetic ageing rates. This phenomenon of non-constant variance or heteroscedasticity violates an assumption of the often used method of ordinary least squares (OLS) regression. The aim of this study was to evaluate alternative statistical methods that do take heteroscedasticity into account in order to provide more accurate, age-dependent prediction intervals. A weighted least squares (WLS) regression is proposed as well as a quantile regression model. Their performances were compared against an OLS regression model based on the same dataset. Both models provided age-dependent prediction intervals which account for the increasing variance with age, but WLS regression performed better in terms of success rate in the current dataset. However, quantile regression might be a preferred method when dealing with a variance that is not only non-constant, but also not normally distributed. Ultimately the choice of which model to use should depend on the observed characteristics of the data. Copyright © 2018 Elsevier B.V. All rights reserved.
Prediction of essential proteins based on gene expression programming.
Zhong, Jiancheng; Wang, Jianxin; Peng, Wei; Zhang, Zhen; Pan, Yi
2013-01-01
Essential proteins are indispensable for cell survive. Identifying essential proteins is very important for improving our understanding the way of a cell working. There are various types of features related to the essentiality of proteins. Many methods have been proposed to combine some of them to predict essential proteins. However, it is still a big challenge for designing an effective method to predict them by integrating different features, and explaining how these selected features decide the essentiality of protein. Gene expression programming (GEP) is a learning algorithm and what it learns specifically is about relationships between variables in sets of data and then builds models to explain these relationships. In this work, we propose a GEP-based method to predict essential protein by combing some biological features and topological features. We carry out experiments on S. cerevisiae data. The experimental results show that the our method achieves better prediction performance than those methods using individual features. Moreover, our method outperforms some machine learning methods and performs as well as a method which is obtained by combining the outputs of eight machine learning methods. The accuracy of predicting essential proteins can been improved by using GEP method to combine some topological features and biological features.
Combined QSAR and molecule docking studies on predicting P-glycoprotein inhibitors
NASA Astrophysics Data System (ADS)
Tan, Wen; Mei, Hu; Chao, Li; Liu, Tengfei; Pan, Xianchao; Shu, Mao; Yang, Li
2013-12-01
P-glycoprotein (P-gp) is an ATP-binding cassette multidrug transporter. The over expression of P-gp leads to the development of multidrug resistance (MDR), which is a major obstacle to effective treatment of cancer. Thus, designing effective P-gp inhibitors has an extremely important role in the overcoming MDR. In this paper, both ligand-based quantitative structure-activity relationship (QSAR) and receptor-based molecular docking are used to predict P-gp inhibitors. The results show that each method achieves good prediction performance. According to the results of tenfold cross-validation, an optimal linear SVM model with only three descriptors is established on 857 training samples, of which the overall accuracy (Acc), sensitivity, specificity, and Matthews correlation coefficient are 0.840, 0.873, 0.813, and 0.683, respectively. The SVM model is further validated by 418 test samples with the overall Acc of 0.868. Based on a homology model of human P-gp established, Surflex-dock is also performed to give binding free energy-based evaluations with the overall accuracies of 0.823 for the test set. Furthermore, a consensus evaluation is also performed by using these two methods. Both QSAR and molecular docking studies indicate that molecular volume, hydrophobicity and aromaticity are three dominant factors influencing the inhibitory activities.
Applying theory of planned behavior to predict exercise maintenance in sarcopenic elderly.
Ahmad, Mohamad Hasnan; Shahar, Suzana; Teng, Nur Islami Mohd Fahmi; Manaf, Zahara Abdul; Sakian, Noor Ibrahim Mohd; Omar, Baharudin
2014-01-01
This study aimed to determine the factors associated with exercise behavior based on the theory of planned behavior (TPB) among the sarcopenic elderly people in Cheras, Kuala Lumpur. A total of 65 subjects with mean ages of 67.5±5.2 (men) and 66.1±5.1 (women) years participated in this study. Subjects were divided into two groups: 1) exercise group (n=34; 25 men, nine women); and 2) the control group (n=31; 22 men, nine women). Structural equation modeling, based on TPB components, was applied to determine specific factors that most contribute to and predict actual behavior toward exercise. Based on the TPB's model, attitude (β=0.60) and perceived behavioral control (β=0.24) were the major predictors of intention to exercise among men at the baseline. Among women, the subjective norm (β=0.82) was the major predictor of intention to perform the exercise at the baseline. After 12 weeks, attitude (men's, β=0.68; women's, β=0.24) and subjective norm (men's, β=0.12; women's, β=0.87) were the predictors of the intention to perform the exercise. "Feels healthier with exercise" was the specific factor to improve the intention to perform and to maintain exercise behavior in men (β=0.36) and women (β=0.49). "Not motivated to perform exercise" was the main barrier among men's intention to exercise. The intention to perform the exercise was able to predict actual behavior regarding exercise at the baseline and at 12 weeks of an intervention program. As a conclusion, TPB is a useful model to determine and to predict maintenance of exercise in the sarcopenic elderly.
Archfield, Stacey A.; Pugliese, Alessio; Castellarin, Attilio; Skøien, Jon O.; Kiang, Julie E.
2013-01-01
In the United States, estimation of flood frequency quantiles at ungauged locations has been largely based on regional regression techniques that relate measurable catchment descriptors to flood quantiles. More recently, spatial interpolation techniques of point data have been shown to be effective for predicting streamflow statistics (i.e., flood flows and low-flow indices) in ungauged catchments. Literature reports successful applications of two techniques, canonical kriging, CK (or physiographical-space-based interpolation, PSBI), and topological kriging, TK (or top-kriging). CK performs the spatial interpolation of the streamflow statistic of interest in the two-dimensional space of catchment descriptors. TK predicts the streamflow statistic along river networks taking both the catchment area and nested nature of catchments into account. It is of interest to understand how these spatial interpolation methods compare with generalized least squares (GLS) regression, one of the most common approaches to estimate flood quantiles at ungauged locations. By means of a leave-one-out cross-validation procedure, the performance of CK and TK was compared to GLS regression equations developed for the prediction of 10, 50, 100 and 500 yr floods for 61 streamgauges in the southeast United States. TK substantially outperforms GLS and CK for the study area, particularly for large catchments. The performance of TK over GLS highlights an important distinction between the treatments of spatial correlation when using regression-based or spatial interpolation methods to estimate flood quantiles at ungauged locations. The analysis also shows that coupling TK with CK slightly improves the performance of TK; however, the improvement is marginal when compared to the improvement in performance over GLS.
Applying theory of planned behavior to predict exercise maintenance in sarcopenic elderly
Ahmad, Mohamad Hasnan; Shahar, Suzana; Teng, Nur Islami Mohd Fahmi; Manaf, Zahara Abdul; Sakian, Noor Ibrahim Mohd; Omar, Baharudin
2014-01-01
This study aimed to determine the factors associated with exercise behavior based on the theory of planned behavior (TPB) among the sarcopenic elderly people in Cheras, Kuala Lumpur. A total of 65 subjects with mean ages of 67.5±5.2 (men) and 66.1±5.1 (women) years participated in this study. Subjects were divided into two groups: 1) exercise group (n=34; 25 men, nine women); and 2) the control group (n=31; 22 men, nine women). Structural equation modeling, based on TPB components, was applied to determine specific factors that most contribute to and predict actual behavior toward exercise. Based on the TPB’s model, attitude (β=0.60) and perceived behavioral control (β=0.24) were the major predictors of intention to exercise among men at the baseline. Among women, the subjective norm (β=0.82) was the major predictor of intention to perform the exercise at the baseline. After 12 weeks, attitude (men’s, β=0.68; women’s, β=0.24) and subjective norm (men’s, β=0.12; women’s, β=0.87) were the predictors of the intention to perform the exercise. “Feels healthier with exercise” was the specific factor to improve the intention to perform and to maintain exercise behavior in men (β=0.36) and women (β=0.49). “Not motivated to perform exercise” was the main barrier among men’s intention to exercise. The intention to perform the exercise was able to predict actual behavior regarding exercise at the baseline and at 12 weeks of an intervention program. As a conclusion, TPB is a useful model to determine and to predict maintenance of exercise in the sarcopenic elderly. PMID:25258524
Ducatman, Barbara S.; Williams, H. James; Hobbs, Gerald; Gyure, Kymberly A.
2009-01-01
Objectives To determine whether a longitudinal, case-based evaluation system can predict acquisition of competency in surgical pathology and how trainees at risk can be identified early. Design Data were collected for trainee performance on surgical pathology cases (how well their diagnosis agreed with the faculty diagnosis) and compared with training outcomes. Negative training outcomes included failure to complete the residency, failure to pass the anatomic pathology component of the American Board of Pathology examination, and/or failure to obtain or hold a position immediately following training. Findings Thirty-three trainees recorded diagnoses for 54 326 surgical pathology cases, with outcome data available for 15 residents. Mean case-based performance was significantly higher for those with positive outcomes, and outcome status could be predicted as early as postgraduate year-1 (P = .0001). Performance on the first postgraduate year-1 rotation was significantly associated with the outcome (P = .02). Although trainees with unsuccessful outcomes improved their performance more rapidly, they started below residents with successful outcomes and did not make up the difference during training. There was no significant difference in Step 1 or 2 United States Medical Licensing Examination (USMLE) scores when compared with performance or final outcomes (P = .43 and P = .68, respectively) and the resident in-service examination (RISE) had limited predictive ability. Discussion Differences between successful- and unsuccessful-outcome residents were most evident in early residency, ideal for designing interventions or counseling residents to consider another specialty. Conclusion Our longitudinal case-based system successfully identified trainees at risk for failure to acquire critical competencies for surgical pathology early in the program. PMID:21975705
NASA Technical Reports Server (NTRS)
Koenig, D. G.; Stoll, F.; Aoyagi, K.
1981-01-01
The status of ejector development in terms of application to V/STOL aircraft is reported in three categories: aircraft systems and ejector concepts; ejector performance including prediction techniques and experimental data base available; and, integration of the ejector with complete aircraft configurations. Available prediction techniques are reviewed and performance of three ejector designs with vertical lift capability is summarized. Applications of the 'fuselage' and 'short diffuser' ejectors to fighter aircraft are related to current and planned research programs. Recommendations are listed for effort needed to evaluate installed performance.
NASA Technical Reports Server (NTRS)
Bahler, D. D.; Owen, H. A., Jr.; Wilson, T. G.
1978-01-01
A model describing the turning-on period of a power switching transistor in an energy storage voltage step-up converter is presented. Comparisons between an experimental layout and the circuit model during the turning-on interval demonstrate the ability of the model to closely predict the effects of circuit topology on the performance of the converter. A phenomenon of particular importance that is observed in the experimental circuits and is predicted by the model is the deleterious feedback effect of the parasitic emitter lead inductance on the base current waveform during the turning-on interval.
Prediction of Regulation Reserve Requirements in California ISO Control Area based on BAAL Standard
DOE Office of Scientific and Technical Information (OSTI.GOV)
Etingov, Pavel V.; Makarov, Yuri V.; Samaan, Nader A.
This paper presents new methodologies developed at Pacific Northwest National Laboratory (PNNL) to estimate regulation capacity requirements in the California ISO control area. Two approaches have been developed: (1) an approach based on statistical analysis of actual historical area control error (ACE) and regulation data, and (2) an approach based on balancing authority ACE limit control performance standard. The approaches predict regulation reserve requirements on a day-ahead basis including upward and downward requirements, for each operating hour of a day. California ISO data has been used to test the performance of the proposed algorithms. Results show that software tool allowsmore » saving up to 30% on the regulation procurements cost .« less
Tiffin, Paul A; Paton, Lewis W; Kasim, Adetayo S; Böhnke, Jan R
2018-01-01
Objectives University academic achievement may be inversely related to the performance of the secondary (high) school an entrant attended. Indeed, some medical schools already offer ‘grade discounts’ to applicants from less well-performing schools. However, evidence to guide such policies is lacking. In this study, we analyse a national dataset in order to understand the relationship between the two main predictors of medical school admission in the UK (prior educational attainment (PEA) and performance on the United Kingdom Clinical Aptitude Test (UKCAT)) and subsequent undergraduate knowledge and skills-related outcomes analysed separately. Methods The study was based on national selection data and linked medical school outcomes for knowledge and skills-based tests during the first five years of medical school. UKCAT scores and PEA grades were available for 2107 students enrolled at 18 medical schools. Models were developed to investigate the potential mediating role played by a student’s previous secondary school’s performance. Multilevel models were created to explore the influence of students’ secondary schools on undergraduate achievement in medical school. Results The ability of the UKCAT scores to predict undergraduate academic performance was significantly mediated by PEA in all five years of medical school. Undergraduate achievement was inversely related to secondary school-level performance. This effect waned over time and was less marked for skills, compared with undergraduate knowledge-based outcomes. Thus, the predictive value of secondary school grades was generally dependent on the secondary school in which they were obtained. Conclusions The UKCAT scores added some value, above and beyond secondary school achievement, in predicting undergraduate performance, especially in the later years of study. Importantly, the findings suggest that the academic entry criteria should be relaxed for candidates applying from the least well performing secondary schools. In the UK, this would translate into a decrease of approximately one to two A-level grades. PMID:29792300
English, Devin; Lambert, Sharon F.; Ialongo, Nicholas S.
2015-01-01
Although the United States faces a seemingly intractable divide between white and African American academic performance, there remains a dearth of longitudinal research investigating factors that work to maintain this gap. The present study examined whether racial discrimination predicted the academic performance of African American students through its effect on depressive symptoms. Participants were a community sample of African American adolescents (N = 495) attending urban public schools from grade 7 to grade 9 (Mage = 12.5). Structural equation modeling revealed that experienced racial discrimination predicted increases in depressive symptoms 1 year later, which, in turn, predicted decreases in academic performance the following year. These results suggest that racial discrimination continues to play a critical role in the academic performance of African American students and, as such, contributes to the maintenance of the race-based academic achievement gap in the United States. PMID:27425564
Girardat-Rotar, Laura; Braun, Julia; Puhan, Milo A; Abraham, Alison G; Serra, Andreas L
2017-07-17
Prediction models in autosomal dominant polycystic kidney disease (ADPKD) are useful in clinical settings to identify patients with greater risk of a rapid disease progression in whom a treatment may have more benefits than harms. Mayo Clinic investigators developed a risk prediction tool for ADPKD patients using a single kidney value. Our aim was to perform an independent geographical and temporal external validation as well as evaluate the potential for improving the predictive performance by including additional information on total kidney volume. We used data from the on-going Swiss ADPKD study from 2006 to 2016. The main analysis included a sample size of 214 patients with Typical ADPKD (Class 1). We evaluated the Mayo Clinic model performance calibration and discrimination in our external sample and assessed whether predictive performance could be improved through the addition of subsequent kidney volume measurements beyond the baseline assessment. The calibration of both versions of the Mayo Clinic prediction model using continuous Height adjusted total kidney volume (HtTKV) and using risk subclasses was good, with R 2 of 78% and 70%, respectively. Accuracy was also good with 91.5% and 88.7% of the predicted within 30% of the observed, respectively. Additional information regarding kidney volume did not substantially improve the model performance. The Mayo Clinic prediction models are generalizable to other clinical settings and provide an accurate tool based on available predictors to identify patients at high risk for rapid disease progression.
The neural bases of distracter-resistant working memory
Wager, Tor D.; Spicer, Julie; Insler, Rachel; Smith, Edward E.
2014-01-01
A major difference between humans and other animals is our capacity to maintain information in working memory (WM) while performing secondary tasks, which enables sustained, complex cognition. A common assumption is that the lateral prefrontal cortex (PFC) is critical for WM performance in the presence of distracters, but direct evidence is scarce. We assessed the relationship between fMRI activity and WM performance within-subjects, with performance matched across Distracter and No-distracter conditions. Activity in ventrolateral PFC during WM encoding and maintenance positively predicted performance in both conditions, whereas activity in the pre-supplementary motor area (pre-SMA) predicted performance only under distraction. Other parts of dorsolateral and ventrolateral PFC predicted performance only in the No-distracter condition. These findings challenge a lateral PFC-centered view of distracter-resistance, and suggest that the lateral PFC supports a type of WM representation that is efficient for dealing with task-irrelevant input but is nonetheless easily disrupted by dual-task demands. PMID:24366656
Kim, Esther S H; Ishwaran, Hemant; Blackstone, Eugene; Lauer, Michael S
2007-11-06
The purpose of this study was to externally validate the prognostic value of age- and gender-based nomograms and categorical definitions of impaired exercise capacity (EC). Exercise capacity predicts death, but its use in routine clinical practice is hampered by its close correlation with age and gender. For a median of 5 years, we followed 22,275 patients without known heart disease who underwent symptom-limited stress testing. Models for predicted or impaired EC were identified by literature search. Gender-specific multivariable proportional hazards models were constructed. Four methods were used to assess validity: Akaike Information Criterion (AIC), right-censored c-index in 100 out-of-bootstrap samples, the Nagelkerke Index R2, and calculation of calibration error in 100 bootstrap samples. There were 646 and 430 deaths in 13,098 men and 9,177 women, respectively. Of the 7 models tested in men, a model based on a Veterans Affairs cohort (predicted metabolic equivalents [METs] = 18 - [0.15 x age]) had the highest AIC and R2. In women, a model based on the St. James Take Heart Project (predicted METs = 14.7 - [0.13 x age]) performed best. Categorical definitions of fitness performed less well. Even after accounting for age and gender, there was still an important interaction with age, whereby predicted EC was a weaker predictor in older subjects (p for interaction <0.001 in men and 0.003 in women). Several methods describe EC accounting for age and gender-related differences, but their ability to predict mortality differ. Simple cutoff values fail to fully describe EC's strong predictive value.
A perturbative approach for enhancing the performance of time series forecasting.
de Mattos Neto, Paulo S G; Ferreira, Tiago A E; Lima, Aranildo R; Vasconcelos, Germano C; Cavalcanti, George D C
2017-04-01
This paper proposes a method to perform time series prediction based on perturbation theory. The approach is based on continuously adjusting an initial forecasting model to asymptotically approximate a desired time series model. First, a predictive model generates an initial forecasting for a time series. Second, a residual time series is calculated as the difference between the original time series and the initial forecasting. If that residual series is not white noise, then it can be used to improve the accuracy of the initial model and a new predictive model is adjusted using residual series. The whole process is repeated until convergence or the residual series becomes white noise. The output of the method is then given by summing up the outputs of all trained predictive models in a perturbative sense. To test the method, an experimental investigation was conducted on six real world time series. A comparison was made with six other methods experimented and ten other results found in the literature. Results show that not only the performance of the initial model is significantly improved but also the proposed method outperforms the other results previously published. Copyright © 2017 Elsevier Ltd. All rights reserved.
1993-03-01
Study of the Productive Capacity Project 40 4. 454X1 Job Duty Areas ....... ........................ ......... 41 5. Bases Visited in the Initial Study of...101 21. Correlttion Matrix of the Other Job Performance Measures ................. 102 22 454X1 Tasks...mentioned, the goal of the thesis is to develop an experimental mathematical model for predicting the job performance of enlisted personnel in AFS 454X1
Review of numerical models to predict cooling tower performance
DOE Office of Scientific and Technical Information (OSTI.GOV)
Johnson, B.M.; Nomura, K.K.; Bartz, J.A.
1987-01-01
Four state-of-the-art computer models developed to predict the thermal performance of evaporative cooling towers are summarized. The formulation of these models, STAR and TEFERI (developed in Europe) and FACTS and VERA2D (developed in the U.S.), is summarized. A fifth code, based on Merkel analysis, is also discussed. Principal features of the codes, computation time and storage requirements are described. A discussion of model validation is also provided.
DOT National Transportation Integrated Search
1998-09-01
This report summarizes 23 years of work undertaken in Texas to understand the reasons for significant performance differences found in pavements placed around the state. To a significant degree, pavement performance can be predicted based on the conc...
Distributed Prognostics based on Structural Model Decomposition
NASA Technical Reports Server (NTRS)
Daigle, Matthew J.; Bregon, Anibal; Roychoudhury, I.
2014-01-01
Within systems health management, prognostics focuses on predicting the remaining useful life of a system. In the model-based prognostics paradigm, physics-based models are constructed that describe the operation of a system and how it fails. Such approaches consist of an estimation phase, in which the health state of the system is first identified, and a prediction phase, in which the health state is projected forward in time to determine the end of life. Centralized solutions to these problems are often computationally expensive, do not scale well as the size of the system grows, and introduce a single point of failure. In this paper, we propose a novel distributed model-based prognostics scheme that formally describes how to decompose both the estimation and prediction problems into independent local subproblems whose solutions may be easily composed into a global solution. The decomposition of the prognostics problem is achieved through structural decomposition of the underlying models. The decomposition algorithm creates from the global system model a set of local submodels suitable for prognostics. Independent local estimation and prediction problems are formed based on these local submodels, resulting in a scalable distributed prognostics approach that allows the local subproblems to be solved in parallel, thus offering increases in computational efficiency. Using a centrifugal pump as a case study, we perform a number of simulation-based experiments to demonstrate the distributed approach, compare the performance with a centralized approach, and establish its scalability. Index Terms-model-based prognostics, distributed prognostics, structural model decomposition ABBREVIATIONS
Toward Biopredictive Dissolution for Enteric Coated Dosage Forms.
Al-Gousous, J; Amidon, G L; Langguth, P
2016-06-06
The aim of this work was to develop a phosphate buffer based dissolution method for enteric-coated formulations with improved biopredictivity for fasted conditions. Two commercially available enteric-coated aspirin products were used as model formulations (Aspirin Protect 300 mg, and Walgreens Aspirin 325 mg). The disintegration performance of these products in a physiological 8 mM pH 6.5 bicarbonate buffer (representing the conditions in the proximal small intestine) was used as a standard to optimize the employed phosphate buffer molarity. To account for the fact that a pH and buffer molarity gradient exists along the small intestine, the introduction of such a gradient was proposed for products with prolonged lag times (when it leads to a release lower than 75% in the first hour post acid stage) in the proposed buffer. This would allow the method also to predict the performance of later-disintegrating products. Dissolution performance using the accordingly developed method was compared to that observed when using two well-established dissolution methods: the United States Pharmacopeia (USP) method and blank fasted state simulated intestinal fluid (FaSSIF). The resulting dissolution profiles were convoluted using GastroPlus software to obtain predicted pharmacokinetic profiles. A pharmacokinetic study on healthy human volunteers was performed to evaluate the predictions made by the different dissolution setups. The novel method provided the best prediction, by a relatively wide margin, for the difference between the lag times of the two tested formulations, indicating its being able to predict the post gastric emptying onset of drug release with reasonable accuracy. Both the new and the blank FaSSIF methods showed potential for establishing in vitro-in vivo correlation (IVIVC) concerning the prediction of Cmax and AUC0-24 (prediction errors not more than 20%). However, these predictions are strongly affected by the highly variable first pass metabolism necessitating the evaluation of an absorption rate metric that is more independent of the first-pass effect. The Cmax/AUC0-24 ratio was selected for this purpose. Regarding this metric's predictions, the new method provided very good prediction of the two products' performances relative to each other (only 1.05% prediction error in this regard), while its predictions for the individual products' values in absolute terms were borderline, narrowly missing the regulatory 20% prediction error limits (21.51% for Aspirin Protect and 22.58% for Walgreens Aspirin). The blank FaSSIF-based method provided good Cmax/AUC0-24 ratio prediction, in absolute terms, for Aspirin Protect (9.05% prediction error), but its prediction for Walgreens Aspirin (33.97% prediction error) was overwhelmingly poor. Thus it gave practically the same average but much higher maximum prediction errors compared to the new method, and it was strongly overdiscriminating as for predicting their performances relative to one another. The USP method, despite not being overdiscriminating, provided poor predictions of the individual products' Cmax/AUC0-24 ratios. This indicates that, overall, the new method is of improved biopredictivity compared to established methods.
EOID System Model Validation, Metrics, and Synthetic Clutter Generation
2003-09-30
Our long-term goal is to accurately predict the capability of the current generation of laser-based underwater imaging sensors to perform Electro ... Optic Identification (EOID) against relevant targets in a variety of realistic environmental conditions. The models will predict the impact of
Boosting compound-protein interaction prediction by deep learning.
Tian, Kai; Shao, Mingyu; Wang, Yang; Guan, Jihong; Zhou, Shuigeng
2016-11-01
The identification of interactions between compounds and proteins plays an important role in network pharmacology and drug discovery. However, experimentally identifying compound-protein interactions (CPIs) is generally expensive and time-consuming, computational approaches are thus introduced. Among these, machine-learning based methods have achieved a considerable success. However, due to the nonlinear and imbalanced nature of biological data, many machine learning approaches have their own limitations. Recently, deep learning techniques show advantages over many state-of-the-art machine learning methods in some applications. In this study, we aim at improving the performance of CPI prediction based on deep learning, and propose a method called DL-CPI (the abbreviation of Deep Learning for Compound-Protein Interactions prediction), which employs deep neural network (DNN) to effectively learn the representations of compound-protein pairs. Extensive experiments show that DL-CPI can learn useful features of compound-protein pairs by a layerwise abstraction, and thus achieves better prediction performance than existing methods on both balanced and imbalanced datasets. Copyright © 2016 Elsevier Inc. All rights reserved.