Multivariate Strategies in Functional Magnetic Resonance Imaging
ERIC Educational Resources Information Center
Hansen, Lars Kai
2007-01-01
We discuss aspects of multivariate fMRI modeling, including the statistical evaluation of multivariate models and means for dimensional reduction. In a case study we analyze linear and non-linear dimensional reduction tools in the context of a "mind reading" predictive multivariate fMRI model.
Can multivariate models based on MOAKS predict OA knee pain? Data from the Osteoarthritis Initiative
NASA Astrophysics Data System (ADS)
Luna-Gómez, Carlos D.; Zanella-Calzada, Laura A.; Galván-Tejada, Jorge I.; Galván-Tejada, Carlos E.; Celaya-Padilla, José M.
2017-03-01
Osteoarthritis is the most common rheumatic disease in the world. Knee pain is the most disabling symptom in the disease, the prediction of pain is one of the targets in preventive medicine, this can be applied to new therapies or treatments. Using the magnetic resonance imaging and the grading scales, a multivariate model based on genetic algorithms is presented. Using a predictive model can be useful to associate minor structure changes in the joint with the future knee pain. Results suggest that multivariate models can be predictive with future knee chronic pain. All models; T0, T1 and T2, were statistically significant, all p values were < 0.05 and all AUC > 0.60.
Multivariate Radiological-Based Models for the Prediction of Future Knee Pain: Data from the OAI
Galván-Tejada, Jorge I.; Celaya-Padilla, José M.; Treviño, Victor; Tamez-Peña, José G.
2015-01-01
In this work, the potential of X-ray based multivariate prognostic models to predict the onset of chronic knee pain is presented. Using X-rays quantitative image assessments of joint-space-width (JSW) and paired semiquantitative central X-ray scores from the Osteoarthritis Initiative (OAI), a case-control study is presented. The pain assessments of the right knee at the baseline and the 60-month visits were used to screen for case/control subjects. Scores were analyzed at the time of pain incidence (T-0), the year prior incidence (T-1), and two years before pain incidence (T-2). Multivariate models were created by a cross validated elastic-net regularized generalized linear models feature selection tool. Univariate differences between cases and controls were reported by AUC, C-statistics, and ODDs ratios. Univariate analysis indicated that the medial osteophytes were significantly more prevalent in cases than controls: C-stat 0.62, 0.62, and 0.61, at T-0, T-1, and T-2, respectively. The multivariate JSW models significantly predicted pain: AUC = 0.695, 0.623, and 0.620, at T-0, T-1, and T-2, respectively. Semiquantitative multivariate models predicted paint with C-stat = 0.671, 0.648, and 0.645 at T-0, T-1, and T-2, respectively. Multivariate models derived from plain X-ray radiography assessments may be used to predict subjects that are at risk of developing knee pain. PMID:26504490
NASA Astrophysics Data System (ADS)
Yu, H.; Gu, H.
2017-12-01
A novel multivariate seismic formation pressure prediction methodology is presented, which incorporates high-resolution seismic velocity data from prestack AVO inversion, and petrophysical data (porosity and shale volume) derived from poststack seismic motion inversion. In contrast to traditional seismic formation prediction methods, the proposed methodology is based on a multivariate pressure prediction model and utilizes a trace-by-trace multivariate regression analysis on seismic-derived petrophysical properties to calibrate model parameters in order to make accurate predictions with higher resolution in both vertical and lateral directions. With prestack time migration velocity as initial velocity model, an AVO inversion was first applied to prestack dataset to obtain high-resolution seismic velocity with higher frequency that is to be used as the velocity input for seismic pressure prediction, and the density dataset to calculate accurate Overburden Pressure (OBP). Seismic Motion Inversion (SMI) is an inversion technique based on Markov Chain Monte Carlo simulation. Both structural variability and similarity of seismic waveform are used to incorporate well log data to characterize the variability of the property to be obtained. In this research, porosity and shale volume are first interpreted on well logs, and then combined with poststack seismic data using SMI to build porosity and shale volume datasets for seismic pressure prediction. A multivariate effective stress model is used to convert velocity, porosity and shale volume datasets to effective stress. After a thorough study of the regional stratigraphic and sedimentary characteristics, a regional normally compacted interval model is built, and then the coefficients in the multivariate prediction model are determined in a trace-by-trace multivariate regression analysis on the petrophysical data. The coefficients are used to convert velocity, porosity and shale volume datasets to effective stress and then to calculate formation pressure with OBP. Application of the proposed methodology to a research area in East China Sea has proved that the method can bridge the gap between seismic and well log pressure prediction and give predicted pressure values close to pressure meassurements from well testing.
Dankers, Frank; Wijsman, Robin; Troost, Esther G C; Monshouwer, René; Bussink, Johan; Hoffmann, Aswin L
2017-05-07
In our previous work, a multivariable normal-tissue complication probability (NTCP) model for acute esophageal toxicity (AET) Grade ⩾2 after highly conformal (chemo-)radiotherapy for non-small cell lung cancer (NSCLC) was developed using multivariable logistic regression analysis incorporating clinical parameters and mean esophageal dose (MED). Since the esophagus is a tubular organ, spatial information of the esophageal wall dose distribution may be important in predicting AET. We investigated whether the incorporation of esophageal wall dose-surface data with spatial information improves the predictive power of our established NTCP model. For 149 NSCLC patients treated with highly conformal radiation therapy esophageal wall dose-surface histograms (DSHs) and polar dose-surface maps (DSMs) were generated. DSMs were used to generate new DSHs and dose-length-histograms that incorporate spatial information of the dose-surface distribution. From these histograms dose parameters were derived and univariate logistic regression analysis showed that they correlated significantly with AET. Following our previous work, new multivariable NTCP models were developed using the most significant dose histogram parameters based on univariate analysis (19 in total). However, the 19 new models incorporating esophageal wall dose-surface data with spatial information did not show improved predictive performance (area under the curve, AUC range 0.79-0.84) over the established multivariable NTCP model based on conventional dose-volume data (AUC = 0.84). For prediction of AET, based on the proposed multivariable statistical approach, spatial information of the esophageal wall dose distribution is of no added value and it is sufficient to only consider MED as a predictive dosimetric parameter.
NASA Astrophysics Data System (ADS)
Dankers, Frank; Wijsman, Robin; Troost, Esther G. C.; Monshouwer, René; Bussink, Johan; Hoffmann, Aswin L.
2017-05-01
In our previous work, a multivariable normal-tissue complication probability (NTCP) model for acute esophageal toxicity (AET) Grade ⩾2 after highly conformal (chemo-)radiotherapy for non-small cell lung cancer (NSCLC) was developed using multivariable logistic regression analysis incorporating clinical parameters and mean esophageal dose (MED). Since the esophagus is a tubular organ, spatial information of the esophageal wall dose distribution may be important in predicting AET. We investigated whether the incorporation of esophageal wall dose-surface data with spatial information improves the predictive power of our established NTCP model. For 149 NSCLC patients treated with highly conformal radiation therapy esophageal wall dose-surface histograms (DSHs) and polar dose-surface maps (DSMs) were generated. DSMs were used to generate new DSHs and dose-length-histograms that incorporate spatial information of the dose-surface distribution. From these histograms dose parameters were derived and univariate logistic regression analysis showed that they correlated significantly with AET. Following our previous work, new multivariable NTCP models were developed using the most significant dose histogram parameters based on univariate analysis (19 in total). However, the 19 new models incorporating esophageal wall dose-surface data with spatial information did not show improved predictive performance (area under the curve, AUC range 0.79-0.84) over the established multivariable NTCP model based on conventional dose-volume data (AUC = 0.84). For prediction of AET, based on the proposed multivariable statistical approach, spatial information of the esophageal wall dose distribution is of no added value and it is sufficient to only consider MED as a predictive dosimetric parameter.
Characterizing multivariate decoding models based on correlated EEG spectral features
McFarland, Dennis J.
2013-01-01
Objective Multivariate decoding methods are popular techniques for analysis of neurophysiological data. The present study explored potential interpretative problems with these techniques when predictors are correlated. Methods Data from sensorimotor rhythm-based cursor control experiments was analyzed offline with linear univariate and multivariate models. Features were derived from autoregressive (AR) spectral analysis of varying model order which produced predictors that varied in their degree of correlation (i.e., multicollinearity). Results The use of multivariate regression models resulted in much better prediction of target position as compared to univariate regression models. However, with lower order AR features interpretation of the spectral patterns of the weights was difficult. This is likely to be due to the high degree of multicollinearity present with lower order AR features. Conclusions Care should be exercised when interpreting the pattern of weights of multivariate models with correlated predictors. Comparison with univariate statistics is advisable. Significance While multivariate decoding algorithms are very useful for prediction their utility for interpretation may be limited when predictors are correlated. PMID:23466267
Multivariate regression model for predicting lumber grade volumes of northern red oak sawlogs
Daniel A. Yaussy; Robert L. Brisbin
1983-01-01
A multivariate regression model was developed to predict green board-foot yields for the seven common factory lumber grades processed from northern red oak (Quercus rubra L.) factory grade logs. The model uses the standard log measurements of grade, scaling diameter, length, and percent defect. It was validated with an independent data set. The model...
2017-09-01
efficacy of statistical post-processing methods downstream of these dynamical model components with a hierarchical multivariate Bayesian approach to...Bayesian hierarchical modeling, Markov chain Monte Carlo methods , Metropolis algorithm, machine learning, atmospheric prediction 15. NUMBER OF PAGES...scale processes. However, this dissertation explores the efficacy of statistical post-processing methods downstream of these dynamical model components
Predictive and mechanistic multivariate linear regression models for reaction development
Santiago, Celine B.; Guo, Jing-Yao
2018-01-01
Multivariate Linear Regression (MLR) models utilizing computationally-derived and empirically-derived physical organic molecular descriptors are described in this review. Several reports demonstrating the effectiveness of this methodological approach towards reaction optimization and mechanistic interrogation are discussed. A detailed protocol to access quantitative and predictive MLR models is provided as a guide for model development and parameter analysis. PMID:29719711
Predictive model for falling in Parkinson disease patients.
Custodio, Nilton; Lira, David; Herrera-Perez, Eder; Montesinos, Rosa; Castro-Suarez, Sheila; Cuenca-Alfaro, Jose; Cortijo, Patricia
2016-12-01
Falls are a common complication of advancing Parkinson's disease (PD). Although numerous risk factors are known, reliable predictors of future falls are still lacking. The aim of this study was to develop a multivariate model to predict falling in PD patients. Prospective cohort with forty-nine PD patients. The area under the receiver-operating characteristic curve (AUC) was calculated to evaluate predictive performance of the purposed multivariate model. The median of PD duration and UPDRS-III score in the cohort was 6 years and 24 points, respectively. Falls occurred in 18 PD patients (30%). Predictive factors for falling identified by univariate analysis were age, PD duration, physical activity, and scores of UPDRS motor, FOG, ACE, IFS, PFAQ and GDS ( p -value < 0.001), as well as fear of falling score ( p -value = 0.04). The final multivariate model (PD duration, FOG, ACE, and physical activity) showed an AUC = 0.9282 (correctly classified = 89.83%; sensitivity = 92.68%; specificity = 83.33%). This study showed that our multivariate model have a high performance to predict falling in a sample of PD patients.
A multivariate model for predicting segmental body composition.
Tian, Simiao; Mioche, Laurence; Denis, Jean-Baptiste; Morio, Béatrice
2013-12-01
The aims of the present study were to propose a multivariate model for predicting simultaneously body, trunk and appendicular fat and lean masses from easily measured variables and to compare its predictive capacity with that of the available univariate models that predict body fat percentage (BF%). The dual-energy X-ray absorptiometry (DXA) dataset (52% men and 48% women) with White, Black and Hispanic ethnicities (1999-2004, National Health and Nutrition Examination Survey) was randomly divided into three sub-datasets: a training dataset (TRD), a test dataset (TED); a validation dataset (VAD), comprising 3835, 1917 and 1917 subjects. For each sex, several multivariate prediction models were fitted from the TRD using age, weight, height and possibly waist circumference. The most accurate model was selected from the TED and then applied to the VAD and a French DXA dataset (French DB) (526 men and 529 women) to assess the prediction accuracy in comparison with that of five published univariate models, for which adjusted formulas were re-estimated using the TRD. Waist circumference was found to improve the prediction accuracy, especially in men. For BF%, the standard error of prediction (SEP) values were 3.26 (3.75) % for men and 3.47 (3.95)% for women in the VAD (French DB), as good as those of the adjusted univariate models. Moreover, the SEP values for the prediction of body and appendicular lean masses ranged from 1.39 to 2.75 kg for both the sexes. The prediction accuracy was best for age < 65 years, BMI < 30 kg/m2 and the Hispanic ethnicity. The application of our multivariate model to large populations could be useful to address various public health issues.
Multivariate regression model for predicting yields of grade lumber from yellow birch sawlogs
Andrew F. Howard; Daniel A. Yaussy
1986-01-01
A multivariate regression model was developed to predict green board-foot yields for the common grades of factory lumber processed from yellow birch factory-grade logs. The model incorporates the standard log measurements of scaling diameter, length, proportion of scalable defects, and the assigned USDA Forest Service log grade. Differences in yields between band and...
A multivariate model and statistical method for validating tree grade lumber yield equations
Donald W. Seegrist
1975-01-01
Lumber yields within lumber grades can be described by a multivariate linear model. A method for validating lumber yield prediction equations when there are several tree grades is presented. The method is based on multivariate simultaneous test procedures.
Using Time Series Analysis to Predict Cardiac Arrest in a PICU.
Kennedy, Curtis E; Aoki, Noriaki; Mariscalco, Michele; Turley, James P
2015-11-01
To build and test cardiac arrest prediction models in a PICU, using time series analysis as input, and to measure changes in prediction accuracy attributable to different classes of time series data. Retrospective cohort study. Thirty-one bed academic PICU that provides care for medical and general surgical (not congenital heart surgery) patients. Patients experiencing a cardiac arrest in the PICU and requiring external cardiac massage for at least 2 minutes. None. One hundred three cases of cardiac arrest and 109 control cases were used to prepare a baseline dataset that consisted of 1,025 variables in four data classes: multivariate, raw time series, clinical calculations, and time series trend analysis. We trained 20 arrest prediction models using a matrix of five feature sets (combinations of data classes) with four modeling algorithms: linear regression, decision tree, neural network, and support vector machine. The reference model (multivariate data with regression algorithm) had an accuracy of 78% and 87% area under the receiver operating characteristic curve. The best model (multivariate + trend analysis data with support vector machine algorithm) had an accuracy of 94% and 98% area under the receiver operating characteristic curve. Cardiac arrest predictions based on a traditional model built with multivariate data and a regression algorithm misclassified cases 3.7 times more frequently than predictions that included time series trend analysis and built with a support vector machine algorithm. Although the final model lacks the specificity necessary for clinical application, we have demonstrated how information from time series data can be used to increase the accuracy of clinical prediction models.
Piecewise multivariate modelling of sequential metabolic profiling data.
Rantalainen, Mattias; Cloarec, Olivier; Ebbels, Timothy M D; Lundstedt, Torbjörn; Nicholson, Jeremy K; Holmes, Elaine; Trygg, Johan
2008-02-19
Modelling the time-related behaviour of biological systems is essential for understanding their dynamic responses to perturbations. In metabolic profiling studies, the sampling rate and number of sampling points are often restricted due to experimental and biological constraints. A supervised multivariate modelling approach with the objective to model the time-related variation in the data for short and sparsely sampled time-series is described. A set of piecewise Orthogonal Projections to Latent Structures (OPLS) models are estimated, describing changes between successive time points. The individual OPLS models are linear, but the piecewise combination of several models accommodates modelling and prediction of changes which are non-linear with respect to the time course. We demonstrate the method on both simulated and metabolic profiling data, illustrating how time related changes are successfully modelled and predicted. The proposed method is effective for modelling and prediction of short and multivariate time series data. A key advantage of the method is model transparency, allowing easy interpretation of time-related variation in the data. The method provides a competitive complement to commonly applied multivariate methods such as OPLS and Principal Component Analysis (PCA) for modelling and analysis of short time-series data.
Characterizing multivariate decoding models based on correlated EEG spectral features.
McFarland, Dennis J
2013-07-01
Multivariate decoding methods are popular techniques for analysis of neurophysiological data. The present study explored potential interpretative problems with these techniques when predictors are correlated. Data from sensorimotor rhythm-based cursor control experiments was analyzed offline with linear univariate and multivariate models. Features were derived from autoregressive (AR) spectral analysis of varying model order which produced predictors that varied in their degree of correlation (i.e., multicollinearity). The use of multivariate regression models resulted in much better prediction of target position as compared to univariate regression models. However, with lower order AR features interpretation of the spectral patterns of the weights was difficult. This is likely to be due to the high degree of multicollinearity present with lower order AR features. Care should be exercised when interpreting the pattern of weights of multivariate models with correlated predictors. Comparison with univariate statistics is advisable. While multivariate decoding algorithms are very useful for prediction their utility for interpretation may be limited when predictors are correlated. Copyright © 2013 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.
Multivariate Analysis of Seismic Field Data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Alam, M. Kathleen
1999-06-01
This report includes the details of the model building procedure and prediction of seismic field data. Principal Components Regression, a multivariate analysis technique, was used to model seismic data collected as two pieces of equipment were cycled on and off. Models built that included only the two pieces of equipment of interest had trouble predicting data containing signals not included in the model. Evidence for poor predictions came from the prediction curves as well as spectral F-ratio plots. Once the extraneous signals were included in the model, predictions improved dramatically. While Principal Components Regression performed well for the present datamore » sets, the present data analysis suggests further work will be needed to develop more robust modeling methods as the data become more complex.« less
Multivariable Time Series Prediction for the Icing Process on Overhead Power Transmission Line
Li, Peng; Zhao, Na; Zhou, Donghua; Cao, Min; Li, Jingjie; Shi, Xinling
2014-01-01
The design of monitoring and predictive alarm systems is necessary for successful overhead power transmission line icing. Given the characteristics of complexity, nonlinearity, and fitfulness in the line icing process, a model based on a multivariable time series is presented here to predict the icing load of a transmission line. In this model, the time effects of micrometeorology parameters for the icing process have been analyzed. The phase-space reconstruction theory and machine learning method were then applied to establish the prediction model, which fully utilized the history of multivariable time series data in local monitoring systems to represent the mapping relationship between icing load and micrometeorology factors. Relevant to the characteristic of fitfulness in line icing, the simulations were carried out during the same icing process or different process to test the model's prediction precision and robustness. According to the simulation results for the Tao-Luo-Xiong Transmission Line, this model demonstrates a good accuracy of prediction in different process, if the prediction length is less than two hours, and would be helpful for power grid departments when deciding to take action in advance to address potential icing disasters. PMID:25136653
ERIC Educational Resources Information Center
McKinney, Cliff; Renk, Kimberly
2008-01-01
Although parent-adolescent interactions have been examined, relevant variables have not been integrated into a multivariate model. As a result, this study examined a multivariate model of parent-late adolescent gender dyads in an attempt to capture important predictors in late adolescents' important and unique transition to adulthood. The sample…
Accuracies of univariate and multivariate genomic prediction models in African cassava.
Okeke, Uche Godfrey; Akdemir, Deniz; Rabbi, Ismail; Kulakow, Peter; Jannink, Jean-Luc
2017-12-04
Genomic selection (GS) promises to accelerate genetic gain in plant breeding programs especially for crop species such as cassava that have long breeding cycles. Practically, to implement GS in cassava breeding, it is necessary to evaluate different GS models and to develop suitable models for an optimized breeding pipeline. In this paper, we compared (1) prediction accuracies from a single-trait (uT) and a multi-trait (MT) mixed model for a single-environment genetic evaluation (Scenario 1), and (2) accuracies from a compound symmetric multi-environment model (uE) parameterized as a univariate multi-kernel model to a multivariate (ME) multi-environment mixed model that accounts for genotype-by-environment interaction for multi-environment genetic evaluation (Scenario 2). For these analyses, we used 16 years of public cassava breeding data for six target cassava traits and a fivefold cross-validation scheme with 10-repeat cycles to assess model prediction accuracies. In Scenario 1, the MT models had higher prediction accuracies than the uT models for all traits and locations analyzed, which amounted to on average a 40% improved prediction accuracy. For Scenario 2, we observed that the ME model had on average (across all locations and traits) a 12% improved prediction accuracy compared to the uE model. We recommend the use of multivariate mixed models (MT and ME) for cassava genetic evaluation. These models may be useful for other plant species.
NASA Astrophysics Data System (ADS)
Chen, Quansheng; Qi, Shuai; Li, Huanhuan; Han, Xiaoyan; Ouyang, Qin; Zhao, Jiewen
2014-10-01
To rapidly and efficiently detect the presence of adulterants in honey, three-dimensional fluorescence spectroscopy (3DFS) technique was employed with the help of multivariate calibration. The data of 3D fluorescence spectra were compressed using characteristic extraction and the principal component analysis (PCA). Then, partial least squares (PLS) and back propagation neural network (BP-ANN) algorithms were used for modeling. The model was optimized by cross validation, and its performance was evaluated according to root mean square error of prediction (RMSEP) and correlation coefficient (R) in prediction set. The results showed that BP-ANN model was superior to PLS models, and the optimum prediction results of the mixed group (sunflower ± longan ± buckwheat ± rape) model were achieved as follow: RMSEP = 0.0235 and R = 0.9787 in the prediction set. The study demonstrated that the 3D fluorescence spectroscopy technique combined with multivariate calibration has high potential in rapid, nondestructive, and accurate quantitative analysis of honey adulteration.
Snell, Kym I E; Hua, Harry; Debray, Thomas P A; Ensor, Joie; Look, Maxime P; Moons, Karel G M; Riley, Richard D
2016-01-01
Our aim was to improve meta-analysis methods for summarizing a prediction model's performance when individual participant data are available from multiple studies for external validation. We suggest multivariate meta-analysis for jointly synthesizing calibration and discrimination performance, while accounting for their correlation. The approach estimates a prediction model's average performance, the heterogeneity in performance across populations, and the probability of "good" performance in new populations. This allows different implementation strategies (e.g., recalibration) to be compared. Application is made to a diagnostic model for deep vein thrombosis (DVT) and a prognostic model for breast cancer mortality. In both examples, multivariate meta-analysis reveals that calibration performance is excellent on average but highly heterogeneous across populations unless the model's intercept (baseline hazard) is recalibrated. For the cancer model, the probability of "good" performance (defined by C statistic ≥0.7 and calibration slope between 0.9 and 1.1) in a new population was 0.67 with recalibration but 0.22 without recalibration. For the DVT model, even with recalibration, there was only a 0.03 probability of "good" performance. Multivariate meta-analysis can be used to externally validate a prediction model's calibration and discrimination performance across multiple populations and to evaluate different implementation strategies. Crown Copyright © 2016. Published by Elsevier Inc. All rights reserved.
Finley, Andrew O.; Banerjee, Sudipto; Cook, Bruce D.; Bradford, John B.
2013-01-01
In this paper we detail a multivariate spatial regression model that couples LiDAR, hyperspectral and forest inventory data to predict forest outcome variables at a high spatial resolution. The proposed model is used to analyze forest inventory data collected on the US Forest Service Penobscot Experimental Forest (PEF), ME, USA. In addition to helping meet the regression model's assumptions, results from the PEF analysis suggest that the addition of multivariate spatial random effects improves model fit and predictive ability, compared with two commonly applied modeling approaches. This improvement results from explicitly modeling the covariation among forest outcome variables and spatial dependence among observations through the random effects. Direct application of such multivariate models to even moderately large datasets is often computationally infeasible because of cubic order matrix algorithms involved in estimation. We apply a spatial dimension reduction technique to help overcome this computational hurdle without sacrificing richness in modeling.
Galván-Tejada, Carlos E.; Zanella-Calzada, Laura A.; Galván-Tejada, Jorge I.; Celaya-Padilla, José M.; Gamboa-Rosales, Hamurabi; Garza-Veloz, Idalia; Martinez-Fierro, Margarita L.
2017-01-01
Breast cancer is an important global health problem, and the most common type of cancer among women. Late diagnosis significantly decreases the survival rate of the patient; however, using mammography for early detection has been demonstrated to be a very important tool increasing the survival rate. The purpose of this paper is to obtain a multivariate model to classify benign and malignant tumor lesions using a computer-assisted diagnosis with a genetic algorithm in training and test datasets from mammography image features. A multivariate search was conducted to obtain predictive models with different approaches, in order to compare and validate results. The multivariate models were constructed using: Random Forest, Nearest centroid, and K-Nearest Neighbor (K-NN) strategies as cost function in a genetic algorithm applied to the features in the BCDR public databases. Results suggest that the two texture descriptor features obtained in the multivariate model have a similar or better prediction capability to classify the data outcome compared with the multivariate model composed of all the features, according to their fitness value. This model can help to reduce the workload of radiologists and present a second opinion in the classification of tumor lesions. PMID:28216571
Galván-Tejada, Carlos E; Zanella-Calzada, Laura A; Galván-Tejada, Jorge I; Celaya-Padilla, José M; Gamboa-Rosales, Hamurabi; Garza-Veloz, Idalia; Martinez-Fierro, Margarita L
2017-02-14
Breast cancer is an important global health problem, and the most common type of cancer among women. Late diagnosis significantly decreases the survival rate of the patient; however, using mammography for early detection has been demonstrated to be a very important tool increasing the survival rate. The purpose of this paper is to obtain a multivariate model to classify benign and malignant tumor lesions using a computer-assisted diagnosis with a genetic algorithm in training and test datasets from mammography image features. A multivariate search was conducted to obtain predictive models with different approaches, in order to compare and validate results. The multivariate models were constructed using: Random Forest, Nearest centroid, and K-Nearest Neighbor (K-NN) strategies as cost function in a genetic algorithm applied to the features in the BCDR public databases. Results suggest that the two texture descriptor features obtained in the multivariate model have a similar or better prediction capability to classify the data outcome compared with the multivariate model composed of all the features, according to their fitness value. This model can help to reduce the workload of radiologists and present a second opinion in the classification of tumor lesions.
Lee, Tsair-Fwu; Liou, Ming-Hsiang; Huang, Yu-Jie; Chao, Pei-Ju; Ting, Hui-Min; Lee, Hsiao-Yi
2014-01-01
To predict the incidence of moderate-to-severe patient-reported xerostomia among head and neck squamous cell carcinoma (HNSCC) and nasopharyngeal carcinoma (NPC) patients treated with intensity-modulated radiotherapy (IMRT). Multivariable normal tissue complication probability (NTCP) models were developed by using quality of life questionnaire datasets from 152 patients with HNSCC and 84 patients with NPC. The primary endpoint was defined as moderate-to-severe xerostomia after IMRT. The numbers of predictive factors for a multivariable logistic regression model were determined using the least absolute shrinkage and selection operator (LASSO) with bootstrapping technique. Four predictive models were achieved by LASSO with the smallest number of factors while preserving predictive value with higher AUC performance. For all models, the dosimetric factors for the mean dose given to the contralateral and ipsilateral parotid gland were selected as the most significant predictors. Followed by the different clinical and socio-economic factors being selected, namely age, financial status, T stage, and education for different models were chosen. The predicted incidence of xerostomia for HNSCC and NPC patients can be improved by using multivariable logistic regression models with LASSO technique. The predictive model developed in HNSCC cannot be generalized to NPC cohort treated with IMRT without validation and vice versa. PMID:25163814
Sun, Jin; Rutkoski, Jessica E; Poland, Jesse A; Crossa, José; Jannink, Jean-Luc; Sorrells, Mark E
2017-07-01
High-throughput phenotyping (HTP) platforms can be used to measure traits that are genetically correlated with wheat ( L.) grain yield across time. Incorporating such secondary traits in the multivariate pedigree and genomic prediction models would be desirable to improve indirect selection for grain yield. In this study, we evaluated three statistical models, simple repeatability (SR), multitrait (MT), and random regression (RR), for the longitudinal data of secondary traits and compared the impact of the proposed models for secondary traits on their predictive abilities for grain yield. Grain yield and secondary traits, canopy temperature (CT) and normalized difference vegetation index (NDVI), were collected in five diverse environments for 557 wheat lines with available pedigree and genomic information. A two-stage analysis was applied for pedigree and genomic selection (GS). First, secondary traits were fitted by SR, MT, or RR models, separately, within each environment. Then, best linear unbiased predictions (BLUPs) of secondary traits from the above models were used in the multivariate prediction models to compare predictive abilities for grain yield. Predictive ability was substantially improved by 70%, on average, from multivariate pedigree and genomic models when including secondary traits in both training and test populations. Additionally, (i) predictive abilities slightly varied for MT, RR, or SR models in this data set, (ii) results indicated that including BLUPs of secondary traits from the MT model was the best in severe drought, and (iii) the RR model was slightly better than SR and MT models under drought environment. Copyright © 2017 Crop Science Society of America.
Multivariate Regression Analysis and Slaughter Livestock,
AGRICULTURE, *ECONOMICS), (*MEAT, PRODUCTION), MULTIVARIATE ANALYSIS, REGRESSION ANALYSIS , ANIMALS, WEIGHT, COSTS, PREDICTIONS, STABILITY, MATHEMATICAL MODELS, STORAGE, BEEF, PORK, FOOD, STATISTICAL DATA, ACCURACY
Wang, Ming; Li, Zheng; Lee, Eun Young; Lewis, Mechelle M; Zhang, Lijun; Sterling, Nicholas W; Wagner, Daymond; Eslinger, Paul; Du, Guangwei; Huang, Xuemei
2017-09-25
It is challenging for current statistical models to predict clinical progression of Parkinson's disease (PD) because of the involvement of multi-domains and longitudinal data. Past univariate longitudinal or multivariate analyses from cross-sectional trials have limited power to predict individual outcomes or a single moment. The multivariate generalized linear mixed-effect model (GLMM) under the Bayesian framework was proposed to study multi-domain longitudinal outcomes obtained at baseline, 18-, and 36-month. The outcomes included motor, non-motor, and postural instability scores from the MDS-UPDRS, and demographic and standardized clinical data were utilized as covariates. The dynamic prediction was performed for both internal and external subjects using the samples from the posterior distributions of the parameter estimates and random effects, and also the predictive accuracy was evaluated based on the root of mean square error (RMSE), absolute bias (AB) and the area under the receiver operating characteristic (ROC) curve. First, our prediction model identified clinical data that were differentially associated with motor, non-motor, and postural stability scores. Second, the predictive accuracy of our model for the training data was assessed, and improved prediction was gained in particularly for non-motor (RMSE and AB: 2.89 and 2.20) compared to univariate analysis (RMSE and AB: 3.04 and 2.35). Third, the individual-level predictions of longitudinal trajectories for the testing data were performed, with ~80% observed values falling within the 95% credible intervals. Multivariate general mixed models hold promise to predict clinical progression of individual outcomes in PD. The data was obtained from Dr. Xuemei Huang's NIH grant R01 NS060722 , part of NINDS PD Biomarker Program (PDBP). All data was entered within 24 h of collection to the Data Management Repository (DMR), which is publically available ( https://pdbp.ninds.nih.gov/data-management ).
Response Surface Modeling Using Multivariate Orthogonal Functions
NASA Technical Reports Server (NTRS)
Morelli, Eugene A.; DeLoach, Richard
2001-01-01
A nonlinear modeling technique was used to characterize response surfaces for non-dimensional longitudinal aerodynamic force and moment coefficients, based on wind tunnel data from a commercial jet transport model. Data were collected using two experimental procedures - one based on modem design of experiments (MDOE), and one using a classical one factor at a time (OFAT) approach. The nonlinear modeling technique used multivariate orthogonal functions generated from the independent variable data as modeling functions in a least squares context to characterize the response surfaces. Model terms were selected automatically using a prediction error metric. Prediction error bounds computed from the modeling data alone were found to be- a good measure of actual prediction error for prediction points within the inference space. Root-mean-square model fit error and prediction error were less than 4 percent of the mean response value in all cases. Efficacy and prediction performance of the response surface models identified from both MDOE and OFAT experiments were investigated.
Cantiello, Francesco; Russo, Giorgio Ivan; Cicione, Antonio; Ferro, Matteo; Cimino, Sebastiano; Favilla, Vincenzo; Perdonà, Sisto; De Cobelli, Ottavio; Magno, Carlo; Morgia, Giuseppe; Damiano, Rocco
2016-04-01
To assess the performance of prostate health index (PHI) and prostate cancer antigen 3 (PCA3) when added to the PRIAS or Epstein criteria in predicting the presence of pathologically insignificant prostate cancer (IPCa) in patients who underwent radical prostatectomy (RP) but eligible for active surveillance (AS). An observational retrospective study was performed in 188 PCa patients treated with laparoscopic or robot-assisted RP but eligible for AS according to Epstein or PRIAS criteria. Blood and urinary specimens were collected before initial prostate biopsy for PHI and PCA3 measurements. Multivariate logistic regression analyses and decision curve analysis were carried out to identify predictors of IPCa using the updated ERSPC definition. At the multivariate analyses, the inclusion of both PCA3 and PHI significantly increased the accuracy of the Epstein multivariate model in predicting IPCa with an increase of 17 % (AUC = 0.77) and of 32 % (AUC = 0.92), respectively. The inclusion of both PCA3 and PHI also increased the predictive accuracy of the PRIAS multivariate model with an increase of 29 % (AUC = 0.87) and of 39 % (AUC = 0.97), respectively. DCA revealed that the multivariable models with the addition of PHI or PCA3 showed a greater net benefit and performed better than the reference models. In a direct comparison, PHI outperformed PCA3 performance resulting in higher net benefit. In a same cohort of patients eligible for AS, the addition of PHI and PCA3 to Epstein or PRIAS models improved their prognostic performance. PHI resulted in greater net benefit in predicting IPCa compared to PCA3.
Load compensation in a lean burn natural gas vehicle
NASA Astrophysics Data System (ADS)
Gangopadhyay, Anupam
A new multivariable PI tuning technique is developed in this research that is primarily developed for regulation purposes. Design guidelines are developed based on closed-loop stability. The new multivariable design is applied in a natural gas vehicle to combine idle and A/F ratio control loops. This results in better recovery during low idle operation of a vehicle under external step torques. A powertrain model of a natural gas engine is developed and validated for steady-state and transient operation. The nonlinear model has three states: engine speed, intake manifold pressure and fuel fraction in the intake manifold. The model includes the effect of fuel partial pressure in the intake manifold filling and emptying dynamics. Due to the inclusion of fuel fraction as a state, fuel flow rate into the cylinders is also accurately modeled. A linear system identification is performed on the nonlinear model. The linear model structure is predicted analytically from the nonlinear model and the coefficients of the predicted transfer function are shown to be functions of key physical parameters in the plant. Simulations of linear system and model parameter identification is shown to converge to the predicted values of the model coefficients. The multivariable controller developed in this research could be designed in an algebraic fashion once the plant model is known. It is thus possible to implement the multivariable PI design in an adaptive fashion combining the controller with identified plant model on-line. This will result in a self-tuning regulator (STR) type controller where the underlying design criteria is the multivariable tuning technique designed in this research.
An effective drift correction for dynamical downscaling of decadal global climate predictions
NASA Astrophysics Data System (ADS)
Paeth, Heiko; Li, Jingmin; Pollinger, Felix; Müller, Wolfgang A.; Pohlmann, Holger; Feldmann, Hendrik; Panitz, Hans-Jürgen
2018-04-01
Initialized decadal climate predictions with coupled climate models are often marked by substantial climate drifts that emanate from a mismatch between the climatology of the coupled model system and the data set used for initialization. While such drifts may be easily removed from the prediction system when analyzing individual variables, a major problem prevails for multivariate issues and, especially, when the output of the global prediction system shall be used for dynamical downscaling. In this study, we present a statistical approach to remove climate drifts in a multivariate context and demonstrate the effect of this drift correction on regional climate model simulations over the Euro-Atlantic sector. The statistical approach is based on an empirical orthogonal function (EOF) analysis adapted to a very large data matrix. The climate drift emerges as a dramatic cooling trend in North Atlantic sea surface temperatures (SSTs) and is captured by the leading EOF of the multivariate output from the global prediction system, accounting for 7.7% of total variability. The SST cooling pattern also imposes drifts in various atmospheric variables and levels. The removal of the first EOF effectuates the drift correction while retaining other components of intra-annual, inter-annual and decadal variability. In the regional climate model, the multivariate drift correction of the input data removes the cooling trends in most western European land regions and systematically reduces the discrepancy between the output of the regional climate model and observational data. In contrast, removing the drift only in the SST field from the global model has hardly any positive effect on the regional climate model.
Classical least squares multivariate spectral analysis
Haaland, David M.
2002-01-01
An improved classical least squares multivariate spectral analysis method that adds spectral shapes describing non-calibrated components and system effects (other than baseline corrections) present in the analyzed mixture to the prediction phase of the method. These improvements decrease or eliminate many of the restrictions to the CLS-type methods and greatly extend their capabilities, accuracy, and precision. One new application of PACLS includes the ability to accurately predict unknown sample concentrations when new unmodeled spectral components are present in the unknown samples. Other applications of PACLS include the incorporation of spectrometer drift into the quantitative multivariate model and the maintenance of a calibration on a drifting spectrometer. Finally, the ability of PACLS to transfer a multivariate model between spectrometers is demonstrated.
NASA Astrophysics Data System (ADS)
Wahid, A.; Putra, I. G. E. P.
2018-03-01
Dimethyl ether (DME) as an alternative clean energy has attracted a growing attention in the recent years. DME production via reactive distillation has potential for capital cost and energy requirement savings. However, combination of reaction and distillation on a single column makes reactive distillation process a very complex multivariable system with high non-linearity of process and strong interaction between process variables. This study investigates a multivariable model predictive control (MPC) based on two-point temperature control strategy for the DME reactive distillation column to maintain the purities of both product streams. The process model is estimated by a first order plus dead time model. The DME and water purity is maintained by controlling a stage temperature in rectifying and stripping section, respectively. The result shows that the model predictive controller performed faster responses compared to conventional PI controller that are showed by the smaller ISE values. In addition, the MPC controller is able to handle the loop interactions well.
NASA Astrophysics Data System (ADS)
Niedzielski, Tomasz; Kosek, Wiesław
2008-02-01
This article presents the application of a multivariate prediction technique for predicting universal time (UT1-UTC), length of day (LOD) and the axial component of atmospheric angular momentum (AAM χ 3). The multivariate predictions of LOD and UT1-UTC are generated by means of the combination of (1) least-squares (LS) extrapolation of models for annual, semiannual, 18.6-year, 9.3-year oscillations and for the linear trend, and (2) multivariate autoregressive (MAR) stochastic prediction of LS residuals (LS + MAR). The MAR technique enables the use of the AAM χ 3 time-series as the explanatory variable for the computation of LOD or UT1-UTC predictions. In order to evaluate the performance of this approach, two other prediction schemes are also applied: (1) LS extrapolation, (2) combination of LS extrapolation and univariate autoregressive (AR) prediction of LS residuals (LS + AR). The multivariate predictions of AAM χ 3 data, however, are computed as a combination of the extrapolation of the LS model for annual and semiannual oscillations and the LS + MAR. The AAM χ 3 predictions are also compared with LS extrapolation and LS + AR prediction. It is shown that the predictions of LOD and UT1-UTC based on LS + MAR taking into account the axial component of AAM are more accurate than the predictions of LOD and UT1-UTC based on LS extrapolation or on LS + AR. In particular, the UT1-UTC predictions based on LS + MAR during El Niño/La Niña events exhibit considerably smaller prediction errors than those calculated by means of LS or LS + AR. The AAM χ 3 time-series is predicted using LS + MAR with higher accuracy than applying LS extrapolation itself in the case of medium-term predictions (up to 100 days in the future). However, the predictions of AAM χ 3 reveal the best accuracy for LS + AR.
Opportunities of probabilistic flood loss models
NASA Astrophysics Data System (ADS)
Schröter, Kai; Kreibich, Heidi; Lüdtke, Stefan; Vogel, Kristin; Merz, Bruno
2016-04-01
Oftentimes, traditional uni-variate damage models as for instance depth-damage curves fail to reproduce the variability of observed flood damage. However, reliable flood damage models are a prerequisite for the practical usefulness of the model results. Innovative multi-variate probabilistic modelling approaches are promising to capture and quantify the uncertainty involved and thus to improve the basis for decision making. In this study we compare the predictive capability of two probabilistic modelling approaches, namely Bagging Decision Trees and Bayesian Networks and traditional stage damage functions. For model evaluation we use empirical damage data which are available from computer aided telephone interviews that were respectively compiled after the floods in 2002, 2005, 2006 and 2013 in the Elbe and Danube catchments in Germany. We carry out a split sample test by sub-setting the damage records. One sub-set is used to derive the models and the remaining records are used to evaluate the predictive performance of the model. Further we stratify the sample according to catchments which allows studying model performance in a spatial transfer context. Flood damage estimation is carried out on the scale of the individual buildings in terms of relative damage. The predictive performance of the models is assessed in terms of systematic deviations (mean bias), precision (mean absolute error) as well as in terms of sharpness of the predictions the reliability which is represented by the proportion of the number of observations that fall within the 95-quantile and 5-quantile predictive interval. The comparison of the uni-variable Stage damage function and the multivariable model approach emphasises the importance to quantify predictive uncertainty. With each explanatory variable, the multi-variable model reveals an additional source of uncertainty. However, the predictive performance in terms of precision (mbe), accuracy (mae) and reliability (HR) is clearly improved in comparison to uni-variable Stage damage function. Overall, Probabilistic models provide quantitative information about prediction uncertainty which is crucial to assess the reliability of model predictions and improves the usefulness of model results.
2011-01-01
Principal component regression is a multivariate data analysis approach routinely used to predict neurochemical concentrations from in vivo fast-scan cyclic voltammetry measurements. This mathematical procedure can rapidly be employed with present day computer programming languages. Here, we evaluate several methods that can be used to evaluate and improve multivariate concentration determination. The cyclic voltammetric representation of the calculated regression vector is shown to be a valuable tool in determining whether the calculated multivariate model is chemically appropriate. The use of Cook’s distance successfully identified outliers contained within in vivo fast-scan cyclic voltammetry training sets. This work also presents the first direct interpretation of a residual color plot and demonstrated the effect of peak shifts on predicted dopamine concentrations. Finally, separate analyses of smaller increments of a single continuous measurement could not be concatenated without substantial error in the predicted neurochemical concentrations due to electrode drift. Taken together, these tools allow for the construction of more robust multivariate calibration models and provide the first approach to assess the predictive ability of a procedure that is inherently impossible to validate because of the lack of in vivo standards. PMID:21966586
Keithley, Richard B; Wightman, R Mark
2011-06-07
Principal component regression is a multivariate data analysis approach routinely used to predict neurochemical concentrations from in vivo fast-scan cyclic voltammetry measurements. This mathematical procedure can rapidly be employed with present day computer programming languages. Here, we evaluate several methods that can be used to evaluate and improve multivariate concentration determination. The cyclic voltammetric representation of the calculated regression vector is shown to be a valuable tool in determining whether the calculated multivariate model is chemically appropriate. The use of Cook's distance successfully identified outliers contained within in vivo fast-scan cyclic voltammetry training sets. This work also presents the first direct interpretation of a residual color plot and demonstrated the effect of peak shifts on predicted dopamine concentrations. Finally, separate analyses of smaller increments of a single continuous measurement could not be concatenated without substantial error in the predicted neurochemical concentrations due to electrode drift. Taken together, these tools allow for the construction of more robust multivariate calibration models and provide the first approach to assess the predictive ability of a procedure that is inherently impossible to validate because of the lack of in vivo standards.
Ecological prediction with nonlinear multivariate time-frequency functional data models
Yang, Wen-Hsi; Wikle, Christopher K.; Holan, Scott H.; Wildhaber, Mark L.
2013-01-01
Time-frequency analysis has become a fundamental component of many scientific inquiries. Due to improvements in technology, the amount of high-frequency signals that are collected for ecological and other scientific processes is increasing at a dramatic rate. In order to facilitate the use of these data in ecological prediction, we introduce a class of nonlinear multivariate time-frequency functional models that can identify important features of each signal as well as the interaction of signals corresponding to the response variable of interest. Our methodology is of independent interest and utilizes stochastic search variable selection to improve model selection and performs model averaging to enhance prediction. We illustrate the effectiveness of our approach through simulation and by application to predicting spawning success of shovelnose sturgeon in the Lower Missouri River.
IRT-ZIP Modeling for Multivariate Zero-Inflated Count Data
ERIC Educational Resources Information Center
Wang, Lijuan
2010-01-01
This study introduces an item response theory-zero-inflated Poisson (IRT-ZIP) model to investigate psychometric properties of multiple items and predict individuals' latent trait scores for multivariate zero-inflated count data. In the model, two link functions are used to capture two processes of the zero-inflated count data. Item parameters are…
Willis, Michael; Asseburg, Christian; Nilsson, Andreas; Johnsson, Kristina; Kartman, Bernt
2017-03-01
Type 2 diabetes mellitus (T2DM) is chronic and progressive and the cost-effectiveness of new treatment interventions must be established over long time horizons. Given the limited durability of drugs, assumptions regarding downstream rescue medication can drive results. Especially for insulin, for which treatment effects and adverse events are known to depend on patient characteristics, this can be problematic for health economic evaluation involving modeling. To estimate parsimonious multivariate equations of treatment effects and hypoglycemic event risks for use in parameterizing insulin rescue therapy in model-based cost-effectiveness analysis. Clinical evidence for insulin use in T2DM was identified in PubMed and from published reviews and meta-analyses. Study and patient characteristics and treatment effects and adverse event rates were extracted and the data used to estimate parsimonious treatment effect and hypoglycemic event risk equations using multivariate regression analysis. Data from 91 studies featuring 171 usable study arms were identified, mostly for premix and basal insulin types. Multivariate prediction equations for glycated hemoglobin A 1c lowering and weight change were estimated separately for insulin-naive and insulin-experienced patients. Goodness of fit (R 2 ) for both outcomes were generally good, ranging from 0.44 to 0.84. Multivariate prediction equations for symptomatic, nocturnal, and severe hypoglycemic events were also estimated, though considerable heterogeneity in definitions limits their usefulness. Parsimonious and robust multivariate prediction equations were estimated for glycated hemoglobin A 1c and weight change, separately for insulin-naive and insulin-experienced patients. Using these in economic simulation modeling in T2DM can improve realism and flexibility in modeling insulin rescue medication. Copyright © 2017 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.
Collins, Gary S; Reitsma, Johannes B; Altman, Douglas G; Moons, Karel G M
2015-06-01
Prediction models are developed to aid health care providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web-based survey and revised during a 3-day meeting in June 2011 with methodologists, health care professionals, and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org). The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. Copyright © 2014 The Authors. Published by Elsevier B.V. All rights reserved.
Augmented classical least squares multivariate spectral analysis
Haaland, David M.; Melgaard, David K.
2004-02-03
A method of multivariate spectral analysis, termed augmented classical least squares (ACLS), provides an improved CLS calibration model when unmodeled sources of spectral variation are contained in a calibration sample set. The ACLS methods use information derived from component or spectral residuals during the CLS calibration to provide an improved calibration-augmented CLS model. The ACLS methods are based on CLS so that they retain the qualitative benefits of CLS, yet they have the flexibility of PLS and other hybrid techniques in that they can define a prediction model even with unmodeled sources of spectral variation that are not explicitly included in the calibration model. The unmodeled sources of spectral variation may be unknown constituents, constituents with unknown concentrations, nonlinear responses, non-uniform and correlated errors, or other sources of spectral variation that are present in the calibration sample spectra. Also, since the various ACLS methods are based on CLS, they can incorporate the new prediction-augmented CLS (PACLS) method of updating the prediction model for new sources of spectral variation contained in the prediction sample set without having to return to the calibration process. The ACLS methods can also be applied to alternating least squares models. The ACLS methods can be applied to all types of multivariate data.
Augmented Classical Least Squares Multivariate Spectral Analysis
Haaland, David M.; Melgaard, David K.
2005-07-26
A method of multivariate spectral analysis, termed augmented classical least squares (ACLS), provides an improved CLS calibration model when unmodeled sources of spectral variation are contained in a calibration sample set. The ACLS methods use information derived from component or spectral residuals during the CLS calibration to provide an improved calibration-augmented CLS model. The ACLS methods are based on CLS so that they retain the qualitative benefits of CLS, yet they have the flexibility of PLS and other hybrid techniques in that they can define a prediction model even with unmodeled sources of spectral variation that are not explicitly included in the calibration model. The unmodeled sources of spectral variation may be unknown constituents, constituents with unknown concentrations, nonlinear responses, non-uniform and correlated errors, or other sources of spectral variation that are present in the calibration sample spectra. Also, since the various ACLS methods are based on CLS, they can incorporate the new prediction-augmented CLS (PACLS) method of updating the prediction model for new sources of spectral variation contained in the prediction sample set without having to return to the calibration process. The ACLS methods can also be applied to alternating least squares models. The ACLS methods can be applied to all types of multivariate data.
Augmented Classical Least Squares Multivariate Spectral Analysis
Haaland, David M.; Melgaard, David K.
2005-01-11
A method of multivariate spectral analysis, termed augmented classical least squares (ACLS), provides an improved CLS calibration model when unmodeled sources of spectral variation are contained in a calibration sample set. The ACLS methods use information derived from component or spectral residuals during the CLS calibration to provide an improved calibration-augmented CLS model. The ACLS methods are based on CLS so that they retain the qualitative benefits of CLS, yet they have the flexibility of PLS and other hybrid techniques in that they can define a prediction model even with unmodeled sources of spectral variation that are not explicitly included in the calibration model. The unmodeled sources of spectral variation may be unknown constituents, constituents with unknown concentrations, nonlinear responses, non-uniform and correlated errors, or other sources of spectral variation that are present in the calibration sample spectra. Also, since the various ACLS methods are based on CLS, they can incorporate the new prediction-augmented CLS (PACLS) method of updating the prediction model for new sources of spectral variation contained in the prediction sample set without having to return to the calibration process. The ACLS methods can also be applied to alternating least squares models. The ACLS methods can be applied to all types of multivariate data.
ERIC Educational Resources Information Center
Siman-Tov, Ayelet; Kaniel, Shlomo
2011-01-01
The research validates a multivariate model that predicts parental adjustment to coping successfully with an autistic child. The model comprises four elements: parental stress, parental resources, parental adjustment and the child's autism symptoms. 176 parents of children aged between 6 to 16 diagnosed with PDD answered several questionnaires…
A Multivariate Model of Parent-Adolescent Relationship Variables in Early Adolescence
ERIC Educational Resources Information Center
McKinney, Cliff; Renk, Kimberly
2011-01-01
Given the importance of predicting outcomes for early adolescents, this study examines a multivariate model of parent-adolescent relationship variables, including parenting, family environment, and conflict. Participants, who completed measures assessing these variables, included 710 culturally diverse 11-14-year-olds who were attending a middle…
Ferreira, Ana Paula A; Póvoa, Luciana C; Zanier, José F C; Ferreira, Arthur S
2017-02-01
The aim of this study was to develop and validate a multivariate prediction model, guided by palpation and personal information, for locating the seventh cervical spinous process (C7SP). A single-blinded, cross-sectional study at a primary to tertiary health care center was conducted for model development and temporal validation. One-hundred sixty participants were prospectively included for model development (n = 80) and time-split validation stages (n = 80). The C7SP was located using the thorax-rib static method (TRSM). Participants underwent chest radiography for assessment of the inner body structure located with TRSM and using radio-opaque markers placed over the skin. Age, sex, height, body mass, body mass index, and vertex-marker distance (D V-M ) were used to predict the distance from the C7SP to the vertex (D V-C7 ). Multivariate linear regression modeling, limits of agreement plot, histogram of residues, receiver operating characteristic curves, and confusion tables were analyzed. The multivariate linear prediction model for D V-C7 (in centimeters) was D V-C7 = 0.986D V-M + 0.018(mass) + 0.014(age) - 1.008. Receiver operating characteristic curves had better discrimination of D V-C7 (area under the curve = 0.661; 95% confidence interval = 0.541-0.782; P = .015) than D V-M (area under the curve = 0.480; 95% confidence interval = 0.345-0.614; P = .761), with respective cutoff points at 23.40 cm (sensitivity = 41%, specificity = 63%) and 24.75 cm (sensitivity = 69%, specificity = 52%). The C7SP was correctly located more often when using predicted D V-C7 in the validation sample than when using the TRSM in the development sample: n = 53 (66%) vs n = 32 (40%), P < .001. Better accuracy was obtained when locating the C7SP by use of a multivariate model that incorporates palpation and personal information. Copyright © 2016. Published by Elsevier Inc.
Boersen, Nathan; Carvajal, M Teresa; Morris, Kenneth R; Peck, Garnet E; Pinal, Rodolfo
2015-01-01
While previous research has demonstrated roller compaction operating parameters strongly influence the properties of the final product, a greater emphasis might be placed on the raw material attributes of the formulation. There were two main objectives to this study. First, to assess the effects of different process variables on the properties of the obtained ribbons and downstream granules produced from the rolled compacted ribbons. Second, was to establish if models obtained with formulations of one active pharmaceutical ingredient (API) could predict the properties of similar formulations in terms of the excipients used, but with a different API. Tolmetin and acetaminophen, chosen for their different compaction properties, were roller compacted on Fitzpatrick roller compactor using the same formulation. Models created using tolmetin and tested using acetaminophen. The physical properties of the blends, ribbon, granule and tablet were characterized. Multivariate analysis using partial least squares was used to analyze all data. Multivariate models showed that the operating parameters and raw material attributes were essential in the prediction of ribbon porosity and post-milled particle size. The post compacted ribbon and granule attributes also significantly contributed to the prediction of the tablet tensile strength. Models derived using tolmetin could reasonably predict the ribbon porosity of a second API. After further processing, the post-milled ribbon and granules properties, rather than the physical attributes of the formulation were needed to predict downstream tablet properties. An understanding of the percolation threshold of the formulation significantly improved the predictive ability of the models.
Liu, Zitao; Hauskrecht, Milos
2017-11-01
Building of an accurate predictive model of clinical time series for a patient is critical for understanding of the patient condition, its dynamics, and optimal patient management. Unfortunately, this process is not straightforward. First, patient-specific variations are typically large and population-based models derived or learned from many different patients are often unable to support accurate predictions for each individual patient. Moreover, time series observed for one patient at any point in time may be too short and insufficient to learn a high-quality patient-specific model just from the patient's own data. To address these problems we propose, develop and experiment with a new adaptive forecasting framework for building multivariate clinical time series models for a patient and for supporting patient-specific predictions. The framework relies on the adaptive model switching approach that at any point in time selects the most promising time series model out of the pool of many possible models, and consequently, combines advantages of the population, patient-specific and short-term individualized predictive models. We demonstrate that the adaptive model switching framework is very promising approach to support personalized time series prediction, and that it is able to outperform predictions based on pure population and patient-specific models, as well as, other patient-specific model adaptation strategies.
Sharif, K M; Rahman, M M; Azmir, J; Khatib, A; Sabina, E; Shamsudin, S H; Zaidul, I S M
2015-12-01
Multivariate analysis of thin-layer chromatography (TLC) images was modeled to predict antioxidant activity of Pereskia bleo leaves and to identify the contributing compounds of the activity. TLC was developed in optimized mobile phase using the 'PRISMA' optimization method and the image was then converted to wavelet signals and imported for multivariate analysis. An orthogonal partial least square (OPLS) model was developed consisting of a wavelet-converted TLC image and 2,2-diphynyl-picrylhydrazyl free radical scavenging activity of 24 different preparations of P. bleo as the x- and y-variables, respectively. The quality of the constructed OPLS model (1 + 1 + 0) with one predictive and one orthogonal component was evaluated by internal and external validity tests. The validated model was then used to identify the contributing spot from the TLC plate that was then analyzed by GC-MS after trimethylsilyl derivatization. Glycerol and amine compounds were mainly found to contribute to the antioxidant activity of the sample. An alternative method to predict the antioxidant activity of a new sample of P. bleo leaves has been developed. Copyright © 2015 John Wiley & Sons, Ltd.
Corron, Louise; Marchal, François; Condemi, Silvana; Chaumoître, Kathia; Adalian, Pascal
2017-01-01
Juvenile age estimation methods used in forensic anthropology generally lack methodological consistency and/or statistical validity. Considering this, a standard approach using nonparametric Multivariate Adaptive Regression Splines (MARS) models were tested to predict age from iliac biometric variables of male and female juveniles from Marseilles, France, aged 0-12 years. Models using unidimensional (length and width) and bidimensional iliac data (module and surface) were constructed on a training sample of 176 individuals and validated on an independent test sample of 68 individuals. Results show that MARS prediction models using iliac width, module and area give overall better and statistically valid age estimates. These models integrate punctual nonlinearities of the relationship between age and osteometric variables. By constructing valid prediction intervals whose size increases with age, MARS models take into account the normal increase of individual variability. MARS models can qualify as a practical and standardized approach for juvenile age estimation. © 2016 American Academy of Forensic Sciences.
Prediction of mortality rates using a model with stochastic parameters
NASA Astrophysics Data System (ADS)
Tan, Chon Sern; Pooi, Ah Hin
2016-10-01
Prediction of future mortality rates is crucial to insurance companies because they face longevity risks while providing retirement benefits to a population whose life expectancy is increasing. In the past literature, a time series model based on multivariate power-normal distribution has been applied on mortality data from the United States for the years 1933 till 2000 to forecast the future mortality rates for the years 2001 till 2010. In this paper, a more dynamic approach based on the multivariate time series will be proposed where the model uses stochastic parameters that vary with time. The resulting prediction intervals obtained using the model with stochastic parameters perform better because apart from having good ability in covering the observed future mortality rates, they also tend to have distinctly shorter interval lengths.
NASA Technical Reports Server (NTRS)
MCKissick, Burnell T. (Technical Monitor); Plassman, Gerald E.; Mall, Gerald H.; Quagliano, John R.
2005-01-01
Linear multivariable regression models for predicting day and night Eddy Dissipation Rate (EDR) from available meteorological data sources are defined and validated. Model definition is based on a combination of 1997-2000 Dallas/Fort Worth (DFW) data sources, EDR from Aircraft Vortex Spacing System (AVOSS) deployment data, and regression variables primarily from corresponding Automated Surface Observation System (ASOS) data. Model validation is accomplished through EDR predictions on a similar combination of 1994-1995 Memphis (MEM) AVOSS and ASOS data. Model forms include an intercept plus a single term of fixed optimal power for each of these regression variables; 30-minute forward averaged mean and variance of near-surface wind speed and temperature, variance of wind direction, and a discrete cloud cover metric. Distinct day and night models, regressing on EDR and the natural log of EDR respectively, yield best performance and avoid model discontinuity over day/night data boundaries.
Modeling a multivariable reactor and on-line model predictive control.
Yu, D W; Yu, D L
2005-10-01
A nonlinear first principle model is developed for a laboratory-scaled multivariable chemical reactor rig in this paper and the on-line model predictive control (MPC) is implemented to the rig. The reactor has three variables-temperature, pH, and dissolved oxygen with nonlinear dynamics-and is therefore used as a pilot system for the biochemical industry. A nonlinear discrete-time model is derived for each of the three output variables and their model parameters are estimated from the real data using an adaptive optimization method. The developed model is used in a nonlinear MPC scheme. An accurate multistep-ahead prediction is obtained for MPC, where the extended Kalman filter is used to estimate system unknown states. The on-line control is implemented and a satisfactory tracking performance is achieved. The MPC is compared with three decentralized PID controllers and the advantage of the nonlinear MPC over the PID is clearly shown.
Dorota, Myszkowska
2013-03-01
The aim of the study was to construct the model forecasting the birch pollen season characteristics in Cracow on the basis of an 18-year data series. The study was performed using the volumetric method (Lanzoni/Burkard trap). The 98/95 % method was used to calculate the pollen season. The Spearman's correlation test was applied to find the relationship between the meteorological parameters and pollen season characteristics. To construct the predictive model, the backward stepwise multiple regression analysis was used including the multi-collinearity of variables. The predictive models best fitted the pollen season start and end, especially models containing two independent variables. The peak concentration value was predicted with the higher prediction error. Also the accuracy of the models predicting the pollen season characteristics in 2009 was higher in comparison with 2010. Both, the multi-variable model and one-variable model for the beginning of the pollen season included air temperature during the last 10 days of February, while the multi-variable model also included humidity at the beginning of April. The models forecasting the end of the pollen season were based on temperature in March-April, while the peak day was predicted using the temperature during the last 10 days of March.
Wang, Li; Wang, Xiaoyi; Jin, Xuebo; Xu, Jiping; Zhang, Huiyan; Yu, Jiabin; Sun, Qian; Gao, Chong; Wang, Lingbin
2017-03-01
The formation process of algae is described inaccurately and water blooms are predicted with a low precision by current methods. In this paper, chemical mechanism of algae growth is analyzed, and a correlation analysis of chlorophyll-a and algal density is conducted by chemical measurement. Taking into account the influence of multi-factors on algae growth and water blooms, the comprehensive prediction method combined with multivariate time series and intelligent model is put forward in this paper. Firstly, through the process of photosynthesis, the main factors that affect the reproduction of the algae are analyzed. A compensation prediction method of multivariate time series analysis based on neural network and Support Vector Machine has been put forward which is combined with Kernel Principal Component Analysis to deal with dimension reduction of the influence factors of blooms. Then, Genetic Algorithm is applied to improve the generalization ability of the BP network and Least Squares Support Vector Machine. Experimental results show that this method could better compensate the prediction model of multivariate time series analysis which is an effective way to improve the description accuracy of algae growth and prediction precision of water blooms.
Multivariate regression model for partitioning tree volume of white oak into round-product classes
Daniel A. Yaussy; David L. Sonderman
1984-01-01
Describes the development of multivariate equations that predict the expected cubic volume of four round-product classes from independent variables composed of individual tree-quality characteristics. Although the model has limited application at this time, it does demonstrate the feasibility of partitioning total tree cubic volume into round-product classes based on...
Oviedo de la Fuente, Manuel; Febrero-Bande, Manuel; Muñoz, María Pilar; Domínguez, Àngela
2018-01-01
This paper proposes a novel approach that uses meteorological information to predict the incidence of influenza in Galicia (Spain). It extends the Generalized Least Squares (GLS) methods in the multivariate framework to functional regression models with dependent errors. These kinds of models are useful when the recent history of the incidence of influenza are readily unavailable (for instance, by delays on the communication with health informants) and the prediction must be constructed by correcting the temporal dependence of the residuals and using more accessible variables. A simulation study shows that the GLS estimators render better estimations of the parameters associated with the regression model than they do with the classical models. They obtain extremely good results from the predictive point of view and are competitive with the classical time series approach for the incidence of influenza. An iterative version of the GLS estimator (called iGLS) was also proposed that can help to model complicated dependence structures. For constructing the model, the distance correlation measure [Formula: see text] was employed to select relevant information to predict influenza rate mixing multivariate and functional variables. These kinds of models are extremely useful to health managers in allocating resources in advance to manage influenza epidemics.
Multivariate Models of Men's and Women's Partner Aggression
ERIC Educational Resources Information Center
O'Leary, K. Daniel; Smith Slep, Amy M.; O'Leary, Susan G.
2007-01-01
This exploratory study was designed to address how multiple factors drawn from varying focal models and ecological levels of influence might operate relative to each other to predict partner aggression, using data from 453 representatively sampled couples. The resulting cross-validated models predicted approximately 50% of the variance in men's…
Lindberg, Ann-Sofie; Oksa, Juha; Antti, Henrik; Malm, Christer
2015-01-01
Physical capacity has previously been deemed important for firefighters physical work capacity, and aerobic fitness, muscular strength, and muscular endurance are the most frequently investigated parameters of importance. Traditionally, bivariate and multivariate linear regression statistics have been used to study relationships between physical capacities and work capacities among firefighters. An alternative way to handle datasets consisting of numerous correlated variables is to use multivariate projection analyses, such as Orthogonal Projection to Latent Structures. The first aim of the present study was to evaluate the prediction and predictive power of field and laboratory tests, respectively, on firefighters' physical work capacity on selected work tasks. Also, to study if valid predictions could be achieved without anthropometric data. The second aim was to externally validate selected models. The third aim was to validate selected models on firefighters' and on civilians'. A total of 38 (26 men and 12 women) + 90 (38 men and 52 women) subjects were included in the models and the external validation, respectively. The best prediction (R2) and predictive power (Q2) of Stairs, Pulling, Demolition, Terrain, and Rescue work capacities included field tests (R2 = 0.73 to 0.84, Q2 = 0.68 to 0.82). The best external validation was for Stairs work capacity (R2 = 0.80) and worst for Demolition work capacity (R2 = 0.40). In conclusion, field and laboratory tests could equally well predict physical work capacities for firefighting work tasks, and models excluding anthropometric data were valid. The predictive power was satisfactory for all included work tasks except Demolition.
Many multivariate methods are used in describing and predicting relation; each has its unique usage of categorical and non-categorical data. In multivariate analysis of variance (MANOVA), many response variables (y's) are related to many independent variables that are categorical...
A time domain frequency-selective multivariate Granger causality approach.
Leistritz, Lutz; Witte, Herbert
2016-08-01
The investigation of effective connectivity is one of the major topics in computational neuroscience to understand the interaction between spatially distributed neuronal units of the brain. Thus, a wide variety of methods has been developed during the last decades to investigate functional and effective connectivity in multivariate systems. Their spectrum ranges from model-based to model-free approaches with a clear separation into time and frequency range methods. We present in this simulation study a novel time domain approach based on Granger's principle of predictability, which allows frequency-selective considerations of directed interactions. It is based on a comparison of prediction errors of multivariate autoregressive models fitted to systematically modified time series. These modifications are based on signal decompositions, which enable a targeted cancellation of specific signal components with specific spectral properties. Depending on the embedded signal decomposition method, a frequency-selective or data-driven signal-adaptive Granger Causality Index may be derived.
Predicting trauma patient mortality: ICD [or ICD-10-AM] versus AIS based approaches.
Willis, Cameron D; Gabbe, Belinda J; Jolley, Damien; Harrison, James E; Cameron, Peter A
2010-11-01
The International Classification of Diseases Injury Severity Score (ICISS) has been proposed as an International Classification of Diseases (ICD)-10-based alternative to mortality prediction tools that use Abbreviated Injury Scale (AIS) data, including the Trauma and Injury Severity Score (TRISS). To date, studies have not examined the performance of ICISS using Australian trauma registry data. This study aimed to compare the performance of ICISS with other mortality prediction tools in an Australian trauma registry. This was a retrospective review of prospectively collected data from the Victorian State Trauma Registry. A training dataset was created for model development and a validation dataset for evaluation. The multiplicative ICISS model was compared with a worst injury ICISS approach, Victorian TRISS (V-TRISS, using local coefficients), maximum AIS severity and a multivariable model including ICD-10-AM codes as predictors. Models were investigated for discrimination (C-statistic) and calibration (Hosmer-Lemeshow statistic). The multivariable approach had the highest level of discrimination (C-statistic 0.90) and calibration (H-L 7.65, P= 0.468). Worst injury ICISS, V-TRISS and maximum AIS had similar performance. The multiplicative ICISS produced the lowest level of discrimination (C-statistic 0.80) and poorest calibration (H-L 50.23, P < 0.001). The performance of ICISS may be affected by the data used to develop estimates, the ICD version employed, the methods for deriving estimates and the inclusion of covariates. In this analysis, a multivariable approach using ICD-10-AM codes was the best-performing method. A multivariable ICISS approach may therefore be a useful alternative to AIS-based methods and may have comparable predictive performance to locally derived TRISS models. © 2010 The Authors. ANZ Journal of Surgery © 2010 Royal Australasian College of Surgeons.
Early experiences building a software quality prediction model
NASA Technical Reports Server (NTRS)
Agresti, W. W.; Evanco, W. M.; Smith, M. C.
1990-01-01
Early experiences building a software quality prediction model are discussed. The overall research objective is to establish a capability to project a software system's quality from an analysis of its design. The technical approach is to build multivariate models for estimating reliability and maintainability. Data from 21 Ada subsystems were analyzed to test hypotheses about various design structures leading to failure-prone or unmaintainable systems. Current design variables highlight the interconnectivity and visibility of compilation units. Other model variables provide for the effects of reusability and software changes. Reported results are preliminary because additional project data is being obtained and new hypotheses are being developed and tested. Current multivariate regression models are encouraging, explaining 60 to 80 percent of the variation in error density of the subsystems.
NASA Astrophysics Data System (ADS)
Eyarkai Nambi, Vijayaram; Thangavel, Kuladaisamy; Manickavasagan, Annamalai; Shahir, Sultan
2017-01-01
Prediction of ripeness level in climacteric fruits is essential for post-harvest handling. An index capable of predicting ripening level with minimum inputs would be highly beneficial to the handlers, processors and researchers in fruit industry. A study was conducted with Indian mango cultivars to develop a ripeness index and associated model. Changes in physicochemical, colour and textural properties were measured throughout the ripening period and the period was classified into five stages (unripe, early ripe, partially ripe, ripe and over ripe). Multivariate regression techniques like partial least square regression, principal component regression and multi linear regression were compared and evaluated for its prediction. Multi linear regression model with 12 parameters was found more suitable in ripening prediction. Scientific variable reduction method was adopted to simplify the developed model. Better prediction was achieved with either 2 or 3 variables (total soluble solids, colour and acidity). Cross validation was done to increase the robustness and it was found that proposed ripening index was more effective in prediction of ripening stages. Three-variable model would be suitable for commercial applications where reasonable accuracies are sufficient. However, 12-variable model can be used to obtain more precise results in research and development applications.
Aboagye-Sarfo, Patrick; Mai, Qun; Sanfilippo, Frank M; Preen, David B; Stewart, Louise M; Fatovich, Daniel M
2015-10-01
To develop multivariate vector-ARMA (VARMA) forecast models for predicting emergency department (ED) demand in Western Australia (WA) and compare them to the benchmark univariate autoregressive moving average (ARMA) and Winters' models. Seven-year monthly WA state-wide public hospital ED presentation data from 2006/07 to 2012/13 were modelled. Graphical and VARMA modelling methods were used for descriptive analysis and model fitting. The VARMA models were compared to the benchmark univariate ARMA and Winters' models to determine their accuracy to predict ED demand. The best models were evaluated by using error correction methods for accuracy. Descriptive analysis of all the dependent variables showed an increasing pattern of ED use with seasonal trends over time. The VARMA models provided a more precise and accurate forecast with smaller confidence intervals and better measures of accuracy in predicting ED demand in WA than the ARMA and Winters' method. VARMA models are a reliable forecasting method to predict ED demand for strategic planning and resource allocation. While the ARMA models are a closely competing alternative, they under-estimated future ED demand. Copyright © 2015 Elsevier Inc. All rights reserved.
USDA-ARS?s Scientific Manuscript database
Accurate, nonintrusive, and inexpensive techniques are needed to measure energy expenditure (EE) in free-living populations. Our primary aim in this study was to validate cross-sectional time series (CSTS) and multivariate adaptive regression splines (MARS) models based on observable participant cha...
Copula-based prediction of economic movements
NASA Astrophysics Data System (ADS)
García, J. E.; González-López, V. A.; Hirsh, I. D.
2016-06-01
In this paper we model the discretized returns of two paired time series BM&FBOVESPA Dividend Index and BM&FBOVESPA Public Utilities Index using multivariate Markov models. The discretization corresponds to three categories, high losses, high profits and the complementary periods of the series. In technical terms, the maximal memory that can be considered for a Markov model, can be derived from the size of the alphabet and dataset. The number of parameters needed to specify a discrete multivariate Markov chain grows exponentially with the order and dimension of the chain. In this case the size of the database is not large enough for a consistent estimation of the model. We apply a strategy to estimate a multivariate process with an order greater than the order achieved using standard procedures. The new strategy consist on obtaining a partition of the state space which is constructed from a combination, of the partitions corresponding to the two marginal processes and the partition corresponding to the multivariate Markov chain. In order to estimate the transition probabilities, all the partitions are linked using a copula. In our application this strategy provides a significant improvement in the movement predictions.
Anantha M. Prasad; Louis R. Iverson; Andy Liaw; Andy Liaw
2006-01-01
We evaluated four statistical models - Regression Tree Analysis (RTA), Bagging Trees (BT), Random Forests (RF), and Multivariate Adaptive Regression Splines (MARS) - for predictive vegetation mapping under current and future climate scenarios according to the Canadian Climate Centre global circulation model.
ERIC Educational Resources Information Center
Owen, Steven V.; Feldhusen, John F.
This study compares the effectiveness of three models of multivariate prediction for academic success in identifying the criterion variance of achievement in nursing education. The first model involves the use of an optimum set of predictors and one equation derived from a regression analysis on first semester grade average in predicting the…
Katseanes, Chelsea K; Chappell, Mark A; Hopkins, Bryan G; Durham, Brian D; Price, Cynthia L; Porter, Beth E; Miller, Lesley F
2016-11-01
After nearly a century of use in numerous munition platforms, TNT and RDX contamination has turned up largely in the environment due to ammunition manufacturing or as part of releases from low-order detonations during training activities. Although the basic knowledge governing the environmental fate of TNT and RDX are known, accurate predictions of TNT and RDX persistence in soil remain elusive, particularly given the universal heterogeneity of pedomorphic soil types. In this work, we proposed a new solution for modeling the sorption and persistence of these munition constituents as multivariate mathematical functions correlating soil attribute data over a variety of taxonomically distinct soil types to contaminant behavior, instead of a single constant or parameter of a specific absolute value. To test this idea, we conducted experiments measuring the sorption of TNT and RDX on taxonomically different soil types that were extensively physical and chemically characterized. Statistical decomposition of the log-transformed, and auto-scaled soil characterization data using the dimension-reduction technique PCA (principal component analysis) revealed a strong latent structure based in the multiple pairwise correlations among the soil properties. TNT and RDX sorption partitioning coefficients (KD-TNT and KD-RDX) were regressed against this latent structure using partial least squares regression (PLSR), generating a 3-factor, multivariate linear functions. Here, PLSR models predicted KD-TNT and KD-RDX values based on attributes contributing to endogenous alkaline/calcareous and soil fertility criteria, respectively, exhibited among the different soil types: We hypothesized that the latent structure arising from the strong covariance of full multivariate geochemical matrix describing taxonomically distinguished soil types may provide the means for potentially predicting complex phenomena in soils. The development of predictive multivariate models tuned to a local soil's taxonomic designation would have direct benefit to military range managers seeking to anticipate the environmental risks of training activities on impact sites. Published by Elsevier Ltd.
Xu, Cheng-Jian; van der Schaaf, Arjen; Schilstra, Cornelis; Langendijk, Johannes A; van't Veld, Aart A
2012-03-15
To study the impact of different statistical learning methods on the prediction performance of multivariate normal tissue complication probability (NTCP) models. In this study, three learning methods, stepwise selection, least absolute shrinkage and selection operator (LASSO), and Bayesian model averaging (BMA), were used to build NTCP models of xerostomia following radiotherapy treatment for head and neck cancer. Performance of each learning method was evaluated by a repeated cross-validation scheme in order to obtain a fair comparison among methods. It was found that the LASSO and BMA methods produced models with significantly better predictive power than that of the stepwise selection method. Furthermore, the LASSO method yields an easily interpretable model as the stepwise method does, in contrast to the less intuitive BMA method. The commonly used stepwise selection method, which is simple to execute, may be insufficient for NTCP modeling. The LASSO method is recommended. Copyright © 2012 Elsevier Inc. All rights reserved.
Zhang, Yong; Zhong, Miner; Geng, Nana; Jiang, Yunjian
2017-01-01
The market demand for electric vehicles (EVs) has increased in recent years. Suitable models are necessary to understand and forecast EV sales. This study presents a singular spectrum analysis (SSA) as a univariate time-series model and vector autoregressive model (VAR) as a multivariate model. Empirical results suggest that SSA satisfactorily indicates the evolving trend and provides reasonable results. The VAR model, which comprised exogenous parameters related to the market on a monthly basis, can significantly improve the prediction accuracy. The EV sales in China, which are categorized into battery and plug-in EVs, are predicted in both short term (up to December 2017) and long term (up to 2020), as statistical proofs of the growth of the Chinese EV industry.
Zhang, Yong; Zhong, Miner; Geng, Nana; Jiang, Yunjian
2017-01-01
The market demand for electric vehicles (EVs) has increased in recent years. Suitable models are necessary to understand and forecast EV sales. This study presents a singular spectrum analysis (SSA) as a univariate time-series model and vector autoregressive model (VAR) as a multivariate model. Empirical results suggest that SSA satisfactorily indicates the evolving trend and provides reasonable results. The VAR model, which comprised exogenous parameters related to the market on a monthly basis, can significantly improve the prediction accuracy. The EV sales in China, which are categorized into battery and plug-in EVs, are predicted in both short term (up to December 2017) and long term (up to 2020), as statistical proofs of the growth of the Chinese EV industry. PMID:28459872
Christensen, Daniel; Zubrick, Stephen R; Lawrence, David; Mitrou, Francis; Taylor, Catherine L
2014-01-01
Receptive vocabulary development is a component of the human language system that emerges in the first year of life and is characterised by onward expansion throughout life. Beginning in infancy, children's receptive vocabulary knowledge builds the foundation for oral language and reading skills. The foundations for success at school are built early, hence the public health policy focus on reducing developmental inequalities before children start formal school. The underlying assumption is that children's development is stable, and therefore predictable, over time. This study investigated this assumption in relation to children's receptive vocabulary ability. We investigated the extent to which low receptive vocabulary ability at 4 years was associated with low receptive vocabulary ability at 8 years, and the predictive utility of a multivariate model that included child, maternal and family risk factors measured at 4 years. The study sample comprised 3,847 children from the first nationally representative Longitudinal Study of Australian Children (LSAC). Multivariate logistic regression was used to investigate risks for low receptive vocabulary ability from 4-8 years and sensitivity-specificity analysis was used to examine the predictive utility of the multivariate model. In the multivariate model, substantial risk factors for receptive vocabulary delay from 4-8 years, in order of descending magnitude, were low receptive vocabulary ability at 4 years, low maternal education, and low school readiness. Moderate risk factors, in order of descending magnitude, were low maternal parenting consistency, socio-economic area disadvantage, low temperamental persistence, and NESB status. The following risk factors were not significant: One or more siblings, low family income, not reading to the child, high maternal work hours, and Aboriginal or Torres Strait Islander ethnicity. The results of the sensitivity-specificity analysis showed that a well-fitted multivariate model featuring risks of substantive magnitude does not do particularly well in predicting low receptive vocabulary ability from 4-8 years.
Zhou, Jinzhe; Zhou, Yanbing; Cao, Shougen; Li, Shikuan; Wang, Hao; Niu, Zhaojian; Chen, Dong; Wang, Dongsheng; Lv, Liang; Zhang, Jian; Li, Yu; Jiao, Xuelong; Tan, Xiaojie; Zhang, Jianli; Wang, Haibo; Zhang, Bingyuan; Lu, Yun; Sun, Zhenqing
2016-01-01
Reporting of surgical complications is common, but few provide information about the severity and estimate risk factors of complications. If have, but lack of specificity. We retrospectively analyzed data on 2795 gastric cancer patients underwent surgical procedure at the Affiliated Hospital of Qingdao University between June 2007 and June 2012, established multivariate logistic regression model to predictive risk factors related to the postoperative complications according to the Clavien-Dindo classification system. Twenty-four out of 86 variables were identified statistically significant in univariate logistic regression analysis, 11 significant variables entered multivariate analysis were employed to produce the risk model. Liver cirrhosis, diabetes mellitus, Child classification, invasion of neighboring organs, combined resection, introperative transfusion, Billroth II anastomosis of reconstruction, malnutrition, surgical volume of surgeons, operating time and age were independent risk factors for postoperative complications after gastrectomy. Based on logistic regression equation, p=Exp∑BiXi / (1+Exp∑BiXi), multivariate logistic regression predictive model that calculated the risk of postoperative morbidity was developed, p = 1/(1 + e((4.810-1.287X1-0.504X2-0.500X3-0.474X4-0.405X5-0.318X6-0.316X7-0.305X8-0.278X9-0.255X10-0.138X11))). The accuracy, sensitivity and specificity of the model to predict the postoperative complications were 86.7%, 76.2% and 88.6%, respectively. This risk model based on Clavien-Dindo grading severity of complications system and logistic regression analysis can predict severe morbidity specific to an individual patient's risk factors, estimate patients' risks and benefits of gastric surgery as an accurate decision-making tool and may serve as a template for the development of risk models for other surgical groups.
Mortality Prediction Model of Septic Shock Patients Based on Routinely Recorded Data
Carrara, Marta; Baselli, Giuseppe; Ferrario, Manuela
2015-01-01
We studied the problem of mortality prediction in two datasets, the first composed of 23 septic shock patients and the second composed of 73 septic subjects selected from the public database MIMIC-II. For each patient we derived hemodynamic variables, laboratory results, and clinical information of the first 48 hours after shock onset and we performed univariate and multivariate analyses to predict mortality in the following 7 days. The results show interesting features that individually identify significant differences between survivors and nonsurvivors and features which gain importance only when considered together with the others in a multivariate regression model. This preliminary study on two small septic shock populations represents a novel contribution towards new personalized models for an integration of multiparameter patient information to improve critical care management of shock patients. PMID:26557154
Phillips, Robert S; Sung, Lillian; Amman, Roland A; Riley, Richard D; Castagnola, Elio; Haeusler, Gabrielle M; Klaassen, Robert; Tissing, Wim J E; Lehrnbecher, Thomas; Chisholm, Julia; Hakim, Hana; Ranasinghe, Neil; Paesmans, Marianne; Hann, Ian M; Stewart, Lesley A
2016-01-01
Background: Risk-stratified management of fever with neutropenia (FN), allows intensive management of high-risk cases and early discharge of low-risk cases. No single, internationally validated, prediction model of the risk of adverse outcomes exists for children and young people. An individual patient data (IPD) meta-analysis was undertaken to devise one. Methods: The ‘Predicting Infectious Complications in Children with Cancer' (PICNICC) collaboration was formed by parent representatives, international clinical and methodological experts. Univariable and multivariable analyses, using random effects logistic regression, were undertaken to derive and internally validate a risk-prediction model for outcomes of episodes of FN based on clinical and laboratory data at presentation. Results: Data came from 22 different study groups from 15 countries, of 5127 episodes of FN in 3504 patients. There were 1070 episodes in 616 patients from seven studies available for multivariable analysis. Univariable analyses showed associations with microbiologically defined infection (MDI) in many items, including higher temperature, lower white cell counts and acute myeloid leukaemia, but not age. Patients with osteosarcoma/Ewings sarcoma and those with more severe mucositis were associated with a decreased risk of MDI. The predictive model included: malignancy type, temperature, clinically ‘severely unwell', haemoglobin, white cell count and absolute monocyte count. It showed moderate discrimination (AUROC 0.723, 95% confidence interval 0.711–0.759) and good calibration (calibration slope 0.95). The model was robust to bootstrap and cross-validation sensitivity analyses. Conclusions: This new prediction model for risk of MDI appears accurate. It requires prospective studies assessing implementation to assist clinicians and parents/patients in individualised decision making. PMID:26954719
Sepehrband, Farshid; Lynch, Kirsten M; Cabeen, Ryan P; Gonzalez-Zacarias, Clio; Zhao, Lu; D'Arcy, Mike; Kesselman, Carl; Herting, Megan M; Dinov, Ivo D; Toga, Arthur W; Clark, Kristi A
2018-05-15
Exploring neuroanatomical sex differences using a multivariate statistical learning approach can yield insights that cannot be derived with univariate analysis. While gross differences in total brain volume are well-established, uncovering the more subtle, regional sex-related differences in neuroanatomy requires a multivariate approach that can accurately model spatial complexity as well as the interactions between neuroanatomical features. Here, we developed a multivariate statistical learning model using a support vector machine (SVM) classifier to predict sex from MRI-derived regional neuroanatomical features from a single-site study of 967 healthy youth from the Philadelphia Neurodevelopmental Cohort (PNC). Then, we validated the multivariate model on an independent dataset of 682 healthy youth from the multi-site Pediatric Imaging, Neurocognition and Genetics (PING) cohort study. The trained model exhibited an 83% cross-validated prediction accuracy, and correctly predicted the sex of 77% of the subjects from the independent multi-site dataset. Results showed that cortical thickness of the middle occipital lobes and the angular gyri are major predictors of sex. Results also demonstrated the inferential benefits of going beyond classical regression approaches to capture the interactions among brain features in order to better characterize sex differences in male and female youths. We also identified specific cortical morphological measures and parcellation techniques, such as cortical thickness as derived from the Destrieux atlas, that are better able to discriminate between males and females in comparison to other brain atlases (Desikan-Killiany, Brodmann and subcortical atlases). Copyright © 2018 Elsevier Inc. All rights reserved.
Collins, G S; Reitsma, J B; Altman, D G; Moons, K G M
2015-01-20
Prediction models are developed to aid health-care providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web-based survey and revised during a 3-day meeting in June 2011 with methodologists, health-care professionals, and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org).
Collins, Gary S; Reitsma, Johannes B; Altman, Douglas G; Moons, Karel G M
2015-02-01
Prediction models are developed to aid healthcare providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision-making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) initiative developed a set of recommendations for the reporting of studies developing, validating or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web-based survey and revised during a 3-day meeting in June 2011 with methodologists, healthcare professionals and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org). © 2015 Stichting European Society for Clinical Investigation Journal Foundation.
Collins, Gary S; Reitsma, Johannes B; Altman, Douglas G; Moons, Karel G M
2015-01-06
Prediction models are developed to aid health care providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web-based survey and revised during a 3-day meeting in June 2011 with methodologists, health care professionals, and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org).
Reitsma, Johannes B.; Altman, Douglas G.; Moons, Karel G.M.
2015-01-01
Background— Prediction models are developed to aid health care providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. Methods— The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web-based survey and revised during a 3-day meeting in June 2011 with methodologists, health care professionals, and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. Results— The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. Conclusions— To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org). PMID:25561516
Collins, G S; Reitsma, J B; Altman, D G; Moons, K G M
2015-01-01
Prediction models are developed to aid health-care providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web-based survey and revised during a 3-day meeting in June 2011 with methodologists, health-care professionals, and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org). PMID:25562432
Collins, G S; Reitsma, J B; Altman, D G; Moons, K G M
2015-02-01
Prediction models are developed to aid health care providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. The Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web-based survey and revised during a 3-day meeting in June 2011 with methodologists, health care professionals, and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org). © 2015 Royal College of Obstetricians and Gynaecologists.
Collins, Gary S; Reitsma, Johannes B; Altman, Douglas G; Moons, Karel G M
2015-01-13
Prediction models are developed to aid health care providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web-based survey and revised during a 3-day meeting in June 2011 with methodologists, health care professionals, and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org). © 2015 The Authors.
Collins, Gary S; Reitsma, Johannes B; Altman, Douglas G; Moons, Karel G M
2015-01-06
Prediction models are developed to aid health care providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web-based survey and revised during a 3-day meeting in June 2011 with methodologists, health care professionals, and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org).
Collins, Gary S; Reitsma, Johannes B; Altman, Douglas G; Moons, Karel G M
2015-02-01
Prediction models are developed to aid health care providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web-based survey and revised during a 3-day meeting in June 2011 with methodologists, health care professionals, and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org). Copyright © 2015 Elsevier Inc. All rights reserved.
An Efficient Pattern Mining Approach for Event Detection in Multivariate Temporal Data
Batal, Iyad; Cooper, Gregory; Fradkin, Dmitriy; Harrison, James; Moerchen, Fabian; Hauskrecht, Milos
2015-01-01
This work proposes a pattern mining approach to learn event detection models from complex multivariate temporal data, such as electronic health records. We present Recent Temporal Pattern mining, a novel approach for efficiently finding predictive patterns for event detection problems. This approach first converts the time series data into time-interval sequences of temporal abstractions. It then constructs more complex time-interval patterns backward in time using temporal operators. We also present the Minimal Predictive Recent Temporal Patterns framework for selecting a small set of predictive and non-spurious patterns. We apply our methods for predicting adverse medical events in real-world clinical data. The results demonstrate the benefits of our methods in learning accurate event detection models, which is a key step for developing intelligent patient monitoring and decision support systems. PMID:26752800
NASA Astrophysics Data System (ADS)
Hao, Zengchao; Hao, Fanghua; Singh, Vijay P.
2016-08-01
Drought is among the costliest natural hazards worldwide and extreme drought events in recent years have caused huge losses to various sectors. Drought prediction is therefore critically important for providing early warning information to aid decision making to cope with drought. Due to the complicated nature of drought, it has been recognized that the univariate drought indicator may not be sufficient for drought characterization and hence multivariate drought indices have been developed for drought monitoring. Alongside the substantial effort in drought monitoring with multivariate drought indices, it is of equal importance to develop a drought prediction method with multivariate drought indices to integrate drought information from various sources. This study proposes a general framework for multivariate multi-index drought prediction that is capable of integrating complementary prediction skills from multiple drought indices. The Multivariate Ensemble Streamflow Prediction (MESP) is employed to sample from historical records for obtaining statistical prediction of multiple variables, which is then used as inputs to achieve multivariate prediction. The framework is illustrated with a linearly combined drought index (LDI), which is a commonly used multivariate drought index, based on climate division data in California and New York in the United States with different seasonality of precipitation. The predictive skill of LDI (represented with persistence) is assessed by comparison with the univariate drought index and results show that the LDI prediction skill is less affected by seasonality than the meteorological drought prediction based on SPI. Prediction results from the case study show that the proposed multivariate drought prediction outperforms the persistence prediction, implying a satisfactory performance of multivariate drought prediction. The proposed method would be useful for drought prediction to integrate drought information from various sources for early drought warning.
USDA-ARS?s Scientific Manuscript database
Spectral scattering is useful for nondestructive sensing of fruit firmness. Prediction models, however, are typically built using multivariate statistical methods such as partial least squares regression (PLSR), whose performance generally depends on the characteristics of the data. The aim of this ...
NASA Astrophysics Data System (ADS)
Teye, Ernest; Huang, Xingyi; Dai, Huang; Chen, Quansheng
2013-10-01
Quick, accurate and reliable technique for discrimination of cocoa beans according to geographical origin is essential for quality control and traceability management. This current study presents the application of Near Infrared Spectroscopy technique and multivariate classification for the differentiation of Ghana cocoa beans. A total of 194 cocoa bean samples from seven cocoa growing regions were used. Principal component analysis (PCA) was used to extract relevant information from the spectral data and this gave visible cluster trends. The performance of four multivariate classification methods: Linear discriminant analysis (LDA), K-nearest neighbors (KNN), Back propagation artificial neural network (BPANN) and Support vector machine (SVM) were compared. The performances of the models were optimized by cross validation. The results revealed that; SVM model was superior to all the mathematical methods with a discrimination rate of 100% in both the training and prediction set after preprocessing with Mean centering (MC). BPANN had a discrimination rate of 99.23% for the training set and 96.88% for prediction set. While LDA model had 96.15% and 90.63% for the training and prediction sets respectively. KNN model had 75.01% for the training set and 72.31% for prediction set. The non-linear classification methods used were superior to the linear ones. Generally, the results revealed that NIR Spectroscopy coupled with SVM model could be used successfully to discriminate cocoa beans according to their geographical origins for effective quality assurance.
Prostate Health Index improves multivariable risk prediction of aggressive prostate cancer.
Loeb, Stacy; Shin, Sanghyuk S; Broyles, Dennis L; Wei, John T; Sanda, Martin; Klee, George; Partin, Alan W; Sokoll, Lori; Chan, Daniel W; Bangma, Chris H; van Schaik, Ron H N; Slawin, Kevin M; Marks, Leonard S; Catalona, William J
2017-07-01
To examine the use of the Prostate Health Index (PHI) as a continuous variable in multivariable risk assessment for aggressive prostate cancer in a large multicentre US study. The study population included 728 men, with prostate-specific antigen (PSA) levels of 2-10 ng/mL and a negative digital rectal examination, enrolled in a prospective, multi-site early detection trial. The primary endpoint was aggressive prostate cancer, defined as biopsy Gleason score ≥7. First, we evaluated whether the addition of PHI improves the performance of currently available risk calculators (the Prostate Cancer Prevention Trial [PCPT] and European Randomised Study of Screening for Prostate Cancer [ERSPC] risk calculators). We also designed and internally validated a new PHI-based multivariable predictive model, and created a nomogram. Of 728 men undergoing biopsy, 118 (16.2%) had aggressive prostate cancer. The PHI predicted the risk of aggressive prostate cancer across the spectrum of values. Adding PHI significantly improved the predictive accuracy of the PCPT and ERSPC risk calculators for aggressive disease. A new model was created using age, previous biopsy, prostate volume, PSA and PHI, with an area under the curve of 0.746. The bootstrap-corrected model showed good calibration with observed risk for aggressive prostate cancer and had net benefit on decision-curve analysis. Using PHI as part of multivariable risk assessment leads to a significant improvement in the detection of aggressive prostate cancer, potentially reducing harms from unnecessary prostate biopsy and overdiagnosis. © 2016 The Authors BJU International © 2016 BJU International Published by John Wiley & Sons Ltd.
Decision-support models for empiric antibiotic selection in Gram-negative bloodstream infections.
MacFadden, D R; Coburn, B; Shah, N; Robicsek, A; Savage, R; Elligsen, M; Daneman, N
2018-04-25
Early empiric antibiotic therapy in patients can improve clinical outcomes in Gram-negative bacteraemia. However, the widespread prevalence of antibiotic-resistant pathogens compromises our ability to provide adequate therapy while minimizing use of broad antibiotics. We sought to determine whether readily available electronic medical record data could be used to develop predictive models for decision support in Gram-negative bacteraemia. We performed a multi-centre cohort study, in Canada and the USA, of hospitalized patients with Gram-negative bloodstream infection from April 2010 to March 2015. We analysed multivariable models for prediction of antibiotic susceptibility at two empiric windows: Gram-stain-guided and pathogen-guided treatment. Decision-support models for empiric antibiotic selection were developed based on three clinical decision thresholds of acceptable adequate coverage (80%, 90% and 95%). A total of 1832 patients with Gram-negative bacteraemia were evaluated. Multivariable models showed good discrimination across countries and at both Gram-stain-guided (12 models, areas under the curve (AUCs) 0.68-0.89, optimism-corrected AUCs 0.63-0.85) and pathogen-guided (12 models, AUCs 0.75-0.98, optimism-corrected AUCs 0.64-0.95) windows. Compared to antibiogram-guided therapy, decision-support models of antibiotic selection incorporating individual patient characteristics and prior culture results have the potential to increase use of narrower-spectrum antibiotics (in up to 78% of patients) while reducing inadequate therapy. Multivariable models using readily available epidemiologic factors can be used to predict antimicrobial susceptibility in infecting pathogens with reasonable discriminatory ability. Implementation of sequential predictive models for real-time individualized empiric antibiotic decision-making has the potential to both optimize adequate coverage for patients while minimizing overuse of broad-spectrum antibiotics, and therefore requires further prospective evaluation. Readily available epidemiologic risk factors can be used to predict susceptibility of Gram-negative organisms among patients with bacteraemia, using automated decision-making models. Copyright © 2018 European Society of Clinical Microbiology and Infectious Diseases. Published by Elsevier Ltd. All rights reserved.
Scalable Joint Models for Reliable Uncertainty-Aware Event Prediction.
Soleimani, Hossein; Hensman, James; Saria, Suchi
2017-08-21
Missing data and noisy observations pose significant challenges for reliably predicting events from irregularly sampled multivariate time series (longitudinal) data. Imputation methods, which are typically used for completing the data prior to event prediction, lack a principled mechanism to account for the uncertainty due to missingness. Alternatively, state-of-the-art joint modeling techniques can be used for jointly modeling the longitudinal and event data and compute event probabilities conditioned on the longitudinal observations. These approaches, however, make strong parametric assumptions and do not easily scale to multivariate signals with many observations. Our proposed approach consists of several key innovations. First, we develop a flexible and scalable joint model based upon sparse multiple-output Gaussian processes. Unlike state-of-the-art joint models, the proposed model can explain highly challenging structure including non-Gaussian noise while scaling to large data. Second, we derive an optimal policy for predicting events using the distribution of the event occurrence estimated by the joint model. The derived policy trades-off the cost of a delayed detection versus incorrect assessments and abstains from making decisions when the estimated event probability does not satisfy the derived confidence criteria. Experiments on a large dataset show that the proposed framework significantly outperforms state-of-the-art techniques in event prediction.
Horner, Fleur; Bilzon, James L; Rayson, Mark; Blacker, Sam; Richmond, Victoria; Carter, James; Wright, Anthony; Nevill, Alan
2013-01-01
This study developed a multivariate model to predict free-living energy expenditure (EE) in independent military cohorts. Two hundred and eighty-eight individuals (20.6 ± 3.9 years, 67.9 ± 12.0 kg, 1.71 ± 0.10 m) from 10 cohorts wore accelerometers during observation periods of 7 or 10 days. Accelerometer counts (PAC) were recorded at 1-minute epochs. Total energy expenditure (TEE) and physical activity energy expenditure (PAEE) were derived using the doubly labelled water technique. Data were reduced to n = 155 based on wear-time. Associations between PAC and EE were assessed using allometric modelling. Models were derived using multiple log-linear regression analysis and gender differences assessed using analysis of covariance. In all models PAC, height and body mass were related to TEE (P < 0.01). For models predicting TEE (r (2) = 0.65, SE = 462 kcal · d(-1) (13.0%)), PAC explained 4% of the variance. For models predicting PAEE (r (2) = 0.41, SE = 490 kcal · d(-1) (32.0%)), PAC accounted for 6% of the variance. Accelerometry increases the accuracy of EE estimation in military populations. However, the unique nature of military life means accurate prediction of individual free-living EE is highly dependent on anthropometric measurements.
Zubrick, Stephen R; Taylor, Catherine L; Christensen, Daniel
2015-01-01
Oral language is the foundation of literacy. Naturally, policies and practices to promote children's literacy begin in early childhood and have a strong focus on developing children's oral language, especially for children with known risk factors for low language ability. The underlying assumption is that children's progress along the oral to literate continuum is stable and predictable, such that low language ability foretells low literacy ability. This study investigated patterns and predictors of children's oral language and literacy abilities at 4, 6, 8 and 10 years. The study sample comprised 2,316 to 2,792 children from the first nationally representative Longitudinal Study of Australian Children (LSAC). Six developmental patterns were observed, a stable middle-high pattern, a stable low pattern, an improving pattern, a declining pattern, a fluctuating low pattern, and a fluctuating middle-high pattern. Most children (69%) fit a stable middle-high pattern. By contrast, less than 1% of children fit a stable low pattern. These results challenged the view that children's progress along the oral to literate continuum is stable and predictable. Multivariate logistic regression was used to investigate risks for low literacy ability at 10 years and sensitivity-specificity analysis was used to examine the predictive utility of the multivariate model. Predictors were modelled as risk variables with the lowest level of risk as the reference category. In the multivariate model, substantial risks for low literacy ability at 10 years, in order of descending magnitude, were: low school readiness, Aboriginal and/or Torres Strait Islander status and low language ability at 8 years. Moderate risks were high temperamental reactivity, low language ability at 4 years, and low language ability at 6 years. The following risk factors were not statistically significant in the multivariate model: Low maternal consistency, low family income, health care card, child not read to at home, maternal smoking, maternal education, family structure, temperamental persistence, and socio-economic area disadvantage. The results of the sensitivity-specificity analysis showed that a well-fitted multivariate model featuring risks of substantive magnitude did not do particularly well in predicting low literacy ability at 10 years.
Harris, Jenny; Cornelius, Victoria; Ream, Emma; Cheevers, Katy; Armes, Jo
2017-07-01
The purpose of this review was to identify potential candidate predictors of anxiety in women with early-stage breast cancer (BC) after adjuvant treatments and evaluate methodological development of existing multivariable models to inform the future development of a predictive risk stratification model (PRSM). Databases (MEDLINE, Web of Science, CINAHL, CENTRAL and PsycINFO) were searched from inception to November 2015. Eligible studies were prospective, recruited women with stage 0-3 BC, used a validated anxiety outcome ≥3 months post-treatment completion and used multivariable prediction models. Internationally accepted quality standards were used to assess predictive risk of bias and strength of evidence. Seven studies were identified: five were observational cohorts and two secondary analyses of RCTs. Variability of measurement and selective reporting precluded meta-analysis. Twenty-one candidate predictors were identified in total. Younger age and previous mental health problems were identified as risk factors in ≥3 studies. Clinical variables (e.g. treatment, tumour grade) were not identified as predictors in any studies. No studies adhered to all quality standards. Pre-existing vulnerability to mental health problems and younger age increased the risk of anxiety after completion of treatment for BC survivors, but there was no evidence that chemotherapy was a predictor. Multiple predictors were identified but many lacked reproducibility or were not measured across studies, and inadequate reporting did not allow full evaluation of the multivariable models. The use of quality standards in the development of PRSM within supportive cancer care would improve model quality and performance, thereby allowing professionals to better target support for patients.
de Godoy, Luiz Antonio Fonseca; Hantao, Leandro Wang; Pedroso, Marcio Pozzobon; Poppi, Ronei Jesus; Augusto, Fabio
2011-08-05
The use of multivariate curve resolution (MCR) to build multivariate quantitative models using data obtained from comprehensive two-dimensional gas chromatography with flame ionization detection (GC×GC-FID) is presented and evaluated. The MCR algorithm presents some important features, such as second order advantage and the recovery of the instrumental response for each pure component after optimization by an alternating least squares (ALS) procedure. A model to quantify the essential oil of rosemary was built using a calibration set containing only known concentrations of the essential oil and cereal alcohol as solvent. A calibration curve correlating the concentration of the essential oil of rosemary and the instrumental response obtained from the MCR-ALS algorithm was obtained, and this calibration model was applied to predict the concentration of the oil in complex samples (mixtures of the essential oil, pineapple essence and commercial perfume). The values of the root mean square error of prediction (RMSEP) and of the root mean square error of the percentage deviation (RMSPD) obtained were 0.4% (v/v) and 7.2%, respectively. Additionally, a second model was built and used to evaluate the accuracy of the method. A model to quantify the essential oil of lemon grass was built and its concentration was predicted in the validation set and real perfume samples. The RMSEP and RMSPD obtained were 0.5% (v/v) and 6.9%, respectively, and the concentration of the essential oil of lemon grass in perfume agreed to the value informed by the manufacturer. The result indicates that the MCR algorithm is adequate to resolve the target chromatogram from the complex sample and to build multivariate models of GC×GC-FID data. Copyright © 2011 Elsevier B.V. All rights reserved.
Collins, G S; Reitsma, J B; Altman, D G; Moons, K G M
2015-02-01
Prediction models are developed to aid healthcare providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision-making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a web-based survey and revised during a 3-day meeting in June 2011 with methodologists, healthcare professionals and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. A complete checklist is available at http://www.tripod-statement.org. © 2015 American College of Physicians.
Miaw, Carolina Sheng Whei; Assis, Camila; Silva, Alessandro Rangel Carolino Sales; Cunha, Maria Luísa; Sena, Marcelo Martins; de Souza, Scheilla Vitorino Carvalho
2018-07-15
Grape, orange, peach and passion fruit nectars were formulated and adulterated by dilution with syrup, apple and cashew juices at 10 levels for each adulterant. Attenuated total reflectance Fourier transform mid infrared (ATR-FTIR) spectra were obtained. Partial least squares (PLS) multivariate calibration models allied to different variable selection methods, such as interval partial least squares (iPLS), ordered predictors selection (OPS) and genetic algorithm (GA), were used to quantify the main fruits. PLS improved by iPLS-OPS variable selection showed the highest predictive capacity to quantify the main fruit contents. The selected variables in the final models varied from 72 to 100; the root mean square errors of prediction were estimated from 0.5 to 2.6%; the correlation coefficients of prediction ranged from 0.948 to 0.990; and, the mean relative errors of prediction varied from 3.0 to 6.7%. All of the developed models were validated. Copyright © 2018 Elsevier Ltd. All rights reserved.
Song, Seung Yeob; Lee, Young Koung; Kim, In-Jung
2016-01-01
A high-throughput screening system for Citrus lines were established with higher sugar and acid contents using Fourier transform infrared (FT-IR) spectroscopy in combination with multivariate analysis. FT-IR spectra confirmed typical spectral differences between the frequency regions of 950-1100 cm(-1), 1300-1500 cm(-1), and 1500-1700 cm(-1). Principal component analysis (PCA) and subsequent partial least square-discriminant analysis (PLS-DA) were able to discriminate five Citrus lines into three separate clusters corresponding to their taxonomic relationships. The quantitative predictive modeling of sugar and acid contents from Citrus fruits was established using partial least square regression algorithms from FT-IR spectra. The regression coefficients (R(2)) between predicted values and estimated sugar and acid content values were 0.99. These results demonstrate that by using FT-IR spectra and applying quantitative prediction modeling to Citrus sugar and acid contents, excellent Citrus lines can be early detected with greater accuracy. Copyright © 2015 Elsevier Ltd. All rights reserved.
2014-01-01
This paper examined the efficiency of multivariate linear regression (MLR) and artificial neural network (ANN) models in prediction of two major water quality parameters in a wastewater treatment plant. Biochemical oxygen demand (BOD) and chemical oxygen demand (COD) as well as indirect indicators of organic matters are representative parameters for sewer water quality. Performance of the ANN models was evaluated using coefficient of correlation (r), root mean square error (RMSE) and bias values. The computed values of BOD and COD by model, ANN method and regression analysis were in close agreement with their respective measured values. Results showed that the ANN performance model was better than the MLR model. Comparative indices of the optimized ANN with input values of temperature (T), pH, total suspended solid (TSS) and total suspended (TS) for prediction of BOD was RMSE = 25.1 mg/L, r = 0.83 and for prediction of COD was RMSE = 49.4 mg/L, r = 0.81. It was found that the ANN model could be employed successfully in estimating the BOD and COD in the inlet of wastewater biochemical treatment plants. Moreover, sensitive examination results showed that pH parameter have more effect on BOD and COD predicting to another parameters. Also, both implemented models have predicted BOD better than COD. PMID:24456676
Comparison of Optimum Interpolation and Cressman Analyses
NASA Technical Reports Server (NTRS)
Baker, W. E.; Bloom, S. C.; Nestler, M. S.
1984-01-01
The objective of this investigation is to develop a state-of-the-art optimum interpolation (O/I) objective analysis procedure for use in numerical weather prediction studies. A three-dimensional multivariate O/I analysis scheme has been developed. Some characteristics of the GLAS O/I compared with those of the NMC and ECMWF systems are summarized. Some recent enhancements of the GLAS scheme include a univariate analysis of water vapor mixing ratio, a geographically dependent model prediction error correlation function and a multivariate oceanic surface analysis.
Comparison of Optimum Interpolation and Cressman Analyses
NASA Technical Reports Server (NTRS)
Baker, W. E.; Bloom, S. C.; Nestler, M. S.
1985-01-01
The development of a state of the art optimum interpolation (O/I) objective analysis procedure for use in numerical weather prediction studies was investigated. A three dimensional multivariate O/I analysis scheme was developed. Some characteristics of the GLAS O/I compared with those of the NMC and ECMWF systems are summarized. Some recent enhancements of the GLAS scheme include a univariate analysis of water vapor mixing ratio, a geographically dependent model prediction error correlation function and a multivariate oceanic surface analysis.
NASA Astrophysics Data System (ADS)
Schwartz, Craig R.; Thelen, Brian J.; Kenton, Arthur C.
1995-06-01
A statistical parametric multispectral sensor performance model was developed by ERIM to support mine field detection studies, multispectral sensor design/performance trade-off studies, and target detection algorithm development. The model assumes target detection algorithms and their performance models which are based on data assumed to obey multivariate Gaussian probability distribution functions (PDFs). The applicability of these algorithms and performance models can be generalized to data having non-Gaussian PDFs through the use of transforms which convert non-Gaussian data to Gaussian (or near-Gaussian) data. An example of one such transform is the Box-Cox power law transform. In practice, such a transform can be applied to non-Gaussian data prior to the introduction of a detection algorithm that is formally based on the assumption of multivariate Gaussian data. This paper presents an extension of these techniques to the case where the joint multivariate probability density function of the non-Gaussian input data is known, and where the joint estimate of the multivariate Gaussian statistics, under the Box-Cox transform, is desired. The jointly estimated multivariate Gaussian statistics can then be used to predict the performance of a target detection algorithm which has an associated Gaussian performance model.
Using state-space models to predict the abundance of juvenile and adult sea lice on Atlantic salmon.
Elghafghuf, Adel; Vanderstichel, Raphael; St-Hilaire, Sophie; Stryhn, Henrik
2018-04-11
Sea lice are marine parasites affecting salmon farms, and are considered one of the most costly pests of the salmon aquaculture industry. Infestations of sea lice on farms significantly increase opportunities for the parasite to spread in the surrounding ecosystem, making control of this pest a challenging issue for salmon producers. The complexity of controlling sea lice on salmon farms requires frequent monitoring of the abundance of different sea lice stages over time. Industry-based data sets of counts of lice are amenable to multivariate time-series data analyses. In this study, two sets of multivariate autoregressive state-space models were applied to Chilean sea lice data from six Atlantic salmon production cycles on five isolated farms (at least 20 km seaway distance away from other known active farms), to evaluate the utility of these models for predicting sea lice abundance over time on farms. The models were constructed with different parameter configurations, and the analysis demonstrated large heterogeneity between production cycles for the autoregressive parameter, the effects of chemotherapeutant bath treatments, and the process-error variance. A model allowing for different parameters across production cycles had the best fit and the smallest overall prediction errors. However, pooling information across cycles for the drift and observation error parameters did not substantially affect model performance, thus reducing the number of necessary parameters in the model. Bath treatments had strong but variable effects for reducing sea lice burdens, and these effects were stronger for adult lice than juvenile lice. Our multivariate state-space models were able to handle different sea lice stages and provide predictions for sea lice abundance with reasonable accuracy up to five weeks out. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.
A model for prediction of color change after tooth bleaching based on CIELAB color space
NASA Astrophysics Data System (ADS)
Herrera, Luis J.; Santana, Janiley; Yebra, Ana; Rivas, María. José; Pulgar, Rosa; Pérez, María. M.
2017-08-01
An experimental study aiming to develop a model based on CIELAB color space for prediction of color change after a tooth bleaching procedure is presented. Multivariate linear regression models were obtained to predict the L*, a*, b* and W* post-bleaching values using the pre-bleaching L*, a*and b*values. Moreover, univariate linear regression models were obtained to predict the variation in chroma (C*), hue angle (h°) and W*. The results demonstrated that is possible to estimate color change when using a carbamide peroxide tooth-bleaching system. The models obtained can be applied in clinic to predict the colour change after bleaching.
Cole-Cole, linear and multivariate modeling of capacitance data for on-line monitoring of biomass.
Dabros, Michal; Dennewald, Danielle; Currie, David J; Lee, Mark H; Todd, Robert W; Marison, Ian W; von Stockar, Urs
2009-02-01
This work evaluates three techniques of calibrating capacitance (dielectric) spectrometers used for on-line monitoring of biomass: modeling of cell properties using the theoretical Cole-Cole equation, linear regression of dual-frequency capacitance measurements on biomass concentration, and multivariate (PLS) modeling of scanning dielectric spectra. The performance and robustness of each technique is assessed during a sequence of validation batches in two experimental settings of differing signal noise. In more noisy conditions, the Cole-Cole model had significantly higher biomass concentration prediction errors than the linear and multivariate models. The PLS model was the most robust in handling signal noise. In less noisy conditions, the three models performed similarly. Estimates of the mean cell size were done additionally using the Cole-Cole and PLS models, the latter technique giving more satisfactory results.
USDA-ARS?s Scientific Manuscript database
High-throughput phenotyping (HTP) platforms can be used to measure traits that are genetically correlated with wheat (Triticum aestivum L.) grain yield across time. Incorporating such secondary traits in the multivariate pedigree and genomic prediction models would be desirable to improve indirect s...
NASA Astrophysics Data System (ADS)
Rounaghi, Mohammad Mahdi; Abbaszadeh, Mohammad Reza; Arashi, Mohammad
2015-11-01
One of the most important topics of interest to investors is stock price changes. Investors whose goals are long term are sensitive to stock price and its changes and react to them. In this regard, we used multivariate adaptive regression splines (MARS) model and semi-parametric splines technique for predicting stock price in this study. The MARS model as a nonparametric method is an adaptive method for regression and it fits for problems with high dimensions and several variables. semi-parametric splines technique was used in this study. Smoothing splines is a nonparametric regression method. In this study, we used 40 variables (30 accounting variables and 10 economic variables) for predicting stock price using the MARS model and using semi-parametric splines technique. After investigating the models, we select 4 accounting variables (book value per share, predicted earnings per share, P/E ratio and risk) as influencing variables on predicting stock price using the MARS model. After fitting the semi-parametric splines technique, only 4 accounting variables (dividends, net EPS, EPS Forecast and P/E Ratio) were selected as variables effective in forecasting stock prices.
Madaniyazi, Lina; Guo, Yuming; Chen, Renjie; Kan, Haidong; Tong, Shilu
2016-01-01
Estimating the burden of mortality associated with particulates requires knowledge of exposure-response associations. However, the evidence on exposure-response associations is limited in many cities, especially in developing countries. In this study, we predicted associations of particulates smaller than 10 μm in aerodynamic diameter (PM10) with mortality in 73 Chinese cities. The meta-regression model was used to test and quantify which city-specific characteristics contributed significantly to the heterogeneity of PM10-mortality associations for 16 Chinese cities. Then, those city-specific characteristics with statistically significant regression coefficients were treated as independent variables to build multivariate meta-regression models. The model with the best fitness was used to predict PM10-mortality associations in 73 Chinese cities in 2010. Mean temperature, PM10 concentration and green space per capita could best explain the heterogeneity in PM10-mortality associations. Based on city-specific characteristics, we were able to develop multivariate meta-regression models to predict associations between air pollutants and health outcomes reasonably well. Copyright © 2015 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Harris, C. D.; Profeta, Luisa T. M.; Akpovo, Codjo A.; Johnson, Lewis; Stowe, Ashley C.
2017-05-01
A calibration model was created to illustrate the detection capabilities of laser ablation molecular isotopic spectroscopy (LAMIS) discrimination in isotopic analysis. The sample set contained boric acid pellets that varied in isotopic concentrations of 10B and 11B. Each sample set was interrogated with a Q-switched Nd:YAG ablation laser operating at 532 nm. A minimum of four band heads of the β system B2∑ -> Χ2∑transitions were identified and verified with previous literature on BO molecular emission lines. Isotopic shifts were observed in the spectra for each transition and used as the predictors in the calibration model. The spectra along with their respective 10/11B isotopic ratios were analyzed using Partial Least Squares Regression (PLSR). An IUPAC novel approach for determining a multivariate Limit of Detection (LOD) interval was used to predict the detection of the desired isotopic ratios. The predicted multivariate LOD is dependent on the variation of the instrumental signal and other composites in the calibration model space.
Lemmens, Louise; Kos, Snjezana; Beijer, Cornelis; Brinkman, Jacoline W; van der Horst, Frans A L; van den Hoven, Leonie; Kieslinger, Dorit C; van Trooyen-van Vrouwerff, Netty J; Wolthuis, Albert; Hendriks, Jan C M; Wetzels, Alex M M
2016-06-01
To investigate the value of sperm parameters to predict an ongoing pregnancy outcome in couples treated with intrauterine insemination (IUI), during a methodologically stable period of time. Retrospective, observational study with logistic regression analyses. University hospital. A total of 1,166 couples visiting the fertility laboratory for their first IUI episode, including 4,251 IUI cycles. None. Sperm morphology, total progressively motile sperm count (TPMSC), and number of inseminated progressively motile spermatozoa (NIPMS); odds ratios (ORs) of the sperm parameters after the first IUI cycle and the first finished IUI episode; discriminatory accuracy of the multivariable model. None of the sperm parameters was of predictive value for pregnancy after the first IUI cycle. In the first finished IUI episode, a positive relationship was found for ≤4% of morphologically normal spermatozoa (OR 1.39) and a moderate NIPMS (5-10 million; OR 1.73). Low NIPMS showed a negative relation (≤1 million; OR 0.42). The TPMSC had no predictive value. The multivariable model (i.e., sperm morphology, NIPMS, female age, male age, and the number of cycles in the episode) had a moderate discriminatory accuracy (area under the curve 0.73). Intrauterine insemination is especially relevant for couples with moderate male factor infertility (sperm morphology ≤4%, NIPMS 5-10 million). In the multivariable model, however, the predictive power of these sperm parameters is rather low. Copyright © 2016 American Society for Reproductive Medicine. Published by Elsevier Inc. All rights reserved.
The extension of total gain (TG) statistic in survival models: properties and applications.
Choodari-Oskooei, Babak; Royston, Patrick; Parmar, Mahesh K B
2015-07-01
The results of multivariable regression models are usually summarized in the form of parameter estimates for the covariates, goodness-of-fit statistics, and the relevant p-values. These statistics do not inform us about whether covariate information will lead to any substantial improvement in prediction. Predictive ability measures can be used for this purpose since they provide important information about the practical significance of prognostic factors. R (2)-type indices are the most familiar forms of such measures in survival models, but they all have limitations and none is widely used. In this paper, we extend the total gain (TG) measure, proposed for a logistic regression model, to survival models and explore its properties using simulations and real data. TG is based on the binary regression quantile plot, otherwise known as the predictiveness curve. Standardised TG ranges from 0 (no explanatory power) to 1 ('perfect' explanatory power). The results of our simulations show that unlike many of the other R (2)-type predictive ability measures, TG is independent of random censoring. It increases as the effect of a covariate increases and can be applied to different types of survival models, including models with time-dependent covariate effects. We also apply TG to quantify the predictive ability of multivariable prognostic models developed in several disease areas. Overall, TG performs well in our simulation studies and can be recommended as a measure to quantify the predictive ability in survival models.
NASA Astrophysics Data System (ADS)
Samhouri, M.; Al-Ghandoor, A.; Fouad, R. H.
2009-08-01
In this study two techniques, for modeling electricity consumption of the Jordanian industrial sector, are presented: (i) multivariate linear regression and (ii) neuro-fuzzy models. Electricity consumption is modeled as function of different variables such as number of establishments, number of employees, electricity tariff, prevailing fuel prices, production outputs, capacity utilizations, and structural effects. It was found that industrial production and capacity utilization are the most important variables that have significant effect on future electrical power demand. The results showed that both the multivariate linear regression and neuro-fuzzy models are generally comparable and can be used adequately to simulate industrial electricity consumption. However, comparison that is based on the square root average squared error of data suggests that the neuro-fuzzy model performs slightly better for future prediction of electricity consumption than the multivariate linear regression model. Such results are in full agreement with similar work, using different methods, for other countries.
A multivariable model for predicting the frictional behaviour and hydration of the human skin.
Veijgen, N K; van der Heide, E; Masen, M A
2013-08-01
The frictional characteristics of skin-object interactions are important when handling objects, in the assessment of perception and comfort of products and materials and in the origins and prevention of skin injuries. In this study, based on statistical methods, a quantitative model is developed that describes the friction behaviour of human skin as a function of the subject characteristics, contact conditions, the properties of the counter material as well as environmental conditions. Although the frictional behaviour of human skin is a multivariable problem, in literature the variables that are associated with skin friction have been studied using univariable methods. In this work, multivariable models for the static and dynamic coefficients of friction as well as for the hydration of the skin are presented. A total of 634 skin-friction measurements were performed using a recently developed tribometer. Using a statistical analysis, previously defined potential influential variables were linked to the static and dynamic coefficient of friction and to the hydration of the skin, resulting in three predictive quantitative models that descibe the friction behaviour and the hydration of human skin respectively. Increased dynamic coefficients of friction were obtained from older subjects, on the index finger, with materials with a higher surface energy at higher room temperatures, whereas lower dynamic coefficients of friction were obtained at lower skin temperatures, on the temple with rougher contact materials. The static coefficient of friction increased with higher skin hydration, increasing age, on the index finger, with materials with a higher surface energy and at higher ambient temperatures. The hydration of the skin was associated with the skin temperature, anatomical location, presence of hair on the skin and the relative air humidity. Predictive models have been derived for the static and dynamic coefficient of friction using a multivariable approach. These two coefficients of friction show a strong correlation. Consequently the two multivariable models resemble, with the static coefficient of friction being on average 18% lower than the dynamic coefficient of friction. The multivariable models in this study can be used to describe the data set that was the basis for this study. Care should be taken when generalising these results. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
The EXCITE Trial: Predicting a Clinically Meaningful Motor Activity Log Outcome
Park, Si-Woon; Wolf, Steven L.; Blanton, Sarah; Winstein, Carolee; Nichols-Larsen, Deborah S.
2013-01-01
Background and Objective This study determined which baseline clinical measurements best predicted a predefined clinically meaningful outcome on the Motor Activity Log (MAL) and developed a predictive multivariate model to determine outcome after 2 weeks of constraint-induced movement therapy (CIMT) and 12 months later using the database from participants in the Extremity Constraint Induced Therapy Evaluation (EXCITE) Trial. Methods A clinically meaningful CIMT outcome was defined as achieving higher than 3 on the MAL Quality of Movement (QOM) scale. Predictive variables included baseline MAL, Wolf Motor Function Test (WMFT), the sensory and motor portion of the Fugl-Meyer Assessment (FMA), spasticity, visual perception, age, gender, type of stroke, concordance, and time after stroke. Significant predictors identified by univariate analysis were used to develop the multivariate model. Predictive equations were generated and odds ratios for predictors were calculated from the multivariate model. Results Pretreatment motor function measured by MAL QOM, WMFT, and FMA were significantly associated with outcome immediately after CIMT. Pretreatment MAL QOM, WMFT, proprioception, and age were significantly associated with outcome after 12 months. Each unit of higher pretreatment MAL QOM score and each unit of faster pretreatment WMFT log mean time improved the probability of achieving a clinically meaningful outcome by 7 and 3 times at posttreatment, and 5 and 2 times after 12 months, respectively. Patients with impaired proprioception had a 20% probability of achieving a clinically meaningful outcome compared with those with intact proprioception. Conclusions Baseline clinical measures of motor and sensory function can be used to predict a clinically meaningful outcome after CIMT. PMID:18780883
The EXCITE Trial: Predicting a clinically meaningful motor activity log outcome.
Park, Si-Woon; Wolf, Steven L; Blanton, Sarah; Winstein, Carolee; Nichols-Larsen, Deborah S
2008-01-01
This study determined which baseline clinical measurements best predicted a predefined clinically meaningful outcome on the Motor Activity Log (MAL) and developed a predictive multivariate model to determine outcome after 2 weeks of constraint-induced movement therapy (CIMT) and 12 months later using the database from participants in the Extremity Constraint Induced Therapy Evaluation (EXCITE) Trial. A clinically meaningful CIMT outcome was defined as achieving higher than 3 on the MAL Quality of Movement (QOM) scale. Predictive variables included baseline MAL, Wolf Motor Function Test (WMFT), the sensory and motor portion of the Fugl-Meyer Assessment (FMA), spasticity, visual perception, age, gender, type of stroke, concordance, and time after stroke. Significant predictors identified by univariate analysis were used to develop the multivariate model. Predictive equations were generated and odds ratios for predictors were calculated from the multivariate model. Pretreatment motor function measured by MAL QOM, WMFT, and FMA were significantly associated with outcome immediately after CIMT. Pretreatment MAL QOM, WMFT, proprioception, and age were significantly associated with outcome after 12 months. Each unit of higher pretreatment MAL QOM score and each unit of faster pretreatment WMFT log mean time improved the probability of achieving a clinically meaningful outcome by 7 and 3 times at posttreatment, and 5 and 2 times after 12 months, respectively. Patients with impaired proprioception had a 20% probability of achieving a clinically meaningful outcome compared with those with intact proprioception. Baseline clinical measures of motor and sensory function can be used to predict a clinically meaningful outcome after CIMT.
Black, L E; Brion, G M; Freitas, S J
2007-06-01
Predicting the presence of enteric viruses in surface waters is a complex modeling problem. Multiple water quality parameters that indicate the presence of human fecal material, the load of fecal material, and the amount of time fecal material has been in the environment are needed. This paper presents the results of a multiyear study of raw-water quality at the inlet of a potable-water plant that related 17 physical, chemical, and biological indices to the presence of enteric viruses as indicated by cytopathic changes in cell cultures. It was found that several simple, multivariate logistic regression models that could reliably identify observations of the presence or absence of total culturable virus could be fitted. The best models developed combined a fecal age indicator (the atypical coliform [AC]/total coliform [TC] ratio), the detectable presence of a human-associated sterol (epicoprostanol) to indicate the fecal source, and one of several fecal load indicators (the levels of Giardia species cysts, coliform bacteria, and coprostanol). The best fit to the data was found when the AC/TC ratio, the presence of epicoprostanol, and the density of fecal coliform bacteria were input into a simple, multivariate logistic regression equation, resulting in 84.5% and 78.6% accuracies for the identification of the presence and absence of total culturable virus, respectively. The AC/TC ratio was the most influential input variable in all of the models generated, but producing the best prediction required additional input related to the fecal source and the fecal load. The potential for replacing microbial indicators of fecal load with levels of coprostanol was proposed and evaluated by multivariate logistic regression modeling for the presence and absence of virus.
Corron, Louise; Marchal, François; Condemi, Silvana; Telmon, Norbert; Chaumoitre, Kathia; Adalian, Pascal
2018-05-31
Subadult age estimation should rely on sampling and statistical protocols capturing development variability for more accurate age estimates. In this perspective, measurements were taken on the fifth lumbar vertebrae and/or clavicles of 534 French males and females aged 0-19 years and the ilia of 244 males and females aged 0-12 years. These variables were fitted in nonparametric multivariate adaptive regression splines (MARS) models with 95% prediction intervals (PIs) of age. The models were tested on two independent samples from Marseille and the Luis Lopes reference collection from Lisbon. Models using ilium width and module, maximum clavicle length, and lateral vertebral body heights were more than 92% accurate. Precision was lower for postpubertal individuals. Integrating punctual nonlinearities of the relationship between age and the variables and dynamic prediction intervals incorporated the normal increase in interindividual growth variability (heteroscedasticity of variance) with age for more biologically accurate predictions. © 2018 American Academy of Forensic Sciences.
Multivariate neural biomarkers of emotional states are categorically distinct
Kragel, Philip A.
2015-01-01
Understanding how emotions are represented neurally is a central aim of affective neuroscience. Despite decades of neuroimaging efforts addressing this question, it remains unclear whether emotions are represented as distinct entities, as predicted by categorical theories, or are constructed from a smaller set of underlying factors, as predicted by dimensional accounts. Here, we capitalize on multivariate statistical approaches and computational modeling to directly evaluate these theoretical perspectives. We elicited discrete emotional states using music and films during functional magnetic resonance imaging scanning. Distinct patterns of neural activation predicted the emotion category of stimuli and tracked subjective experience. Bayesian model comparison revealed that combining dimensional and categorical models of emotion best characterized the information content of activation patterns. Surprisingly, categorical and dimensional aspects of emotion experience captured unique and opposing sources of neural information. These results indicate that diverse emotional states are poorly differentiated by simple models of valence and arousal, and that activity within separable neural systems can be mapped to unique emotion categories. PMID:25813790
Bayesian transformation cure frailty models with multivariate failure time data.
Yin, Guosheng
2008-12-10
We propose a class of transformation cure frailty models to accommodate a survival fraction in multivariate failure time data. Established through a general power transformation, this family of cure frailty models includes the proportional hazards and the proportional odds modeling structures as two special cases. Within the Bayesian paradigm, we obtain the joint posterior distribution and the corresponding full conditional distributions of the model parameters for the implementation of Gibbs sampling. Model selection is based on the conditional predictive ordinate statistic and deviance information criterion. As an illustration, we apply the proposed method to a real data set from dentistry.
Development and evaluation of height diameter at breast models for native Chinese Metasequoia.
Liu, Mu; Feng, Zhongke; Zhang, Zhixiang; Ma, Chenghui; Wang, Mingming; Lian, Bo-Ling; Sun, Renjie; Zhang, Li
2017-01-01
Accurate tree height and diameter at breast height (dbh) are important input variables for growth and yield models. A total of 5503 Chinese Metasequoia trees were used in this study. We studied 53 fitted models, of which 7 were linear models and 46 were non-linear models. These models were divided into two groups of single models and multivariate models according to the number of independent variables. The results show that the allometry equation of tree height which has diameter at breast height as independent variable can better reflect the change of tree height; in addition the prediction accuracy of the multivariate composite models is higher than that of the single variable models. Although tree age is not the most important variable in the study of the relationship between tree height and dbh, the consideration of tree age when choosing models and parameters in model selection can make the prediction of tree height more accurate. The amount of data is also an important parameter what can improve the reliability of models. Other variables such as tree height, main dbh and altitude, etc can also affect models. In this study, the method of developing the recommended models for predicting the tree height of native Metasequoias aged 50-485 years is statistically reliable and can be used for reference in predicting the growth and production of mature native Metasequoia.
Development and evaluation of height diameter at breast models for native Chinese Metasequoia
Feng, Zhongke; Zhang, Zhixiang; Ma, Chenghui; Wang, Mingming; Lian, Bo-ling; Sun, Renjie; Zhang, Li
2017-01-01
Accurate tree height and diameter at breast height (dbh) are important input variables for growth and yield models. A total of 5503 Chinese Metasequoia trees were used in this study. We studied 53 fitted models, of which 7 were linear models and 46 were non-linear models. These models were divided into two groups of single models and multivariate models according to the number of independent variables. The results show that the allometry equation of tree height which has diameter at breast height as independent variable can better reflect the change of tree height; in addition the prediction accuracy of the multivariate composite models is higher than that of the single variable models. Although tree age is not the most important variable in the study of the relationship between tree height and dbh, the consideration of tree age when choosing models and parameters in model selection can make the prediction of tree height more accurate. The amount of data is also an important parameter what can improve the reliability of models. Other variables such as tree height, main dbh and altitude, etc can also affect models. In this study, the method of developing the recommended models for predicting the tree height of native Metasequoias aged 50–485 years is statistically reliable and can be used for reference in predicting the growth and production of mature native Metasequoia. PMID:28817600
Hirai, Toshinori; Itoh, Toshimasa; Kimura, Toshimi; Echizen, Hirotoshi
2018-06-06
Febuxostat is an active xanthine oxidase (XO) inhibitor that is widely used in the hyperuricemia treatment. We aimed to evaluate the predictive performance of a pharmacokinetic-pharmacodynamic (PK-PD) model for hypouricemic effects of febuxostat. Previously, we have formulated a PK--PD model for predicting hypouricemic effects of febuxostat as a function of baseline serum urate levels, body weight, renal function, and drug dose using datasets reported in preapproval studies (Hirai T et al., Biol Pharm Bull 2016; 39: 1013-21). Using an updated model with sensitivity analysis, we examined the predictive performance of the PK-PD model using datasets obtained from the medical records of patients who received febuxostat from March 2011 to December 2015 at Tokyo Women's Medical University Hospital. Multivariate regression analysis was performed to explore clinical variables to improve the predictive performance of the model. A total of 1,199 serum urate data were retrieved from 168 patients (age: 60.5 ±17.7 years, 71.4% males) who received febuxostat as hyperuricemia treatment. There was a significant correlation (r=0.68, p<0.01) between serum urate levels observed and those predicted by the modified PK-PD model. A multivariate regression analysis revealed that the predictive performance of the model may be improved further by considering comorbidities, such as diabetes mellitus, estimated glomerular filtration rate (eGFR), and co-administration of loop diuretics (r = 0.77, p<0.01). The PK-PD model may be useful for predicting individualized maintenance doses of febuxostat in real-world patients. This article is protected by copyright. All rights reserved.
Nomogram Prediction of Overall Survival After Curative Irradiation for Uterine Cervical Cancer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Seo, YoungSeok; Yoo, Seong Yul; Kim, Mi-Sook
Purpose: The purpose of this study was to develop a nomogram capable of predicting the probability of 5-year survival after radical radiotherapy (RT) without chemotherapy for uterine cervical cancer. Methods and Materials: We retrospectively analyzed 549 patients that underwent radical RT for uterine cervical cancer between March 1994 and April 2002 at our institution. Multivariate analysis using Cox proportional hazards regression was performed and this Cox model was used as the basis for the devised nomogram. The model was internally validated for discrimination and calibration by bootstrap resampling. Results: By multivariate regression analysis, the model showed that age, hemoglobin levelmore » before RT, Federation Internationale de Gynecologie Obstetrique (FIGO) stage, maximal tumor diameter, lymph node status, and RT dose at Point A significantly predicted overall survival. The survival prediction model demonstrated good calibration and discrimination. The bootstrap-corrected concordance index was 0.67. The predictive ability of the nomogram proved to be superior to FIGO stage (p = 0.01). Conclusions: The devised nomogram offers a significantly better level of discrimination than the FIGO staging system. In particular, it improves predictions of survival probability and could be useful for counseling patients, choosing treatment modalities and schedules, and designing clinical trials. However, before this nomogram is used clinically, it should be externally validated.« less
Lee, Byeong-Ju; Zhou, Yaoyao; Lee, Jae Soung; Shin, Byeung Kon; Seo, Jeong-Ah; Lee, Doyup; Kim, Young-Suk
2018-01-01
The ability to determine the origin of soybeans is an important issue following the inclusion of this information in the labeling of agricultural food products becoming mandatory in South Korea in 2017. This study was carried out to construct a prediction model for discriminating Chinese and Korean soybeans using Fourier-transform infrared (FT-IR) spectroscopy and multivariate statistical analysis. The optimal prediction models for discriminating soybean samples were obtained by selecting appropriate scaling methods, normalization methods, variable influence on projection (VIP) cutoff values, and wave-number regions. The factors for constructing the optimal partial-least-squares regression (PLSR) prediction model were using second derivatives, vector normalization, unit variance scaling, and the 4000–400 cm–1 region (excluding water vapor and carbon dioxide). The PLSR model for discriminating Chinese and Korean soybean samples had the best predictability when a VIP cutoff value was not applied. When Chinese soybean samples were identified, a PLSR model that has the lowest root-mean-square error of the prediction value was obtained using a VIP cutoff value of 1.5. The optimal PLSR prediction model for discriminating Korean soybean samples was also obtained using a VIP cutoff value of 1.5. This is the first study that has combined FT-IR spectroscopy with normalization methods, VIP cutoff values, and selected wave-number regions for discriminating Chinese and Korean soybeans. PMID:29689113
Zhong-xiang, Feng; Shi-sheng, Lu; Wei-hua, Zhang; Nan-nan, Zhang
2014-01-01
In order to build a combined model which can meet the variation rule of death toll data for road traffic accidents and can reflect the influence of multiple factors on traffic accidents and improve prediction accuracy for accidents, the Verhulst model was built based on the number of death tolls for road traffic accidents in China from 2002 to 2011; and car ownership, population, GDP, highway freight volume, highway passenger transportation volume, and highway mileage were chosen as the factors to build the death toll multivariate linear regression model. Then the two models were combined to be a combined prediction model which has weight coefficient. Shapley value method was applied to calculate the weight coefficient by assessing contributions. Finally, the combined model was used to recalculate the number of death tolls from 2002 to 2011, and the combined model was compared with the Verhulst and multivariate linear regression models. The results showed that the new model could not only characterize the death toll data characteristics but also quantify the degree of influence to the death toll by each influencing factor and had high accuracy as well as strong practicability. PMID:25610454
Feng, Zhong-xiang; Lu, Shi-sheng; Zhang, Wei-hua; Zhang, Nan-nan
2014-01-01
In order to build a combined model which can meet the variation rule of death toll data for road traffic accidents and can reflect the influence of multiple factors on traffic accidents and improve prediction accuracy for accidents, the Verhulst model was built based on the number of death tolls for road traffic accidents in China from 2002 to 2011; and car ownership, population, GDP, highway freight volume, highway passenger transportation volume, and highway mileage were chosen as the factors to build the death toll multivariate linear regression model. Then the two models were combined to be a combined prediction model which has weight coefficient. Shapley value method was applied to calculate the weight coefficient by assessing contributions. Finally, the combined model was used to recalculate the number of death tolls from 2002 to 2011, and the combined model was compared with the Verhulst and multivariate linear regression models. The results showed that the new model could not only characterize the death toll data characteristics but also quantify the degree of influence to the death toll by each influencing factor and had high accuracy as well as strong practicability.
Binquet, C; Abrahamowicz, M; Mahboubi, A; Jooste, V; Faivre, J; Bonithon-Kopp, C; Quantin, C
2008-12-30
Flexible survival models, which avoid assumptions about hazards proportionality (PH) or linearity of continuous covariates effects, bring the issues of model selection to a new level of complexity. Each 'candidate covariate' requires inter-dependent decisions regarding (i) its inclusion in the model, and representation of its effects on the log hazard as (ii) either constant over time or time-dependent (TD) and, for continuous covariates, (iii) either loglinear or non-loglinear (NL). Moreover, 'optimal' decisions for one covariate depend on the decisions regarding others. Thus, some efficient model-building strategy is necessary.We carried out an empirical study of the impact of the model selection strategy on the estimates obtained in flexible multivariable survival analyses of prognostic factors for mortality in 273 gastric cancer patients. We used 10 different strategies to select alternative multivariable parametric as well as spline-based models, allowing flexible modeling of non-parametric (TD and/or NL) effects. We employed 5-fold cross-validation to compare the predictive ability of alternative models.All flexible models indicated significant non-linearity and changes over time in the effect of age at diagnosis. Conventional 'parametric' models suggested the lack of period effect, whereas more flexible strategies indicated a significant NL effect. Cross-validation confirmed that flexible models predicted better mortality. The resulting differences in the 'final model' selected by various strategies had also impact on the risk prediction for individual subjects.Overall, our analyses underline (a) the importance of accounting for significant non-parametric effects of covariates and (b) the need for developing accurate model selection strategies for flexible survival analyses. Copyright 2008 John Wiley & Sons, Ltd.
Belay, T K; Dagnachew, B S; Kowalski, Z M; Ådnøy, T
2017-08-01
Fourier transform mid-infrared (FT-MIR) spectra of milk are commonly used for phenotyping of traits of interest through links developed between the traits and milk FT-MIR spectra. Predicted traits are then used in genetic analysis for ultimate phenotypic prediction using a single-trait mixed model that account for cows' circumstances at a given test day. Here, this approach is referred to as indirect prediction (IP). Alternatively, FT-MIR spectral variable can be kept multivariate in the form of factor scores in REML and BLUP analyses. These BLUP predictions, including phenotype (predicted factor scores), were converted to single-trait through calibration outputs; this method is referred to as direct prediction (DP). The main aim of this study was to verify whether mixed modeling of milk spectra in the form of factors scores (DP) gives better prediction of blood β-hydroxybutyrate (BHB) than the univariate approach (IP). Models to predict blood BHB from milk spectra were also developed. Two data sets that contained milk FT-MIR spectra and other information on Polish dairy cattle were used in this study. Data set 1 (n = 826) also contained BHB measured in blood samples, whereas data set 2 (n = 158,028) did not contain measured blood values. Part of data set 1 was used to calibrate a prediction model (n = 496) and the remaining part of data set 1 (n = 330) was used to validate the calibration models, as well as to evaluate the DP and IP approaches. Dimensions of FT-MIR spectra in data set 2 were reduced either into 5 or 10 factor scores (DP) or into a single trait (IP) with calibration outputs. The REML estimates for these factor scores were found using WOMBAT. The BLUP values and predicted BHB for observations in the validation set were computed using the REML estimates. Blood BHB predicted from milk FT-MIR spectra by both approaches were regressed on reference blood BHB that had not been used in the model development. Coefficients of determination in cross-validation for untransformed blood BHB were from 0.21 to 0.32, whereas that for the log-transformed BHB were from 0.31 to 0.38. The corresponding estimates in validation were from 0.29 to 0.37 and 0.21 to 0.43, respectively, for untransformed and logarithmic BHB. Contrary to expectation, slightly better predictions of BHB were found when univariate variance structure was used (IP) than when multivariate covariance structures were used (DP). Conclusive remarks on the importance of keeping spectral data in multivariate form for prediction of phenotypes may be found in data sets where the trait of interest has strong relationships with spectral variables. The Authors. Published by the Federation of Animal Science Societies and Elsevier Inc. on behalf of the American Dairy Science Association®. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/3.0/).
Mwanza, Jean-Claude; Warren, Joshua L; Hochberg, Jessica T; Budenz, Donald L; Chang, Robert T; Ramulu, Pradeep Y
2015-01-01
To determine the ability of frequency doubling technology (FDT) and scanning laser polarimetry with variable corneal compensation (GDx-VCC) to detect glaucoma when used individually and in combination. One hundred ten normal and 114 glaucomatous subjects were tested with FDT C-20-5 screening protocol and the GDx-VCC. The discriminating ability was tested for each device individually and for both devices combined using GDx-NFI, GDx-TSNIT, number of missed points of FDT, and normal or abnormal FDT. Measures of discrimination included sensitivity, specificity, area under the curve (AUC), Akaike's information criterion (AIC), and prediction confidence interval lengths. For detecting glaucoma regardless of severity, the multivariable model resulting from the combination of GDx-TSNIT, number of abnormal points on FDT (NAP-FDT), and the interaction GDx-TSNIT×NAP-FDT (AIC: 88.28, AUC: 0.959, sensitivity: 94.6%, specificity: 89.5%) outperformed the best single-variable model provided by GDx-NFI (AIC: 120.88, AUC: 0.914, sensitivity: 87.8%, specificity: 84.2%). The multivariable model combining GDx-TSNIT, NAP-FDT, and interaction GDx-TSNIT×NAP-FDT consistently provided better discriminating abilities for detecting early, moderate, and severe glaucoma than the best single-variable models. The multivariable model including GDx-TSNIT, NAP-FDT, and the interaction GDx-TSNIT×NAP-FDT provides the best glaucoma prediction compared with all other multivariable and univariable models. Combining the FDT C-20-5 screening protocol and GDx-VCC improves glaucoma detection compared with using GDx or FDT alone.
Simulation analysis of adaptive cruise prediction control
NASA Astrophysics Data System (ADS)
Zhang, Li; Cui, Sheng Min
2017-09-01
Predictive control is suitable for multi-variable and multi-constraint system control.In order to discuss the effect of predictive control on the vehicle longitudinal motion, this paper establishes the expected spacing model by combining variable pitch spacing and the of safety distance strategy. The model predictive control theory and the optimization method based on secondary planning are designed to obtain and track the best expected acceleration trajectory quickly. Simulation models are established including predictive and adaptive fuzzy control. Simulation results show that predictive control can realize the basic function of the system while ensuring the safety. The application of predictive and fuzzy adaptive algorithm in cruise condition indicates that the predictive control effect is better.
Collins, G S; Reitsma, J B; Altman, D G; Moons, K G M
2015-02-01
Prediction models are developed to aid healthcare providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision-making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. The Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD) initiative developed a set of recommendations for the reporting of studies developing, validating or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a web-based survey and revised during a 3-day meeting in June 2011 with methodologists, healthcare professionals and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study, regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org). © 2015 Joint copyright. The Authors and Annals of Internal Medicine. Diabetic Medicine published by John Wiley Ltd. on behalf of Diabetes UK.
Vial, Flavie; Wei, Wei; Held, Leonhard
2016-12-20
In an era of ubiquitous electronic collection of animal health data, multivariate surveillance systems (which concurrently monitor several data streams) should have a greater probability of detecting disease events than univariate systems. However, despite their limitations, univariate aberration detection algorithms are used in most active syndromic surveillance (SyS) systems because of their ease of application and interpretation. On the other hand, a stochastic modelling-based approach to multivariate surveillance offers more flexibility, allowing for the retention of historical outbreaks, for overdispersion and for non-stationarity. While such methods are not new, they are yet to be applied to animal health surveillance data. We applied an example of such stochastic model, Held and colleagues' two-component model, to two multivariate animal health datasets from Switzerland. In our first application, multivariate time series of the number of laboratories test requests were derived from Swiss animal diagnostic laboratories. We compare the performance of the two-component model to parallel monitoring using an improved Farrington algorithm and found both methods yield a satisfactorily low false alarm rate. However, the calibration test of the two-component model on the one-step ahead predictions proved satisfactory, making such an approach suitable for outbreak prediction. In our second application, the two-component model was applied to the multivariate time series of the number of cattle abortions and the number of test requests for bovine viral diarrhea (a disease that often results in abortions). We found that there is a two days lagged effect from the number of abortions to the number of test requests. We further compared the joint modelling and univariate modelling of the number of laboratory test requests time series. The joint modelling approach showed evidence of superiority in terms of forecasting abilities. Stochastic modelling approaches offer the potential to address more realistic surveillance scenarios through, for example, the inclusion of times series specific parameters, or of covariates known to have an impact on syndrome counts. Nevertheless, many methodological challenges to multivariate surveillance of animal SyS data still remain. Deciding on the amount of corroboration among data streams that is required to escalate into an alert is not a trivial task given the sparse data on the events under consideration (e.g. disease outbreaks).
A Study of Effects of MultiCollinearity in the Multivariable Analysis
Yoo, Wonsuk; Mayberry, Robert; Bae, Sejong; Singh, Karan; (Peter) He, Qinghua; Lillard, James W.
2015-01-01
A multivariable analysis is the most popular approach when investigating associations between risk factors and disease. However, efficiency of multivariable analysis highly depends on correlation structure among predictive variables. When the covariates in the model are not independent one another, collinearity/multicollinearity problems arise in the analysis, which leads to biased estimation. This work aims to perform a simulation study with various scenarios of different collinearity structures to investigate the effects of collinearity under various correlation structures amongst predictive and explanatory variables and to compare these results with existing guidelines to decide harmful collinearity. Three correlation scenarios among predictor variables are considered: (1) bivariate collinear structure as the most simple collinearity case, (2) multivariate collinear structure where an explanatory variable is correlated with two other covariates, (3) a more realistic scenario when an independent variable can be expressed by various functions including the other variables. PMID:25664257
A Study of Effects of MultiCollinearity in the Multivariable Analysis.
Yoo, Wonsuk; Mayberry, Robert; Bae, Sejong; Singh, Karan; Peter He, Qinghua; Lillard, James W
2014-10-01
A multivariable analysis is the most popular approach when investigating associations between risk factors and disease. However, efficiency of multivariable analysis highly depends on correlation structure among predictive variables. When the covariates in the model are not independent one another, collinearity/multicollinearity problems arise in the analysis, which leads to biased estimation. This work aims to perform a simulation study with various scenarios of different collinearity structures to investigate the effects of collinearity under various correlation structures amongst predictive and explanatory variables and to compare these results with existing guidelines to decide harmful collinearity. Three correlation scenarios among predictor variables are considered: (1) bivariate collinear structure as the most simple collinearity case, (2) multivariate collinear structure where an explanatory variable is correlated with two other covariates, (3) a more realistic scenario when an independent variable can be expressed by various functions including the other variables.
DiMagno, Matthew J; Spaete, Joshua P; Ballard, Darren D; Wamsteker, Erik-Jan; Saini, Sameer D
2013-08-01
We investigated which variables independently associated with protection against or development of postendoscopic retrograde cholangiopancreatography (ERCP) pancreatitis (PEP) and severity of PEP. Subsequently, we derived predictive risk models for PEP. In a case-control design, 6505 patients had 8264 ERCPs, 211 patients had PEP, and 22 patients had severe PEP. We randomly selected 348 non-PEP controls. We examined 7 established- and 9 investigational variables. In univariate analysis, 7 variables predicted PEP: younger age, female sex, suspected sphincter of Oddi dysfunction (SOD), pancreatic sphincterotomy, moderate-difficult cannulation (MDC), pancreatic stent placement, and lower Charlson score. Protective variables were current smoking, former drinking, diabetes, and chronic liver disease (CLD, biliary/transplant complications). Multivariate analysis identified seven independent variables for PEP, three protective (current smoking, CLD-biliary, CLD-transplant/hepatectomy complications) and 4 predictive (younger age, suspected SOD, pancreatic sphincterotomy, MDC). Pre- and post-ERCP risk models of 7 variables have a C-statistic of 0.74. Removing age (seventh variable) did not significantly affect the predictive value (C-statistic of 0.73) and reduced model complexity. Severity of PEP did not associate with any variables by multivariate analysis. By using the newly identified protective variables with 3 predictive variables, we derived 2 risk models with a higher predictive value for PEP compared to prior studies.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tripathi, Markandey M.; Krishnan, Sundar R.; Srinivasan, Kalyan K.
Chemiluminescence emissions from OH*, CH*, C2, and CO2 formed within the reaction zone of premixed flames depend upon the fuel-air equivalence ratio in the burning mixture. In the present paper, a new partial least square regression (PLS-R) based multivariate sensing methodology is investigated and compared with an OH*/CH* intensity ratio-based calibration model for sensing equivalence ratio in atmospheric methane-air premixed flames. Five replications of spectral data at nine different equivalence ratios ranging from 0.73 to 1.48 were used in the calibration of both models. During model development, the PLS-R model was initially validated with the calibration data set using themore » leave-one-out cross validation technique. Since the PLS-R model used the entire raw spectral intensities, it did not need the nonlinear background subtraction of CO2 emission that is required for typical OH*/CH* intensity ratio calibrations. An unbiased spectral data set (not used in the PLS-R model development), for 28 different equivalence ratio conditions ranging from 0.71 to 1.67, was used to predict equivalence ratios using the PLS-R and the intensity ratio calibration models. It was found that the equivalence ratios predicted with the PLS-R based multivariate calibration model matched the experimentally measured equivalence ratios within 7%; whereas, the OH*/CH* intensity ratio calibration grossly underpredicted equivalence ratios in comparison to measured equivalence ratios, especially under rich conditions ( > 1.2). The practical implications of the chemiluminescence-based multivariate equivalence ratio sensing methodology are also discussed.« less
All-Possible-Subsets for MANOVA and Factorial MANOVAs: Less than a Weekend Project
ERIC Educational Resources Information Center
Nimon, Kim; Zientek, Linda Reichwein; Kraha, Amanda
2016-01-01
Multivariate techniques are increasingly popular as researchers attempt to accurately model a complex world. MANOVA is a multivariate technique used to investigate the dimensions along which groups differ, and how these dimensions may be used to predict group membership. A concern in a MANOVA analysis is to determine if a smaller subset of…
Efficient Global Aerodynamic Modeling from Flight Data
NASA Technical Reports Server (NTRS)
Morelli, Eugene A.
2012-01-01
A method for identifying global aerodynamic models from flight data in an efficient manner is explained and demonstrated. A novel experiment design technique was used to obtain dynamic flight data over a range of flight conditions with a single flight maneuver. Multivariate polynomials and polynomial splines were used with orthogonalization techniques and statistical modeling metrics to synthesize global nonlinear aerodynamic models directly and completely from flight data alone. Simulation data and flight data from a subscale twin-engine jet transport aircraft were used to demonstrate the techniques. Results showed that global multivariate nonlinear aerodynamic dependencies could be accurately identified using flight data from a single maneuver. Flight-derived global aerodynamic model structures, model parameter estimates, and associated uncertainties were provided for all six nondimensional force and moment coefficients for the test aircraft. These models were combined with a propulsion model identified from engine ground test data to produce a high-fidelity nonlinear flight simulation very efficiently. Prediction testing using a multi-axis maneuver showed that the identified global model accurately predicted aircraft responses.
Establishment of a mathematic model for predicting malignancy in solitary pulmonary nodules.
Zhang, Man; Zhuo, Na; Guo, Zhanlin; Zhang, Xingguang; Liang, Wenhua; Zhao, Sheng; He, Jianxing
2015-10-01
The aim of this study was to establish a model for predicting the probability of malignancy in solitary pulmonary nodules (SPNs) and provide guidance for the diagnosis and follow-up intervention of SPNs. We retrospectively analyzed the clinical data and computed tomography (CT) images of 294 patients with a clear pathological diagnosis of SPN. Multivariate logistic regression analysis was used to screen independent predictors of the probability of malignancy in the SPN and to establish a model for predicting malignancy in SPNs. Then, another 120 SPN patients who did not participate in the model establishment were chosen as group B and used to verify the accuracy of the prediction model. Multivariate logistic regression analysis showed that there were significant differences in age, smoking history, maximum diameter of nodules, spiculation, clear borders, and Cyfra21-1 levels between subgroups with benign and malignant SPNs (P<0.05). These factors were identified as independent predictors of malignancy in SPNs. The area under the curve (AUC) was 0.910 [95% confidence interval (CI), 0.857-0.963] in model with Cyfra21-1 significantly better than 0.812 (95% CI, 0.763-0.861) in model without Cyfra21-1 (P=0.008). The area under receiver operating characteristic (ROC) curve of our model is significantly higher than the Mayo model, VA model and Peking University People's (PKUPH) model. Our model (AUC =0.910) compared with Brock model (AUC =0.878, P=0.350), the difference was not statistically significant. The model added Cyfra21-1 could improve prediction. The prediction model established in this study can be used to assess the probability of malignancy in SPNs, thereby providing help for the diagnosis of SPNs and the selection of follow-up interventions.
Analysis/forecast experiments with a multivariate statistical analysis scheme using FGGE data
NASA Technical Reports Server (NTRS)
Baker, W. E.; Bloom, S. C.; Nestler, M. S.
1985-01-01
A three-dimensional, multivariate, statistical analysis method, optimal interpolation (OI) is described for modeling meteorological data from widely dispersed sites. The model was developed to analyze FGGE data at the NASA-Goddard Laboratory of Atmospherics. The model features a multivariate surface analysis over the oceans, including maintenance of the Ekman balance and a geographically dependent correlation function. Preliminary comparisons are made between the OI model and similar schemes employed at the European Center for Medium Range Weather Forecasts and the National Meteorological Center. The OI scheme is used to provide input to a GCM, and model error correlations are calculated for forecasts of 500 mb vertical water mixing ratios and the wind profiles. Comparisons are made between the predictions and measured data. The model is shown to be as accurate as a successive corrections model out to 4.5 days.
Fadzillah, Nurrulhidayah Ahmad; Man, Yaakob bin Che; Rohman, Abdul; Rosman, Arieff Salleh; Ismail, Amin; Mustafa, Shuhaimi; Khatib, Alfi
2015-01-01
The authentication of food products from the presence of non-allowed components for certain religion like lard is very important. In this study, we used proton Nuclear Magnetic Resonance ((1)H-NMR) spectroscopy for the analysis of butter adulterated with lard by simultaneously quantification of all proton bearing compounds, and consequently all relevant sample classes. Since the spectra obtained were too complex to be analyzed visually by the naked eyes, the classification of spectra was carried out.The multivariate calibration of partial least square (PLS) regression was used for modelling the relationship between actual value of lard and predicted value. The model yielded a highest regression coefficient (R(2)) of 0.998 and the lowest root mean square error calibration (RMSEC) of 0.0091% and root mean square error prediction (RMSEP) of 0.0090, respectively. Cross validation testing evaluates the predictive power of the model. PLS model was shown as good models as the intercept of R(2)Y and Q(2)Y were 0.0853 and -0.309, respectively.
Prediction of energy expenditure and physical activity in preschoolers
USDA-ARS?s Scientific Manuscript database
Accurate, nonintrusive, and feasible methods are needed to predict energy expenditure (EE) and physical activity (PA) levels in preschoolers. Herein, we validated cross-sectional time series (CSTS) and multivariate adaptive regression splines (MARS) models based on accelerometry and heart rate (HR) ...
Dong, Jian-Jun; Li, Qing-Liang; Yin, Hua; Zhong, Cheng; Hao, Jun-Guang; Yang, Pan-Fei; Tian, Yu-Hong; Jia, Shi-Ru
2014-10-15
Sensory evaluation is regarded as a necessary procedure to ensure a reproducible quality of beer. Meanwhile, high-throughput analytical methods provide a powerful tool to analyse various flavour compounds, such as higher alcohol and ester. In this study, the relationship between flavour compounds and sensory evaluation was established by non-linear models such as partial least squares (PLS), genetic algorithm back-propagation neural network (GA-BP), support vector machine (SVM). It was shown that SVM with a Radial Basis Function (RBF) had a better performance of prediction accuracy for both calibration set (94.3%) and validation set (96.2%) than other models. Relatively lower prediction abilities were observed for GA-BP (52.1%) and PLS (31.7%). In addition, the kernel function of SVM played an essential role of model training when the prediction accuracy of SVM with polynomial kernel function was 32.9%. As a powerful multivariate statistics method, SVM holds great potential to assess beer quality. Copyright © 2014 Elsevier Ltd. All rights reserved.
User Selection Criteria of Airspace Designs in Flexible Airspace Management
NASA Technical Reports Server (NTRS)
Lee, Hwasoo E.; Lee, Paul U.; Jung, Jaewoo; Lai, Chok Fung
2011-01-01
A method for identifying global aerodynamic models from flight data in an efficient manner is explained and demonstrated. A novel experiment design technique was used to obtain dynamic flight data over a range of flight conditions with a single flight maneuver. Multivariate polynomials and polynomial splines were used with orthogonalization techniques and statistical modeling metrics to synthesize global nonlinear aerodynamic models directly and completely from flight data alone. Simulation data and flight data from a subscale twin-engine jet transport aircraft were used to demonstrate the techniques. Results showed that global multivariate nonlinear aerodynamic dependencies could be accurately identified using flight data from a single maneuver. Flight-derived global aerodynamic model structures, model parameter estimates, and associated uncertainties were provided for all six nondimensional force and moment coefficients for the test aircraft. These models were combined with a propulsion model identified from engine ground test data to produce a high-fidelity nonlinear flight simulation very efficiently. Prediction testing using a multi-axis maneuver showed that the identified global model accurately predicted aircraft responses.
Westman, Eric; Aguilar, Carlos; Muehlboeck, J-Sebastian; Simmons, Andrew
2013-01-01
Automated structural magnetic resonance imaging (MRI) processing pipelines are gaining popularity for Alzheimer's disease (AD) research. They generate regional volumes, cortical thickness measures and other measures, which can be used as input for multivariate analysis. It is not clear which combination of measures and normalization approach are most useful for AD classification and to predict mild cognitive impairment (MCI) conversion. The current study includes MRI scans from 699 subjects [AD, MCI and controls (CTL)] from the Alzheimer's disease Neuroimaging Initiative (ADNI). The Freesurfer pipeline was used to generate regional volume, cortical thickness, gray matter volume, surface area, mean curvature, gaussian curvature, folding index and curvature index measures. 259 variables were used for orthogonal partial least square to latent structures (OPLS) multivariate analysis. Normalisation approaches were explored and the optimal combination of measures determined. Results indicate that cortical thickness measures should not be normalized, while volumes should probably be normalized by intracranial volume (ICV). Combining regional cortical thickness measures (not normalized) with cortical and subcortical volumes (normalized with ICV) using OPLS gave a prediction accuracy of 91.5 % when distinguishing AD versus CTL. This model prospectively predicted future decline from MCI to AD with 75.9 % of converters correctly classified. Normalization strategy did not have a significant effect on the accuracies of multivariate models containing multiple MRI measures for this large dataset. The appropriate choice of input for multivariate analysis in AD and MCI is of great importance. The results support the use of un-normalised cortical thickness measures and volumes normalised by ICV.
Zubrick, Stephen R.; Taylor, Catherine L.; Christensen, Daniel
2015-01-01
Aims Oral language is the foundation of literacy. Naturally, policies and practices to promote children’s literacy begin in early childhood and have a strong focus on developing children’s oral language, especially for children with known risk factors for low language ability. The underlying assumption is that children’s progress along the oral to literate continuum is stable and predictable, such that low language ability foretells low literacy ability. This study investigated patterns and predictors of children’s oral language and literacy abilities at 4, 6, 8 and 10 years. The study sample comprised 2,316 to 2,792 children from the first nationally representative Longitudinal Study of Australian Children (LSAC). Six developmental patterns were observed, a stable middle-high pattern, a stable low pattern, an improving pattern, a declining pattern, a fluctuating low pattern, and a fluctuating middle-high pattern. Most children (69%) fit a stable middle-high pattern. By contrast, less than 1% of children fit a stable low pattern. These results challenged the view that children’s progress along the oral to literate continuum is stable and predictable. Findings Multivariate logistic regression was used to investigate risks for low literacy ability at 10 years and sensitivity-specificity analysis was used to examine the predictive utility of the multivariate model. Predictors were modelled as risk variables with the lowest level of risk as the reference category. In the multivariate model, substantial risks for low literacy ability at 10 years, in order of descending magnitude, were: low school readiness, Aboriginal and/or Torres Strait Islander status and low language ability at 8 years. Moderate risks were high temperamental reactivity, low language ability at 4 years, and low language ability at 6 years. The following risk factors were not statistically significant in the multivariate model: Low maternal consistency, low family income, health care card, child not read to at home, maternal smoking, maternal education, family structure, temperamental persistence, and socio-economic area disadvantage. The results of the sensitivity-specificity analysis showed that a well-fitted multivariate model featuring risks of substantive magnitude did not do particularly well in predicting low literacy ability at 10 years. PMID:26352436
Multivariate Modelling of the Career Intent of Air Force Personnel.
1980-09-01
index (HOPP) was used as a measure of current job satisfaction . As with the Vroom and Fishbein/Graen models, two separate validations were accom...34 Organizational Behavior and Human Performance , 23: 251-267, 1979. Lewis, Logan M. "Expectancy Theory as a Predictive Model of Career Intent, Job Satisfaction ...W. Albright. "Expectancy Theory Predictions of the Satisfaction , Effort, Performance , and Retention of Naval Aviation Officers," Organizational
A novel strategy for forensic age prediction by DNA methylation and support vector regression model
Xu, Cheng; Qu, Hongzhu; Wang, Guangyu; Xie, Bingbing; Shi, Yi; Yang, Yaran; Zhao, Zhao; Hu, Lan; Fang, Xiangdong; Yan, Jiangwei; Feng, Lei
2015-01-01
High deviations resulting from prediction model, gender and population difference have limited age estimation application of DNA methylation markers. Here we identified 2,957 novel age-associated DNA methylation sites (P < 0.01 and R2 > 0.5) in blood of eight pairs of Chinese Han female monozygotic twins. Among them, nine novel sites (false discovery rate < 0.01), along with three other reported sites, were further validated in 49 unrelated female volunteers with ages of 20–80 years by Sequenom Massarray. A total of 95 CpGs were covered in the PCR products and 11 of them were built the age prediction models. After comparing four different models including, multivariate linear regression, multivariate nonlinear regression, back propagation neural network and support vector regression, SVR was identified as the most robust model with the least mean absolute deviation from real chronological age (2.8 years) and an average accuracy of 4.7 years predicted by only six loci from the 11 loci, as well as an less cross-validated error compared with linear regression model. Our novel strategy provides an accurate measurement that is highly useful in estimating the individual age in forensic practice as well as in tracking the aging process in other related applications. PMID:26635134
Simultaneous calibration of ensemble river flow predictions over an entire range of lead times
NASA Astrophysics Data System (ADS)
Hemri, S.; Fundel, F.; Zappa, M.
2013-10-01
Probabilistic estimates of future water levels and river discharge are usually simulated with hydrologic models using ensemble weather forecasts as main inputs. As hydrologic models are imperfect and the meteorological ensembles tend to be biased and underdispersed, the ensemble forecasts for river runoff typically are biased and underdispersed, too. Thus, in order to achieve both reliable and sharp predictions statistical postprocessing is required. In this work Bayesian model averaging (BMA) is applied to statistically postprocess ensemble runoff raw forecasts for a catchment in Switzerland, at lead times ranging from 1 to 240 h. The raw forecasts have been obtained using deterministic and ensemble forcing meteorological models with different forecast lead time ranges. First, BMA is applied based on mixtures of univariate normal distributions, subject to the assumption of independence between distinct lead times. Then, the independence assumption is relaxed in order to estimate multivariate runoff forecasts over the entire range of lead times simultaneously, based on a BMA version that uses multivariate normal distributions. Since river runoff is a highly skewed variable, Box-Cox transformations are applied in order to achieve approximate normality. Both univariate and multivariate BMA approaches are able to generate well calibrated probabilistic forecasts that are considerably sharper than climatological forecasts. Additionally, multivariate BMA provides a promising approach for incorporating temporal dependencies into the postprocessed forecasts. Its major advantage against univariate BMA is an increase in reliability when the forecast system is changing due to model availability.
Parrish, Rudolph S.; Smith, Charles N.
1990-01-01
A quantitative method is described for testing whether model predictions fall within a specified factor of true values. The technique is based on classical theory for confidence regions on unknown population parameters and can be related to hypothesis testing in both univariate and multivariate situations. A capability index is defined that can be used as a measure of predictive capability of a model, and its properties are discussed. The testing approach and the capability index should facilitate model validation efforts and permit comparisons among competing models. An example is given for a pesticide leaching model that predicts chemical concentrations in the soil profile.
Pedersen, Kristine Bondo; Kirkelund, Gunvor M; Ottosen, Lisbeth M; Jensen, Pernille E; Lejon, Tore
2015-01-01
Chemometrics was used to develop a multivariate model based on 46 previously reported electrodialytic remediation experiments (EDR) of five different harbour sediments. The model predicted final concentrations of Cd, Cu, Pb and Zn as a function of current density, remediation time, stirring rate, dry/wet sediment, cell set-up as well as sediment properties. Evaluation of the model showed that remediation time and current density had the highest comparative influence on the clean-up levels. Individual models for each heavy metal showed variance in the variable importance, indicating that the targeted heavy metals were bound to different sediment fractions. Based on the results, a PLS model was used to design five new EDR experiments of a sixth sediment to achieve specified clean-up levels of Cu and Pb. The removal efficiencies were up to 82% for Cu and 87% for Pb and the targeted clean-up levels were met in four out of five experiments. The clean-up levels were better than predicted by the model, which could hence be used for predicting an approximate remediation strategy; the modelling power will however improve with more data included. Copyright © 2014 Elsevier B.V. All rights reserved.
Estimating the decomposition of predictive information in multivariate systems
NASA Astrophysics Data System (ADS)
Faes, Luca; Kugiumtzis, Dimitris; Nollo, Giandomenico; Jurysta, Fabrice; Marinazzo, Daniele
2015-03-01
In the study of complex systems from observed multivariate time series, insight into the evolution of one system may be under investigation, which can be explained by the information storage of the system and the information transfer from other interacting systems. We present a framework for the model-free estimation of information storage and information transfer computed as the terms composing the predictive information about the target of a multivariate dynamical process. The approach tackles the curse of dimensionality employing a nonuniform embedding scheme that selects progressively, among the past components of the multivariate process, only those that contribute most, in terms of conditional mutual information, to the present target process. Moreover, it computes all information-theoretic quantities using a nearest-neighbor technique designed to compensate the bias due to the different dimensionality of individual entropy terms. The resulting estimators of prediction entropy, storage entropy, transfer entropy, and partial transfer entropy are tested on simulations of coupled linear stochastic and nonlinear deterministic dynamic processes, demonstrating the superiority of the proposed approach over the traditional estimators based on uniform embedding. The framework is then applied to multivariate physiologic time series, resulting in physiologically well-interpretable information decompositions of cardiovascular and cardiorespiratory interactions during head-up tilt and of joint brain-heart dynamics during sleep.
NASA Astrophysics Data System (ADS)
Vallières, M.; Freeman, C. R.; Skamene, S. R.; El Naqa, I.
2015-07-01
This study aims at developing a joint FDG-PET and MRI texture-based model for the early evaluation of lung metastasis risk in soft-tissue sarcomas (STSs). We investigate if the creation of new composite textures from the combination of FDG-PET and MR imaging information could better identify aggressive tumours. Towards this goal, a cohort of 51 patients with histologically proven STSs of the extremities was retrospectively evaluated. All patients had pre-treatment FDG-PET and MRI scans comprised of T1-weighted and T2-weighted fat-suppression sequences (T2FS). Nine non-texture features (SUV metrics and shape features) and forty-one texture features were extracted from the tumour region of separate (FDG-PET, T1 and T2FS) and fused (FDG-PET/T1 and FDG-PET/T2FS) scans. Volume fusion of the FDG-PET and MRI scans was implemented using the wavelet transform. The influence of six different extraction parameters on the predictive value of textures was investigated. The incorporation of features into multivariable models was performed using logistic regression. The multivariable modeling strategy involved imbalance-adjusted bootstrap resampling in the following four steps leading to final prediction model construction: (1) feature set reduction; (2) feature selection; (3) prediction performance estimation; and (4) computation of model coefficients. Univariate analysis showed that the isotropic voxel size at which texture features were extracted had the most impact on predictive value. In multivariable analysis, texture features extracted from fused scans significantly outperformed those from separate scans in terms of lung metastases prediction estimates. The best performance was obtained using a combination of four texture features extracted from FDG-PET/T1 and FDG-PET/T2FS scans. This model reached an area under the receiver-operating characteristic curve of 0.984 ± 0.002, a sensitivity of 0.955 ± 0.006, and a specificity of 0.926 ± 0.004 in bootstrapping evaluations. Ultimately, lung metastasis risk assessment at diagnosis of STSs could improve patient outcomes by allowing better treatment adaptation.
Voss, Jesse S; Iqbal, Seher; Jenkins, Sarah M; Henry, Michael R; Clayton, Amy C; Jett, James R; Kipp, Benjamin R; Halling, Kevin C; Maldonado, Fabien
2014-01-01
Studies have shown that fluorescence in situ hybridization (FISH) testing increases lung cancer detection on cytology specimens in peripheral nodules. The goal of this study was to determine whether a predictive model using clinical features and routine cytology with FISH results could predict lung malignancy after a nondiagnostic bronchoscopic evaluation. Patients with an indeterminate peripheral lung nodule that had a nondiagnostic bronchoscopic evaluation were included in this study (N = 220). FISH was performed on residual bronchial brushing cytology specimens diagnosed as negative (n = 195), atypical (n = 16), or suspicious (n = 9). FISH results included hypertetrasomy (n = 30) and negative (n = 190). Primary study end points included lung cancer status along with time to diagnosis of lung cancer or date of last clinical follow-up. Hazard ratios (HRs) were calculated using Cox proportional hazards regression model analyses, and P values < .05 were considered statistically significant. The mean age of the 220 patients was 66.7 years (range, 35-91), and most (58%) were men. Most patients (79%) were current or former smokers with a mean pack year history of 43.2 years (median, 40; range, 1-200). After multivariate analysis, hypertetrasomy FISH (HR = 2.96, P < .001), pack years (HR = 1.03 per pack year up to 50, P = .001), age (HR = 1.04 per year, P = .02), atypical or suspicious cytology (HR = 2.02, P = .04), and nodule spiculation (HR = 2.36, P = .003) were independent predictors of malignancy over time and were used to create a prediction model (C-statistic = 0.78). These results suggest that this multivariate model including test results and clinical features may be useful following a nondiagnostic bronchoscopic examination. © 2013.
The natural mathematics of behavior analysis.
Li, Don; Hautus, Michael J; Elliffe, Douglas
2018-04-19
Models that generate event records have very general scope regarding the dimensions of the target behavior that we measure. From a set of predicted event records, we can generate predictions for any dependent variable that we could compute from the event records of our subjects. In this sense, models that generate event records permit us a freely multivariate analysis. To explore this proposition, we conducted a multivariate examination of Catania's Operant Reserve on single VI schedules in transition using a Markov Chain Monte Carlo scheme for Approximate Bayesian Computation. Although we found systematic deviations between our implementation of Catania's Operant Reserve and our observed data (e.g., mismatches in the shape of the interresponse time distributions), the general approach that we have demonstrated represents an avenue for modelling behavior that transcends the typical constraints of algebraic models. © 2018 Society for the Experimental Analysis of Behavior.
[Multivariate Adaptive Regression Splines (MARS), an alternative for the analysis of time series].
Vanegas, Jairo; Vásquez, Fabián
Multivariate Adaptive Regression Splines (MARS) is a non-parametric modelling method that extends the linear model, incorporating nonlinearities and interactions between variables. It is a flexible tool that automates the construction of predictive models: selecting relevant variables, transforming the predictor variables, processing missing values and preventing overshooting using a self-test. It is also able to predict, taking into account structural factors that might influence the outcome variable, thereby generating hypothetical models. The end result could identify relevant cut-off points in data series. It is rarely used in health, so it is proposed as a tool for the evaluation of relevant public health indicators. For demonstrative purposes, data series regarding the mortality of children under 5 years of age in Costa Rica were used, comprising the period 1978-2008. Copyright © 2016 SESPAS. Publicado por Elsevier España, S.L.U. All rights reserved.
Lee, Michael J; Cizik, Amy M; Hamilton, Deven; Chapman, Jens R
2014-09-01
The impact of surgical site infection (SSI) is substantial. Although previous study has determined relative risk and odds ratio (OR) values to quantify risk factors, these values may be difficult to translate to the patient during counseling of surgical options. Ideally, a model that predicts absolute risk of SSI, rather than relative risk or OR values, would greatly enhance the discussion of safety of spine surgery. To date, there is no risk stratification model that specifically predicts the risk of medical complication. The purpose of this study was to create and validate a predictive model for the risk of SSI after spine surgery. This study performs a multivariate analysis of SSI after spine surgery using a large prospective surgical registry. Using the results of this analysis, this study will then create and validate a predictive model for SSI after spine surgery. The patient sample is from a high-quality surgical registry from our two institutions with prospectively collected, detailed demographic, comorbidity, and complication data. An SSI that required return to the operating room for surgical debridement. Using a prospectively collected surgical registry of more than 1,532 patients with extensive demographic, comorbidity, surgical, and complication details recorded for 2 years after the surgery, we identified several risk factors for SSI after multivariate analysis. Using the beta coefficients from those regression analyses, we created a model to predict the occurrence of SSI after spine surgery. We split our data into two subsets for internal and cross-validation of our model. We created a predictive model based on our beta coefficients from our multivariate analysis. The final predictive model for SSI had a receiver-operator curve characteristic of 0.72, considered to be a fair measure. The final model has been uploaded for use on SpineSage.com. We present a validated model for predicting SSI after spine surgery. The value in this model is that it gives the user an absolute percent likelihood of SSI after spine surgery based on the patient's comorbidity profile and invasiveness of surgery. Patients are far more likely to understand an absolute percentage, rather than relative risk and confidence interval values. A model such as this is of paramount importance in counseling patients and enhancing the safety of spine surgery. In addition, a tool such as this can be of great use particularly as health care trends toward pay for performance, quality metrics (such as SSI), and risk adjustment. To facilitate the use of this model, we have created a Web site (SpineSage.com) where users can enter patient data to determine likelihood for SSI. Copyright © 2014 Elsevier Inc. All rights reserved.
Multivariate Bias Correction Procedures for Improving Water Quality Predictions from the SWAT Model
NASA Astrophysics Data System (ADS)
Arumugam, S.; Libera, D.
2017-12-01
Water quality observations are usually not available on a continuous basis for longer than 1-2 years at a time over a decadal period given the labor requirements making calibrating and validating mechanistic models difficult. Further, any physical model predictions inherently have bias (i.e., under/over estimation) and require post-simulation techniques to preserve the long-term mean monthly attributes. This study suggests a multivariate bias-correction technique and compares to a common technique in improving the performance of the SWAT model in predicting daily streamflow and TN loads across the southeast based on split-sample validation. The approach is a dimension reduction technique, canonical correlation analysis (CCA) that regresses the observed multivariate attributes with the SWAT model simulated values. The common approach is a regression based technique that uses an ordinary least squares regression to adjust model values. The observed cross-correlation between loadings and streamflow is better preserved when using canonical correlation while simultaneously reducing individual biases. Additionally, canonical correlation analysis does a better job in preserving the observed joint likelihood of observed streamflow and loadings. These procedures were applied to 3 watersheds chosen from the Water Quality Network in the Southeast Region; specifically, watersheds with sufficiently large drainage areas and number of observed data points. The performance of these two approaches are compared for the observed period and over a multi-decadal period using loading estimates from the USGS LOADEST model. Lastly, the CCA technique is applied in a forecasting sense by using 1-month ahead forecasts of P & T from ECHAM4.5 as forcings in the SWAT model. Skill in using the SWAT model for forecasting loadings and streamflow at the monthly and seasonal timescale is also discussed.
Predicting early cognitive decline in newly-diagnosed Parkinson's patients: A practical model.
Hogue, Olivia; Fernandez, Hubert H; Floden, Darlene P
2018-06-19
To create a multivariable model to predict early cognitive decline among de novo patients with Parkinson's disease, using brief, inexpensive assessments that are easily incorporated into clinical flow. Data for 351 drug-naïve patients diagnosed with idiopathic Parkinson's disease were obtained from the Parkinson's Progression Markers Initiative. Baseline demographic, disease history, motor, and non-motor features were considered as candidate predictors. Best subsets selection was used to determine the multivariable baseline symptom profile that most accurately predicted individual cognitive decline within three years. Eleven per cent of the sample experienced cognitive decline. The final logistic regression model predicting decline included five baseline variables: verbal memory retention, right-sided bradykinesia, years of education, subjective report of cognitive impairment, and REM behavior disorder. Model discrimination was good (optimism-adjusted concordance index = .749). The associated nomogram provides a tool to determine individual patient risk of meaningful cognitive change in the early stages of the disease. Through the consideration of easily-implemented or routinely-gathered assessments, we have identified a multidimensional baseline profile and created a convenient, inexpensive tool to predict cognitive decline in the earliest stages of Parkinson's disease. The use of this tool would generate prediction at the individual level, allowing clinicians to tailor medical management for each patient and identify at-risk patients for clinical trials aimed at disease modifying therapies. Copyright © 2018. Published by Elsevier Ltd.
Mwanza, Jean-Claude; Warren, Joshua L.; Hochberg, Jessica T.; Budenz, Donald L.; Chang, Robert T.; Ramulu, Pradeep Y.
2014-01-01
Purpose To determine the ability of frequency doubling technology (FDT) and scanning laser polarimetry with variable corneal compensation (GDx-VCC) to detect glaucoma when used individually and in combination. Methods One hundred and ten normal and 114 glaucomatous subjects were tested with FDT C-20-5 screening protocol and the GDx-VCC. The discriminating ability was tested for each device individually and for both devices combined using GDx-NFI, GDx-TSNIT, number of missed points of FDT, and normal or abnormal FDT. Measures of discrimination included sensitivity, specificity, area under the curve (AUC), Akaike’s information criterion (AIC), and prediction confidence interval lengths (PIL). Results For detecting glaucoma regardless of severity, the multivariable model resulting from the combination of GDX-TSNIT, number of abnormal points on FDT (NAP-FDT), and the interaction GDx-TSNIT * NAP-FDT (AIC: 88.28, AUC: 0.959, sensitivity: 94.6%, specificity: 89.5%) outperformed the best single variable model provided by GDx-NFI (AIC: 120.88, AUC: 0.914, sensitivity: 87.8%, specificity: 84.2%). The multivariable model combining GDx-TSNIT, NAPFDT, and interaction GDx-TSNIT*NAP-FDT consistently provided better discriminating abilities for detecting early, moderate and severe glaucoma than the best single variable models. Conclusions The multivariable model including GDx-TSNIT, NAP-FDT, and the interaction GDX-TSNIT * NAP-FDT provides the best glaucoma prediction compared to all other multivariable and univariable models. Combining the FDT C-20-5 screening protocol and GDx-VCC improves glaucoma detection compared to using GDx or FDT alone. PMID:24777046
Davis, Matthew H.
2016-01-01
Successful perception depends on combining sensory input with prior knowledge. However, the underlying mechanism by which these two sources of information are combined is unknown. In speech perception, as in other domains, two functionally distinct coding schemes have been proposed for how expectations influence representation of sensory evidence. Traditional models suggest that expected features of the speech input are enhanced or sharpened via interactive activation (Sharpened Signals). Conversely, Predictive Coding suggests that expected features are suppressed so that unexpected features of the speech input (Prediction Errors) are processed further. The present work is aimed at distinguishing between these two accounts of how prior knowledge influences speech perception. By combining behavioural, univariate, and multivariate fMRI measures of how sensory detail and prior expectations influence speech perception with computational modelling, we provide evidence in favour of Prediction Error computations. Increased sensory detail and informative expectations have additive behavioural and univariate neural effects because they both improve the accuracy of word report and reduce the BOLD signal in lateral temporal lobe regions. However, sensory detail and informative expectations have interacting effects on speech representations shown by multivariate fMRI in the posterior superior temporal sulcus. When prior knowledge was absent, increased sensory detail enhanced the amount of speech information measured in superior temporal multivoxel patterns, but with informative expectations, increased sensory detail reduced the amount of measured information. Computational simulations of Sharpened Signals and Prediction Errors during speech perception could both explain these behavioural and univariate fMRI observations. However, the multivariate fMRI observations were uniquely simulated by a Prediction Error and not a Sharpened Signal model. The interaction between prior expectation and sensory detail provides evidence for a Predictive Coding account of speech perception. Our work establishes methods that can be used to distinguish representations of Prediction Error and Sharpened Signals in other perceptual domains. PMID:27846209
Multivariate neural biomarkers of emotional states are categorically distinct.
Kragel, Philip A; LaBar, Kevin S
2015-11-01
Understanding how emotions are represented neurally is a central aim of affective neuroscience. Despite decades of neuroimaging efforts addressing this question, it remains unclear whether emotions are represented as distinct entities, as predicted by categorical theories, or are constructed from a smaller set of underlying factors, as predicted by dimensional accounts. Here, we capitalize on multivariate statistical approaches and computational modeling to directly evaluate these theoretical perspectives. We elicited discrete emotional states using music and films during functional magnetic resonance imaging scanning. Distinct patterns of neural activation predicted the emotion category of stimuli and tracked subjective experience. Bayesian model comparison revealed that combining dimensional and categorical models of emotion best characterized the information content of activation patterns. Surprisingly, categorical and dimensional aspects of emotion experience captured unique and opposing sources of neural information. These results indicate that diverse emotional states are poorly differentiated by simple models of valence and arousal, and that activity within separable neural systems can be mapped to unique emotion categories. © The Author (2015). Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
Forecasting of municipal solid waste quantity in a developing country using multivariate grey models
DOE Office of Scientific and Technical Information (OSTI.GOV)
Intharathirat, Rotchana, E-mail: rotchana.in@gmail.com; Abdul Salam, P., E-mail: salam@ait.ac.th; Kumar, S., E-mail: kumar@ait.ac.th
Highlights: • Grey model can be used to forecast MSW quantity accurately with the limited data. • Prediction interval overcomes the uncertainty of MSW forecast effectively. • A multivariate model gives accuracy associated with factors affecting MSW quantity. • Population, urbanization, employment and household size play role for MSW quantity. - Abstract: In order to plan, manage and use municipal solid waste (MSW) in a sustainable way, accurate forecasting of MSW generation and composition plays a key role. It is difficult to carry out the reliable estimates using the existing models due to the limited data available in the developingmore » countries. This study aims to forecast MSW collected in Thailand with prediction interval in long term period by using the optimized multivariate grey model which is the mathematical approach. For multivariate models, the representative factors of residential and commercial sectors affecting waste collected are identified, classified and quantified based on statistics and mathematics of grey system theory. Results show that GMC (1, 5), the grey model with convolution integral, is the most accurate with the least error of 1.16% MAPE. MSW collected would increase 1.40% per year from 43,435–44,994 tonnes per day in 2013 to 55,177–56,735 tonnes per day in 2030. This model also illustrates that population density is the most important factor affecting MSW collected, followed by urbanization, proportion employment and household size, respectively. These mean that the representative factors of commercial sector may affect more MSW collected than that of residential sector. Results can help decision makers to develop the measures and policies of waste management in long term period.« less
An analytics approach to designing patient centered medical homes.
Ajorlou, Saeede; Shams, Issac; Yang, Kai
2015-03-01
Recently the patient centered medical home (PCMH) model has become a popular team based approach focused on delivering more streamlined care to patients. In current practices of medical homes, a clinical based prediction frame is recommended because it can help match the portfolio capacity of PCMH teams with the actual load generated by a set of patients. Without such balances in clinical supply and demand, issues such as excessive under and over utilization of physicians, long waiting time for receiving the appropriate treatment, and non-continuity of care will eliminate many advantages of the medical home strategy. In this paper, by using the hierarchical generalized linear model with multivariate responses, we develop a clinical workload prediction model for care portfolio demands in a Bayesian framework. The model allows for heterogeneous variances and unstructured covariance matrices for nested random effects that arise through complex hierarchical care systems. We show that using a multivariate approach substantially enhances the precision of workload predictions at both primary and non primary care levels. We also demonstrate that care demands depend not only on patient demographics but also on other utilization factors, such as length of stay. Our analyses of a recent data from Veteran Health Administration further indicate that risk adjustment for patient health conditions can considerably improve the prediction power of the model.
Selecting an Informative/Discriminating Multivariate Response for Inverse Prediction
Thomas, Edward V.; Lewis, John R.; Anderson-Cook, Christine M.; ...
2017-11-21
nverse prediction is important in a wide variety of scientific and engineering contexts. One might use inverse prediction to predict fundamental properties/characteristics of an object using measurements obtained from it. This can be accomplished by “inverting” parameterized forward models that relate the measurements (responses) to the properties/characteristics of interest. Sometimes forward models are science based; but often, forward models are empirically based, using the results of experimentation. For empirically-based forward models, it is important that the experiments provide a sound basis to develop accurate forward models in terms of the properties/characteristics (factors). While nature dictates the causal relationship between factorsmore » and responses, experimenters can influence control of the type, accuracy, and precision of forward models that can be constructed via selection of factors, factor levels, and the set of trials that are performed. Whether the forward models are based on science, experiments or both, researchers can influence the ability to perform inverse prediction by selecting informative response variables. By using an errors-in-variables framework for inverse prediction, this paper shows via simple analysis and examples how the capability of a multivariate response (with respect to being informative and discriminating) can vary depending on how well the various responses complement one another over the range of the factor-space of interest. Insights derived from this analysis could be useful for selecting a set of response variables among candidates in cases where the number of response variables that can be acquired is limited by difficulty, expense, and/or availability of material.« less
Selecting an Informative/Discriminating Multivariate Response for Inverse Prediction
DOE Office of Scientific and Technical Information (OSTI.GOV)
Thomas, Edward V.; Lewis, John R.; Anderson-Cook, Christine M.
nverse prediction is important in a wide variety of scientific and engineering contexts. One might use inverse prediction to predict fundamental properties/characteristics of an object using measurements obtained from it. This can be accomplished by “inverting” parameterized forward models that relate the measurements (responses) to the properties/characteristics of interest. Sometimes forward models are science based; but often, forward models are empirically based, using the results of experimentation. For empirically-based forward models, it is important that the experiments provide a sound basis to develop accurate forward models in terms of the properties/characteristics (factors). While nature dictates the causal relationship between factorsmore » and responses, experimenters can influence control of the type, accuracy, and precision of forward models that can be constructed via selection of factors, factor levels, and the set of trials that are performed. Whether the forward models are based on science, experiments or both, researchers can influence the ability to perform inverse prediction by selecting informative response variables. By using an errors-in-variables framework for inverse prediction, this paper shows via simple analysis and examples how the capability of a multivariate response (with respect to being informative and discriminating) can vary depending on how well the various responses complement one another over the range of the factor-space of interest. Insights derived from this analysis could be useful for selecting a set of response variables among candidates in cases where the number of response variables that can be acquired is limited by difficulty, expense, and/or availability of material.« less
A modeling study of 2006 Huntington Beach (Lake Erie) beach bacteria concentrations indicates multi-variable linear regression (MLR) can effectively estimate bacteria concentrations compared to the persistence model. Our use of the Virtual Beach (VB) model affirms that fact. VB i...
Gu, Jiwei; Andreasen, Jan J; Melgaard, Jacob; Lundbye-Christensen, Søren; Hansen, John; Schmidt, Erik B; Thorsteinsson, Kristinn; Graff, Claus
2017-02-01
To investigate if electrocardiogram (ECG) markers from routine preoperative ECGs can be used in combination with clinical data to predict new-onset postoperative atrial fibrillation (POAF) following cardiac surgery. Retrospective observational case-control study. Single-center university hospital. One hundred consecutive adult patients (50 POAF, 50 without POAF) who underwent coronary artery bypass grafting, valve surgery, or combinations. Retrospective review of medical records and registration of POAF. Clinical data and demographics were retrieved from the Western Denmark Heart Registry and patient records. Paper tracings of preoperative ECGs were collected from patient records, and ECG measurements were read by two independent readers blinded to outcome. A subset of four clinical variables (age, gender, body mass index, and type of surgery) were selected to form a multivariate clinical prediction model for POAF and five ECG variables (QRS duration, PR interval, P-wave duration, left atrial enlargement, and left ventricular hypertrophy) were used in a multivariate ECG model. Adding ECG variables to the clinical prediction model significantly improved the area under the receiver operating characteristic curve from 0.54 to 0.67 (with cross-validation). The best predictive model for POAF was a combined clinical and ECG model with the following four variables: age, PR-interval, QRS duration, and left atrial enlargement. ECG markers obtained from a routine preoperative ECG may be helpful in predicting new-onset POAF in patients undergoing cardiac surgery. Copyright © 2017 Elsevier Inc. All rights reserved.
A Novel Early Pregnancy Risk Prediction Model for Gestational Diabetes Mellitus.
Sweeting, Arianne N; Wong, Jencia; Appelblom, Heidi; Ross, Glynis P; Kouru, Heikki; Williams, Paul F; Sairanen, Mikko; Hyett, Jon A
2018-06-13
Accurate early risk prediction for gestational diabetes mellitus (GDM) would target intervention and prevention in women at the highest risk. We evaluated novel biomarker predictors to develop a first-trimester risk prediction model in a large multiethnic cohort. Maternal clinical, aneuploidy and pre-eclampsia screening markers (PAPP-A, free hCGβ, mean arterial pressure, uterine artery pulsatility index) were measured prospectively at 11-13+6 weeks' gestation in 980 women (248 with GDM; 732 controls). Nonfasting glucose, lipids, adiponectin, leptin, lipocalin-2, and plasminogen activator inhibitor-2 were measured on banked serum. The relationship between marker multiples-of-the-median and GDM was examined with multivariate regression. Model predictive performance for early (< 24 weeks' gestation) and overall GDM diagnosis was evaluated by receiver operating characteristic curves. Glucose, triglycerides, leptin, and lipocalin-2 were higher, while adiponectin was lower, in GDM (p < 0.05). Lipocalin-2 performed best in Caucasians, and triglycerides in South Asians with GDM. Family history of diabetes, previous GDM, South/East Asian ethnicity, parity, BMI, PAPP-A, triglycerides, and lipocalin-2 were significant independent GDM predictors (all p < 0.01), achieving an area under the curve of 0.91 (95% confidence interval [CI] 0.89-0.94) overall, and 0.93 (95% CI 0.89-0.96) for early GDM, in a combined multivariate prediction model. A first-trimester risk prediction model, which incorporates novel maternal lipid markers, accurately identifies women at high risk of GDM, including early GDM. © 2018 S. Karger AG, Basel.
Dienstmann, R; Mason, M J; Sinicrope, F A; Phipps, A I; Tejpar, S; Nesbakken, A; Danielsen, S A; Sveen, A; Buchanan, D D; Clendenning, M; Rosty, C; Bot, B; Alberts, S R; Milburn Jessup, J; Lothe, R A; Delorenzi, M; Newcomb, P A; Sargent, D; Guinney, J
2017-05-01
TNM staging alone does not accurately predict outcome in colon cancer (CC) patients who may be eligible for adjuvant chemotherapy. It is unknown to what extent the molecular markers microsatellite instability (MSI) and mutations in BRAF or KRAS improve prognostic estimation in multivariable models that include detailed clinicopathological annotation. After imputation of missing at random data, a subset of patients accrued in phase 3 trials with adjuvant chemotherapy (n = 3016)-N0147 (NCT00079274) and PETACC3 (NCT00026273)-was aggregated to construct multivariable Cox models for 5-year overall survival that were subsequently validated internally in the remaining clinical trial samples (n = 1499), and also externally in different population cohorts of chemotherapy-treated (n = 949) or -untreated (n = 1080) CC patients, and an additional series without treatment annotation (n = 782). TNM staging, MSI and BRAFV600E mutation status remained independent prognostic factors in multivariable models across clinical trials cohorts and observational studies. Concordance indices increased from 0.61-0.68 in the TNM alone model to 0.63-0.71 in models with added molecular markers, 0.65-0.73 with clinicopathological features and 0.66-0.74 with all covariates. In validation cohorts with complete annotation, the integrated time-dependent AUC rose from 0.64 for the TNM alone model to 0.67 for models that included clinicopathological features, with or without molecular markers. In patient cohorts that received adjuvant chemotherapy, the relative proportion of variance explained (R2) by TNM, clinicopathological features and molecular markers was on an average 65%, 25% and 10%, respectively. Incorporation of MSI, BRAFV600E and KRAS mutation status to overall survival models with TNM staging improves the ability to precisely prognosticate in stage II and III CC patients, but only modestly increases prediction accuracy in multivariable models that include clinicopathological features, particularly in chemotherapy-treated patients. © The Author 2017. Published by Oxford University Press on behalf of the European Society for Medical Oncology.
Stegmaier, Petra; Drendel, Vanessa; Mo, Xiaokui; Ling, Stella; Fabian, Denise; Manring, Isabel; Jilg, Cordula A.; Schultze-Seemann, Wolfgang; McNulty, Maureen; Zynger, Debra L.; Martin, Douglas; White, Julia; Werner, Martin; Grosu, Anca L.; Chakravarti, Arnab
2015-01-01
Purpose To develop a microRNA (miRNA)-based predictive model for prostate cancer patients of 1) time to biochemical recurrence after radical prostatectomy and 2) biochemical recurrence after salvage radiation therapy following documented biochemical disease progression post-radical prostatectomy. Methods Forty three patients who had undergone salvage radiation therapy following biochemical failure after radical prostatectomy with greater than 4 years of follow-up data were identified. Formalin-fixed, paraffin-embedded tissue blocks were collected for all patients and total RNA was isolated from 1mm cores enriched for tumor (>70%). Eight hundred miRNAs were analyzed simultaneously using the nCounter human miRNA v2 assay (NanoString Technologies; Seattle, WA). Univariate and multivariate Cox proportion hazards regression models as well as receiver operating characteristics were used to identify statistically significant miRNAs that were predictive of biochemical recurrence. Results Eighty eight miRNAs were identified to be significantly (p<0.05) associated with biochemical failure post-prostatectomy by multivariate analysis and clustered into two groups that correlated with early (≤ 36 months) versus late recurrence (>36 months). Nine miRNAs were identified to be significantly (p<0.05) associated by multivariate analysis with biochemical failure after salvage radiation therapy. A new predictive model for biochemical recurrence after salvage radiation therapy was developed; this model consisted of miR-4516 and miR-601 together with, Gleason score, and lymph node status. The area under the ROC curve (AUC) was improved to 0.83 compared to that of 0.66 for Gleason score and lymph node status alone. Conclusion miRNA signatures can distinguish patients who fail soon after radical prostatectomy versus late failures, giving insight into which patients may need adjuvant therapy. Notably, two novel miRNAs (miR-4516 and miR-601) were identified that significantly improve prediction of biochemical failure post-salvage radiation therapy compared to clinico-histopathological factors, supporting the use of miRNAs within clinically used predictive models. Both findings warrant further validation studies. PMID:25760964
Retention of community college students in online courses
NASA Astrophysics Data System (ADS)
Krajewski, Sarah
The issue of attrition in online courses at higher learning institutions remains a high priority in the United States. A recent rapid growth of online courses at community colleges has been instigated by student demand, as they meet the time constraints many nontraditional community college students have as a result of the need to work and care for dependents. Failure in an online course can cause students to become frustrated with the college experience, financially burdened, or to even give up and leave college. Attrition could be avoided by proper guidance of who is best suited for online courses. This study examined factors related to retention (i.e., course completion) and success (i.e., receiving a C or better) in an online biology course at a community college in the Midwest by operationalizing student characteristics (age, race, gender), student skills (whether or not the student met the criteria to be placed in an AFP course), and external factors (Pell recipient, full/part time status, first term) from the persistence model developed by Rovai. Internal factors from this model were not included in this study. Both univariate analyses and multivariate logistic regression were used to analyze the variables. Results suggest that race and Pell recipient were both predictive of course completion on univariate analyses. However, multivariate analyses showed that age, race, academic load and first term were predictive of completion and Pell recipient was no longer predictive. The univariate results for the C or better showed that age, race, Pell recipient, academic load, and meeting AFP criteria were predictive of success. Multivariate analyses showed that only age, race, and Pell recipient were significant predictors of success. Both regression models explained very little (<15%) of the variability within the outcome variables of retention and success. Therefore, although significant predictors were identified for course completion and retention, there are still many factors that remain unaccounted for in both regression models. Further research into the operationalization of Rovai's model, including internal factors, to predict completion and success is necessary.
Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.; Michael, John A.; Helsel, Dennis R.
2008-01-01
Logistic regression was used to develop statistical models that can be used to predict the probability of debris flows in areas recently burned by wildfires by using data from 14 wildfires that burned in southern California during 2003-2006. Twenty-eight independent variables describing the basin morphology, burn severity, rainfall, and soil properties of 306 drainage basins located within those burned areas were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows soon after the 2003 to 2006 fires were delineated from data in the National Elevation Dataset using a geographic information system; (2) Data describing the basin morphology, burn severity, rainfall, and soil properties were compiled for each basin. These data were then input to a statistics software package for analysis using logistic regression; and (3) Relations between the occurrence or absence of debris flows and the basin morphology, burn severity, rainfall, and soil properties were evaluated, and five multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combinations produced the most effective models, and the multivariate models that best predicted the occurrence of debris flows were identified. Percentage of high burn severity and 3-hour peak rainfall intensity were significant variables in all models. Soil organic matter content and soil clay content were significant variables in all models except Model 5. Soil slope was a significant variable in all models except Model 4. The most suitable model can be selected from these five models on the basis of the availability of independent variables in the particular area of interest and field checking of probability maps. The multivariate logistic regression models can be entered into a geographic information system, and maps showing the probability of debris flows can be constructed in recently burned areas of southern California. This study demonstrates that logistic regression is a valuable tool for developing models that predict the probability of debris flows occurring in recently burned landscapes.
Guglielminotti, Jean; Dechartres, Agnès; Mentré, France; Montravers, Philippe; Longrois, Dan; Laouénan, Cedric
2015-10-01
Prognostic research studies in anesthesiology aim to identify risk factors for an outcome (explanatory studies) or calculate the risk of this outcome on the basis of patients' risk factors (predictive studies). Multivariable models express the relationship between predictors and an outcome and are used in both explanatory and predictive studies. Model development demands a strict methodology and a clear reporting to assess its reliability. In this methodological descriptive review, we critically assessed the reporting and methodology of multivariable analysis used in observational prognostic studies published in anesthesiology journals. A systematic search was conducted on Medline through Web of Knowledge, PubMed, and journal websites to identify observational prognostic studies with multivariable analysis published in Anesthesiology, Anesthesia & Analgesia, British Journal of Anaesthesia, and Anaesthesia in 2010 and 2011. Data were extracted by 2 independent readers. First, studies were analyzed with respect to reporting of outcomes, design, size, methods of analysis, model performance (discrimination and calibration), model validation, clinical usefulness, and STROBE (i.e., Strengthening the Reporting of Observational Studies in Epidemiology) checklist. A reporting rate was calculated on the basis of 21 items of the aforementioned points. Second, they were analyzed with respect to some predefined methodological points. Eighty-six studies were included: 87.2% were explanatory and 80.2% investigated a postoperative event. The reporting was fairly good, with a median reporting rate of 79% (75% in explanatory studies and 100% in predictive studies). Six items had a reporting rate <36% (i.e., the 25th percentile), with some of them not identified in the STROBE checklist: blinded evaluation of the outcome (11.9%), reason for sample size (15.1%), handling of missing data (36.0%), assessment of colinearity (17.4%), assessment of interactions (13.9%), and calibration (34.9%). When reported, a few methodological shortcomings were observed, both in explanatory and predictive studies, such as an insufficient number of events of the outcome (44.6%), exclusion of cases with missing data (93.6%), or categorization of continuous variables (65.1%.). The reporting of multivariable analysis was fairly good and could be further improved by checking reporting guidelines and EQUATOR Network website. Limiting the number of candidate variables, including cases with missing data, and not arbitrarily categorizing continuous variables should be encouraged.
NASA Astrophysics Data System (ADS)
Tsao, Sinchai; Gajawelli, Niharika; Zhou, Jiayu; Shi, Jie; Ye, Jieping; Wang, Yalin; Lepore, Natasha
2014-03-01
Prediction of Alzheimers disease (AD) progression based on baseline measures allows us to understand disease progression and has implications in decisions concerning treatment strategy. To this end we combine a predictive multi-task machine learning method1 with novel MR-based multivariate morphometric surface map of the hippocampus2 to predict future cognitive scores of patients. Previous work by Zhou et al.1 has shown that a multi-task learning framework that performs prediction of all future time points (or tasks) simultaneously can be used to encode both sparsity as well as temporal smoothness. They showed that this can be used in predicting cognitive outcomes of Alzheimers Disease Neuroimaging Initiative (ADNI) subjects based on FreeSurfer-based baseline MRI features, MMSE score demographic information and ApoE status. Whilst volumetric information may hold generalized information on brain status, we hypothesized that hippocampus specific information may be more useful in predictive modeling of AD. To this end, we applied Shi et al.2s recently developed multivariate tensor-based (mTBM) parametric surface analysis method to extract features from the hippocampal surface. We show that by combining the power of the multi-task framework with the sensitivity of mTBM features of the hippocampus surface, we are able to improve significantly improve predictive performance of ADAS cognitive scores 6, 12, 24, 36 and 48 months from baseline.
Higher-order Multivariable Polynomial Regression to Estimate Human Affective States
NASA Astrophysics Data System (ADS)
Wei, Jie; Chen, Tong; Liu, Guangyuan; Yang, Jiemin
2016-03-01
From direct observations, facial, vocal, gestural, physiological, and central nervous signals, estimating human affective states through computational models such as multivariate linear-regression analysis, support vector regression, and artificial neural network, have been proposed in the past decade. In these models, linear models are generally lack of precision because of ignoring intrinsic nonlinearities of complex psychophysiological processes; and nonlinear models commonly adopt complicated algorithms. To improve accuracy and simplify model, we introduce a new computational modeling method named as higher-order multivariable polynomial regression to estimate human affective states. The study employs standardized pictures in the International Affective Picture System to induce thirty subjects’ affective states, and obtains pure affective patterns of skin conductance as input variables to the higher-order multivariable polynomial model for predicting affective valence and arousal. Experimental results show that our method is able to obtain efficient correlation coefficients of 0.98 and 0.96 for estimation of affective valence and arousal, respectively. Moreover, the method may provide certain indirect evidences that valence and arousal have their brain’s motivational circuit origins. Thus, the proposed method can serve as a novel one for efficiently estimating human affective states.
Higher-order Multivariable Polynomial Regression to Estimate Human Affective States
Wei, Jie; Chen, Tong; Liu, Guangyuan; Yang, Jiemin
2016-01-01
From direct observations, facial, vocal, gestural, physiological, and central nervous signals, estimating human affective states through computational models such as multivariate linear-regression analysis, support vector regression, and artificial neural network, have been proposed in the past decade. In these models, linear models are generally lack of precision because of ignoring intrinsic nonlinearities of complex psychophysiological processes; and nonlinear models commonly adopt complicated algorithms. To improve accuracy and simplify model, we introduce a new computational modeling method named as higher-order multivariable polynomial regression to estimate human affective states. The study employs standardized pictures in the International Affective Picture System to induce thirty subjects’ affective states, and obtains pure affective patterns of skin conductance as input variables to the higher-order multivariable polynomial model for predicting affective valence and arousal. Experimental results show that our method is able to obtain efficient correlation coefficients of 0.98 and 0.96 for estimation of affective valence and arousal, respectively. Moreover, the method may provide certain indirect evidences that valence and arousal have their brain’s motivational circuit origins. Thus, the proposed method can serve as a novel one for efficiently estimating human affective states. PMID:26996254
Poláček, Roman; Májek, Pavel; Hroboňová, Katarína; Sádecká, Jana
2016-04-01
Fluoxetine is the most prescribed antidepressant chiral drug worldwide. Its enantiomers have a different duration of serotonin inhibition. A novel simple and rapid method for determination of the enantiomeric composition of fluoxetine in pharmaceutical pills is presented. Specifically, emission, excitation, and synchronous fluorescence techniques were employed to obtain the spectral data, which with multivariate calibration methods, namely, principal component regression (PCR) and partial least square (PLS), were investigated. The chiral recognition of fluoxetine enantiomers in the presence of β-cyclodextrin was based on diastereomeric complexes. The results of the multivariate calibration modeling indicated good prediction abilities. The obtained results for tablets were compared with those from chiral HPLC and no significant differences are shown by Fisher's (F) test and Student's t-test. The smallest residuals between reference or nominal values and predicted values were achieved by multivariate calibration of synchronous fluorescence spectral data. This conclusion is supported by calculated values of the figure of merit.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rupšys, P.
A system of stochastic differential equations (SDE) with mixed-effects parameters and multivariate normal copula density function were used to develop tree height model for Scots pine trees in Lithuania. A two-step maximum likelihood parameter estimation method is used and computational guidelines are given. After fitting the conditional probability density functions to outside bark diameter at breast height, and total tree height, a bivariate normal copula distribution model was constructed. Predictions from the mixed-effects parameters SDE tree height model calculated during this research were compared to the regression tree height equations. The results are implemented in the symbolic computational language MAPLE.
Cella, Laura; Liuzzi, Raffaele; Conson, Manuel; D'Avino, Vittoria; Salvatore, Marco; Pacelli, Roberto
2012-12-27
Hypothyroidism is a frequent late side effect of radiation therapy of the cervical region. Purpose of this work is to develop multivariate normal tissue complication probability (NTCP) models for radiation-induced hypothyroidism (RHT) and to compare them with already existing NTCP models for RHT. Fifty-three patients treated with sequential chemo-radiotherapy for Hodgkin's lymphoma (HL) were retrospectively reviewed for RHT events. Clinical information along with thyroid gland dose distribution parameters were collected and their correlation to RHT was analyzed by Spearman's rank correlation coefficient (Rs). Multivariate logistic regression method using resampling methods (bootstrapping) was applied to select model order and parameters for NTCP modeling. Model performance was evaluated through the area under the receiver operating characteristic curve (AUC). Models were tested against external published data on RHT and compared with other published NTCP models. If we express the thyroid volume exceeding X Gy as a percentage (Vx(%)), a two-variable NTCP model including V30(%) and gender resulted to be the optimal predictive model for RHT (Rs = 0.615, p < 0.001. AUC = 0.87). Conversely, if absolute thyroid volume exceeding X Gy (Vx(cc)) was analyzed, an NTCP model based on 3 variables including V30(cc), thyroid gland volume and gender was selected as the most predictive model (Rs = 0.630, p < 0.001. AUC = 0.85). The three-variable model performs better when tested on an external cohort characterized by large inter-individuals variation in thyroid volumes (AUC = 0.914, 95% CI 0.760-0.984). A comparable performance was found between our model and that proposed in the literature based on thyroid gland mean dose and volume (p = 0.264). The absolute volume of thyroid gland exceeding 30 Gy in combination with thyroid gland volume and gender provide an NTCP model for RHT with improved prediction capability not only within our patient population but also in an external cohort.
Girard, Jean-Michel; Deschênes, Jean-Sébastien; Tremblay, Réjean; Gagnon, Jonathan
2013-09-01
The objective of this work is to develop a quick and simple method for the in situ monitoring of sugars in biological cultures. A new technology based on Attenuated Total Reflectance-Fourier Transform Infrared (FT-IR/ATR) spectroscopy in combination with an external light guiding fiber probe was tested, first to build predictive models from solutions of pure sugars, and secondly to use those models to monitor the sugars in the complex culture medium of mixotrophic microalgae. Quantification results from the univariate model were correlated with the total dissolved solids content (R(2)=0.74). A vector normalized multivariate model was used to proportionally quantify the different sugars present in the complex culture medium and showed a predictive accuracy of >90% for sugars representing >20% of the total. This method offers an alternative to conventional sugar monitoring assays and could be used at-line or on-line in commercial scale production systems. Copyright © 2013 Elsevier Ltd. All rights reserved.
Larrosa, José Manuel; Moreno-Montañés, Javier; Martinez-de-la-Casa, José María; Polo, Vicente; Velázquez-Villoria, Álvaro; Berrozpe, Clara; García-Granero, Marta
2015-10-01
The purpose of this study was to develop and validate a multivariate predictive model to detect glaucoma by using a combination of retinal nerve fiber layer (RNFL), retinal ganglion cell-inner plexiform (GCIPL), and optic disc parameters measured using spectral-domain optical coherence tomography (OCT). Five hundred eyes from 500 participants and 187 eyes of another 187 participants were included in the study and validation groups, respectively. Patients with glaucoma were classified in five groups based on visual field damage. Sensitivity and specificity of all glaucoma OCT parameters were analyzed. Receiver operating characteristic curves (ROC) and areas under the ROC (AUC) were compared. Three predictive multivariate models (quantitative, qualitative, and combined) that used a combination of the best OCT parameters were constructed. A diagnostic calculator was created using the combined multivariate model. The best AUC parameters were: inferior RNFL, average RNFL, vertical cup/disc ratio, minimal GCIPL, and inferior-temporal GCIPL. Comparisons among the parameters did not show that the GCIPL parameters were better than those of the RNFL in early and advanced glaucoma. The highest AUC was in the combined predictive model (0.937; 95% confidence interval, 0.911-0.957) and was significantly (P = 0.0001) higher than the other isolated parameters considered in early and advanced glaucoma. The validation group displayed similar results to those of the study group. Best GCIPL, RNFL, and optic disc parameters showed a similar ability to detect glaucoma. The combined predictive formula improved the glaucoma detection compared to the best isolated parameters evaluated. The diagnostic calculator obtained good classification from participants in both the study and validation groups.
Agarwal, Shivani; Jawad, Abbas F; Miller, Victoria A
2016-11-01
The current study examined how a comprehensive set of variables from multiple domains, including at the adolescent and family level, were predictive of glycemic control in adolescents with type 1 diabetes (T1D). Participants included 100 adolescents with T1D ages 10-16 yrs and their parents. Participants were enrolled in a longitudinal study about youth decision-making involvement in chronic illness management of which the baseline data were available for analysis. Bivariate associations with glycemic control (HbA1C) were tested. Hierarchical linear regression was implemented to inform the predictive model. In bivariate analyses, race, family structure, household income, insulin regimen, adolescent-reported adherence to diabetes self-management, cognitive development, adolescent responsibility for T1D management, and parent behavior during the illness management discussion were associated with HbA1c. In the multivariate model, the only significant predictors of HbA1c were race and insulin regimen, accounting for 17% of the variance. Caucasians had better glycemic control than other racial groups. Participants using pre-mixed insulin therapy and basal-bolus insulin had worse glycemic control than those on insulin pumps. This study shows that despite associations of adolescent and family-level variables with glycemic control at the bivariate level, only race and insulin regimen are predictive of glycemic control in hierarchical multivariate analyses. This model offers an alternative way to examine the relationship of demographic and psychosocial factors on glycemic control in adolescents with T1D. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhou, Ping; Song, Heda; Wang, Hong
Blast furnace (BF) in ironmaking is a nonlinear dynamic process with complicated physical-chemical reactions, where multi-phase and multi-field coupling and large time delay occur during its operation. In BF operation, the molten iron temperature (MIT) as well as Si, P and S contents of molten iron are the most essential molten iron quality (MIQ) indices, whose measurement, modeling and control have always been important issues in metallurgic engineering and automation field. This paper develops a novel data-driven nonlinear state space modeling for the prediction and control of multivariate MIQ indices by integrating hybrid modeling and control techniques. First, to improvemore » modeling efficiency, a data-driven hybrid method combining canonical correlation analysis and correlation analysis is proposed to identify the most influential controllable variables as the modeling inputs from multitudinous factors would affect the MIQ indices. Then, a Hammerstein model for the prediction of MIQ indices is established using the LS-SVM based nonlinear subspace identification method. Such a model is further simplified by using piecewise cubic Hermite interpolating polynomial method to fit the complex nonlinear kernel function. Compared to the original Hammerstein model, this simplified model can not only significantly reduce the computational complexity, but also has almost the same reliability and accuracy for a stable prediction of MIQ indices. Last, in order to verify the practicability of the developed model, it is applied in designing a genetic algorithm based nonlinear predictive controller for multivariate MIQ indices by directly taking the established model as a predictor. Industrial experiments show the advantages and effectiveness of the proposed approach.« less
Schmidt, A F; Nielen, M; Withrow, S J; Selmic, L E; Burton, J H; Klungel, O H; Groenwold, R H H; Kirpensteijn, J
2016-03-01
Canine osteosarcoma is the most common bone cancer, and an important cause of mortality and morbidity, in large purebred dogs. Previously we constructed two multivariable models to predict a dog's 5-month or 1-year mortality risk after surgical treatment for osteosarcoma. According to the 5-month model, dogs with a relatively low risk of 5-month mortality benefited most from additional chemotherapy treatment. In the present study, we externally validated these results using an independent cohort study of 794 dogs. External performance of our prediction models showed some disagreement between observed and predicted risk, mean difference: -0.11 (95% confidence interval [95% CI]-0.29; 0.08) for 5-month risk and 0.25 (95%CI 0.10; 0.40) for 1-year mortality risk. After updating the intercept, agreement improved: -0.0004 (95%CI-0.16; 0.16) and -0.002 (95%CI-0.15; 0.15). The chemotherapy by predicted mortality risk interaction (P-value=0.01) showed that the chemotherapy compared to no chemotherapy effectiveness was modified by 5-month mortality risk: dogs with a relatively lower risk of mortality benefited most from additional chemotherapy. Chemotherapy effectiveness on 1-year mortality was not significantly modified by predicted risk (P-value=0.28). In conclusion, this external validation study confirmed that our multivariable risk prediction models can predict a patient's mortality risk and that dogs with a relatively lower risk of 5-month mortality seem to benefit most from chemotherapy. Copyright © 2016 Elsevier B.V. All rights reserved.
Lefcheck, Jonathan S; Duffy, J Emmett
2015-11-01
The use of functional traits to explain how biodiversity affects ecosystem functioning has attracted intense interest, yet few studies have a priori altered functional diversity, especially in multitrophic communities. Here, we manipulated multivariate functional diversity of estuarine grazers and predators within multiple levels of species richness to test how species richness and functional diversity predicted ecosystem functioning in a multitrophic food web. Community functional diversity was a better predictor than species richness for the majority of ecosystem properties, based on generalized linear mixed-effects models. Combining inferences from eight traits into a single multivariate index increased prediction accuracy of these models relative to any individual trait. Structural equation modeling revealed that functional diversity of both grazers and predators was important in driving final biomass within trophic levels, with stronger effects observed for predators. We also show that different species drove different ecosystem responses, with evidence for both sampling effects and complementarity. Our study extends experimental investigations of functional trait diversity to a multilevel food web, and demonstrates that functional diversity can be more accurate and effective than species richness in predicting community biomass in a food web context.
A multivariate model of parent-adolescent relationship variables in early adolescence.
McKinney, Cliff; Renk, Kimberly
2011-08-01
Given the importance of predicting outcomes for early adolescents, this study examines a multivariate model of parent-adolescent relationship variables, including parenting, family environment, and conflict. Participants, who completed measures assessing these variables, included 710 culturally diverse 11-14-year-olds who were attending a middle school in a Southeastern state. The parents of a subset of these adolescents (i.e., 487 mother-father pairs) participated in this study as well. Correlational analyses indicate that authoritative and authoritarian parenting, family cohesion and adaptability, and conflict are significant predictors of early adolescents' internalizing and externalizing problems. Structural equation modeling analyses indicate that fathers' parenting may not predict directly externalizing problems in male and female adolescents but instead may act through conflict. More direct relationships exist when examining mothers' parenting. The impact of parenting, family environment, and conflict on early adolescents' internalizing and externalizing problems and the importance of both gender and cross-informant ratings are emphasized.
Grünhut, Marcos; Garrido, Mariano; Centurión, Maria E; Fernández Band, Beatriz S
2010-07-12
A combination of kinetic spectroscopic monitoring and multivariate curve resolution-alternating least squares (MCR-ALS) was proposed for the enzymatic determination of levodopa (LVD) and carbidopa (CBD) in pharmaceuticals. The enzymatic reaction process was carried out in a reverse stopped-flow injection system and monitored by UV-vis spectroscopy. The spectra (292-600 nm) were recorded throughout the reaction and were analyzed by multivariate curve resolution-alternating least squares. A small calibration matrix containing nine mixtures was used in the model construction. Additionally, to evaluate the prediction ability of the model, a set with six validation mixtures was used. The lack of fit obtained was 4.3%, the explained variance 99.8% and the overall prediction error 5.5%. Tablets of commercial samples were analyzed and the results were validated by pharmacopeia method (high performance liquid chromatography). No significant differences were found (alpha=0.05) between the reference values and the ones obtained with the proposed method. It is important to note that a unique chemometric model made it possible to determine both analytes simultaneously. Copyright 2010 Elsevier B.V. All rights reserved.
Hoffman, Haydn; Lee, Sunghoon I; Garst, Jordan H; Lu, Derek S; Li, Charles H; Nagasawa, Daniel T; Ghalehsari, Nima; Jahanforouz, Nima; Razaghy, Mehrdad; Espinal, Marie; Ghavamrezaii, Amir; Paak, Brian H; Wu, Irene; Sarrafzadeh, Majid; Lu, Daniel C
2015-09-01
This study introduces the use of multivariate linear regression (MLR) and support vector regression (SVR) models to predict postoperative outcomes in a cohort of patients who underwent surgery for cervical spondylotic myelopathy (CSM). Currently, predicting outcomes after surgery for CSM remains a challenge. We recruited patients who had a diagnosis of CSM and required decompressive surgery with or without fusion. Fine motor function was tested preoperatively and postoperatively with a handgrip-based tracking device that has been previously validated, yielding mean absolute accuracy (MAA) results for two tracking tasks (sinusoidal and step). All patients completed Oswestry disability index (ODI) and modified Japanese Orthopaedic Association questionnaires preoperatively and postoperatively. Preoperative data was utilized in MLR and SVR models to predict postoperative ODI. Predictions were compared to the actual ODI scores with the coefficient of determination (R(2)) and mean absolute difference (MAD). From this, 20 patients met the inclusion criteria and completed follow-up at least 3 months after surgery. With the MLR model, a combination of the preoperative ODI score, preoperative MAA (step function), and symptom duration yielded the best prediction of postoperative ODI (R(2)=0.452; MAD=0.0887; p=1.17 × 10(-3)). With the SVR model, a combination of preoperative ODI score, preoperative MAA (sinusoidal function), and symptom duration yielded the best prediction of postoperative ODI (R(2)=0.932; MAD=0.0283; p=5.73 × 10(-12)). The SVR model was more accurate than the MLR model. The SVR can be used preoperatively in risk/benefit analysis and the decision to operate. Copyright © 2015 Elsevier Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sakaguchi, Kaori; Nagatsuma, Tsutomu; Reeves, Geoffrey D.
The Van Allen radiation belts surrounding the Earth are filled with MeV-energy electrons. This region poses ionizing radiation risks for spacecraft that operate within it, including those in geostationary orbit (GEO) and medium Earth orbit. In order to provide alerts of electron flux enhancements, 16 prediction models of the electron log-flux variation throughout the equatorial outer radiation belt as a function of the McIlwain L parameter were developed using the multivariate autoregressive model and Kalman filter. Measurements of omnidirectional 2.3 MeV electron flux from the Van Allen Probes mission as well as >2 MeV electrons from the GOES 15 spacecraftmore » were used as the predictors. Furthermore, we selected model explanatory parameters from solar wind parameters, the electron log-flux at GEO, and geomagnetic indices. For the innermost region of the outer radiation belt, the electron flux is best predicted by using the Dst index as the sole input parameter. For the central to outermost regions, at L≥4.8 and L ≥5.6, the electron flux is predicted most accurately by including also the solar wind velocity and then the dynamic pressure, respectively. The Dst index is the best overall single parameter for predicting at 3 ≤ L ≤ 6, while for the GEO flux prediction, the K P index is better than Dst. Finally, a test calculation demonstrates that the model successfully predicts the timing and location of the flux maximum as much as 2 days in advance and that the electron flux decreases faster with time at higher L values, both model features consistent with the actually observed behavior.« less
Sakaguchi, Kaori; Nagatsuma, Tsutomu; Reeves, Geoffrey D.; ...
2015-12-22
The Van Allen radiation belts surrounding the Earth are filled with MeV-energy electrons. This region poses ionizing radiation risks for spacecraft that operate within it, including those in geostationary orbit (GEO) and medium Earth orbit. In order to provide alerts of electron flux enhancements, 16 prediction models of the electron log-flux variation throughout the equatorial outer radiation belt as a function of the McIlwain L parameter were developed using the multivariate autoregressive model and Kalman filter. Measurements of omnidirectional 2.3 MeV electron flux from the Van Allen Probes mission as well as >2 MeV electrons from the GOES 15 spacecraftmore » were used as the predictors. Furthermore, we selected model explanatory parameters from solar wind parameters, the electron log-flux at GEO, and geomagnetic indices. For the innermost region of the outer radiation belt, the electron flux is best predicted by using the Dst index as the sole input parameter. For the central to outermost regions, at L≥4.8 and L ≥5.6, the electron flux is predicted most accurately by including also the solar wind velocity and then the dynamic pressure, respectively. The Dst index is the best overall single parameter for predicting at 3 ≤ L ≤ 6, while for the GEO flux prediction, the K P index is better than Dst. Finally, a test calculation demonstrates that the model successfully predicts the timing and location of the flux maximum as much as 2 days in advance and that the electron flux decreases faster with time at higher L values, both model features consistent with the actually observed behavior.« less
NASA Astrophysics Data System (ADS)
Sakaguchi, Kaori; Nagatsuma, Tsutomu; Reeves, Geoffrey D.; Spence, Harlan E.
2015-12-01
The Van Allen radiation belts surrounding the Earth are filled with MeV-energy electrons. This region poses ionizing radiation risks for spacecraft that operate within it, including those in geostationary orbit (GEO) and medium Earth orbit. To provide alerts of electron flux enhancements, 16 prediction models of the electron log-flux variation throughout the equatorial outer radiation belt as a function of the McIlwain L parameter were developed using the multivariate autoregressive model and Kalman filter. Measurements of omnidirectional 2.3 MeV electron flux from the Van Allen Probes mission as well as >2 MeV electrons from the GOES 15 spacecraft were used as the predictors. Model explanatory parameters were selected from solar wind parameters, the electron log-flux at GEO, and geomagnetic indices. For the innermost region of the outer radiation belt, the electron flux is best predicted by using the Dst index as the sole input parameter. For the central to outermost regions, at L ≧ 4.8 and L ≧ 5.6, the electron flux is predicted most accurately by including also the solar wind velocity and then the dynamic pressure, respectively. The Dst index is the best overall single parameter for predicting at 3 ≦ L ≦ 6, while for the GEO flux prediction, the KP index is better than Dst. A test calculation demonstrates that the model successfully predicts the timing and location of the flux maximum as much as 2 days in advance and that the electron flux decreases faster with time at higher L values, both model features consistent with the actually observed behavior.
Yilmaz, Banu; Aras, Egemen; Nacar, Sinan; Kankal, Murat
2018-05-23
The functional life of a dam is often determined by the rate of sediment delivery to its reservoir. Therefore, an accurate estimate of the sediment load in rivers with dams is essential for designing and predicting a dam's useful lifespan. The most credible method is direct measurements of sediment input, but this can be very costly and it cannot always be implemented at all gauging stations. In this study, we tested various regression models to estimate suspended sediment load (SSL) at two gauging stations on the Çoruh River in Turkey, including artificial bee colony (ABC), teaching-learning-based optimization algorithm (TLBO), and multivariate adaptive regression splines (MARS). These models were also compared with one another and with classical regression analyses (CRA). Streamflow values and previously collected data of SSL were used as model inputs with predicted SSL data as output. Two different training and testing dataset configurations were used to reinforce the model accuracy. For the MARS method, the root mean square error value was found to range between 35% and 39% for the test two gauging stations, which was lower than errors for other models. Error values were even lower (7% to 15%) using another dataset. Our results indicate that simultaneous measurements of streamflow with SSL provide the most effective parameter for obtaining accurate predictive models and that MARS is the most accurate model for predicting SSL. Copyright © 2017 Elsevier B.V. All rights reserved.
Bello, Alessandra; Bianchi, Federica; Careri, Maria; Giannetto, Marco; Mori, Giovanni; Musci, Marilena
2007-11-05
A new NIR method based on multivariate calibration for determination of ethanol in industrially packed wholemeal bread was developed and validated. GC-FID was used as reference method for the determination of actual ethanol concentration of different samples of wholemeal bread with proper content of added ethanol, ranging from 0 to 3.5% (w/w). Stepwise discriminant analysis was carried out on the NIR dataset, in order to reduce the number of original variables by selecting those that were able to discriminate between the samples of different ethanol concentrations. With the so selected variables a multivariate calibration model was then obtained by multiple linear regression. The prediction power of the linear model was optimized by a new "leave one out" method, so that the number of original variables resulted further reduced.
Ali, Arif N; Switchenko, Jeffrey M; Kim, Sungjin; Kowalski, Jeanne; El-Deiry, Mark W; Beitler, Jonathan J
2014-11-15
The current study was conducted to develop a multifactorial statistical model to predict the specific head and neck (H&N) tumor site origin in cases of squamous cell carcinoma confined to the cervical lymph nodes ("unknown primaries"). The Surveillance, Epidemiology, and End Results (SEER) database was analyzed for patients with an H&N tumor site who were diagnosed between 2004 and 2011. The SEER patients were identified according to their H&N primary tumor site and clinically positive cervical lymph node levels at the time of presentation. The SEER patient data set was randomly divided into 2 data sets for the purposes of internal split-sample validation. The effects of cervical lymph node levels, age, race, and sex on H&N primary tumor site were examined using univariate and multivariate analyses. Multivariate logistic regression models and an associated set of nomograms were developed based on relevant factors to provide probabilities of tumor site origin. Analysis of the SEER database identified 20,011 patients with H&N disease with both site-level and lymph node-level data. Sex, race, age, and lymph node levels were associated with primary H&N tumor site (nasopharynx, hypopharynx, oropharynx, and larynx) in the multivariate models. Internal validation techniques affirmed the accuracy of these models on separate data. The incorporation of epidemiologic and lymph node data into a predictive model has the potential to provide valuable guidance to clinicians in the treatment of patients with squamous cell carcinoma confined to the cervical lymph nodes. © 2014 The Authors. Cancer published by Wiley Periodicals, Inc. on behalf of American Cancer Society.
Du, Juan; Yang, Fang; Zhang, Zhiqiang; Hu, Jingze; Xu, Qiang; Hu, Jianping; Zeng, Fanyong; Lu, Guangming; Liu, Xinfeng
2018-05-15
An accurate prediction of long term outcome after stroke is urgently required to provide early individualized neurorehabilitation. This study aimed to examine the added value of early neuroimaging measures and identify the best approaches for predicting motor outcome after stroke. This prospective study involved 34 first-ever ischemic stroke patients (time since stroke: 1-14 days) with upper limb impairment. All patients underwent baseline multimodal assessments that included clinical (age, motor impairment), neurophysiological (motor-evoked potentials, MEP) and neuroimaging (diffusion tensor imaging and motor task-based fMRI) measures, and also underwent reassessment 3 months after stroke. Bivariate analysis and multivariate linear regression models were used to predict the motor scores (Fugl-Meyer assessment, FMA) at 3 months post-stroke. With bivariate analysis, better motor outcome significantly correlated with (1) less initial motor impairment and disability, (2) less corticospinal tract injury, (3) the initial presence of MEPs, (4) stronger baseline motor fMRI activations. In multivariate analysis, incorporating neuroimaging data improved the predictive accuracy relative to only clinical and neurophysiological assessments. Baseline fMRI activation in SMA was an independent predictor of motor outcome after stroke. A multimodal model incorporating fMRI and clinical measures best predicted the motor outcome following stroke. fMRI measures obtained early after stroke provided independent prediction of long-term motor outcome.
NASA Astrophysics Data System (ADS)
Wang, Yunzhi; Qiu, Yuchen; Thai, Theresa; More, Kathleen; Ding, Kai; Liu, Hong; Zheng, Bin
2016-03-01
How to rationally identify epithelial ovarian cancer (EOC) patients who will benefit from bevacizumab or other antiangiogenic therapies is a critical issue in EOC treatments. The motivation of this study is to quantitatively measure adiposity features from CT images and investigate the feasibility of predicting potential benefit of EOC patients with or without receiving bevacizumab-based chemotherapy treatment using multivariate statistical models built based on quantitative adiposity image features. A dataset involving CT images from 59 advanced EOC patients were included. Among them, 32 patients received maintenance bevacizumab after primary chemotherapy and the remaining 27 patients did not. We developed a computer-aided detection (CAD) scheme to automatically segment subcutaneous fat areas (VFA) and visceral fat areas (SFA) and then extracted 7 adiposity-related quantitative features. Three multivariate data analysis models (linear regression, logistic regression and Cox proportional hazards regression) were performed respectively to investigate the potential association between the model-generated prediction results and the patients' progression-free survival (PFS) and overall survival (OS). The results show that using all 3 statistical models, a statistically significant association was detected between the model-generated results and both of the two clinical outcomes in the group of patients receiving maintenance bevacizumab (p<0.01), while there were no significant association for both PFS and OS in the group of patients without receiving maintenance bevacizumab. Therefore, this study demonstrated the feasibility of using quantitative adiposity-related CT image features based statistical prediction models to generate a new clinical marker and predict the clinical outcome of EOC patients receiving maintenance bevacizumab-based chemotherapy.
Predicting clinical diagnosis in Huntington's disease: An imaging polymarker
Daws, Richard E.; Soreq, Eyal; Johnson, Eileanoir B.; Scahill, Rachael I.; Tabrizi, Sarah J.; Barker, Roger A.; Hampshire, Adam
2018-01-01
Objective Huntington's disease (HD) gene carriers can be identified before clinical diagnosis; however, statistical models for predicting when overt motor symptoms will manifest are too imprecise to be useful at the level of the individual. Perfecting this prediction is integral to the search for disease modifying therapies. This study aimed to identify an imaging marker capable of reliably predicting real‐life clinical diagnosis in HD. Method A multivariate machine learning approach was applied to resting‐state and structural magnetic resonance imaging scans from 19 premanifest HD gene carriers (preHD, 8 of whom developed clinical disease in the 5 years postscanning) and 21 healthy controls. A classification model was developed using cross‐group comparisons between preHD and controls, and within the preHD group in relation to “estimated” and “actual” proximity to disease onset. Imaging measures were modeled individually, and combined, and permutation modeling robustly tested classification accuracy. Results Classification performance for preHDs versus controls was greatest when all measures were combined. The resulting polymarker predicted converters with high accuracy, including those who were not expected to manifest in that time scale based on the currently adopted statistical models. Interpretation We propose that a holistic multivariate machine learning treatment of brain abnormalities in the premanifest phase can be used to accurately identify those patients within 5 years of developing motor features of HD, with implications for prognostication and preclinical trials. Ann Neurol 2018;83:532–543 PMID:29405351
Vedder, Moniek M; de Bekker-Grob, Esther W; Lilja, Hans G; Vickers, Andrew J; van Leenders, Geert J L H; Steyerberg, Ewout W; Roobol, Monique J
2014-12-01
Prostate-specific antigen (PSA) testing has limited accuracy for the early detection of prostate cancer (PCa). To assess the value added by percentage of free to total PSA (%fPSA), prostate cancer antigen 3 (PCA3), and a kallikrein panel (4k-panel) to the European Randomised Study of Screening for Prostate Cancer (ERSPC) multivariable prediction models: risk calculator (RC) 4, including transrectal ultrasound, and RC 4 plus digital rectal examination (4+DRE) for prescreened men. Participants were invited for rescreening between October 2007 and February 2009 within the Dutch part of the ERSPC study. Biopsies were taken in men with a PSA level ≥3.0 ng/ml or a PCA3 score ≥10. Additional analyses of the 4k-panel were done on serum samples. Outcome was defined as PCa detectable by sextant biopsy. Receiver operating characteristic curve and decision curve analyses were performed to compare the predictive capabilities of %fPSA, PCA3, 4k-panel, the ERSPC RCs, and their combinations in logistic regression models. PCa was detected in 119 of 708 men. The %fPSA did not perform better univariately or added to the RCs compared with the RCs alone. In 202 men with an elevated PSA, the 4k-panel discriminated better than PCA3 when modelled univariately (area under the curve [AUC]: 0.78 vs. 0.62; p=0.01). The multivariable models with PCA3 or the 4k-panel were equivalent (AUC: 0.80 for RC 4+DRE). In the total population, PCA3 discriminated better than the 4k-panel (univariate AUC: 0.63 vs. 0.56; p=0.05). There was no statistically significant difference between the multivariable model with PCA3 (AUC: 0.73) versus the model with the 4k-panel (AUC: 0.71; p=0.18). The multivariable model with PCA3 performed better than the reference model (0.73 vs. 0.70; p=0.02). Decision curves confirmed these patterns, although numbers were small. Both PCA3 and, to a lesser extent, a 4k-panel have added value to the DRE-based ERSPC RC in detecting PCa in prescreened men. We studied the added value of novel biomarkers to previously developed risk prediction models for prostate cancer. We found that inclusion of these biomarkers resulted in an increase in predictive ability. Copyright © 2014. Published by Elsevier B.V.
Menon, Ramkumar; Bhat, Geeta; Saade, George R; Spratt, Heidi
2014-04-01
To develop classification models of demographic/clinical factors and biomarker data from spontaneous preterm birth in African Americans and Caucasians. Secondary analysis of biomarker data using multivariate adaptive regression splines (MARS), a supervised machine learning algorithm method. Analysis of data on 36 biomarkers from 191 women was reduced by MARS to develop predictive models for preterm birth in African Americans and Caucasians. Maternal plasma, cord plasma collected at admission for preterm or term labor and amniotic fluid at delivery. Data were partitioned into training and testing sets. Variable importance, a relative indicator (0-100%) and area under the receiver operating characteristic curve (AUC) characterized results. Multivariate adaptive regression splines generated models for combined and racially stratified biomarker data. Clinical and demographic data did not contribute to the model. Racial stratification of data produced distinct models in all three compartments. In African Americans maternal plasma samples IL-1RA, TNF-α, angiopoietin 2, TNFRI, IL-5, MIP1α, IL-1β and TGF-α modeled preterm birth (AUC train: 0.98, AUC test: 0.86). In Caucasians TNFR1, ICAM-1 and IL-1RA contributed to the model (AUC train: 0.84, AUC test: 0.68). African Americans cord plasma samples produced IL-12P70, IL-8 (AUC train: 0.82, AUC test: 0.66). Cord plasma in Caucasians modeled IGFII, PDGFBB, TGF-β1 , IL-12P70, and TIMP1 (AUC train: 0.99, AUC test: 0.82). Amniotic fluid in African Americans modeled FasL, TNFRII, RANTES, KGF, IGFI (AUC train: 0.95, AUC test: 0.89) and in Caucasians, TNF-α, MCP3, TGF-β3 , TNFR1 and angiopoietin 2 (AUC train: 0.94 AUC test: 0.79). Multivariate adaptive regression splines models multiple biomarkers associated with preterm birth and demonstrated racial disparity. © 2014 Nordic Federation of Societies of Obstetrics and Gynecology.
Genome-Wide Association Analysis of Adaptation Using Environmentally Predicted Traits.
van Heerwaarden, Joost; van Zanten, Martijn; Kruijer, Willem
2015-10-01
Current methods for studying the genetic basis of adaptation evaluate genetic associations with ecologically relevant traits or single environmental variables, under the implicit assumption that natural selection imposes correlations between phenotypes, environments and genotypes. In practice, observed trait and environmental data are manifestations of unknown selective forces and are only indirectly associated with adaptive genetic variation. In theory, improved estimation of these forces could enable more powerful detection of loci under selection. Here we present an approach in which we approximate adaptive variation by modeling phenotypes as a function of the environment and using the predicted trait in multivariate and univariate genome-wide association analysis (GWAS). Based on computer simulations and published flowering time data from the model plant Arabidopsis thaliana, we find that environmentally predicted traits lead to higher recovery of functional loci in multivariate GWAS and are more strongly correlated to allele frequencies at adaptive loci than individual environmental variables. Our results provide an example of the use of environmental data to obtain independent and meaningful information on adaptive genetic variation.
Paul, Christoph; Heun, Christine; Müller, Hans-Helge; Hoerauf, Hans; Feltgen, Nicolas; Wachtlin, Joachim; Kaymak, Hakan; Mennel, Stefan; Koss, Michael Janusz; Fauser, Sascha; Maier, Mathias M; Schumann, Ricarda G; Mueller, Simone; Chang, Petrus; Schmitz-Valckenberg, Steffen; Kazerounian, Sara; Szurman, Peter; Lommatzsch, Albrecht; Bertelmann, Thomas
2017-10-31
To evaluate predictive factors for the treatment success of ocriplasmin and to use these factors to generate a multivariate model to calculate the individual probability of successful treatment. Data were collected in a retrospective, multicentre cohort study. Patients with vitreomacular traction (VMT) syndrome without a full-thickness macular hole were included if they received an intravitreal injection (IVI) of ocriplasmin. Five factors (age, gender, lens status, presence of epiretinal membrane (ERM) formation and horizontal diameter of VMT) were assessed on their association with VMT resolution. A multivariable logistic regression model was employed to further analyse these factors and calculate the individual probability of successful treatment. 167 eyes of 167 patients were included. Univariate analysis revealed a significant correlation to VMT resolution for all analysed factors: age (years) (OR 0.9208; 95% CI 0.8845 to 0.9586; p<0.0001), gender (male) (OR 0.480; 95% CI 0.241 to 0.957; p=0.0371), lens status (phakic) (OR 2.042; 95% CI 1.054 to 3.958; p=0.0344), ERM formation (present) (OR 0.384; 95% CI 0.179 to 0.821; p=0.0136) and horizontal VMT diameter (µm) (OR 0.99812; 95% CI 0.99684 to 0.99941, p=0.0042). A significant multivariable logistic regression model was established with age and VMT diameter. Known predictive factors for VMT resolution after ocriplasmin IVI were confirmed in our study. We were able to combine them into a formula, ultimately allowing the calculation of an individual probability of treatment success with ocriplasmin in patients with VMT syndrome without FTHM. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Prediction of wastewater treatment plants performance based on artificial fish school neural network
NASA Astrophysics Data System (ADS)
Zhang, Ruicheng; Li, Chong
2011-10-01
A reliable model for wastewater treatment plant is essential in providing a tool for predicting its performance and to form a basis for controlling the operation of the process. This would minimize the operation costs and assess the stability of environmental balance. For the multi-variable, uncertainty, non-linear characteristics of the wastewater treatment system, an artificial fish school neural network prediction model is established standing on actual operation data in the wastewater treatment system. The model overcomes several disadvantages of the conventional BP neural network. The results of model calculation show that the predicted value can better match measured value, played an effect on simulating and predicting and be able to optimize the operation status. The establishment of the predicting model provides a simple and practical way for the operation and management in wastewater treatment plant, and has good research and engineering practical value.
Laurens, L M L; Wolfrum, E J
2013-12-18
One of the challenges associated with microalgal biomass characterization and the comparison of microalgal strains and conversion processes is the rapid determination of the composition of algae. We have developed and applied a high-throughput screening technology based on near-infrared (NIR) spectroscopy for the rapid and accurate determination of algal biomass composition. We show that NIR spectroscopy can accurately predict the full composition using multivariate linear regression analysis of varying lipid, protein, and carbohydrate content of algal biomass samples from three strains. We also demonstrate a high quality of predictions of an independent validation set. A high-throughput 96-well configuration for spectroscopy gives equally good prediction relative to a ring-cup configuration, and thus, spectra can be obtained from as little as 10-20 mg of material. We found that lipids exhibit a dominant, distinct, and unique fingerprint in the NIR spectrum that allows for the use of single and multiple linear regression of respective wavelengths for the prediction of the biomass lipid content. This is not the case for carbohydrate and protein content, and thus, the use of multivariate statistical modeling approaches remains necessary.
Ting, Hui-Min; Chang, Liyun; Huang, Yu-Jie; Wu, Jia-Ming; Wang, Hung-Yu; Horng, Mong-Fong; Chang, Chun-Ming; Lan, Jen-Hong; Huang, Ya-Yu; Fang, Fu-Min; Leung, Stephen Wan
2014-01-01
Purpose The aim of this study was to develop a multivariate logistic regression model with least absolute shrinkage and selection operator (LASSO) to make valid predictions about the incidence of moderate-to-severe patient-rated xerostomia among head and neck cancer (HNC) patients treated with IMRT. Methods and Materials Quality of life questionnaire datasets from 206 patients with HNC were analyzed. The European Organization for Research and Treatment of Cancer QLQ-H&N35 and QLQ-C30 questionnaires were used as the endpoint evaluation. The primary endpoint (grade 3+ xerostomia) was defined as moderate-to-severe xerostomia at 3 (XER3m) and 12 months (XER12m) after the completion of IMRT. Normal tissue complication probability (NTCP) models were developed. The optimal and suboptimal numbers of prognostic factors for a multivariate logistic regression model were determined using the LASSO with bootstrapping technique. Statistical analysis was performed using the scaled Brier score, Nagelkerke R2, chi-squared test, Omnibus, Hosmer-Lemeshow test, and the AUC. Results Eight prognostic factors were selected by LASSO for the 3-month time point: Dmean-c, Dmean-i, age, financial status, T stage, AJCC stage, smoking, and education. Nine prognostic factors were selected for the 12-month time point: Dmean-i, education, Dmean-c, smoking, T stage, baseline xerostomia, alcohol abuse, family history, and node classification. In the selection of the suboptimal number of prognostic factors by LASSO, three suboptimal prognostic factors were fine-tuned by Hosmer-Lemeshow test and AUC, i.e., Dmean-c, Dmean-i, and age for the 3-month time point. Five suboptimal prognostic factors were also selected for the 12-month time point, i.e., Dmean-i, education, Dmean-c, smoking, and T stage. The overall performance for both time points of the NTCP model in terms of scaled Brier score, Omnibus, and Nagelkerke R2 was satisfactory and corresponded well with the expected values. Conclusions Multivariate NTCP models with LASSO can be used to predict patient-rated xerostomia after IMRT. PMID:24586971
Lee, Tsair-Fwu; Chao, Pei-Ju; Ting, Hui-Min; Chang, Liyun; Huang, Yu-Jie; Wu, Jia-Ming; Wang, Hung-Yu; Horng, Mong-Fong; Chang, Chun-Ming; Lan, Jen-Hong; Huang, Ya-Yu; Fang, Fu-Min; Leung, Stephen Wan
2014-01-01
The aim of this study was to develop a multivariate logistic regression model with least absolute shrinkage and selection operator (LASSO) to make valid predictions about the incidence of moderate-to-severe patient-rated xerostomia among head and neck cancer (HNC) patients treated with IMRT. Quality of life questionnaire datasets from 206 patients with HNC were analyzed. The European Organization for Research and Treatment of Cancer QLQ-H&N35 and QLQ-C30 questionnaires were used as the endpoint evaluation. The primary endpoint (grade 3(+) xerostomia) was defined as moderate-to-severe xerostomia at 3 (XER3m) and 12 months (XER12m) after the completion of IMRT. Normal tissue complication probability (NTCP) models were developed. The optimal and suboptimal numbers of prognostic factors for a multivariate logistic regression model were determined using the LASSO with bootstrapping technique. Statistical analysis was performed using the scaled Brier score, Nagelkerke R(2), chi-squared test, Omnibus, Hosmer-Lemeshow test, and the AUC. Eight prognostic factors were selected by LASSO for the 3-month time point: Dmean-c, Dmean-i, age, financial status, T stage, AJCC stage, smoking, and education. Nine prognostic factors were selected for the 12-month time point: Dmean-i, education, Dmean-c, smoking, T stage, baseline xerostomia, alcohol abuse, family history, and node classification. In the selection of the suboptimal number of prognostic factors by LASSO, three suboptimal prognostic factors were fine-tuned by Hosmer-Lemeshow test and AUC, i.e., Dmean-c, Dmean-i, and age for the 3-month time point. Five suboptimal prognostic factors were also selected for the 12-month time point, i.e., Dmean-i, education, Dmean-c, smoking, and T stage. The overall performance for both time points of the NTCP model in terms of scaled Brier score, Omnibus, and Nagelkerke R(2) was satisfactory and corresponded well with the expected values. Multivariate NTCP models with LASSO can be used to predict patient-rated xerostomia after IMRT.
NASA Astrophysics Data System (ADS)
Moura, Ricardo; Sinha, Bimal; Coelho, Carlos A.
2017-06-01
The recent popularity of the use of synthetic data as a Statistical Disclosure Control technique has enabled the development of several methods of generating and analyzing such data, but almost always relying in asymptotic distributions and in consequence being not adequate for small sample datasets. Thus, a likelihood-based exact inference procedure is derived for the matrix of regression coefficients of the multivariate regression model, for multiply imputed synthetic data generated via Posterior Predictive Sampling. Since it is based in exact distributions this procedure may even be used in small sample datasets. Simulation studies compare the results obtained from the proposed exact inferential procedure with the results obtained from an adaptation of Reiters combination rule to multiply imputed synthetic datasets and an application to the 2000 Current Population Survey is discussed.
[Academic performance in first year medical students: an explanatory multivariate model].
Urrutia Aguilar, María Esther; Ortiz León, Silvia; Fouilloux Morales, Claudia; Ponce Rosas, Efrén Raúl; Guevara Guzmán, Rosalinda
2014-12-01
Current education is focused in intellectual, affective, and ethical aspects, thus acknowledging their significance in students´ metacognition. Nowadays, it is known that an adequate and motivating environment together with a positive attitude towards studies is fundamental to induce learning. Medical students are under multiple stressful, academic, personal, and vocational situations. To identify psychosocial, vocational, and academic variables of 2010-2011 first year medical students at UNAM that may help predict their academic performance. Academic surveys of psychological and vocational factors were applied; an academic follow-up was carried out to obtain a multivariate model. The data were analyzed considering descriptive, comparative, correlative, and predictive statistics. The main variables that affect students´ academic performance are related to previous knowledge and to psychological variables. The results show the significance of implementing institutional programs to support students throughout their college adaptation.
Wijsman, Robin; Dankers, Frank; Troost, Esther G C; Hoffmann, Aswin L; van der Heijden, Erik H F M; de Geus-Oei, Lioe-Fee; Bussink, Johan
2015-10-01
The majority of normal-tissue complication probability (NTCP) models for acute esophageal toxicity (AET) in advanced stage non-small cell lung cancer (AS-NSCLC) patients treated with (chemo-)radiotherapy are based on three-dimensional conformal radiotherapy (3D-CRT). Due to distinct dosimetric characteristics of intensity-modulated radiation therapy (IMRT), 3D-CRT based models need revision. We established a multivariable NTCP model for AET in 149 AS-NSCLC patients undergoing IMRT. An established model selection procedure was used to develop an NTCP model for Grade ⩾2 AET (53 patients) including clinical and esophageal dose-volume histogram parameters. The NTCP model predicted an increased risk of Grade ⩾2 AET in case of: concurrent chemoradiotherapy (CCR) [adjusted odds ratio (OR) 14.08, 95% confidence interval (CI) 4.70-42.19; p<0.001], increasing mean esophageal dose [Dmean; OR 1.12 per Gy increase, 95% CI 1.06-1.19; p<0.001], female patients (OR 3.33, 95% CI 1.36-8.17; p=0.008), and ⩾cT3 (OR 2.7, 95% CI 1.12-6.50; p=0.026). The AUC was 0.82 and the model showed good calibration. A multivariable NTCP model including CCR, Dmean, clinical tumor stage and gender predicts Grade ⩾2 AET after IMRT for AS-NSCLC. Prior to clinical introduction, the model needs validation in an independent patient cohort. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Validation of Single-Item Screening Measures for Provider Burnout in a Rural Health Care Network.
Waddimba, Anthony C; Scribani, Melissa; Nieves, Melinda A; Krupa, Nicole; May, John J; Jenkins, Paul
2016-06-01
We validated three single-item measures for emotional exhaustion (EE) and depersonalization (DP) among rural physician/nonphysician practitioners. We linked cross-sectional survey data (on provider demographics, satisfaction, resilience, and burnout) with administrative information from an integrated health care network (1 academic medical center, 6 community hospitals, 31 clinics, and 19 school-based health centers) in an eight-county underserved area of upstate New York. In total, 308 physicians and advanced-practice clinicians completed a self-administered, multi-instrument questionnaire (65.1% response rate). Significant proportions of respondents reported high EE (36.1%) and DP (9.9%). In multivariable linear mixed models, scores on EE/DP subscales of the Maslach Burnout Inventory were regressed on each single-item measure. The Physician Work-Life Study's single-item measure (classifying 32.8% of respondents as burning out/completely burned out) was correlated with EE and DP (Spearman's ρ = .72 and .41, p < .0001; Kruskal-Wallis χ(2) = 149.9 and 56.5, p < .0001, respectively). In multivariable models, it predicted high EE (but neither low EE nor low/high DP). EE/DP single items were correlated with parent subscales (Spearman's ρ = .89 and .81, p < .0001; Kruskal-Wallis χ(2) = 230.98 and 197.84, p < .0001, respectively). In multivariable models, the EE item predicted high/low EE, whereas the DP item predicted only low DP. Therefore, the three single-item measures tested varied in effectiveness as screeners for EE/DP dimensions of burnout. © The Author(s) 2015.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gunn, Andrew J., E-mail: agunn@uabmc.edu; Sheth, Rahul A.; Luber, Brandon
2017-01-15
PurposeThe purpse of this study was to evaluate the ability of various radiologic response criteria to predict patient outcomes after trans-arterial chemo-embolization with drug-eluting beads (DEB-TACE) in patients with advanced-stage (BCLC C) hepatocellular carcinoma (HCC).Materials and methodsHospital records from 2005 to 2011 were retrospectively reviewed. Non-infiltrative lesions were measured at baseline and on follow-up scans after DEB-TACE according to various common radiologic response criteria, including guidelines of the World Health Organization (WHO), Response Evaluation Criteria in Solid Tumors (RECIST), the European Association for the Study of the Liver (EASL), and modified RECIST (mRECIST). Statistical analysis was performed to see which,more » if any, of the response criteria could be used as a predictor of overall survival (OS) or time-to-progression (TTP).Results75 patients met inclusion criteria. Median OS and TTP were 22.6 months (95 % CI 11.6–24.8) and 9.8 months (95 % CI 7.1–21.6), respectively. Univariate and multivariate Cox analyses revealed that none of the evaluated criteria had the ability to be used as a predictor for OS or TTP. Analysis of the C index in both univariate and multivariate models showed that the evaluated criteria were not accurate predictors of either OS (C-statistic range: 0.51–0.58 in the univariate model; range: 0.54–0.58 in the multivariate model) or TTP (C-statistic range: 0.55–0.59 in the univariate model; range: 0.57–0.61 in the multivariate model).ConclusionCurrent response criteria are not accurate predictors of OS or TTP in patients with advanced-stage HCC after DEB-TACE.« less
Gunn, Andrew J; Sheth, Rahul A; Luber, Brandon; Huynh, Minh-Huy; Rachamreddy, Niranjan R; Kalva, Sanjeeva P
2017-01-01
The purpse of this study was to evaluate the ability of various radiologic response criteria to predict patient outcomes after trans-arterial chemo-embolization with drug-eluting beads (DEB-TACE) in patients with advanced-stage (BCLC C) hepatocellular carcinoma (HCC). Hospital records from 2005 to 2011 were retrospectively reviewed. Non-infiltrative lesions were measured at baseline and on follow-up scans after DEB-TACE according to various common radiologic response criteria, including guidelines of the World Health Organization (WHO), Response Evaluation Criteria in Solid Tumors (RECIST), the European Association for the Study of the Liver (EASL), and modified RECIST (mRECIST). Statistical analysis was performed to see which, if any, of the response criteria could be used as a predictor of overall survival (OS) or time-to-progression (TTP). 75 patients met inclusion criteria. Median OS and TTP were 22.6 months (95 % CI 11.6-24.8) and 9.8 months (95 % CI 7.1-21.6), respectively. Univariate and multivariate Cox analyses revealed that none of the evaluated criteria had the ability to be used as a predictor for OS or TTP. Analysis of the C index in both univariate and multivariate models showed that the evaluated criteria were not accurate predictors of either OS (C-statistic range: 0.51-0.58 in the univariate model; range: 0.54-0.58 in the multivariate model) or TTP (C-statistic range: 0.55-0.59 in the univariate model; range: 0.57-0.61 in the multivariate model). Current response criteria are not accurate predictors of OS or TTP in patients with advanced-stage HCC after DEB-TACE.
Feinauer, Christoph; Procaccini, Andrea; Zecchina, Riccardo; Weigt, Martin; Pagnani, Andrea
2014-01-01
In the course of evolution, proteins show a remarkable conservation of their three-dimensional structure and their biological function, leading to strong evolutionary constraints on the sequence variability between homologous proteins. Our method aims at extracting such constraints from rapidly accumulating sequence data, and thereby at inferring protein structure and function from sequence information alone. Recently, global statistical inference methods (e.g. direct-coupling analysis, sparse inverse covariance estimation) have achieved a breakthrough towards this aim, and their predictions have been successfully implemented into tertiary and quaternary protein structure prediction methods. However, due to the discrete nature of the underlying variable (amino-acids), exact inference requires exponential time in the protein length, and efficient approximations are needed for practical applicability. Here we propose a very efficient multivariate Gaussian modeling approach as a variant of direct-coupling analysis: the discrete amino-acid variables are replaced by continuous Gaussian random variables. The resulting statistical inference problem is efficiently and exactly solvable. We show that the quality of inference is comparable or superior to the one achieved by mean-field approximations to inference with discrete variables, as done by direct-coupling analysis. This is true for (i) the prediction of residue-residue contacts in proteins, and (ii) the identification of protein-protein interaction partner in bacterial signal transduction. An implementation of our multivariate Gaussian approach is available at the website http://areeweb.polito.it/ricerca/cmp/code. PMID:24663061
Models of Marine Fish Biodiversity: Assessing Predictors from Three Habitat Classification Schemes.
Yates, Katherine L; Mellin, Camille; Caley, M Julian; Radford, Ben T; Meeuwig, Jessica J
2016-01-01
Prioritising biodiversity conservation requires knowledge of where biodiversity occurs. Such knowledge, however, is often lacking. New technologies for collecting biological and physical data coupled with advances in modelling techniques could help address these gaps and facilitate improved management outcomes. Here we examined the utility of environmental data, obtained using different methods, for developing models of both uni- and multivariate biodiversity metrics. We tested which biodiversity metrics could be predicted best and evaluated the performance of predictor variables generated from three types of habitat data: acoustic multibeam sonar imagery, predicted habitat classification, and direct observer habitat classification. We used boosted regression trees (BRT) to model metrics of fish species richness, abundance and biomass, and multivariate regression trees (MRT) to model biomass and abundance of fish functional groups. We compared model performance using different sets of predictors and estimated the relative influence of individual predictors. Models of total species richness and total abundance performed best; those developed for endemic species performed worst. Abundance models performed substantially better than corresponding biomass models. In general, BRT and MRTs developed using predicted habitat classifications performed less well than those using multibeam data. The most influential individual predictor was the abiotic categorical variable from direct observer habitat classification and models that incorporated predictors from direct observer habitat classification consistently outperformed those that did not. Our results show that while remotely sensed data can offer considerable utility for predictive modelling, the addition of direct observer habitat classification data can substantially improve model performance. Thus it appears that there are aspects of marine habitats that are important for modelling metrics of fish biodiversity that are not fully captured by remotely sensed data. As such, the use of remotely sensed data to model biodiversity represents a compromise between model performance and data availability.
Models of Marine Fish Biodiversity: Assessing Predictors from Three Habitat Classification Schemes
Yates, Katherine L.; Mellin, Camille; Caley, M. Julian; Radford, Ben T.; Meeuwig, Jessica J.
2016-01-01
Prioritising biodiversity conservation requires knowledge of where biodiversity occurs. Such knowledge, however, is often lacking. New technologies for collecting biological and physical data coupled with advances in modelling techniques could help address these gaps and facilitate improved management outcomes. Here we examined the utility of environmental data, obtained using different methods, for developing models of both uni- and multivariate biodiversity metrics. We tested which biodiversity metrics could be predicted best and evaluated the performance of predictor variables generated from three types of habitat data: acoustic multibeam sonar imagery, predicted habitat classification, and direct observer habitat classification. We used boosted regression trees (BRT) to model metrics of fish species richness, abundance and biomass, and multivariate regression trees (MRT) to model biomass and abundance of fish functional groups. We compared model performance using different sets of predictors and estimated the relative influence of individual predictors. Models of total species richness and total abundance performed best; those developed for endemic species performed worst. Abundance models performed substantially better than corresponding biomass models. In general, BRT and MRTs developed using predicted habitat classifications performed less well than those using multibeam data. The most influential individual predictor was the abiotic categorical variable from direct observer habitat classification and models that incorporated predictors from direct observer habitat classification consistently outperformed those that did not. Our results show that while remotely sensed data can offer considerable utility for predictive modelling, the addition of direct observer habitat classification data can substantially improve model performance. Thus it appears that there are aspects of marine habitats that are important for modelling metrics of fish biodiversity that are not fully captured by remotely sensed data. As such, the use of remotely sensed data to model biodiversity represents a compromise between model performance and data availability. PMID:27333202
2014-09-01
approaches. Ecological Modelling Volume 200, Issues 1–2, 10, pp 1–19. Buhlmann, Kurt A ., Thomas S.B. Akre , John B. Iverson, Deno Karapatakis, Russell A ...statistical multivariate analysis to define the current and projected future range probability for species of interest to Army land managers. A software...15 Figure 4. RCW omission rate and predicted area as a function of the cumulative threshold
Kou, Peng Meng; Pallassana, Narayanan; Bowden, Rebeca; Cunningham, Barry; Joy, Abraham; Kohn, Joachim; Babensee, Julia E.
2011-01-01
Dendritic cells (DCs) play a critical role in orchestrating the host responses to a wide variety of foreign antigens and are essential in maintaining immune tolerance. Distinct biomaterials have been shown to differentially affect the phenotype of DCs, which suggested that biomaterials may be used to modulate immune response towards the biologic component in combination products. The elucidation of biomaterial property-DC phenotype relationships is expected to inform rational design of immuno-modulatory biomaterials. In this study, DC response to a set of 12 polymethacrylates (pMAs) was assessed in terms of surface marker expression and cytokine profile. Principal component analysis (PCA) determined that surface carbon correlated with enhanced DC maturation, while surface oxygen was associated with an immature DC phenotype. Partial square linear regression, a multivariate modeling approach, was implemented and successfully predicted biomaterial-induced DC phenotype in terms of surface marker expression from biomaterial properties with R2prediction = 0.76. Furthermore, prediction of DC phenotype was effective based on only theoretical chemical composition of the bulk polymers with R2prediction = 0.80. These results demonstrated that immune cell response can be predicted from biomaterial properties, and computational models will expedite future biomaterial design and selection. PMID:22136715
Samad, Manar D; Ulloa, Alvaro; Wehner, Gregory J; Jing, Linyuan; Hartzel, Dustin; Good, Christopher W; Williams, Brent A; Haggerty, Christopher M; Fornwalt, Brandon K
2018-06-09
The goal of this study was to use machine learning to more accurately predict survival after echocardiography. Predicting patient outcomes (e.g., survival) following echocardiography is primarily based on ejection fraction (EF) and comorbidities. However, there may be significant predictive information within additional echocardiography-derived measurements combined with clinical electronic health record data. Mortality was studied in 171,510 unselected patients who underwent 331,317 echocardiograms in a large regional health system. We investigated the predictive performance of nonlinear machine learning models compared with that of linear logistic regression models using 3 different inputs: 1) clinical variables, including 90 cardiovascular-relevant International Classification of Diseases, Tenth Revision, codes, and age, sex, height, weight, heart rate, blood pressures, low-density lipoprotein, high-density lipoprotein, and smoking; 2) clinical variables plus physician-reported EF; and 3) clinical variables and EF, plus 57 additional echocardiographic measurements. Missing data were imputed with a multivariate imputation by using a chained equations algorithm (MICE). We compared models versus each other and baseline clinical scoring systems by using a mean area under the curve (AUC) over 10 cross-validation folds and across 10 survival durations (6 to 60 months). Machine learning models achieved significantly higher prediction accuracy (all AUC >0.82) over common clinical risk scores (AUC = 0.61 to 0.79), with the nonlinear random forest models outperforming logistic regression (p < 0.01). The random forest model including all echocardiographic measurements yielded the highest prediction accuracy (p < 0.01 across all models and survival durations). Only 10 variables were needed to achieve 96% of the maximum prediction accuracy, with 6 of these variables being derived from echocardiography. Tricuspid regurgitation velocity was more predictive of survival than LVEF. In a subset of studies with complete data for the top 10 variables, multivariate imputation by chained equations yielded slightly reduced predictive accuracies (difference in AUC of 0.003) compared with the original data. Machine learning can fully utilize large combinations of disparate input variables to predict survival after echocardiography with superior accuracy. Copyright © 2018 American College of Cardiology Foundation. Published by Elsevier Inc. All rights reserved.
USDA-ARS?s Scientific Manuscript database
Prediction equations of energy expenditure (EE) using accelerometers and miniaturized heart rate (HR) monitors have been developed in older children and adults but not in preschool-aged children. Because the relationships between accelerometer counts (ACs), HR, and EE are confounded by growth and ma...
Huang, An-Min; Fei, Ben-Hua; Jiang, Ze-Hui; Hse, Chung-Yun
2007-09-01
Near infrared spectroscopy is widely used as a quantitative method, and the main multivariate techniques consist of regression methods used to build prediction models, however, the accuracy of analysis results will be affected by many factors. In the present paper, the influence of different sample roughness on the mathematical model of NIR quantitative analysis of wood density was studied. The result of experiments showed that if the roughness of predicted samples was consistent with that of calibrated samples, the result was good, otherwise the error would be much higher. The roughness-mixed model was more flexible and adaptable to different sample roughness. The prediction ability of the roughness-mixed model was much better than that of the single-roughness model.
Prat, Chantal; Besalú, Emili; Bañeras, Lluís; Anticó, Enriqueta
2011-06-15
The volatile fraction of aqueous cork macerates of tainted and non-tainted agglomerate cork stoppers was analysed by headspace solid-phase microextraction (HS-SPME)/gas chromatography. Twenty compounds containing terpenoids, aliphatic alcohols, lignin-related compounds and others were selected and analysed in individual corks. Cork stoppers were previously classified in six different classes according to sensory descriptions including, 2,4,6-trichloroanisole taint and other frequent, non-characteristic odours found in cork. A multivariate analysis of the chromatographic data of 20 selected chemical compounds using linear discriminant analysis models helped in the differentiation of the a priori made groups. The discriminant model selected five compounds as the best combination. Selected compounds appear in the model in the following order; 2,4,6 TCA, fenchyl alcohol, 1-octen-3-ol, benzyl alcohol and benzothiazole. Unfortunately, not all six a priori differentiated sensory classes were clearly discriminated in the model, probably indicating that no measurable differences exist in the chromatographic data for some categories. The predictive analyses of a refined model in which two sensory classes were fused together resulted in a good classification. Prediction rates of control (non-tainted), TCA, musty-earthy-vegetative, vegetative and chemical descriptions were 100%, 100%, 85%, 67.3% and 100%, respectively, when the modified model was used. The multivariate analysis of chromatographic data will help in the classification of stoppers and provide a perfect complement to sensory analyses. Copyright © 2010 Elsevier Ltd. All rights reserved.
Neuropsychological tests for predicting cognitive decline in older adults
Baerresen, Kimberly M; Miller, Karen J; Hanson, Eric R; Miller, Justin S; Dye, Richelin V; Hartman, Richard E; Vermeersch, David; Small, Gary W
2015-01-01
Summary Aim To determine neuropsychological tests likely to predict cognitive decline. Methods A sample of nonconverters (n = 106) was compared with those who declined in cognitive status (n = 24). Significant univariate logistic regression prediction models were used to create multivariate logistic regression models to predict decline based on initial neuropsychological testing. Results Rey–Osterrieth Complex Figure Test (RCFT) Retention predicted conversion to mild cognitive impairment (MCI) while baseline Buschke Delay predicted conversion to Alzheimer’s disease (AD). Due to group sample size differences, additional analyses were conducted using a subsample of demographically matched nonconverters. Analyses indicated RCFT Retention predicted conversion to MCI and AD, and Buschke Delay predicted conversion to AD. Conclusion Results suggest RCFT Retention and Buschke Delay may be useful in predicting cognitive decline. PMID:26107318
NASA Astrophysics Data System (ADS)
Cannon, Alex J.
2018-01-01
Most bias correction algorithms used in climatology, for example quantile mapping, are applied to univariate time series. They neglect the dependence between different variables. Those that are multivariate often correct only limited measures of joint dependence, such as Pearson or Spearman rank correlation. Here, an image processing technique designed to transfer colour information from one image to another—the N-dimensional probability density function transform—is adapted for use as a multivariate bias correction algorithm (MBCn) for climate model projections/predictions of multiple climate variables. MBCn is a multivariate generalization of quantile mapping that transfers all aspects of an observed continuous multivariate distribution to the corresponding multivariate distribution of variables from a climate model. When applied to climate model projections, changes in quantiles of each variable between the historical and projection period are also preserved. The MBCn algorithm is demonstrated on three case studies. First, the method is applied to an image processing example with characteristics that mimic a climate projection problem. Second, MBCn is used to correct a suite of 3-hourly surface meteorological variables from the Canadian Centre for Climate Modelling and Analysis Regional Climate Model (CanRCM4) across a North American domain. Components of the Canadian Forest Fire Weather Index (FWI) System, a complicated set of multivariate indices that characterizes the risk of wildfire, are then calculated and verified against observed values. Third, MBCn is used to correct biases in the spatial dependence structure of CanRCM4 precipitation fields. Results are compared against a univariate quantile mapping algorithm, which neglects the dependence between variables, and two multivariate bias correction algorithms, each of which corrects a different form of inter-variable correlation structure. MBCn outperforms these alternatives, often by a large margin, particularly for annual maxima of the FWI distribution and spatiotemporal autocorrelation of precipitation fields.
Brouckaert, D; Uyttersprot, J-S; Broeckx, W; De Beer, T
2018-03-01
Calibration transfer or standardisation aims at creating a uniform spectral response on different spectroscopic instruments or under varying conditions, without requiring a full recalibration for each situation. In the current study, this strategy is applied to construct at-line multivariate calibration models and consequently employ them in-line in a continuous industrial production line, using the same spectrometer. Firstly, quantitative multivariate models are constructed at-line at laboratory scale for predicting the concentration of two main ingredients in hard surface cleaners. By regressing the Raman spectra of a set of small-scale calibration samples against their reference concentration values, partial least squares (PLS) models are developed to quantify the surfactant levels in the liquid detergent compositions under investigation. After evaluating the models performance with a set of independent validation samples, a univariate slope/bias correction is applied in view of transporting these at-line calibration models to an in-line manufacturing set-up. This standardisation technique allows a fast and easy transfer of the PLS regression models, by simply correcting the model predictions on the in-line set-up, without adjusting anything to the original multivariate calibration models. An extensive statistical analysis is performed in order to assess the predictive quality of the transferred regression models. Before and after transfer, the R 2 and RMSEP of both models is compared for evaluating if their magnitude is similar. T-tests are then performed to investigate whether the slope and intercept of the transferred regression line are not statistically different from 1 and 0, respectively. Furthermore, it is inspected whether no significant bias can be noted. F-tests are executed as well, for assessing the linearity of the transfer regression line and for investigating the statistical coincidence of the transfer and validation regression line. Finally, a paired t-test is performed to compare the original at-line model to the slope/bias corrected in-line model, using interval hypotheses. It is shown that the calibration models of Surfactant 1 and Surfactant 2 yield satisfactory in-line predictions after slope/bias correction. While Surfactant 1 passes seven out of eight statistical tests, the recommended validation parameters are 100% successful for Surfactant 2. It is hence concluded that the proposed strategy for transferring at-line calibration models to an in-line industrial environment via a univariate slope/bias correction of the predicted values offers a successful standardisation approach. Copyright © 2017 Elsevier B.V. All rights reserved.
Assessing Multivariate Constraints to Evolution across Ten Long-Term Avian Studies
Teplitsky, Celine; Tarka, Maja; Møller, Anders P.; Nakagawa, Shinichi; Balbontín, Javier; Burke, Terry A.; Doutrelant, Claire; Gregoire, Arnaud; Hansson, Bengt; Hasselquist, Dennis; Gustafsson, Lars; de Lope, Florentino; Marzal, Alfonso; Mills, James A.; Wheelwright, Nathaniel T.; Yarrall, John W.; Charmantier, Anne
2014-01-01
Background In a rapidly changing world, it is of fundamental importance to understand processes constraining or facilitating adaptation through microevolution. As different traits of an organism covary, genetic correlations are expected to affect evolutionary trajectories. However, only limited empirical data are available. Methodology/Principal Findings We investigate the extent to which multivariate constraints affect the rate of adaptation, focusing on four morphological traits often shown to harbour large amounts of genetic variance and considered to be subject to limited evolutionary constraints. Our data set includes unique long-term data for seven bird species and a total of 10 populations. We estimate population-specific matrices of genetic correlations and multivariate selection coefficients to predict evolutionary responses to selection. Using Bayesian methods that facilitate the propagation of errors in estimates, we compare (1) the rate of adaptation based on predicted response to selection when including genetic correlations with predictions from models where these genetic correlations were set to zero and (2) the multivariate evolvability in the direction of current selection to the average evolvability in random directions of the phenotypic space. We show that genetic correlations on average decrease the predicted rate of adaptation by 28%. Multivariate evolvability in the direction of current selection was systematically lower than average evolvability in random directions of space. These significant reductions in the rate of adaptation and reduced evolvability were due to a general nonalignment of selection and genetic variance, notably orthogonality of directional selection with the size axis along which most (60%) of the genetic variance is found. Conclusions These results suggest that genetic correlations can impose significant constraints on the evolution of avian morphology in wild populations. This could have important impacts on evolutionary dynamics and hence population persistence in the face of rapid environmental change. PMID:24608111
Hosseinpour, Mehdi; Sahebi, Sina; Zamzuri, Zamira Hasanah; Yahaya, Ahmad Shukri; Ismail, Noriszura
2018-06-01
According to crash configuration and pre-crash conditions, traffic crashes are classified into different collision types. Based on the literature, multi-vehicle crashes, such as head-on, rear-end, and angle crashes, are more frequent than single-vehicle crashes, and most often result in serious consequences. From a methodological point of view, the majority of prior studies focused on multivehicle collisions have employed univariate count models to estimate crash counts separately by collision type. However, univariate models fail to account for correlations which may exist between different collision types. Among others, multivariate Poisson lognormal (MVPLN) model with spatial correlation is a promising multivariate specification because it not only allows for unobserved heterogeneity (extra-Poisson variation) and dependencies between collision types, but also spatial correlation between adjacent sites. However, the MVPLN spatial model has rarely been applied in previous research for simultaneously modelling crash counts by collision type. Therefore, this study aims at utilizing a MVPLN spatial model to estimate crash counts for four different multi-vehicle collision types, including head-on, rear-end, angle, and sideswipe collisions. To investigate the performance of the MVPLN spatial model, a two-stage model and a univariate Poisson lognormal model (UNPLN) spatial model were also developed in this study. Detailed information on roadway characteristics, traffic volume, and crash history were collected on 407 homogeneous segments from Malaysian federal roads. The results indicate that the MVPLN spatial model outperforms the other comparing models in terms of goodness-of-fit measures. The results also show that the inclusion of spatial heterogeneity in the multivariate model significantly improves the model fit, as indicated by the Deviance Information Criterion (DIC). The correlation between crash types is high and positive, implying that the occurrence of a specific collision type is highly associated with the occurrence of other crash types on the same road segment. These results support the utilization of the MVPLN spatial model when predicting crash counts by collision manner. In terms of contributing factors, the results show that distinct crash types are attributed to different subsets of explanatory variables. Copyright © 2018 Elsevier Ltd. All rights reserved.
Li, Zai-Shang; Chen, Peng; Yao, Kai; Wang, Bin; Li, Jing; Mi, Qi-Wu; Chen, Xiao-Feng; Zhao, Qi; Li, Yong-Hong; Chen, Jie-Ping; Deng, Chuang-Zhong; Ye, Yun-Lin; Zhong, Ming-Zhu; Liu, Zhuo-Wei; Qin, Zi-Ke; Lin, Xiang-Tian; Liang, Wei-Cong; Han, Hui; Zhou, Fang-Jian
2016-04-12
To determine the predictive value and feasibility of the new outcome prediction model for Chinese patients with penile squamous cell carcinoma. The 3-year disease-specific survival (DSS) survival (DSS) was 92.3% in patients with < 8.70 mg/L CRP and 54.9% in those with elevated CRP (P < 0.001). The 3-year DSS was 86.5% in patients with a BMI < 22.6 Kg/m2 and 69.9% in those with a higher BMI (P = 0.025). In a multivariate analysis, pathological T stage (P < 0.001), pathological N stage (P = 0.002), BMI (P = 0.002), and CRP (P = 0.004) were independent predictors of DSS. A new scoring model was developed, consisting of BMI, CRP, and tumor T and N classification. In our study, we found that the addition of the above-mentioned parameters significantly increased the predictive accuracy of the system of the American Joint Committee on Cancer (AJCC) anatomic stage group. The accuracy of the new prediction category was verified. A total of 172 Chinese patients with penile squamous cell cancer were analyzed retrospectively between November 2005 and November 2014. Statistical data analysis was conducted using the nonparametric method. Survival analysis was performed with the log-rank test and the Cox proportional hazard model. Based on regression estimates of significant parameters in multivariate analysis, a new BMI-, CRP- and pathologic factors-based scoring model was developed to predict disease--specific outcomes. The predictive accuracy of the model was evaluated using the internal and external validation. The present study demonstrated that the TNCB score group system maybe a precise and easy to use tool for predicting outcomes in Chinese penile squamous cell carcinoma patients.
Moreira, Luiz Felipe Pompeu Prado; Ferrari, Adriana Cristina; Moraes, Tiago Bueno; Reis, Ricardo Andrade; Colnago, Luiz Alberto; Pereira, Fabíola Manhas Verbi
2016-05-19
Time-domain nuclear magnetic resonance and chemometrics were used to predict color parameters, such as lightness (L*), redness (a*), and yellowness (b*) of beef (Longissimus dorsi muscle) samples. Analyzing the relaxation decays with multivariate models performed with partial least-squares regression, color quality parameters were predicted. The partial least-squares models showed low errors independent of the sample size, indicating the potentiality of the method. Minced procedure and weighing were not necessary to improve the predictive performance of the models. The reduction of transverse relaxation time (T 2 ) measured by Carr-Purcell-Meiboom-Gill pulse sequence in darker beef in comparison with lighter ones can be explained by the lower relaxivity Fe 2+ present in deoxymyoglobin and oxymyoglobin (red beef) to the higher relaxivity of Fe 3+ present in metmyoglobin (brown beef). These results point that time-domain nuclear magnetic resonance spectroscopy can become a useful tool for quality assessment of beef cattle on bulk of the sample and through-packages, because this technique is also widely applied to measure sensorial parameters, such as flavor, juiciness and tenderness, and physicochemical parameters, cooking loss, fat and moisture content, and instrumental tenderness using Warner Bratzler shear force. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Low Social Status Markers: Do They Predict Depressive Symptoms in Adolescence?
Jackson, Benita; Goodman, Elizabeth
2011-07-01
Some markers of social disadvantage are associated robustly with depressive symptoms among adolescents: female gender and lower socioeconomic status (SES), respectively. Others are associated equivocally, notably Black v. White race/ethnicity. Few studies examine whether markers of social disadvantage by gender, SES, and race/ethnicity jointly predict self-reported depressive symptoms during adolescence; this was our goal. Secondary analyses were conducted on data from a socioeconomically diverse community-based cohort study of non-Hispanic Black and White adolescents (N = 1,263, 50.4% female). Multivariable general linear models tested if female gender, Black race/ethnicity, and lower SES (assessed by parent education and household income), and their interactions predicted greater depressive symptoms reported on the Center for Epidemiological Studies-Depression scale. Models adjusted for age and pubertal status. Univariate analyses revealed more depressive symptoms in females, Blacks, and participants with lower SES. Multivariable models showed females across both racial/ethnic groups reported greater depressive symptoms; Blacks demonstrated more depressive symptoms than did Whites but when SES was included this association disappeared. Exploratory analyses suggested Blacks gained less mental health benefit from increased SES. However there were no statistically significant interactions among gender, race/ethnicity, or SES. Taken together, we conclude that complex patterning among low social status domains within gender, race/ethnicity, and SES predicts depressive symptoms among adolescents.
García Nieto, Paulino José; González Suárez, Victor Manuel; Álvarez Antón, Juan Carlos; Mayo Bayón, Ricardo; Sirgo Blanco, José Ángel; Díaz Fernández, Ana María
2015-01-01
The aim of this study was to obtain a predictive model able to perform an early detection of central segregation severity in continuous cast steel slabs. Segregation in steel cast products is an internal defect that can be very harmful when slabs are rolled in heavy plate mills. In this research work, the central segregation was studied with success using the data mining methodology based on multivariate adaptive regression splines (MARS) technique. For this purpose, the most important physical-chemical parameters are considered. The results of the present study are two-fold. In the first place, the significance of each physical-chemical variable on the segregation is presented through the model. Second, a model for forecasting segregation is obtained. Regression with optimal hyperparameters was performed and coefficients of determination equal to 0.93 for continuity factor estimation and 0.95 for average width were obtained when the MARS technique was applied to the experimental dataset, respectively. The agreement between experimental data and the model confirmed the good performance of the latter.
Collins, Gary S; Reitsma, Johannes B; Altman, Douglas G; Moons, Karel G M
2015-01-07
Prediction models are developed to aid health care providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web based survey and revised during a three day meeting in June 2011 with methodologists, health care professionals, and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org).To encourage dissemination of the TRIPOD Statement, this article is freely accessible on the Annals of Internal Medicine Web site (www.annals.org) and will be also published in BJOG, British Journal of Cancer, British Journal of Surgery, BMC Medicine, The BMJ, Circulation, Diabetic Medicine, European Journal of Clinical Investigation, European Urology, and Journal of Clinical Epidemiology. The authors jointly hold the copyright of this article. An accompanying explanation and elaboration article is freely available only on www.annals.org; Annals of Internal Medicine holds copyright for that article. © BMJ Publishing Group Ltd 2014.
Hamilton, C A; Miller, A; Casablanca, Y; Horowitz, N S; Rungruang, B; Krivak, T C; Richard, S D; Rodriguez, N; Birrer, M J; Backes, F J; Geller, M A; Quinn, M; Goodheart, M J; Mutch, D G; Kavanagh, J J; Maxwell, G L; Bookman, M A
2018-02-01
To identify clinicopathologic factors associated with 10-year overall survival in epithelial ovarian cancer (EOC) and primary peritoneal cancer (PPC), and to develop a predictive model identifying long-term survivors. Demographic, surgical, and clinicopathologic data were abstracted from GOG 182 records. The association between clinical variables and long-term survival (LTS) (>10years) was assessed using multivariable regression analysis. Bootstrap methods were used to develop predictive models from known prognostic clinical factors and predictive accuracy was quantified using optimism-adjusted area under the receiver operating characteristic curve (AUC). The analysis dataset included 3010 evaluable patients, of whom 195 survived greater than ten years. These patients were more likely to have better performance status, endometrioid histology, stage III (rather than stage IV) disease, absence of ascites, less extensive preoperative disease distribution, microscopic disease residual following cyoreduction (R0), and decreased complexity of surgery (p<0.01). Multivariable regression analysis revealed that lower CA-125 levels, absence of ascites, stage, and R0 were significant independent predictors of LTS. A predictive model created using these variables had an AUC=0.729, which outperformed any of the individual predictors. The absence of ascites, a low CA-125, stage, and R0 at the time of cytoreduction are factors associated with LTS when controlling for other confounders. An extensively annotated clinicopathologic prediction model for LTS fell short of clinical utility suggesting that prognostic molecular profiles are needed to better predict which patients are likely to be long-term survivors. Published by Elsevier Inc.
Hamilton, C. A.; Miller, A.; Casablanca, Y.; Horowitz, N. S.; Rungruang, B.; Krivak, T. C.; Richard, S. D.; Rodriguez, N.; Birrer, M.J.; Backes, F.J.; Geller, M.A.; Quinn, M.; Goodheart, M.J.; Mutch, D.G.; Kavanagh, J.J.; Maxwell, G. L.; Bookman, M. A.
2018-01-01
Objective To identify clinicopathologic factors associated with 10-year overall survival in epithelial ovarian cancer (EOC) and primary peritoneal cancer (PPC), and to develop a predictive model identifying long-term survivors. Methods Demographic, surgical, and clinicopathologic data were abstracted from GOG 182 records. The association between clinical variables and long-term survival (LTS) (>10 years) was assessed using multivariable regression analysis. Bootstrap methods were used to develop predictive models from known prognostic clinical factors and predictive accuracy was quantified using optimism-adjusted area under the receiver operating characteristic curve (AUC). Results The analysis dataset included 3,010 evaluable patients, of whom 195 survived greater than ten years. These patients were more likely to have better performance status, endometrioid histology, stage III (rather than stage IV) disease, absence of ascites, less extensive preoperative disease distribution, microscopic disease residual following cyoreduction (R0), and decreased complexity of surgery (p<0.01). Multivariable regression analysis revealed that lower CA-125 levels, absence of ascites, stage, and R0 were significant independent predictors of LTS. A predictive model created using these variables had an AUC=0.729, which outperformed any of the individual predictors. Conclusions The absence of ascites, a low CA-125, stage, and R0 at the time of cytoreduction are factors associated with LTS when controlling for other confounders. An extensively annotated clinicopathologic prediction model for LTS fell short of clinical utility suggesting that prognostic molecular profiles are needed to better predict which patients are likely to be long-term survivors. PMID:29195926
Predicting the activity of drugs for a group of imidazopyridine anticoccidial compounds.
Si, Hongzong; Lian, Ning; Yuan, Shuping; Fu, Aiping; Duan, Yun-Bo; Zhang, Kejun; Yao, Xiaojun
2009-10-01
Gene expression programming (GEP) is a novel machine learning technique. The GEP is used to build nonlinear quantitative structure-activity relationship model for the prediction of the IC(50) for the imidazopyridine anticoccidial compounds. This model is based on descriptors which are calculated from the molecular structure. Four descriptors are selected from the descriptors' pool by heuristic method (HM) to build multivariable linear model. The GEP method produced a nonlinear quantitative model with a correlation coefficient and a mean error of 0.96 and 0.24 for the training set, 0.91 and 0.52 for the test set, respectively. It is shown that the GEP predicted results are in good agreement with experimental ones.
Study on rapid valid acidity evaluation of apple by fiber optic diffuse reflectance technique
NASA Astrophysics Data System (ADS)
Liu, Yande; Ying, Yibin; Fu, Xiaping; Jiang, Xuesong
2004-03-01
Some issues related to nondestructive evaluation of valid acidity in intact apples by means of Fourier transform near infrared (FTNIR) (800-2631nm) method were addressed. A relationship was established between the diffuse reflectance spectra recorded with a bifurcated optic fiber and the valid acidity. The data were analyzed by multivariate calibration analysis such as partial least squares (PLS) analysis and principal component regression (PCR) technique. A total of 120 Fuji apples were tested and 80 of them were used to form a calibration data set. The influence of data preprocessing and different spectra treatments were also investigated. Models based on smoothing spectra were slightly worse than models based on derivative spectra and the best result was obtained when the segment length was 5 and the gap size was 10. Depending on data preprocessing and multivariate calibration technique, the best prediction model had a correlation efficient (0.871), a low RMSEP (0.0677), a low RMSEC (0.056) and a small difference between RMSEP and RMSEC by PLS analysis. The results point out the feasibility of FTNIR spectral analysis to predict the fruit valid acidity non-destructively. The ratio of data standard deviation to the root mean square error of prediction (SDR) is better to be less than 3 in calibration models, however, the results cannot meet the demand of actual application. Therefore, further study is required for better calibration and prediction.
Does investor ownership of nursing homes compromise the quality of care?
Harrington, C; Woolhandler, S; Mullan, J; Carrillo, H; Himmelstein, D U
2001-09-01
Two thirds of nursing homes are investor owned. This study examined whether investor ownership affects quality. We analyzed 1998 data from state inspections of 13,693 nursing facilities. We used a multivariate model and controlled for case mix, facility characteristics, and location. Investor-owned facilities averaged 5.89 deficiencies per home, 46.5% higher than nonprofit facilities and 43.0% higher than public facilities. In multivariate analysis, investor ownership predicted 0.679 additional deficiencies per home; chain ownership predicted an additional 0.633 deficiencies. Nurse staffing was lower at investor-owned nursing homes. Investor-owned nursing homes provide worse care and less nursing care than do not-for-profit or public homes.
Takahara, Mitsuyoshi; Katakami, Naoto; Kaneto, Hideaki; Noguchi, Midori; Shimomura, Iichiro
2014-01-01
The aim of the current study was to develop a predictive model of insulin resistance using general health checkup data in Japanese employees with one or more metabolic risk factors. We used a database of 846 Japanese employees with one or more metabolic risk factors who underwent general health checkup and a 75-g oral glucose tolerance test (OGTT). Logistic regression models were developed to predict existing insulin resistance evaluated using the Matsuda index. The predictive performance of these models was assessed using the C statistic. The C statistics of body mass index (BMI), waist circumference and their combined use were 0.743, 0.732 and 0.749, with no significant differences. The multivariate backward selection model, in which BMI, the levels of plasma glucose, high-density lipoprotein (HDL) cholesterol, log-transformed triglycerides and log-transformed alanine aminotransferase and hypertension under treatment remained, had a C statistic of 0.816, with a significant difference compared to the combined use of BMI and waist circumference (p<0.01). The C statistic was not significantly reduced when the levels of log-transformed triglycerides and log-transformed alanine aminotransferase and hypertension under treatment were simultaneously excluded from the multivariate model (p=0.14). On the other hand, further exclusion of any of the remaining three variables significantly reduced the C statistic (all p<0.01). When predicting the presence of insulin resistance using general health checkup data in Japanese employees with metabolic risk factors, it is important to take into consideration the BMI and fasting plasma glucose and HDL cholesterol levels.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Roeloffzen, Ellen M., E-mail: e.m.a.roeloffzen@umcutrecht.nl; Vulpen, Marco van; Battermann, Jan J.
Purpose: Acute urinary retention (AUR) after iodine-125 (I-125) prostate brachytherapy negatively influences long-term quality of life and therefore should be prevented. We aimed to develop a nomogram to preoperatively predict the risk of AUR. Methods: Using the preoperative data of 714 consecutive patients who underwent I-125 prostate brachytherapy between 2005 and 2008 at our department, we modeled the probability of AUR. Multivariate logistic regression analysis was used to assess the predictive ability of a set of pretreatment predictors and the additional value of a new risk factor (the extent of prostate protrusion into the bladder). The performance of the finalmore » model was assessed with calibration and discrimination measures. Results: Of the 714 patients, 57 patients (8.0%) developed AUR after implantation. Multivariate analysis showed that the combination of prostate volume, IPSS score, neoadjuvant hormonal treatment and the extent of prostate protrusion contribute to the prediction of AUR. The discriminative value (receiver operator characteristic area, ROC) of the basic model (including prostate volume, International Prostate Symptom Score, and neoadjuvant hormonal treatment) to predict the development of AUR was 0.70. The addition of prostate protrusion significantly increased the discriminative power of the model (ROC 0.82). Calibration of this final model was good. The nomogram showed that among patients with a low sum score (<18 points), the risk of AUR was only 0%-5%. However, in patients with a high sum score (>35 points), the risk of AUR was more than 20%. Conclusion: This nomogram is a useful tool for physicians to predict the risk of AUR after I-125 prostate brachytherapy. The nomogram can aid in individualized treatment decision-making and patient counseling.« less
Kamstra, J I; Dijkstra, P U; van Leeuwen, M; Roodenburg, J L N; Langendijk, J A
2015-05-01
Aims of this prospective cohort study were (1) to analyze the course of mouth opening up to 48months post-radiotherapy (RT), (2) to assess risk factors predicting decrease in mouth opening, and (3) to develop a multivariable prediction model for change in mouth opening in a large sample of patients irradiated for head and neck cancer. Mouth opening was measured prior to RT (baseline) and at 6, 12, 18, 24, 36, and 48months post-RT. The primary outcome variable was mouth opening. Potential risk factors were entered into a linear mixed model analysis (manual backward-stepwise elimination) to create a multivariable prediction model. The interaction terms between time and risk factors that were significantly related to mouth opening were explored. The study population consisted of 641 patients: 70.4% male, mean age at baseline 62.3years (sd 12.5). Primary tumors were predominantly located in the oro- and nasopharynx (25.3%) and oral cavity (20.6%). Mean mouth opening at baseline was 38.7mm (sd 10.8). Six months post-RT, mean mouth opening was smallest, 36.7mm (sd 10.0). In the linear mixed model analysis, mouth opening was statistically predicted by the location of the tumor, natural logarithm of time post-RT in months (Ln (months)), gender, baseline mouth opening, and baseline age. All main effects interacted with Ln (months). The mean mouth opening decreased slightly over time. Mouth opening was predicted by tumor location, time, gender, baseline mouth opening, and age. The model can be used to predict mouth opening. Copyright © 2015 Elsevier Ltd. All rights reserved.
Maisonneuve, Patrick; Bagnardi, Vincenzo; Bellomi, Massimo; Spaggiari, Lorenzo; Pelosi, Giuseppe; Rampinelli, Cristiano; Bertolotti, Raffaella; Rotmensz, Nicole; Field, John K; Decensi, Andrea; Veronesi, Giulia
2011-11-01
Screening with low-dose helical computed tomography (CT) has been shown to significantly reduce lung cancer mortality but the optimal target population and time interval to subsequent screening are yet to be defined. We developed two models to stratify individual smokers according to risk of developing lung cancer. We first used the number of lung cancers detected at baseline screening CT in the 5,203 asymptomatic participants of the COSMOS trial to recalibrate the Bach model, which we propose using to select smokers for screening. Next, we incorporated lung nodule characteristics and presence of emphysema identified at baseline CT into the Bach model and proposed the resulting multivariable model to predict lung cancer risk in screened smokers after baseline CT. Age and smoking exposure were the main determinants of lung cancer risk. The recalibrated Bach model accurately predicted lung cancers detected during the first year of screening. Presence of nonsolid nodules (RR = 10.1, 95% CI = 5.57-18.5), nodule size more than 8 mm (RR = 9.89, 95% CI = 5.84-16.8), and emphysema (RR = 2.36, 95% CI = 1.59-3.49) at baseline CT were all significant predictors of subsequent lung cancers. Incorporation of these variables into the Bach model increased the predictive value of the multivariable model (c-index = 0.759, internal validation). The recalibrated Bach model seems suitable for selecting the higher risk population for recruitment for large-scale CT screening. The Bach model incorporating CT findings at baseline screening could help defining the time interval to subsequent screening in individual participants. Further studies are necessary to validate these models.
Yahya, Noorazrul; Ebert, Martin A; Bulsara, Max; Kennedy, Angel; Joseph, David J; Denham, James W
2016-08-01
Most predictive models are not sufficiently validated for prospective use. We performed independent external validation of published predictive models for urinary dysfunctions following radiotherapy of the prostate. Multivariable models developed to predict atomised and generalised urinary symptoms, both acute and late, were considered for validation using a dataset representing 754 participants from the TROG 03.04-RADAR trial. Endpoints and features were harmonised to match the predictive models. The overall performance, calibration and discrimination were assessed. 14 models from four publications were validated. The discrimination of the predictive models in an independent external validation cohort, measured using the area under the receiver operating characteristic (ROC) curve, ranged from 0.473 to 0.695, generally lower than in internal validation. 4 models had ROC >0.6. Shrinkage was required for all predictive models' coefficients ranging from -0.309 (prediction probability was inverse to observed proportion) to 0.823. Predictive models which include baseline symptoms as a feature produced the highest discrimination. Two models produced a predicted probability of 0 and 1 for all patients. Predictive models vary in performance and transferability illustrating the need for improvements in model development and reporting. Several models showed reasonable potential but efforts should be increased to improve performance. Baseline symptoms should always be considered as potential features for predictive models. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.
2003-01-01
Logistic regression was used to predict the probability of debris flows occurring in areas recently burned by wildland fires. Multiple logistic regression is conceptually similar to multiple linear regression because statistical relations between one dependent variable and several independent variables are evaluated. In logistic regression, however, the dependent variable is transformed to a binary variable (debris flow did or did not occur), and the actual probability of the debris flow occurring is statistically modeled. Data from 399 basins located within 15 wildland fires that burned during 2000-2002 in Colorado, Idaho, Montana, and New Mexico were evaluated. More than 35 independent variables describing the burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows were delineated from National Elevation Data using a Geographic Information System (GIS). (2) Data describing the burn severity, geology, land surface gradient, rainfall, and soil properties were determined for each basin. These data were then downloaded to a statistics software package for analysis using logistic regression. (3) Relations between the occurrence/non-occurrence of debris flows and burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated and several preliminary multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combination produced the most effective model. The multivariate model that best predicted the occurrence of debris flows was selected. (4) The multivariate logistic regression model was entered into a GIS, and a map showing the probability of debris flows was constructed. The most effective model incorporates the percentage of each basin with slope greater than 30 percent, percentage of land burned at medium and high burn severity in each basin, particle size sorting, average storm intensity (millimeters per hour), soil organic matter content, soil permeability, and soil drainage. The results of this study demonstrate that logistic regression is a valuable tool for predicting the probability of debris flows occurring in recently-burned landscapes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bessac, Julie; Constantinescu, Emil; Anitescu, Mihai
We propose a statistical space-time model for predicting atmospheric wind speed based on deterministic numerical weather predictions and historical measurements. We consider a Gaussian multivariate space-time framework that combines multiple sources of past physical model outputs and measurements in order to produce a probabilistic wind speed forecast within the prediction window. We illustrate this strategy on wind speed forecasts during several months in 2012 for a region near the Great Lakes in the United States. The results show that the prediction is improved in the mean-squared sense relative to the numerical forecasts as well as in probabilistic scores. Moreover, themore » samples are shown to produce realistic wind scenarios based on sample spectra and space-time correlation structure.« less
Bessac, Julie; Constantinescu, Emil; Anitescu, Mihai
2018-03-01
We propose a statistical space-time model for predicting atmospheric wind speed based on deterministic numerical weather predictions and historical measurements. We consider a Gaussian multivariate space-time framework that combines multiple sources of past physical model outputs and measurements in order to produce a probabilistic wind speed forecast within the prediction window. We illustrate this strategy on wind speed forecasts during several months in 2012 for a region near the Great Lakes in the United States. The results show that the prediction is improved in the mean-squared sense relative to the numerical forecasts as well as in probabilistic scores. Moreover, themore » samples are shown to produce realistic wind scenarios based on sample spectra and space-time correlation structure.« less
Algorithms for Robust Identification and Control of Large Space Structures. Phase 1.
1988-05-14
Variate Analysis," Proc. Amer. Control Conf., San Francisco, * pp. 445-451. LECTIQUE, J., Rault, A., Tessier, M., and Testud , J.L. (1978), "Multivariable...Rault, J.L. Testud , and J. Papon (1978), "Model Predictive Heuris- tic Control: Applications to Industrial Processes," Automatica, Vol. 14, pp. 413...Control ’. Conference, Minneapolis, MN, June. TESTUD , J.L. (1979), "Commande Numerique Multivariable du Ballon de Recupera- tion de Vapeur," Adersa/Gerbios
Martin, Lisa; Watanabe, Sharon; Fainsinger, Robin; Lau, Francis; Ghosh, Sunita; Quan, Hue; Atkins, Marlis; Fassbender, Konrad; Downing, G Michael; Baracos, Vickie
2010-10-01
To determine whether elements of a standard nutritional screening assessment are independently prognostic of survival in patients with advanced cancer. A prospective nested cohort of patients with metastatic cancer were accrued from different units of a Regional Palliative Care Program. Patients completed a nutritional screen on admission. Data included age, sex, cancer site, height, weight history, dietary intake, 13 nutrition impact symptoms, and patient- and physician-reported performance status (PS). Univariate and multivariate survival analyses were conducted. Concordance statistics (c-statistics) were used to test the predictive accuracy of models based on training and validation sets; a c-statistic of 0.5 indicates the model predicts the outcome as well as chance; perfect prediction has a c-statistic of 1.0. A training set of patients in palliative home care (n = 1,164) was used to identify prognostic variables. Primary disease site, PS, short-term weight change (either gain or loss), dietary intake, and dysphagia predicted survival in multivariate analysis (P < .05). A model including only patients separated by disease site and PS with high c-statistics between predicted and observed responses for survival in the training set (0.90) and validation set (0.88; n = 603). The addition of weight change, dietary intake, and dysphagia did not further improve the c-statistic of the model. The c-statistic was also not altered by substituting physician-rated palliative PS for patient-reported PS. We demonstrate a high probability of concordance between predicted and observed survival for patients in distinct palliative care settings (home care, tertiary inpatient, ambulatory outpatient) based on patient-reported information.
NASA Astrophysics Data System (ADS)
Attia, Khalid A. M.; Nassar, Mohammed W. I.; El-Zeiny, Mohamed B.; Serag, Ahmed
2017-01-01
For the first time, a new variable selection method based on swarm intelligence namely firefly algorithm is coupled with three different multivariate calibration models namely, concentration residual augmented classical least squares, artificial neural network and support vector regression in UV spectral data. A comparative study between the firefly algorithm and the well-known genetic algorithm was developed. The discussion revealed the superiority of using this new powerful algorithm over the well-known genetic algorithm. Moreover, different statistical tests were performed and no significant differences were found between all the models regarding their predictabilities. This ensures that simpler and faster models were obtained without any deterioration of the quality of the calibration.
A predictive risk model for medical intractability in epilepsy.
Huang, Lisu; Li, Shi; He, Dake; Bao, Weiqun; Li, Ling
2014-08-01
This study aimed to investigate early predictors (6 months after diagnosis) of medical intractability in epilepsy. All children <12 years of age having two or more unprovoked seizures 24 h apart at Xinhua Hospital between 1992 and 2006 were included. Medical intractability was defined as failure, due to lack of seizure control, of more than 2 antiepileptic drugs at maximum tolerated doses, with an average of more than 1 seizure per month for 24 months and no more than 3 consecutive months of seizure freedom during this interval. Univariate and multivariate logistic regression models were performed to determine the risk factors for developing medical intractability. Receiver operating characteristic curve was applied to fit the best compounded predictive model. A total of 649 patients were identified, out of which 119 (18%) met the study definition of intractable epilepsy at 2 years after diagnosis, and the rate of intractable epilepsy in patients with idiopathic syndromes was 12%. Multivariate logistic regression analysis revealed that neurodevelopmental delay, symptomatic etiology, partial seizures, and more than 10 seizures before diagnosis were significant and independent risk factors for intractable epilepsy. The best model to predict medical intractability in epilepsy comprised neurological physical abnormality, age at onset of epilepsy under 1 year, more than 10 seizures before diagnosis, and partial epilepsy, and the area under receiver operating characteristic curve was 0.7797. This model also fitted best in patients with idiopathic syndromes. A predictive model of medically intractable epilepsy composed of only four characteristics is established. This model is comparatively accurate and simple to apply clinically. Copyright © 2014 Elsevier Inc. All rights reserved.
Calibrated Multivariate Regression with Application to Neural Semantic Basis Discovery.
Liu, Han; Wang, Lie; Zhao, Tuo
2015-08-01
We propose a calibrated multivariate regression method named CMR for fitting high dimensional multivariate regression models. Compared with existing methods, CMR calibrates regularization for each regression task with respect to its noise level so that it simultaneously attains improved finite-sample performance and tuning insensitiveness. Theoretically, we provide sufficient conditions under which CMR achieves the optimal rate of convergence in parameter estimation. Computationally, we propose an efficient smoothed proximal gradient algorithm with a worst-case numerical rate of convergence O (1/ ϵ ), where ϵ is a pre-specified accuracy of the objective function value. We conduct thorough numerical simulations to illustrate that CMR consistently outperforms other high dimensional multivariate regression methods. We also apply CMR to solve a brain activity prediction problem and find that it is as competitive as a handcrafted model created by human experts. The R package camel implementing the proposed method is available on the Comprehensive R Archive Network http://cran.r-project.org/web/packages/camel/.
Hara, Tomohiko; Nakanishi, Hiroyuki; Nakagawa, Tohru; Komiyama, Motokiyo; Kawahara, Takashi; Manabe, Tomoko; Miyake, Mototaka; Arai, Eri; Kanai, Yae; Fujimoto, Hiroyuki
2013-10-01
Recent studies have shown an improvement in prostate cancer diagnosis with the use of 3.0-Tesla magnetic resonance imaging. We retrospectively assessed the ability of this imaging technique to predict side-specific extracapsular extension of prostate cancer. From October 2007 to August 2011, prostatectomy was carried out in 396 patients after preoperative 3.0-Tesla magnetic resonance imaging. Among these, 132 (primary sample) and 134 patients (validation sample) underwent 12-core prostate biopsy at the National Cancer Center Hospital of Tokyo, Japan, and at other institutions, respectively. In the primary dataset, univariate and multivariate analyses were carried out to predict side-specific extracapsular extension using variables determined preoperatively, including 3.0-Tesla magnetic resonance imaging findings (T2-weighted and diffusion-weighted imaging). A prediction model was then constructed and applied to the validation study sample. Multivariate analysis identified four significant independent predictors (P < 0.05), including a biopsy Gleason score of ≥8, positive 3.0-Tesla diffusion-weighted magnetic resonance imaging findings, ≥2 positive biopsy cores on each side and a maximum percentage of positive cores ≥31% on each side. The negative predictive value was 93.9% in the combination model with these four predictors, meanwhile the positive predictive value was 33.8%. Good reproducibility of these four significant predictors and the combination model was observed in the validation study sample. The side-specific extracapsular extension prediction by the biopsy Gleason score and factors associated with tumor location, including a positive 3.0-Tesla diffusion-weighted magnetic resonance imaging finding, have a high negative predictive value, but a low positive predictive value. © 2013 The Japanese Urological Association.
Song, Wan; Bang, Seok Hwan; Jeon, Hwang Gyun; Jeong, Byong Chang; Seo, Seong Il; Jeon, Seong Soo; Choi, Han Yong; Kim, Chan Kyo; Lee, Hyun Moo
2018-02-23
The objective of this study was to investigate the effect of Prostate Imaging Reporting and Data System version 2 (PI-RADSv2) on prediction of postoperative Gleason score (GS) upgrading for patients with biopsy GS 6 prostate cancer. We retrospectively reviewed 443 patients who underwent magnetic resonance imaging (MRI) and radical prostatectomy for biopsy-proven GS 6 prostate cancer between January 2011 and December 2013. Preoperative clinical variables and pathologic GS were examined, and all MRI findings were assessed with PI-RADSv2. Receiver operating characteristic curves were used to compare predictive accuracies of multivariate logistic regression models with or without PI-RADSv2. Of the total 443 patients, 297 (67.0%) experienced GS upgrading postoperatively. PI-RADSv2 scores 1 to 3 and 4 to 5 were identified in 157 (25.4%) and 286 (64.6%) patients, respectively, and the rate of GS upgrading was 54.1% and 74.1%, respectively (P < .001). In multivariate analysis, prostate-specific antigen density > 0.16 ng/mL 2 , number of positive cores ≥ 2, maximum percentage of cancer per core > 20, and PI-RADSv2 score 4 to 5 were independent predictors influencing GS upgrading (each P < .05). When predictive accuracies of multivariate models with or without PI-RADSv2 were compared, the model including PI-RADSv2 was shown to have significantly higher accuracy (area under the curve, 0.729 vs. 0.703; P = .041). Use of PI-RADSv2 is an independent predictor of postoperative GS upgrading and increases the predictive accuracy of GS upgrading. PI-RADSv2 might be used as a preoperative imaging tool to determine risk classification and to help counsel patients with regard to treatment decision and prognosis of disease. Copyright © 2018 Elsevier Inc. All rights reserved.
Multivariate Cholesky models of human female fertility patterns in the NLSY.
Rodgers, Joseph Lee; Bard, David E; Miller, Warren B
2007-03-01
Substantial evidence now exists that variables measuring or correlated with human fertility outcomes have a heritable component. In this study, we define a series of age-sequenced fertility variables, and fit multivariate models to account for underlying shared genetic and environmental sources of variance. We make predictions based on a theory developed by Udry [(1996) Biosocial models of low-fertility societies. In: Casterline, JB, Lee RD, Foote KA (eds) Fertility in the United States: new patterns, new theories. The Population Council, New York] suggesting that biological/genetic motivations can be more easily realized and measured in settings in which fertility choices are available. Udry's theory, along with principles from molecular genetics and certain tenets of life history theory, allow us to make specific predictions about biometrical patterns across age. Consistent with predictions, our results suggest that there are different sources of genetic influence on fertility variance at early compared to later ages, but that there is only one source of shared environmental influence that occurs at early ages. These patterns are suggestive of the types of gene-gene and gene-environment interactions for which we must account to better understand individual differences in fertility outcomes.
Recurrent Neural Networks for Multivariate Time Series with Missing Values.
Che, Zhengping; Purushotham, Sanjay; Cho, Kyunghyun; Sontag, David; Liu, Yan
2018-04-17
Multivariate time series data in practical applications, such as health care, geoscience, and biology, are characterized by a variety of missing values. In time series prediction and other related tasks, it has been noted that missing values and their missing patterns are often correlated with the target labels, a.k.a., informative missingness. There is very limited work on exploiting the missing patterns for effective imputation and improving prediction performance. In this paper, we develop novel deep learning models, namely GRU-D, as one of the early attempts. GRU-D is based on Gated Recurrent Unit (GRU), a state-of-the-art recurrent neural network. It takes two representations of missing patterns, i.e., masking and time interval, and effectively incorporates them into a deep model architecture so that it not only captures the long-term temporal dependencies in time series, but also utilizes the missing patterns to achieve better prediction results. Experiments of time series classification tasks on real-world clinical datasets (MIMIC-III, PhysioNet) and synthetic datasets demonstrate that our models achieve state-of-the-art performance and provide useful insights for better understanding and utilization of missing values in time series analysis.
Pastore, Francesco; Conson, Manuel; D'Avino, Vittoria; Palma, Giuseppe; Liuzzi, Raffaele; Solla, Raffaele; Farella, Antonio; Salvatore, Marco; Cella, Laura; Pacelli, Roberto
2016-01-01
Severe acute radiation-induced skin toxicity (RIST) after breast irradiation is a side effect impacting the quality of life in breast cancer (BC) patients. The aim of the present study was to develop normal tissue complication probability (NTCP) models of severe acute RIST in BC patients. We evaluated 140 consecutive BC patients undergoing conventional three-dimensional conformal radiotherapy (3D-CRT) after breast conserving surgery in a prospective study assessing acute RIST. The acute RIST was classified according to the RTOG scoring system. Dose-surface histograms (DSHs) of the body structure in the breast region were extracted as representative of skin irradiation. Patient, disease, and treatment-related characteristics were analyzed along with DSHs. NTCP modeling by Lyman-Kutcher-Burman (LKB) and by multivariate logistic regression using bootstrap resampling techniques was performed. Models were evaluated by Spearman's Rs coefficient and ROC area. By the end of radiotherapy, 139 (99%) patients developed any degree of acute RIST. G3 RIST was found in 11 of 140 (8%) patients. Mild-moderate (G1-G2) RIST was still present at 40 days after treatment in six (4%) patients. Using DSHs for LKB modeling of acute RIST severity (RTOG G3 vs. G0-2), parameter estimates were TD50=39 Gy, n=0.38 and m=0.14 [Rs = 0.25, area under the curve (AUC) = 0.77, p = 0.003]. On multivariate analysis, the most predictive model of acute RIST severity was a two-variable model including the skin receiving ≥30 Gy (S30) and psoriasis [Rs = 0.32, AUC = 0.84, p < 0.001]. Using body DSH as representative of skin dose, the LKB n parameter was consistent with a surface effect for the skin. A good prediction performance was obtained using a data-driven multivariate model including S30 and a pre-existing skin disease (psoriasis) as a clinical factor.
Multivariate methods for indoor PM10 and PM2.5 modelling in naturally ventilated schools buildings
NASA Astrophysics Data System (ADS)
Elbayoumi, Maher; Ramli, Nor Azam; Md Yusof, Noor Faizah Fitri; Yahaya, Ahmad Shukri Bin; Al Madhoun, Wesam; Ul-Saufie, Ahmed Zia
2014-09-01
In this study the concentrations of PM10, PM2.5, CO and CO2 concentrations and meteorological variables (wind speed, air temperature, and relative humidity) were employed to predict the annual and seasonal indoor concentration of PM10 and PM2.5 using multivariate statistical methods. The data have been collected in twelve naturally ventilated schools in Gaza Strip (Palestine) from October 2011 to May 2012 (academic year). The bivariate correlation analysis showed that the indoor PM10 and PM2.5 were highly positive correlated with outdoor concentration of PM10 and PM2.5. Further, Multiple linear regression (MLR) was used for modelling and R2 values for indoor PM10 were determined as 0.62 and 0.84 for PM10 and PM2.5 respectively. The Performance indicators of MLR models indicated that the prediction for PM10 and PM2.5 annual models were better than seasonal models. In order to reduce the number of input variables, principal component analysis (PCA) and principal component regression (PCR) were applied by using annual data. The predicted R2 were 0.40 and 0.73 for PM10 and PM2.5, respectively. PM10 models (MLR and PCR) show the tendency to underestimate indoor PM10 concentrations as it does not take into account the occupant's activities which highly affect the indoor concentrations during the class hours.
Summers, Richard L; Pipke, Matt; Wegerich, Stephan; Conkright, Gary; Isom, Kristen C
2014-01-01
Background. Monitoring cardiovascular hemodynamics in the modern clinical setting is a major challenge. Increasing amounts of physiologic data must be analyzed and interpreted in the context of the individual patients pathology and inherent biologic variability. Certain data-driven analytical methods are currently being explored for smart monitoring of data streams from patients as a first tier automated detection system for clinical deterioration. As a prelude to human clinical trials, an empirical multivariate machine learning method called Similarity-Based Modeling (SBM), was tested in an In Silico experiment using data generated with the aid of a detailed computer simulator of human physiology (Quantitative Circulatory Physiology or QCP) which contains complex control systems with realistic integrated feedback loops. Methods. SBM is a kernel-based, multivariate machine learning method that that uses monitored clinical information to generate an empirical model of a patients physiologic state. This platform allows for the use of predictive analytic techniques to identify early changes in a patients condition that are indicative of a state of deterioration or instability. The integrity of the technique was tested through an In Silico experiment using QCP in which the output of computer simulations of a slowly evolving cardiac tamponade resulted in progressive state of cardiovascular decompensation. Simulator outputs for the variables under consideration were generated at a 2-min data rate (0.083Hz) with the tamponade introduced at a point 420 minutes into the simulation sequence. The functionality of the SBM predictive analytics methodology to identify clinical deterioration was compared to the thresholds used by conventional monitoring methods. Results. The SBM modeling method was found to closely track the normal physiologic variation as simulated by QCP. With the slow development of the tamponade, the SBM model are seen to disagree while the simulated biosignals in the early stages of physiologic deterioration and while the variables are still within normal ranges. Thus, the SBM system was found to identify pathophysiologic conditions in a timeframe that would not have been detected in a usual clinical monitoring scenario. Conclusion. In this study the functionality of a multivariate machine learning predictive methodology that that incorporates commonly monitored clinical information was tested using a computer model of human physiology. SBM and predictive analytics were able to differentiate a state of decompensation while the monitored variables were still within normal clinical ranges. This finding suggests that the SBM could provide for early identification of a clinical deterioration using predictive analytic techniques. predictive analytics, hemodynamic, monitoring.
Kasprowicz, Magdalena; Burzynska, Malgorzata; Melcer, Tomasz; Kübler, Andrzej
2016-01-01
To compare the performance of multivariate predictive models incorporating either the Full Outline of UnResponsiveness (FOUR) score or Glasgow Coma Score (GCS) in order to test whether substituting GCS with the FOUR score in predictive models for outcome in patients after TBI is beneficial. A total of 162 TBI patients were prospectively enrolled in the study. Stepwise logistic regression analysis was conducted to compare the prediction of (1) in-ICU mortality and (2) unfavourable outcome at 3 months post-injury using as predictors either the FOUR score or GCS along with other factors that may affect patient outcome. The areas under the ROC curves (AUCs) were used to compare the discriminant ability and predictive power of the models. The internal validation was performed with bootstrap technique and expressed as accuracy rate (AcR). The FOUR score, age, the CT Rotterdam score, systolic ABP and being placed on ventilator within day one (model 1: AUC: 0.906 ± 0.024; AcR: 80.3 ± 4.8%) performed equally well in predicting in-ICU mortality as the combination of GCS with the same set of predictors plus pupil reactivity (model 2: AUC: 0.913 ± 0.022; AcR: 81.1 ± 4.8%). The CT Rotterdam score, age and either the FOUR score (model 3) or GCS (model 4) equally well predicted unfavourable outcome at 3 months post-injury (AUC: 0.852 ± 0.037 vs. 0.866 ± 0.034; AcR: 72.3 ± 6.6% vs. 71.9%±6.6%, respectively). Adding the FOUR score or GCS at discharge from ICU to predictive models for unfavourable outcome increased significantly their performances (AUC: 0.895 ± 0.029, p = 0.05; AcR: 76.1 ± 6.5%; p < 0.004 when compared with model 3; and AUC: 0.918 ± 0.025, p < 0.05; AcR: 79.6 ± 7.2%, p < 0.009 when compared with model 4), but there was no benefit from substituting GCS with the FOUR score. Results showed that FOUR score and GCS perform equally well in multivariate predictive modelling in TBI.
Louis R Iverson; Anantha M. Prasad; Mark W. Schwartz; Mark W. Schwartz
2005-01-01
We predict current distribution and abundance for tree species present in eastern North America, and subsequently estimate potential suitable habitat for those species under a changed climate with 2 x CO2. We used a series of statistical models (i.e., Regression Tree Analysis (RTA), Multivariate Adaptive Regression Splines (MARS), Bagging Trees (...
Multivariate prediction of upper limb prosthesis acceptance or rejection.
Biddiss, Elaine A; Chau, Tom T
2008-07-01
To develop a model for prediction of upper limb prosthesis use or rejection. A questionnaire exploring factors in prosthesis acceptance was distributed internationally to individuals with upper limb absence through community-based support groups and rehabilitation hospitals. A total of 191 participants (59 prosthesis rejecters and 132 prosthesis wearers) were included in this study. A logistic regression model, a C5.0 decision tree, and a radial basis function neural network were developed and compared in terms of sensitivity (prediction of prosthesis rejecters), specificity (prediction of prosthesis wearers), and overall cross-validation accuracy. The logistic regression and neural network provided comparable overall accuracies of approximately 84 +/- 3%, specificity of 93%, and sensitivity of 61%. Fitting time-frame emerged as the predominant predictor. Individuals fitted within two years of birth (congenital) or six months of amputation (acquired) were 16 times more likely to continue prosthesis use. To increase rates of prosthesis acceptance, clinical directives should focus on timely, client-centred fitting strategies and the development of improved prostheses and healthcare for individuals with high-level or bilateral limb absence. Multivariate analyses are useful in determining the relative importance of the many factors involved in prosthesis acceptance and rejection.
Binder, Harald; Sauerbrei, Willi; Royston, Patrick
2013-06-15
In observational studies, many continuous or categorical covariates may be related to an outcome. Various spline-based procedures or the multivariable fractional polynomial (MFP) procedure can be used to identify important variables and functional forms for continuous covariates. This is the main aim of an explanatory model, as opposed to a model only for prediction. The type of analysis often guides the complexity of the final model. Spline-based procedures and MFP have tuning parameters for choosing the required complexity. To compare model selection approaches, we perform a simulation study in the linear regression context based on a data structure intended to reflect realistic biomedical data. We vary the sample size, variance explained and complexity parameters for model selection. We consider 15 variables. A sample size of 200 (1000) and R(2) = 0.2 (0.8) is the scenario with the smallest (largest) amount of information. For assessing performance, we consider prediction error, correct and incorrect inclusion of covariates, qualitative measures for judging selected functional forms and further novel criteria. From limited information, a suitable explanatory model cannot be obtained. Prediction performance from all types of models is similar. With a medium amount of information, MFP performs better than splines on several criteria. MFP better recovers simpler functions, whereas splines better recover more complex functions. For a large amount of information and no local structure, MFP and the spline procedures often select similar explanatory models. Copyright © 2012 John Wiley & Sons, Ltd.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cella, Laura, E-mail: laura.cella@cnr.it; Department of Advanced Biomedical Sciences, Federico II University School of Medicine, Naples; Liuzzi, Raffaele
Purpose: To establish a multivariate normal tissue complication probability (NTCP) model for radiation-induced asymptomatic heart valvular defects (RVD). Methods and Materials: Fifty-six patients treated with sequential chemoradiation therapy for Hodgkin lymphoma (HL) were retrospectively reviewed for RVD events. Clinical information along with whole heart, cardiac chambers, and lung dose distribution parameters was collected, and the correlations to RVD were analyzed by means of Spearman's rank correlation coefficient (Rs). For the selection of the model order and parameters for NTCP modeling, a multivariate logistic regression method using resampling techniques (bootstrapping) was applied. Model performance was evaluated using the area under themore » receiver operating characteristic curve (AUC). Results: When we analyzed the whole heart, a 3-variable NTCP model including the maximum dose, whole heart volume, and lung volume was shown to be the optimal predictive model for RVD (Rs = 0.573, P<.001, AUC = 0.83). When we analyzed the cardiac chambers individually, for the left atrium and for the left ventricle, an NTCP model based on 3 variables including the percentage volume exceeding 30 Gy (V30), cardiac chamber volume, and lung volume was selected as the most predictive model (Rs = 0.539, P<.001, AUC = 0.83; and Rs = 0.557, P<.001, AUC = 0.82, respectively). The NTCP values increase as heart maximum dose or cardiac chambers V30 increase. They also increase with larger volumes of the heart or cardiac chambers and decrease when lung volume is larger. Conclusions: We propose logistic NTCP models for RVD considering not only heart irradiation dose but also the combined effects of lung and heart volumes. Our study establishes the statistical evidence of the indirect effect of lung size on radio-induced heart toxicity.« less
Li, Hai-Yan; Guo, Yu-Tao; Tian, Cui; Song, Chao-Qun; Mu, Yang; Li, Yang; Chen, Yun-Dai
2017-08-01
The vasovagal reflex syndrome (VVRS) is common in the patients undergoing percutaneous coronary intervention (PCI). However, prediction and prevention of the risk for the VVRS have not been completely fulfilled. This study was conducted to develop a Risk Prediction Score Model to identify the determinants of VVRS in a large Chinese population cohort receiving PCI. From the hospital electronic medical database, we identified 3550 patients who received PCI (78.0% males, mean age 60 years) in Chinese PLA General Hospital from January 1, 2000 to August 30, 2016. The multivariate analysis and receiver operating characteristic (ROC) analysis were performed. The adverse events of VVRS in the patients were significantly increased after PCI procedure than before the operation (all P < 0.001). The rate of VVRS [95% confidence interval (CI)] in patients receiving PCI was 4.5% (4.1%-5.6%). Compared to the patients suffering no VVRS, incidence of VVRS involved the following factors, namely female gender, primary PCI, hypertension, over two stents implantation in the left anterior descending (LAD), and the femoral puncture site. The multivariate analysis suggested that they were independent risk factors for predicting the incidence of VVRS (all P < 0.001). We developed a risk prediction score model for VVRS. ROC analysis showed that the risk prediction score model was effectively predictive of the incidence of VVRS in patients receiving PCI (c-statistic 0.76, 95% CI: 0.72-0.79, P < 0.001). There were decreased events of VVRS in the patients receiving PCI whose diastolic blood pressure dropped by more than 30 mmHg and heart rate reduced by 10 times per minute (AUC: 0.84, 95% CI: 0.81-0.87, P < 0.001). The risk prediction score is quite efficient in predicting the incidence of VVRS in patients receiving PCI. In which, the following factors may be involved, the femoral puncture site, female gender, hypertension, primary PCI, and over 2 stents implanted in LAD.
A simple prognostic model for overall survival in metastatic renal cell carcinoma.
Assi, Hazem I; Patenaude, Francois; Toumishey, Ethan; Ross, Laura; Abdelsalam, Mahmoud; Reiman, Tony
2016-01-01
The primary purpose of this study was to develop a simpler prognostic model to predict overall survival for patients treated for metastatic renal cell carcinoma (mRCC) by examining variables shown in the literature to be associated with survival. We conducted a retrospective analysis of patients treated for mRCC at two Canadian centres. All patients who started first-line treatment were included in the analysis. A multivariate Cox proportional hazards regression model was constructed using a stepwise procedure. Patients were assigned to risk groups depending on how many of the three risk factors from the final multivariate model they had. There were three risk factors in the final multivariate model: hemoglobin, prior nephrectomy, and time from diagnosis to treatment. Patients in the high-risk group (two or three risk factors) had a median survival of 5.9 months, while those in the intermediate-risk group (one risk factor) had a median survival of 16.2 months, and those in the low-risk group (no risk factors) had a median survival of 50.6 months. In multivariate analysis, shorter survival times were associated with hemoglobin below the lower limit of normal, absence of prior nephrectomy, and initiation of treatment within one year of diagnosis.
A simple prognostic model for overall survival in metastatic renal cell carcinoma
Assi, Hazem I.; Patenaude, Francois; Toumishey, Ethan; Ross, Laura; Abdelsalam, Mahmoud; Reiman, Tony
2016-01-01
Introduction: The primary purpose of this study was to develop a simpler prognostic model to predict overall survival for patients treated for metastatic renal cell carcinoma (mRCC) by examining variables shown in the literature to be associated with survival. Methods: We conducted a retrospective analysis of patients treated for mRCC at two Canadian centres. All patients who started first-line treatment were included in the analysis. A multivariate Cox proportional hazards regression model was constructed using a stepwise procedure. Patients were assigned to risk groups depending on how many of the three risk factors from the final multivariate model they had. Results: There were three risk factors in the final multivariate model: hemoglobin, prior nephrectomy, and time from diagnosis to treatment. Patients in the high-risk group (two or three risk factors) had a median survival of 5.9 months, while those in the intermediate-risk group (one risk factor) had a median survival of 16.2 months, and those in the low-risk group (no risk factors) had a median survival of 50.6 months. Conclusions: In multivariate analysis, shorter survival times were associated with hemoglobin below the lower limit of normal, absence of prior nephrectomy, and initiation of treatment within one year of diagnosis. PMID:27217858
Goode, C; LeRoy, J; Allen, D G
2007-01-01
This study reports on a multivariate analysis of the moving bed biofilm reactor (MBBR) wastewater treatment system at a Canadian pulp mill. The modelling approach involved a data overview by principal component analysis (PCA) followed by partial least squares (PLS) modelling with the objective of explaining and predicting changes in the BOD output of the reactor. Over two years of data with 87 process measurements were used to build the models. Variables were collected from the MBBR control scheme as well as upstream in the bleach plant and in digestion. To account for process dynamics, a variable lagging approach was used for variables with significant temporal correlations. It was found that wood type pulped at the mill was a significant variable governing reactor performance. Other important variables included flow parameters, faults in the temperature or pH control of the reactor, and some potential indirect indicators of biomass activity (residual nitrogen and pH out). The most predictive model was found to have an RMSEP value of 606 kgBOD/d, representing a 14.5% average error. This was a good fit, given the measurement error of the BOD test. Overall, the statistical approach was effective in describing and predicting MBBR treatment performance.
NASA Astrophysics Data System (ADS)
Yu, Liuqian; Fennel, Katja; Bertino, Laurent; Gharamti, Mohamad El; Thompson, Keith R.
2018-06-01
Effective data assimilation methods for incorporating observations into marine biogeochemical models are required to improve hindcasts, nowcasts and forecasts of the ocean's biogeochemical state. Recent assimilation efforts have shown that updating model physics alone can degrade biogeochemical fields while only updating biogeochemical variables may not improve a model's predictive skill when the physical fields are inaccurate. Here we systematically investigate whether multivariate updates of physical and biogeochemical model states are superior to only updating either physical or biogeochemical variables. We conducted a series of twin experiments in an idealized ocean channel that experiences wind-driven upwelling. The forecast model was forced with biased wind stress and perturbed biogeochemical model parameters compared to the model run representing the "truth". Taking advantage of the multivariate nature of the deterministic Ensemble Kalman Filter (DEnKF), we assimilated different combinations of synthetic physical (sea surface height, sea surface temperature and temperature profiles) and biogeochemical (surface chlorophyll and nitrate profiles) observations. We show that when biogeochemical and physical properties are highly correlated (e.g., thermocline and nutricline), multivariate updates of both are essential for improving model skill and can be accomplished by assimilating either physical (e.g., temperature profiles) or biogeochemical (e.g., nutrient profiles) observations. In our idealized domain, the improvement is largely due to a better representation of nutrient upwelling, which results in a more accurate nutrient input into the euphotic zone. In contrast, assimilating surface chlorophyll improves the model state only slightly, because surface chlorophyll contains little information about the vertical density structure. We also show that a degradation of the correlation between observed subsurface temperature and nutrient fields, which has been an issue in several previous assimilation studies, can be reduced by multivariate updates of physical and biogeochemical fields.
Grobman, William A.; Lai, Yinglei; Landon, Mark B.; Spong, Catherine Y.; Leveno, Kenneth J.; Rouse, Dwight J.; Varner, Michael W.; Moawad, Atef H.; Simhan, Hyagriv N.; Harper, Margaret; Wapner, Ronald J.; Sorokin, Yoram; Miodovnik, Menachem; Carpenter, Marshall; O'sullivan, Mary J.; Sibai, Baha M.; Langer, Oded; Thorp, John M.; Ramin, Susan M.; Mercer, Brian M.
2010-01-01
Objective To construct a predictive model for vaginal birth after cesarean (VBAC) that combines factors that can be ascertained only as the pregnancy progresses with those known at initiation of prenatal care. Study design Using multivariable modeling, we constructed a predictive model for VBAC that included patient factors known at the initial prenatal visit as well as those that only became evident as the pregancy progressed to the admission for delivery. Results 9616 women were analyzed. The regression equation for VBAC success included multiple factors that could not be known at the first prenatal visit. The area under the curve for this model was significantly greater (P < .001) than that of a model that included only factors available at the first prenatal visit. Conclusion A prediction model for VBAC success that incorporates factors that can be ascertained only as the pregnancy progresses adds to the predictive accuracy of a model that uses only factors available at a first prenatal visit. PMID:19813165
Predicting introductory programming performance: A multi-institutional multivariate study
NASA Astrophysics Data System (ADS)
Bergin, Susan; Reilly, Ronan
2006-12-01
A model for predicting student performance on introductory programming modules is presented. The model uses attributes identified in a study carried out at four third-level institutions in the Republic of Ireland. Four instruments were used to collect the data and over 25 attributes were examined. A data reduction technique was applied and a logistic regression model using 10-fold stratified cross validation was developed. The model used three attributes: Leaving Certificate Mathematics result (final mathematics examination at second level), number of hours playing computer games while taking the module and programming self-esteem. Prediction success was significant with 80% of students correctly classified. The model also works well on a per-institution level. A discussion on the implications of the model is provided and future work is outlined.
Papadia, Andrea; Bellati, Filippo; Bogani, Giorgio; Ditto, Antonino; Martinelli, Fabio; Lorusso, Domenica; Donfrancesco, Cristina; Gasparri, Maria Luisa; Raspagliesi, Francesco
2015-12-01
The aim of this study was to identify clinical variables that may predict the need for adjuvant radiotherapy after neoadjuvant chemotherapy (NACT) and radical surgery in locally advanced cervical cancer patients. A retrospective series of cervical cancer patients with International Federation of Gynecology and Obstetrics (FIGO) stages IB2-IIB treated with NACT followed by radical surgery was analyzed. Clinical predictors of persistence of intermediate- and/or high-risk factors at final pathological analysis were investigated. Statistical analysis was performed using univariate and multivariate analysis and using a model based on artificial intelligence known as artificial neuronal network (ANN) analysis. Overall, 101 patients were available for the analyses. Fifty-two (51 %) patients were considered at high risk secondary to parametrial, resection margin and/or lymph node involvement. When disease was confined to the cervix, four (4 %) patients were considered at intermediate risk. At univariate analysis, FIGO grade 3, stage IIB disease at diagnosis and the presence of enlarged nodes before NACT predicted the presence of intermediate- and/or high-risk factors at final pathological analysis. At multivariate analysis, only FIGO grade 3 and tumor diameter maintained statistical significance. The specificity of ANN models in evaluating predictive variables was slightly superior to conventional multivariable models. FIGO grade, stage, tumor diameter, and histology are associated with persistence of pathological intermediate- and/or high-risk factors after NACT and radical surgery. This information is useful in counseling patients at the time of treatment planning with regard to the probability of being subjected to pelvic radiotherapy after completion of the initially planned treatment.
Brain natriuretic peptide predicts functional outcome in ischemic stroke
Rost, Natalia S; Biffi, Alessandro; Cloonan, Lisa; Chorba, John; Kelly, Peter; Greer, David; Ellinor, Patrick; Furie, Karen L
2011-01-01
Background Elevated serum levels of brain natriuretic peptide (BNP) have been associated with cardioembolic (CE) stroke and increased post-stroke mortality. We sought to determine whether BNP levels were associated with functional outcome after ischemic stroke. Methods We measured BNP in consecutive patients aged ≥18 years admitted to our Stroke Unit between 2002–2005. BNP quintiles were used for analysis. Stroke subtypes were assigned using TOAST criteria. Outcomes were measured as 6-month modified Rankin Scale score (“good outcome” = 0–2 vs. “poor”) as well as mortality. Multivariate logistic regression was used to assess association between the quintiles of BNP and outcomes. Predictive performance of BNP as compared to clinical model alone was assessed by comparing ROC curves. Results Of 569 ischemic stroke patients, 46% were female; mean age was 67.9 ± 15 years. In age- and gender-adjusted analysis, elevated BNP was associated with lower ejection fraction (p<0.0001) and left atrial dilatation (p<0.001). In multivariate analysis, elevated BNP decreased the odds of good functional outcome (OR 0.64, 95%CI 0.41–0.98) and increased the odds of death (OR 1.75, 95%CI 1.36–2.24) in these patients. Addition of BNP to multivariate models increased their predictive performance for functional outcome (p=0.013) and mortality (p<0.03) after CE stroke. Conclusions Serum BNP levels are strongly associated with CE stroke and functional outcome at 6 months after ischemic stroke. Inclusion of BNP improved prediction of mortality in patients with CE stroke. PMID:22116811
ERIC Educational Resources Information Center
Baker, Bruce D.; Richards, Craig E.
1999-01-01
Applies neural network methods for forecasting 1991-95 per-pupil expenditures in U.S. public elementary and secondary schools. Forecasting models included the National Center for Education Statistics' multivariate regression model and three neural architectures. Regarding prediction accuracy, neural network results were comparable or superior to…
LANDSCAPE METRICS THAT ARE USEFUL FOR EXPLAINING ESTUARINE ECOLOGICAL RESPONSES
We investigated whether land use/cover characteristics of watersheds associated with estuaries exhibit a strong enough signal to make landscape metrics useful for predicting estuarine ecological condition. We used multivariate logistic regression models to discriminate between su...
NASA Astrophysics Data System (ADS)
de Oliveira, Isadora R. N.; Roque, Jussara V.; Maia, Mariza P.; Stringheta, Paulo C.; Teófilo, Reinaldo F.
2018-04-01
A new method was developed to determine the antioxidant properties of red cabbage extract (Brassica oleracea) by mid (MID) and near (NIR) infrared spectroscopies and partial least squares (PLS) regression. A 70% (v/v) ethanolic extract of red cabbage was concentrated to 9° Brix and further diluted (12 to 100%) in water. The dilutions were used as external standards for the building of PLS models. For the first time, this strategy was applied for building multivariate regression models. Reference analyses and spectral data were obtained from diluted extracts. The determinate properties were total and monomeric anthocyanins, total polyphenols and antioxidant capacity by ABTS (2,2-azino-bis(3-ethyl-benzothiazoline-6-sulfonate)) and DPPH (2,2-diphenyl-1-picrylhydrazyl) methods. Ordered predictors selection (OPS) and genetic algorithm (GA) were used for feature selection before PLS regression (PLS-1). In addition, a PLS-2 regression was applied to all properties simultaneously. PLS-1 models provided more predictive models than did PLS-2 regression. PLS-OPS and PLS-GA models presented excellent prediction results with a correlation coefficient higher than 0.98. However, the best models were obtained using PLS and variable selection with the OPS algorithm and the models based on NIR spectra were considered more predictive for all properties. Then, these models provided a simple, rapid and accurate method for determination of red cabbage extract antioxidant properties and its suitability for use in the food industry.
Fiber-optic evanescent-wave spectroscopy for fast multicomponent analysis of human blood
NASA Astrophysics Data System (ADS)
Simhi, Ronit; Gotshal, Yaron; Bunimovich, David; Katzir, Abraham; Sela, Ben-Ami
1996-07-01
A spectral analysis of human blood serum was undertaken by fiber-optic evanescent-wave spectroscopy (FEWS) by the use of a Fourier-transform infrared spectrometer. A special cell for the FEWS measurements was designed and built that incorporates an IR-transmitting silver halide fiber and a means for introducing the blood-serum sample. Further improvements in analysis were obtained by the adoption of multivariate calibration techniques that are already used in clinical chemistry. The partial least-squares algorithm was used to calculate the concentrations of cholesterol, total protein, urea, and uric acid in human blood serum. The estimated prediction errors obtained (in percent from the average value) were 6% for total protein, 15% for cholesterol, 30% for urea, and 30% for uric acid. These results were compared with another independent prediction method that used a neural-network model. This model yielded estimated prediction errors of 8.8% for total protein, 25% for cholesterol, and 21% for uric acid. spectroscopy, fiber-optic evanescent-wave spectroscopy, Fourier-transform infrared spectrometer, blood, multivariate calibration, neural networks.
Genome-Wide Association Analysis of Adaptation Using Environmentally Predicted Traits
van Zanten, Martijn
2015-01-01
Current methods for studying the genetic basis of adaptation evaluate genetic associations with ecologically relevant traits or single environmental variables, under the implicit assumption that natural selection imposes correlations between phenotypes, environments and genotypes. In practice, observed trait and environmental data are manifestations of unknown selective forces and are only indirectly associated with adaptive genetic variation. In theory, improved estimation of these forces could enable more powerful detection of loci under selection. Here we present an approach in which we approximate adaptive variation by modeling phenotypes as a function of the environment and using the predicted trait in multivariate and univariate genome-wide association analysis (GWAS). Based on computer simulations and published flowering time data from the model plant Arabidopsis thaliana, we find that environmentally predicted traits lead to higher recovery of functional loci in multivariate GWAS and are more strongly correlated to allele frequencies at adaptive loci than individual environmental variables. Our results provide an example of the use of environmental data to obtain independent and meaningful information on adaptive genetic variation. PMID:26496492
Weichenthal, Scott; Ryswyk, Keith Van; Goldstein, Alon; Bagg, Scott; Shekkarizfard, Maryam; Hatzopoulou, Marianne
2016-04-01
Existing evidence suggests that ambient ultrafine particles (UFPs) (<0.1µm) may contribute to acute cardiorespiratory morbidity. However, few studies have examined the long-term health effects of these pollutants owing in part to a need for exposure surfaces that can be applied in large population-based studies. To address this need, we developed a land use regression model for UFPs in Montreal, Canada using mobile monitoring data collected from 414 road segments during the summer and winter months between 2011 and 2012. Two different approaches were examined for model development including standard multivariable linear regression and a machine learning approach (kernel-based regularized least squares (KRLS)) that learns the functional form of covariate impacts on ambient UFP concentrations from the data. The final models included parameters for population density, ambient temperature and wind speed, land use parameters (park space and open space), length of local roads and rail, and estimated annual average NOx emissions from traffic. The final multivariable linear regression model explained 62% of the spatial variation in ambient UFP concentrations whereas the KRLS model explained 79% of the variance. The KRLS model performed slightly better than the linear regression model when evaluated using an external dataset (R(2)=0.58 vs. 0.55) or a cross-validation procedure (R(2)=0.67 vs. 0.60). In general, our findings suggest that the KRLS approach may offer modest improvements in predictive performance compared to standard multivariable linear regression models used to estimate spatial variations in ambient UFPs. However, differences in predictive performance were not statistically significant when evaluated using the cross-validation procedure. Crown Copyright © 2015. Published by Elsevier Inc. All rights reserved.
Coelho, Antonio Augusto Rodrigues
2016-01-01
This paper introduces the Fuzzy Logic Hypercube Interpolator (FLHI) and demonstrates applications in control of multiple-input single-output (MISO) and multiple-input multiple-output (MIMO) processes with Hammerstein nonlinearities. FLHI consists of a Takagi-Sugeno fuzzy inference system where membership functions act as kernel functions of an interpolator. Conjunction of membership functions in an unitary hypercube space enables multivariable interpolation of N-dimensions. Membership functions act as interpolation kernels, such that choice of membership functions determines interpolation characteristics, allowing FLHI to behave as a nearest-neighbor, linear, cubic, spline or Lanczos interpolator, to name a few. The proposed interpolator is presented as a solution to the modeling problem of static nonlinearities since it is capable of modeling both a function and its inverse function. Three study cases from literature are presented, a single-input single-output (SISO) system, a MISO and a MIMO system. Good results are obtained regarding performance metrics such as set-point tracking, control variation and robustness. Results demonstrate applicability of the proposed method in modeling Hammerstein nonlinearities and their inverse functions for implementation of an output compensator with Model Based Predictive Control (MBPC), in particular Dynamic Matrix Control (DMC). PMID:27657723
A hybrid PCA-CART-MARS-based prognostic approach of the remaining useful life for aircraft engines.
Sánchez Lasheras, Fernando; García Nieto, Paulino José; de Cos Juez, Francisco Javier; Mayo Bayón, Ricardo; González Suárez, Victor Manuel
2015-03-23
Prognostics is an engineering discipline that predicts the future health of a system. In this research work, a data-driven approach for prognostics is proposed. Indeed, the present paper describes a data-driven hybrid model for the successful prediction of the remaining useful life of aircraft engines. The approach combines the multivariate adaptive regression splines (MARS) technique with the principal component analysis (PCA), dendrograms and classification and regression trees (CARTs). Elements extracted from sensor signals are used to train this hybrid model, representing different levels of health for aircraft engines. In this way, this hybrid algorithm is used to predict the trends of these elements. Based on this fitting, one can determine the future health state of a system and estimate its remaining useful life (RUL) with accuracy. To evaluate the proposed approach, a test was carried out using aircraft engine signals collected from physical sensors (temperature, pressure, speed, fuel flow, etc.). Simulation results show that the PCA-CART-MARS-based approach can forecast faults long before they occur and can predict the RUL. The proposed hybrid model presents as its main advantage the fact that it does not require information about the previous operation states of the input variables of the engine. The performance of this model was compared with those obtained by other benchmark models (multivariate linear regression and artificial neural networks) also applied in recent years for the modeling of remaining useful life. Therefore, the PCA-CART-MARS-based approach is very promising in the field of prognostics of the RUL for aircraft engines.
A Hybrid PCA-CART-MARS-Based Prognostic Approach of the Remaining Useful Life for Aircraft Engines
Lasheras, Fernando Sánchez; Nieto, Paulino José García; de Cos Juez, Francisco Javier; Bayón, Ricardo Mayo; Suárez, Victor Manuel González
2015-01-01
Prognostics is an engineering discipline that predicts the future health of a system. In this research work, a data-driven approach for prognostics is proposed. Indeed, the present paper describes a data-driven hybrid model for the successful prediction of the remaining useful life of aircraft engines. The approach combines the multivariate adaptive regression splines (MARS) technique with the principal component analysis (PCA), dendrograms and classification and regression trees (CARTs). Elements extracted from sensor signals are used to train this hybrid model, representing different levels of health for aircraft engines. In this way, this hybrid algorithm is used to predict the trends of these elements. Based on this fitting, one can determine the future health state of a system and estimate its remaining useful life (RUL) with accuracy. To evaluate the proposed approach, a test was carried out using aircraft engine signals collected from physical sensors (temperature, pressure, speed, fuel flow, etc.). Simulation results show that the PCA-CART-MARS-based approach can forecast faults long before they occur and can predict the RUL. The proposed hybrid model presents as its main advantage the fact that it does not require information about the previous operation states of the input variables of the engine. The performance of this model was compared with those obtained by other benchmark models (multivariate linear regression and artificial neural networks) also applied in recent years for the modeling of remaining useful life. Therefore, the PCA-CART-MARS-based approach is very promising in the field of prognostics of the RUL for aircraft engines. PMID:25806876
Iterative procedures for space shuttle main engine performance models
NASA Technical Reports Server (NTRS)
Santi, L. Michael
1989-01-01
Performance models of the Space Shuttle Main Engine (SSME) contain iterative strategies for determining approximate solutions to nonlinear equations reflecting fundamental mass, energy, and pressure balances within engine flow systems. Both univariate and multivariate Newton-Raphson algorithms are employed in the current version of the engine Test Information Program (TIP). Computational efficiency and reliability of these procedures is examined. A modified trust region form of the multivariate Newton-Raphson method is implemented and shown to be superior for off nominal engine performance predictions. A heuristic form of Broyden's Rank One method is also tested and favorable results based on this algorithm are presented.
Enhanced pid vs model predictive control applied to bldc motor
NASA Astrophysics Data System (ADS)
Gaya, M. S.; Muhammad, Auwal; Aliyu Abdulkadir, Rabiu; Salim, S. N. S.; Madugu, I. S.; Tijjani, Aminu; Aminu Yusuf, Lukman; Dauda Umar, Ibrahim; Khairi, M. T. M.
2018-01-01
BrushLess Direct Current (BLDC) motor is a multivariable and highly complex nonlinear system. Variation of internal parameter values with environment or reference signal increases the difficulty in controlling the BLDC effectively. Advanced control strategies (like model predictive control) often have to be integrated to satisfy the control desires. Enhancing or proper tuning of a conventional algorithm results in achieving the desired performance. This paper presents a performance comparison of Enhanced PID and Model Predictive Control (MPC) applied to brushless direct current motor. The simulation results demonstrated that the PSO-PID is slightly better than the PID and MPC in tracking the trajectory of the reference signal. The proposed scheme could be useful algorithms for the system.
Predictive modeling of EEG time series for evaluating surgery targets in epilepsy patients.
Steimer, Andreas; Müller, Michael; Schindler, Kaspar
2017-05-01
During the last 20 years, predictive modeling in epilepsy research has largely been concerned with the prediction of seizure events, whereas the inference of effective brain targets for resective surgery has received surprisingly little attention. In this exploratory pilot study, we describe a distributional clustering framework for the modeling of multivariate time series and use it to predict the effects of brain surgery in epilepsy patients. By analyzing the intracranial EEG, we demonstrate how patients who became seizure free after surgery are clearly distinguished from those who did not. More specifically, for 5 out of 7 patients who obtained seizure freedom (= Engel class I) our method predicts the specific collection of brain areas that got actually resected during surgery to yield a markedly lower posterior probability for the seizure related clusters, when compared to the resection of random or empty collections. Conversely, for 4 out of 5 Engel class III/IV patients who still suffer from postsurgical seizures, performance of the actually resected collection is not significantly better than performances displayed by random or empty collections. As the number of possible collections ranges into billions and more, this is a substantial contribution to a problem that today is still solved by visual EEG inspection. Apart from epilepsy research, our clustering methodology is also of general interest for the analysis of multivariate time series and as a generative model for temporally evolving functional networks in the neurosciences and beyond. Hum Brain Mapp 38:2509-2531, 2017. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
N'gattia, A K; Coulibaly, D; Nzussouo, N Talla; Kadjo, H A; Chérif, D; Traoré, Y; Kouakou, B K; Kouassi, P D; Ekra, K D; Dagnan, N S; Williams, T; Tiembré, I
2016-09-13
In temperate regions, influenza epidemics occur in the winter and correlate with certain climatological parameters. In African tropical regions, the effects of climatological parameters on influenza epidemics are not well defined. This study aims to identify and model the effects of climatological parameters on seasonal influenza activity in Abidjan, Cote d'Ivoire. We studied the effects of weekly rainfall, humidity, and temperature on laboratory-confirmed influenza cases in Abidjan from 2007 to 2010. We used the Box-Jenkins method with the autoregressive integrated moving average (ARIMA) process to create models using data from 2007-2010 and to assess the predictive value of best model on data from 2011 to 2012. The weekly number of influenza cases showed significant cross-correlation with certain prior weeks for both rainfall, and relative humidity. The best fitting multivariate model (ARIMAX (2,0,0) _RF) included the number of influenza cases during 1-week and 2-weeks prior, and the rainfall during the current week and 5-weeks prior. The performance of this model showed an increase of >3 % for Akaike Information Criterion (AIC) and 2.5 % for Bayesian Information Criterion (BIC) compared to the reference univariate ARIMA (2,0,0). The prediction of the weekly number of influenza cases during 2011-2012 with the best fitting multivariate model (ARIMAX (2,0,0) _RF), showed that the observed values were within the 95 % confidence interval of the predicted values during 97 of 104 weeks. Including rainfall increases the performances of fitted and predicted models. The timing of influenza in Abidjan can be partially explained by rainfall influence, in a setting with little change in temperature throughout the year. These findings can help clinicians to anticipate influenza cases during the rainy season by implementing preventive measures.
NASA Astrophysics Data System (ADS)
Tewari, Jagdish; Strong, Richard; Boulas, Pierre
2017-02-01
This article summarizes the development and validation of a Fourier transform near infrared spectroscopy (FT-NIR) method for the rapid at-line prediction of active pharmaceutical ingredient (API) in a powder blend to optimize small molecule formulations. The method was used to determine the blend uniformity end-point for a pharmaceutical solid dosage formulation containing a range of API concentrations. A set of calibration spectra from samples with concentrations ranging from 1% to 15% of API (w/w) were collected at-line from 4000 to 12,500 cm- 1. The ability of the FT-NIR method to predict API concentration in the blend samples was validated against a reference high performance liquid chromatography (HPLC) method. The prediction efficiency of four different types of multivariate data modeling methods such as partial least-squares 1 (PLS1), partial least-squares 2 (PLS2), principal component regression (PCR) and artificial neural network (ANN), were compared using relevant multivariate figures of merit. The prediction ability of the regression models were cross validated against results generated with the reference HPLC method. PLS1 and ANN showed excellent and superior prediction abilities when compared to PLS2 and PCR. Based upon these results and because of its decreased complexity compared to ANN, PLS1 was selected as the best chemometric method to predict blend uniformity at-line. The FT-NIR measurement and the associated chemometric analysis were implemented in the production environment for rapid at-line determination of the end-point of the small molecule blending operation. FIGURE 1: Correlation coefficient vs Rank plot FIGURE 2: FT-NIR spectra of different steps of Blend and final blend FIGURE 3: Predictions ability of PCR FIGURE 4: Blend uniformity predication ability of PLS2 FIGURE 5: Prediction efficiency of blend uniformity using ANN FIGURE 6: Comparison of prediction efficiency of chemometric models TABLE 1: Order of Addition for Blending Steps
Does Investor Ownership of Nursing Homes Compromise the Quality of Care?
Harrington, Charlene; Woolhandler, Steffie; Mullan, Joseph; Carrillo, Helen; Himmelstein, David U.
2001-01-01
Objectives. Two thirds of nursing homes are investor owned. This study examined whether investor ownership affects quality. Methods. We analyzed 1998 data from state inspections of 13 693 nursing facilities. We used a multivariate model and controlled for case mix, facility characteristics, and location. Results. Investor-owned facilities averaged 5.89 deficiencies per home, 46.5% higher than nonprofit facilities and 43.0% higher than public facilities. In multivariate analysis, investor ownership predicted 0.679 additional deficiencies per home; chain ownership predicted an additional 0.633 deficiencies. Nurse staffing was lower at investor-owned nursing homes. Conclusions. Investor-owned nursing homes provide worse care and less nursing care than do not-for-profit or public homes. PMID:11527781
Predictive features of chronic kidney disease in atypical haemolytic uremic syndrome
Jamme, Matthieu; Raimbourg, Quentin; Chauveau, Dominique; Seguin, Amélie; Presne, Claire; Perez, Pierre; Gobert, Pierre; Wynckel, Alain; Provôt, François; Delmas, Yahsou; Mousson, Christiane; Servais, Aude; Vrigneaud, Laurence; Veyradier, Agnès
2017-01-01
Chronic kidney disease (CKD) is a frequent and serious complication of atypical haemolytic uremic syndrome (aHUS). We aimed to develop a simple accurate model to predict the risk of renal dysfunction in aHUS based on clinical and biological features available at hospital admission. Renal function at 1-year follow-up, based on an estimated glomerular filtration rate < 60mL/min/1.73m2 as assessed by the Modification of Diet in Renal Disease equation, was used as an indicator of significant CKD. Prospectively collected data from a cohort of 156 aHUS patients who did not receive eculizumab were used to identify predictors of CKD. Covariates associated with renal impairment were identified by multivariate analysis. The model performance was assessed and a scoring system for clinical practice was constructed from the regression coefficient. Multivariate analyses identified three predictors of CKD: a high serum creatinine level, a high mean arterial pressure and a mildly decreased platelet count. The prognostic model had a good discriminative ability (area under the curve = .84). The scoring system ranged from 0 to 5, with corresponding risks of CKD ranging from 18% to 100%. This model accurately predicts development of 1-year CKD in patients with aHUS using clinical and biological features available on admission. After further validation, this model may assist in clinical decision making. PMID:28542627
Supporting inquiry learning by promoting normative understanding of multivariable causality
NASA Astrophysics Data System (ADS)
Keselman, Alla
2003-11-01
Early adolescents may lack the cognitive and metacognitive skills necessary for effective inquiry learning. In particular, they are likely to have a nonnormative mental model of multivariable causality in which effects of individual variables are neither additive nor consistent. Described here is a software-based intervention designed to facilitate students' metalevel and performance-level inquiry skills by enhancing their understanding of multivariable causality. Relative to an exploration-only group, sixth graders who practiced predicting an outcome (earthquake risk) based on multiple factors demonstrated increased attention to evidence, improved metalevel appreciation of effective strategies, and a trend toward consistent use of a controlled comparison strategy. Sixth graders who also received explicit instruction in making predictions based on multiple factors showed additional improvement in their ability to compare multiple instances as a basis for inferences and constructed the most accurate knowledge of the system. Gains were maintained in transfer tasks. The cognitive skills and metalevel understanding examined here are essential to inquiry learning.
Lee, Byeong-Ju; Kim, Hye-Youn; Lim, Sa Rang; Huang, Linfang; Choi, Hyung-Kyoon
2017-01-01
Panax ginseng C.A. Meyer is a herb used for medicinal purposes, and its discrimination according to cultivation age has been an important and practical issue. This study employed Fourier-transform infrared (FT-IR) spectroscopy with multivariate statistical analysis to obtain a prediction model for discriminating cultivation ages (5 and 6 years) and three different parts (rhizome, tap root, and lateral root) of P. ginseng. The optimal partial-least-squares regression (PLSR) models for discriminating ginseng samples were determined by selecting normalization methods, number of partial-least-squares (PLS) components, and variable influence on projection (VIP) cutoff values. The best prediction model for discriminating 5- and 6-year-old ginseng was developed using tap root, vector normalization applied after the second differentiation, one PLS component, and a VIP cutoff of 1.0 (based on the lowest root-mean-square error of prediction value). In addition, for discriminating among the three parts of P. ginseng, optimized PLSR models were established using data sets obtained from vector normalization, two PLS components, and VIP cutoff values of 1.5 (for 5-year-old ginseng) and 1.3 (for 6-year-old ginseng). To our knowledge, this is the first study to provide a novel strategy for rapidly discriminating the cultivation ages and parts of P. ginseng using FT-IR by selected normalization methods, number of PLS components, and VIP cutoff values.
Lim, Sa Rang; Huang, Linfang
2017-01-01
Panax ginseng C.A. Meyer is a herb used for medicinal purposes, and its discrimination according to cultivation age has been an important and practical issue. This study employed Fourier-transform infrared (FT-IR) spectroscopy with multivariate statistical analysis to obtain a prediction model for discriminating cultivation ages (5 and 6 years) and three different parts (rhizome, tap root, and lateral root) of P. ginseng. The optimal partial-least-squares regression (PLSR) models for discriminating ginseng samples were determined by selecting normalization methods, number of partial-least-squares (PLS) components, and variable influence on projection (VIP) cutoff values. The best prediction model for discriminating 5- and 6-year-old ginseng was developed using tap root, vector normalization applied after the second differentiation, one PLS component, and a VIP cutoff of 1.0 (based on the lowest root-mean-square error of prediction value). In addition, for discriminating among the three parts of P. ginseng, optimized PLSR models were established using data sets obtained from vector normalization, two PLS components, and VIP cutoff values of 1.5 (for 5-year-old ginseng) and 1.3 (for 6-year-old ginseng). To our knowledge, this is the first study to provide a novel strategy for rapidly discriminating the cultivation ages and parts of P. ginseng using FT-IR by selected normalization methods, number of PLS components, and VIP cutoff values. PMID:29049369
Multivariate Models for Prediction of Human Skin Sensitization Hazard.
One of the lnteragency Coordinating Committee on the Validation of Alternative Method's (ICCVAM) top priorities is the development and evaluation of non-animal approaches to identify potential skin sensitizers. The complexity of biological events necessary to produce skin sensiti...
Attia, Khalid A M; Nassar, Mohammed W I; El-Zeiny, Mohamed B; Serag, Ahmed
2017-01-05
For the first time, a new variable selection method based on swarm intelligence namely firefly algorithm is coupled with three different multivariate calibration models namely, concentration residual augmented classical least squares, artificial neural network and support vector regression in UV spectral data. A comparative study between the firefly algorithm and the well-known genetic algorithm was developed. The discussion revealed the superiority of using this new powerful algorithm over the well-known genetic algorithm. Moreover, different statistical tests were performed and no significant differences were found between all the models regarding their predictabilities. This ensures that simpler and faster models were obtained without any deterioration of the quality of the calibration. Copyright © 2016 Elsevier B.V. All rights reserved.
Transforming RNA-Seq data to improve the performance of prognostic gene signatures.
Zwiener, Isabella; Frisch, Barbara; Binder, Harald
2014-01-01
Gene expression measurements have successfully been used for building prognostic signatures, i.e for identifying a short list of important genes that can predict patient outcome. Mostly microarray measurements have been considered, and there is little advice available for building multivariable risk prediction models from RNA-Seq data. We specifically consider penalized regression techniques, such as the lasso and componentwise boosting, which can simultaneously consider all measurements and provide both, multivariable regression models for prediction and automated variable selection. However, they might be affected by the typical skewness, mean-variance-dependency or extreme values of RNA-Seq covariates and therefore could benefit from transformations of the latter. In an analytical part, we highlight preferential selection of covariates with large variances, which is problematic due to the mean-variance dependency of RNA-Seq data. In a simulation study, we compare different transformations of RNA-Seq data for potentially improving detection of important genes. Specifically, we consider standardization, the log transformation, a variance-stabilizing transformation, the Box-Cox transformation, and rank-based transformations. In addition, the prediction performance for real data from patients with kidney cancer and acute myeloid leukemia is considered. We show that signature size, identification performance, and prediction performance critically depend on the choice of a suitable transformation. Rank-based transformations perform well in all scenarios and can even outperform complex variance-stabilizing approaches. Generally, the results illustrate that the distribution and potential transformations of RNA-Seq data need to be considered as a critical step when building risk prediction models by penalized regression techniques.
Transforming RNA-Seq Data to Improve the Performance of Prognostic Gene Signatures
Zwiener, Isabella; Frisch, Barbara; Binder, Harald
2014-01-01
Gene expression measurements have successfully been used for building prognostic signatures, i.e for identifying a short list of important genes that can predict patient outcome. Mostly microarray measurements have been considered, and there is little advice available for building multivariable risk prediction models from RNA-Seq data. We specifically consider penalized regression techniques, such as the lasso and componentwise boosting, which can simultaneously consider all measurements and provide both, multivariable regression models for prediction and automated variable selection. However, they might be affected by the typical skewness, mean-variance-dependency or extreme values of RNA-Seq covariates and therefore could benefit from transformations of the latter. In an analytical part, we highlight preferential selection of covariates with large variances, which is problematic due to the mean-variance dependency of RNA-Seq data. In a simulation study, we compare different transformations of RNA-Seq data for potentially improving detection of important genes. Specifically, we consider standardization, the log transformation, a variance-stabilizing transformation, the Box-Cox transformation, and rank-based transformations. In addition, the prediction performance for real data from patients with kidney cancer and acute myeloid leukemia is considered. We show that signature size, identification performance, and prediction performance critically depend on the choice of a suitable transformation. Rank-based transformations perform well in all scenarios and can even outperform complex variance-stabilizing approaches. Generally, the results illustrate that the distribution and potential transformations of RNA-Seq data need to be considered as a critical step when building risk prediction models by penalized regression techniques. PMID:24416353
Wang, Na-Na; Yang, Zheng-Jun; Wang, Xue; Chen, Li-Xuan; Zhao, Hong-Meng; Cao, Wen-Feng; Zhang, Bin
2018-04-25
Molecular subtype of breast cancer is associated with sentinel lymph node status. We sought to establish a mathematical prediction model that included breast cancer molecular subtype for risk of positive non-sentinel lymph nodes in breast cancer patients with sentinel lymph node metastasis and further validate the model in a separate validation cohort. We reviewed the clinicopathologic data of breast cancer patients with sentinel lymph node metastasis who underwent axillary lymph node dissection between June 16, 2014 and November 16, 2017 at our hospital. Sentinel lymph node biopsy was performed and patients with pathologically proven sentinel lymph node metastasis underwent axillary lymph node dissection. Independent risks for non-sentinel lymph node metastasis were assessed in a training cohort by multivariate analysis and incorporated into a mathematical prediction model. The model was further validated in a separate validation cohort, and a nomogram was developed and evaluated for diagnostic performance in predicting the risk of non-sentinel lymph node metastasis. Moreover, we assessed the performance of five different models in predicting non-sentinel lymph node metastasis in training cohort. Totally, 495 cases were eligible for the study, including 291 patients in the training cohort and 204 in the validation cohort. Non-sentinel lymph node metastasis was observed in 33.3% (97/291) patients in the training cohort. The AUC of MSKCC, Tenon, MDA, Ljubljana, and Louisville models in training cohort were 0.7613, 0.7142, 0.7076, 0.7483, and 0.671, respectively. Multivariate regression analysis indicated that tumor size (OR = 1.439; 95% CI 1.025-2.021; P = 0.036), sentinel lymph node macro-metastasis versus micro-metastasis (OR = 5.063; 95% CI 1.111-23.074; P = 0.036), the number of positive sentinel lymph nodes (OR = 2.583, 95% CI 1.714-3.892; P < 0.001), and the number of negative sentinel lymph nodes (OR = 0.686, 95% CI 0.575-0.817; P < 0.001) were independent statistically significant predictors of non-sentinel lymph node metastasis. Furthermore, luminal B (OR = 3.311, 95% CI 1.593-6.884; P = 0.001) and HER2 overexpression (OR = 4.308, 95% CI 1.097-16.912; P = 0.036) were independent and statistically significant predictor of non-sentinel lymph node metastasis versus luminal A. A regression model based on the results of multivariate analysis was established to predict the risk of non-sentinel lymph node metastasis, which had an AUC of 0.8188. The model was validated in the validation cohort and showed excellent diagnostic performance. The mathematical prediction model that incorporates five variables including breast cancer molecular subtype demonstrates excellent diagnostic performance in assessing the risk of non-sentinel lymph node metastasis in sentinel lymph node-positive patients. The prediction model could be of help surgeons in evaluating the risk of non-sentinel lymph node involvement for breast cancer patients; however, the model requires further validation in prospective studies.
Risk prediction for myocardial infarction via generalized functional regression models.
Ieva, Francesca; Paganoni, Anna M
2016-08-01
In this paper, we propose a generalized functional linear regression model for a binary outcome indicating the presence/absence of a cardiac disease with multivariate functional data among the relevant predictors. In particular, the motivating aim is the analysis of electrocardiographic traces of patients whose pre-hospital electrocardiogram (ECG) has been sent to 118 Dispatch Center of Milan (the Italian free-toll number for emergencies) by life support personnel of the basic rescue units. The statistical analysis starts with a preprocessing of ECGs treated as multivariate functional data. The signals are reconstructed from noisy observations. The biological variability is then removed by a nonlinear registration procedure based on landmarks. Thus, in order to perform a data-driven dimensional reduction, a multivariate functional principal component analysis is carried out on the variance-covariance matrix of the reconstructed and registered ECGs and their first derivatives. We use the scores of the Principal Components decomposition as covariates in a generalized linear model to predict the presence of the disease in a new patient. Hence, a new semi-automatic diagnostic procedure is proposed to estimate the risk of infarction (in the case of interest, the probability of being affected by Left Bundle Brunch Block). The performance of this classification method is evaluated and compared with other methods proposed in literature. Finally, the robustness of the procedure is checked via leave-j-out techniques. © The Author(s) 2013.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tatiana G. Levitskaia; James M. Peterson; Emily L. Campbell
2013-12-01
In liquid–liquid extraction separation processes, accumulation of organic solvent degradation products is detrimental to the process robustness, and frequent solvent analysis is warranted. Our research explores the feasibility of online monitoring of the organic solvents relevant to used nuclear fuel reprocessing. This paper describes the first phase of developing a system for monitoring the tributyl phosphate (TBP)/n-dodecane solvent commonly used to separate used nuclear fuel. In this investigation, the effect of extraction of nitric acid from aqueous solutions of variable concentrations on the quantification of TBP and its major degradation product dibutylphosphoric acid (HDBP) was assessed. Fourier transform infrared (FTIR)more » spectroscopy was used to discriminate between HDBP and TBP in the nitric acid-containing TBP/n-dodecane solvent. Multivariate analysis of the spectral data facilitated the development of regression models for HDBP and TBP quantification in real time, enabling online implementation of the monitoring system. The predictive regression models were validated using TBP/n-dodecane solvent samples subjected to high-dose external ?-irradiation. The predictive models were translated to flow conditions using a hollow fiber FTIR probe installed in a centrifugal contactor extraction apparatus, demonstrating the applicability of the FTIR technique coupled with multivariate analysis for the online monitoring of the organic solvent degradation products.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Levitskaia, Tatiana G.; Peterson, James M.; Campbell, Emily L.
2013-11-05
In liquid-liquid extraction separation processes, accumulation of organic solvent degradation products is detrimental to the process robustness and frequent solvent analysis is warranted. Our research explores feasibility of online monitoring of the organic solvents relevant to used nuclear fuel reprocessing. This paper describes the first phase of developing a system for monitoring the tributyl phosphate (TBP)/n-dodecane solvent commonly used to separate used nuclear fuel. In this investigation, the effect of extraction of nitric acid from aqueous solutions of variable concentrations on the quantification of TBP and its major degradation product dibutyl phosphoric acid (HDBP) was assessed. Fourier Transform Infrared Spectroscopymore » (FTIR) spectroscopy was used to discriminate between HDBP and TBP in the nitric acid-containing TBP/n-dodecane solvent. Multivariate analysis of the spectral data facilitated the development of regression models for HDBP and TBP quantification in real time, enabling online implementation of the monitoring system. The predictive regression models were validated using TBP/n-dodecane solvent samples subjected to the high dose external gamma irradiation. The predictive models were translated to flow conditions using a hollow fiber FTIR probe installed in a centrifugal contactor extraction apparatus demonstrating the applicability of the FTIR technique coupled with multivariate analysis for the online monitoring of the organic solvent degradation products.« less
The Effect of Visual Information on the Manual Approach and Landing
NASA Technical Reports Server (NTRS)
Wewerinke, P. H.
1982-01-01
The effect of visual information in combination with basic display information on the approach performance. A pre-experimental model analysis was performed in terms of the optimal control model. The resulting aircraft approach performance predictions were compared with the results of a moving base simulator program. The results illustrate that the model provides a meaningful description of the visual (scene) perception process involved in the complex (multi-variable, time varying) manual approach task with a useful predictive capability. The theoretical framework was shown to allow a straight-forward investigation of the complex interaction of a variety of task variables.
[Statistical prediction methods in violence risk assessment and its application].
Liu, Yuan-Yuan; Hu, Jun-Mei; Yang, Min; Li, Xiao-Song
2013-06-01
It is an urgent global problem how to improve the violence risk assessment. As a necessary part of risk assessment, statistical methods have remarkable impacts and effects. In this study, the predicted methods in violence risk assessment from the point of statistics are reviewed. The application of Logistic regression as the sample of multivariate statistical model, decision tree model as the sample of data mining technique, and neural networks model as the sample of artificial intelligence technology are all reviewed. This study provides data in order to contribute the further research of violence risk assessment.
Analysis of Multivariate Experimental Data Using A Simplified Regression Model Search Algorithm
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert M.
2013-01-01
A new regression model search algorithm was developed that may be applied to both general multivariate experimental data sets and wind tunnel strain-gage balance calibration data. The algorithm is a simplified version of a more complex algorithm that was originally developed for the NASA Ames Balance Calibration Laboratory. The new algorithm performs regression model term reduction to prevent overfitting of data. It has the advantage that it needs only about one tenth of the original algorithm's CPU time for the completion of a regression model search. In addition, extensive testing showed that the prediction accuracy of math models obtained from the simplified algorithm is similar to the prediction accuracy of math models obtained from the original algorithm. The simplified algorithm, however, cannot guarantee that search constraints related to a set of statistical quality requirements are always satisfied in the optimized regression model. Therefore, the simplified algorithm is not intended to replace the original algorithm. Instead, it may be used to generate an alternate optimized regression model of experimental data whenever the application of the original search algorithm fails or requires too much CPU time. Data from a machine calibration of NASA's MK40 force balance is used to illustrate the application of the new search algorithm.
Tufto, Jarle
2010-01-01
Domesticated species frequently spread their genes into populations of wild relatives through interbreeding. The domestication process often involves artificial selection for economically desirable traits. This can lead to an indirect response in unknown correlated traits and a reduction in fitness of domesticated individuals in the wild. Previous models for the effect of gene flow from domesticated species to wild relatives have assumed that evolution occurs in one dimension. Here, I develop a quantitative genetic model for the balance between migration and multivariate stabilizing selection. Different forms of correlational selection consistent with a given observed ratio between average fitness of domesticated and wild individuals offsets the phenotypic means at migration-selection balance away from predictions based on simpler one-dimensional models. For almost all parameter values, correlational selection leads to a reduction in the migration load. For ridge selection, this reduction arises because the distance the immigrants deviates from the local optimum in effect is reduced. For realistic parameter values, however, the effect of correlational selection on the load is small, suggesting that simpler one-dimensional models may still be adequate in terms of predicting mean population fitness and viability.
Vickers, Andrew J; Cronin, Angel M; Aus, Gunnar; Pihl, Carl-Gustav; Becker, Charlotte; Pettersson, Kim; Scardino, Peter T; Hugosson, Jonas; Lilja, Hans
2008-01-01
Background Prostate-specific antigen (PSA) is widely used to detect prostate cancer. The low positive predictive value of elevated PSA results in large numbers of unnecessary prostate biopsies. We set out to determine whether a multivariable model including four kallikrein forms (total, free, and intact PSA, and human kallikrein 2 (hK2)) could predict prostate biopsy outcome in previously unscreened men with elevated total PSA. Methods The study cohort comprised 740 men in Göteborg, Sweden, undergoing biopsy during the first round of the European Randomized study of Screening for Prostate Cancer. We calculated the area-under-the-curve (AUC) for predicting prostate cancer at biopsy. AUCs for a model including age and PSA (the 'laboratory' model) and age, PSA and digital rectal exam (the 'clinical' model) were compared with those for models that also included additional kallikreins. Results Addition of free and intact PSA and hK2 improved AUC from 0.68 to 0.83 and from 0.72 to 0.84, for the laboratory and clinical models respectively. Using a 20% risk of prostate cancer as the threshold for biopsy would have reduced the number of biopsies by 424 (57%) and missed only 31 out of 152 low-grade and 3 out of 40 high-grade cancers. Conclusion Multiple kallikrein forms measured in blood can predict the result of biopsy in previously unscreened men with elevated PSA. A multivariable model can determine which men should be advised to undergo biopsy and which might be advised to continue screening, but defer biopsy until there was stronger evidence of malignancy. PMID:18611265
Third molar development: measurements versus scores as age predictor.
Thevissen, P W; Fieuws, S; Willems, G
2011-10-01
Human third molar development is widely used to predict chronological age of sub adult individuals with unknown or doubted age. For these predictions, classically, the radiologically observed third molar growth and maturation is registered using a staging and related scoring technique. Measures of lengths and widths of the developing wisdom tooth and its adjacent second molar can be considered as an alternative registration. The aim of this study was to verify relations between mandibular third molar developmental stages or measurements of mandibular second molar and third molars and age. Age related performance of stages and measurements were compared to assess if measurements added information to age predictions from third molar formation stage. The sample was 340 orthopantomograms (170 females, 170 males) of individuals homogenously distributed in age between 7 and 24 years. Mandibular lower right, third and second molars, were staged following Gleiser and Hunt, length and width measurements were registered, and various ratios of these measurements were calculated. Univariable regression models with age as response and third molar stage, measurements and ratios of second and third molars as predictors, were considered. Multivariable regression models assessed if measurements or ratios added information to age prediction from third molar stage. Coefficients of determination (R(2)) and root mean squared errors (RMSE) obtained from all regression models were compared. The univariable regression model using stages as predictor yielded most accurate age predictions (males: R(2) 0.85, RMSE between 0.85 and 1.22 year; females: R(2) 0.77, RMSE between 1.19 and 2.11 year) compared to all models including measurements and ratios. The multivariable regression models indicated that measurements and ratios added no clinical relevant information to the age prediction from third molar stage. Ratios and measurements of second and third molars are less accurate age predictors than stages of developing third molars. Copyright © 2011 Elsevier Ltd. All rights reserved.
New Methods for Estimating Seasonal Potential Climate Predictability
NASA Astrophysics Data System (ADS)
Feng, Xia
This study develops two new statistical approaches to assess the seasonal potential predictability of the observed climate variables. One is the univariate analysis of covariance (ANOCOVA) model, a combination of autoregressive (AR) model and analysis of variance (ANOVA). It has the advantage of taking into account the uncertainty of the estimated parameter due to sampling errors in statistical test, which is often neglected in AR based methods, and accounting for daily autocorrelation that is not considered in traditional ANOVA. In the ANOCOVA model, the seasonal signals arising from external forcing are determined to be identical or not to assess any interannual variability that may exist is potentially predictable. The bootstrap is an attractive alternative method that requires no hypothesis model and is available no matter how mathematically complicated the parameter estimator. This method builds up the empirical distribution of the interannual variance from the resamplings drawn with replacement from the given sample, in which the only predictability in seasonal means arises from the weather noise. These two methods are applied to temperature and water cycle components including precipitation and evaporation, to measure the extent to which the interannual variance of seasonal means exceeds the unpredictable weather noise compared with the previous methods, including Leith-Shukla-Gutzler (LSG), Madden, and Katz. The potential predictability of temperature from ANOCOVA model, bootstrap, LSG and Madden exhibits a pronounced tropical-extratropical contrast with much larger predictability in the tropics dominated by El Nino/Southern Oscillation (ENSO) than in higher latitudes where strong internal variability lowers predictability. Bootstrap tends to display highest predictability of the four methods, ANOCOVA lies in the middle, while LSG and Madden appear to generate lower predictability. Seasonal precipitation from ANOCOVA, bootstrap, and Katz, resembling that for temperature, is more predictable over the tropical regions, and less predictable in extropics. Bootstrap and ANOCOVA are in good agreement with each other, both methods generating larger predictability than Katz. The seasonal predictability of evaporation over land bears considerably similarity with that of temperature using ANOCOVA, bootstrap, LSG and Madden. The remote SST forcing and soil moisture reveal substantial seasonality in their relations with the potentially predictable seasonal signals. For selected regions, either SST or soil moisture or both shows significant relationships with predictable signals, hence providing indirect insight on slowly varying boundary processes involved to enable useful seasonal climate predication. A multivariate analysis of covariance (MANOCOVA) model is established to identify distinctive predictable patterns, which are uncorrelated with each other. Generally speaking, the seasonal predictability from multivariate model is consistent with that from ANOCOVA. Besides unveiling the spatial variability of predictability, MANOCOVA model also reveals the temporal variability of each predictable pattern, which could be linked to the periodic oscillations.
Achana, Felix A; Cooper, Nicola J; Bujkiewicz, Sylwia; Hubbard, Stephanie J; Kendrick, Denise; Jones, David R; Sutton, Alex J
2014-07-21
Network meta-analysis (NMA) enables simultaneous comparison of multiple treatments while preserving randomisation. When summarising evidence to inform an economic evaluation, it is important that the analysis accurately reflects the dependency structure within the data, as correlations between outcomes may have implication for estimating the net benefit associated with treatment. A multivariate NMA offers a framework for evaluating multiple treatments across multiple outcome measures while accounting for the correlation structure between outcomes. The standard NMA model is extended to multiple outcome settings in two stages. In the first stage, information is borrowed across outcomes as well across studies through modelling the within-study and between-study correlation structure. In the second stage, we make use of the additional assumption that intervention effects are exchangeable between outcomes to predict effect estimates for all outcomes, including effect estimates on outcomes where evidence is either sparse or the treatment had not been considered by any one of the studies included in the analysis. We apply the methods to binary outcome data from a systematic review evaluating the effectiveness of nine home safety interventions on uptake of three poisoning prevention practices (safe storage of medicines, safe storage of other household products, and possession of poison centre control telephone number) in households with children. Analyses are conducted in WinBUGS using Markov Chain Monte Carlo (MCMC) simulations. Univariate and the first stage multivariate models produced broadly similar point estimates of intervention effects but the uncertainty around the multivariate estimates varied depending on the prior distribution specified for the between-study covariance structure. The second stage multivariate analyses produced more precise effect estimates while enabling intervention effects to be predicted for all outcomes, including intervention effects on outcomes not directly considered by the studies included in the analysis. Accounting for the dependency between outcomes in a multivariate meta-analysis may or may not improve the precision of effect estimates from a network meta-analysis compared to analysing each outcome separately.
NASA Astrophysics Data System (ADS)
Daftedar Abdelhadi, Raghda Mohamed
Although the Next Generation Science Standards (NGSS) present a detailed set of Science and Engineering Practices, a finer grained representation of the underlying skills is lacking in the standards document. Therefore, it has been reported that teachers are facing challenges deciphering and effectively implementing the standards, especially with regards to the Practices. This analytical study assessed the development of high school chemistry students' (N = 41) inquiry, multivariable causal reasoning skills, and metacognition as a mediator for their development. Inquiry tasks based on concepts of element properties of the periodic table as well as reaction kinetics required students to conduct controlled thought experiments, make inferences, and declare predictions of the level of the outcome variable by coordinating the effects of multiple variables. An embedded mixed methods design was utilized for depth and breadth of understanding. Various sources of data were collected including students' written artifacts, audio recordings of in-depth observational groups and interviews. Data analysis was informed by a conceptual framework formulated around the concepts of coordinating theory and evidence, metacognition, and mental models of multivariable causal reasoning. Results of the study indicated positive change towards conducting controlled experimentation, making valid inferences and justifications. Additionally, significant positive correlation between metastrategic and metacognitive competencies, and sophistication of experimental strategies, signified the central role metacognition played. Finally, lack of consistency in indicating effective variables during the multivariable prediction task pointed towards the fragile mental models of multivariable causal reasoning the students had. Implications for teacher education, science education policy as well as classroom research methods are discussed. Finally, recommendations for developing reform-based chemistry curricula based on the Practices are presented.
Carnahan, Brian; Meyer, Gérard; Kuntz, Lois-Ann
2003-01-01
Multivariate classification models play an increasingly important role in human factors research. In the past, these models have been based primarily on discriminant analysis and logistic regression. Models developed from machine learning research offer the human factors professional a viable alternative to these traditional statistical classification methods. To illustrate this point, two machine learning approaches--genetic programming and decision tree induction--were used to construct classification models designed to predict whether or not a student truck driver would pass his or her commercial driver license (CDL) examination. The models were developed and validated using the curriculum scores and CDL exam performances of 37 student truck drivers who had completed a 320-hr driver training course. Results indicated that the machine learning classification models were superior to discriminant analysis and logistic regression in terms of predictive accuracy. Actual or potential applications of this research include the creation of models that more accurately predict human performance outcomes.
Rixen, D; Raum, M; Bouillon, B; Schlosser, L E; Neugebauer, E
2001-03-01
On hospital admission numerous variables are documented from multiple trauma patients. The value of these variables to predict outcome are discussed controversially. The aim was the ability to initially determine the probability of death of multiple trauma patients. Thus, a multivariate probability model was developed based on data obtained from the trauma registry of the Deutsche Gesellschaft für Unfallchirurgie (DGU). On hospital admission the DGU trauma registry collects more than 30 variables prospectively. In the first step of analysis those variables were selected, that were assumed to be clinical predictors for outcome from literature. In a second step a univariate analysis of these variables was performed. For all primary variables with univariate significance in outcome prediction a multivariate logistic regression was performed in the third step and a multivariate prognostic model was developed. 2069 patients from 20 hospitals were prospectively included in the trauma registry from 01.01.1993-31.12.1997 (age 39 +/- 19 years; 70.0% males; ISS 22 +/- 13; 18.6% lethality). From more than 30 initially documented variables, the age, the GCS, the ISS, the base excess (BE) and the prothrombin time were the most important prognostic factors to predict the probability of death (P(death)). The following prognostic model was developed: P(death) = 1/1 + e(-[k + beta 1(age) + beta 2(GCS) + beta 3(ISS) + beta 4(BE) + beta 5(prothrombin time)]) where: k = -0.1551, beta 1 = 0.0438 with p < 0.0001, beta 2 = -0.2067 with p < 0.0001, beta 3 = 0.0252 with p = 0.0071, beta 4 = -0.0840 with p < 0.0001 and beta 5 = -0.0359 with p < 0.0001. Each of the five variables contributed significantly to the multifactorial model. These data show that the age, GCS, ISS, base excess and prothrombin time are potentially important predictors to initially identify multiple trauma patients with a high risk of lethality. With the base excess and prothrombin time value, as only variables of this multifactorial model that can be therapeutically influenced, it might be possible to better guide early and aggressive therapy.
NASA Astrophysics Data System (ADS)
Manan, Norhafizah A.; Abidin, Basir
2015-02-01
Five percent of patients who went through Percutaneous Coronary Intervention (PCI) experienced Major Adverse Cardiac Events (MACE) after PCI procedure. Risk prediction of MACE following a PCI procedure therefore is helpful. This work describes a review of such prediction models currently in use. Literature search was done on PubMed and SCOPUS database. Thirty literatures were found but only 4 studies were chosen based on the data used, design, and outcome of the study. Particular emphasis was given and commented on the study design, population, sample size, modeling method, predictors, outcomes, discrimination and calibration of the model. All the models had acceptable discrimination ability (C-statistics >0.7) and good calibration (Hosmer-Lameshow P-value >0.05). Most common model used was multivariate logistic regression and most popular predictor was age.
Zeng, Fangfang; Li, Zhongtao; Yu, Xiaoling; Zhou, Linuo
2013-01-01
Background This study aimed to develop the artificial neural network (ANN) and multivariable logistic regression (LR) analyses for prediction modeling of cardiovascular autonomic (CA) dysfunction in the general population, and compare the prediction models using the two approaches. Methods and Materials We analyzed a previous dataset based on a Chinese population sample consisting of 2,092 individuals aged 30–80 years. The prediction models were derived from an exploratory set using ANN and LR analysis, and were tested in the validation set. Performances of these prediction models were then compared. Results Univariate analysis indicated that 14 risk factors showed statistically significant association with the prevalence of CA dysfunction (P<0.05). The mean area under the receiver-operating curve was 0.758 (95% CI 0.724–0.793) for LR and 0.762 (95% CI 0.732–0.793) for ANN analysis, but noninferiority result was found (P<0.001). The similar results were found in comparisons of sensitivity, specificity, and predictive values in the prediction models between the LR and ANN analyses. Conclusion The prediction models for CA dysfunction were developed using ANN and LR. ANN and LR are two effective tools for developing prediction models based on our dataset. PMID:23940593
Multivariate Models for Prediction of Skin Sensitization Hazard in Humans
One of ICCVAM’s highest priorities is the development and evaluation of non-animal approaches to identify potential skin sensitizers. The complexity of biological events necessary for a substance to elicit a skin sensitization reaction suggests that no single alternative me...
Multivariate Models for Prediction of Human Skin Sensitization Hazard
One of ICCVAM’s top priorities is the development and evaluation of non-animal approaches to identify potential skin sensitizers. The complexity of biological events necessary for a substance to elicit a skin sensitization reaction suggests that no single alternative method...
Truccolo, Wilson
2017-01-01
This review presents a perspective on capturing collective dynamics in recorded neuronal ensembles based on multivariate point process models, inference of low-dimensional dynamics and coarse graining of spatiotemporal measurements. A general probabilistic framework for continuous time point processes reviewed, with an emphasis on multivariate nonlinear Hawkes processes with exogenous inputs. A point process generalized linear model (PP-GLM) framework for the estimation of discrete time multivariate nonlinear Hawkes processes is described. The approach is illustrated with the modeling of collective dynamics in neocortical neuronal ensembles recorded in human and non-human primates, and prediction of single-neuron spiking. A complementary approach to capture collective dynamics based on low-dimensional dynamics (“order parameters”) inferred via latent state-space models with point process observations is presented. The approach is illustrated by inferring and decoding low-dimensional dynamics in primate motor cortex during naturalistic reach and grasp movements. Finally, we briefly review hypothesis tests based on conditional inference and spatiotemporal coarse graining for assessing collective dynamics in recorded neuronal ensembles. PMID:28336305
Truccolo, Wilson
2016-11-01
This review presents a perspective on capturing collective dynamics in recorded neuronal ensembles based on multivariate point process models, inference of low-dimensional dynamics and coarse graining of spatiotemporal measurements. A general probabilistic framework for continuous time point processes reviewed, with an emphasis on multivariate nonlinear Hawkes processes with exogenous inputs. A point process generalized linear model (PP-GLM) framework for the estimation of discrete time multivariate nonlinear Hawkes processes is described. The approach is illustrated with the modeling of collective dynamics in neocortical neuronal ensembles recorded in human and non-human primates, and prediction of single-neuron spiking. A complementary approach to capture collective dynamics based on low-dimensional dynamics ("order parameters") inferred via latent state-space models with point process observations is presented. The approach is illustrated by inferring and decoding low-dimensional dynamics in primate motor cortex during naturalistic reach and grasp movements. Finally, we briefly review hypothesis tests based on conditional inference and spatiotemporal coarse graining for assessing collective dynamics in recorded neuronal ensembles. Published by Elsevier Ltd.
Nakajima, Kenichi; Nakata, Tomoaki; Matsuo, Shinro; Jacobson, Arnold F
2016-10-01
(123)I meta-iodobenzylguanidine (MIBG) imaging has been extensively used for prognostication in patients with chronic heart failure (CHF). The purpose of this study was to create mortality risk charts for short-term (2 years) and long-term (5 years) prediction of cardiac mortality. Using a pooled database of 1322 CHF patients, multivariate analysis, including (123)I-MIBG late heart-to-mediastinum ratio (HMR), left ventricular ejection fraction (LVEF), and clinical factors, was performed to determine optimal variables for the prediction of 2- and 5-year mortality risk using subsets of the patients (n = 1280 and 933, respectively). Multivariate logistic regression analysis was performed to create risk charts. Cardiac mortality was 10 and 22% for the sub-population of 2- and 5-year analyses. A four-parameter multivariate logistic regression model including age, New York Heart Association (NYHA) functional class, LVEF, and HMR was used. Annualized mortality rate was <1% in patients with NYHA Class I-II and HMR ≥ 2.0, irrespective of age and LVEF. In patients with NYHA Class III-IV, mortality rate was 4-6 times higher for HMR < 1.40 compared with HMR ≥ 2.0 in all LVEF classes. Among the subset of patients with b-type natriuretic peptide (BNP) results (n = 491 and 359 for 2- and 5-year models, respectively), the 5-year model showed incremental value of HMR in addition to BNP. Both 2- and 5-year risk prediction models with (123)I-MIBG HMR can be used to identify low-risk as well as high-risk patients, which can be effective for further risk stratification of CHF patients even when BNP is available. © The Author 2015. Published by Oxford University Press on behalf of the European Society of Cardiology.
On the Bayesian Treed Multivariate Gaussian Process with Linear Model of Coregionalization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Konomi, Bledar A.; Karagiannis, Georgios; Lin, Guang
2015-02-01
The Bayesian treed Gaussian process (BTGP) has gained popularity in recent years because it provides a straightforward mechanism for modeling non-stationary data and can alleviate computational demands by fitting models to less data. The extension of BTGP to the multivariate setting requires us to model the cross-covariance and to propose efficient algorithms that can deal with trans-dimensional MCMC moves. In this paper we extend the cross-covariance of the Bayesian treed multivariate Gaussian process (BTMGP) to that of linear model of Coregionalization (LMC) cross-covariances. Different strategies have been developed to improve the MCMC mixing and invert smaller matrices in the Bayesianmore » inference. Moreover, we compare the proposed BTMGP with existing multiple BTGP and BTMGP in test cases and multiphase flow computer experiment in a full scale regenerator of a carbon capture unit. The use of the BTMGP with LMC cross-covariance helped to predict the computer experiments relatively better than existing competitors. The proposed model has a wide variety of applications, such as computer experiments and environmental data. In the case of computer experiments we also develop an adaptive sampling strategy for the BTMGP with LMC cross-covariance function.« less
NASA Astrophysics Data System (ADS)
Darvishzadeh, R.; Skidmore, A. K.; Mirzaie, M.; Atzberger, C.; Schlerf, M.
2014-12-01
Accurate estimation of grassland biomass at their peak productivity can provide crucial information regarding the functioning and productivity of the rangelands. Hyperspectral remote sensing has proved to be valuable for estimation of vegetation biophysical parameters such as biomass using different statistical techniques. However, in statistical analysis of hyperspectral data, multicollinearity is a common problem due to large amount of correlated hyper-spectral reflectance measurements. The aim of this study was to examine the prospect of above ground biomass estimation in a heterogeneous Mediterranean rangeland employing multivariate calibration methods. Canopy spectral measurements were made in the field using a GER 3700 spectroradiometer, along with concomitant in situ measurements of above ground biomass for 170 sample plots. Multivariate calibrations including partial least squares regression (PLSR), principal component regression (PCR), and Least-Squared Support Vector Machine (LS-SVM) were used to estimate the above ground biomass. The prediction accuracy of the multivariate calibration methods were assessed using cross validated R2 and RMSE. The best model performance was obtained using LS_SVM and then PLSR both calibrated with first derivative reflectance dataset with R2cv = 0.88 & 0.86 and RMSEcv= 1.15 & 1.07 respectively. The weakest prediction accuracy was appeared when PCR were used (R2cv = 0.31 and RMSEcv= 2.48). The obtained results highlight the importance of multivariate calibration methods for biomass estimation when hyperspectral data are used.
Kragelj, Borut
2016-03-01
Aiming at improving treatment individualization in patients with prostate cancer treated with combination of external beam radiotherapy and high-dose-rate brachytherapy to boost the dose to prostate (HDRB-B), the objective was to evaluate factors that have potential impact on obstructive urination problems (OUP) after HDRB-B. In the follow-up study 88 patients consecutively treated with HDRB-B at the Institute of Oncology Ljubljana in the period 2006-2011 were included. The observed outcome was deterioration of OUP (DOUP) during the follow-up period longer than 1 year. Univariate and multivariate relationship analysis between DOUP and potential risk factors (treatment factors, patients' characteristics) was carried out by using binary logistic regression. ROC curve was constructed on predicted values and the area under the curve (AUC) calculated to assess the performance of the multivariate model. Analysis was carried out on 71 patients who completed 3 years of follow-up. DOUP was noted in 13/71 (18.3%) of them. The results of multivariate analysis showed statistically significant relationship between DOUP and anti-coagulation treatment (OR 4.86, 95% C.I. limits: 1.21-19.61, p = 0.026). Also minimal dose received by 90% of the urethra volume was close to statistical significance (OR = 1.23; 95% C.I. limits: 0.98-1.07, p = 0.099). The value of AUC was 0.755. The study emphasized the relationship between DOUP and anticoagulation treatment, and suggested the multivariate model with fair predictive performance. This model potentially enables a reduction of DOUP after HDRB-B. It supports the belief that further research should be focused on urethral sphincter as a critical structure for OUP.
Analysis of Multivariate Experimental Data Using A Simplified Regression Model Search Algorithm
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert Manfred
2013-01-01
A new regression model search algorithm was developed in 2011 that may be used to analyze both general multivariate experimental data sets and wind tunnel strain-gage balance calibration data. The new algorithm is a simplified version of a more complex search algorithm that was originally developed at the NASA Ames Balance Calibration Laboratory. The new algorithm has the advantage that it needs only about one tenth of the original algorithm's CPU time for the completion of a search. In addition, extensive testing showed that the prediction accuracy of math models obtained from the simplified algorithm is similar to the prediction accuracy of math models obtained from the original algorithm. The simplified algorithm, however, cannot guarantee that search constraints related to a set of statistical quality requirements are always satisfied in the optimized regression models. Therefore, the simplified search algorithm is not intended to replace the original search algorithm. Instead, it may be used to generate an alternate optimized regression model of experimental data whenever the application of the original search algorithm either fails or requires too much CPU time. Data from a machine calibration of NASA's MK40 force balance is used to illustrate the application of the new regression model search algorithm.
Yang, D H; Su, Z Q; Chen, Y; Chen, Z B; Ding, Z N; Weng, Y Y; Li, J; Li, X; Tong, Q L; Han, Y X; Zhang, X
2016-03-08
To assess the predictive value of the albumin to globulin ratio (AGR) in evaluation of disease severity and prognosis in myasthenia gravis patients. A total of 135 myasthenia gravis (MG) patients were enrolled between February 2009 and March 2015. The AGR was detected on the first day of hospitalization and ranked from lowest to highest, and the patients were divided into three equal tertiles according to the AGR values, which were T1 (AGR <1.34), T2 (1.34≤AGR≤1.53) and T3 (AGR>1.53). The Kaplan-Meier curve was used to evaluate the prognostic value of AGR. Cox model analysis was used to evaluate the relevant factors. Multivariate Logistic regression analysis was used to find the predictors of myasthenia crisis during hospitalization. The median length of hospital stay for each tertile was: for the T1 21 days (15-35.5), T2 18 days (14-27.5), and T3 16 days (12-22.5) (P<0.01), and Kaplan-Meier curves showed significant difference among the three groups. In the univariate model, serum albumin, creatinine, AGR and MGFA clinical classification were related to prognosis of myasthenia gravis. At the multivariate Cox regression analysis, the AGR (P<0.001) and MGFA clinical classification (P<0.001) were independent predictive factors of disease severity and prognosis in myasthenia gravis patients. Respectively, the hazard ratio (HR) were 4.655 (95% CI: 2.355-9.202) and 0.596 (95% CI: 0.492-0.723). Multivariate Logistic regression analysis showed the AGR (P<0.001) and MGFA clinical classification were related to myasthenia crisis. The AGR may represent a simple, potentially useful predictive biomarker for evaluating the disease severity and prognosis of patients with myasthenia gravis.
Jiang, Yanlin; Xu, Hong; Zhang, Hao; Ou, Xunyan; Xu, Zhen; Ai, Liping; Sun, Lisha; Liu, Caigang
2017-09-22
The current management of the axilla in level 1 node-positive breast cancer patients is axillary lymph node dissection regardless of the status of the level 2 axillary lymph nodes. The goal of this study was to develop a nomogram predicting the probability of level 2 axillary lymph node metastasis (L-2-ALNM) in patients with level 1 axillary node-positive breast cancer. We reviewed the records of 974 patients with pathology-confirmed level 1 node-positive breast cancer between 2010 and 2014 at the Liaoning Cancer Hospital and Institute. The patients were randomized 1:1 and divided into a modeling group and a validation group. Clinical and pathological features of the patients were assessed with uni- and multivariate logistic regression. A nomogram based on independent predictors for the L-2-ALNM identified by multivariate logistic regression was constructed. Independent predictors of L-2-ALNM by the multivariate logistic regression analysis included tumor size, Ki-67 status, histological grade, and number of positive level 1 axillary lymph nodes. The areas under the receiver operating characteristic curve of the modeling set and the validation set were 0.828 and 0.816, respectively. The false-negative rates of the L-2-ALNM nomogram were 1.82% and 7.41% for the predicted probability cut-off points of < 6% and < 10%, respectively, when applied to the validation group. Our nomogram could help predict L-2-ALNM in patients with level 1 axillary lymph node metastasis. Patients with a low probability of L-2-ALNM could be spared level 2 axillary lymph node dissection, thereby reducing postoperative morbidity.
Pérez, Concepción; Navarro, Ana; Saldaña, María T; Wilson, Koo; Rejas, Javier
2015-03-01
The aim of the present analysis was to model the association and predictive value of pain intensity on cost and resource utilization in patients with chronic peripheral neuropathic pain (PNP) treated in routine clinical practice settings in Spain. We performed a secondary economic analysis based on data from a multicenter, observational, and prospective cost-of-illness study in patients with chronic PNP that is refractory to prior treatment. Pain intensity was measured using the Short-Form McGill Pain Questionnaire. Univariate and multivariate linear regression models were fitted to identify independent predictors of cost and health care/non-health care resource utilization. A total of 1703 patients were included in the current analysis. Pain intensity was an independent predictor of total costs ([total costs]=35.6 [pain intensity]+214.5; coefficient of determination [R(2)]=0.19, P<0.001), direct costs ([direct costs]=10.8 [pain intensity]+257.7; R=0.06, P<0.001), and indirect costs ([indirect costs]=24.8 [pain intensity]-43.4; R(2)=0.20, P<0.001) related to chronic PNP in the univariate analysis. Pain intensity remains significantly associated with total costs, direct costs, and indirect costs after adjustment by other covariates in the multivariate analysis (P<0.001). None of the other variables considered in the multivariate analysis were predictors of resource utilization. Pain intensity predicts the health care and non-health care resource utilization, and costs related to chronic PNP. Management of patients with drugs associated with a higher reduction of pain intensity may have a greater impact on the economic burden of that condition.
Roland, Lauren T.; Kallogjeri, Dorina; Sinks, Belinda C.; Rauch, Steven D.; Shepard, Neil T.; White, Judith A.; Goebel, Joel A.
2015-01-01
Objective Test performance of a focused dizziness questionnaire’s ability to discriminate between peripheral and non-peripheral causes of vertigo. Study Design Prospective multi-center Setting Four academic centers with experienced balance specialists Patients New dizzy patients Interventions A 32-question survey was given to participants. Balance specialists were blinded and a diagnosis was established for all participating patients within 6 months. Main outcomes Multinomial logistic regression was used to evaluate questionnaire performance in predicting final diagnosis and differentiating between peripheral and non-peripheral vertigo. Univariate and multivariable stepwise logistic regression were used to identify questions as significant predictors of the ultimate diagnosis. C-index was used to evaluate performance and discriminative power of the multivariable models. Results 437 patients participated in the study. Eight participants without confirmed diagnoses were excluded and 429 were included in the analysis. Multinomial regression revealed that the model had good overall predictive accuracy of 78.5% for the final diagnosis and 75.5% for differentiating between peripheral and non-peripheral vertigo. Univariate logistic regression identified significant predictors of three main categories of vertigo: peripheral, central and other. Predictors were entered into forward stepwise multivariable logistic regression. The discriminative power of the final models for peripheral, central and other causes were considered good as measured by c-indices of 0.75, 0.7 and 0.78, respectively. Conclusions This multicenter study demonstrates a focused dizziness questionnaire can accurately predict diagnosis for patients with chronic/relapsing dizziness referred to outpatient clinics. Additionally, this survey has significant capability to differentiate peripheral from non-peripheral causes of vertigo and may, in the future, serve as a screening tool for specialty referral. Clinical utility of this questionnaire to guide specialty referral is discussed. PMID:26485598
Roland, Lauren T; Kallogjeri, Dorina; Sinks, Belinda C; Rauch, Steven D; Shepard, Neil T; White, Judith A; Goebel, Joel A
2015-12-01
Test performance of a focused dizziness questionnaire's ability to discriminate between peripheral and nonperipheral causes of vertigo. Prospective multicenter. Four academic centers with experienced balance specialists. New dizzy patients. A 32-question survey was given to participants. Balance specialists were blinded and a diagnosis was established for all participating patients within 6 months. Multinomial logistic regression was used to evaluate questionnaire performance in predicting final diagnosis and differentiating between peripheral and nonperipheral vertigo. Univariate and multivariable stepwise logistic regression were used to identify questions as significant predictors of the ultimate diagnosis. C-index was used to evaluate performance and discriminative power of the multivariable models. In total, 437 patients participated in the study. Eight participants without confirmed diagnoses were excluded and 429 were included in the analysis. Multinomial regression revealed that the model had good overall predictive accuracy of 78.5% for the final diagnosis and 75.5% for differentiating between peripheral and nonperipheral vertigo. Univariate logistic regression identified significant predictors of three main categories of vertigo: peripheral, central, and other. Predictors were entered into forward stepwise multivariable logistic regression. The discriminative power of the final models for peripheral, central, and other causes was considered good as measured by c-indices of 0.75, 0.7, and 0.78, respectively. This multicenter study demonstrates a focused dizziness questionnaire can accurately predict diagnosis for patients with chronic/relapsing dizziness referred to outpatient clinics. Additionally, this survey has significant capability to differentiate peripheral from nonperipheral causes of vertigo and may, in the future, serve as a screening tool for specialty referral. Clinical utility of this questionnaire to guide specialty referral is discussed.
Prognosis Relevance of Serum Cytokines in Pancreatic Cancer
Alejandre, Maria José; Palomino-Morales, Rogelio J.; Prados, Jose; Aránega, Antonia; Delgado, Juan R.; Irigoyen, Antonio; Martínez-Galán, Joaquina; Ortuño, Francisco M.
2015-01-01
The overall survival of patients with pancreatic ductal adenocarcinoma is extremely low. Although gemcitabine is the standard used chemotherapy for this disease, clinical outcomes do not reflect significant improvements, not even when combined with adjuvant treatments. There is an urgent need for prognosis markers to be found. The aim of this study was to analyze the potential value of serum cytokines to find a profile that can predict the clinical outcome in patients with pancreatic cancer and to establish a practical prognosis index that significantly predicts patients' outcomes. We have conducted an extensive analysis of serum prognosis biomarkers using an antibody array comprising 507 human cytokines. Overall survival was estimated using the Kaplan-Meier method. Univariate and multivariate Cox's proportional hazard models were used to analyze prognosis factors. To determine the extent that survival could be predicted based on this index, we used the leave-one-out cross-validation model. The multivariate model showed a better performance and it could represent a novel panel of serum cytokines that correlates to poor prognosis in pancreatic cancer. B7-1/CD80, EG-VEGF/PK1, IL-29, NRG1-beta1/HRG1-beta1, and PD-ECGF expressions portend a poor prognosis for patients with pancreatic cancer and these cytokines could represent novel therapeutic targets for this disease. PMID:26346854
Whittle, Rebecca; Peat, George; Belcher, John; Collins, Gary S; Riley, Richard D
2018-05-18
Measurement error in predictor variables may threaten the validity of clinical prediction models. We sought to evaluate the possible extent of the problem. A secondary objective was to examine whether predictors are measured at the intended moment of model use. A systematic search of Medline was used to identify a sample of articles reporting the development of a clinical prediction model published in 2015. After screening according to a predefined inclusion criteria, information on predictors, strategies to control for measurement error and intended moment of model use were extracted. Susceptibility to measurement error for each predictor was classified into low and high risk. Thirty-three studies were reviewed, including 151 different predictors in the final prediction models. Fifty-one (33.7%) predictors were categorised as high risk of error, however this was not accounted for in the model development. Only 8 (24.2%) studies explicitly stated the intended moment of model use and when the predictors were measured. Reporting of measurement error and intended moment of model use is poor in prediction model studies. There is a need to identify circumstances where ignoring measurement error in prediction models is consequential and whether accounting for the error will improve the predictions. Copyright © 2018. Published by Elsevier Inc.
Application of Multivariable Model Predictive Advanced Control for a 2×310T/H CFB Boiler Unit
NASA Astrophysics Data System (ADS)
Weijie, Zhao; Zongllao, Dai; Rong, Gou; Wengan, Gong
When a CFB boiler is in automatic control, there are strong interactions between various process variables and inverse response characteristics of bed temperature control target. Conventional Pill control strategy cannot deliver satisfactory control demand. Kalman wave filter technology is used to establish a non-linear combustion model, based on the CFB combustion characteristics of bed fuel inventory, heating values, bed lime inventory and consumption. CFB advanced combustion control utilizes multivariable model predictive control technology to optimize primary and secondary air flow, bed temperature, air flow, fuel flow and heat flux. In addition to providing advanced combustion control to 2×310t/h CFB+1×100MW extraction condensing turbine generator unit, the control also provides load allocation optimization and advanced control for main steam pressure, combustion and temperature. After the successful implementation, under 10% load change, main steam pressure varied less than ±0.07MPa, temperature less than ±1°C, bed temperature less than ±4°C, and air flow (O2) less than ±0.4%.
Preexposure Prophylaxis and Predicted Condom Use Among High-Risk Men Who Have Sex With Men
Golub, Sarit A.; Kowalczyk, William; Weinberger, Corina L.; Parsons, Jeffrey T.
2010-01-01
Objectives Preexposure prophylaxis (PREP) is an emerging HIV prevention strategy; however, many fear it may lead to neglect of traditional risk reduction practices through behavioral disinhibition or risk compensation. Methods Participants were 180 HIV-negative high-risk men who have sex with men recruited in New York City, who completed an Audio Computer Assisted Self Interview-administered survey between September 2007 and July 2009. Bivariate and multivariate logistic regression models were used to predict intention to use PREP and perceptions that PREP would decrease condom use. Results Almost 70% (n = 124) of participants reported that they would be likely to use PREP if it were at least 80% effective in preventing HIV. Of those who would use PREP, over 35% reported that they would be likely to decrease condom use while on PREP. In multivariate analyses, arousal/pleasure barriers to condom use significantly predicted likelihood of PREP use (odds ratio = 1.71, P < 0.05) and risk perception motivations for condom use significantly predicted decreased condom use on PREP (odds ratio = 2.48, P < 0.05). Discussion These data provide support for both behavioral disinhibition and risk compensation models and underscore the importance of developing behavioral interventions to accompany any wide-scale provision of PREP to high-risk populations. PMID:20512046
Predictive monitoring and diagnosis of periodic air pollution in a subway station.
Kim, YongSu; Kim, MinJung; Lim, JungJin; Kim, Jeong Tai; Yoo, ChangKyoo
2010-11-15
The purpose of this study was to develop a predictive monitoring and diagnosis system for the air pollutants in a subway system using a lifting technique with a multiway principal component analysis (MPCA) which monitors the periodic patterns of the air pollutants and diagnoses the sources of the contamination. The basic purpose of this lifting technique was to capture the multivariate and periodic characteristics of all of the indoor air samples collected during each day. These characteristics could then be used to improve the handling of strong periodic fluctuations in the air quality environment in subway systems and will allow important changes in the indoor air quality to be quickly detected. The predictive monitoring approach was applied to a real indoor air quality dataset collected by telemonitoring systems (TMS) that indicated some periodic variations in the air pollutants and multivariate relationships between the measured variables. Two monitoring models--global and seasonal--were developed to study climate change in Korea. The proposed predictive monitoring method using the lifted model resulted in fewer false alarms and missed faults due to non-stationary behavior than that were experienced with the conventional methods. This method could be used to identify the contributions of various pollution sources. Copyright © 2010 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Yokota, Miyo; Berglund, Larry G.; Bathalon, Gaston P.
2012-03-01
The use of thermoregulatory models for assessing physiological responses of workers in thermally stressful situations has been increasing because of the risks and costs related to human studies. In a previous study (Yokota et al. Eur J Appl Physiol 104:297-302, 2008), the effects of anthropometric variability on predicted physiological responses to heat stress in U.S. Army male soldiers were evaluated. Five somatotypes were identified in U.S. Army male multivariate anthropometric distribution. The simulated heat responses, using a thermoregulatory model, were different between somatotypes. The present study further extends this line of research to female soldiers. Anthropometric somatotypes were identified using multivariate analysis [height, weight, percent body fat (%BF)] and the predicted physiological responses to simulated exercise and heat stress using a thermoregulatory model were evaluated. The simulated conditions included walking at ~3 mph (4.8 km/h) for 300 min and wearing battle dress uniform and body armor in a 30°C, 25% relative humidity (RH) environment without solar radiation. Five major somatotypes (tall-fat, tall-lean, average, short-lean, and short-fat), identified through multivariate analysis of anthropometric distributions, showed different tolerance levels to simulated heat stress: lean women were predicted to maintain their core temperatures (Tc) lower than short-fat or tall-fat women. The measured Tc of female subjects obtained from two heat studies (data1: 30°C, 32% RH, protective garments, ~225 w·m-2 walk for 90 min; data2: 32°C, 75% RH, hot weather battle dress uniform, ~378 ± 32 w·m-2 for 30 min walk/30 min rest cycles for 120 min) were utilized for validation. Validation results agreed with the findings in this study: fat subjects tended to have higher core temperatures than medium individuals (data2) and lean subjects maintained lower core temperatures than medium subjects (data1).
Cisler, Josh M.; Bush, Keith; James, G. Andrew; Smitherman, Sonet; Kilts, Clinton D.
2015-01-01
Posttraumatic Stress Disorder (PTSD) is characterized by intrusive recall of the traumatic memory. While numerous studies have investigated the neural processing mechanisms engaged during trauma memory recall in PTSD, these analyses have only focused on group-level contrasts that reveal little about the predictive validity of the identified brain regions. By contrast, a multivariate pattern analysis (MVPA) approach towards identifying the neural mechanisms engaged during trauma memory recall would entail testing whether a multivariate set of brain regions is reliably predictive of (i.e., discriminates) whether an individual is engaging in trauma or non-trauma memory recall. Here, we use a MVPA approach to test 1) whether trauma memory vs neutral memory recall can be predicted reliably using a multivariate set of brain regions among women with PTSD related to assaultive violence exposure (N=16), 2) the methodological parameters (e.g., spatial smoothing, number of memory recall repetitions, etc.) that optimize classification accuracy and reproducibility of the feature weight spatial maps, and 3) the correspondence between brain regions that discriminate trauma memory recall and the brain regions predicted by neurocircuitry models of PTSD. Cross-validation classification accuracy was significantly above chance for all methodological permutations tested; mean accuracy across participants was 76% for the methodological parameters selected as optimal for both efficiency and accuracy. Classification accuracy was significantly better for a voxel-wise approach relative to voxels within restricted regions-of-interest (ROIs); classification accuracy did not differ when using PTSD-related ROIs compared to randomly generated ROIs. ROI-based analyses suggested the reliable involvement of the left hippocampus in discriminating memory recall across participants and that the contribution of the left amygdala to the decision function was dependent upon PTSD symptom severity. These results have methodological implications for real-time fMRI neurofeedback of the trauma memory in PTSD and conceptual implications for neurocircuitry models of PTSD that attempt to explain core neural processing mechanisms mediating PTSD. PMID:26241958
Yokota, Miyo; Berglund, Larry G; Bathalon, Gaston P
2012-03-01
The use of thermoregulatory models for assessing physiological responses of workers in thermally stressful situations has been increasing because of the risks and costs related to human studies. In a previous study (Yokota et al. Eur J Appl Physiol 104:297-302, 2008), the effects of anthropometric variability on predicted physiological responses to heat stress in U.S. Army male soldiers were evaluated. Five somatotypes were identified in U.S. Army male multivariate anthropometric distribution. The simulated heat responses, using a thermoregulatory model, were different between somatotypes. The present study further extends this line of research to female soldiers. Anthropometric somatotypes were identified using multivariate analysis [height, weight, percent body fat (%BF)] and the predicted physiological responses to simulated exercise and heat stress using a thermoregulatory model were evaluated. The simulated conditions included walking at ~3 mph (4.8 km/h) for 300 min and wearing battle dress uniform and body armor in a 30°C, 25% relative humidity (RH) environment without solar radiation. Five major somatotypes (tall-fat, tall-lean, average, short-lean, and short-fat), identified through multivariate analysis of anthropometric distributions, showed different tolerance levels to simulated heat stress: lean women were predicted to maintain their core temperatures (T(c)) lower than short-fat or tall-fat women. The measured T(c) of female subjects obtained from two heat studies (data1: 30°C, 32% RH, protective garments, ~225 w·m(-2) walk for 90 min; data2: 32°C, 75% RH, hot weather battle dress uniform, ~378 ± 32 w·m(-2) for 30 min walk/30 min rest cycles for 120 min) were utilized for validation. Validation results agreed with the findings in this study: fat subjects tended to have higher core temperatures than medium individuals (data2) and lean subjects maintained lower core temperatures than medium subjects (data1).
Cisler, Josh M; Bush, Keith; James, G Andrew; Smitherman, Sonet; Kilts, Clinton D
2015-01-01
Posttraumatic Stress Disorder (PTSD) is characterized by intrusive recall of the traumatic memory. While numerous studies have investigated the neural processing mechanisms engaged during trauma memory recall in PTSD, these analyses have only focused on group-level contrasts that reveal little about the predictive validity of the identified brain regions. By contrast, a multivariate pattern analysis (MVPA) approach towards identifying the neural mechanisms engaged during trauma memory recall would entail testing whether a multivariate set of brain regions is reliably predictive of (i.e., discriminates) whether an individual is engaging in trauma or non-trauma memory recall. Here, we use a MVPA approach to test 1) whether trauma memory vs neutral memory recall can be predicted reliably using a multivariate set of brain regions among women with PTSD related to assaultive violence exposure (N=16), 2) the methodological parameters (e.g., spatial smoothing, number of memory recall repetitions, etc.) that optimize classification accuracy and reproducibility of the feature weight spatial maps, and 3) the correspondence between brain regions that discriminate trauma memory recall and the brain regions predicted by neurocircuitry models of PTSD. Cross-validation classification accuracy was significantly above chance for all methodological permutations tested; mean accuracy across participants was 76% for the methodological parameters selected as optimal for both efficiency and accuracy. Classification accuracy was significantly better for a voxel-wise approach relative to voxels within restricted regions-of-interest (ROIs); classification accuracy did not differ when using PTSD-related ROIs compared to randomly generated ROIs. ROI-based analyses suggested the reliable involvement of the left hippocampus in discriminating memory recall across participants and that the contribution of the left amygdala to the decision function was dependent upon PTSD symptom severity. These results have methodological implications for real-time fMRI neurofeedback of the trauma memory in PTSD and conceptual implications for neurocircuitry models of PTSD that attempt to explain core neural processing mechanisms mediating PTSD.
Hayman, Jonathan; Phillips, Ryan; Chen, Di; Perin, Jamie; Narang, Amol K; Trieu, Janson; Radwan, Noura; Greco, Stephen; Deville, Curtiland; McNutt, Todd; Song, Daniel Y; DeWeese, Theodore L; Tran, Phuoc T
2018-06-01
Undetectable End of Radiation PSA (EOR-PSA) has been shown to predict improved survival in prostate cancer (PCa). While validating the unfavorable intermediate-risk (UIR) and favorable intermediate-risk (FIR) stratifications among Johns Hopkins PCa patients treated with radiotherapy, we examined whether EOR-PSA could further risk stratify UIR men for survival. A total of 302 IR patients were identified in the Johns Hopkins PCa database (178 UIR, 124 FIR). Kaplan-Meier curves and multivariable analysis was performed via Cox regression for biochemical recurrence free survival (bRFS), distant metastasis free survival (DMFS), and overall survival (OS), while a competing risks model was used for PCa specific survival (PCSS). Among the 235 patients with known EOR-PSA values, we then stratified by EOR-PSA and performed the aforementioned analysis. The median follow-up time was 11.5 years (138 months). UIR was predictive of worse DMFS and PCSS (P = 0.008 and P = 0.023) on multivariable analysis (MVA). Increased radiation dose was significant for improved DMFS (P = 0.016) on MVA. EOR-PSA was excluded from the models because it did not trend towards significance as a continuous or binary variable due to interaction with UIR, and we were unable to converge a multivariable model with a variable to control for this interaction. However, when stratifying by detectable versus undetectable EOR-PSA, UIR had worse DMFS and PCSS among detectable EOR-PSA patients, but not undetectable patients. UIR was significant on MVA among detectable EOR-PSA patients for DMFS (P = 0.021) and PCSS (P = 0.033), while RT dose also predicted PCSS (P = 0.013). EOR-PSA can assist in predicting DMFS and PCSS among UIR patients, suggesting a clinically meaningful time point for considering intensification of treatment in clinical trials of intermediate-risk men. © 2018 Wiley Periodicals, Inc.
Data-driven Analysis and Prediction of Arctic Sea Ice
NASA Astrophysics Data System (ADS)
Kondrashov, D. A.; Chekroun, M.; Ghil, M.; Yuan, X.; Ting, M.
2015-12-01
We present results of data-driven predictive analyses of sea ice over the main Arctic regions. Our approach relies on the Multilayer Stochastic Modeling (MSM) framework of Kondrashov, Chekroun and Ghil [Physica D, 2015] and it leads to prognostic models of sea ice concentration (SIC) anomalies on seasonal time scales.This approach is applied to monthly time series of leading principal components from the multivariate Empirical Orthogonal Function decomposition of SIC and selected climate variables over the Arctic. We evaluate the predictive skill of MSM models by performing retrospective forecasts with "no-look ahead" forup to 6-months ahead. It will be shown in particular that the memory effects included in our non-Markovian linear MSM models improve predictions of large-amplitude SIC anomalies in certain Arctic regions. Furtherimprovements allowed by the MSM framework will adopt a nonlinear formulation, as well as alternative data-adaptive decompositions.
Ouyang, Qin; Zhao, Jiewen; Chen, Quansheng
2015-01-01
The non-sugar solids (NSS) content is one of the most important nutrition indicators of Chinese rice wine. This study proposed a rapid method for the measurement of NSS content in Chinese rice wine using near infrared (NIR) spectroscopy. We also systemically studied the efficient spectral variables selection algorithms that have to go through modeling. A new algorithm of synergy interval partial least square with competitive adaptive reweighted sampling (Si-CARS-PLS) was proposed for modeling. The performance of the final model was back-evaluated using root mean square error of calibration (RMSEC) and correlation coefficient (Rc) in calibration set and similarly tested by mean square error of prediction (RMSEP) and correlation coefficient (Rp) in prediction set. The optimum model by Si-CARS-PLS algorithm was achieved when 7 PLS factors and 18 variables were included, and the results were as follows: Rc=0.95 and RMSEC=1.12 in the calibration set, Rp=0.95 and RMSEP=1.22 in the prediction set. In addition, Si-CARS-PLS algorithm showed its superiority when compared with the commonly used algorithms in multivariate calibration. This work demonstrated that NIR spectroscopy technique combined with a suitable multivariate calibration algorithm has a high potential in rapid measurement of NSS content in Chinese rice wine. Copyright © 2015 Elsevier B.V. All rights reserved.
Climate variability, weather and enteric disease incidence in New Zealand: time series analysis.
Lal, Aparna; Ikeda, Takayoshi; French, Nigel; Baker, Michael G; Hales, Simon
2013-01-01
Evaluating the influence of climate variability on enteric disease incidence may improve our ability to predict how climate change may affect these diseases. To examine the associations between regional climate variability and enteric disease incidence in New Zealand. Associations between monthly climate and enteric diseases (campylobacteriosis, salmonellosis, cryptosporidiosis, giardiasis) were investigated using Seasonal Auto Regressive Integrated Moving Average (SARIMA) models. No climatic factors were significantly associated with campylobacteriosis and giardiasis, with similar predictive power for univariate and multivariate models. Cryptosporidiosis was positively associated with average temperature of the previous month (β = 0.130, SE = 0.060, p <0.01) and inversely related to the Southern Oscillation Index (SOI) two months previously (β = -0.008, SE = 0.004, p <0.05). By contrast, salmonellosis was positively associated with temperature (β = 0.110, SE = 0.020, p<0.001) of the current month and SOI of the current (β = 0.005, SE = 0.002, p<0.050) and previous month (β = 0.005, SE = 0.002, p<0.05). Forecasting accuracy of the multivariate models for cryptosporidiosis and salmonellosis were significantly higher. Although spatial heterogeneity in the observed patterns could not be assessed, these results suggest that temporally lagged relationships between climate variables and national communicable disease incidence data can contribute to disease prediction models and early warning systems.
A real-time prediction model for post-irradiation malignant cervical lymph nodes.
Lo, W-C; Cheng, P-W; Shueng, P-W; Hsieh, C-H; Chang, Y-L; Liao, L-J
2018-04-01
To establish a real-time predictive scoring model based on sonographic characteristics for identifying malignant cervical lymph nodes (LNs) in cancer patients after neck irradiation. One-hundred forty-four irradiation-treated patients underwent ultrasonography and ultrasound-guided fine-needle aspirations (USgFNAs), and the resultant data were used to construct a real-time and computerised predictive scoring model. This scoring system was further compared with our previously proposed prediction model. A predictive scoring model, 1.35 × (L axis) + 2.03 × (S axis) + 2.27 × (margin) + 1.48 × (echogenic hilum) + 3.7, was generated by stepwise multivariate logistic regression analysis. Neck LNs were considered to be malignant when the score was ≥ 7, corresponding to a sensitivity of 85.5%, specificity of 79.4%, positive predictive value (PPV) of 82.3%, negative predictive value (NPV) of 83.1%, and overall accuracy of 82.6%. When this new model and the original model were compared, the areas under the receiver operating characteristic curve (c-statistic) were 0.89 and 0.81, respectively (P < .05). A real-time sonographic predictive scoring model was constructed to provide prompt and reliable guidance for USgFNA biopsies to manage cervical LNs after neck irradiation. © 2017 John Wiley & Sons Ltd.
Brandstätter, Christian; Laner, David; Prantl, Roman; Fellner, Johann
2014-12-01
Municipal solid waste landfills pose a threat on environment and human health, especially old landfills which lack facilities for collection and treatment of landfill gas and leachate. Consequently, missing information about emission flows prevent site-specific environmental risk assessments. To overcome this gap, the combination of waste sampling and analysis with statistical modeling is one option for estimating present and future emission potentials. Optimizing the tradeoff between investigation costs and reliable results requires knowledge about both: the number of samples to be taken and variables to be analyzed. This article aims to identify the optimized number of waste samples and variables in order to predict a larger set of variables. Therefore, we introduce a multivariate linear regression model and tested the applicability by usage of two case studies. Landfill A was used to set up and calibrate the model based on 50 waste samples and twelve variables. The calibrated model was applied to Landfill B including 36 waste samples and twelve variables with four predictor variables. The case study results are twofold: first, the reliable and accurate prediction of the twelve variables can be achieved with the knowledge of four predictor variables (Loi, EC, pH and Cl). For the second Landfill B, only ten full measurements would be needed for a reliable prediction of most response variables. The four predictor variables would exhibit comparably low analytical costs in comparison to the full set of measurements. This cost reduction could be used to increase the number of samples yielding an improved understanding of the spatial waste heterogeneity in landfills. Concluding, the future application of the developed model potentially improves the reliability of predicted emission potentials. The model could become a standard screening tool for old landfills if its applicability and reliability would be tested in additional case studies. Copyright © 2014 Elsevier Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Farjam, R; Pramanik, P; Srinivasan, A
Purpose: Vascular injury could be a cause of hippocampal dysfunction leading to late neurocognitive decline in patients receiving brain radiotherapy (RT). Hence, our aim was to develop a multivariate interaction model for characterization of hippocampal vascular dose-response and early prediction of radiation-induced late neurocognitive impairments. Methods: 27 patients (17 males and 10 females, age 31–80 years) were enrolled in an IRB-approved prospective longitudinal study. All patients were diagnosed with a low-grade glioma or benign tumor and treated by 3-D conformal or intensity-modulated RT with a median dose of 54 Gy (50.4–59.4 Gy in 1.8− Gy fractions). Six DCE-MRI scans weremore » performed from pre-RT to 18 months post-RT. DCE data were fitted to the modified Toft model to obtain the transfer constant of gadolinium influx from the intravascular space into the extravascular extracellular space, Ktrans, and the fraction of blood plasma volume, Vp. The hippocampus vascular property alterations after starting RT were characterized by changes in the hippocampal mean values of, μh(Ktrans)τ and μh(Vp)τ. The dose-response, Δμh(Ktrans/Vp)pre->τ, was modeled using a multivariate linear regression considering integrations of doses with age, sex, hippocampal laterality and presence of tumor/edema near a hippocampus. Finally, the early vascular dose-response in hippocampus was correlated with neurocognitive decline 6 and 18 months post-RT. Results: The μh(Ktrans) increased significantly from pre-RT to 1 month post-RT (p<0.0004). The multivariate model showed that the dose effect on Δμh(Ktrans)pre->1M post-RT was interacted with sex (p<0.0007) and age (p<0.00004), with the dose-response more pronounced in older females. Also, the vascular dose-response in the left hippocampus of females was significantly correlated with memory function decline at 6 (r = − 0.95, p<0.0006) and 18 (r = −0.88, p<0.02) months post-RT. Conclusion: The hippocampal vascular response to radiation could be sex and age dependent. The early hippocampal vascular dose-response could predict late neurocognitive dysfunction. (Support: NIH-RO1NS064973)« less
Pellegrino Vidal, Rocío B; Allegrini, Franco; Olivieri, Alejandro C
2018-03-20
Multivariate curve resolution-alternating least-squares (MCR-ALS) is the model of choice when dealing with some non-trilinear arrays, specifically when the data are of chromatographic origin. To drive the iterative procedure to chemically interpretable solutions, the use of constraints becomes essential. In this work, both simulated and experimental data have been analyzed by MCR-ALS, applying chemically reasonable constraints, and investigating the relationship between selectivity, analytical sensitivity (γ) and root mean square error of prediction (RMSEP). As the selectivity in the instrumental modes decreases, the estimated values for γ did not fully represent the predictive model capabilities, judged from the obtained RMSEP values. Since the available sensitivity expressions have been developed by error propagation theory in unconstrained systems, there is a need of developing new expressions or analytical indicators. They should not only consider the specific profiles retrieved by MCR-ALS, but also the constraints under which the latter ones have been obtained. Copyright © 2017 Elsevier B.V. All rights reserved.
Teixeira, Kelly Sivocy Sampaio; da Cruz Fonseca, Said Gonçalves; de Moura, Luís Carlos Brigido; de Moura, Mario Luís Ribeiro; Borges, Márcia Herminia Pinheiro; Barbosa, Euzébio Guimaraes; De Lima E Moura, Túlio Flávio Accioly
2018-02-05
The World Health Organization recommends that TB treatment be administered using combination therapy. The methodologies for quantifying simultaneously associated drugs are highly complex, being costly, extremely time consuming and producing chemical residues harmful to the environment. The need to seek alternative techniques that minimize these drawbacks is widely discussed in the pharmaceutical industry. Therefore, the objective of this study was to develop and validate a multivariate calibration model in association with the near infrared spectroscopy technique (NIR) for the simultaneous determination of rifampicin, isoniazid, pyrazinamide and ethambutol. These models allow the quality control of these medicines to be optimized using simple, fast, low-cost techniques that produce no chemical waste. In the NIR - PLS method, spectra readings were acquired in the 10,000-4000cm -1 range using an infrared spectrophotometer (IRPrestige - 21 - Shimadzu) with a resolution of 4cm -1 , 20 sweeps, under controlled temperature and humidity. For construction of the model, the central composite experimental design was employed on the program Statistica 13 (StatSoft Inc.). All spectra were treated by computational tools for multivariate analysis using partial least squares regression (PLS) on the software program Pirouette 3.11 (Infometrix, Inc.). Variable selections were performed by the QSAR modeling program. The models developed by NIR in association with multivariate analysis provided good prediction of the APIs for the external samples and were therefore validated. For the tablets, however, the slightly different quantitative compositions of excipients compared to the mixtures prepared for building the models led to results that were not statistically similar, despite having prediction errors considered acceptable in the literature. Copyright © 2017 Elsevier B.V. All rights reserved.
Annamalai, Alagappan; Harada, Megan Y; Chen, Melissa; Tran, Tram; Ko, Ara; Ley, Eric J; Nuno, Miriam; Klein, Andrew; Nissen, Nicholas; Noureddin, Mazen
2017-03-01
Critically ill cirrhotics require liver transplantation urgently, but are at high risk for perioperative mortality. The Model for End-stage Liver Disease (MELD) score, recently updated to incorporate serum sodium, estimates survival probability in patients with cirrhosis, but needs additional evaluation in the critically ill. The purpose of this study was to evaluate the predictive power of ICU admission MELD scores and identify clinical risk factors associated with increased mortality. This was a retrospective review of cirrhotic patients admitted to the ICU between January 2011 and December 2014. Patients who were discharged or underwent transplantation (survivors) were compared with those who died (nonsurvivors). Demographic characteristics, admission MELD scores, and clinical risk factors were recorded. Multivariate regression was used to identify independent predictors of mortality, and measures of model performance were assessed to determine predictive accuracy. Of 276 patients who met inclusion criteria, 153 were considered survivors and 123 were nonsurvivors. Survivor and nonsurvivor cohorts had similar demographic characteristics. Nonsurvivors had increased MELD, gastrointestinal bleeding, infection, mechanical ventilation, encephalopathy, vasopressors, dialysis, renal replacement therapy, requirement of blood products, and ICU length of stay. The MELD demonstrated low predictive power (c-statistic 0.73). Multivariate analysis identified MELD score (adjusted odds ratio [AOR] = 1.05), mechanical ventilation (AOR = 4.55), vasopressors (AOR = 3.87), and continuous renal replacement therapy (AOR = 2.43) as independent predictors of mortality, with stronger predictive accuracy (c-statistic 0.87). The MELD demonstrated relatively poor predictive accuracy in critically ill patients with cirrhosis and might not be the best indicator for prognosis in the ICU population. Prognostic accuracy is significantly improved when variables indicating organ support (mechanical ventilation, vasopressors, and continuous renal replacement therapy) are included in the model. Copyright © 2016. Published by Elsevier Inc.
Predictors of regular cigarette smoking among adolescent females: Does body image matter?
Kaufman, Annette R.; Augustson, Erik M.
2013-01-01
This study examined how factors associated with body image predict regular smoking in adolescent females. Data were from the National Longitudinal Study of Adolescent Health (Add Health), a study of health-related behaviors in a nationally representative sample of adolescents in grades 7 through 12. Females in Waves I and II (n=6,956) were used for this study. Using SUDAAN to adjust for the sampling frame, univariate and multivariate analyses were performed to investigate if baseline body image factors, including perceived weight, perceived physical development, trying to lose weight, and self-esteem, were predictive of regular smoking status 1 year later. In univariate analyses, perceived weight (p<.01), perceived physical development (p<.0001), trying to lose weight (p<.05), and self-esteem (p<.0001) significantly predicted regular smoking 1 year later. In the logistic regression model, perceived physical development (p<.05), and self-esteem (p<.001) significantly predicted regular smoking. The more developed a female reported being in comparison to other females her age, the more likely she was to be a regular smoker. Lower self-esteem was predictive of regular smoking. Perceived weight and trying to lose weight failed to reach statistical significance in the multivariate model. This current study highlights the importance of perceived physical development and self-esteem when predicting regular smoking in adolescent females. Efforts to promote positive self-esteem in young females may be an important strategy when creating interventions to reduce regular cigarette smoking. PMID:18686177
Treatment Selection in Depression.
Cohen, Zachary D; DeRubeis, Robert J
2018-05-07
Mental health researchers and clinicians have long sought answers to the question "What works for whom?" The goal of precision medicine is to provide evidence-based answers to this question. Treatment selection in depression aims to help each individual receive the treatment, among the available options, that is most likely to lead to a positive outcome for them. Although patient variables that are predictive of response to treatment have been identified, this knowledge has not yet translated into real-world treatment recommendations. The Personalized Advantage Index (PAI) and related approaches combine information obtained prior to the initiation of treatment into multivariable prediction models that can generate individualized predictions to help clinicians and patients select the right treatment. With increasing availability of advanced statistical modeling approaches, as well as novel predictive variables and big data, treatment selection models promise to contribute to improved outcomes in depression.
Multivariate reference technique for quantitative analysis of fiber-optic tissue Raman spectroscopy.
Bergholt, Mads Sylvest; Duraipandian, Shiyamala; Zheng, Wei; Huang, Zhiwei
2013-12-03
We report a novel method making use of multivariate reference signals of fused silica and sapphire Raman signals generated from a ball-lens fiber-optic Raman probe for quantitative analysis of in vivo tissue Raman measurements in real time. Partial least-squares (PLS) regression modeling is applied to extract the characteristic internal reference Raman signals (e.g., shoulder of the prominent fused silica boson peak (~130 cm(-1)); distinct sapphire ball-lens peaks (380, 417, 646, and 751 cm(-1))) from the ball-lens fiber-optic Raman probe for quantitative analysis of fiber-optic Raman spectroscopy. To evaluate the analytical value of this novel multivariate reference technique, a rapid Raman spectroscopy system coupled with a ball-lens fiber-optic Raman probe is used for in vivo oral tissue Raman measurements (n = 25 subjects) under 785 nm laser excitation powers ranging from 5 to 65 mW. An accurate linear relationship (R(2) = 0.981) with a root-mean-square error of cross validation (RMSECV) of 2.5 mW can be obtained for predicting the laser excitation power changes based on a leave-one-subject-out cross-validation, which is superior to the normal univariate reference method (RMSE = 6.2 mW). A root-mean-square error of prediction (RMSEP) of 2.4 mW (R(2) = 0.985) can also be achieved for laser power prediction in real time when we applied the multivariate method independently on the five new subjects (n = 166 spectra). We further apply the multivariate reference technique for quantitative analysis of gelatin tissue phantoms that gives rise to an RMSEP of ~2.0% (R(2) = 0.998) independent of laser excitation power variations. This work demonstrates that multivariate reference technique can be advantageously used to monitor and correct the variations of laser excitation power and fiber coupling efficiency in situ for standardizing the tissue Raman intensity to realize quantitative analysis of tissue Raman measurements in vivo, which is particularly appealing in challenging Raman endoscopic applications.
Deconstructing multivariate decoding for the study of brain function.
Hebart, Martin N; Baker, Chris I
2017-08-04
Multivariate decoding methods were developed originally as tools to enable accurate predictions in real-world applications. The realization that these methods can also be employed to study brain function has led to their widespread adoption in the neurosciences. However, prior to the rise of multivariate decoding, the study of brain function was firmly embedded in a statistical philosophy grounded on univariate methods of data analysis. In this way, multivariate decoding for brain interpretation grew out of two established frameworks: multivariate decoding for predictions in real-world applications, and classical univariate analysis based on the study and interpretation of brain activation. We argue that this led to two confusions, one reflecting a mixture of multivariate decoding for prediction or interpretation, and the other a mixture of the conceptual and statistical philosophies underlying multivariate decoding and classical univariate analysis. Here we attempt to systematically disambiguate multivariate decoding for the study of brain function from the frameworks it grew out of. After elaborating these confusions and their consequences, we describe six, often unappreciated, differences between classical univariate analysis and multivariate decoding. We then focus on how the common interpretation of what is signal and noise changes in multivariate decoding. Finally, we use four examples to illustrate where these confusions may impact the interpretation of neuroimaging data. We conclude with a discussion of potential strategies to help resolve these confusions in interpreting multivariate decoding results, including the potential departure from multivariate decoding methods for the study of brain function. Copyright © 2017. Published by Elsevier Inc.
Green lumber grade yields from factory grade logs of three oak species
Daniel A. Yaussy
1986-01-01
Multivariate regression models were developed to predict green board foot yields for the seven common factory lumber grades processed from white, black, and chestnut oak factory grade logs. These models use the standard log measurements of grade, scaling diameter, log length, and proportion of scaling defect. Any combination of lumber grades (such as 1 Common and...
Daniel A. Yaussy
1989-01-01
Multivariate regression models were developed to predict green board-foot yields (1 board ft. = 2.360 dm 3) for the standard factory lumber grades processed from black cherry (Prunus serotina Ehrh.) and red maple (Acer rubrum L.) factory grade logs sawed at band and circular sawmills. The models use log...
Real estate value prediction using multivariate regression models
NASA Astrophysics Data System (ADS)
Manjula, R.; Jain, Shubham; Srivastava, Sharad; Rajiv Kher, Pranav
2017-11-01
The real estate market is one of the most competitive in terms of pricing and the same tends to vary significantly based on a lot of factors, hence it becomes one of the prime fields to apply the concepts of machine learning to optimize and predict the prices with high accuracy. Therefore in this paper, we present various important features to use while predicting housing prices with good accuracy. We have described regression models, using various features to have lower Residual Sum of Squares error. While using features in a regression model some feature engineering is required for better prediction. Often a set of features (multiple regressions) or polynomial regression (applying a various set of powers in the features) is used for making better model fit. For these models are expected to be susceptible towards over fitting ridge regression is used to reduce it. This paper thus directs to the best application of regression models in addition to other techniques to optimize the result.
Pathan, Sameer A; Bhutta, Zain A; Moinudheen, Jibin; Jenkins, Dominic; Silva, Ashwin D; Sharma, Yogdutt; Saleh, Warda A; Khudabakhsh, Zeenat; Irfan, Furqan B; Thomas, Stephen H
2016-01-01
Background: Standard Emergency Department (ED) operations goals include minimization of the time interval (tMD) between patients' initial ED presentation and initial physician evaluation. This study assessed factors known (or suspected) to influence tMD with a two-step goal. The first step was generation of a multivariate model identifying parameters associated with prolongation of tMD at a single study center. The second step was the use of a study center-specific multivariate tMD model as a basis for predictive marginal probability analysis; the marginal model allowed for prediction of the degree of ED operations benefit that would be affected with specific ED operations improvements. Methods: The study was conducted using one month (May 2015) of data obtained from an ED administrative database (EDAD) in an urban academic tertiary ED with an annual census of approximately 500,000; during the study month, the ED saw 39,593 cases. The EDAD data were used to generate a multivariate linear regression model assessing the various demographic and operational covariates' effects on the dependent variable tMD. Predictive marginal probability analysis was used to calculate the relative contributions of key covariates as well as demonstrate the likely tMD impact on modifying those covariates with operational improvements. Analyses were conducted with Stata 14MP, with significance defined at p < 0.05 and confidence intervals (CIs) reported at the 95% level. Results: In an acceptable linear regression model that accounted for just over half of the overall variance in tMD (adjusted r 2 0.51), important contributors to tMD included shift census ( p = 0.008), shift time of day ( p = 0.002), and physician coverage n ( p = 0.004). These strong associations remained even after adjusting for each other and other covariates. Marginal predictive probability analysis was used to predict the overall tMD impact (improvement from 50 to 43 minutes, p < 0.001) of consistent staffing with 22 physicians. Conclusions: The analysis identified expected variables contributing to tMD with regression demonstrating significance and effect magnitude of alterations in covariates including patient census, shift time of day, and number of physicians. Marginal analysis provided operationally useful demonstration of the need to adjust physician coverage numbers, prompting changes at the study ED. The methods used in this analysis may prove useful in other EDs wishing to analyze operations information with the goal of predicting which interventions may have the most benefit.
Pat, Lucio; Ali, Bassam; Guerrero, Armando; Córdova, Atl V.; Garduza, José P.
2016-01-01
Attenuated total reflectance-Fourier transform infrared spectrometry and chemometrics model was used for determination of physicochemical properties (pH, redox potential, free acidity, electrical conductivity, moisture, total soluble solids (TSS), ash, and HMF) in honey samples. The reference values of 189 honey samples of different botanical origin were determined using Association Official Analytical Chemists, (AOAC), 1990; Codex Alimentarius, 2001, International Honey Commission, 2002, methods. Multivariate calibration models were built using partial least squares (PLS) for the measurands studied. The developed models were validated using cross-validation and external validation; several statistical parameters were obtained to determine the robustness of the calibration models: (PCs) optimum number of components principal, (SECV) standard error of cross-validation, (R 2 cal) coefficient of determination of cross-validation, (SEP) standard error of validation, and (R 2 val) coefficient of determination for external validation and coefficient of variation (CV). The prediction accuracy for pH, redox potential, electrical conductivity, moisture, TSS, and ash was good, while for free acidity and HMF it was poor. The results demonstrate that attenuated total reflectance-Fourier transform infrared spectrometry is a valuable, rapid, and nondestructive tool for the quantification of physicochemical properties of honey. PMID:28070445
Metabolomics biomarkers to predict acamprosate treatment response in alcohol-dependent subjects.
Hinton, David J; Vázquez, Marely Santiago; Geske, Jennifer R; Hitschfeld, Mario J; Ho, Ada M C; Karpyak, Victor M; Biernacka, Joanna M; Choi, Doo-Sup
2017-05-31
Precision medicine for alcohol use disorder (AUD) allows optimal treatment of the right patient with the right drug at the right time. Here, we generated multivariable models incorporating clinical information and serum metabolite levels to predict acamprosate treatment response. The sample of 120 patients was randomly split into a training set (n = 80) and test set (n = 40) five independent times. Treatment response was defined as complete abstinence (no alcohol consumption during 3 months of acamprosate treatment) while nonresponse was defined as any alcohol consumption during this period. In each of the five training sets, we built a predictive model using a least absolute shrinkage and section operator (LASSO) penalized selection method and then evaluated the predictive performance of each model in the corresponding test set. The models predicted acamprosate treatment response with a mean sensitivity and specificity in the test sets of 0.83 and 0.31, respectively, suggesting our model performed well at predicting responders, but not non-responders (i.e. many non-responders were predicted to respond). Studies with larger sample sizes and additional biomarkers will expand the clinical utility of predictive algorithms for pharmaceutical response in AUD.
Comparing theories' performance in predicting violence.
Haas, Henriette; Cusson, Maurice
2015-01-01
The stakes of choosing the best theory as a basis for violence prevention and offender rehabilitation are high. However, no single theory of violence has ever been universally accepted by a majority of established researchers. Psychiatry, psychology and sociology are each subdivided into different schools relying upon different premises. All theories can produce empirical evidence for their validity, some of them stating the opposite of each other. Calculating different models with multivariate logistic regression on a dataset of N = 21,312 observations and ninety-two influences allowed a direct comparison of the performance of operationalizations of some of the most important schools. The psychopathology model ranked as the best model in terms of predicting violence right after the comprehensive interdisciplinary model. Next came the rational choice and lifestyle model and third the differential association and learning theory model. Other models namely the control theory model, the childhood-trauma model and the social conflict and reaction model turned out to have low sensitivities for predicting violence. Nevertheless, all models produced acceptable results in predictions of a non-violent outcome. Copyright © 2015. Published by Elsevier Ltd.
Rhon, Daniel I; Teyhen, Deydre S; Shaffer, Scott W; Goffar, Stephen L; Kiesel, Kyle; Plisky, Phil P
2018-02-01
Musculoskeletal injuries are a primary source of disability in the US Military, and low back pain and lower extremity injuries account for over 44% of limited work days annually. History of prior musculoskeletal injury increases the risk for future injury. This study aims to determine the risk of injury after returning to work from a previous injury. The objective is to identify criteria that can help predict likelihood for future injury or re-injury. There will be 480 active duty soldiers recruited from across four medical centres. These will be patients who have sustained a musculoskeletal injury in the lower extremity or lumbar/thoracic spine, and have now been cleared to return back to work without any limitations. Subjects will undergo a battery of physical performance tests and fill out sociodemographic surveys. They will be followed for a year to identify any musculoskeletal injuries that occur. Prediction algorithms will be derived using regression analysis from performance and sociodemographic variables found to be significantly different between injured and non-injured subjects. Due to the high rates of injuries, injury prevention and prediction initiatives are growing. This is the first study looking at predicting re-injury rates after an initial musculoskeletal injury. In addition, multivariate prediction models appear to have move value than models based on only one variable. This approach aims to validate a multivariate model used in healthy non-injured individuals to help improve variables that best predict the ability to return to work with lower risk of injury, after a recent musculoskeletal injury. NCT02776930. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
SU-F-R-51: Radiomics in CT Perfusion Maps of Head and Neck Cancer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nesteruk, M; Riesterer, O; Veit-Haibach, P
2016-06-15
Purpose: The aim of this study was to test the predictive value of radiomics features of CT perfusion (CTP) for tumor control, based on a preselection of radiomics features in a robustness study. Methods: 11 patients with head and neck cancer (HNC) and 11 patients with lung cancer were included in the robustness study to preselect stable radiomics parameters. Data from 36 HNC patients treated with definitive radiochemotherapy (median follow-up 30 months) was used to build a predictive model based on these parameters. All patients underwent pre-treatment CTP. 315 texture parameters were computed for three perfusion maps: blood volume, bloodmore » flow and mean transit time. The variability of texture parameters was tested with respect to non-standardizable perfusion computation factors (noise level and artery contouring) using intraclass correlation coefficients (ICC). The parameter with the highest ICC in the correlated group of parameters (inter-parameter Spearman correlations) was tested for its predictive value. The final model to predict tumor control was built using multivariate Cox regression analysis with backward selection of the variables. For comparison, a predictive model based on tumor volume was created. Results: Ten parameters were found to be stable in both HNC and lung cancer regarding potentially non-standardizable factors after the correction for inter-parameter correlations. In the multivariate backward selection of the variables, blood flow entropy showed a highly significant impact on tumor control (p=0.03) with concordance index (CI) of 0.76. Blood flow entropy was significantly lower in the patient group with controlled tumors at 18 months (p<0.1). The new model showed a higher concordance index compared to the tumor volume model (CI=0.68). Conclusion: The preselection of variables in the robustness study allowed building a predictive radiomics-based model of tumor control in HNC despite a small patient cohort. This model was found to be superior to the volume-based model. The project was supported by the KFSP Tumor Oxygenation of the University of Zurich, by a grant of the Center for Clinical Research, University and University Hospital Zurich and by a research grant from Merck (Schweiz) AG.« less
Predicting frequent emergency department visits among children with asthma using EHR data.
Das, Lala T; Abramson, Erika L; Stone, Anne E; Kondrich, Janienne E; Kern, Lisa M; Grinspan, Zachary M
2017-07-01
For children with asthma, emergency department (ED) visits are common, expensive, and often avoidable. Though several factors are associated with ED use (demographics, comorbidities, insurance, medications), its predictability using electronic health record (EHR) data is understudied. We used a retrospective cohort study design and EHR data from one center to examine the relationship of patient factors in 1 year (2013) and the likelihood of frequent ED use (≥2 visits) in the following year (2014), using bivariate and multivariable statistics. We applied and compared several machine-learning algorithms to predict frequent ED use, then selected a model based on accuracy, parsimony, and interpretability. We identified 2691 children. In bivariate analyses, future frequent ED use was associated with demographics, co-morbidities, insurance status, medication history, and use of healthcare resources. Machine learning algorithms had very good AUC (area under the curve) values [0.66-0.87], though fair PPV (positive predictive value) [48-70%] and poor sensitivity [16-27%]. Our final multivariable logistic regression model contained two variables: insurance status and prior ED use. For publicly insured patients, the odds of frequent ED use were 3.1 [2.2-4.5] times that of privately insured patients. Publicly insured patients with 4+ ED visits and privately insured patients with 6+ ED visits in a year had ≥50% probability of frequent ED use the following year. The model had an AUC of 0.86, PPV of 56%, and sensitivity of 23%. Among children with asthma, prior frequent ED use and insurance status strongly predict future ED use. © 2017 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Ghanate, A. D.; Kothiwale, S.; Singh, S. P.; Bertrand, Dominique; Krishna, C. Murali
2011-02-01
Cancer is now recognized as one of the major causes of morbidity and mortality. Histopathological diagnosis, the gold standard, is shown to be subjective, time consuming, prone to interobserver disagreement, and often fails to predict prognosis. Optical spectroscopic methods are being contemplated as adjuncts or alternatives to conventional cancer diagnostics. The most important aspect of these approaches is their objectivity, and multivariate statistical tools play a major role in realizing it. However, rigorous evaluation of the robustness of spectral models is a prerequisite. The utility of Raman spectroscopy in the diagnosis of cancers has been well established. Until now, the specificity and applicability of spectral models have been evaluated for specific cancer types. In this study, we have evaluated the utility of spectroscopic models representing normal and malignant tissues of the breast, cervix, colon, larynx, and oral cavity in a broader perspective, using different multivariate tests. The limit test, which was used in our earlier study, gave high sensitivity but suffered from poor specificity. The performance of other methods such as factorial discriminant analysis and partial least square discriminant analysis are at par with more complex nonlinear methods such as decision trees, but they provide very little information about the classification model. This comparative study thus demonstrates not just the efficacy of Raman spectroscopic models but also the applicability and limitations of different multivariate tools for discrimination under complex conditions such as the multicancer scenario.
Mund, Marcus; Neyer, Franz J
2016-10-01
Prior research demonstrated influences of personality traits and their development on later status of subjective health and loneliness. In the present study, we intended to extend these findings by examining mutual influences between health-related characteristics and personality traits and their development over time. German adults were assessed at two time points across 15 years (NT1 = 654, NT2 = 271; Mage at Time 1 = 24.39, SD = 3.69). Data were analyzed with multivariate structural equation models and a multivariate latent change model. Neuroticism was found to predict later levels and the development of subjective health and loneliness. While subjective health likewise predicted later levels of Neuroticism, loneliness was found to be predictive of later levels as well as the development of Neuroticism, Extraversion, and Conscientiousness. Correlated changes indicated that developing a socially more desirable personality is associated with slower declines in subjective health and slower increases in loneliness. The findings indicate that characteristics related to an individual's health are reciprocally associated with personality traits. Thus, the study adds to the understanding of the development of personality and health-related characteristics. © 2015 Wiley Periodicals, Inc.
Hendrick, C Emily; Cohen, Alison K; Deardorff, Julianna; Cance, Jessica D
2016-03-01
Lifetime educational attainment is an important predictor of health and well-being for women in the United States. In this study, we examine the roles of sociocultural factors in youth and an understudied biological life event, pubertal timing, in predicting women's lifetime educational attainment. Using data from the National Longitudinal Survey of Youth 1997 cohort (N = 3889), we conducted sequential multivariate linear regression analyses to investigate the influences of macro-level and family-level sociocultural contextual factors in youth (region of country, urbanicity, race/ethnicity, year of birth, household composition, mother's education, and mother's age at first birth) and early menarche, a marker of early pubertal development, on women's educational attainment after age 24. Pubertal timing and all sociocultural factors in youth, other than year of birth, predicted women's lifetime educational attainment in bivariate models. Family factors had the strongest associations. When family factors were added to multivariate models, geographic region in youth, and pubertal timing were no longer significant. Our findings provide additional evidence that family factors should be considered when developing comprehensive and inclusive interventions in childhood and adolescence to promote lifetime educational attainment among girls. © 2016, American School Health Association.
Mammalian cell culture monitoring using in situ spectroscopy: Is your method really optimised?
André, Silvère; Lagresle, Sylvain; Hannas, Zahia; Calvosa, Éric; Duponchel, Ludovic
2017-03-01
In recent years, as a result of the process analytical technology initiative of the US Food and Drug Administration, many different works have been carried out on direct and in situ monitoring of critical parameters for mammalian cell cultures by Raman spectroscopy and multivariate regression techniques. However, despite interesting results, it cannot be said that the proposed monitoring strategies, which will reduce errors of the regression models and thus confidence limits of the predictions, are really optimized. Hence, the aim of this article is to optimize some critical steps of spectroscopic acquisition and data treatment in order to reach a higher level of accuracy and robustness of bioprocess monitoring. In this way, we propose first an original strategy to assess the most suited Raman acquisition time for the processes involved. In a second part, we demonstrate the importance of the interbatch variability on the accuracy of the predictive models with a particular focus on the optical probes adjustment. Finally, we propose a methodology for the optimization of the spectral variables selection in order to decrease prediction errors of multivariate regressions. © 2017 American Institute of Chemical Engineers Biotechnol. Prog., 33:308-316, 2017. © 2017 American Institute of Chemical Engineers.
Abrate, Alberto; Lazzeri, Massimo; Lughezzani, Giovanni; Buffi, Nicolòmaria; Bini, Vittorio; Haese, Alexander; de la Taille, Alexandre; McNicholas, Thomas; Redorta, Joan Palou; Gadda, Giulio M; Lista, Giuliana; Kinzikeeva, Ella; Fossati, Nicola; Larcher, Alessandro; Dell'Oglio, Paolo; Mistretta, Francesco; Freschi, Massimo; Guazzoni, Giorgio
2015-04-01
To test serum prostate-specific antigen (PSA) isoform [-2]proPSA (p2PSA), p2PSA/free PSA (%p2PSA) and Prostate Health Index (PHI) accuracy in predicting prostate cancer in obese men and to test whether PHI is more accurate than PSA in predicting prostate cancer in obese patients. The analysis consisted of a nested case-control study from the pro-PSA Multicentric European Study (PROMEtheuS) project. The study is registered at http://www.controlled-trials.com/ISRCTN04707454. The primary outcome was to test sensitivity, specificity and accuracy (clinical validity) of serum p2PSA, %p2PSA and PHI, in determining prostate cancer at prostate biopsy in obese men [body mass index (BMI) ≥30 kg/m(2) ], compared with total PSA (tPSA), free PSA (fPSA) and fPSA/tPSA ratio (%fPSA). The number of avoidable prostate biopsies (clinical utility) was also assessed. Multivariable logistic regression models were complemented by predictive accuracy analysis and decision-curve analysis. Of the 965 patients, 383 (39.7%) were normal weight (BMI <25 kg/m(2) ), 440 (45.6%) were overweight (BMI 25-29.9 kg/m(2) ) and 142 (14.7%) were obese (BMI ≥30 kg/m(2) ). Among obese patients, prostate cancer was found in 65 patients (45.8%), with a higher percentage of Gleason score ≥7 diseases (67.7%). PSA, p2PSA, %p2PSA and PHI were significantly higher, and %fPSA significantly lower in patients with prostate cancer (P < 0.001). In multivariable logistic regression models, PHI significantly increased accuracy of the base multivariable model by 8.8% (P = 0.007). At a PHI threshold of 35.7, 46 (32.4%) biopsies could have been avoided. In obese patients, PHI is significantly more accurate than current tests in predicting prostate cancer. © 2014 The Authors. BJU International © 2014 BJU International.
Ouma, Paul O; Agutu, Nathan O; Snow, Robert W; Noor, Abdisalan M
2017-09-18
Precise quantification of health service utilisation is important for the estimation of disease burden and allocation of health resources. Current approaches to mapping health facility utilisation rely on spatial accessibility alone as the predictor. However, other spatially varying social, demographic and economic factors may affect the use of health services. The exclusion of these factors can lead to the inaccurate estimation of health facility utilisation. Here, we compare the accuracy of a univariate spatial model, developed only from estimated travel time, to a multivariate model that also includes relevant social, demographic and economic factors. A theoretical surface of travel time to the nearest public health facility was developed. These were assigned to each child reported to have had fever in the Kenya demographic and health survey of 2014 (KDHS 2014). The relationship of child treatment seeking for fever with travel time, household and individual factors from the KDHS2014 were determined using multilevel mixed modelling. Bayesian information criterion (BIC) and likelihood ratio test (LRT) tests were carried out to measure how selected factors improve parsimony and goodness of fit of the time model. Using the mixed model, a univariate spatial model of health facility utilisation was fitted using travel time as the predictor. The mixed model was also used to compute a multivariate spatial model of utilisation, using travel time and modelled surfaces of selected household and individual factors as predictors. The univariate and multivariate spatial models were then compared using the receiver operating area under the curve (AUC) and a percent correct prediction (PCP) test. The best fitting multivariate model had travel time, household wealth index and number of children in household as the predictors. These factors reduced BIC of the time model from 4008 to 2959, a change which was confirmed by the LRT test. Although there was a high correlation of the two modelled probability surfaces (Adj R 2 = 88%), the multivariate model had better AUC compared to the univariate model; 0.83 versus 0.73 and PCP 0.61 versus 0.45 values. Our study shows that a model that uses travel time, as well as household and individual-level socio-demographic factors, results in a more accurate estimation of use of health facilities for the treatment of childhood fever, compared to one that relies on only travel time.
Prediction versus aetiology: common pitfalls and how to avoid them.
van Diepen, Merel; Ramspek, Chava L; Jager, Kitty J; Zoccali, Carmine; Dekker, Friedo W
2017-04-01
Prediction research is a distinct field of epidemiologic research, which should be clearly separated from aetiological research. Both prediction and aetiology make use of multivariable modelling, but the underlying research aim and interpretation of results are very different. Aetiology aims at uncovering the causal effect of a specific risk factor on an outcome, adjusting for confounding factors that are selected based on pre-existing knowledge of causal relations. In contrast, prediction aims at accurately predicting the risk of an outcome using multiple predictors collectively, where the final prediction model is usually based on statistically significant, but not necessarily causal, associations in the data at hand.In both scientific and clinical practice, however, the two are often confused, resulting in poor-quality publications with limited interpretability and applicability. A major problem is the frequently encountered aetiological interpretation of prediction results, where individual variables in a prediction model are attributed causal meaning. This article stresses the differences in use and interpretation of aetiological and prediction studies, and gives examples of common pitfalls. © The Author 2017. Published by Oxford University Press on behalf of ERA-EDTA. All rights reserved.
Churchill, Laura; Malian, Samuel J; Chesworth, Bert M; Bryant, Dianne; MacDonald, Steven J; Marsh, Jacquelyn D; Giffin, J Robert
2016-12-01
In previous studies, 50%-70% of patients referred to orthopedic surgeons for total knee replacement (TKR) were not surgical candidates at the time of initial assessment. The purpose of our study was to identify and cross-validate patient self-reported predictors of suitability for TKR and to determine the clinical utility of a predictive model to guide the timing and appropriateness of referral to a surgeon. We assessed pre-consultation patient data as well as the surgeon's findings and post-consultation recommendations. We used multivariate logistic regression to detect self-reported items that could identify suitable surgical candidates. Patients' willingness to undergo surgery, higher rating of pain, greater physical function, previous intra-articular injections and patient age were the factors predictive of patients being offered and electing to undergo TKR. The application of the model developed in our study would effectively reduce the proportion of nonsurgical referrals by 25%, while identifying the vast majority of surgical candidates (> 90%). Using patient-reported information, we can correctly predict the outcome of specialist consultation for TKR in 70% of cases. To reduce long waits for first consultation with a surgeon, it may be possible to use these items to educate and guide referring clinicians and patients to understand when specialist consultation is the next step in managing the patient with severe osteoarthritis of the knee.
Churchill, Laura; Malian, Samuel J.; Chesworth, Bert M.; Bryant, Dianne; MacDonald, Steven J.; Marsh, Jacquelyn D.; Giffin, J. Robert
2016-01-01
Background In previous studies, 50%–70% of patients referred to orthopedic surgeons for total knee replacement (TKR) were not surgical candidates at the time of initial assessment. The purpose of our study was to identify and cross-validate patient self-reported predictors of suitability for TKR and to determine the clinical utility of a predictive model to guide the timing and appropriateness of referral to a surgeon. Methods We assessed pre-consultation patient data as well as the surgeon’s findings and post-consultation recommendations. We used multivariate logistic regression to detect self-reported items that could identify suitable surgical candidates. Results Patients’ willingness to undergo surgery, higher rating of pain, greater physical function, previous intra-articular injections and patient age were the factors predictive of patients being offered and electing to undergo TKR. Conclusion The application of the model developed in our study would effectively reduce the proportion of nonsurgical referrals by 25%, while identifying the vast majority of surgical candidates (> 90%). Using patient-reported information, we can correctly predict the outcome of specialist consultation for TKR in 70% of cases. To reduce long waits for first consultation with a surgeon, it may be possible to use these items to educate and guide referring clinicians and patients to understand when specialist consultation is the next step in managing the patient with severe osteoarthritis of the knee. PMID:28234616
Diagnosing perforated appendicitis in pediatric patients: a new model.
van den Bogaard, Veerle A B; Euser, Sjoerd M; van der Ploeg, Tjeerd; de Korte, Niels; Sanders, Dave G M; de Winter, Derek; Vergroesen, Diederik; van Groningen, Krijn; de Winter, Peter
2016-03-01
Studies have investigated sensitivity and specificity of symptoms and tests for diagnosing appendicitis in children. Less is known with regard to the predictive value of these symptoms and tests with respect to the severity of appendicitis. The aim of this study was to determine the predictive value of patient's characteristics and tests for discriminating between perforated and nonperforated appendicitis in children. Pediatric patients who underwent an appendectomy at Spaarne Hospital Hoofddorp, the Netherlands, between January 1, 2009 and December 31, 2013, were included. Baseline patient's characteristics, history, physical examination, laboratory data and results of ultrasounds were collected. Univariate and multivariate logistic regressions were used to determine predictors of perforation. In total, 375 patients were included in this study of which 97 children (25.9%) had significant signs of perforation. Univariate analysis showed that age, duration of complaints, temperature, vomiting, CRP, WBC, different findings on ultrasound and the diameter of the appendix were good predictors of a perforated appendicitis. The final multivariate prediction model included temperature, CRP, clearly visible appendix and free fluids on ultrasound and diameter of the appendix and resulted in an area under the curve (AUC) of 0.91 showing sensitivity and specificity of respectively 85.2% and 81.2%. This prediction model can be used for identification of 'high-risk' children for a perforated appendicitis and might be helpful to prevent complications and longer hospitalization by bringing these children to theater earlier. Copyright © 2016 Elsevier Inc. All rights reserved.
Outcomes of Extremely Low Birth Weight Infants with Acidosis at Birth
Randolph, David A.; Nolen, Tracy L.; Ambalavanan, Namasivayam; Carlo, Waldemar A.; Peralta-Carcelen, Myriam; Das, Abhik; Bell, Edward F.; Davis, Alexis S.; Laptook, Abbot R.; Stoll, Barbara J.; Shankaran, Seetha; Higgins, Rosemary D.
2014-01-01
OBJECTIVES To test the hypothesis that acidosis at birth is associated with the combined primary outcome of death or neurodevelopmental impairment (NDI) in extremely low birth weight (ELBW) infants, and to develop a predictive model of death/NDI exploring perinatal acidosis as a predictor variable. STUDY DESIGN The study population consisted of ELBW infants born between 2002-2007 at NICHD Neonatal Research Network hospitals. Infants with cord blood gas data and documentation of either mortality prior to discharge or 18-22 month neurodevelopmental outcomes were included. Multiple logistic regression analysis was used to determine the contribution of perinatal acidosis, defined as a cord blood gas with a pH<7 or base excess (BE)<-12, to death/NDI in ELBW infants. In addition, a multivariable model predicting death/NDI was developed. RESULTS 3979 patients were identified of whom 249 had a cord gas pH<7 or BE<-12 mEq/L. 2124 patients (53%) had the primary outcome of death/NDI. After adjustment for confounding variables, pH<7 and BE<-12 mEq/L were each significantly associated with death/NDI (OR=2.5[1.6,4.2]; and OR=1.5[1.1,2.0], respectively). However, inclusion of pH or BE did not improve the ability of the multivariable model to predict death/NDI. CONCLUSIONS Perinatal acidosis is significantly associated with death/NDI in ELBW infants. Perinatal acidosis is infrequent in ELBW infants, however, and other factors are more important in predicting death/NDI. PMID:24554564
Parent Influences on Early Childhood Internalizing Difficulties
ERIC Educational Resources Information Center
Bayer, Jordana, K.; Sanson, Ann, V.; Hemphill, Sheryl A.
2006-01-01
Children's internalizing problems are a concerning mental health issue, due to significant prevalence and continuity over time. This study tested a multivariate model predicting young children's internalizing behaviors from parenting practices, parents' anxiety-depression and family stressors. A community sample of 2 year old children (N=112) was…
Dunn, Adam G; Surian, Didi; Leask, Julie; Dey, Aditi; Mandl, Kenneth D; Coiera, Enrico
2017-05-25
Together with access, acceptance of vaccines affects human papillomavirus (HPV) vaccine coverage, yet little is known about media's role. Our aim was to determine whether measures of information exposure derived from Twitter could be used to explain differences in coverage in the United States. We conducted an analysis of exposure to information about HPV vaccines on Twitter, derived from 273.8 million exposures to 258,418 tweets posted between 1 October 2013 and 30 October 2015. Tweets were classified by topic using machine learning methods. Proportional exposure to each topic was used to construct multivariable models for predicting state-level HPV vaccine coverage, and compared to multivariable models constructed using socioeconomic factors: poverty, education, and insurance. Outcome measures included correlations between coverage and the individual topics and socioeconomic factors; and differences in the predictive performance of the multivariable models. Topics corresponding to media controversies were most closely correlated with coverage (both positively and negatively); education and insurance were highest among socioeconomic indicators. Measures of information exposure explained 68% of the variance in one dose 2015 HPV vaccine coverage in females (males: 63%). In comparison, models based on socioeconomic factors explained 42% of the variance in females (males: 40%). Measures of information exposure derived from Twitter explained differences in coverage that were not explained by socioeconomic factors. Vaccine coverage was lower in states where safety concerns, misinformation, and conspiracies made up higher proportions of exposures, suggesting that negative representations of vaccines in the media may reflect or influence vaccine acceptance. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.
A multivariate test of disease risk reveals conditions leading to disease amplification.
Halliday, Fletcher W; Heckman, Robert W; Wilfahrt, Peter A; Mitchell, Charles E
2017-10-25
Theory predicts that increasing biodiversity will dilute the risk of infectious diseases under certain conditions and will amplify disease risk under others. Yet, few empirical studies demonstrate amplification. This contrast may occur because few studies have considered the multivariate nature of disease risk, which includes richness and abundance of parasites with different transmission modes. By combining a multivariate statistical model developed for biodiversity-ecosystem-multifunctionality with an extensive field manipulation of host (plant) richness, composition and resource supply to hosts, we reveal that (i) host richness alone could not explain most changes in disease risk, and (ii) shifting host composition allowed disease amplification, depending on parasite transmission mode. Specifically, as predicted from theory, the effect of host diversity on parasite abundance differed for microbes (more density-dependent transmission) and insects (more frequency-dependent transmission). Host diversity did not influence microbial parasite abundance, but nearly doubled insect parasite abundance, and this amplification effect was attributable to variation in host composition. Parasite richness was reduced by resource addition, but only in species-rich host communities. Overall, this study demonstrates that multiple drivers, related to both host community and parasite characteristics, can influence disease risk. Furthermore, it provides a framework for evaluating multivariate disease risk in other systems. © 2017 The Author(s).
Walling, Craig A; Morrissey, Michael B; Foerster, Katharina; Clutton-Brock, Tim H; Pemberton, Josephine M; Kruuk, Loeske E B
2014-12-01
Evolutionary theory predicts that genetic constraints should be widespread, but empirical support for their existence is surprisingly rare. Commonly applied univariate and bivariate approaches to detecting genetic constraints can underestimate their prevalence, with important aspects potentially tractable only within a multivariate framework. However, multivariate genetic analyses of data from natural populations are challenging because of modest sample sizes, incomplete pedigrees, and missing data. Here we present results from a study of a comprehensive set of life history traits (juvenile survival, age at first breeding, annual fecundity, and longevity) for both males and females in a wild, pedigreed, population of red deer (Cervus elaphus). We use factor analytic modeling of the genetic variance-covariance matrix ( G: ) to reduce the dimensionality of the problem and take a multivariate approach to estimating genetic constraints. We consider a range of metrics designed to assess the effect of G: on the deflection of a predicted response to selection away from the direction of fastest adaptation and on the evolvability of the traits. We found limited support for genetic constraint through genetic covariances between traits, both within sex and between sexes. We discuss these results with respect to other recent findings and to the problems of estimating these parameters for natural populations. Copyright © 2014 Walling et al.
Walling, Craig A.; Morrissey, Michael B.; Foerster, Katharina; Clutton-Brock, Tim H.; Pemberton, Josephine M.; Kruuk, Loeske E. B.
2014-01-01
Evolutionary theory predicts that genetic constraints should be widespread, but empirical support for their existence is surprisingly rare. Commonly applied univariate and bivariate approaches to detecting genetic constraints can underestimate their prevalence, with important aspects potentially tractable only within a multivariate framework. However, multivariate genetic analyses of data from natural populations are challenging because of modest sample sizes, incomplete pedigrees, and missing data. Here we present results from a study of a comprehensive set of life history traits (juvenile survival, age at first breeding, annual fecundity, and longevity) for both males and females in a wild, pedigreed, population of red deer (Cervus elaphus). We use factor analytic modeling of the genetic variance–covariance matrix (G) to reduce the dimensionality of the problem and take a multivariate approach to estimating genetic constraints. We consider a range of metrics designed to assess the effect of G on the deflection of a predicted response to selection away from the direction of fastest adaptation and on the evolvability of the traits. We found limited support for genetic constraint through genetic covariances between traits, both within sex and between sexes. We discuss these results with respect to other recent findings and to the problems of estimating these parameters for natural populations. PMID:25278555
Yang, Rongbing; Nam, Kihoon; Kim, Sung Wan; Turkson, James; Zou, Ye; Zuo, Yi Y; Haware, Rahul V; Chougule, Mahavir B
2017-01-03
Desired characteristics of nanocarriers are crucial to explore its therapeutic potential. This investigation aimed to develop tunable bioresponsive newly synthesized unique arginine grafted poly(cystaminebis(acrylamide)-diaminohexane) [ABP] polymeric matrix based nanocarriers by using L9 Taguchi factorial design, desirability function, and multivariate method. The selected formulation and process parameters were ABP concentration, acetone concentration, the volume ratio of acetone to ABP solution, and drug concentration. The measured nanocarrier characteristics were particle size, polydispersity index, zeta potential, and percentage drug loading. Experimental validation of nanocarrier characteristics computed from initially developed predictive model showed nonsignificant differences (p > 0.05). The multivariate modeling based optimized cationic nanocarrier formulation of <100 nm loaded with hydrophilic acetaminophen was readapted for a hydrophobic etoposide loading without significant changes (p > 0.05) except for improved loading percentage. This is the first study focusing on ABP polymeric matrix based nanocarrier development. Nanocarrier particle size was stable in PBS 7.4 for 48 h. The increase of zeta potential at lower pH 6.4, compared to the physiological pH, showed possible endosomal escape capability. The glutathione triggered release at the physiological conditions indicated the competence of cytosolic targeting delivery of the loaded drug from bioresponsive nanocarriers. In conclusion, this unique systematic approach provides rational evaluation and prediction of a tunable bioresponsive ABP based matrix nanocarrier, which was built on selected limited number of smart experimentation.
Barimani, Shirin; Kleinebudde, Peter
2017-10-01
A multivariate analysis method, Science-Based Calibration (SBC), was used for the first time for endpoint determination of a tablet coating process using Raman data. Two types of tablet cores, placebo and caffeine cores, received a coating suspension comprising a polyvinyl alcohol-polyethylene glycol graft-copolymer and titanium dioxide to a maximum coating thickness of 80µm. Raman spectroscopy was used as in-line PAT tool. The spectra were acquired every minute and correlated to the amount of applied aqueous coating suspension. SBC was compared to another well-known multivariate analysis method, Partial Least Squares-regression (PLS) and a simpler approach, Univariate Data Analysis (UVDA). All developed calibration models had coefficient of determination values (R 2 ) higher than 0.99. The coating endpoints could be predicted with root mean square errors (RMSEP) less than 3.1% of the applied coating suspensions. Compared to PLS and UVDA, SBC proved to be an alternative multivariate calibration method with high predictive power. Copyright © 2017 Elsevier B.V. All rights reserved.
Shao, Q; Rowe, R C; York, P
2007-06-01
This study has investigated an artificial intelligence technology - model trees - as a modelling tool applied to an immediate release tablet formulation database. The modelling performance was compared with artificial neural networks that have been well established and widely applied in the pharmaceutical product formulation fields. The predictability of generated models was validated on unseen data and judged by correlation coefficient R(2). Output from the model tree analyses produced multivariate linear equations which predicted tablet tensile strength, disintegration time, and drug dissolution profiles of similar quality to neural network models. However, additional and valuable knowledge hidden in the formulation database was extracted from these equations. It is concluded that, as a transparent technology, model trees are useful tools to formulators.
Muratov, Eugene; Lewis, Margaret; Fourches, Denis; Tropsha, Alexander; Cox, Wendy C
2017-04-01
Objective. To develop predictive computational models forecasting the academic performance of students in the didactic-rich portion of a doctor of pharmacy (PharmD) curriculum as admission-assisting tools. Methods. All PharmD candidates over three admission cycles were divided into two groups: those who completed the PharmD program with a GPA ≥ 3; and the remaining candidates. Random Forest machine learning technique was used to develop a binary classification model based on 11 pre-admission parameters. Results. Robust and externally predictive models were developed that had particularly high overall accuracy of 77% for candidates with high or low academic performance. These multivariate models were highly accurate in predicting these groups to those obtained using undergraduate GPA and composite PCAT scores only. Conclusion. The models developed in this study can be used to improve the admission process as preliminary filters and thus quickly identify candidates who are likely to be successful in the PharmD curriculum.
Matsen, Frederick A; Russ, Stacy M; Vu, Phuong T; Hsu, Jason E; Lucas, Robert M; Comstock, Bryan A
2016-11-01
Although shoulder arthroplasties generally are effective in improving patients' comfort and function, the results are variable for reasons that are not well understood. We posed two questions: (1) What factors are associated with better 2-year outcomes after shoulder arthroplasty? (2) What are the sensitivities, specificities, and positive and negative predictive values of a multivariate predictive model for better outcome? Three hundred thirty-nine patients having a shoulder arthroplasty (hemiarthroplasty, arthroplasty for cuff tear arthropathy, ream and run arthroplasty, total shoulder or reverse total shoulder arthroplasty) between August 24, 2010 and December 31, 2012 consented to participate in this prospective study. Two patients were excluded because they were missing baseline variables. Forty-three patients were missing 2-year data. Univariate and multivariate analyses determined the relationship of baseline patient, shoulder, and surgical characteristics to a "better" outcome, defined as an improvement of at least 30% of the maximal possible improvement in the Simple Shoulder Test. The results were used to develop a predictive model, the accuracy of which was tested using a 10-fold cross-validation. After controlling for potentially relevant confounding variables, the multivariate analysis showed that the factors significantly associated with better outcomes were American Society of Anesthesiologists Class I (odds ratio [OR], 1.94; 95% CI, 1.03-3.65; p = 0.041), shoulder problem not related to work (OR, 5.36; 95% CI, 2.15-13.37; p < 0.001), lower baseline Simple Shoulder Test score (OR, 1.32; 95% CI, 1.23-1.42; p < 0.001), no prior shoulder surgery (OR, 1.79; 95% CI, 1.18-2.70; p = 0.006), humeral head not superiorly displaced on the AP radiograph (OR, 2.14; 95% CI, 1.15-4.02; p = 0.017), and glenoid type other than A1 (OR, 4.47; 95% CI, 2.24-8.94; p < 0.001). Neither preoperative glenoid version nor posterior decentering of the humeral head on the glenoid were associated with the outcomes. The model predictive of a better result was driven mainly by the six factors listed above. The area under the receiver operating characteristic curve generated from the cross-validated enhanced predictive model was 0.79 (generally values of 0.7 to 0.8 are considered fair and values of 0.8 to 0.9 are considered good). The false-positive fraction and the true-positive fraction depended on the cutoff probability selected (ie, the selected probability above which the prediction would be classified as a better outcome). A cutoff probability of 0.68 yielded the best performance of the model with cross-validation predictions of better outcomes for 236 patients (80%) and worse outcomes for 58 patients (20%); sensitivity of 91% (95% CI, 88%-95%); specificity of 65% (95% CI, 53%-77%); positive predictive value of 92% (95% CI, 88%-95%); and negative predictive value of 64% (95% CI, 51%-76%). We found six easy-to-determine preoperative patient and shoulder factors that were significantly associated with better outcomes of shoulder arthroplasty. A model based on these characteristics had good predictive properties for identifying patients likely to have a better outcome from shoulder arthroplasty. Future research could refine this model with larger patient populations from multiple practices. Level II, therapeutic study.
Characterization of Used Nuclear Fuel with Multivariate Analysis for Process Monitoring
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dayman, Kenneth J.; Coble, Jamie B.; Orton, Christopher R.
2014-01-01
The Multi-Isotope Process (MIP) Monitor combines gamma spectroscopy and multivariate analysis to detect anomalies in various process streams in a nuclear fuel reprocessing system. Measured spectra are compared to models of nominal behavior at each measurement location to detect unexpected changes in system behavior. In order to improve the accuracy and specificity of process monitoring, fuel characterization may be used to more accurately train subsequent models in a full analysis scheme. This paper presents initial development of a reactor-type classifier that is used to select a reactor-specific partial least squares model to predict fuel burnup. Nuclide activities for prototypic usedmore » fuel samples were generated in ORIGEN-ARP and used to investigate techniques to characterize used nuclear fuel in terms of reactor type (pressurized or boiling water reactor) and burnup. A variety of reactor type classification algorithms, including k-nearest neighbors, linear and quadratic discriminant analyses, and support vector machines, were evaluated to differentiate used fuel from pressurized and boiling water reactors. Then, reactor type-specific partial least squares models were developed to predict the burnup of the fuel. Using these reactor type-specific models instead of a model trained for all light water reactors improved the accuracy of burnup predictions. The developed classification and prediction models were combined and applied to a large dataset that included eight fuel assembly designs, two of which were not used in training the models, and spanned the range of the initial 235U enrichment, cooling time, and burnup values expected of future commercial used fuel for reprocessing. Error rates were consistent across the range of considered enrichment, cooling time, and burnup values. Average absolute relative errors in burnup predictions for validation data both within and outside the training space were 0.0574% and 0.0597%, respectively. The errors seen in this work are artificially low, because the models were trained, optimized, and tested on simulated, noise-free data. However, these results indicate that the developed models may generalize well to new data and that the proposed approach constitutes a viable first step in developing a fuel characterization algorithm based on gamma spectra.« less
Rupert, Michael G.
2003-01-01
Draft Federal regulations may require that each State develop a State Pesticide Management Plan for the herbicides atrazine, alachlor, metolachlor, and simazine. Maps were developed that the State of Colorado could use to predict the probability of detecting atrazine and desethyl-atrazine (a breakdown product of atrazine) in ground water in Colorado. These maps can be incorporated into the State Pesticide Management Plan and can help provide a sound hydrogeologic basis for atrazine management in Colorado. Maps showing the probability of detecting elevated nitrite plus nitrate as nitrogen (nitrate) concentrations in ground water in Colorado also were developed because nitrate is a contaminant of concern in many areas of Colorado. Maps showing the probability of detecting atrazine and(or) desethyl-atrazine (atrazine/DEA) at or greater than concentrations of 0.1 microgram per liter and nitrate concentrations in ground water greater than 5 milligrams per liter were developed as follows: (1) Ground-water quality data were overlaid with anthropogenic and hydrogeologic data using a geographic information system to produce a data set in which each well had corresponding data on atrazine use, fertilizer use, geology, hydrogeomorphic regions, land cover, precipitation, soils, and well construction. These data then were downloaded to a statistical software package for analysis by logistic regression. (2) Relations were observed between ground-water quality and the percentage of land-cover categories within circular regions (buffers) around wells. Several buffer sizes were evaluated; the buffer size that provided the strongest relation was selected for use in the logistic regression models. (3) Relations between concentrations of atrazine/DEA and nitrate in ground water and atrazine use, fertilizer use, geology, hydrogeomorphic regions, land cover, precipitation, soils, and well-construction data were evaluated, and several preliminary multivariate models with various combinations of independent variables were constructed. (4) The multivariate models that best predicted the presence of atrazine/DEA and elevated concentrations of nitrate in ground water were selected. (5) The accuracy of the multivariate models was confirmed by validating the models with an independent set of ground-water quality data. (6) The multivariate models were entered into a geographic information system and the probability maps were constructed.
Broyles, Lauren Matukaitis; Gordon, Adam J; Sereika, Susan M; Ryan, Christopher M; Erlen, Judith A
2011-10-01
Alcohol use negatively affects adherence to antiretroviral therapy (ART), thus human immunodeficiency virus/acquired immunodeficiency syndrome (HIV/AIDS) care providers need accurate, efficient assessments of alcohol use. Using existing data from an efficacy trial of 2 cognitive-behavioral ART adherence interventions, the authors sought to determine if results on 2 common alcohol screening tests (Alcohol Use Disorders Identification Test--Consumption [AUDIT-C] and its binge-related question [AUDIT-3]) predict ART nonadherence. Twenty-seven percent of the sample (n = 308) were positive on the AUDIT-C and 34% were positive on the AUDIT-3. In multivariate analyses, AUDIT-C-positive status predicted ART nonadherence after controlling for race, age, conscientiousness, and self-efficacy (P = .036). Although AUDIT-3-positive status was associated with ART nonadherence in unadjusted analyses, this relationship was not maintained in the final multivariate model. The AUDIT-C shows potential as an indirect screening tool for both at-risk drinking and ART nonadherence, underscoring the relationship between alcohol and chronic disease management.
Predicting major element mineral/melt equilibria - A statistical approach
NASA Technical Reports Server (NTRS)
Hostetler, C. J.; Drake, M. J.
1980-01-01
Empirical equations have been developed for calculating the mole fractions of NaO0.5, MgO, AlO1.5, SiO2, KO0.5, CaO, TiO2, and FeO in a solid phase of initially unknown identity given only the composition of the coexisting silicate melt. The approach involves a linear multivariate regression analysis in which solid composition is expressed as a Taylor series expansion of the liquid compositions. An internally consistent precision of approximately 0.94 is obtained, that is, the nature of the liquidus phase in the input data set can be correctly predicted for approximately 94% of the entries. The composition of the liquidus phase may be calculated to better than 5 mol % absolute. An important feature of this 'generalized solid' model is its reversibility; that is, the dependent and independent variables in the linear multivariate regression may be inverted to permit prediction of the composition of a silicate liquid produced by equilibrium partial melting of a polymineralic source assemblage.
Predictive Utility of Brief AUDIT for HIV Antiretroviral Medication Nonadherence
Broyles, Lauren Matukaitis; Gordon, Adam J.; Sereika, Susan M.; Ryan, Christopher M.; Erlen, Judith A.
2012-01-01
Alcohol use negatively affects adherence to antiretroviral therapy (ART), thus HIV/AIDS providers need accurate, efficient assessments of alcohol use. Using existing data from an efficacy trial of two cognitive-behavioral ART adherence interventions, we sought to determine if results on two common alcohol screening tests (Alcohol Use Disorders Identification Test—Consumption (AUDIT-C) and its binge-related question (AUDIT-3)) predict ART nonadherence. Twenty seven percent of the sample (n=308) were positive on the AUDIT-C and 34% were positive on the AUDIT-3. In multivariate analyses, AUDIT-C positive status predicted ART nonadherence after controlling for race, age, conscientiousness, and self-efficacy (p=.036). While AUDIT-3 positive status was associated with ART nonadherence in unadjusted analyses, this relationship was not maintained in the final multivariate model. The AUDIT-C shows potential as an indirect screening tool for both at-risk drinking and ART nonadherence, underscoring the relationship between alcohol and chronic disease management. PMID:22014256
D'Ovidio, Valeria; Meo, Donatella; Viscido, Angelo; Bresci, Giampaolo; Vernia, Piero; Caprilli, Renzo
2011-01-01
AIM: To identify factors predicting the clinical response of ulcerative colitis patients to granulocyte-monocyte apheresis (GMA). METHODS: Sixty-nine ulcerative colitis patients (39 F, 30 M) dependent upon/refractory to steroids were treated with GMA. Steroid dependency, clinical activity index (CAI), C reactive protein (CRP) level, erythrocyte sedimentation rate (ESR), values at baseline, use of immunosuppressant, duration of disease, and age and extent of disease were considered for statistical analysis as predictive factors of clinical response. Univariate and multivariate logistic regression models were used. RESULTS: In the univariate analysis, CAI (P = 0.039) and ESR (P = 0.017) levels at baseline were singled out as predictive of clinical remission. In the multivariate analysis steroid dependency [Odds ratio (OR) = 0.390, 95% Confidence interval (CI): 0.176-0.865, Wald 5.361, P = 0.0160] and low CAI levels at baseline (4 < CAI < 7) (OR = 0.770, 95% CI: 0.425-1.394, Wald 3.747, P = 0.028) proved to be effective as factors predicting clinical response. CONCLUSION: GMA may be a valid therapeutic option for steroid-dependent ulcerative colitis patients with mild-moderate disease and its clinical efficacy seems to persist for 12 mo. PMID:21528055
Individualized Prediction of Reading Comprehension Ability Using Gray Matter Volume.
Cui, Zaixu; Su, Mengmeng; Li, Liangjie; Shu, Hua; Gong, Gaolang
2018-05-01
Reading comprehension is a crucial reading skill for learning and putatively contains 2 key components: reading decoding and linguistic comprehension. Current understanding of the neural mechanism underlying these reading comprehension components is lacking, and whether and how neuroanatomical features can be used to predict these 2 skills remain largely unexplored. In the present study, we analyzed a large sample from the Human Connectome Project (HCP) dataset and successfully built multivariate predictive models for these 2 skills using whole-brain gray matter volume features. The results showed that these models effectively captured individual differences in these 2 skills and were able to significantly predict these components of reading comprehension for unseen individuals. The strict cross-validation using the HCP cohort and another independent cohort of children demonstrated the model generalizability. The identified gray matter regions contributing to the skill prediction consisted of a wide range of regions covering the putative reading, cerebellum, and subcortical systems. Interestingly, there were gender differences in the predictive models, with the female-specific model overestimating the males' abilities. Moreover, the identified contributing gray matter regions for the female-specific and male-specific models exhibited considerable differences, supporting a gender-dependent neuroanatomical substrate for reading comprehension.
Hermes, Ilarraza-Lomelí; Marianna, García-Saldivia; Jessica, Rojano-Castillo; Carlos, Barrera-Ramírez; Rafael, Chávez-Domínguez; María Dolores, Rius-Suárez; Pedro, Iturralde
2016-10-01
Mortality due to cardiovascular disease is often associated with ventricular arrhythmias. Nowadays, patients with cardiovascular disease are more encouraged to take part in physical training programs. Nevertheless, high-intensity exercise is associated to a higher risk for sudden death, even in apparently healthy people. During an exercise testing (ET), health care professionals provide patients, in a controlled scenario, an intense physiological stimulus that could precipitate cardiac arrhythmia in high risk individuals. There is still no clinical or statistical tool to predict this incidence. The aim of this study was to develop a statistical model to predict the incidence of exercise-induced potentially life-threatening ventricular arrhythmia (PLVA) during high intensity exercise. 6415 patients underwent a symptom-limited ET with a Balke ramp protocol. A multivariate logistic regression model where the primary outcome was PLVA was performed. Incidence of PLVA was 548 cases (8.5%). After a bivariate model, thirty one clinical or ergometric variables were statistically associated with PLVA and were included in the regression model. In the multivariate model, 13 of these variables were found to be statistically significant. A regression model (G) with a X(2) of 283.987 and a p<0.001, was constructed. Significant variables included: heart failure, antiarrhythmic drugs, myocardial lower-VD, age and use of digoxin, nitrates, among others. This study allows clinicians to identify patients at risk of ventricular tachycardia or couplets during exercise, and to take preventive measures or appropriate supervision. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
An Applet to Estimate the IOP-Induced Stress and Strain within the Optic Nerve Head
2011-01-01
Purpose. The ability to predict the biomechanical response of the optic nerve head (ONH) to intraocular pressure (IOP) elevation holds great promise, yet remains elusive. The objective of this work was to introduce an approach to model ONH biomechanics that combines the ease of use and speed of analytical models with the flexibility and power of numerical models. Methods. Models representing a variety of ONHs were produced, and finite element (FE) techniques used to predict the stresses (forces) and strains (relative deformations) induced on each of the models by IOP elevations (up to 10 mm Hg). Multivariate regression was used to parameterize each biomechanical response as an analytical function. These functions were encoded into a Flash-based applet. Applet utility was demonstrated by investigating hypotheses concerning ONH biomechanics posited in the literature. Results. All responses were parameterized well by polynomials (R2 values between 0.985 and 0.999), demonstrating the effectiveness of our fitting approach. Previously published univariate results were reproduced with the applet in seconds. A few minutes allowed for multivariate analysis, with which it was predicted that often, but not always, larger eyes experience higher levels of stress and strain than smaller ones, even at the same IOP. Conclusions. An applet has been presented with which it is simple to make rapid estimates of IOP-related ONH biomechanics. The applet represents a step toward bringing the power of FE modeling beyond the specialized laboratory and can thus help develop more refined biomechanics-based hypotheses. The applet is available for use at www.ocularbiomechanics.com. PMID:21527378
A model of the human observer and decision maker
NASA Technical Reports Server (NTRS)
Wewerinke, P. H.
1981-01-01
The decision process is described in terms of classical sequential decision theory by considering the hypothesis that an abnormal condition has occurred by means of a generalized likelihood ratio test. For this, a sufficient statistic is provided by the innovation sequence which is the result of the perception an information processing submodel of the human observer. On the basis of only two model parameters, the model predicts the decision speed/accuracy trade-off and various attentional characteristics. A preliminary test of the model for single variable failure detection tasks resulted in a very good fit of the experimental data. In a formal validation program, a variety of multivariable failure detection tasks was investigated and the predictive capability of the model was demonstrated.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tucker, Susan L., E-mail: sltucker@mdanderson.org; Li Minghuan; Xu Ting
2013-01-01
Purpose: To determine whether single-nucleotide polymorphisms (SNPs) in genes associated with DNA repair, cell cycle, transforming growth factor-{beta}, tumor necrosis factor and receptor, folic acid metabolism, and angiogenesis can significantly improve the fit of the Lyman-Kutcher-Burman (LKB) normal-tissue complication probability (NTCP) model of radiation pneumonitis (RP) risk among patients with non-small cell lung cancer (NSCLC). Methods and Materials: Sixteen SNPs from 10 different genes (XRCC1, XRCC3, APEX1, MDM2, TGF{beta}, TNF{alpha}, TNFR, MTHFR, MTRR, and VEGF) were genotyped in 141 NSCLC patients treated with definitive radiation therapy, with or without chemotherapy. The LKB model was used to estimate the risk ofmore » severe (grade {>=}3) RP as a function of mean lung dose (MLD), with SNPs and patient smoking status incorporated into the model as dose-modifying factors. Multivariate analyses were performed by adding significant factors to the MLD model in a forward stepwise procedure, with significance assessed using the likelihood-ratio test. Bootstrap analyses were used to assess the reproducibility of results under variations in the data. Results: Five SNPs were selected for inclusion in the multivariate NTCP model based on MLD alone. SNPs associated with an increased risk of severe RP were in genes for TGF{beta}, VEGF, TNF{alpha}, XRCC1 and APEX1. With smoking status included in the multivariate model, the SNPs significantly associated with increased risk of RP were in genes for TGF{beta}, VEGF, and XRCC3. Bootstrap analyses selected a median of 4 SNPs per model fit, with the 6 genes listed above selected most often. Conclusions: This study provides evidence that SNPs can significantly improve the predictive ability of the Lyman MLD model. With a small number of SNPs, it was possible to distinguish cohorts with >50% risk vs <10% risk of RP when they were exposed to high MLDs.« less
Dynamic Web Pages: Performance Impact on Web Servers.
ERIC Educational Resources Information Center
Kothari, Bhupesh; Claypool, Mark
2001-01-01
Discussion of Web servers and requests for dynamic pages focuses on experimentally measuring and analyzing the performance of the three dynamic Web page generation technologies: CGI, FastCGI, and Servlets. Develops a multivariate linear regression model and predicts Web server performance under some typical dynamic requests. (Author/LRW)
Liao, Qiuyan; Wong, Wing Sze; Fielding, Richard
2013-01-01
Background Risk perception is a reported predictor of vaccination uptake, but which measures of risk perception best predict influenza vaccination uptake remain unclear. Methodology During the main influenza seasons (between January and March) of 2009 (Wave 1) and 2010 (Wave 2),505 Chinese students and employees from a Hong Kong university completed an online survey. Multivariate logistic regression models were conducted to assess how well different risk perceptions measures in Wave 1 predicted vaccination uptake against seasonal influenza in Wave 2. Principal Findings The results of the multivariate logistic regression models showed that feeling at risk (β = 0.25, p = 0.021) was the better predictor compared with probability judgment while probability judgment (β = 0.25, p = 0.029 ) was better than beliefs about risk in predicting subsequent influenza vaccination uptake. Beliefs about risk and feeling at risk seemed to predict the same aspect of subsequent vaccination uptake because their associations with vaccination uptake became insignificant when paired into the logistic regression model. Similarly, to compare the four scales for assessing probability judgment in predicting vaccination uptake, the 7-point verbal scale remained a significant and stronger predictor for vaccination uptake when paired with other three scales; the 6-point verbal scale was a significant and stronger predictor when paired with the percentage scale or the 2-point verbal scale; and the percentage scale was a significant and stronger predictor only when paired with the 2-point verbal scale. Conclusions/Significance Beliefs about risk and feeling at risk are not well differentiated by Hong Kong Chinese people. Feeling at risk, an affective-cognitive dimension of risk perception predicts subsequent vaccination uptake better than do probability judgments. Among the four scales for assessing risk probability judgment, the 7-point verbal scale offered the best predictive power for subsequent vaccination uptake. PMID:23894292
Liao, Qiuyan; Wong, Wing Sze; Fielding, Richard
2013-01-01
Risk perception is a reported predictor of vaccination uptake, but which measures of risk perception best predict influenza vaccination uptake remain unclear. During the main influenza seasons (between January and March) of 2009 (Wave 1) and 2010 (Wave 2),505 Chinese students and employees from a Hong Kong university completed an online survey. Multivariate logistic regression models were conducted to assess how well different risk perceptions measures in Wave 1 predicted vaccination uptake against seasonal influenza in Wave 2. The results of the multivariate logistic regression models showed that feeling at risk (β = 0.25, p = 0.021) was the better predictor compared with probability judgment while probability judgment (β = 0.25, p = 0.029 ) was better than beliefs about risk in predicting subsequent influenza vaccination uptake. Beliefs about risk and feeling at risk seemed to predict the same aspect of subsequent vaccination uptake because their associations with vaccination uptake became insignificant when paired into the logistic regression model. Similarly, to compare the four scales for assessing probability judgment in predicting vaccination uptake, the 7-point verbal scale remained a significant and stronger predictor for vaccination uptake when paired with other three scales; the 6-point verbal scale was a significant and stronger predictor when paired with the percentage scale or the 2-point verbal scale; and the percentage scale was a significant and stronger predictor only when paired with the 2-point verbal scale. Beliefs about risk and feeling at risk are not well differentiated by Hong Kong Chinese people. Feeling at risk, an affective-cognitive dimension of risk perception predicts subsequent vaccination uptake better than do probability judgments. Among the four scales for assessing risk probability judgment, the 7-point verbal scale offered the best predictive power for subsequent vaccination uptake.
NASA Astrophysics Data System (ADS)
Szeląg, Bartosz; Barbusiński, Krzysztof; Studziński, Jan; Bartkiewicz, Lidia
2017-11-01
In the study, models developed using data mining methods are proposed for predicting wastewater quality indicators: biochemical and chemical oxygen demand, total suspended solids, total nitrogen and total phosphorus at the inflow to wastewater treatment plant (WWTP). The models are based on values measured in previous time steps and daily wastewater inflows. Also, independent prediction systems that can be used in case of monitoring devices malfunction are provided. Models of wastewater quality indicators were developed using MARS (multivariate adaptive regression spline) method, artificial neural networks (ANN) of the multilayer perceptron type combined with the classification model (SOM) and cascade neural networks (CNN). The lowest values of absolute and relative errors were obtained using ANN+SOM, whereas the MARS method produced the highest error values. It was shown that for the analysed WWTP it is possible to obtain continuous prediction of selected wastewater quality indicators using the two developed independent prediction systems. Such models can ensure reliable WWTP work when wastewater quality monitoring systems become inoperable, or are under maintenance.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Konomi, Bledar A.; Karagiannis, Georgios; Sarkar, Avik
2014-05-16
Computer experiments (numerical simulations) are widely used in scientific research to study and predict the behavior of complex systems, which usually have responses consisting of a set of distinct outputs. The computational cost of the simulations at high resolution are often expensive and become impractical for parametric studies at different input values. To overcome these difficulties we develop a Bayesian treed multivariate Gaussian process (BTMGP) as an extension of the Bayesian treed Gaussian process (BTGP) in order to model and evaluate a multivariate process. A suitable choice of covariance function and the prior distributions facilitates the different Markov chain Montemore » Carlo (MCMC) movements. We utilize this model to sequentially sample the input space for the most informative values, taking into account model uncertainty and expertise gained. A simulation study demonstrates the use of the proposed method and compares it with alternative approaches. We apply the sequential sampling technique and BTMGP to model the multiphase flow in a full scale regenerator of a carbon capture unit. The application presented in this paper is an important tool for research into carbon dioxide emissions from thermal power plants.« less
Feng, Ji-Feng; Chen, Sheng; Yang, Xun
2017-09-08
We initially proposed a useful and novel prognostic model, named CCS [Combination of c-reactive protein (CRP) and squamous cell carcinoma antigen (SCC)], for predicting the postoperative survival in patients with esophageal squamous cell carcinoma (ESCC). Two hundred and fifty-two patients with resectable ESCC were included in this retrospective study. A logistic regression was performed and yielded a logistic equation. The CCS was calculated by the combined CRP and SCC. The optimal cut-off value for CCS was evaluated by X-tile program. Univariate and multivariate analyses were used to evaluate the predictive factors. In addition, a novel nomogram model was also performed to predict the prognosis for patients with ESCC. In the current study, CCS was calculated as CRP+6.33 SCC according to the logistic equation. The optimal cut-off value was 15.8 for CCS according to the X-tile program. Kaplan-Meier analyses demonstrated that high CCS group had a significantly poor 5-year cancer-specific survival (CSS) than low CCS group (10.3% vs. 47.3%, P <0.001). According to multivariate analyses, CCS ( P =0.004), but not CRP ( P =0.466) or SCC ( P =0.926), was an independent prognostic factor. A nomogram could be more accuracy for CSS (Harrell's c-index: 0.70). The CCS is a usefull and independent predictive factor in patients with ESCC.
Zekry, Dina; Herrmann, François R; Graf, Christophe E; Giannelli, Sandra; Michel, Jean-Pierre; Gold, Gabriel; Krause, Karl-Heinz
2011-01-01
The relative weight of various etiologies of dementia as predictors of long-term mortality after other risk factors have been taken into account remains unclear. We investigated the 5-year mortality risk associated with dementia in elderly people after discharge from acute care, taking into account comorbid conditions and functionality. A prospective cohort study of 444 patients (mean age: 85 years; 74% female) discharged from the acute geriatric unit of Geneva University Hospitals. On admission, each subject underwent a standardized diagnostic evaluation: demographic variables, cognitive, comorbid medical conditions and functional assessment. Patients were followed yearly by the same team. Predictors of survival at 5 years were evaluated by Cox proportional hazards models. The univariate model showed that being older and male, and having vascular and severe dementia, comorbidity and functional disability, were predictive of shorter survival. However, in the full multivariate model adjusted for age and sex, the effect of dementia type or severity completely disappeared when all the variables were added. In multivariate analysis, the best predictor was higher comorbidity score, followed by functional status (R(2) = 23%). The identification of comorbidity and functional impairment effects as predictive factors for long-term mortality independent of cognitive status may increase the accuracy of long-term discharge planning. Copyright © 2011 S. Karger AG, Basel.
Robust tumor morphometry in multispectral fluorescence microscopy
NASA Astrophysics Data System (ADS)
Tabesh, Ali; Vengrenyuk, Yevgen; Teverovskiy, Mikhail; Khan, Faisal M.; Sapir, Marina; Powell, Douglas; Mesa-Tejada, Ricardo; Donovan, Michael J.; Fernandez, Gerardo
2009-02-01
Morphological and architectural characteristics of primary tissue compartments, such as epithelial nuclei (EN) and cytoplasm, provide important cues for cancer diagnosis, prognosis, and therapeutic response prediction. We propose two feature sets for the robust quantification of these characteristics in multiplex immunofluorescence (IF) microscopy images of prostate biopsy specimens. To enable feature extraction, EN and cytoplasm regions were first segmented from the IF images. Then, feature sets consisting of the characteristics of the minimum spanning tree (MST) connecting the EN and the fractal dimension (FD) of gland boundaries were obtained from the segmented compartments. We demonstrated the utility of the proposed features in prostate cancer recurrence prediction on a multi-institution cohort of 1027 patients. Univariate analysis revealed that both FD and one of the MST features were highly effective for predicting cancer recurrence (p <= 0.0001). In multivariate analysis, an MST feature was selected for a model incorporating clinical and image features. The model achieved a concordance index (CI) of 0.73 on the validation set, which was significantly higher than the CI of 0.69 for the standard multivariate model based solely on clinical features currently used in clinical practice (p < 0.0001). The contributions of this work are twofold. First, it is the first demonstration of the utility of the proposed features in morphometric analysis of IF images. Second, this is the largest scale study of the efficacy and robustness of the proposed features in prostate cancer prognosis.
Goldrick, Stephen; Holmes, William; Bond, Nicholas J.; Lewis, Gareth; Kuiper, Marcel; Turner, Richard
2017-01-01
ABSTRACT Product quality heterogeneities, such as a trisulfide bond (TSB) formation, can be influenced by multiple interacting process parameters. Identifying their root cause is a major challenge in biopharmaceutical production. To address this issue, this paper describes the novel application of advanced multivariate data analysis (MVDA) techniques to identify the process parameters influencing TSB formation in a novel recombinant antibody–peptide fusion expressed in mammalian cell culture. The screening dataset was generated with a high‐throughput (HT) micro‐bioreactor system (AmbrTM 15) using a design of experiments (DoE) approach. The complex dataset was firstly analyzed through the development of a multiple linear regression model focusing solely on the DoE inputs and identified the temperature, pH and initial nutrient feed day as important process parameters influencing this quality attribute. To further scrutinize the dataset, a partial least squares model was subsequently built incorporating both on‐line and off‐line process parameters and enabled accurate predictions of the TSB concentration at harvest. Process parameters identified by the models to promote and suppress TSB formation were implemented on five 7 L bioreactors and the resultant TSB concentrations were comparable to the model predictions. This study demonstrates the ability of MVDA to enable predictions of the key performance drivers influencing TSB formation that are valid also upon scale‐up. Biotechnol. Bioeng. 2017;114: 2222–2234. © 2017 The Authors. Biotechnology and Bioengineering Published by Wiley Periodicals, Inc. PMID:28500668
Stata Modules for Calculating Novel Predictive Performance Indices for Logistic Models.
Barkhordari, Mahnaz; Padyab, Mojgan; Hadaegh, Farzad; Azizi, Fereidoun; Bozorgmanesh, Mohammadreza
2016-01-01
Prediction is a fundamental part of prevention of cardiovascular diseases (CVD). The development of prediction algorithms based on the multivariate regression models loomed several decades ago. Parallel with predictive models development, biomarker researches emerged in an impressively great scale. The key question is how best to assess and quantify the improvement in risk prediction offered by new biomarkers or more basically how to assess the performance of a risk prediction model. Discrimination, calibration, and added predictive value have been recently suggested to be used while comparing the predictive performances of the predictive models' with and without novel biomarkers. Lack of user-friendly statistical software has restricted implementation of novel model assessment methods while examining novel biomarkers. We intended, thus, to develop a user-friendly software that could be used by researchers with few programming skills. We have written a Stata command that is intended to help researchers obtain cut point-free and cut point-based net reclassification improvement index and (NRI) and relative and absolute Integrated discriminatory improvement index (IDI) for logistic-based regression analyses.We applied the commands to a real data on women participating the Tehran lipid and glucose study (TLGS) to examine if information of a family history of premature CVD, waist circumference, and fasting plasma glucose can improve predictive performance of the Framingham's "general CVD risk" algorithm. The command is addpred for logistic regression models. The Stata package provided herein can encourage the use of novel methods in examining predictive capacity of ever-emerging plethora of novel biomarkers.
Magnetic resonance spectroscopy metabolite profiles predict survival in paediatric brain tumours.
Wilson, Martin; Cummins, Carole L; Macpherson, Lesley; Sun, Yu; Natarajan, Kal; Grundy, Richard G; Arvanitis, Theodoros N; Kauppinen, Risto A; Peet, Andrew C
2013-01-01
Brain tumours cause the highest mortality and morbidity rate of all childhood tumour groups and new methods are required to improve clinical management. (1)H magnetic resonance spectroscopy (MRS) allows non-invasive concentration measurements of small molecules present in tumour tissue, providing clinically useful imaging biomarkers. The primary aim of this study was to investigate whether MRS detectable molecules can predict the survival of paediatric brain tumour patients. Short echo time (30ms) single voxel (1)H MRS was performed on children attending Birmingham Children's Hospital with a suspected brain tumour and 115 patients were included in the survival analysis. Patients were followed-up for a median period of 35 months and Cox-Regression was used to establish the prognostic value of individual MRS detectable molecules. A multivariate model of survival was also investigated to improve prognostic power. Lipids and scyllo-inositol predicted poor survival whilst glutamine and N-acetyl aspartate predicted improved survival (p<0.05). A multivariate model of survival based on three MRS biomarkers predicted survival with a similar accuracy to histologic grading (p<5e-5). A negative correlation between lipids and glutamine was found, suggesting a functional link between these molecules. MRS detectable biomolecules have been identified that predict survival of paediatric brain tumour patients across a range of tumour types. The evaluation of these biomarkers in large prospective studies of specific tumour types should be undertaken. The correlation between lipids and glutamine provides new insight into paediatric brain tumour metabolism that may present novel targets for therapy. Copyright © 2012 Elsevier Ltd. All rights reserved.
Neonatal Pulmonary MRI of Bronchopulmonary Dysplasia Predicts Short-term Clinical Outcomes.
Higano, Nara S; Spielberg, David R; Fleck, Robert J; Schapiro, Andrew H; Walkup, Laura L; Hahn, Andrew D; Tkach, Jean A; Kingma, Paul S; Merhar, Stephanie L; Fain, Sean B; Woods, Jason C
2018-05-23
Bronchopulmonary dysplasia (BPD) is a serious neonatal pulmonary condition associated with premature birth, but the underlying parenchymal disease and trajectory are poorly characterized. The current NICHD/NHLBI definition of BPD severity is based on degree of prematurity and extent of oxygen requirement. However, no clear link exists between initial diagnosis and clinical outcomes. We hypothesized that magnetic resonance imaging (MRI) of structural parenchymal abnormalities will correlate with NICHD-defined BPD disease severity and predict short-term respiratory outcomes. Forty-two neonates (20 severe BPD, 6 moderate, 7 mild, 9 non-BPD controls; 40±3 weeks post-menstrual age) underwent quiet-breathing structural pulmonary MRI (ultrashort echo-time and gradient echo) in a NICU-sited, neonatal-sized 1.5T scanner, without sedation or respiratory support unless already clinically prescribed. Disease severity was scored independently by two radiologists. Mean scores were compared to clinical severity and short-term respiratory outcomes. Outcomes were predicted using univariate and multivariable models including clinical data and scores. MRI scores significantly correlated with severities and predicted respiratory support at NICU discharge (P<0.0001). In multivariable models, MRI scores were by far the strongest predictor of respiratory support duration over clinical data, including birth weight and gestational age. Notably, NICHD severity level was not predictive of discharge support. Quiet-breathing neonatal pulmonary MRI can independently assess structural abnormalities of BPD, describe disease severity, and predict short-term outcomes more accurately than any individual standard clinical measure. Importantly, this non-ionizing technique can be implemented to phenotype disease and has potential to serially assess efficacy of individualized therapies.
Health Literacy, Cognitive Abilities, and Mortality Among Elderly Persons
Wolf, Michael S.; Feinglass, Joseph; Thompson, Jason A.
2008-01-01
Background Low health literacy and low cognitive abilities both predict mortality, but no study has jointly examined these relationships. Methods We conducted a prospective cohort study of 3,260 community-dwelling adults age 65 and older. Participants were interviewed in 1997 and administered the Short Test of Functional Health Literacy in Adults and the Mini Mental Status Examination. Mortality was determined using the National Death Index through 2003. Measurements and Main Results In multivariate models with only literacy (not cognition), the adjusted hazard ratio was 1.50 (95% confidence of interval [CI] 1.24–1.81) for inadequate versus adequate literacy. In multivariate models without literacy, delayed recall of 3 items and the ability to serial subtract numbers were associated with higher mortality (e.g., adjusted hazard ratios [AHR] 1.74 [95% CI 1.30–2.34] for recall of zero versus 3 items, and 1.32 [95% CI 1.09–1.60] for 0–2 vs 5 correct subtractions). In multivariate analysis with both literacy and cognition, the AHRs for the cognition items were similar, but the AHR for inadequate literacy decreased to 1.27 (95% CI 1.03 – 1.57). Conclusions Both health literacy and cognitive abilities independently predict mortality. Interventions to improve patient knowledge and self-management skills should consider both the reading level and cognitive demands of the materials. PMID:18330654
Lee, Ji Yeon; Ahn, Eun Hee; Kang, Sukho; Moon, Myung Jin; Jung, Sang Hee; Chang, Sung Woon; Cho, Hee Young
2018-01-01
We aimed to identify factors associated with massive post-partum bleeding in pregnancies with placenta previa and to establish a scoring model to predict post-partum severe bleeding. A retrospective cohort study was performed in 506 healthy singleton pregnancies with placenta previa from 2006 to 2016. Cases with intraoperative blood loss (≥2000 mL), packed red blood cells transfusion (≥4), uterine artery embolization, or hysterectomy were defined as massive bleeding. After performing multivariable analysis, using the adjusted odds ratios (aOR), we formulated a scoring model. Seventy-three women experienced massive post-partum bleeding (14.4%). After multivariable analysis, seven variables were associated with massive bleeding: maternal old age (≥35 years; aOR 1.79, 95% confidence interval [CI] 1.00-3.20, P = 0.049), antepartum bleeding (aOR 4.76, 95%CI 2.01-11.02, P < 0.001), non-cephalic presentation (aOR 3.41, 95%CI 1.40-8.30, P = 0.007), complete placenta previa (aOR 1.93, 95%CI 1.05-3.54, P = 0.034), anterior placenta (aOR 2.74, 95%CI 1.54-4.89, P = 0.001), multiple lacunae (≥4; aOR 2.77, 95%CI 1.54-4.99, P = 0.001), and uteroplacental hypervascularity (aOR 4.51, 95%CI 2.30-8.83, P < 0.001). We formulated a scoring model including maternal old age (<35: 0, ≥35: 1), antepartum bleeding (no: 0, yes: 2), fetal non-cephalic presentation (no: 0, yes: 2), placenta previa type (incomplete: 0, complete: 1), placenta location (posterior: 0, anterior: 1), uteroplacental hypervascularity (no: 0, yes: 2), and multiple lacunae (no: 0, yes: 1) to predict post-partum massive bleeding. According to our scoring model, a score of 5/10 had a sensitivity of 81% and a specificity of 77% for predicting massive post-partum bleeding. The area under the receiver-operator curve was 0.856 (P < 0.001). The negative predictive value was 95.9%. Our scoring model might provide useful information for prediction of massive post-partum bleeding in pregnancies with placenta previa. © 2017 Japan Society of Obstetrics and Gynecology.
Cogswell, Rebecca; Kobashigawa, Erin; McGlothlin, Dana; Shaw, Robin; De Marco, Teresa
2012-11-01
The Registry to Evaluate Early and Long-Term Pulmonary Arterial (PAH) Hypertension Disease Management (REVEAL) model was designed to predict 1-year survival in patients with PAH. Multivariate prediction models need to be evaluated in cohorts distinct from the derivation set to determine external validity. In addition, limited data exist on the utility of this model in the prediction of long-term survival. REVEAL model performance was assessed to predict 1-year and 5-year outcomes, defined as survival or composite survival or freedom from lung transplant, in 140 patients with PAH. The validation cohort had a higher proportion of human immunodeficiency virus (7.9% vs 1.9%, p < 0.0001), methamphetamine use (19.3% vs 4.9%, p < 0.0001), and portal hypertension PAH (16.4% vs 5.1%, p < 0.0001) compared with the development cohort. The C-index of the model to predict survival was 0.765 at 1 year and 0.712 at 5 years of follow-up. The C-index of the model to predict composite survival or freedom from lung transplant was 0.805 and 0.724 at 1 and 5 years of follow-up, respectively. Prediction by the model, however, was weakest among patients with intermediate-risk predicted survival. The REVEAL model had adequate discrimination to predict 1-year survival in this small but clinically distinct validation cohort. Although the model also had predictive ability out to 5 years, prediction was limited among patients of intermediate risk, suggesting our prediction methods can still be improved. Copyright © 2012. Published by Elsevier Inc.
Wen, Zhang; Guo, Ya; Xu, Banghao; Xiao, Kaiyin; Peng, Tao; Peng, Minhao
2016-04-01
Postoperative pancreatic fistula is still a major complication after pancreatic surgery, despite improvements of surgical technique and perioperative management. We sought to systematically review and critically access the conduct and reporting of methods used to develop risk prediction models for predicting postoperative pancreatic fistula. We conducted a systematic search of PubMed and EMBASE databases to identify articles published before January 1, 2015, which described the development of models to predict the risk of postoperative pancreatic fistula. We extracted information of developing a prediction model including study design, sample size and number of events, definition of postoperative pancreatic fistula, risk predictor selection, missing data, model-building strategies, and model performance. Seven studies of developing seven risk prediction models were included. In three studies (42 %), the number of events per variable was less than 10. The number of candidate risk predictors ranged from 9 to 32. Five studies (71 %) reported using univariate screening, which was not recommended in building a multivariate model, to reduce the number of risk predictors. Six risk prediction models (86 %) were developed by categorizing all continuous risk predictors. The treatment and handling of missing data were not mentioned in all studies. We found use of inappropriate methods that could endanger the development of model, including univariate pre-screening of variables, categorization of continuous risk predictors, and model validation. The use of inappropriate methods affects the reliability and the accuracy of the probability estimates of predicting postoperative pancreatic fistula.
NASA Astrophysics Data System (ADS)
Bonne, F.; Alamir, M.; Bonnay, P.
2017-02-01
This paper deals with multivariable constrained model predictive control for Warm Compression Stations (WCS). WCSs are subject to numerous constraints (limits on pressures, actuators) that need to be satisfied using appropriate algorithms. The strategy is to replace all the PID loops controlling the WCS with an optimally designed model-based multivariable loop. This new strategy leads to high stability and fast disturbance rejection such as those induced by a turbine or a compressor stop, a key-aspect in the case of large scale cryogenic refrigeration. The proposed control scheme can be used to achieve precise control of pressures in normal operation or to avoid reaching stopping criteria (such as excessive pressures) under high disturbances (such as a pulsed heat load expected to take place in future fusion reactors, expected in the cryogenic cooling systems of the International Thermonuclear Experimental Reactor ITER or the Japan Torus-60 Super Advanced fusion experiment JT-60SA). The paper details the simulator used to validate this new control scheme and the associated simulation results on the SBTs WCS. This work is partially supported through the French National Research Agency (ANR), task agreement ANR-13-SEED-0005.
2014-01-01
Background Network meta-analysis (NMA) enables simultaneous comparison of multiple treatments while preserving randomisation. When summarising evidence to inform an economic evaluation, it is important that the analysis accurately reflects the dependency structure within the data, as correlations between outcomes may have implication for estimating the net benefit associated with treatment. A multivariate NMA offers a framework for evaluating multiple treatments across multiple outcome measures while accounting for the correlation structure between outcomes. Methods The standard NMA model is extended to multiple outcome settings in two stages. In the first stage, information is borrowed across outcomes as well across studies through modelling the within-study and between-study correlation structure. In the second stage, we make use of the additional assumption that intervention effects are exchangeable between outcomes to predict effect estimates for all outcomes, including effect estimates on outcomes where evidence is either sparse or the treatment had not been considered by any one of the studies included in the analysis. We apply the methods to binary outcome data from a systematic review evaluating the effectiveness of nine home safety interventions on uptake of three poisoning prevention practices (safe storage of medicines, safe storage of other household products, and possession of poison centre control telephone number) in households with children. Analyses are conducted in WinBUGS using Markov Chain Monte Carlo (MCMC) simulations. Results Univariate and the first stage multivariate models produced broadly similar point estimates of intervention effects but the uncertainty around the multivariate estimates varied depending on the prior distribution specified for the between-study covariance structure. The second stage multivariate analyses produced more precise effect estimates while enabling intervention effects to be predicted for all outcomes, including intervention effects on outcomes not directly considered by the studies included in the analysis. Conclusions Accounting for the dependency between outcomes in a multivariate meta-analysis may or may not improve the precision of effect estimates from a network meta-analysis compared to analysing each outcome separately. PMID:25047164
Latin hypercube approach to estimate uncertainty in ground water vulnerability
Gurdak, J.J.; McCray, J.E.; Thyne, G.; Qi, S.L.
2007-01-01
A methodology is proposed to quantify prediction uncertainty associated with ground water vulnerability models that were developed through an approach that coupled multivariate logistic regression with a geographic information system (GIS). This method uses Latin hypercube sampling (LHS) to illustrate the propagation of input error and estimate uncertainty associated with the logistic regression predictions of ground water vulnerability. Central to the proposed method is the assumption that prediction uncertainty in ground water vulnerability models is a function of input error propagation from uncertainty in the estimated logistic regression model coefficients (model error) and the values of explanatory variables represented in the GIS (data error). Input probability distributions that represent both model and data error sources of uncertainty were simultaneously sampled using a Latin hypercube approach with logistic regression calculations of probability of elevated nonpoint source contaminants in ground water. The resulting probability distribution represents the prediction intervals and associated uncertainty of the ground water vulnerability predictions. The method is illustrated through a ground water vulnerability assessment of the High Plains regional aquifer. Results of the LHS simulations reveal significant prediction uncertainties that vary spatially across the regional aquifer. Additionally, the proposed method enables a spatial deconstruction of the prediction uncertainty that can lead to improved prediction of ground water vulnerability. ?? 2007 National Ground Water Association.
Lastoria, Secondo; Piccirillo, Maria Carmela; Caracò, Corradina; Nasti, Guglielmo; Aloj, Luigi; Arrichiello, Cecilia; de Lutio di Castelguidone, Elisabetta; Tatangelo, Fabiana; Ottaiano, Alessandro; Iaffaioli, Rosario Vincenzo; Izzo, Francesco; Romano, Giovanni; Giordano, Pasqualina; Signoriello, Simona; Gallo, Ciro; Perrone, Francesco
2013-12-01
Markers predictive of treatment effect might be useful to improve the treatment of patients with metastatic solid tumors. Particularly, early changes in tumor metabolism measured by PET/CT with (18)F-FDG could predict the efficacy of treatment better than standard dimensional Response Evaluation Criteria In Solid Tumors (RECIST) response. We performed PET/CT evaluation before and after 1 cycle of treatment in patients with resectable liver metastases from colorectal cancer, within a phase 2 trial of preoperative FOLFIRI plus bevacizumab. For each lesion, the maximum standardized uptake value (SUV) and the total lesion glycolysis (TLG) were determined. On the basis of previous studies, a ≤ -50% change from baseline was used as a threshold for significant metabolic response for maximum SUV and, exploratively, for TLG. Standard RECIST response was assessed with CT after 3 mo of treatment. Pathologic response was assessed in patients undergoing resection. The association between metabolic and CT/RECIST and pathologic response was tested with the McNemar test; the ability to predict progression-free survival (PFS) and overall survival (OS) was tested with the Log-rank test and a multivariable Cox model. Thirty-three patients were analyzed. After treatment, there was a notable decrease of all the parameters measured by PET/CT. Early metabolic PET/CT response (either SUV- or TLG-based) had a stronger, independent and statistically significant predictive value for PFS and OS than both CT/RECIST and pathologic response at multivariate analysis, although with different degrees of statistical significance. The predictive value of CT/RECIST response was not significant at multivariate analysis. PET/CT response was significantly predictive of long-term outcomes during preoperative treatment of patients with liver metastases from colorectal cancer, and its predictive ability was higher than that of CT/RECIST response after 3 mo of treatment. Such findings need to be confirmed by larger prospective trials.
Dong, Chunjiao; Clarke, David B; Richards, Stephen H; Huang, Baoshan
2014-01-01
The influence of intersection features on safety has been examined extensively because intersections experience a relatively large proportion of motor vehicle conflicts and crashes. Although there are distinct differences between passenger cars and large trucks-size, operating characteristics, dimensions, and weight-modeling crash counts across vehicle types is rarely addressed. This paper develops and presents a multivariate regression model of crash frequencies by collision vehicle type using crash data for urban signalized intersections in Tennessee. In addition, the performance of univariate Poisson-lognormal (UVPLN), multivariate Poisson (MVP), and multivariate Poisson-lognormal (MVPLN) regression models in establishing the relationship between crashes, traffic factors, and geometric design of roadway intersections is investigated. Bayesian methods are used to estimate the unknown parameters of these models. The evaluation results suggest that the MVPLN model possesses most of the desirable statistical properties in developing the relationships. Compared to the UVPLN and MVP models, the MVPLN model better identifies significant factors and predicts crash frequencies. The findings suggest that traffic volume, truck percentage, lighting condition, and intersection angle significantly affect intersection safety. Important differences in car, car-truck, and truck crash frequencies with respect to various risk factors were found to exist between models. The paper provides some new or more comprehensive observations that have not been covered in previous studies. Copyright © 2013 Elsevier Ltd. All rights reserved.
Leptospirosis in American Samoa – Estimating and Mapping Risk Using Environmental Data
Lau, Colleen L.; Clements, Archie C. A.; Skelly, Chris; Dobson, Annette J.; Smythe, Lee D.; Weinstein, Philip
2012-01-01
Background The recent emergence of leptospirosis has been linked to many environmental drivers of disease transmission. Accurate epidemiological data are lacking because of under-diagnosis, poor laboratory capacity, and inadequate surveillance. Predictive risk maps have been produced for many diseases to identify high-risk areas for infection and guide allocation of public health resources, and are particularly useful where disease surveillance is poor. To date, no predictive risk maps have been produced for leptospirosis. The objectives of this study were to estimate leptospirosis seroprevalence at geographic locations based on environmental factors, produce a predictive disease risk map for American Samoa, and assess the accuracy of the maps in predicting infection risk. Methodology and Principal Findings Data on seroprevalence and risk factors were obtained from a recent study of leptospirosis in American Samoa. Data on environmental variables were obtained from local sources, and included rainfall, altitude, vegetation, soil type, and location of backyard piggeries. Multivariable logistic regression was performed to investigate associations between seropositivity and risk factors. Using the multivariable models, seroprevalence at geographic locations was predicted based on environmental variables. Goodness of fit of models was measured using area under the curve of the receiver operating characteristic, and the percentage of cases correctly classified as seropositive. Environmental predictors of seroprevalence included living below median altitude of a village, in agricultural areas, on clay soil, and higher density of piggeries above the house. Models had acceptable goodness of fit, and correctly classified ∼84% of cases. Conclusions and Significance Environmental variables could be used to identify high-risk areas for leptospirosis. Environmental monitoring could potentially be a valuable strategy for leptospirosis control, and allow us to move from disease surveillance to environmental health hazard surveillance as a more cost-effective tool for directing public health interventions. PMID:22666516
Kassam, Zain; Fabersunne, Camila Cribb; Smith, Mark B.; Alm, Eric J.; Kaplan, Gilaad G.; Nguyen, Geoffrey C.; Ananthakrishnan, Ashwin N.
2016-01-01
Background Clostridium difficile infection (CDI) is public health threat and associated with significant mortality. However, there is a paucity of objectively derived CDI severity scoring systems to predict mortality. Aims To develop a novel CDI risk score to predict mortality entitled: Clostridium difficile Associated Risk of Death Score (CARDS). Methods We obtained data from the United States 2011 Nationwide Inpatient Sample (NIS) database. All CDI-associated hospitalizations were identified using discharge codes (ICD-9-CM, 008.45). Multivariate logistic regression was utilized to identify independent predictors of mortality. CARDS was calculated by assigning a numeric weight to each parameter based on their odds ratio in the final logistic model. Predictive properties of model discrimination were assessed using the c-statistic and validated in an independent sample using the 2010 NIS database. Results We identified 77,776 hospitalizations, yielding an estimate of 374,747 cases with an associated diagnosis of CDI in the United States, 8% of whom died in the hospital. The 8 severity score predictors were identified on multivariate analysis: age, cardiopulmonary disease, malignancy, diabetes, inflammatory bowel disease, acute renal failure, liver disease and ICU admission, with weights ranging from −1 (for diabetes) to 5 (for ICU admission). The overall risk score in the cohort ranged from 0 to 18. Mortality increased significantly as CARDS increased. CDI-associated mortality was 1.2% with a CARDS of 0 compared to 100% with CARDS of 18. The model performed equally well in our validation cohort. Conclusion CARDS is a promising simple severity score to predict mortality among those hospitalized with CDI. PMID:26849527
Minematsu, Akira; Hazaki, Kan; Harano, Akihiro; Iki, Masayuki; Fujita, Yuki; Okamoto, Nozomi; Kurumatani, Norio
2012-01-01
Screening for low bone mass is important to prevent fragility fractures in men as well as women, although men show a much lower prevalence of osteoporosis than women. The purpose of this study was to establish a screening model for low bone mineral density (BMD) using a quantitative ultrasound parameter and easily obtained objective indices for elderly Japanese men. We examined 1633 men (65-84 yr old) who were subjects of the Fujiwara-Kyo Study. Speed of sound (SOS) at the calcaneus was determined, and BMD was measured by dual-energy X-ray absorptiometry at the lumbar spine (LS), total hip (TH), and femoral neck (FN). Low BMD was defined as >1 standard deviation below the young adult mean, in accordance with World Health Organization criteria. We performed receiver operating characteristic (ROC) analysis to identify a better screening model incorporating SOS and determined the optimal cutoff value using Youden index. Prevalences of low BMD at the 3 skeletal sites were 27.8% (LS), 33.5% (TH), 48.6% (FN), and 43.3% at either LS or TH. The greatest area under the ROC curve (0.806, 95% confidence interval: 0.785-0.828) and smallest Akaike's information criterion were obtained in the multivariate model incorporating SOS, age, height, and weight for predicting low BMD at all skeletal sites. This model predicted low BMD at TH with the sensitivity of 0.726 and specificity of 0.739, whereas a similar model predicted low BMD at LS with much lower validity. We conclude that the multivariate model for TH could be used to screen for low BMD in elderly Japanese men. Copyright © 2012 The International Society for Clinical Densitometry. Published by Elsevier Inc. All rights reserved.
Wang, Qingliang; Li, Xiaojie; Hu, Kunpeng; Zhao, Kun; Yang, Peisheng; Liu, Bo
2015-05-12
To explore the risk factors of portal hypertensive gastropathy (PHG) in patients with hepatitis B associated cirrhosis and establish a Logistic regression model of noninvasive prediction. The clinical data of 234 hospitalized patients with hepatitis B associated cirrhosis from March 2012 to March 2014 were analyzed retrospectively. The dependent variable was the occurrence of PHG while the independent variables were screened by binary Logistic analysis. Multivariate Logistic regression was used for further analysis of significant noninvasive independent variables. Logistic regression model was established and odds ratio was calculated for each factor. The accuracy, sensitivity and specificity of model were evaluated by the curve of receiver operating characteristic (ROC). According to univariate Logistic regression, the risk factors included hepatic dysfunction, albumin (ALB), bilirubin (TB), prothrombin time (PT), platelet (PLT), white blood cell (WBC), portal vein diameter, spleen index, splenic vein diameter, diameter ratio, PLT to spleen volume ratio, esophageal varices (EV) and gastric varices (GV). Multivariate analysis showed that hepatic dysfunction (X1), TB (X2), PLT (X3) and splenic vein diameter (X4) were the major occurring factors for PHG. The established regression model was Logit P=-2.667+2.186X1-2.167X2+0.725X3+0.976X4. The accuracy of model for PHG was 79.1% with a sensitivity of 77.2% and a specificity of 80.8%. Hepatic dysfunction, TB, PLT and splenic vein diameter are risk factors for PHG and the noninvasive predicted Logistic regression model was Logit P=-2.667+2.186X1-2.167X2+0.725X3+0.976X4.
Cavallo, Jaime A.; Roma, Andres A.; Jasielec, Mateusz S.; Ousley, Jenny; Creamer, Jennifer; Pichert, Matthew D.; Baalman, Sara; Frisella, Margaret M.; Matthews, Brent D.
2014-01-01
Background The purpose of this study was to evaluate the associations between patient characteristics or surgical site classifications and the histologic remodeling scores of synthetic meshes biopsied from their abdominal wall repair sites in the first attempt to generate a multivariable risk prediction model of non-constructive remodeling. Methods Biopsies of the synthetic meshes were obtained from the abdominal wall repair sites of 51 patients during a subsequent abdominal re-exploration. Biopsies were stained with hematoxylin and eosin, and evaluated according to a semi-quantitative scoring system for remodeling characteristics (cell infiltration, cell types, extracellular matrix deposition, inflammation, fibrous encapsulation, and neovascularization) and a mean composite score (CR). Biopsies were also stained with Sirius Red and Fast Green, and analyzed to determine the collagen I:III ratio. Based on univariate analyses between subject clinical characteristics or surgical site classification and the histologic remodeling scores, cohort variables were selected for multivariable regression models using a threshold p value of ≤0.200. Results The model selection process for the extracellular matrix score yielded two variables: subject age at time of mesh implantation, and mesh classification (c-statistic = 0.842). For CR score, the model selection process yielded two variables: subject age at time of mesh implantation and mesh classification (r2 = 0.464). The model selection process for the collagen III area yielded a model with two variables: subject body mass index at time of mesh explantation and pack-year history (r2 = 0.244). Conclusion Host characteristics and surgical site assessments may predict degree of remodeling for synthetic meshes used to reinforce abdominal wall repair sites. These preliminary results constitute the first steps in generating a risk prediction model that predicts the patients and clinical circumstances for which non-constructive remodeling of an abdominal wall repair site with synthetic mesh reinforcement is most likely to occur. PMID:24442681
Prediction of road accidents: A Bayesian hierarchical approach.
Deublein, Markus; Schubert, Matthias; Adey, Bryan T; Köhler, Jochen; Faber, Michael H
2013-03-01
In this paper a novel methodology for the prediction of the occurrence of road accidents is presented. The methodology utilizes a combination of three statistical methods: (1) gamma-updating of the occurrence rates of injury accidents and injured road users, (2) hierarchical multivariate Poisson-lognormal regression analysis taking into account correlations amongst multiple dependent model response variables and effects of discrete accident count data e.g. over-dispersion, and (3) Bayesian inference algorithms, which are applied by means of data mining techniques supported by Bayesian Probabilistic Networks in order to represent non-linearity between risk indicating and model response variables, as well as different types of uncertainties which might be present in the development of the specific models. Prior Bayesian Probabilistic Networks are first established by means of multivariate regression analysis of the observed frequencies of the model response variables, e.g. the occurrence of an accident, and observed values of the risk indicating variables, e.g. degree of road curvature. Subsequently, parameter learning is done using updating algorithms, to determine the posterior predictive probability distributions of the model response variables, conditional on the values of the risk indicating variables. The methodology is illustrated through a case study using data of the Austrian rural motorway network. In the case study, on randomly selected road segments the methodology is used to produce a model to predict the expected number of accidents in which an injury has occurred and the expected number of light, severe and fatally injured road users. Additionally, the methodology is used for geo-referenced identification of road sections with increased occurrence probabilities of injury accident events on a road link between two Austrian cities. It is shown that the proposed methodology can be used to develop models to estimate the occurrence of road accidents for any road network provided that the required data are available. Copyright © 2012 Elsevier Ltd. All rights reserved.
DasPy – Open Source Multivariate Land Data Assimilation Framework with High Performance Computing
NASA Astrophysics Data System (ADS)
Han, Xujun; Li, Xin; Montzka, Carsten; Kollet, Stefan; Vereecken, Harry; Hendricks Franssen, Harrie-Jan
2015-04-01
Data assimilation has become a popular method to integrate observations from multiple sources with land surface models to improve predictions of the water and energy cycles of the soil-vegetation-atmosphere continuum. In recent years, several land data assimilation systems have been developed in different research agencies. Because of the software availability or adaptability, these systems are not easy to apply for the purpose of multivariate land data assimilation research. Multivariate data assimilation refers to the simultaneous assimilation of observation data for multiple model state variables into a simulation model. Our main motivation was to develop an open source multivariate land data assimilation framework (DasPy) which is implemented using the Python script language mixed with C++ and Fortran language. This system has been evaluated in several soil moisture, L-band brightness temperature and land surface temperature assimilation studies. The implementation allows also parameter estimation (soil properties and/or leaf area index) on the basis of the joint state and parameter estimation approach. LETKF (Local Ensemble Transform Kalman Filter) is implemented as the main data assimilation algorithm, and uncertainties in the data assimilation can be represented by perturbed atmospheric forcings, perturbed soil and vegetation properties and model initial conditions. The CLM4.5 (Community Land Model) was integrated as the model operator. The CMEM (Community Microwave Emission Modelling Platform), COSMIC (COsmic-ray Soil Moisture Interaction Code) and the two source formulation were integrated as observation operators for assimilation of L-band passive microwave, cosmic-ray soil moisture probe and land surface temperature measurements, respectively. DasPy is parallelized using the hybrid MPI (Message Passing Interface) and OpenMP (Open Multi-Processing) techniques. All the input and output data flow is organized efficiently using the commonly used NetCDF file format. Online 1D and 2D visualization of data assimilation results is also implemented to facilitate the post simulation analysis. In summary, DasPy is a ready to use open source parallel multivariate land data assimilation framework.
Kona, Ravikanth; Fahmy, Raafat M; Claycamp, Gregg; Polli, James E; Martinez, Marilyn; Hoag, Stephen W
2015-02-01
The objective of this study is to use near-infrared spectroscopy (NIRS) coupled with multivariate chemometric models to monitor granule and tablet quality attributes in the formulation development and manufacturing of ciprofloxacin hydrochloride (CIP) immediate release tablets. Critical roller compaction process parameters, compression force (CFt), and formulation variables identified from our earlier studies were evaluated in more detail. Multivariate principal component analysis (PCA) and partial least square (PLS) models were developed during the development stage and used as a control tool to predict the quality of granules and tablets. Validated models were used to monitor and control batches manufactured at different sites to assess their robustness to change. The results showed that roll pressure (RP) and CFt played a critical role in the quality of the granules and the finished product within the range tested. Replacing binder source did not statistically influence the quality attributes of the granules and tablets. However, lubricant type has significantly impacted the granule size. Blend uniformity, crushing force, disintegration time during the manufacturing was predicted using validated PLS regression models with acceptable standard error of prediction (SEP) values, whereas the models resulted in higher SEP for batches obtained from different manufacturing site. From this study, we were able to identify critical factors which could impact the quality attributes of the CIP IR tablets. In summary, we demonstrated the ability of near-infrared spectroscopy coupled with chemometrics as a powerful tool to monitor critical quality attributes (CQA) identified during formulation development.
Callén, M S; López, J M; Mastral, A M
2010-08-15
The estimation of benzo(a)pyrene (BaP) concentrations in ambient air is very important from an environmental point of view especially with the introduction of the Directive 2004/107/EC and due to the carcinogenic character of this pollutant. A sampling campaign of particulate matter less or equal than 10 microns (PM10) carried out during 2008-2009 in four locations of Spain was collected to determine experimentally BaP concentrations by gas chromatography mass-spectrometry mass-spectrometry (GC-MS-MS). Multivariate linear regression models (MLRM) were used to predict BaP air concentrations in two sampling places, taking PM10 and meteorological variables as possible predictors. The model obtained with data from two sampling sites (all sites model) (R(2)=0.817, PRESS/SSY=0.183) included the significant variables like PM10, temperature, solar radiation and wind speed and was internally and externally validated. The first validation was performed by cross validation and the last one by BaP concentrations from previous campaigns carried out in Zaragoza from 2001-2004. The proposed model constitutes a first approximation to estimate BaP concentrations in urban atmospheres with very good internal prediction (Q(CV)(2)=0.813, PRESS/SSY=0.187) and with the maximal external prediction for the 2001-2002 campaign (Q(ext)(2)=0.679 and PRESS/SSY=0.321) versus the 2001-2004 campaign (Q(ext)(2)=0.551, PRESS/SSY=0.449). Copyright 2010 Elsevier B.V. All rights reserved.
Jin, Xiaoli; Shi, Chunhai; Yu, Chang Yeon; ...
2017-05-19
Leaf water content is one of the most common physiological parameters limiting efficiency of photosynthesis and biomass productivity in plants including Miscanthus. Therefore, it is of great significance to determine or predict the water content quickly and non-destructively. In this study, we explored the relationship between leaf water content and diffuse reflectance spectra in Miscanthus. Three multivariate calibrations including partial least squares (PLS), least squares support vector machine regression (LSSVR), and radial basis function (RBF) neural network (NN) were developed for the models of leaf water content determination. The non-linear models including RBF_LSSVR and RBF_NN showed higher accuracy than themore » PLS and Lin_LSSVR models. Moreover, 75 sensitive wavelengths were identified to be closely associated with the leaf water content in Miscanthus. The RBF_LSSVR and RBF_NN models for predicting leaf water content, based on 75 characteristic wavelengths, obtained the high determination coefficients of 0.9838 and 0.9899, respectively. The results indicated the non-linear models were more accurate than the linear models using both wavelength intervals. These results demonstrated that visible and near-infrared (VIS/NIR) spectroscopy combined with RBF_LSSVR or RBF_NN is a useful, non-destructive tool for determinations of the leaf water content in Miscanthus, and thus very helpful for development of drought-resistant varieties in Miscanthus.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jin, Xiaoli; Shi, Chunhai; Yu, Chang Yeon
Leaf water content is one of the most common physiological parameters limiting efficiency of photosynthesis and biomass productivity in plants including Miscanthus. Therefore, it is of great significance to determine or predict the water content quickly and non-destructively. In this study, we explored the relationship between leaf water content and diffuse reflectance spectra in Miscanthus. Three multivariate calibrations including partial least squares (PLS), least squares support vector machine regression (LSSVR), and radial basis function (RBF) neural network (NN) were developed for the models of leaf water content determination. The non-linear models including RBF_LSSVR and RBF_NN showed higher accuracy than themore » PLS and Lin_LSSVR models. Moreover, 75 sensitive wavelengths were identified to be closely associated with the leaf water content in Miscanthus. The RBF_LSSVR and RBF_NN models for predicting leaf water content, based on 75 characteristic wavelengths, obtained the high determination coefficients of 0.9838 and 0.9899, respectively. The results indicated the non-linear models were more accurate than the linear models using both wavelength intervals. These results demonstrated that visible and near-infrared (VIS/NIR) spectroscopy combined with RBF_LSSVR or RBF_NN is a useful, non-destructive tool for determinations of the leaf water content in Miscanthus, and thus very helpful for development of drought-resistant varieties in Miscanthus.« less
Wisniowski, Brendan; Barnes, Mary; Jenkins, Jason; Boyne, Nicholas; Kruger, Allan; Walker, Philip J
2011-09-01
Endovascular abdominal aortic aneurysm (AAA) repair (EVAR) has been associated with lower operative mortality and morbidity than open surgery but comparable long-term mortality and higher delayed complication and reintervention rates. Attention has therefore been directed to identifying preoperative and operative variables that influence outcomes after EVAR. Risk-prediction models, such as the EVAR Risk Assessment (ERA) model, have also been developed to help surgeons plan EVAR procedures. The aims of this study were (1) to describe outcomes of elective EVAR at the Royal Brisbane and Women's Hospital (RBWH), (2) to identify preoperative and operative variables predictive of outcomes after EVAR, and (3) to externally validate the ERA model. All elective EVAR procedures at the RBWH before July 1, 2009, were reviewed. Descriptive analyses were performed to determine the outcomes. Univariate and multivariate analyses were performed to identify preoperative and operative variables predictive of outcomes after EVAR. Binomial logistic regression analyses were used to externally validate the ERA model. Before July 1, 2009, 197 patients (172 men), who were a mean age of 72.8 years, underwent elective EVAR at the RBWH. Operative mortality was 1.0%. Survival was 81.1% at 3 years and 63.2% at 5 years. Multivariate analysis showed predictors of survival were age (P = .0126), American Society of Anesthesiologists (ASA) score (P = .0180), and chronic obstructive pulmonary disease (P = .0348) at 3 years and age (P = .0103), ASA score (P = .0006), renal failure (P = .0048), and serum creatinine (P = .0022) at 5 years. Aortic branch vessel score was predictive of initial (30-day) type II endoleak (P = .0015). AAA tortuosity was predictive of midterm type I endoleak (P = .0251). Female sex was associated with lower rates of initial clinical success (P = .0406). The ERA model fitted RBWH data well for early death (C statistic = .906), 3-year survival (C statistic = .735), 5-year survival (C statistic = .800), and initial type I endoleak (C statistic = .850). The outcomes of elective EVAR at the RBWH are broadly consistent with those of a nationwide Australian audit and recent randomized trials. Age and ASA score are independent predictors of midterm survival after elective EVAR. The ERA model predicts mortality-related outcomes and initial type I endoleak well for RBWH elective EVAR patients. Copyright © 2011 Society for Vascular Surgery. All rights reserved.
Fu, Xia; Liang, Xinling; Song, Li; Huang, Huigen; Wang, Jing; Chen, Yuanhan; Zhang, Li; Quan, Zilin; Shi, Wei
2014-04-01
To develop a predictive model for circuit clotting in patients with continuous renal replacement therapy (CRRT). A total of 425 cases were selected. 302 cases were used to develop a predictive model of extracorporeal circuit life span during CRRT without citrate anticoagulation in 24 h, and 123 cases were used to validate the model. The prediction formula was developed using multivariate Cox proportional-hazards regression analysis, from which a risk score was assigned. The mean survival time of the circuit was 15.0 ± 1.3 h, and the rate of circuit clotting was 66.6 % during 24 h of CRRT. Five significant variables were assigned a predicting score according to the regression coefficient: insufficient blood flow, no anticoagulation, hematocrit ≥0.37, lactic acid of arterial blood gas analysis ≤3 mmol/L and APTT < 44.2 s. The Hosmer-Lemeshow test showed no significant difference between the predicted and actual circuit clotting (R (2) = 0.232; P = 0.301). A risk score that includes the five above-mentioned variables can be used to predict the likelihood of extracorporeal circuit clotting in patients undergoing CRRT.
A probabilistic framework to infer brain functional connectivity from anatomical connections.
Deligianni, Fani; Varoquaux, Gael; Thirion, Bertrand; Robinson, Emma; Sharp, David J; Edwards, A David; Rueckert, Daniel
2011-01-01
We present a novel probabilistic framework to learn across several subjects a mapping from brain anatomical connectivity to functional connectivity, i.e. the covariance structure of brain activity. This prediction problem must be formulated as a structured-output learning task, as the predicted parameters are strongly correlated. We introduce a model selection framework based on cross-validation with a parametrization-independent loss function suitable to the manifold of covariance matrices. Our model is based on constraining the conditional independence structure of functional activity by the anatomical connectivity. Subsequently, we learn a linear predictor of a stationary multivariate autoregressive model. This natural parameterization of functional connectivity also enforces the positive-definiteness of the predicted covariance and thus matches the structure of the output space. Our results show that functional connectivity can be explained by anatomical connectivity on a rigorous statistical basis, and that a proper model of functional connectivity is essential to assess this link.
Carvalho, Carlos; Gomes, Danielo G.; Agoulmine, Nazim; de Souza, José Neuman
2011-01-01
This paper proposes a method based on multivariate spatial and temporal correlation to improve prediction accuracy in data reduction for Wireless Sensor Networks (WSN). Prediction of data not sent to the sink node is a technique used to save energy in WSNs by reducing the amount of data traffic. However, it may not be very accurate. Simulations were made involving simple linear regression and multiple linear regression functions to assess the performance of the proposed method. The results show a higher correlation between gathered inputs when compared to time, which is an independent variable widely used for prediction and forecasting. Prediction accuracy is lower when simple linear regression is used, whereas multiple linear regression is the most accurate one. In addition to that, our proposal outperforms some current solutions by about 50% in humidity prediction and 21% in light prediction. To the best of our knowledge, we believe that we are probably the first to address prediction based on multivariate correlation for WSN data reduction. PMID:22346626
[Modeling in value-based medicine].
Neubauer, A S; Hirneiss, C; Kampik, A
2010-03-01
Modeling plays an important role in value-based medicine (VBM). It allows decision support by predicting potential clinical and economic consequences, frequently combining different sources of evidence. Based on relevant publications and examples focusing on ophthalmology the key economic modeling methods are explained and definitions are given. The most frequently applied model types are decision trees, Markov models, and discrete event simulation (DES) models. Model validation includes besides verifying internal validity comparison with other models (external validity) and ideally validation of its predictive properties. The existing uncertainty with any modeling should be clearly stated. This is true for economic modeling in VBM as well as when using disease risk models to support clinical decisions. In economic modeling uni- and multivariate sensitivity analyses are usually applied; the key concepts here are tornado plots and cost-effectiveness acceptability curves. Given the existing uncertainty, modeling helps to make better informed decisions than without this additional information.
Harmsen, Wouter J; Ribbers, Gerard M; Slaman, Jorrit; Heijenbrok-Kal, Majanka H; Khajeh, Ladbon; van Kooten, Fop; Neggers, Sebastiaan J C M M; van den Berg-Emons, Rita J
2017-05-01
Peak oxygen uptake (VO 2peak ) established during progressive cardiopulmonary exercise testing (CPET) is the "gold-standard" for cardiorespiratory fitness. However, CPET measurements may be limited in patients with aneurysmal subarachnoid hemorrhage (a-SAH) by disease-related complaints, such as cardiovascular health-risks or anxiety. Furthermore, CPET with gas-exchange analyses require specialized knowledge and infrastructure with limited availability in most rehabilitation facilities. To determine whether an easy-to-administer six-minute walk test (6MWT) is a valid clinical alternative to progressive CPET in order to predict VO 2peak in individuals with a-SAH. Twenty-seven patients performed the 6MWT and CPET with gas-exchange analyses on a cycle ergometer. Univariate and multivariate regression models were made to investigate the predictability of VO 2peak from the six-minute walk distance (6MWD). Univariate regression showed that the 6MWD was strongly related to VO 2peak (r = 0.75, p < 0.001), with an explained variance of 56% and a prediction error of 4.12 ml/kg/min, representing 18% of mean VO 2peak . Adding age and sex to an extended multivariate regression model improved this relationship (r = 0.82, p < 0.001), with an explained variance of 67% and a prediction error of 3.67 ml/kg/min corresponding to 16% of mean VO 2peak . The 6MWT is an easy-to-administer submaximal exercise test that can be selected to estimate cardiorespiratory fitness at an aggregated level, in groups of patients with a-SAH, which may help to evaluate interventions in a clinical or research setting. However, the relatively large prediction error does not allow for an accurate prediction in individual patients.
Drought: A comprehensive R package for drought monitoring, prediction and analysis
NASA Astrophysics Data System (ADS)
Hao, Zengchao; Hao, Fanghua; Singh, Vijay P.; Cheng, Hongguang
2015-04-01
Drought may impose serious challenges to human societies and ecosystems. Due to complicated causing effects and wide impacts, a universally accepted definition of drought does not exist. The drought indicator is commonly used to characterize drought properties such as duration or severity. Various drought indicators have been developed in the past few decades for the monitoring of a certain aspect of drought condition along with the development of multivariate drought indices for drought characterizations from multiple sources or hydro-climatic variables. Reliable drought prediction with suitable drought indicators is critical to the drought preparedness plan to reduce potential drought impacts. In addition, drought analysis to quantify the risk of drought properties would provide useful information for operation drought managements. The drought monitoring, prediction and risk analysis are important components in drought modeling and assessments. In this study, a comprehensive R package "drought" is developed to aid the drought monitoring, prediction and risk analysis (available from R-Forge and CRAN soon). The computation of a suite of univariate and multivariate drought indices that integrate drought information from various sources such as precipitation, temperature, soil moisture, and runoff is available in the drought monitoring component in the package. The drought prediction/forecasting component consists of statistical drought predictions to enhance the drought early warning for decision makings. Analysis of drought properties such as duration and severity is also provided in this package for drought risk assessments. Based on this package, a drought monitoring and prediction/forecasting system is under development as a decision supporting tool. The package will be provided freely to the public to aid the drought modeling and assessment for researchers and practitioners.
Slaughter, Laurel A; Bonfante-Mejia, Eliana; Hintz, Susan R; Dvorchik, Igor; Parikh, Nehal A
2016-01-01
Extremely-low-birth-weight (ELBW; ≤1,000 g) infants are at high risk for neurodevelopmental impairments. Conventional brain MRI at term-equivalent age is increasingly used for prediction of outcomes. However, optimal prediction models remain to be determined, especially for cognitive outcomes. The aim was to evaluate the accuracy of a data-driven MRI scoring system to predict neurodevelopmental impairments. 122 ELBW infants had a brain MRI performed at term-equivalent age. Conventional MRI findings were scored with a standardized algorithm and tested using a multivariable regression model to predict neurodevelopmental impairment, defined as one or more of the following at 18-24 months' corrected age: cerebral palsy, bilateral blindness, bilateral deafness requiring amplification, and/or cognitive/language delay. Results were compared with a commonly cited scoring system. In multivariable analyses, only moderate-to-severe gyral maturational delay was a significant predictor of overall neurodevelopmental impairment (OR: 12.6, 95% CI: 2.6, 62.0; p < 0.001). Moderate-to-severe gyral maturational delay also predicted cognitive delay, cognitive delay/death, and neurodevelopmental impairment/death. Diffuse cystic abnormality was a significant predictor of cerebral palsy (OR: 33.6, 95% CI: 4.9, 229.7; p < 0.001). These predictors exhibited high specificity (range: 94-99%) but low sensitivity (30-67%) for the above outcomes. White or gray matter scores, determined using a commonly cited scoring system, did not show significant association with neurodevelopmental impairment. In our cohort, conventional MRI at term-equivalent age exhibited high specificity in predicting neurodevelopmental outcomes. However, sensitivity was suboptimal, suggesting additional clinical factors and biomarkers are needed to enable accurate prognostication. © 2016 S. Karger AG, Basel.
Pearson, Amy C. S.; Subramanian, Arun; Schroeder, Darrell R.; Findlay, James Y.
2017-01-01
Background The surgical Apgar score (SAS) is a 10-point scale using the lowest heart rate, lowest mean arterial pressure, and estimated blood loss (EBL) during surgery to predict postoperative outcomes. The SAS has not yet been validated in liver transplantation patients, because typical blood loss usually exceeds the highest EBL category. Our primary aim was to develop a modified SAS for liver transplant (SAS-LT) by replacing the EBL parameter with volume of red cells transfused. We hypothesized that the SAS-LT would predict death or severe complication within 30 days of transplant with similar accuracy to current scoring systems. Methods A retrospective cohort of consecutive liver transplantations from July 2007 to November 2013 was used to develop the SAS-LT. The predictive ability of SAS-LT for early postoperative outcomes was compared with Model for End-stage Liver Disease, Sequential Organ Failure Assessment, and Acute Physiology and Chronic Health Evaluation III scores using multivariable logistic regression and receiver operating characteristic analysis. Results Of 628 transplants, death or serious perioperative morbidity occurred in 105 (16.7%). The SAS-LT (receiver operating characteristic area under the curve [AUC], 0.57) had similar predictive ability to Acute Physiology and Chronic Health Evaluation III, model for end-stage liver disease, and Sequential Organ Failure Assessment scores (0.57, 0.56, and 0.61, respectively). Seventy-nine (12.6%) patients were discharged from the ICU in 24 hours or less. These patients’ SAS-LT scores were significantly higher than those with a longer stay (7.0 vs 6.2, P < 0.01). The AUC on multivariable modeling remained predictive of early ICU discharge (AUC, 0.67). Conclusions The SAS-LT utilized simple intraoperative metrics to predict early morbidity and mortality after liver transplant with similar accuracy to other scoring systems at an earlier postoperative time point. PMID:29184910
Wang, S; Sun, Z; Wang, S
1996-11-01
A prospective follow-up study of 539 advanced gastric carcinoma patients after resection was undertaken between 1 January 1980 and 31 December 1989, with a follow-up rate of 95.36%. A multivariate analysis of possible factors influencing survival of these patients was performed, and their predicting models of survival rates was established by Cox proportional hazard model. The results showed that the major significant prognostic factors influencing survival of these patients were rate and station of lymph node metastases, type of operation, hepatic metastases, size of tumor, age and location of tumor. The most important factor was the rate of lymph node metastases. According to their regression coefficients, the predicting value (PV) of each patient was calculated, then all patients were divided into five risk groups according to PV, their predicting models of survival rates after resection were established in groups. The goodness-fit of estimated predicting models of survival rates were checked by fitting curve and residual plot, and the estimated models tallied with the actual situation. The results suggest that the patients with advanced gastric cancer after resection without lymph node metastases and hepatic metastases had a better prognosis, and their survival probability may be predicted according to the predicting model of survival rates.
A New Multivariate Approach for Prognostics Based on Extreme Learning Machine and Fuzzy Clustering.
Javed, Kamran; Gouriveau, Rafael; Zerhouni, Noureddine
2015-12-01
Prognostics is a core process of prognostics and health management (PHM) discipline, that estimates the remaining useful life (RUL) of a degrading machinery to optimize its service delivery potential. However, machinery operates in a dynamic environment and the acquired condition monitoring data are usually noisy and subject to a high level of uncertainty/unpredictability, which complicates prognostics. The complexity further increases, when there is absence of prior knowledge about ground truth (or failure definition). For such issues, data-driven prognostics can be a valuable solution without deep understanding of system physics. This paper contributes a new data-driven prognostics approach namely, an "enhanced multivariate degradation modeling," which enables modeling degrading states of machinery without assuming a homogeneous pattern. In brief, a predictability scheme is introduced to reduce the dimensionality of the data. Following that, the proposed prognostics model is achieved by integrating two new algorithms namely, the summation wavelet-extreme learning machine and subtractive-maximum entropy fuzzy clustering to show evolution of machine degradation by simultaneous predictions and discrete state estimation. The prognostics model is equipped with a dynamic failure threshold assignment procedure to estimate RUL in a realistic manner. To validate the proposition, a case study is performed on turbofan engines data from PHM challenge 2008 (NASA), and results are compared with recent publications.
Neural network-based nonlinear model predictive control vs. linear quadratic gaussian control
Cho, C.; Vance, R.; Mardi, N.; Qian, Z.; Prisbrey, K.
1997-01-01
One problem with the application of neural networks to the multivariable control of mineral and extractive processes is determining whether and how to use them. The objective of this investigation was to compare neural network control to more conventional strategies and to determine if there are any advantages in using neural network control in terms of set-point tracking, rise time, settling time, disturbance rejection and other criteria. The procedure involved developing neural network controllers using both historical plant data and simulation models. Various control patterns were tried, including both inverse and direct neural network plant models. These were compared to state space controllers that are, by nature, linear. For grinding and leaching circuits, a nonlinear neural network-based model predictive control strategy was superior to a state space-based linear quadratic gaussian controller. The investigation pointed out the importance of incorporating state space into neural networks by making them recurrent, i.e., feeding certain output state variables into input nodes in the neural network. It was concluded that neural network controllers can have better disturbance rejection, set-point tracking, rise time, settling time and lower set-point overshoot, and it was also concluded that neural network controllers can be more reliable and easy to implement in complex, multivariable plants.
Roehl, Edwin A.; Conrads, Paul
2010-01-01
This is the second of two papers that describe how data mining can aid natural-resource managers with the difficult problem of controlling the interactions between hydrologic and man-made systems. Data mining is a new science that assists scientists in converting large databases into knowledge, and is uniquely able to leverage the large amounts of real-time, multivariate data now being collected for hydrologic systems. Part 1 gives a high-level overview of data mining, and describes several applications that have addressed major water resource issues in South Carolina. This Part 2 paper describes how various data mining methods are integrated to produce predictive models for controlling surface- and groundwater hydraulics and quality. The methods include: - signal processing to remove noise and decompose complex signals into simpler components; - time series clustering that optimally groups hundreds of signals into "classes" that behave similarly for data reduction and (or) divide-and-conquer problem solving; - classification which optimally matches new data to behavioral classes; - artificial neural networks which optimally fit multivariate data to create predictive models; - model response surface visualization that greatly aids in understanding data and physical processes; and, - decision support systems that integrate data, models, and graphics into a single package that is easy to use.
Longobardi, F; Ventrella, A; Bianco, A; Catucci, L; Cafagna, I; Gallo, V; Mastrorilli, P; Agostiano, A
2013-12-01
In this study, non-targeted (1)H NMR fingerprinting was used in combination with multivariate statistical techniques for the classification of Italian sweet cherries based on their different geographical origins (Emilia Romagna and Puglia). As classification techniques, Soft Independent Modelling of Class Analogy (SIMCA), Partial Least Squares Discriminant Analysis (PLS-DA), and Linear Discriminant Analysis (LDA) were carried out and the results were compared. For LDA, before performing a refined selection of the number/combination of variables, two different strategies for a preliminary reduction of the variable number were tested. The best average recognition and CV prediction abilities (both 100.0%) were obtained for all the LDA models, although PLS-DA also showed remarkable performances (94.6%). All the statistical models were validated by observing the prediction abilities with respect to an external set of cherry samples. The best result (94.9%) was obtained with LDA by performing a best subset selection procedure on a set of 30 principal components previously selected by a stepwise decorrelation. The metabolites that mostly contributed to the classification performances of such LDA model, were found to be malate, glucose, fructose, glutamine and succinate. Copyright © 2013 Elsevier Ltd. All rights reserved.
Shan, Jiajia; Wang, Xue; Zhou, Hao; Han, Shuqing; Riza, Dimas Firmanda Al; Kondo, Naoshi
2018-03-13
Synchronous fluorescence spectra, combined with multivariate analysis were used to predict flavonoids content in green tea rapidly and nondestructively. This paper presented a new and efficient spectral intervals selection method called clustering based partial least square (CL-PLS), which selected informative wavelengths by combining clustering concept and partial least square (PLS) methods to improve models' performance by synchronous fluorescence spectra. The fluorescence spectra of tea samples were obtained and k-means and kohonen-self organizing map clustering algorithms were carried out to cluster full spectra into several clusters, and sub-PLS regression model was developed on each cluster. Finally, CL-PLS models consisting of gradually selected clusters were built. Correlation coefficient (R) was used to evaluate the effect on prediction performance of PLS models. In addition, variable influence on projection partial least square (VIP-PLS), selectivity ratio partial least square (SR-PLS), interval partial least square (iPLS) models and full spectra PLS model were investigated and the results were compared. The results showed that CL-PLS presented the best result for flavonoids prediction using synchronous fluorescence spectra.
Tran, Alexandre; Matar, Maher; Steyerberg, Ewout W; Lampron, Jacinthe; Taljaard, Monica; Vaillancourt, Christian
2017-04-13
Hemorrhage is a major cause of early mortality following a traumatic injury. The progression and consequences of significant blood loss occur quickly as death from hemorrhagic shock or exsanguination often occurs within the first few hours. The mainstay of treatment therefore involves early identification of patients at risk for hemorrhagic shock in order to provide blood products and control of the bleeding source if necessary. The intended scope of this review is to identify and assess combinations of predictors informing therapeutic decision-making for clinicians during the initial trauma assessment. The primary objective of this systematic review is to identify and critically assess any existing multivariable models predicting significant traumatic hemorrhage that requires intervention, defined as a composite outcome comprising massive transfusion, surgery for hemostasis, or angiography with embolization for the purpose of external validation or updating in other study populations. If no suitable existing multivariable models are identified, the secondary objective is to identify candidate predictors to inform the development of a new prediction rule. We will search the EMBASE and MEDLINE databases for all randomized controlled trials and prospective and retrospective cohort studies developing or validating predictors of intervention for traumatic hemorrhage in adult patients 16 years of age or older. Eligible predictors must be available to the clinician during the first hour of trauma resuscitation and may be clinical, lab-based, or imaging-based. Outcomes of interest include the need for surgical intervention, angiographic embolization, or massive transfusion within the first 24 h. Data extraction will be performed independently by two reviewers. Items for extraction will be based on the CHARMS checklist. We will evaluate any existing models for relevance, quality, and the potential for external validation and updating in other populations. Relevance will be described in terms of appropriateness of outcomes and predictors. Quality criteria will include variable selection strategies, adequacy of sample size, handling of missing data, validation techniques, and measures of model performance. This systematic review will describe the availability of multivariable prediction models and summarize evidence regarding predictors that can be used to identify the need for intervention in patients with traumatic hemorrhage. PROSPERO CRD42017054589.
Assessing the Validity of Air Force Selection and Training Strategies.
ERIC Educational Resources Information Center
Mumford, Michael D.; And Others
A study was undertaken to develop a system for predicting the impact of adjustments in aptitude requirements on outcomes (performance) in Air Force basic resident technical training. To accomplish this, a multivariate modeling approach was used. Initially, interviews were constructed within a variety of technical training programs to specify the…
ERIC Educational Resources Information Center
Heller, Monica L.; Cassady, Jerrell C.
2017-01-01
The current study explored the differential influences that behavioral learning strategies (i.e., cognitive-metacognitive, resource management), motivational profiles, and academic anxiety appraisals have on college-level learners in two unique learning contexts. Using multivariate analysis of variance and discriminant analysis, the study first…
A Cognitive Analysis of Credit Card Acquisition and College Student Financial Development.
ERIC Educational Resources Information Center
Kidwell, Blair; Turrisi, Robert
2000-01-01
Examines cognitions relevant to credit card decision making in college-aged participants (N=304). Assesses measures of beliefs, attitudes, and behavioral alternatives toward acquiring a credit card. Identifies a multivariate model predicting college student financial development of the attitudes and behavioral tendencies of acquiring a new card.…
Effects of Self-Perceptions on Self-Learning among Teacher Education Students
ERIC Educational Resources Information Center
Liu, Shih-Hsiung
2015-01-01
This study evaluates the multivariate hypothesized model that predicts the significance of, and relationships among, various self-perception factors for being a qualified teacher and their direct and mediated effects on self-learning activities among teacher education students. A total of 248 teacher education students enrolled at an education…
USDA-ARS?s Scientific Manuscript database
Free-living measurements of 24-h total energy expenditure (TEE) and activity energy expenditure (AEE) are required to better understand the metabolic, physiological, behavioral, and environmental factors affecting energy balance and contributing to the global epidemic of childhood obesity. The spec...
Hendrick, C. Emily; Cohen, Alison K.; Deardorff, Julianna
2015-01-01
BACKGROUND Lifetime educational attainment is an important predictor of health and well-being for women in the United States. In the current study, we examine the roles of socio-cultural factors in youth and an understudied biological life event, pubertal timing, in predicting women’s lifetime educational attainment. METHODS Using data from the National Longitudinal Survey of Youth 1997 cohort (N = 3889), we conducted sequential multivariate linear regression analyses to investigate the influences of macro-level and family-level socio-cultural contextual factors in youth (region of country, urbanicity, race/ethnicity, year of birth, household composition, mother’s education, mother’s age at first birth) and early menarche, a marker of early pubertal development, on women’s educational attainment after age 24. RESULTS Pubertal timing and all socio-cultural factors in youth, other than year of birth, predicted women’s lifetime educational attainment in bivariate models. Family factors had the strongest associations. When family factors were added to multivariate models, geographic region in youth and pubertal timing were no longer significant. CONCLUSION Our findings provide additional evidence that family factors should be considered when developing comprehensive and inclusive interventions in childhood and adolescence to promote lifetime educational attainment among girls. PMID:26830508
Le Strat, Yann
2017-01-01
The objective of this paper is to evaluate a panel of statistical algorithms for temporal outbreak detection. Based on a large dataset of simulated weekly surveillance time series, we performed a systematic assessment of 21 statistical algorithms, 19 implemented in the R package surveillance and two other methods. We estimated false positive rate (FPR), probability of detection (POD), probability of detection during the first week, sensitivity, specificity, negative and positive predictive values and F1-measure for each detection method. Then, to identify the factors associated with these performance measures, we ran multivariate Poisson regression models adjusted for the characteristics of the simulated time series (trend, seasonality, dispersion, outbreak sizes, etc.). The FPR ranged from 0.7% to 59.9% and the POD from 43.3% to 88.7%. Some methods had a very high specificity, up to 99.4%, but a low sensitivity. Methods with a high sensitivity (up to 79.5%) had a low specificity. All methods had a high negative predictive value, over 94%, while positive predictive values ranged from 6.5% to 68.4%. Multivariate Poisson regression models showed that performance measures were strongly influenced by the characteristics of time series. Past or current outbreak size and duration strongly influenced detection performances. PMID:28715489
NASA Astrophysics Data System (ADS)
Ayoko, Godwin A.; Singh, Kirpal; Balerea, Steven; Kokot, Serge
2007-03-01
SummaryPhysico-chemical properties of surface water and groundwater samples from some developing countries have been subjected to multivariate analyses by the non-parametric multi-criteria decision-making methods, PROMETHEE and GAIA. Complete ranking information necessary to select one source of water in preference to all others was obtained, and this enabled relationships between the physico-chemical properties and water quality to be assessed. Thus, the ranking of the quality of the water bodies was found to be strongly dependent on the total dissolved solid, phosphate, sulfate, ammonia-nitrogen, calcium, iron, chloride, magnesium, zinc, nitrate and fluoride contents of the waters. However, potassium, manganese and zinc composition showed the least influence in differentiating the water bodies. To model and predict the water quality influencing parameters, partial least squares analyses were carried out on a matrix made up of the results of water quality assessment studies carried out in Nigeria, Papua New Guinea, Egypt, Thailand and India/Pakistan. The results showed that the total dissolved solid, calcium, sulfate, sodium and chloride contents can be used to predict a wide range of physico-chemical characteristics of water. The potential implications of these observations on the financial and opportunity costs associated with elaborate water quality monitoring are discussed.
Steiner, John F.; Ho, P. Michael; Beaty, Brenda L.; Dickinson, L. Miriam; Hanratty, Rebecca; Zeng, Chan; Tavel, Heather M.; Havranek, Edward P.; Davidson, Arthur J.; Magid, David J.; Estacio, Raymond O.
2009-01-01
Background Although many studies have identified patient characteristics or chronic diseases associated with medication adherence, the clinical utility of such predictors has rarely been assessed. We attempted to develop clinical prediction rules for adherence with antihypertensive medications in two health care delivery systems. Methods and Results Retrospective cohort studies of hypertension registries in an inner-city health care delivery system (N = 17176) and a health maintenance organization (N = 94297) in Denver, Colorado. Adherence was defined by acquisition of 80% or more of antihypertensive medications. A multivariable model in the inner-city system found that adherent patients (36.3% of the total) were more likely than non-adherent patients to be older, white, married, and acculturated in US society, to have diabetes or cerebrovascular disease, not to abuse alcohol or controlled substances, and to be prescribed less than three antihypertensive medications. Although statistically significant, all multivariate odds ratios were 1.7 or less, and the model did not accurately discriminate adherent from non-adherent patients (C-statistic = 0.606). In the health maintenance organization, where 72.1% of patients were adherent, significant but weak associations existed between adherence and older age, white race, the lack of alcohol abuse, and fewer antihypertensive medications. The multivariate model again failed to accurately discriminate adherent from non-adherent individuals (C-statistic = 0.576). Conclusions Although certain socio-demographic characteristics or clinical diagnoses are statistically associated with adherence to refills of antihypertensive medications, a combination of these characteristics is not sufficiently accurate to allow clinicians to predict whether their patients will be adherent with treatment. PMID:20031876
Stekolnikov, Alexandr A; Klimov, Pavel B
2010-09-01
We revise chiggers belonging to the minuta-species group (genus Neotrombicula Hirst, 1925) from the Palaearctic using size-free multivariate morphometrics. This approach allowed us to resolve several diagnostic problems. We show that the widely distributed Neotrombicula scrupulosa Kudryashova, 1993 forms three spatially and ecologically isolated groups different from each other in size or shape (morphometric property) only: specimens from the Caucasus are distinct from those from Asia in shape, whereas the Asian specimens from plains and mountains are different from each other in size. We developed a multivariate classification model to separate three closely related species: N. scrupulosa, N. lubrica Kudryashova, 1993 and N. minuta Schluger, 1966. This model is based on five shape variables selected from an initial 17 variables by a best subset analysis using a custom size-correction subroutine. The variable selection procedure slightly improved the predictive power of the model, suggesting that it not only removed redundancy but also reduced 'noise' in the dataset. The overall classification accuracy of this model is 96.2, 96.2 and 95.5%, as estimated by internal validation, external validation and jackknife statistics, respectively. Our analyses resulted in one new synonymy: N. dimidiata Stekolnikov, 1995 is considered to be a synonym of N. lubrica. Both N. scrupulosa and N. lubrica are recorded from new localities. A key to species of the minuta-group incorporating results from our multivariate analyses is presented.
Mercuri, A; Pagliari, M; Baxevanis, F; Fares, R; Fotaki, N
2017-02-25
In this study the selection of in vivo predictive in vitro dissolution experimental set-ups using a multivariate analysis approach, in line with the Quality by Design (QbD) principles, is explored. The dissolution variables selected using a design of experiments (DoE) were the dissolution apparatus [USP1 apparatus (basket) and USP2 apparatus (paddle)], the rotational speed of the basket/or paddle, the operator conditions (dissolution apparatus brand and operator), the volume, the pH, and the ethanol content of the dissolution medium. The dissolution profiles of two nifedipine capsules (poorly soluble compound), under conditions mimicking the intake of the capsules with i. water, ii. orange juice and iii. an alcoholic drink (orange juice and ethanol) were analysed using multiple linear regression (MLR). Optimised dissolution set-ups, generated based on the mathematical model obtained via MLR, were used to build predicted in vitro-in vivo correlations (IVIVC). IVIVC could be achieved using physiologically relevant in vitro conditions mimicking the intake of the capsules with an alcoholic drink (orange juice and ethanol). The multivariate analysis revealed that the concentration of ethanol used in the in vitro dissolution experiments (47% v/v) can be lowered to less than 20% v/v, reflecting recently found physiological conditions. Copyright © 2016 Elsevier B.V. All rights reserved.
Preoperative nomogram to predict the likelihood of complications after radical nephroureterectomy.
Raman, Jay D; Lin, Yu-Kuan; Shariat, Shahrokh F; Krabbe, Laura-Maria; Margulis, Vitaly; Arnouk, Alex; Lallas, Costas D; Trabulsi, Edouard J; Drouin, Sarah J; Rouprêt, Morgan; Bozzini, Gregory; Colin, Pierre; Peyronnet, Benoit; Bensalah, Karim; Bailey, Kari; Canes, David; Klatte, Tobias
2017-02-01
To construct a nomogram based on preoperative variables to better predict the likelihood of complications occurring within 30 days of radical nephroureterectomy (RNU). The charts of 731 patients undergoing RNU at eight academic medical centres between 2002 and 2014 were reviewed. Preoperative clinical, demographic and comorbidity indices were collected. Complications occurring within 30 days of surgery were graded using the modified Clavien-Dindo scale. Multivariate logistic regression determined the association between preoperative variables and post-RNU complications. A nomogram was created from the reduced multivariate model with internal validation using the bootstrapping technique with 200 repetitions. A total of 408 men and 323 women with a median age of 70 years and a body mass index of 27 kg/m 2 were included. A total of 75% of the cohort was white, 18% had an Eastern Cooperative Oncology Group (ECOG) performance status ≥2, 20% had a Charlson comorbidity index (CCI) score >5 and 50% had baseline chronic kidney disease (CKD) ≥ stage III. Overall, 279 patients (38%) experienced a complication, including 61 events (22%) with Clavien grade ≥ III. A multivariate model identified five variables associated with complications, including patient age, race, ECOG performance status, CKD stage and CCI score. A preoperative nomogram incorporating these risk factors was constructed with an area under curve of 72.2%. Using standard preoperative variables from this multi-institutional RNU experience, we constructed and validated a nomogram for predicting peri-operative complications after RNU. Such information may permit more accurate risk stratification on an individual cases basis before major surgery. © 2016 The Authors BJU International © 2016 BJU International Published by John Wiley & Sons Ltd.
Chalcraft, Kenneth R; Lee, Richard; Mills, Casandra; Britz-McKibbin, Philip
2009-04-01
A major obstacle in metabolomics remains the identification and quantification of a large fraction of unknown metabolites in complex biological samples when purified standards are unavailable. Herein we introduce a multivariate strategy for de novo quantification of cationic/zwitterionic metabolites using capillary electrophoresis-electrospray ionization-mass spectrometry (CE-ESI-MS) based on fundamental molecular, thermodynamic, and electrokinetic properties of an ion. Multivariate calibration was used to derive a quantitative relationship between the measured relative response factor (RRF) of polar metabolites with respect to four physicochemical properties associated with ion evaporation in ESI-MS, namely, molecular volume (MV), octanol-water distribution coefficient (log D), absolute mobility (mu(o)), and effective charge (z(eff)). Our studies revealed that a limited set of intrinsic solute properties can be used to predict the RRF of various classes of metabolites (e.g., amino acids, amines, peptides, acylcarnitines, nucleosides, etc.) with reasonable accuracy and robustness provided that an appropriate training set is validated and ion responses are normalized to an internal standard(s). The applicability of the multivariate model to quantify micromolar levels of metabolites spiked in red blood cell (RBC) lysates was also examined by CE-ESI-MS without significant matrix effects caused by involatile salts and/or major co-ion interferences. This work demonstrates the feasibility for virtual quantification of low-abundance metabolites and their isomers in real-world samples using physicochemical properties estimated by computer modeling, while providing deeper insight into the wide disparity of solute responses in ESI-MS. New strategies for predicting ionization efficiency in silico allow for rapid and semiquantitative analysis of newly discovered biomarkers and/or drug metabolites in metabolomics research when chemical standards do not exist.
Prediction of concurrent endometrial carcinoma in women with endometrial hyperplasia.
Matsuo, Koji; Ramzan, Amin A; Gualtieri, Marc R; Mhawech-Fauceglia, Paulette; Machida, Hiroko; Moeini, Aida; Dancz, Christina E; Ueda, Yutaka; Roman, Lynda D
2015-11-01
Although a fraction of endometrial hyperplasia cases have concurrent endometrial carcinoma, patient characteristics associated with concurrent malignancy are not well described. The aim of our study was to identify predictive clinico-pathologic factors for concurrent endometrial carcinoma among patients with endometrial hyperplasia. A case-control study was conducted to compare endometrial hyperplasia in both preoperative endometrial biopsy and hysterectomy specimens (n=168) and endometrial carcinoma in hysterectomy specimen but endometrial hyperplasia in preoperative endometrial biopsy (n=43). Clinico-pathologic factors were examined to identify independent risk factors of concurrent endometrial carcinoma in a multivariate logistic regression model. The most common histologic subtype in preoperative endometrial biopsy was complex hyperplasia with atypia [CAH] (n=129) followed by complex hyperplasia without atypia (n=58) and simple hyperplasia with or without atypia (n=24). The majority of endometrial carcinomas were grade 1 (86.0%) and stage I (83.7%). In multivariate analysis, age 40-59 (odds ratio [OR] 3.07, p=0.021), age≥60 (OR 6.65, p=0.005), BMI≥35kg/m(2) (OR 2.32, p=0.029), diabetes mellitus (OR 2.51, p=0.019), and CAH (OR 9.01, p=0.042) were independent predictors of concurrent endometrial carcinoma. The risk of concurrent endometrial carcinoma rose dramatically with increasing number of risk factors identified in multivariate model (none 0%, 1 risk factor 7.0%, 2 risk factors 17.6%, 3 risk factors 35.8%, and 4 risk factors 45.5%, p<0.001). Hormonal treatment was associated with decreased risk of concurrent endometrial cancer in those with ≥3 risk factors. Older age, obesity, diabetes mellitus, and CAH are predictive of concurrent endometrial carcinoma in endometrial hyperplasia patients. Copyright © 2015 Elsevier Inc. All rights reserved.
Toda, Hiroyuki; Inoue, Takeshi; Tsunoda, Tomoya; Nakai, Yukiei; Tanichi, Masaaki; Tanaka, Teppei; Hashimoto, Naoki; Nakato, Yasuya; Nakagawa, Shin; Kitaichi, Yuji; Mitsui, Nobuyuki; Boku, Shuken; Tanabe, Hajime; Nibuya, Masashi; Yoshino, Aihide; Kusumi, Ichiro
2015-01-01
Background Previous studies have shown the interaction between heredity and childhood stress or life events on the pathogenesis of a major depressive disorder (MDD). In this study, we tested our hypothesis that childhood abuse, affective temperaments, and adult stressful life events interact and influence the diagnosis of MDD. Patients and methods A total of 170 healthy controls and 98 MDD patients were studied using the following self-administered questionnaire surveys: the Patient Health Questionnaire-9 (PHQ-9), the Life Experiences Survey, the Temperament Evaluation of the Memphis, Pisa, Paris, and San Diego Autoquestionnaire, and the Child Abuse and Trauma Scale (CATS). The data were analyzed with univariate analysis, multivariable analysis, and structural equation modeling. Results The neglect scores of the CATS indirectly predicted the diagnosis of MDD through cyclothymic and anxious temperament scores of the Temperament Evaluation of the Memphis, Pisa, Paris, and San Diego Autoquestionnaire in the structural equation modeling. Two temperaments – cyclothymic and anxious – directly predicted the diagnosis of MDD. The validity of this result was supported by the results of the stepwise multivariate logistic regression analysis as follows: three factors – neglect, cyclothymic, and anxious temperaments – were significant predictors of MDD. Neglect and the total CATS scores were also predictors of remission vs treatment-resistance in MDD patients independently of depressive symptoms. Limitations The sample size was small for the comparison between the remission and treatment-resistant groups in MDD patients in multivariable analysis. Conclusion This study suggests that childhood abuse, especially neglect, indirectly predicted the diagnosis of MDD through increased affective temperaments. The important role as a mediator of affective temperaments in the effect of childhood abuse on MDD was suggested. PMID:26316754
A pilot study of NMR-based sensory prediction of roasted coffee bean extracts.
Wei, Feifei; Furihata, Kazuo; Miyakawa, Takuya; Tanokura, Masaru
2014-01-01
Nuclear magnetic resonance (NMR) spectroscopy can be considered a kind of "magnetic tongue" for the characterisation and prediction of the tastes of foods, since it provides a wealth of information in a nondestructive and nontargeted manner. In the present study, the chemical substances in roasted coffee bean extracts that could distinguish and predict the different sensations of coffee taste were identified by the combination of NMR-based metabolomics and human sensory test and the application of the multivariate projection method of orthogonal projection to latent structures (OPLS). In addition, the tastes of commercial coffee beans were successfully predicted based on their NMR metabolite profiles using our OPLS model, suggesting that NMR-based metabolomics accompanied with multiple statistical models is convenient, fast and accurate for the sensory evaluation of coffee. Copyright © 2013 Elsevier Ltd. All rights reserved.
Guzman, L; Ortega-Hrepich, C; Polyzos, N P; Anckaert, E; Verheyen, G; Coucke, W; Devroey, P; Tournaye, H; Smitz, J; De Vos, M
2013-05-01
Which baseline patient characteristics can help assisted reproductive technology practitioners to identify patients who are suitable for in-vitro maturation (IVM) treatment? In patients with polycystic ovary syndrome (PCOS) who undergo oocyte IVM in a non-hCG-triggered system, circulating anti-Müllerian hormone (AMH), antral follicle count (AFC) and total testosterone are independently related to the number of immature oocytes and hold promise as outcome predictors to guide the patient selection process for IVM. Patient selection criteria for IVM treatment have been described in normo-ovulatory patients, although patients with PCOS constitute the major target population for IVM. With this study, we assessed the independent predictive value of clinical and endocrine parameters that are related to oocyte yield in patients with PCOS undergoing IVM. Cohort study involving 124 consecutive patients with PCOS undergoing IVM whose data were prospectively collected. Enrolment took place between January 2010 and January 2012. Only data relating to the first IVM cycle of each patient were included. Patients with PCOS underwent oocyte retrieval for IVM after minimal gonadotrophin stimulation and no hCG trigger. Correlation coefficients were calculated to investigate which parameters are related to immature oocyte yield (patient's age, BMI, baseline hormonal profile and AMH, AFC). The independence of predictive parameters was tested using multivariate linear regression analysis. Finally, multivariate receiver operating characteristic (ROC) analyses for cumulus oocyte complexes (COC) yield were performed to assess the efficiency of the prediction model to select suitable candidates for IVM. Using multivariate regression analysis, circulating baseline AMH, AFC and baseline total testosterone serum concentration were incorporated into a model to predict the number of COC retrieved in an IVM cycle, with unstandardized coefficients [95% confidence interval (CI)] of 0.03 (0.02-0.03) (P < 0.001), 0.012 (0.008-0.017) (P < 0.001) and 0.37 (0.18-0.57) (P < 0.001), respectively. Logistic regression analysis shows that a prediction model based on AMH and AFC, with unstandardized coefficients (95% CI) of 0.148 (0.03-0.25) (P < 0.001) and 0.034 (-0.003-0.07) (P = 0.025), respectively, is a useful patient selection tool to predict the probability to yield at least eight COCs for IVM in patients with PCOS. In this population, patients with at least eight COC available for IVM have a statistically higher number of embryos of good morphological quality (2.9 ± 2.3; 0.9 ± 0.9; P < 0.001) and cumulative ongoing pregnancy rate [30.4% (24 out of 79); 11% (5 out of 45); P = 0.01] when compared with patients with less than eight COC. ROC curve analysis showed that this prediction model has an area under the curve of 0.7864 (95% CI = 0.6997-0.8732) for the prediction of oocyte yield in IVM. The proposed model has been constructed based on a genuine IVM system, i.e. no hCG trigger was given and none of the oocytes matured in vivo. However, other variables, such as needle type, aspiration technique and whether or not hCG-triggering is used, should be considered as confounding factors. The results of this study have to be confirmed using a second independent validation sample. The proposed model could be applied to patients with PCOS after confirmation through a further validation study. This study was supported by a research grant by the Institute for the Promotion of Innovation by Science and Technology in Flanders, Project number IWT 070719.
Age-Specific Prostate Specific Antigen Cutoffs for Guiding Biopsy Decision in Chinese Population
Xu, Jianfeng; Jiang, Haowen; Ding, Qiang
2013-01-01
Background Age-specific prostate specific antigen (PSA) cutoffs for prostate biopsy have been widely used in the USA and European countries. However, the application of age-specific PSA remains poorly understood in China. Methods Between 2003 and 2012, 1,848 men over the age of 40, underwent prostate biopsy for prostate cancer (PCa) at Huashan Hospital, Shanghai, China. Clinical information and blood samples were collected prior to biopsy for each patient. Men were divided into three age groups (≤60, 61 to 80, and >80) for analyses. Digital rectal examination (DRE), transrectal ultrasound (prostate volume and nodule), total PSA (tPSA), and free PSA (fPSA) were also included in the analyses. Logistic regression was used to build the multi-variate model. Results Serum tPSA levels were age-dependent (P = 0.008), while %fPSA (P = 0.051) and PSAD (P = 0.284) were age-independent. At a specificity of 80%, the sensitivities for predicting PCa were 83%, 71% and 68% with tPSA cutoff values of 19.0 ng/mL (age≤60),21.0 ng/mL (age 61–80), and 23.0 ng/mL (age≥81). Also, sensitivities at the same tPSA levels were able to reach relatively high levels (70%–88%) for predicting high-grade PCa. Area (AUC) under the receive operating curves (ROCs) of tPSA, %fPSA, PSAD and multi-variate model were different in age groups. When predicting PCa, the AUC of tPSA, %fPSA, PSAD and multi-variate model were 0.90, 0.57, 0.93 and 0.87 respectively in men ≤60 yr; 0.82, 0.70, 0.88 and 0.86 respectively in men 61–80 yr; 0.79, 0.78, 0.87 and 0.88 respectively in men>80 yr. When predicting Gleason Score ≥7 or 8 PCa, there were no significant differences between AUCs of each variable. Conclusion Age-specific PSA cutoff values for prostate biopsy should be considered in the Chinese population. Indications for prostate biopsies (tPSA, %fPSA and PSAD) should be considered based on age in the Chinese population. PMID:23825670
Prediction Activities at NASA's Global Modeling and Assimilation Office
NASA Technical Reports Server (NTRS)
Schubert, Siegfried
2010-01-01
The Global Modeling and Assimilation Office (GMAO) is a core NASA resource for the development and use of satellite observations through the integrating tools of models and assimilation systems. Global ocean, atmosphere and land surface models are developed as components of assimilation and forecast systems that are used for addressing the weather and climate research questions identified in NASA's science mission. In fact, the GMAO is actively engaged in addressing one of NASA's science mission s key questions concerning how well transient climate variations can be understood and predicted. At weather time scales the GMAO is developing ultra-high resolution global climate models capable of resolving high impact weather systems such as hurricanes. The ability to resolve the detailed characteristics of weather systems within a global framework greatly facilitates addressing fundamental questions concerning the link between weather and climate variability. At sub-seasonal time scales, the GMAO is engaged in research and development to improve the use of land information (especially soil moisture), and in the improved representation and initialization of various sub-seasonal atmospheric variability (such as the MJO) that evolves on time scales longer than weather and involves exchanges with both the land and ocean The GMAO has a long history of development for advancing the seasonal-to-interannual (S-I) prediction problem using an older version of the coupled atmosphere-ocean general circulation model (AOGCM). This includes the development of an Ensemble Kalman Filter (EnKF) to facilitate the multivariate assimilation of ocean surface altimetry, and an EnKF developed for the highly inhomogeneous nature of the errors in land surface models, as well as the multivariate assimilation needed to take advantage of surface soil moisture and snow observations. The importance of decadal variability, especially that associated with long-term droughts is well recognized by the climate community. An improved understanding of the nature of decadal variability and its predictability has important implications for efforts to assess the impacts of global change in the coming decades. In fact, the GMAO has taken on the challenge of carrying out experimental decadal predictions in support of the IPCC AR5 effort.
Predictive model for risk of cesarean section in pregnant women after induction of labor.
Hernández-Martínez, Antonio; Pascual-Pedreño, Ana I; Baño-Garnés, Ana B; Melero-Jiménez, María R; Tenías-Burillo, José M; Molina-Alarcón, Milagros
2016-03-01
To develop a predictive model for risk of cesarean section in pregnant women after induction of labor. A retrospective cohort study was conducted of 861 induced labors during 2009, 2010, and 2011 at Hospital "La Mancha-Centro" in Alcázar de San Juan, Spain. Multivariate analysis was used with binary logistic regression and areas under the ROC curves to determine predictive ability. Two predictive models were created: model A predicts the outcome at the time the woman is admitted to the hospital (before the decision to of the method of induction); and model B predicts the outcome at the time the woman is definitely admitted to the labor room. The predictive factors in the final model were: maternal height, body mass index, nulliparity, Bishop score, gestational age, macrosomia, gender of fetus, and the gynecologist's overall cesarean section rate. The predictive ability of model A was 0.77 [95% confidence interval (CI) 0.73-0.80] and model B was 0.79 (95% CI 0.76-0.83). The predictive ability for pregnant women with previous cesarean section with model A was 0.79 (95% CI 0.64-0.94) and with model B was 0.80 (95% CI 0.64-0.96). For a probability of estimated cesarean section ≥80%, the models A and B presented a positive likelihood ratio (+LR) for cesarean section of 22 and 20, respectively. Also, for a likelihood of estimated cesarean section ≤10%, the models A and B presented a +LR for vaginal delivery of 13 and 6, respectively. These predictive models have a good discriminative ability, both overall and for all subgroups studied. This tool can be useful in clinical practice, especially for pregnant women with previous cesarean section and diabetes.
Niles, Justin K; Webber, Mayris P; Liu, Xiaoxue; Zeig-Owens, Rachel; Hall, Charles B; Cohen, Hillel W; Glaser, Michelle S; Weakley, Jessica; Schwartz, Theresa M; Weiden, Michael D; Nolan, Anna; Aldrich, Thomas K; Glass, Lara; Kelly, Kerry J; Prezant, David J
2014-08-01
We investigated early post 9/11 factors that could predict rhinosinusitis healthcare utilization costs up to 11 years later in 8,079 World Trade Center-exposed rescue/recovery workers. We used bivariate and multivariate analytic techniques to investigate utilization outcomes; we also used a pyramid framework to describe rhinosinusitis healthcare groups at early (by 9/11/2005) and late (by 9/11/2012) time points. Multivariate models showed that pre-9/11/2005 chronic rhinosinusitis diagnoses and nasal symptoms predicted final year healthcare utilization outcomes more than a decade after WTC exposure. The relative proportion of workers on each pyramid level changed significantly during the study period. Diagnoses of chronic rhinosinusitis within 4 years of a major inhalation event only partially explain future healthcare utilization. Exposure intensity, early symptoms and other factors must also be considered when anticipating future healthcare needs. © 2014 Wiley Periodicals, Inc.
Niles, Justin K.; Webber, Mayris P.; Liu, Xiaoxue; Zeig-Owens, Rachel; Hall, Charles B.; Cohen, Hillel W.; Glaser, Michelle S.; Weakley, Jessica; Schwartz, Theresa M.; Weiden, Michael D.; Nolan, Anna; Aldrich, Thomas K.; Glass, Lara; Kelly, Kerry J.; Prezant, David J.
2015-01-01
Background We investigated early post 9/11 factors that could predict rhinosinusitis healthcare utilization costs up to 11 years later in 8,079 World Trade Center-exposed rescue/recovery workers. Methods We used bivariate and multivariate analytic techniques to investigate utilization outcomes; we also used a pyramid framework to describe rhinosinusitis healthcare groups at early (by 9/11/2005) and late (by 9/11/2012) time points. Results Multivariate models showed that pre-9/11/2005 chronic rhinosinusitis diagnoses and nasal symptoms predicted final year healthcare utilization outcomes more than a decade after WTC exposure. The relative proportion of workers on each pyramid level changed significantly during the study period. Conclusions Diagnoses of chronic rhinosinusitis within 4 years of a major inhalation event only partially explain future healthcare utilization. Exposure intensity, early symptoms and other factors must also be considered when anticipating future healthcare needs. PMID:24898816
Prediction of processing tomato peeling outcomes
USDA-ARS?s Scientific Manuscript database
Peeling outcomes of processing tomatoes were predicted using multivariate analysis of Magnetic Resonance (MR) images. Tomatoes were obtained from a whole-peel production line. Each fruit was imaged using a 7 Tesla MR system, and a multivariate data set was created from 28 different images. After ...
Neuropsychological Testing Predicts Cerebrospinal Fluid Aβ in Mild Cognitive Impairment (MCI)
Kandel, Benjamin M.; Avants, Brian B.; Gee, James C.; Arnold, Steven E.; Wolk, David A.
2015-01-01
Background Psychometric tests predict conversion of Mild Cognitive Impairment (MCI) to probable Alzheimer's Disease (AD). Because the definition of clinical AD relies on those same psychometric tests, the ability of these tests to identify underlying AD pathology remains unclear. Objective To determine the degree to which psychometric testing predicts molecular evidence of AD amyloid pathology, as indicated by CSF Aβ1–42, in patients with MCI, as compared to neuroimaging biomarkers. Methods We identified 408 MCI subjects with CSF Aβ levels, psychometric test data, FDG-PET scans, and acceptable volumetric MR scans from the Alzheimer’s Disease Neuroimaging Initiative (ADNI). We used psychometric tests and imaging biomarkers in univariate and multivariate models to predict Aβ status. Results The 30-minute delayed recall score of the Rey Auditory Verbal Learning Test (AVLT) was the best predictor of Aβ status among the psychometric tests, achieving an AUC of 0.67±0.02 and odds ratio of 2.5±0.4. FDG-PET was the best imaging-based biomarker (AUC 0.67±0.03, OR 3.2±1.2), followed by hippocampal volume (AUC 0.64±0.02,,OR 2.4±0.3). A multivariate analysis based on the psychometric tests improved on the univariate predictors, achieving an AUC of 0.68±0.03 (OR 3.38±1.2). Adding imaging biomarkers to the multivariate analysis did not improve the AUC. Conclusion Psychometric tests perform as well as imaging biomarkers to predict presence of molecular markers of AD pathology in MCI patients and should be considered in the determination of the likelihood that MCI is due to AD. PMID:25881908
Martin, Wade H; Xian, Hong; Chandiramani, Pooja; Bainter, Emily; Klein, Andrew J P
2015-08-01
No data exist comparing outcome prediction from arm exercise vs pharmacologic myocardial perfusion imaging (MPI) stress test variables in patients unable to perform treadmill exercise. In this retrospective study, 2,173 consecutive lower extremity disabled veterans aged 65.4 ± 11.0years (mean ± SD) underwent either pharmacologic MPI (1730 patients) or arm exercise stress tests (443 patients) with MPI (n = 253) or electrocardiography alone (n = 190) between 1997 and 2002. Cox multivariate regression models and reclassification analysis by integrated discrimination improvement (IDI) were used to characterize stress test and MPI predictors of cardiovascular mortality at ≥10-year follow-up after inclusion of significant demographic, clinical, and other variables. Cardiovascular death occurred in 561 pharmacologic MPI and 102 arm exercise participants. Multivariate-adjusted cardiovascular mortality was predicted by arm exercise resting metabolic equivalents (hazard ratio [HR] 0.52, 95% CI 0.39-0.69, P < .001), 1-minute heart rate recovery (HR 0.61, 95% CI 0.44-0.86, P < .001), and pharmacologic and arm exercise delta (peak-rest) heart rate (both P < .001). Only an abnormal arm exercise MPI prognosticated cardiovascular death by multivariate Cox analysis (HR 1.98, 95% CI 1.04-3.77, P < .05). Arm exercise MPI defect number, type, and size provided IDI over covariates for prediction of cardiovascular mortality (IDI = 0.074-0.097). Only pharmacologic defect size prognosticated cardiovascular mortality (IDI = 0.022). Arm exercise capacity, heart rate recovery, and pharmacologic and arm exercise heart rate responses are robust predictors of cardiovascular mortality. Arm exercise MPI results are equivalent and possibly superior to pharmacologic MPI for cardiovascular mortality prediction in patients unable to perform treadmill exercise. Published by Elsevier Inc.
Abe, Ricardo Y; Gracitelli, Carolina P B; Diniz-Filho, Alberto; Zangwill, Linda M; Weinreb, Robert N; Medeiros, Felipe A
2015-07-01
To evaluate the relationship between rates of change on frequency doubling technology (FDT) perimetry and longitudinal changes in quality of life (QoL) of glaucoma patients. Prospective observational cohort study. One hundred fifty-two subjects (127 glaucoma and 25 healthy) were followed for an average of 3.2 ± 1.1 years. All subjects were evaluated with National Eye Institute Visual Function Questionnaire (NEI VFQ-25), FDT, and standard automated perimetry (SAP). Glaucoma patients had a median of 3 NEI VFQ-25, 8 FDT, and 8 SAP tests during follow-up. Mean sensitivities of the integrated binocular visual fields were estimated for FDT and SAP and used to calculate rates of change. A joint longitudinal multivariable mixed model was used to investigate the association between change in binocular mean sensitivities and change in NEI VFQ-25 Rasch-calibrated scores. There was a statistically significant correlation between change in binocular mean sensitivity for FDT and change in NEI VFQ-25 scores during follow-up in the glaucoma group. In multivariable analysis with the confounding factors, each 1 dB/year change in binocular FDT mean sensitivity corresponded to a change of 0.8 units per year in the NEI VFQ-25 scores (P = .001). For binocular SAP mean sensitivity, each 1 dB/year change was associated with 2.4 units per year change in NEI VFQ-25 scores (P < .001). The multivariable model containing baseline and rate of change information from SAP had stronger ability to predict change in NEI VFQ-25 scores compared to the equivalent model for FDT (R(2) of 50% and 30%, respectively; P = .001). SAP performed significantly better than FDT in predicting change in NEI VFQ-25 scores in our population, suggesting that it may still be the preferable perimetric technique for predicting risk of disability from the disease. Copyright © 2015 Elsevier Inc. All rights reserved.
Martínez, Carlos Alberto; Khare, Kshitij; Banerjee, Arunava; Elzo, Mauricio A
2017-03-21
This study corresponds to the second part of a companion paper devoted to the development of Bayesian multiple regression models accounting for randomness of genotypes in across population genome-wide prediction. This family of models considers heterogeneous and correlated marker effects and allelic frequencies across populations, and has the ability of considering records from non-genotyped individuals and individuals with missing genotypes in any subset of loci without the need for previous imputation, taking into account uncertainty about imputed genotypes. This paper extends this family of models by considering multivariate spike and slab conditional priors for marker allele substitution effects and contains derivations of approximate Bayes factors and fractional Bayes factors to compare models from part I and those developed here with their null versions. These null versions correspond to simpler models ignoring heterogeneity of populations, but still accounting for randomness of genotypes. For each marker loci, the spike component of priors corresponded to point mass at 0 in R S , where S is the number of populations, and the slab component was a S-variate Gaussian distribution, independent conditional priors were assumed. For the Gaussian components, covariance matrices were assumed to be either the same for all markers or different for each marker. For null models, the priors were simply univariate versions of these finite mixture distributions. Approximate algebraic expressions for Bayes factors and fractional Bayes factors were found using the Laplace approximation. Using the simulated datasets described in part I, these models were implemented and compared with models derived in part I using measures of predictive performance based on squared Pearson correlations, Deviance Information Criterion, Bayes factors, and fractional Bayes factors. The extensions presented here enlarge our family of genome-wide prediction models making it more flexible in the sense that it now offers more modeling options. Copyright © 2017 Elsevier Ltd. All rights reserved.
Manikandan, Narayanan; Subha, Srinivasan
2016-01-01
Software development life cycle has been characterized by destructive disconnects between activities like planning, analysis, design, and programming. Particularly software developed with prediction based results is always a big challenge for designers. Time series data forecasting like currency exchange, stock prices, and weather report are some of the areas where an extensive research is going on for the last three decades. In the initial days, the problems with financial analysis and prediction were solved by statistical models and methods. For the last two decades, a large number of Artificial Neural Networks based learning models have been proposed to solve the problems of financial data and get accurate results in prediction of the future trends and prices. This paper addressed some architectural design related issues for performance improvement through vectorising the strengths of multivariate econometric time series models and Artificial Neural Networks. It provides an adaptive approach for predicting exchange rates and it can be called hybrid methodology for predicting exchange rates. This framework is tested for finding the accuracy and performance of parallel algorithms used.
Manikandan, Narayanan; Subha, Srinivasan
2016-01-01
Software development life cycle has been characterized by destructive disconnects between activities like planning, analysis, design, and programming. Particularly software developed with prediction based results is always a big challenge for designers. Time series data forecasting like currency exchange, stock prices, and weather report are some of the areas where an extensive research is going on for the last three decades. In the initial days, the problems with financial analysis and prediction were solved by statistical models and methods. For the last two decades, a large number of Artificial Neural Networks based learning models have been proposed to solve the problems of financial data and get accurate results in prediction of the future trends and prices. This paper addressed some architectural design related issues for performance improvement through vectorising the strengths of multivariate econometric time series models and Artificial Neural Networks. It provides an adaptive approach for predicting exchange rates and it can be called hybrid methodology for predicting exchange rates. This framework is tested for finding the accuracy and performance of parallel algorithms used. PMID:26881271
Attitudes and exercise adherence: test of the Theories of Reasoned Action and Planned Behaviour.
Smith, R A; Biddle, S J
1999-04-01
Three studies of exercise adherence and attitudes are reported that tested the Theory of Reasoned Action and the Theory of Planned Behaviour. In a prospective study of adherence to a private fitness club, structural equation modelling path analysis showed that attitudinal and social normative components of the Theory of Reasoned Action accounted for 13.1% of the variance in adherence 4 months later, although only social norm significantly predicted intention. In a second study, the Theory of Planned Behaviour was used to predict both physical activity and sedentary behaviour. Path analyses showed that attitude and perceived control, but not social norm, predicted total physical activity. Physical activity was predicted from intentions and control over sedentary behaviour. Finally, an intervention study with previously sedentary adults showed that intentions to be active measured at the start and end of a 10-week intervention were associated with the planned behaviour variables. A multivariate analysis of variance revealed no significant multivariate effects for time on the planned behaviour variables measured before and after intervention. Qualitative data provided evidence that participants had a positive experience on the intervention programme and supported the role of social normative factors in the adherence process.
Sun, Xiangqing; Elston, Robert C; Barnholtz-Sloan, Jill S; Falk, Gary W; Grady, William M; Faulx, Ashley; Mittal, Sumeet K; Canto, Marcia; Shaheen, Nicholas J; Wang, Jean S; Iyer, Prasad G; Abrams, Julian A; Tian, Ye D; Willis, Joseph E; Guda, Kishore; Markowitz, Sanford D; Chandar, Apoorva; Warfe, James M; Brock, Wendy; Chak, Amitabh
2016-05-01
Barrett's esophagus is often asymptomatic and only a small portion of Barrett's esophagus patients are currently diagnosed and under surveillance. Therefore, it is important to develop risk prediction models to identify high-risk individuals with Barrett's esophagus. Familial aggregation of Barrett's esophagus and esophageal adenocarcinoma, and the increased risk of esophageal adenocarcinoma for individuals with a family history, raise the necessity of including genetic factors in the prediction model. Methods to determine risk prediction models using both risk covariates and ascertained family data are not well developed. We developed a Barrett's Esophagus Translational Research Network (BETRNet) risk prediction model from 787 singly ascertained Barrett's esophagus pedigrees and 92 multiplex Barrett's esophagus pedigrees, fitting a multivariate logistic model that incorporates family history and clinical risk factors. The eight risk factors, age, sex, education level, parental status, smoking, heartburn frequency, regurgitation frequency, and use of acid suppressant, were included in the model. The prediction accuracy was evaluated on the training dataset and an independent validation dataset of 643 multiplex Barrett's esophagus pedigrees. Our results indicate family information helps to predict Barrett's esophagus risk, and predicting in families improves both prediction calibration and discrimination accuracy. Our model can predict Barrett's esophagus risk for anyone with family members known to have, or not have, had Barrett's esophagus. It can predict risk for unrelated individuals without knowing any relatives' information. Our prediction model will shed light on effectively identifying high-risk individuals for Barrett's esophagus screening and surveillance, consequently allowing intervention at an early stage, and reducing mortality from esophageal adenocarcinoma. Cancer Epidemiol Biomarkers Prev; 25(5); 727-35. ©2016 AACR. ©2016 American Association for Cancer Research.
NASA Astrophysics Data System (ADS)
Isingizwe Nturambirwe, J. Frédéric; Perold, Willem J.; Opara, Umezuruike L.
2016-02-01
Near infrared (NIR) spectroscopy has gained extensive use in quality evaluation. It is arguably one of the most advanced spectroscopic tools in non-destructive quality testing of food stuff, from measurement to data analysis and interpretation. NIR spectral data are interpreted through means often involving multivariate statistical analysis, sometimes associated with optimisation techniques for model improvement. The objective of this research was to explore the extent to which genetic algorithms (GA) can be used to enhance model development, for predicting fruit quality. Apple fruits were used, and NIR spectra in the range from 12000 to 4000 cm-1 were acquired on both bruised and healthy tissues, with different degrees of mechanical damage. GAs were used in combination with partial least squares regression methods to develop bruise severity prediction models, and compared to PLS models developed using the full NIR spectrum. A classification model was developed, which clearly separated bruised from unbruised apple tissue. GAs helped improve prediction models by over 10%, in comparison with full spectrum-based models, as evaluated in terms of error of prediction (Root Mean Square Error of Cross-validation). PLS models to predict internal quality, such as sugar content and acidity were developed and compared to the versions optimized by genetic algorithm. Overall, the results highlighted the potential use of GA method to improve speed and accuracy of fruit quality prediction.
Noguchi, M; Kido, Y; Kubota, H; Kinjo, H; Kohama, G
1999-12-01
The records of 136 patients with N1-3 oral squamous cell carcinoma treated by surgery were investigated retrospectively, with the aim of finding out which factors were predictive of survival on multivariate analysis. Four independent factors significantly influenced survival in the following order: pN stage; T stage; histological grade; and N stage. The most significant was pN stage, the five-year survival for patients with pN0 being 91% and for patients with pN1-3 41%. A further study was carried out on the 80 patients with pN1-3 to find out their prognostic factors for survival and the independent factors identified by multivariate analysis were T stage and presence or absence of extracapsular spread to metastatic lymph nodes.
Pannu, Neesh; Hemmelgarn, Brenda R.; Austin, Peter C.; Tan, Zhi; McArthur, Eric; Manns, Braden J.; Tonelli, Marcello; Wald, Ron; Quinn, Robert R.; Ravani, Pietro; Garg, Amit X.
2017-01-01
Importance Some patients will develop chronic kidney disease after a hospitalization with acute kidney injury; however, no risk-prediction tools have been developed to identify high-risk patients requiring follow-up. Objective To derive and validate predictive models for progression of acute kidney injury to advanced chronic kidney disease. Design, Setting, and Participants Data from 2 population-based cohorts of patients with a prehospitalization estimated glomerular filtration rate (eGFR) of more than 45 mL/min/1.73 m2 and who had survived hospitalization with acute kidney injury (defined by a serum creatinine increase during hospitalization > 0.3 mg/dL or > 50% of their prehospitalization baseline), were used to derive and validate multivariable prediction models. The risk models were derived from 9973 patients hospitalized in Alberta, Canada (April 2004-March 2014, with follow-up to March 2015). The risk models were externally validated with data from a cohort of 2761 patients hospitalized in Ontario, Canada (June 2004-March 2012, with follow-up to March 2013). Exposures Demographic, laboratory, and comorbidity variables measured prior to discharge. Main Outcomes and Measures Advanced chronic kidney disease was defined by a sustained reduction in eGFR less than 30 mL/min/1.73 m2 for at least 3 months during the year after discharge. All participants were followed up for up to 1 year. Results The participants (mean [SD] age, 66 [15] years in the derivation and internal validation cohorts and 69 [11] years in the external validation cohort; 40%-43% women per cohort) had a mean (SD) baseline serum creatinine level of 1.0 (0.2) mg/dL and more than 20% had stage 2 or 3 acute kidney injury. Advanced chronic kidney disease developed in 408 (2.7%) of 9973 patients in the derivation cohort and 62 (2.2%) of 2761 patients in the external validation cohort. In the derivation cohort, 6 variables were independently associated with the outcome: older age, female sex, higher baseline serum creatinine value, albuminuria, greater severity of acute kidney injury, and higher serum creatinine value at discharge. In the external validation cohort, a multivariable model including these 6 variables had a C statistic of 0.81 (95% CI, 0.75-0.86) and improved discrimination and reclassification compared with reduced models that included age, sex, and discharge serum creatinine value alone (integrated discrimination improvement, 2.6%; 95% CI, 1.1%-4.0%; categorical net reclassification index, 13.5%; 95% CI, 1.9%-25.1%) or included age, sex, and acute kidney injury stage alone (integrated discrimination improvement, 8.0%; 95% CI, 5.1%-11.0%; categorical net reclassification index, 79.9%; 95% CI, 60.9%-98.9%). Conclusions and Relevance A multivariable model using routine laboratory data was able to predict advanced chronic kidney disease following hospitalization with acute kidney injury. The utility of this model in clinical care requires further research. PMID:29136443
James, Matthew T; Pannu, Neesh; Hemmelgarn, Brenda R; Austin, Peter C; Tan, Zhi; McArthur, Eric; Manns, Braden J; Tonelli, Marcello; Wald, Ron; Quinn, Robert R; Ravani, Pietro; Garg, Amit X
2017-11-14
Some patients will develop chronic kidney disease after a hospitalization with acute kidney injury; however, no risk-prediction tools have been developed to identify high-risk patients requiring follow-up. To derive and validate predictive models for progression of acute kidney injury to advanced chronic kidney disease. Data from 2 population-based cohorts of patients with a prehospitalization estimated glomerular filtration rate (eGFR) of more than 45 mL/min/1.73 m2 and who had survived hospitalization with acute kidney injury (defined by a serum creatinine increase during hospitalization > 0.3 mg/dL or > 50% of their prehospitalization baseline), were used to derive and validate multivariable prediction models. The risk models were derived from 9973 patients hospitalized in Alberta, Canada (April 2004-March 2014, with follow-up to March 2015). The risk models were externally validated with data from a cohort of 2761 patients hospitalized in Ontario, Canada (June 2004-March 2012, with follow-up to March 2013). Demographic, laboratory, and comorbidity variables measured prior to discharge. Advanced chronic kidney disease was defined by a sustained reduction in eGFR less than 30 mL/min/1.73 m2 for at least 3 months during the year after discharge. All participants were followed up for up to 1 year. The participants (mean [SD] age, 66 [15] years in the derivation and internal validation cohorts and 69 [11] years in the external validation cohort; 40%-43% women per cohort) had a mean (SD) baseline serum creatinine level of 1.0 (0.2) mg/dL and more than 20% had stage 2 or 3 acute kidney injury. Advanced chronic kidney disease developed in 408 (2.7%) of 9973 patients in the derivation cohort and 62 (2.2%) of 2761 patients in the external validation cohort. In the derivation cohort, 6 variables were independently associated with the outcome: older age, female sex, higher baseline serum creatinine value, albuminuria, greater severity of acute kidney injury, and higher serum creatinine value at discharge. In the external validation cohort, a multivariable model including these 6 variables had a C statistic of 0.81 (95% CI, 0.75-0.86) and improved discrimination and reclassification compared with reduced models that included age, sex, and discharge serum creatinine value alone (integrated discrimination improvement, 2.6%; 95% CI, 1.1%-4.0%; categorical net reclassification index, 13.5%; 95% CI, 1.9%-25.1%) or included age, sex, and acute kidney injury stage alone (integrated discrimination improvement, 8.0%; 95% CI, 5.1%-11.0%; categorical net reclassification index, 79.9%; 95% CI, 60.9%-98.9%). A multivariable model using routine laboratory data was able to predict advanced chronic kidney disease following hospitalization with acute kidney injury. The utility of this model in clinical care requires further research.
Liu, Fei; Ye, Lanhan; Peng, Jiyu; Song, Kunlin; Shen, Tingting; Zhang, Chu; He, Yong
2018-02-27
Fast detection of heavy metals is very important for ensuring the quality and safety of crops. Laser-induced breakdown spectroscopy (LIBS), coupled with uni- and multivariate analysis, was applied for quantitative analysis of copper in three kinds of rice (Jiangsu rice, regular rice, and Simiao rice). For univariate analysis, three pre-processing methods were applied to reduce fluctuations, including background normalization, the internal standard method, and the standard normal variate (SNV). Linear regression models showed a strong correlation between spectral intensity and Cu content, with an R 2 more than 0.97. The limit of detection (LOD) was around 5 ppm, lower than the tolerance limit of copper in foods. For multivariate analysis, partial least squares regression (PLSR) showed its advantage in extracting effective information for prediction, and its sensitivity reached 1.95 ppm, while support vector machine regression (SVMR) performed better in both calibration and prediction sets, where R c 2 and R p 2 reached 0.9979 and 0.9879, respectively. This study showed that LIBS could be considered as a constructive tool for the quantification of copper contamination in rice.
Ye, Lanhan; Song, Kunlin; Shen, Tingting
2018-01-01
Fast detection of heavy metals is very important for ensuring the quality and safety of crops. Laser-induced breakdown spectroscopy (LIBS), coupled with uni- and multivariate analysis, was applied for quantitative analysis of copper in three kinds of rice (Jiangsu rice, regular rice, and Simiao rice). For univariate analysis, three pre-processing methods were applied to reduce fluctuations, including background normalization, the internal standard method, and the standard normal variate (SNV). Linear regression models showed a strong correlation between spectral intensity and Cu content, with an R2 more than 0.97. The limit of detection (LOD) was around 5 ppm, lower than the tolerance limit of copper in foods. For multivariate analysis, partial least squares regression (PLSR) showed its advantage in extracting effective information for prediction, and its sensitivity reached 1.95 ppm, while support vector machine regression (SVMR) performed better in both calibration and prediction sets, where Rc2 and Rp2 reached 0.9979 and 0.9879, respectively. This study showed that LIBS could be considered as a constructive tool for the quantification of copper contamination in rice. PMID:29495445
NASA Astrophysics Data System (ADS)
de Campos, Luana Janaína; de Melo, Eduardo Borges
2017-08-01
In the present study, 199 compounds derived from pyrimidine, pyrimidone and pyridopyrazine carboxamides with inhibitory activity against HIV-1 integrase were modeled. Subsequently, a multivariate QSAR study was conducted with 54 molecules employed by Ordered Predictors Selection (OPS) and Partial Least Squares (PLS) for the selection of variables and model construction, respectively. Topological, electrotopological, geometric, and molecular descriptors were used. The selected real model was robust and free from chance correlation; in addition, it demonstrated favorable internal and external statistical quality. Once statistically validated, the training model was used to predict the activity of a second data set (n = 145). The root mean square deviation (RMSD) between observed and predicted values was 0.698. Although it is a value outside of the standards, only 15 (10.34%) of the samples exhibited higher residual values than 1 log unit, a result considered acceptable. Results of Williams and Euclidean applicability domains relative to the prediction showed that the predictions did not occur by extrapolation and that the model is representative of the chemical space of test compounds.
NASA Astrophysics Data System (ADS)
Shan, Jiajia; Wang, Xue; Zhou, Hao; Han, Shuqing; Riza, Dimas Firmanda Al; Kondo, Naoshi
2018-04-01
Synchronous fluorescence spectra, combined with multivariate analysis were used to predict flavonoids content in green tea rapidly and nondestructively. This paper presented a new and efficient spectral intervals selection method called clustering based partial least square (CL-PLS), which selected informative wavelengths by combining clustering concept and partial least square (PLS) methods to improve models’ performance by synchronous fluorescence spectra. The fluorescence spectra of tea samples were obtained and k-means and kohonen-self organizing map clustering algorithms were carried out to cluster full spectra into several clusters, and sub-PLS regression model was developed on each cluster. Finally, CL-PLS models consisting of gradually selected clusters were built. Correlation coefficient (R) was used to evaluate the effect on prediction performance of PLS models. In addition, variable influence on projection partial least square (VIP-PLS), selectivity ratio partial least square (SR-PLS), interval partial least square (iPLS) models and full spectra PLS model were investigated and the results were compared. The results showed that CL-PLS presented the best result for flavonoids prediction using synchronous fluorescence spectra.
Intharathirat, Rotchana; Abdul Salam, P; Kumar, S; Untong, Akarapong
2015-05-01
In order to plan, manage and use municipal solid waste (MSW) in a sustainable way, accurate forecasting of MSW generation and composition plays a key role. It is difficult to carry out the reliable estimates using the existing models due to the limited data available in the developing countries. This study aims to forecast MSW collected in Thailand with prediction interval in long term period by using the optimized multivariate grey model which is the mathematical approach. For multivariate models, the representative factors of residential and commercial sectors affecting waste collected are identified, classified and quantified based on statistics and mathematics of grey system theory. Results show that GMC (1, 5), the grey model with convolution integral, is the most accurate with the least error of 1.16% MAPE. MSW collected would increase 1.40% per year from 43,435-44,994 tonnes per day in 2013 to 55,177-56,735 tonnes per day in 2030. This model also illustrates that population density is the most important factor affecting MSW collected, followed by urbanization, proportion employment and household size, respectively. These mean that the representative factors of commercial sector may affect more MSW collected than that of residential sector. Results can help decision makers to develop the measures and policies of waste management in long term period. Copyright © 2015 Elsevier Ltd. All rights reserved.
Li, Min; Zhang, Lu; Yao, Xiaolong; Jiang, Xingyu
2017-01-01
The emerging membrane introduction mass spectrometry technique has been successfully used to detect benzene, toluene, ethyl benzene and xylene (BTEX), while overlapped spectra have unfortunately hindered its further application to the analysis of mixtures. Multivariate calibration, an efficient method to analyze mixtures, has been widely applied. In this paper, we compared univariate and multivariate analyses for quantification of the individual components of mixture samples. The results showed that the univariate analysis creates poor models with regression coefficients of 0.912, 0.867, 0.440 and 0.351 for BTEX, respectively. For multivariate analysis, a comparison to the partial-least squares (PLS) model shows that the orthogonal partial-least squares (OPLS) regression exhibits an optimal performance with regression coefficients of 0.995, 0.999, 0.980 and 0.976, favorable calibration parameters (RMSEC and RMSECV) and a favorable validation parameter (RMSEP). Furthermore, the OPLS exhibits a good recovery of 73.86 - 122.20% and relative standard deviation (RSD) of the repeatability of 1.14 - 4.87%. Thus, MIMS coupled with the OPLS regression provides an optimal approach for a quantitative BTEX mixture analysis in monitoring and predicting water pollution.
Physiology-Based Modeling May Predict Surgical Treatment Outcome for Obstructive Sleep Apnea
Li, Yanru; Ye, Jingying; Han, Demin; Cao, Xin; Ding, Xiu; Zhang, Yuhuan; Xu, Wen; Orr, Jeremy; Jen, Rachel; Sands, Scott; Malhotra, Atul; Owens, Robert
2017-01-01
Study Objectives: To test whether the integration of both anatomical and nonanatomical parameters (ventilatory control, arousal threshold, muscle responsiveness) in a physiology-based model will improve the ability to predict outcomes after upper airway surgery for obstructive sleep apnea (OSA). Methods: In 31 patients who underwent upper airway surgery for OSA, loop gain and arousal threshold were calculated from preoperative polysomnography (PSG). Three models were compared: (1) a multiple regression based on an extensive list of PSG parameters alone; (2) a multivariate regression using PSG parameters plus PSG-derived estimates of loop gain, arousal threshold, and other trait surrogates; (3) a physiological model incorporating selected variables as surrogates of anatomical and nonanatomical traits important for OSA pathogenesis. Results: Although preoperative loop gain was positively correlated with postoperative apnea-hypopnea index (AHI) (P = .008) and arousal threshold was negatively correlated (P = .011), in both model 1 and 2, the only significant variable was preoperative AHI, which explained 42% of the variance in postoperative AHI. In contrast, the physiological model (model 3), which included AHIREM (anatomy term), fraction of events that were hypopnea (arousal term), the ratio of AHIREM and AHINREM (muscle responsiveness term), loop gain, and central/mixed apnea index (control of breathing terms), was able to explain 61% of the variance in postoperative AHI. Conclusions: Although loop gain and arousal threshold are associated with residual AHI after surgery, only preoperative AHI was predictive using multivariate regression modeling. Instead, incorporating selected surrogates of physiological traits on the basis of OSA pathophysiology created a model that has more association with actual residual AHI. Commentary: A commentary on this article appears in this issue on page 1023. Clinical Trial Registration: ClinicalTrials.Gov; Title: The Impact of Sleep Apnea Treatment on Physiology Traits in Chinese Patients With Obstructive Sleep Apnea; Identifier: NCT02696629; URL: https://clinicaltrials.gov/show/NCT02696629 Citation: Li Y, Ye J, Han D, Cao X, Ding X, Zhang Y, Xu W, Orr J, Jen R, Sands S, Malhotra A, Owens R. Physiology-based modeling may predict surgical treatment outcome for obstructive sleep apnea. J Clin Sleep Med. 2017;13(9):1029–1037. PMID:28818154
Payne, Courtney E; Wolfrum, Edward J
2015-01-01
Obtaining accurate chemical composition and reactivity (measures of carbohydrate release and yield) information for biomass feedstocks in a timely manner is necessary for the commercialization of biofuels. Our objective was to use near-infrared (NIR) spectroscopy and partial least squares (PLS) multivariate analysis to develop calibration models to predict the feedstock composition and the release and yield of soluble carbohydrates generated by a bench-scale dilute acid pretreatment and enzymatic hydrolysis assay. Major feedstocks included in the calibration models are corn stover, sorghum, switchgrass, perennial cool season grasses, rice straw, and miscanthus. We present individual model statistics to demonstrate model performance and validation samples to more accurately measure predictive quality of the models. The PLS-2 model for composition predicts glucan, xylan, lignin, and ash (wt%) with uncertainties similar to primary measurement methods. A PLS-2 model was developed to predict glucose and xylose release following pretreatment and enzymatic hydrolysis. An additional PLS-2 model was developed to predict glucan and xylan yield. PLS-1 models were developed to predict the sum of glucose/glucan and xylose/xylan for release and yield (grams per gram). The release and yield models have higher uncertainties than the primary methods used to develop the models. It is possible to build effective multispecies feedstock models for composition, as well as carbohydrate release and yield. The model for composition is useful for predicting glucan, xylan, lignin, and ash with good uncertainties. The release and yield models have higher uncertainties; however, these models are useful for rapidly screening sample populations to identify unusual samples.
Sun, Gang; Hoff, Steven J; Zelle, Brian C; Nelson, Minda A
2008-12-01
It is vital to forecast gas and particle matter concentrations and emission rates (GPCER) from livestock production facilities to assess the impact of airborne pollutants on human health, ecological environment, and global warming. Modeling source air quality is a complex process because of abundant nonlinear interactions between GPCER and other factors. The objective of this study was to introduce statistical methods and radial basis function (RBF) neural network to predict daily source air quality in Iowa swine deep-pit finishing buildings. The results show that four variables (outdoor and indoor temperature, animal units, and ventilation rates) were identified as relative important model inputs using statistical methods. It can be further demonstrated that only two factors, the environment factor and the animal factor, were capable of explaining more than 94% of the total variability after performing principal component analysis. The introduction of fewer uncorrelated variables to the neural network would result in the reduction of the model structure complexity, minimize computation cost, and eliminate model overfitting problems. The obtained results of RBF network prediction were in good agreement with the actual measurements, with values of the correlation coefficient between 0.741 and 0.995 and very low values of systemic performance indexes for all the models. The good results indicated the RBF network could be trained to model these highly nonlinear relationships. Thus, the RBF neural network technology combined with multivariate statistical methods is a promising tool for air pollutant emissions modeling.
Mendoza-Carranza, Manuel; Ejarque, Elisabet; Nagelkerke, Leopold A J
2018-01-01
Tropical small-scale fisheries are typical for providing complex multivariate data, due to their diversity in fishing techniques and highly diverse species composition. In this paper we used for the first time a supervised Self-Organizing Map (xyf-SOM), to recognize and understand the internal heterogeneity of a tropical marine small-scale fishery, using as model the fishery fleet of San Pedro port, Tabasco, Mexico. We used multivariate data from commercial logbooks, including the following four factors: fish species (47), gear types (bottom longline, vertical line+shark longline and vertical line), season (cold, warm), and inter-annual variation (2007-2012). The size of the xyf-SOM, a fundamental characteristic to improve its predictive quality, was optimized for the minimum distance between objects and the maximum prediction rate. The xyf-SOM successfully classified individual fishing trips in relation to the four factors included in the model. Prediction percentages were high (80-100%) for bottom longline and vertical line + shark longline, but lower prediction values were obtained for vertical line (51-74%) fishery. A confusion matrix indicated that classification errors occurred within the same fishing gear. Prediction rates were validated by generating confidence interval using bootstrap. The xyf-SOM showed that not all the fishing trips were targeting the most abundant species and the catch rates were not symmetrically distributed around the mean. Also, the species composition is not homogeneous among fishing trips. Despite the complexity of the data, the xyf-SOM proved to be an excellent tool to identify trends in complex scenarios, emphasizing the diverse and complex patterns that characterize tropical small scale-fishery fleets.
Bogani, Giorgio; Cromi, Antonella; Serati, Maurizio; Uccella, Stefano; Donato, Violante Di; Casarin, Jvan; Naro, Edoardo Di; Ghezzi, Fabio
2017-06-01
To identify factors predicting for recurrence in vulvar cancer patients undergoing surgical treatment. We retrospectively evaluated data of consecutive patients with squamous cell vulvar cancer treated between January 1, 1990 and December 31, 2013. Basic descriptive statistics and multivariable analysis were used to design predicting models influencing outcomes. Five-year disease-free survival (DFS) and overall survival (OS) were analyzed using the Cox model. The study included 101 patients affected by vulvar cancer: 64 (63%) stage I, 12 (12%) stage II, 20 (20%) stage III, and 5 (5%) stage IV. After a mean (SD) follow-up of 37.6 (22.1) months, 21 (21%) recurrences occurred. Local, regional, and distant failures were recorded in 14 (14%), 6 (6%), and 3 (3%) patients, respectively. Five-year DFS and OS were 77% and 82%, respectively. At multivariate analysis only stromal invasion >2 mm (hazard ratio: 4.9 [95% confidence interval, 1.17-21.1]; P=0.04) and extracapsular lymph node involvement (hazard ratio: 9.0 (95% confidence interval, 1.17-69.5); P=0.03) correlated with worse DFS, although no factor independently correlated with OS. Looking at factors influencing local and regional failure, we observed that stromal invasion >2 mm was the only factor predicting for local recurrence, whereas lymph node extracapsular involvement predicted for regional recurrence. Stromal invasion >2 mm and lymph node extracapsular spread are the most important factors predicting for local and regional failure, respectively. Studies evaluating the effectiveness of adjuvant treatment in high-risk patients are warranted.
Crispin, Alexander; Strahwald, Brigitte; Cheney, Catherine; Mansmann, Ulrich
2018-06-04
Quality control, benchmarking, and pay for performance (P4P) require valid indicators and statistical models allowing adjustment for differences in risk profiles of the patient populations of the respective institutions. Using hospital remuneration data for measuring quality and modelling patient risks has been criticized by clinicians. Here we explore the potential of prediction models for 30- and 90-day mortality after colorectal cancer surgery based on routine data. Full census of a major statutory health insurer. Surgical departments throughout the Federal Republic of Germany. 4283 and 4124 insurants with major surgery for treatment of colorectal cancer during 2013 and 2014, respectively. Age, sex, primary and secondary diagnoses as well as tumor locations as recorded in the hospital remuneration data according to §301 SGB V. 30- and 90-day mortality. Elixhauser comorbidities, Charlson conditions, and Charlson scores were generated from the ICD-10 diagnoses. Multivariable prediction models were developed using a penalized logistic regression approach (logistic ridge regression) in a derivation set (patients treated in 2013). Calibration and discrimination of the models were assessed in an internal validation sample (patients treated in 2014) using calibration curves, Brier scores, receiver operating characteristic curves (ROC curves) and the areas under the ROC curves (AUC). 30- and 90-day mortality rates in the learning-sample were 5.7 and 8.4%, respectively. The corresponding values in the validation sample were 5.9% and once more 8.4%. Models based on Elixhauser comorbidities exhibited the highest discriminatory power with AUC values of 0.804 (95% CI: 0.776 -0.832) and 0.805 (95% CI: 0.782-0.828) for 30- and 90-day mortality. The Brier scores for these models were 0.050 (95% CI: 0.044-0.056) and 0.067 (95% CI: 0.060-0.074) and similar to the models based on Charlson conditions. Regardless of the model, low predicted probabilities were well calibrated, while higher predicted values tended to be overestimates. The reasonable results regarding discrimination and calibration notwithstanding, models based on hospital remuneration data may not be helpful for P4P. Routine data do not offer information regarding a wide range of quality indicators more useful than mortality. As an alternative, models based on clinical registries may allow a wider, more valid perspective. © Georg Thieme Verlag KG Stuttgart · New York.
Interpolation of the Radial Velocity Data from Coastal HF Radars
2013-01-01
practical applications and may help to solve many environmental problems caused by human activity. References [1] Alvera -Azcarate A., A. Barth, M. Rixen...surface temperature, Ocean Modelling, 9,325-346. [2] Alvera -Azcarate, A., A. Barth,. J.-M. Beckers, and R. H. Weisber, 2007: Multivari- ate...predictions from the global Navy Coastal Ocean Model (NCOM) dur- ing 1998-2001,7. Atmos. Oceanic TechnoL, 21(12), 1876-1894. [4] Barth, A., Alvera
Confronting uncertainty in flood damage predictions
NASA Astrophysics Data System (ADS)
Schröter, Kai; Kreibich, Heidi; Vogel, Kristin; Merz, Bruno
2015-04-01
Reliable flood damage models are a prerequisite for the practical usefulness of the model results. Oftentimes, traditional uni-variate damage models as for instance depth-damage curves fail to reproduce the variability of observed flood damage. Innovative multi-variate probabilistic modelling approaches are promising to capture and quantify the uncertainty involved and thus to improve the basis for decision making. In this study we compare the predictive capability of two probabilistic modelling approaches, namely Bagging Decision Trees and Bayesian Networks. For model evaluation we use empirical damage data which are available from computer aided telephone interviews that were respectively compiled after the floods in 2002, 2005 and 2006, in the Elbe and Danube catchments in Germany. We carry out a split sample test by sub-setting the damage records. One sub-set is used to derive the models and the remaining records are used to evaluate the predictive performance of the model. Further we stratify the sample according to catchments which allows studying model performance in a spatial transfer context. Flood damage estimation is carried out on the scale of the individual buildings in terms of relative damage. The predictive performance of the models is assessed in terms of systematic deviations (mean bias), precision (mean absolute error) as well as in terms of reliability which is represented by the proportion of the number of observations that fall within the 95-quantile and 5-quantile predictive interval. The reliability of the probabilistic predictions within validation runs decreases only slightly and achieves a very good coverage of observations within the predictive interval. Probabilistic models provide quantitative information about prediction uncertainty which is crucial to assess the reliability of model predictions and improves the usefulness of model results.
Predicting volumes in four Hawaii hardwoods...first multivariate equations developed
David A. Sharpnack
1966-01-01
Multivariate regression equations were developed for predicting board-foot (Int. 1/ 4-inch log rule ) and cubic-foot volumes in each 8.15-foot section of trees of four Hawaii hardwood species. The species are koa (Acacia koa), ohia (Metrosideros polymorpha), robusta eucalyptus (Eucalyptus robusta), and...
Selecting an Informative/Discriminating Multivariate Response for Inverse Prediction
DOE Office of Scientific and Technical Information (OSTI.GOV)
Thomas, Edward V.; Lewis, John. R.; Anderson-Cook, Christine Michaela
The inverse prediction is important in a variety of scientific and engineering applications, such as to predict properties/characteristics of an object by using multiple measurements obtained from it. Inverse prediction can be accomplished by inverting parameterized forward models that relate the measurements (responses) to the properties/characteristics of interest. Sometimes forward models are computational/science based; but often, forward models are empirically based response surface models, obtained by using the results of controlled experimentation. For empirical models, it is important that the experiments provide a sound basis to develop accurate forward models in terms of the properties/characteristics (factors). And while nature dictatesmore » the causal relationships between factors and responses, experimenters can control the complexity, accuracy, and precision of forward models constructed via selection of factors, factor levels, and the set of trials that are performed. Recognition of the uncertainty in the estimated forward models leads to an errors-in-variables approach for inverse prediction. The forward models (estimated by experiments or science based) can also be used to analyze how well candidate responses complement one another for inverse prediction over the range of the factor space of interest. Furthermore, one may find that some responses are complementary, redundant, or noninformative. Simple analysis and examples illustrate how an informative and discriminating subset of responses could be selected among candidates in cases where the number of responses that can be acquired during inverse prediction is limited by difficulty, expense, and/or availability of material.« less
Selecting an Informative/Discriminating Multivariate Response for Inverse Prediction
Thomas, Edward V.; Lewis, John. R.; Anderson-Cook, Christine Michaela; ...
2017-07-01
The inverse prediction is important in a variety of scientific and engineering applications, such as to predict properties/characteristics of an object by using multiple measurements obtained from it. Inverse prediction can be accomplished by inverting parameterized forward models that relate the measurements (responses) to the properties/characteristics of interest. Sometimes forward models are computational/science based; but often, forward models are empirically based response surface models, obtained by using the results of controlled experimentation. For empirical models, it is important that the experiments provide a sound basis to develop accurate forward models in terms of the properties/characteristics (factors). And while nature dictatesmore » the causal relationships between factors and responses, experimenters can control the complexity, accuracy, and precision of forward models constructed via selection of factors, factor levels, and the set of trials that are performed. Recognition of the uncertainty in the estimated forward models leads to an errors-in-variables approach for inverse prediction. The forward models (estimated by experiments or science based) can also be used to analyze how well candidate responses complement one another for inverse prediction over the range of the factor space of interest. Furthermore, one may find that some responses are complementary, redundant, or noninformative. Simple analysis and examples illustrate how an informative and discriminating subset of responses could be selected among candidates in cases where the number of responses that can be acquired during inverse prediction is limited by difficulty, expense, and/or availability of material.« less
Urquhart, Andrew G.; Hassett, Afton L.; Tsodikov, Alex; Hallstrom, Brian R.; Wood, Nathan I.; Williams, David A.; Clauw, Daniel J.
2015-01-01
Objective While psychosocial factors have been associated with poorer outcomes after knee and hip arthroplasty, we hypothesized that augmented pain perception, as occurs in conditions such as fibromyalgia, may account for decreased responsiveness to primary knee and hip arthroplasty. Methods A prospective, observational cohort study was conducted. Preoperative phenotyping was conducted using validated questionnaires to assess pain, function, depression, anxiety, and catastrophizing. Participants also completed the 2011 fibromyalgia survey questionnaire, which addresses the widespread body pain and comorbid symptoms associated with characteristics of fibromyalgia. Results Of the 665 participants, 464 were retained 6 months after surgery. Since individuals who met criteria for being classified as having fibromyalgia were expected to respond less favorably, all primary analyses excluded these individuals (6% of the cohort). In the multivariate linear regression model predicting change in knee/hip pain (primary outcome), a higher fibromyalgia survey score was independently predictive of less improvement in pain (estimate −0.25, SE 0.044; P < 0.00001). Lower baseline joint pain scores and knee (versus hip) arthroplasty were also predictive of less improvement (R2 = 0.58). The same covariates were predictive in the multivariate logistic regression model for change in knee/hip pain, with a 17.8% increase in the odds of failure to meet the threshold of 50% improvement for every 1‐point increase in fibromyalgia survey score (P = 0.00032). The fibromyalgia survey score was also independently predictive of change in overall pain and patient global impression of change. Conclusion Our findings indicate that the fibromyalgia survey score is a robust predictor of poorer arthroplasty outcomes, even among individuals whose score falls well below the threshold for the categorical diagnosis of fibromyalgia. PMID:25772388
Self-consistent core-pedestal transport simulations with neural network accelerated models
Meneghini, Orso; Smith, Sterling P.; Snyder, Philip B.; ...
2017-07-12
Fusion whole device modeling simulations require comprehensive models that are simultaneously physically accurate, fast, robust, and predictive. In this paper we describe the development of two neural-network (NN) based models as a means to perform a snon-linear multivariate regression of theory-based models for the core turbulent transport fluxes, and the pedestal structure. Specifically, we find that a NN-based approach can be used to consistently reproduce the results of the TGLF and EPED1 theory-based models over a broad range of plasma regimes, and with a computational speedup of several orders of magnitudes. These models are then integrated into a predictive workflowmore » that allows prediction with self-consistent core-pedestal coupling of the kinetic profiles within the last closed flux surface of the plasma. Finally, the NN paradigm is capable of breaking the speed-accuracy trade-off that is expected of traditional numerical physics models, and can provide the missing link towards self-consistent coupled core-pedestal whole device modeling simulations that are physically accurate and yet take only seconds to run.« less
Self-consistent core-pedestal transport simulations with neural network accelerated models
DOE Office of Scientific and Technical Information (OSTI.GOV)
Meneghini, Orso; Smith, Sterling P.; Snyder, Philip B.
Fusion whole device modeling simulations require comprehensive models that are simultaneously physically accurate, fast, robust, and predictive. In this paper we describe the development of two neural-network (NN) based models as a means to perform a snon-linear multivariate regression of theory-based models for the core turbulent transport fluxes, and the pedestal structure. Specifically, we find that a NN-based approach can be used to consistently reproduce the results of the TGLF and EPED1 theory-based models over a broad range of plasma regimes, and with a computational speedup of several orders of magnitudes. These models are then integrated into a predictive workflowmore » that allows prediction with self-consistent core-pedestal coupling of the kinetic profiles within the last closed flux surface of the plasma. Finally, the NN paradigm is capable of breaking the speed-accuracy trade-off that is expected of traditional numerical physics models, and can provide the missing link towards self-consistent coupled core-pedestal whole device modeling simulations that are physically accurate and yet take only seconds to run.« less
Self-consistent core-pedestal transport simulations with neural network accelerated models
NASA Astrophysics Data System (ADS)
Meneghini, O.; Smith, S. P.; Snyder, P. B.; Staebler, G. M.; Candy, J.; Belli, E.; Lao, L.; Kostuk, M.; Luce, T.; Luda, T.; Park, J. M.; Poli, F.
2017-08-01
Fusion whole device modeling simulations require comprehensive models that are simultaneously physically accurate, fast, robust, and predictive. In this paper we describe the development of two neural-network (NN) based models as a means to perform a snon-linear multivariate regression of theory-based models for the core turbulent transport fluxes, and the pedestal structure. Specifically, we find that a NN-based approach can be used to consistently reproduce the results of the TGLF and EPED1 theory-based models over a broad range of plasma regimes, and with a computational speedup of several orders of magnitudes. These models are then integrated into a predictive workflow that allows prediction with self-consistent core-pedestal coupling of the kinetic profiles within the last closed flux surface of the plasma. The NN paradigm is capable of breaking the speed-accuracy trade-off that is expected of traditional numerical physics models, and can provide the missing link towards self-consistent coupled core-pedestal whole device modeling simulations that are physically accurate and yet take only seconds to run.