Sample records for multivariate regression techniques

  1. Linear regression analysis and its application to multivariate chromatographic calibration for the quantitative analysis of two-component mixtures.

    PubMed

    Dinç, Erdal; Ozdemir, Abdil

    2005-01-01

    Multivariate chromatographic calibration technique was developed for the quantitative analysis of binary mixtures enalapril maleate (EA) and hydrochlorothiazide (HCT) in tablets in the presence of losartan potassium (LST). The mathematical algorithm of multivariate chromatographic calibration technique is based on the use of the linear regression equations constructed using relationship between concentration and peak area at the five-wavelength set. The algorithm of this mathematical calibration model having a simple mathematical content was briefly described. This approach is a powerful mathematical tool for an optimum chromatographic multivariate calibration and elimination of fluctuations coming from instrumental and experimental conditions. This multivariate chromatographic calibration contains reduction of multivariate linear regression functions to univariate data set. The validation of model was carried out by analyzing various synthetic binary mixtures and using the standard addition technique. Developed calibration technique was applied to the analysis of the real pharmaceutical tablets containing EA and HCT. The obtained results were compared with those obtained by classical HPLC method. It was observed that the proposed multivariate chromatographic calibration gives better results than classical HPLC.

  2. Electricity Consumption in the Industrial Sector of Jordan: Application of Multivariate Linear Regression and Adaptive Neuro-Fuzzy Techniques

    NASA Astrophysics Data System (ADS)

    Samhouri, M.; Al-Ghandoor, A.; Fouad, R. H.

    2009-08-01

    In this study two techniques, for modeling electricity consumption of the Jordanian industrial sector, are presented: (i) multivariate linear regression and (ii) neuro-fuzzy models. Electricity consumption is modeled as function of different variables such as number of establishments, number of employees, electricity tariff, prevailing fuel prices, production outputs, capacity utilizations, and structural effects. It was found that industrial production and capacity utilization are the most important variables that have significant effect on future electrical power demand. The results showed that both the multivariate linear regression and neuro-fuzzy models are generally comparable and can be used adequately to simulate industrial electricity consumption. However, comparison that is based on the square root average squared error of data suggests that the neuro-fuzzy model performs slightly better for future prediction of electricity consumption than the multivariate linear regression model. Such results are in full agreement with similar work, using different methods, for other countries.

  3. Applied Statistics: From Bivariate through Multivariate Techniques [with CD-ROM

    ERIC Educational Resources Information Center

    Warner, Rebecca M.

    2007-01-01

    This book provides a clear introduction to widely used topics in bivariate and multivariate statistics, including multiple regression, discriminant analysis, MANOVA, factor analysis, and binary logistic regression. The approach is applied and does not require formal mathematics; equations are accompanied by verbal explanations. Students are asked…

  4. Computation of nonlinear least squares estimator and maximum likelihood using principles in matrix calculus

    NASA Astrophysics Data System (ADS)

    Mahaboob, B.; Venkateswarlu, B.; Sankar, J. Ravi; Balasiddamuni, P.

    2017-11-01

    This paper uses matrix calculus techniques to obtain Nonlinear Least Squares Estimator (NLSE), Maximum Likelihood Estimator (MLE) and Linear Pseudo model for nonlinear regression model. David Pollard and Peter Radchenko [1] explained analytic techniques to compute the NLSE. However the present research paper introduces an innovative method to compute the NLSE using principles in multivariate calculus. This study is concerned with very new optimization techniques used to compute MLE and NLSE. Anh [2] derived NLSE and MLE of a heteroscedatistic regression model. Lemcoff [3] discussed a procedure to get linear pseudo model for nonlinear regression model. In this research article a new technique is developed to get the linear pseudo model for nonlinear regression model using multivariate calculus. The linear pseudo model of Edmond Malinvaud [4] has been explained in a very different way in this paper. David Pollard et.al used empirical process techniques to study the asymptotic of the LSE (Least-squares estimation) for the fitting of nonlinear regression function in 2006. In Jae Myung [13] provided a go conceptual for Maximum likelihood estimation in his work “Tutorial on maximum likelihood estimation

  5. Characterizing multivariate decoding models based on correlated EEG spectral features

    PubMed Central

    McFarland, Dennis J.

    2013-01-01

    Objective Multivariate decoding methods are popular techniques for analysis of neurophysiological data. The present study explored potential interpretative problems with these techniques when predictors are correlated. Methods Data from sensorimotor rhythm-based cursor control experiments was analyzed offline with linear univariate and multivariate models. Features were derived from autoregressive (AR) spectral analysis of varying model order which produced predictors that varied in their degree of correlation (i.e., multicollinearity). Results The use of multivariate regression models resulted in much better prediction of target position as compared to univariate regression models. However, with lower order AR features interpretation of the spectral patterns of the weights was difficult. This is likely to be due to the high degree of multicollinearity present with lower order AR features. Conclusions Care should be exercised when interpreting the pattern of weights of multivariate models with correlated predictors. Comparison with univariate statistics is advisable. Significance While multivariate decoding algorithms are very useful for prediction their utility for interpretation may be limited when predictors are correlated. PMID:23466267

  6. Validation of cross-sectional time series and multivariate adaptive regression splines models for the prediction of energy expenditure in children and adolescents using doubly labeled water

    USDA-ARS?s Scientific Manuscript database

    Accurate, nonintrusive, and inexpensive techniques are needed to measure energy expenditure (EE) in free-living populations. Our primary aim in this study was to validate cross-sectional time series (CSTS) and multivariate adaptive regression splines (MARS) models based on observable participant cha...

  7. Characterizing multivariate decoding models based on correlated EEG spectral features.

    PubMed

    McFarland, Dennis J

    2013-07-01

    Multivariate decoding methods are popular techniques for analysis of neurophysiological data. The present study explored potential interpretative problems with these techniques when predictors are correlated. Data from sensorimotor rhythm-based cursor control experiments was analyzed offline with linear univariate and multivariate models. Features were derived from autoregressive (AR) spectral analysis of varying model order which produced predictors that varied in their degree of correlation (i.e., multicollinearity). The use of multivariate regression models resulted in much better prediction of target position as compared to univariate regression models. However, with lower order AR features interpretation of the spectral patterns of the weights was difficult. This is likely to be due to the high degree of multicollinearity present with lower order AR features. Care should be exercised when interpreting the pattern of weights of multivariate models with correlated predictors. Comparison with univariate statistics is advisable. While multivariate decoding algorithms are very useful for prediction their utility for interpretation may be limited when predictors are correlated. Copyright © 2013 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.

  8. Newer classification and regression tree techniques: Bagging and Random Forests for ecological prediction

    Treesearch

    Anantha M. Prasad; Louis R. Iverson; Andy Liaw; Andy Liaw

    2006-01-01

    We evaluated four statistical models - Regression Tree Analysis (RTA), Bagging Trees (BT), Random Forests (RF), and Multivariate Adaptive Regression Splines (MARS) - for predictive vegetation mapping under current and future climate scenarios according to the Canadian Climate Centre global circulation model.

  9. Stock price forecasting for companies listed on Tehran stock exchange using multivariate adaptive regression splines model and semi-parametric splines technique

    NASA Astrophysics Data System (ADS)

    Rounaghi, Mohammad Mahdi; Abbaszadeh, Mohammad Reza; Arashi, Mohammad

    2015-11-01

    One of the most important topics of interest to investors is stock price changes. Investors whose goals are long term are sensitive to stock price and its changes and react to them. In this regard, we used multivariate adaptive regression splines (MARS) model and semi-parametric splines technique for predicting stock price in this study. The MARS model as a nonparametric method is an adaptive method for regression and it fits for problems with high dimensions and several variables. semi-parametric splines technique was used in this study. Smoothing splines is a nonparametric regression method. In this study, we used 40 variables (30 accounting variables and 10 economic variables) for predicting stock price using the MARS model and using semi-parametric splines technique. After investigating the models, we select 4 accounting variables (book value per share, predicted earnings per share, P/E ratio and risk) as influencing variables on predicting stock price using the MARS model. After fitting the semi-parametric splines technique, only 4 accounting variables (dividends, net EPS, EPS Forecast and P/E Ratio) were selected as variables effective in forecasting stock prices.

  10. Local polynomial estimation of heteroscedasticity in a multivariate linear regression model and its applications in economics.

    PubMed

    Su, Liyun; Zhao, Yanyong; Yan, Tianshun; Li, Fenglan

    2012-01-01

    Multivariate local polynomial fitting is applied to the multivariate linear heteroscedastic regression model. Firstly, the local polynomial fitting is applied to estimate heteroscedastic function, then the coefficients of regression model are obtained by using generalized least squares method. One noteworthy feature of our approach is that we avoid the testing for heteroscedasticity by improving the traditional two-stage method. Due to non-parametric technique of local polynomial estimation, it is unnecessary to know the form of heteroscedastic function. Therefore, we can improve the estimation precision, when the heteroscedastic function is unknown. Furthermore, we verify that the regression coefficients is asymptotic normal based on numerical simulations and normal Q-Q plots of residuals. Finally, the simulation results and the local polynomial estimation of real data indicate that our approach is surely effective in finite-sample situations.

  11. Multivariate statistical analysis: Principles and applications to coorbital streams of meteorite falls

    NASA Technical Reports Server (NTRS)

    Wolf, S. F.; Lipschutz, M. E.

    1993-01-01

    Multivariate statistical analysis techniques (linear discriminant analysis and logistic regression) can provide powerful discrimination tools which are generally unfamiliar to the planetary science community. Fall parameters were used to identify a group of 17 H chondrites (Cluster 1) that were part of a coorbital stream which intersected Earth's orbit in May, from 1855 - 1895, and can be distinguished from all other H chondrite falls. Using multivariate statistical techniques, it was demonstrated that a totally different criterion, labile trace element contents - hence thermal histories - or 13 Cluster 1 meteorites are distinguishable from those of 45 non-Cluster 1 H chondrites. Here, we focus upon the principles of multivariate statistical techniques and illustrate their application using non-meteoritic and meteoritic examples.

  12. Diagnostic tools for nearest neighbors techniques when used with satellite imagery

    Treesearch

    Ronald E. McRoberts

    2009-01-01

    Nearest neighbors techniques are non-parametric approaches to multivariate prediction that are useful for predicting both continuous and categorical forest attribute variables. Although some assumptions underlying nearest neighbor techniques are common to other prediction techniques such as regression, other assumptions are unique to nearest neighbor techniques....

  13. MODELING SNAKE MICROHABITAT FROM RADIOTELEMETRY STUDIES USING POLYTOMOUS LOGISTIC REGRESSION

    EPA Science Inventory

    Multivariate analysis of snake microhabitat has historically used techniques that were derived under assumptions of normality and common covariance structure (e.g., discriminant function analysis, MANOVA). In this study, polytomous logistic regression (PLR which does not require ...

  14. Quantitative monitoring of sucrose, reducing sugar and total sugar dynamics for phenotyping of water-deficit stress tolerance in rice through spectroscopy and chemometrics

    NASA Astrophysics Data System (ADS)

    Das, Bappa; Sahoo, Rabi N.; Pargal, Sourabh; Krishna, Gopal; Verma, Rakesh; Chinnusamy, Viswanathan; Sehgal, Vinay K.; Gupta, Vinod K.; Dash, Sushanta K.; Swain, Padmini

    2018-03-01

    In the present investigation, the changes in sucrose, reducing and total sugar content due to water-deficit stress in rice leaves were modeled using visible, near infrared (VNIR) and shortwave infrared (SWIR) spectroscopy. The objectives of the study were to identify the best vegetation indices and suitable multivariate technique based on precise analysis of hyperspectral data (350 to 2500 nm) and sucrose, reducing sugar and total sugar content measured at different stress levels from 16 different rice genotypes. Spectral data analysis was done to identify suitable spectral indices and models for sucrose estimation. Novel spectral indices in near infrared (NIR) range viz. ratio spectral index (RSI) and normalised difference spectral indices (NDSI) sensitive to sucrose, reducing sugar and total sugar content were identified which were subsequently calibrated and validated. The RSI and NDSI models had R2 values of 0.65, 0.71 and 0.67; RPD values of 1.68, 1.95 and 1.66 for sucrose, reducing sugar and total sugar, respectively for validation dataset. Different multivariate spectral models such as artificial neural network (ANN), multivariate adaptive regression splines (MARS), multiple linear regression (MLR), partial least square regression (PLSR), random forest regression (RFR) and support vector machine regression (SVMR) were also evaluated. The best performing multivariate models for sucrose, reducing sugars and total sugars were found to be, MARS, ANN and MARS, respectively with respect to RPD values of 2.08, 2.44, and 1.93. Results indicated that VNIR and SWIR spectroscopy combined with multivariate calibration can be used as a reliable alternative to conventional methods for measurement of sucrose, reducing sugars and total sugars of rice under water-deficit stress as this technique is fast, economic, and noninvasive.

  15. Biostatistics Series Module 10: Brief Overview of Multivariate Methods.

    PubMed

    Hazra, Avijit; Gogtay, Nithya

    2017-01-01

    Multivariate analysis refers to statistical techniques that simultaneously look at three or more variables in relation to the subjects under investigation with the aim of identifying or clarifying the relationships between them. These techniques have been broadly classified as dependence techniques, which explore the relationship between one or more dependent variables and their independent predictors, and interdependence techniques, that make no such distinction but treat all variables equally in a search for underlying relationships. Multiple linear regression models a situation where a single numerical dependent variable is to be predicted from multiple numerical independent variables. Logistic regression is used when the outcome variable is dichotomous in nature. The log-linear technique models count type of data and can be used to analyze cross-tabulations where more than two variables are included. Analysis of covariance is an extension of analysis of variance (ANOVA), in which an additional independent variable of interest, the covariate, is brought into the analysis. It tries to examine whether a difference persists after "controlling" for the effect of the covariate that can impact the numerical dependent variable of interest. Multivariate analysis of variance (MANOVA) is a multivariate extension of ANOVA used when multiple numerical dependent variables have to be incorporated in the analysis. Interdependence techniques are more commonly applied to psychometrics, social sciences and market research. Exploratory factor analysis and principal component analysis are related techniques that seek to extract from a larger number of metric variables, a smaller number of composite factors or components, which are linearly related to the original variables. Cluster analysis aims to identify, in a large number of cases, relatively homogeneous groups called clusters, without prior information about the groups. The calculation intensive nature of multivariate analysis has so far precluded most researchers from using these techniques routinely. The situation is now changing with wider availability, and increasing sophistication of statistical software and researchers should no longer shy away from exploring the applications of multivariate methods to real-life data sets.

  16. A diagnostic analysis of the VVP single-doppler retrieval technique

    NASA Technical Reports Server (NTRS)

    Boccippio, Dennis J.

    1995-01-01

    A diagnostic analysis of the VVP (volume velocity processing) retrieval method is presented, with emphasis on understanding the technique as a linear, multivariate regression. Similarities and differences to the velocity-azimuth display and extended velocity-azimuth display retrieval techniques are discussed, using this framework. Conventional regression diagnostics are then employed to quantitatively determine situations in which the VVP technique is likely to fail. An algorithm for preparation and analysis of a robust VVP retrieval is developed and applied to synthetic and actual datasets with high temporal and spatial resolution. A fundamental (but quantifiable) limitation to some forms of VVP analysis is inadequate sampling dispersion in the n space of the multivariate regression, manifest as a collinearity between the basis functions of some fitted parameters. Such collinearity may be present either in the definition of these basis functions or in their realization in a given sampling configuration. This nonorthogonality may cause numerical instability, variance inflation (decrease in robustness), and increased sensitivity to bias from neglected wind components. It is shown that these effects prevent the application of VVP to small azimuthal sectors of data. The behavior of the VVP regression is further diagnosed over a wide range of sampling constraints, and reasonable sector limits are established.

  17. Advanced statistics: linear regression, part II: multiple linear regression.

    PubMed

    Marill, Keith A

    2004-01-01

    The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.

  18. Transition from a multiport technique to a single-port technique for lung cancer surgery: is lymph node dissection inferior using the single-port technique?†.

    PubMed

    Liu, Chia-Chuan; Shih, Chih-Shiun; Pennarun, Nicolas; Cheng, Chih-Tao

    2016-01-01

    The feasibility and radicalism of lymph node dissection for lung cancer surgery by a single-port technique has frequently been challenged. We performed a retrospective cohort study to investigate this issue. Two chest surgeons initiated multiple-port thoracoscopic surgery in a 180-bed cancer centre in 2005 and shifted to a single-port technique gradually after 2010. Data, including demographic and clinical information, from 389 patients receiving multiport thoracoscopic lobectomy or segmentectomy and 149 consecutive patients undergoing either single-port lobectomy or segmentectomy for primary non-small-cell lung cancer were retrieved and entered for statistical analysis by multivariable linear regression models and Box-Cox transformed multivariable analysis. The mean number of total dissected lymph nodes in the lobectomy group was 28.5 ± 11.7 for the single-port group versus 25.2 ± 11.3 for the multiport group; the mean number of total dissected lymph nodes in the segmentectomy group was 19.5 ± 10.8 for the single-port group versus 17.9 ± 10.3 for the multiport group. In linear multivariable and after Box-Cox transformed multivariable analyses, the single-port approach was still associated with a higher total number of dissected lymph nodes. The total number of dissected lymph nodes for primary lung cancer surgery by single-port video-assisted thoracoscopic surgery (VATS) was higher than by multiport VATS in univariable, multivariable linear regression and Box-Cox transformed multivariable analyses. This study confirmed that highly effective lymph node dissection could be achieved through single-port VATS in our setting. © The Author 2015. Published by Oxford University Press on behalf of the European Association for Cardio-Thoracic Surgery. All rights reserved.

  19. PM10 modeling in the Oviedo urban area (Northern Spain) by using multivariate adaptive regression splines

    NASA Astrophysics Data System (ADS)

    Nieto, Paulino José García; Antón, Juan Carlos Álvarez; Vilán, José Antonio Vilán; García-Gonzalo, Esperanza

    2014-10-01

    The aim of this research work is to build a regression model of the particulate matter up to 10 micrometers in size (PM10) by using the multivariate adaptive regression splines (MARS) technique in the Oviedo urban area (Northern Spain) at local scale. This research work explores the use of a nonparametric regression algorithm known as multivariate adaptive regression splines (MARS) which has the ability to approximate the relationship between the inputs and outputs, and express the relationship mathematically. In this sense, hazardous air pollutants or toxic air contaminants refer to any substance that may cause or contribute to an increase in mortality or serious illness, or that may pose a present or potential hazard to human health. To accomplish the objective of this study, the experimental dataset of nitrogen oxides (NOx), carbon monoxide (CO), sulfur dioxide (SO2), ozone (O3) and dust (PM10) were collected over 3 years (2006-2008) and they are used to create a highly nonlinear model of the PM10 in the Oviedo urban nucleus (Northern Spain) based on the MARS technique. One main objective of this model is to obtain a preliminary estimate of the dependence between PM10 pollutant in the Oviedo urban area at local scale. A second aim is to determine the factors with the greatest bearing on air quality with a view to proposing health and lifestyle improvements. The United States National Ambient Air Quality Standards (NAAQS) establishes the limit values of the main pollutants in the atmosphere in order to ensure the health of healthy people. Firstly, this MARS regression model captures the main perception of statistical learning theory in order to obtain a good prediction of the dependence among the main pollutants in the Oviedo urban area. Secondly, the main advantages of MARS are its capacity to produce simple, easy-to-interpret models, its ability to estimate the contributions of the input variables, and its computational efficiency. Finally, on the basis of these numerical calculations, using the multivariate adaptive regression splines (MARS) technique, conclusions of this research work are exposed.

  20. Statistical Evaluation of Time Series Analysis Techniques

    NASA Technical Reports Server (NTRS)

    Benignus, V. A.

    1973-01-01

    The performance of a modified version of NASA's multivariate spectrum analysis program is discussed. A multiple regression model was used to make the revisions. Performance improvements were documented and compared to the standard fast Fourier transform by Monte Carlo techniques.

  1. MANCOVA for one way classification with homogeneity of regression coefficient vectors

    NASA Astrophysics Data System (ADS)

    Mokesh Rayalu, G.; Ravisankar, J.; Mythili, G. Y.

    2017-11-01

    The MANOVA and MANCOVA are the extensions of the univariate ANOVA and ANCOVA techniques to multidimensional or vector valued observations. The assumption of a Gaussian distribution has been replaced with the Multivariate Gaussian distribution for the vectors data and residual term variables in the statistical models of these techniques. The objective of MANCOVA is to determine if there are statistically reliable mean differences that can be demonstrated between groups later modifying the newly created variable. When randomization assignment of samples or subjects to groups is not possible, multivariate analysis of covariance (MANCOVA) provides statistical matching of groups by adjusting dependent variables as if all subjects scored the same on the covariates. In this research article, an extension has been made to the MANCOVA technique with more number of covariates and homogeneity of regression coefficient vectors is also tested.

  2. Quantitative analysis of binary polymorphs mixtures of fusidic acid by diffuse reflectance FTIR spectroscopy, diffuse reflectance FT-NIR spectroscopy, Raman spectroscopy and multivariate calibration.

    PubMed

    Guo, Canyong; Luo, Xuefang; Zhou, Xiaohua; Shi, Beijia; Wang, Juanjuan; Zhao, Jinqi; Zhang, Xiaoxia

    2017-06-05

    Vibrational spectroscopic techniques such as infrared, near-infrared and Raman spectroscopy have become popular in detecting and quantifying polymorphism of pharmaceutics since they are fast and non-destructive. This study assessed the ability of three vibrational spectroscopy combined with multivariate analysis to quantify a low-content undesired polymorph within a binary polymorphic mixture. Partial least squares (PLS) regression and support vector machine (SVM) regression were employed to build quantitative models. Fusidic acid, a steroidal antibiotic, was used as the model compound. It was found that PLS regression performed slightly better than SVM regression in all the three spectroscopic techniques. Root mean square errors of prediction (RMSEP) were ranging from 0.48% to 1.17% for diffuse reflectance FTIR spectroscopy and 1.60-1.93% for diffuse reflectance FT-NIR spectroscopy and 1.62-2.31% for Raman spectroscopy. The results indicate that diffuse reflectance FTIR spectroscopy offers significant advantages in providing accurate measurement of polymorphic content in the fusidic acid binary mixtures, while Raman spectroscopy is the least accurate technique for quantitative analysis of polymorphs. Copyright © 2017 Elsevier B.V. All rights reserved.

  3. Improving Prediction Accuracy for WSN Data Reduction by Applying Multivariate Spatio-Temporal Correlation

    PubMed Central

    Carvalho, Carlos; Gomes, Danielo G.; Agoulmine, Nazim; de Souza, José Neuman

    2011-01-01

    This paper proposes a method based on multivariate spatial and temporal correlation to improve prediction accuracy in data reduction for Wireless Sensor Networks (WSN). Prediction of data not sent to the sink node is a technique used to save energy in WSNs by reducing the amount of data traffic. However, it may not be very accurate. Simulations were made involving simple linear regression and multiple linear regression functions to assess the performance of the proposed method. The results show a higher correlation between gathered inputs when compared to time, which is an independent variable widely used for prediction and forecasting. Prediction accuracy is lower when simple linear regression is used, whereas multiple linear regression is the most accurate one. In addition to that, our proposal outperforms some current solutions by about 50% in humidity prediction and 21% in light prediction. To the best of our knowledge, we believe that we are probably the first to address prediction based on multivariate correlation for WSN data reduction. PMID:22346626

  4. Multivariate Bias Correction Procedures for Improving Water Quality Predictions from the SWAT Model

    NASA Astrophysics Data System (ADS)

    Arumugam, S.; Libera, D.

    2017-12-01

    Water quality observations are usually not available on a continuous basis for longer than 1-2 years at a time over a decadal period given the labor requirements making calibrating and validating mechanistic models difficult. Further, any physical model predictions inherently have bias (i.e., under/over estimation) and require post-simulation techniques to preserve the long-term mean monthly attributes. This study suggests a multivariate bias-correction technique and compares to a common technique in improving the performance of the SWAT model in predicting daily streamflow and TN loads across the southeast based on split-sample validation. The approach is a dimension reduction technique, canonical correlation analysis (CCA) that regresses the observed multivariate attributes with the SWAT model simulated values. The common approach is a regression based technique that uses an ordinary least squares regression to adjust model values. The observed cross-correlation between loadings and streamflow is better preserved when using canonical correlation while simultaneously reducing individual biases. Additionally, canonical correlation analysis does a better job in preserving the observed joint likelihood of observed streamflow and loadings. These procedures were applied to 3 watersheds chosen from the Water Quality Network in the Southeast Region; specifically, watersheds with sufficiently large drainage areas and number of observed data points. The performance of these two approaches are compared for the observed period and over a multi-decadal period using loading estimates from the USGS LOADEST model. Lastly, the CCA technique is applied in a forecasting sense by using 1-month ahead forecasts of P & T from ECHAM4.5 as forcings in the SWAT model. Skill in using the SWAT model for forecasting loadings and streamflow at the monthly and seasonal timescale is also discussed.

  5. The role of chemometrics in single and sequential extraction assays: a review. Part II. Cluster analysis, multiple linear regression, mixture resolution, experimental design and other techniques.

    PubMed

    Giacomino, Agnese; Abollino, Ornella; Malandrino, Mery; Mentasti, Edoardo

    2011-03-04

    Single and sequential extraction procedures are used for studying element mobility and availability in solid matrices, like soils, sediments, sludge, and airborne particulate matter. In the first part of this review we reported an overview on these procedures and described the applications of chemometric uni- and bivariate techniques and of multivariate pattern recognition techniques based on variable reduction to the experimental results obtained. The second part of the review deals with the use of chemometrics not only for the visualization and interpretation of data, but also for the investigation of the effects of experimental conditions on the response, the optimization of their values and the calculation of element fractionation. We will describe the principles of the multivariate chemometric techniques considered, the aims for which they were applied and the key findings obtained. The following topics will be critically addressed: pattern recognition by cluster analysis (CA), linear discriminant analysis (LDA) and other less common techniques; modelling by multiple linear regression (MLR); investigation of spatial distribution of variables by geostatistics; calculation of fractionation patterns by a mixture resolution method (Chemometric Identification of Substrates and Element Distributions, CISED); optimization and characterization of extraction procedures by experimental design; other multivariate techniques less commonly applied. Copyright © 2010 Elsevier B.V. All rights reserved.

  6. Quality Reporting of Multivariable Regression Models in Observational Studies: Review of a Representative Sample of Articles Published in Biomedical Journals.

    PubMed

    Real, Jordi; Forné, Carles; Roso-Llorach, Albert; Martínez-Sánchez, Jose M

    2016-05-01

    Controlling for confounders is a crucial step in analytical observational studies, and multivariable models are widely used as statistical adjustment techniques. However, the validation of the assumptions of the multivariable regression models (MRMs) should be made clear in scientific reporting. The objective of this study is to review the quality of statistical reporting of the most commonly used MRMs (logistic, linear, and Cox regression) that were applied in analytical observational studies published between 2003 and 2014 by journals indexed in MEDLINE.Review of a representative sample of articles indexed in MEDLINE (n = 428) with observational design and use of MRMs (logistic, linear, and Cox regression). We assessed the quality of reporting about: model assumptions and goodness-of-fit, interactions, sensitivity analysis, crude and adjusted effect estimate, and specification of more than 1 adjusted model.The tests of underlying assumptions or goodness-of-fit of the MRMs used were described in 26.2% (95% CI: 22.0-30.3) of the articles and 18.5% (95% CI: 14.8-22.1) reported the interaction analysis. Reporting of all items assessed was higher in articles published in journals with a higher impact factor.A low percentage of articles indexed in MEDLINE that used multivariable techniques provided information demonstrating rigorous application of the model selected as an adjustment method. Given the importance of these methods to the final results and conclusions of observational studies, greater rigor is required in reporting the use of MRMs in the scientific literature.

  7. Multivariate time series analysis of neuroscience data: some challenges and opportunities.

    PubMed

    Pourahmadi, Mohsen; Noorbaloochi, Siamak

    2016-04-01

    Neuroimaging data may be viewed as high-dimensional multivariate time series, and analyzed using techniques from regression analysis, time series analysis and spatiotemporal analysis. We discuss issues related to data quality, model specification, estimation, interpretation, dimensionality and causality. Some recent research areas addressing aspects of some recurring challenges are introduced. Copyright © 2015 Elsevier Ltd. All rights reserved.

  8. Cole-Cole, linear and multivariate modeling of capacitance data for on-line monitoring of biomass.

    PubMed

    Dabros, Michal; Dennewald, Danielle; Currie, David J; Lee, Mark H; Todd, Robert W; Marison, Ian W; von Stockar, Urs

    2009-02-01

    This work evaluates three techniques of calibrating capacitance (dielectric) spectrometers used for on-line monitoring of biomass: modeling of cell properties using the theoretical Cole-Cole equation, linear regression of dual-frequency capacitance measurements on biomass concentration, and multivariate (PLS) modeling of scanning dielectric spectra. The performance and robustness of each technique is assessed during a sequence of validation batches in two experimental settings of differing signal noise. In more noisy conditions, the Cole-Cole model had significantly higher biomass concentration prediction errors than the linear and multivariate models. The PLS model was the most robust in handling signal noise. In less noisy conditions, the three models performed similarly. Estimates of the mean cell size were done additionally using the Cole-Cole and PLS models, the latter technique giving more satisfactory results.

  9. [Use of multiple regression models in observational studies (1970-2013) and requirements of the STROBE guidelines in Spanish scientific journals].

    PubMed

    Real, J; Cleries, R; Forné, C; Roso-Llorach, A; Martínez-Sánchez, J M

    In medicine and biomedical research, statistical techniques like logistic, linear, Cox and Poisson regression are widely known. The main objective is to describe the evolution of multivariate techniques used in observational studies indexed in PubMed (1970-2013), and to check the requirements of the STROBE guidelines in the author guidelines in Spanish journals indexed in PubMed. A targeted PubMed search was performed to identify papers that used logistic linear Cox and Poisson models. Furthermore, a review was also made of the author guidelines of journals published in Spain and indexed in PubMed and Web of Science. Only 6.1% of the indexed manuscripts included a term related to multivariate analysis, increasing from 0.14% in 1980 to 12.3% in 2013. In 2013, 6.7, 2.5, 3.5, and 0.31% of the manuscripts contained terms related to logistic, linear, Cox and Poisson regression, respectively. On the other hand, 12.8% of journals author guidelines explicitly recommend to follow the STROBE guidelines, and 35.9% recommend the CONSORT guideline. A low percentage of Spanish scientific journals indexed in PubMed include the STROBE statement requirement in the author guidelines. Multivariate regression models in published observational studies such as logistic regression, linear, Cox and Poisson are increasingly used both at international level, as well as in journals published in Spanish. Copyright © 2015 Sociedad Española de Médicos de Atención Primaria (SEMERGEN). Publicado por Elsevier España, S.L.U. All rights reserved.

  10. Multivariate decoding of brain images using ordinal regression.

    PubMed

    Doyle, O M; Ashburner, J; Zelaya, F O; Williams, S C R; Mehta, M A; Marquand, A F

    2013-11-01

    Neuroimaging data are increasingly being used to predict potential outcomes or groupings, such as clinical severity, drug dose response, and transitional illness states. In these examples, the variable (target) we want to predict is ordinal in nature. Conventional classification schemes assume that the targets are nominal and hence ignore their ranked nature, whereas parametric and/or non-parametric regression models enforce a metric notion of distance between classes. Here, we propose a novel, alternative multivariate approach that overcomes these limitations - whole brain probabilistic ordinal regression using a Gaussian process framework. We applied this technique to two data sets of pharmacological neuroimaging data from healthy volunteers. The first study was designed to investigate the effect of ketamine on brain activity and its subsequent modulation with two compounds - lamotrigine and risperidone. The second study investigates the effect of scopolamine on cerebral blood flow and its modulation using donepezil. We compared ordinal regression to multi-class classification schemes and metric regression. Considering the modulation of ketamine with lamotrigine, we found that ordinal regression significantly outperformed multi-class classification and metric regression in terms of accuracy and mean absolute error. However, for risperidone ordinal regression significantly outperformed metric regression but performed similarly to multi-class classification both in terms of accuracy and mean absolute error. For the scopolamine data set, ordinal regression was found to outperform both multi-class and metric regression techniques considering the regional cerebral blood flow in the anterior cingulate cortex. Ordinal regression was thus the only method that performed well in all cases. Our results indicate the potential of an ordinal regression approach for neuroimaging data while providing a fully probabilistic framework with elegant approaches for model selection. Copyright © 2013. Published by Elsevier Inc.

  11. Nonlinear multivariate and time series analysis by neural network methods

    NASA Astrophysics Data System (ADS)

    Hsieh, William W.

    2004-03-01

    Methods in multivariate statistical analysis are essential for working with large amounts of geophysical data, data from observational arrays, from satellites, or from numerical model output. In classical multivariate statistical analysis, there is a hierarchy of methods, starting with linear regression at the base, followed by principal component analysis (PCA) and finally canonical correlation analysis (CCA). A multivariate time series method, the singular spectrum analysis (SSA), has been a fruitful extension of the PCA technique. The common drawback of these classical methods is that only linear structures can be correctly extracted from the data. Since the late 1980s, neural network methods have become popular for performing nonlinear regression and classification. More recently, neural network methods have been extended to perform nonlinear PCA (NLPCA), nonlinear CCA (NLCCA), and nonlinear SSA (NLSSA). This paper presents a unified view of the NLPCA, NLCCA, and NLSSA techniques and their applications to various data sets of the atmosphere and the ocean (especially for the El Niño-Southern Oscillation and the stratospheric quasi-biennial oscillation). These data sets reveal that the linear methods are often too simplistic to describe real-world systems, with a tendency to scatter a single oscillatory phenomenon into numerous unphysical modes or higher harmonics, which can be largely alleviated in the new nonlinear paradigm.

  12. A New Predictive Model of Centerline Segregation in Continuous Cast Steel Slabs by Using Multivariate Adaptive Regression Splines Approach

    PubMed Central

    García Nieto, Paulino José; González Suárez, Victor Manuel; Álvarez Antón, Juan Carlos; Mayo Bayón, Ricardo; Sirgo Blanco, José Ángel; Díaz Fernández, Ana María

    2015-01-01

    The aim of this study was to obtain a predictive model able to perform an early detection of central segregation severity in continuous cast steel slabs. Segregation in steel cast products is an internal defect that can be very harmful when slabs are rolled in heavy plate mills. In this research work, the central segregation was studied with success using the data mining methodology based on multivariate adaptive regression splines (MARS) technique. For this purpose, the most important physical-chemical parameters are considered. The results of the present study are two-fold. In the first place, the significance of each physical-chemical variable on the segregation is presented through the model. Second, a model for forecasting segregation is obtained. Regression with optimal hyperparameters was performed and coefficients of determination equal to 0.93 for continuity factor estimation and 0.95 for average width were obtained when the MARS technique was applied to the experimental dataset, respectively. The agreement between experimental data and the model confirmed the good performance of the latter.

  13. Membrane Introduction Mass Spectrometry Combined with an Orthogonal Partial-Least Squares Calibration Model for Mixture Analysis.

    PubMed

    Li, Min; Zhang, Lu; Yao, Xiaolong; Jiang, Xingyu

    2017-01-01

    The emerging membrane introduction mass spectrometry technique has been successfully used to detect benzene, toluene, ethyl benzene and xylene (BTEX), while overlapped spectra have unfortunately hindered its further application to the analysis of mixtures. Multivariate calibration, an efficient method to analyze mixtures, has been widely applied. In this paper, we compared univariate and multivariate analyses for quantification of the individual components of mixture samples. The results showed that the univariate analysis creates poor models with regression coefficients of 0.912, 0.867, 0.440 and 0.351 for BTEX, respectively. For multivariate analysis, a comparison to the partial-least squares (PLS) model shows that the orthogonal partial-least squares (OPLS) regression exhibits an optimal performance with regression coefficients of 0.995, 0.999, 0.980 and 0.976, favorable calibration parameters (RMSEC and RMSECV) and a favorable validation parameter (RMSEP). Furthermore, the OPLS exhibits a good recovery of 73.86 - 122.20% and relative standard deviation (RSD) of the repeatability of 1.14 - 4.87%. Thus, MIMS coupled with the OPLS regression provides an optimal approach for a quantitative BTEX mixture analysis in monitoring and predicting water pollution.

  14. Analysis of Forest Foliage Using a Multivariate Mixture Model

    NASA Technical Reports Server (NTRS)

    Hlavka, C. A.; Peterson, David L.; Johnson, L. F.; Ganapol, B.

    1997-01-01

    Data with wet chemical measurements and near infrared spectra of ground leaf samples were analyzed to test a multivariate regression technique for estimating component spectra which is based on a linear mixture model for absorbance. The resulting unmixed spectra for carbohydrates, lignin, and protein resemble the spectra of extracted plant starches, cellulose, lignin, and protein. The unmixed protein spectrum has prominent absorption spectra at wavelengths which have been associated with nitrogen bonds.

  15. Inference for multivariate regression model based on multiply imputed synthetic data generated via posterior predictive sampling

    NASA Astrophysics Data System (ADS)

    Moura, Ricardo; Sinha, Bimal; Coelho, Carlos A.

    2017-06-01

    The recent popularity of the use of synthetic data as a Statistical Disclosure Control technique has enabled the development of several methods of generating and analyzing such data, but almost always relying in asymptotic distributions and in consequence being not adequate for small sample datasets. Thus, a likelihood-based exact inference procedure is derived for the matrix of regression coefficients of the multivariate regression model, for multiply imputed synthetic data generated via Posterior Predictive Sampling. Since it is based in exact distributions this procedure may even be used in small sample datasets. Simulation studies compare the results obtained from the proposed exact inferential procedure with the results obtained from an adaptation of Reiters combination rule to multiply imputed synthetic datasets and an application to the 2000 Current Population Survey is discussed.

  16. Multivariate Analysis of Seismic Field Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Alam, M. Kathleen

    1999-06-01

    This report includes the details of the model building procedure and prediction of seismic field data. Principal Components Regression, a multivariate analysis technique, was used to model seismic data collected as two pieces of equipment were cycled on and off. Models built that included only the two pieces of equipment of interest had trouble predicting data containing signals not included in the model. Evidence for poor predictions came from the prediction curves as well as spectral F-ratio plots. Once the extraneous signals were included in the model, predictions improved dramatically. While Principal Components Regression performed well for the present datamore » sets, the present data analysis suggests further work will be needed to develop more robust modeling methods as the data become more complex.« less

  17. Error Covariance Penalized Regression: A novel multivariate model combining penalized regression with multivariate error structure.

    PubMed

    Allegrini, Franco; Braga, Jez W B; Moreira, Alessandro C O; Olivieri, Alejandro C

    2018-06-29

    A new multivariate regression model, named Error Covariance Penalized Regression (ECPR) is presented. Following a penalized regression strategy, the proposed model incorporates information about the measurement error structure of the system, using the error covariance matrix (ECM) as a penalization term. Results are reported from both simulations and experimental data based on replicate mid and near infrared (MIR and NIR) spectral measurements. The results for ECPR are better under non-iid conditions when compared with traditional first-order multivariate methods such as ridge regression (RR), principal component regression (PCR) and partial least-squares regression (PLS). Copyright © 2018 Elsevier B.V. All rights reserved.

  18. Modeling and managing risk early in software development

    NASA Technical Reports Server (NTRS)

    Briand, Lionel C.; Thomas, William M.; Hetmanski, Christopher J.

    1993-01-01

    In order to improve the quality of the software development process, we need to be able to build empirical multivariate models based on data collectable early in the software process. These models need to be both useful for prediction and easy to interpret, so that remedial actions may be taken in order to control and optimize the development process. We present an automated modeling technique which can be used as an alternative to regression techniques. We show how it can be used to facilitate the identification and aid the interpretation of the significant trends which characterize 'high risk' components in several Ada systems. Finally, we evaluate the effectiveness of our technique based on a comparison with logistic regression based models.

  19. LASSO NTCP predictors for the incidence of xerostomia in patients with head and neck squamous cell carcinoma and nasopharyngeal carcinoma

    PubMed Central

    Lee, Tsair-Fwu; Liou, Ming-Hsiang; Huang, Yu-Jie; Chao, Pei-Ju; Ting, Hui-Min; Lee, Hsiao-Yi

    2014-01-01

    To predict the incidence of moderate-to-severe patient-reported xerostomia among head and neck squamous cell carcinoma (HNSCC) and nasopharyngeal carcinoma (NPC) patients treated with intensity-modulated radiotherapy (IMRT). Multivariable normal tissue complication probability (NTCP) models were developed by using quality of life questionnaire datasets from 152 patients with HNSCC and 84 patients with NPC. The primary endpoint was defined as moderate-to-severe xerostomia after IMRT. The numbers of predictive factors for a multivariable logistic regression model were determined using the least absolute shrinkage and selection operator (LASSO) with bootstrapping technique. Four predictive models were achieved by LASSO with the smallest number of factors while preserving predictive value with higher AUC performance. For all models, the dosimetric factors for the mean dose given to the contralateral and ipsilateral parotid gland were selected as the most significant predictors. Followed by the different clinical and socio-economic factors being selected, namely age, financial status, T stage, and education for different models were chosen. The predicted incidence of xerostomia for HNSCC and NPC patients can be improved by using multivariable logistic regression models with LASSO technique. The predictive model developed in HNSCC cannot be generalized to NPC cohort treated with IMRT without validation and vice versa. PMID:25163814

  20. Hierarchical Bayesian spatial models for predicting multiple forest variables using waveform LiDAR, hyperspectral imagery, and large inventory datasets

    USGS Publications Warehouse

    Finley, Andrew O.; Banerjee, Sudipto; Cook, Bruce D.; Bradford, John B.

    2013-01-01

    In this paper we detail a multivariate spatial regression model that couples LiDAR, hyperspectral and forest inventory data to predict forest outcome variables at a high spatial resolution. The proposed model is used to analyze forest inventory data collected on the US Forest Service Penobscot Experimental Forest (PEF), ME, USA. In addition to helping meet the regression model's assumptions, results from the PEF analysis suggest that the addition of multivariate spatial random effects improves model fit and predictive ability, compared with two commonly applied modeling approaches. This improvement results from explicitly modeling the covariation among forest outcome variables and spatial dependence among observations through the random effects. Direct application of such multivariate models to even moderately large datasets is often computationally infeasible because of cubic order matrix algorithms involved in estimation. We apply a spatial dimension reduction technique to help overcome this computational hurdle without sacrificing richness in modeling.

  1. Assessing the sensitivity and robustness of prediction models for apple firmness using spectral scattering technique

    USDA-ARS?s Scientific Manuscript database

    Spectral scattering is useful for nondestructive sensing of fruit firmness. Prediction models, however, are typically built using multivariate statistical methods such as partial least squares regression (PLSR), whose performance generally depends on the characteristics of the data. The aim of this ...

  2. Salting-out assisted liquid-liquid extraction and partial least squares regression to assay low molecular weight polycyclic aromatic hydrocarbons leached from soils and sediments

    NASA Astrophysics Data System (ADS)

    Bressan, Lucas P.; do Nascimento, Paulo Cícero; Schmidt, Marcella E. P.; Faccin, Henrique; de Machado, Leandro Carvalho; Bohrer, Denise

    2017-02-01

    A novel method was developed to determine low molecular weight polycyclic aromatic hydrocarbons in aqueous leachates from soils and sediments using a salting-out assisted liquid-liquid extraction, synchronous fluorescence spectrometry and a multivariate calibration technique. Several experimental parameters were controlled and the optimum conditions were: sodium carbonate as the salting-out agent at concentration of 2 mol L- 1, 3 mL of acetonitrile as extraction solvent, 6 mL of aqueous leachate, vortexing for 5 min and centrifuging at 4000 rpm for 5 min. The partial least squares calibration was optimized to the lowest values of root mean squared error and five latent variables were chosen for each of the targeted compounds. The regression coefficients for the true versus predicted concentrations were higher than 0.99. Figures of merit for the multivariate method were calculated, namely sensitivity, multivariate detection limit and multivariate quantification limit. The selectivity was also evaluated and other polycyclic aromatic hydrocarbons did not interfere in the analysis. Likewise, high performance liquid chromatography was used as a comparative methodology, and the regression analysis between the methods showed no statistical difference (t-test). The proposed methodology was applied to soils and sediments of a Brazilian river and the recoveries ranged from 74.3% to 105.8%. Overall, the proposed methodology was suitable for the targeted compounds, showing that the extraction method can be applied to spectrofluorometric analysis and that the multivariate calibration is also suitable for these compounds in leachates from real samples.

  3. Estuarial fingerprinting through multidimensional fluorescence and multivariate analysis.

    PubMed

    Hall, Gregory J; Clow, Kerin E; Kenny, Jonathan E

    2005-10-01

    As part of a strategy for preventing the introduction of aquatic nuisance species (ANS) to U.S. estuaries, ballast water exchange (BWE) regulations have been imposed. Enforcing these regulations requires a reliable method for determining the port of origin of water in the ballast tanks of ships entering U.S. waters. This study shows that a three-dimensional fluorescence fingerprinting technique, excitation emission matrix (EEM) spectroscopy, holds great promise as a ballast water analysis tool. In our technique, EEMs are analyzed by multivariate classification and curve resolution methods, such as N-way partial least squares Regression-discriminant analysis (NPLS-DA) and parallel factor analysis (PARAFAC). We demonstrate that classification techniques can be used to discriminate among sampling sites less than 10 miles apart, encompassing Boston Harbor and two tributaries in the Mystic River Watershed. To our knowledge, this work is the first to use multivariate analysis to classify water as to location of origin. Furthermore, it is shown that curve resolution can show seasonal features within the multidimensional fluorescence data sets, which correlate with difficulty in classification.

  4. Music and Suicidality: A Quantitative Review and Extension

    ERIC Educational Resources Information Center

    Stack, Steven; Lester, David; Rosenberg, Jonathan S.

    2012-01-01

    This article provides the first quantitative review of the literature on music and suicidality. Multivariate logistic regression techniques are applied to 90 findings from 21 studies. Investigations employing ecological data on suicide completions are 19.2 times more apt than other studies to report a link between music and suicide. More recent…

  5. Sexual Orientation, Weight Concerns, and Eating-Disordered Behaviors in Adolescent Girls and Boys.

    ERIC Educational Resources Information Center

    Austin, S. Bryn; Ziyadeh, Najat; Kahn, Jessica A.; Camargo, Carlos A.; Colditz, Graham A.; Field, Alison E.

    2004-01-01

    Objective: To examine sexual orientation group differences in eating disorder symptoms in adolescent girls and boys. Method: Cross-sectional associations were examined using multivariate regression techniques using data gathered in 1999 from 10,583 adolescents in the Growing Up Today Study, a cohort of children of women participating in the…

  6. Time Poverty Thresholds and Rates for the US Population

    ERIC Educational Resources Information Center

    Kalenkoski, Charlene M.; Hamrick, Karen S.; Andrews, Margaret

    2011-01-01

    Time constraints, like money constraints, affect Americans' well-being. This paper defines what it means to be time poor based on the concepts of necessary and committed time and presents time poverty thresholds and rates for the US population and certain subgroups. Multivariate regression techniques are used to identify the key variables…

  7. Applications of modern statistical methods to analysis of data in physical science

    NASA Astrophysics Data System (ADS)

    Wicker, James Eric

    Modern methods of statistical and computational analysis offer solutions to dilemmas confronting researchers in physical science. Although the ideas behind modern statistical and computational analysis methods were originally introduced in the 1970's, most scientists still rely on methods written during the early era of computing. These researchers, who analyze increasingly voluminous and multivariate data sets, need modern analysis methods to extract the best results from their studies. The first section of this work showcases applications of modern linear regression. Since the 1960's, many researchers in spectroscopy have used classical stepwise regression techniques to derive molecular constants. However, problems with thresholds of entry and exit for model variables plagues this analysis method. Other criticisms of this kind of stepwise procedure include its inefficient searching method, the order in which variables enter or leave the model and problems with overfitting data. We implement an information scoring technique that overcomes the assumptions inherent in the stepwise regression process to calculate molecular model parameters. We believe that this kind of information based model evaluation can be applied to more general analysis situations in physical science. The second section proposes new methods of multivariate cluster analysis. The K-means algorithm and the EM algorithm, introduced in the 1960's and 1970's respectively, formed the basis of multivariate cluster analysis methodology for many years. However, several shortcomings of these methods include strong dependence on initial seed values and inaccurate results when the data seriously depart from hypersphericity. We propose new cluster analysis methods based on genetic algorithms that overcomes the strong dependence on initial seed values. In addition, we propose a generalization of the Genetic K-means algorithm which can accurately identify clusters with complex hyperellipsoidal covariance structures. We then use this new algorithm in a genetic algorithm based Expectation-Maximization process that can accurately calculate parameters describing complex clusters in a mixture model routine. Using the accuracy of this GEM algorithm, we assign information scores to cluster calculations in order to best identify the number of mixture components in a multivariate data set. We will showcase how these algorithms can be used to process multivariate data from astronomical observations.

  8. Multivariate analysis of nystatin and metronidazole in a semi-solid matrix by means of diffuse reflectance NIR spectroscopy and PLS regression.

    PubMed

    Baratieri, Sabrina C; Barbosa, Juliana M; Freitas, Matheus P; Martins, José A

    2006-01-23

    A multivariate method of analysis of nystatin and metronidazole in a semi-solid matrix, based on diffuse reflectance NIR measurements and partial least squares regression, is reported. The product, a vaginal cream used in the antifungal and antibacterial treatment, is usually, quantitatively analyzed through microbiological tests (nystatin) and HPLC technique (metronidazole), according to pharmacopeial procedures. However, near infrared spectroscopy has demonstrated to be a valuable tool for content determination, given the rapidity and scope of the method. In the present study, it was successfully applied in the prediction of nystatin (even in low concentrations, ca. 0.3-0.4%, w/w, which is around 100,000 IU/5g) and metronidazole contents, as demonstrated by some figures of merit, namely linearity, precision (mean and repeatability) and accuracy.

  9. Fresh Biomass Estimation in Heterogeneous Grassland Using Hyperspectral Measurements and Multivariate Statistical Analysis

    NASA Astrophysics Data System (ADS)

    Darvishzadeh, R.; Skidmore, A. K.; Mirzaie, M.; Atzberger, C.; Schlerf, M.

    2014-12-01

    Accurate estimation of grassland biomass at their peak productivity can provide crucial information regarding the functioning and productivity of the rangelands. Hyperspectral remote sensing has proved to be valuable for estimation of vegetation biophysical parameters such as biomass using different statistical techniques. However, in statistical analysis of hyperspectral data, multicollinearity is a common problem due to large amount of correlated hyper-spectral reflectance measurements. The aim of this study was to examine the prospect of above ground biomass estimation in a heterogeneous Mediterranean rangeland employing multivariate calibration methods. Canopy spectral measurements were made in the field using a GER 3700 spectroradiometer, along with concomitant in situ measurements of above ground biomass for 170 sample plots. Multivariate calibrations including partial least squares regression (PLSR), principal component regression (PCR), and Least-Squared Support Vector Machine (LS-SVM) were used to estimate the above ground biomass. The prediction accuracy of the multivariate calibration methods were assessed using cross validated R2 and RMSE. The best model performance was obtained using LS_SVM and then PLSR both calibrated with first derivative reflectance dataset with R2cv = 0.88 & 0.86 and RMSEcv= 1.15 & 1.07 respectively. The weakest prediction accuracy was appeared when PCR were used (R2cv = 0.31 and RMSEcv= 2.48). The obtained results highlight the importance of multivariate calibration methods for biomass estimation when hyperspectral data are used.

  10. Comprehensive ripeness-index for prediction of ripening level in mangoes by multivariate modelling of ripening behaviour

    NASA Astrophysics Data System (ADS)

    Eyarkai Nambi, Vijayaram; Thangavel, Kuladaisamy; Manickavasagan, Annamalai; Shahir, Sultan

    2017-01-01

    Prediction of ripeness level in climacteric fruits is essential for post-harvest handling. An index capable of predicting ripening level with minimum inputs would be highly beneficial to the handlers, processors and researchers in fruit industry. A study was conducted with Indian mango cultivars to develop a ripeness index and associated model. Changes in physicochemical, colour and textural properties were measured throughout the ripening period and the period was classified into five stages (unripe, early ripe, partially ripe, ripe and over ripe). Multivariate regression techniques like partial least square regression, principal component regression and multi linear regression were compared and evaluated for its prediction. Multi linear regression model with 12 parameters was found more suitable in ripening prediction. Scientific variable reduction method was adopted to simplify the developed model. Better prediction was achieved with either 2 or 3 variables (total soluble solids, colour and acidity). Cross validation was done to increase the robustness and it was found that proposed ripening index was more effective in prediction of ripening stages. Three-variable model would be suitable for commercial applications where reasonable accuracies are sufficient. However, 12-variable model can be used to obtain more precise results in research and development applications.

  11. Support vector machine regression (SVR/LS-SVM)--an alternative to neural networks (ANN) for analytical chemistry? Comparison of nonlinear methods on near infrared (NIR) spectroscopy data.

    PubMed

    Balabin, Roman M; Lomakina, Ekaterina I

    2011-04-21

    In this study, we make a general comparison of the accuracy and robustness of five multivariate calibration models: partial least squares (PLS) regression or projection to latent structures, polynomial partial least squares (Poly-PLS) regression, artificial neural networks (ANNs), and two novel techniques based on support vector machines (SVMs) for multivariate data analysis: support vector regression (SVR) and least-squares support vector machines (LS-SVMs). The comparison is based on fourteen (14) different datasets: seven sets of gasoline data (density, benzene content, and fractional composition/boiling points), two sets of ethanol gasoline fuel data (density and ethanol content), one set of diesel fuel data (total sulfur content), three sets of petroleum (crude oil) macromolecules data (weight percentages of asphaltenes, resins, and paraffins), and one set of petroleum resins data (resins content). Vibrational (near-infrared, NIR) spectroscopic data are used to predict the properties and quality coefficients of gasoline, biofuel/biodiesel, diesel fuel, and other samples of interest. The four systems presented here range greatly in composition, properties, strength of intermolecular interactions (e.g., van der Waals forces, H-bonds), colloid structure, and phase behavior. Due to the high diversity of chemical systems studied, general conclusions about SVM regression methods can be made. We try to answer the following question: to what extent can SVM-based techniques replace ANN-based approaches in real-world (industrial/scientific) applications? The results show that both SVR and LS-SVM methods are comparable to ANNs in accuracy. Due to the much higher robustness of the former, the SVM-based approaches are recommended for practical (industrial) application. This has been shown to be especially true for complicated, highly nonlinear objects.

  12. A multivariate regression model for detection of fumonisins content in maize from near infrared spectra.

    PubMed

    Giacomo, Della Riccia; Stefania, Del Zotto

    2013-12-15

    Fumonisins are mycotoxins produced by Fusarium species that commonly live in maize. Whereas fungi damage plants, fumonisins cause disease both to cattle breedings and human beings. Law limits set fumonisins tolerable daily intake with respect to several maize based feed and food. Chemical techniques assure the most reliable and accurate measurements, but they are expensive and time consuming. A method based on Near Infrared spectroscopy and multivariate statistical regression is described as a simpler, cheaper and faster alternative. We apply Partial Least Squares with full cross validation. Two models are described, having high correlation of calibration (0.995, 0.998) and of validation (0.908, 0.909), respectively. Description of observed phenomenon is accurate and overfitting is avoided. Screening of contaminated maize with respect to European legal limit of 4 mg kg(-1) should be assured. Copyright © 2013 Elsevier Ltd. All rights reserved.

  13. Validated univariate and multivariate spectrophotometric methods for the determination of pharmaceuticals mixture in complex wastewater

    NASA Astrophysics Data System (ADS)

    Riad, Safaa M.; Salem, Hesham; Elbalkiny, Heba T.; Khattab, Fatma I.

    2015-04-01

    Five, accurate, precise, and sensitive univariate and multivariate spectrophotometric methods were developed for the simultaneous determination of a ternary mixture containing Trimethoprim (TMP), Sulphamethoxazole (SMZ) and Oxytetracycline (OTC) in waste water samples collected from different cites either production wastewater or livestock wastewater after their solid phase extraction using OASIS HLB cartridges. In univariate methods OTC was determined at its λmax 355.7 nm (0D), while (TMP) and (SMZ) were determined by three different univariate methods. Method (A) is based on successive spectrophotometric resolution technique (SSRT). The technique starts with the ratio subtraction method followed by ratio difference method for determination of TMP and SMZ. Method (B) is successive derivative ratio technique (SDR). Method (C) is mean centering of the ratio spectra (MCR). The developed multivariate methods are principle component regression (PCR) and partial least squares (PLS). The specificity of the developed methods is investigated by analyzing laboratory prepared mixtures containing different ratios of the three drugs. The obtained results are statistically compared with those obtained by the official methods, showing no significant difference with respect to accuracy and precision at p = 0.05.

  14. Validated univariate and multivariate spectrophotometric methods for the determination of pharmaceuticals mixture in complex wastewater.

    PubMed

    Riad, Safaa M; Salem, Hesham; Elbalkiny, Heba T; Khattab, Fatma I

    2015-04-05

    Five, accurate, precise, and sensitive univariate and multivariate spectrophotometric methods were developed for the simultaneous determination of a ternary mixture containing Trimethoprim (TMP), Sulphamethoxazole (SMZ) and Oxytetracycline (OTC) in waste water samples collected from different cites either production wastewater or livestock wastewater after their solid phase extraction using OASIS HLB cartridges. In univariate methods OTC was determined at its λmax 355.7 nm (0D), while (TMP) and (SMZ) were determined by three different univariate methods. Method (A) is based on successive spectrophotometric resolution technique (SSRT). The technique starts with the ratio subtraction method followed by ratio difference method for determination of TMP and SMZ. Method (B) is successive derivative ratio technique (SDR). Method (C) is mean centering of the ratio spectra (MCR). The developed multivariate methods are principle component regression (PCR) and partial least squares (PLS). The specificity of the developed methods is investigated by analyzing laboratory prepared mixtures containing different ratios of the three drugs. The obtained results are statistically compared with those obtained by the official methods, showing no significant difference with respect to accuracy and precision at p=0.05. Copyright © 2015 Elsevier B.V. All rights reserved.

  15. Estuarine Sediment Deposition during Wetland Restoration: A GIS and Remote Sensing Modeling Approach

    NASA Technical Reports Server (NTRS)

    Newcomer, Michelle; Kuss, Amber; Kentron, Tyler; Remar, Alex; Choksi, Vivek; Skiles, J. W.

    2011-01-01

    Restoration of the industrial salt flats in the San Francisco Bay, California is an ongoing wetland rehabilitation project. Remote sensing maps of suspended sediment concentration, and other GIS predictor variables were used to model sediment deposition within these recently restored ponds. Suspended sediment concentrations were calibrated to reflectance values from Landsat TM 5 and ASTER using three statistical techniques -- linear regression, multivariate regression, and an Artificial Neural Network (ANN), to map suspended sediment concentrations. Multivariate and ANN regressions using ASTER proved to be the most accurate methods, yielding r2 values of 0.88 and 0.87, respectively. Predictor variables such as sediment grain size and tidal frequency were used in the Marsh Sedimentation (MARSED) model for predicting deposition rates for three years. MARSED results for a fully restored pond show a root mean square deviation (RMSD) of 66.8 mm (<1) between modeled and field observations. This model was further applied to a pond breached in November 2010 and indicated that the recently breached pond will reach equilibrium levels after 60 months of tidal inundation.

  16. Simple linear and multivariate regression models.

    PubMed

    Rodríguez del Águila, M M; Benítez-Parejo, N

    2011-01-01

    In biomedical research it is common to find problems in which we wish to relate a response variable to one or more variables capable of describing the behaviour of the former variable by means of mathematical models. Regression techniques are used to this effect, in which an equation is determined relating the two variables. While such equations can have different forms, linear equations are the most widely used form and are easy to interpret. The present article describes simple and multiple linear regression models, how they are calculated, and how their applicability assumptions are checked. Illustrative examples are provided, based on the use of the freely accessible R program. Copyright © 2011 SEICAP. Published by Elsevier Espana. All rights reserved.

  17. Solving large mixed linear models using preconditioned conjugate gradient iteration.

    PubMed

    Strandén, I; Lidauer, M

    1999-12-01

    Continuous evaluation of dairy cattle with a random regression test-day model requires a fast solving method and algorithm. A new computing technique feasible in Jacobi and conjugate gradient based iterative methods using iteration on data is presented. In the new computing technique, the calculations in multiplication of a vector by a matrix were recorded to three steps instead of the commonly used two steps. The three-step method was implemented in a general mixed linear model program that used preconditioned conjugate gradient iteration. Performance of this program in comparison to other general solving programs was assessed via estimation of breeding values using univariate, multivariate, and random regression test-day models. Central processing unit time per iteration with the new three-step technique was, at best, one-third that needed with the old technique. Performance was best with the test-day model, which was the largest and most complex model used. The new program did well in comparison to other general software. Programs keeping the mixed model equations in random access memory required at least 20 and 435% more time to solve the univariate and multivariate animal models, respectively. Computations of the second best iteration on data took approximately three and five times longer for the animal and test-day models, respectively, than did the new program. Good performance was due to fast computing time per iteration and quick convergence to the final solutions. Use of preconditioned conjugate gradient based methods in solving large breeding value problems is supported by our findings.

  18. Cider fermentation process monitoring by Vis-NIR sensor system and chemometrics.

    PubMed

    Villar, Alberto; Vadillo, Julen; Santos, Jose I; Gorritxategi, Eneko; Mabe, Jon; Arnaiz, Aitor; Fernández, Luis A

    2017-04-15

    Optimization of a multivariate calibration process has been undertaken for a Visible-Near Infrared (400-1100nm) sensor system, applied in the monitoring of the fermentation process of the cider produced in the Basque Country (Spain). The main parameters that were monitored included alcoholic proof, l-lactic acid content, glucose+fructose and acetic acid content. The multivariate calibration was carried out using a combination of different variable selection techniques and the most suitable pre-processing strategies were selected based on the spectra characteristics obtained by the sensor system. The variable selection techniques studied in this work include Martens Uncertainty test, interval Partial Least Square Regression (iPLS) and Genetic Algorithm (GA). This procedure arises from the need to improve the calibration models prediction ability for cider monitoring. Copyright © 2016 Elsevier Ltd. All rights reserved.

  19. A strategy for simultaneous determination of fatty acid composition, fatty acid position, and position-specific isotope contents in triacylglycerol matrices by 13C-NMR.

    PubMed

    Merchak, Noelle; Silvestre, Virginie; Loquet, Denis; Rizk, Toufic; Akoka, Serge; Bejjani, Joseph

    2017-01-01

    Triacylglycerols, which are quasi-universal components of food matrices, consist of complex mixtures of molecules. Their site-specific 13 C content, their fatty acid profile, and their position on the glycerol moiety may significantly vary with the geographical, botanical, or animal origin of the sample. Such variables are valuable tracers for food authentication issues. The main objective of this work was to develop a new method based on a rapid and precise 13 C-NMR spectroscopy (using a polarization transfer technique) coupled with multivariate linear regression analyses in order to quantify the whole set of individual fatty acids within triacylglycerols. In this respect, olive oil samples were analyzed by means of both adiabatic 13 C-INEPT sequence and gas chromatography (GC). For each fatty acid within the studied matrix and for squalene as well, a multivariate prediction model was constructed using the deconvoluted peak areas of 13 C-INEPT spectra as predictors, and the data obtained by GC as response variables. This 13 C-NMR-based strategy, tested on olive oil, could serve as an alternative to the gas chromatographic quantification of individual fatty acids in other matrices, while providing additional compositional and isotopic information. Graphical abstract A strategy based on the multivariate linear regression of variables obtained by a rapid 13 C-NMR technique was developed for the quantification of individual fatty acids within triacylglycerol matrices. The conceived strategy was tested on olive oil.

  20. Multivariate reference technique for quantitative analysis of fiber-optic tissue Raman spectroscopy.

    PubMed

    Bergholt, Mads Sylvest; Duraipandian, Shiyamala; Zheng, Wei; Huang, Zhiwei

    2013-12-03

    We report a novel method making use of multivariate reference signals of fused silica and sapphire Raman signals generated from a ball-lens fiber-optic Raman probe for quantitative analysis of in vivo tissue Raman measurements in real time. Partial least-squares (PLS) regression modeling is applied to extract the characteristic internal reference Raman signals (e.g., shoulder of the prominent fused silica boson peak (~130 cm(-1)); distinct sapphire ball-lens peaks (380, 417, 646, and 751 cm(-1))) from the ball-lens fiber-optic Raman probe for quantitative analysis of fiber-optic Raman spectroscopy. To evaluate the analytical value of this novel multivariate reference technique, a rapid Raman spectroscopy system coupled with a ball-lens fiber-optic Raman probe is used for in vivo oral tissue Raman measurements (n = 25 subjects) under 785 nm laser excitation powers ranging from 5 to 65 mW. An accurate linear relationship (R(2) = 0.981) with a root-mean-square error of cross validation (RMSECV) of 2.5 mW can be obtained for predicting the laser excitation power changes based on a leave-one-subject-out cross-validation, which is superior to the normal univariate reference method (RMSE = 6.2 mW). A root-mean-square error of prediction (RMSEP) of 2.4 mW (R(2) = 0.985) can also be achieved for laser power prediction in real time when we applied the multivariate method independently on the five new subjects (n = 166 spectra). We further apply the multivariate reference technique for quantitative analysis of gelatin tissue phantoms that gives rise to an RMSEP of ~2.0% (R(2) = 0.998) independent of laser excitation power variations. This work demonstrates that multivariate reference technique can be advantageously used to monitor and correct the variations of laser excitation power and fiber coupling efficiency in situ for standardizing the tissue Raman intensity to realize quantitative analysis of tissue Raman measurements in vivo, which is particularly appealing in challenging Raman endoscopic applications.

  1. Non-Gaussian spatiotemporal simulation of multisite daily precipitation: downscaling framework

    NASA Astrophysics Data System (ADS)

    Ben Alaya, M. A.; Ouarda, T. B. M. J.; Chebana, F.

    2018-01-01

    Probabilistic regression approaches for downscaling daily precipitation are very useful. They provide the whole conditional distribution at each forecast step to better represent the temporal variability. The question addressed in this paper is: how to simulate spatiotemporal characteristics of multisite daily precipitation from probabilistic regression models? Recent publications point out the complexity of multisite properties of daily precipitation and highlight the need for using a non-Gaussian flexible tool. This work proposes a reasonable compromise between simplicity and flexibility avoiding model misspecification. A suitable nonparametric bootstrapping (NB) technique is adopted. A downscaling model which merges a vector generalized linear model (VGLM as a probabilistic regression tool) and the proposed bootstrapping technique is introduced to simulate realistic multisite precipitation series. The model is applied to data sets from the southern part of the province of Quebec, Canada. It is shown that the model is capable of reproducing both at-site properties and the spatial structure of daily precipitations. Results indicate the superiority of the proposed NB technique, over a multivariate autoregressive Gaussian framework (i.e. Gaussian copula).

  2. Bayesian Estimation of Multivariate Latent Regression Models: Gauss versus Laplace

    ERIC Educational Resources Information Center

    Culpepper, Steven Andrew; Park, Trevor

    2017-01-01

    A latent multivariate regression model is developed that employs a generalized asymmetric Laplace (GAL) prior distribution for regression coefficients. The model is designed for high-dimensional applications where an approximate sparsity condition is satisfied, such that many regression coefficients are near zero after accounting for all the model…

  3. Multivariate Regression Analysis and Slaughter Livestock,

    DTIC Science & Technology

    AGRICULTURE, *ECONOMICS), (*MEAT, PRODUCTION), MULTIVARIATE ANALYSIS, REGRESSION ANALYSIS , ANIMALS, WEIGHT, COSTS, PREDICTIONS, STABILITY, MATHEMATICAL MODELS, STORAGE, BEEF, PORK, FOOD, STATISTICAL DATA, ACCURACY

  4. MULTIVARIATE ANALYSIS OF DRINKING BEHAVIOUR IN A RURAL POPULATION

    PubMed Central

    Mathrubootham, N.; Bashyam, V.S.P.; Shahjahan

    1997-01-01

    This study was carried out to find out the drinking pattern in a rural population, using multivariate techniques. 386 current users identified in a community were assessed with regard to their drinking behaviours using a structured interview. For purposes of the study the questions were condensed into 46 meaningful variables. In bivariate analysis, 14 variables including dependent variables such as dependence, MAST & CAGE (measuring alcoholic status), Q.F. Index and troubled drinking were found to be significant. Taking these variables and other multivariate techniques too such as ANOVA, correlation, regression analysis and factor analysis were done using both SPSS PC + and HCL magnum mainframe computer with FOCUS package and UNIX systems. Results revealed that number of factors such as drinking style, duration of drinking, pattern of abuse, Q.F. Index and various problems influenced drinking and some of them set up a vicious circle. Factor analysis revealed mainly 3 factors, abuse, dependence and social drinking factors. Dependence could be divided into low/moderate dependence. The implications and practical applications of these tests are also discussed. PMID:21584077

  5. Comparison of Logistic Regression and Random Forests techniques for shallow landslide susceptibility assessment in Giampilieri (NE Sicily, Italy)

    NASA Astrophysics Data System (ADS)

    Trigila, Alessandro; Iadanza, Carla; Esposito, Carlo; Scarascia-Mugnozza, Gabriele

    2015-11-01

    The aim of this work is to define reliable susceptibility models for shallow landslides using Logistic Regression and Random Forests multivariate statistical techniques. The study area, located in North-East Sicily, was hit on October 1st 2009 by a severe rainstorm (225 mm of cumulative rainfall in 7 h) which caused flash floods and more than 1000 landslides. Several small villages, such as Giampilieri, were hit with 31 fatalities, 6 missing persons and damage to buildings and transportation infrastructures. Landslides, mainly types such as earth and debris translational slides evolving into debris flows, were triggered on steep slopes and involved colluvium and regolith materials which cover the underlying metamorphic bedrock. The work has been carried out with the following steps: i) realization of a detailed event landslide inventory map through field surveys coupled with observation of high resolution aerial colour orthophoto; ii) identification of landslide source areas; iii) data preparation of landslide controlling factors and descriptive statistics based on a bivariate method (Frequency Ratio) to get an initial overview on existing relationships between causative factors and shallow landslide source areas; iv) choice of criteria for the selection and sizing of the mapping unit; v) implementation of 5 multivariate statistical susceptibility models based on Logistic Regression and Random Forests techniques and focused on landslide source areas; vi) evaluation of the influence of sample size and type of sampling on results and performance of the models; vii) evaluation of the predictive capabilities of the models using ROC curve, AUC and contingency tables; viii) comparison of model results and obtained susceptibility maps; and ix) analysis of temporal variation of landslide susceptibility related to input parameter changes. Models based on Logistic Regression and Random Forests have demonstrated excellent predictive capabilities. Land use and wildfire variables were found to have a strong control on the occurrence of very rapid shallow landslides.

  6. Remote sensing estimation of the total phosphorus concentration in a large lake using band combinations and regional multivariate statistical modeling techniques.

    PubMed

    Gao, Yongnian; Gao, Junfeng; Yin, Hongbin; Liu, Chuansheng; Xia, Ting; Wang, Jing; Huang, Qi

    2015-03-15

    Remote sensing has been widely used for ater quality monitoring, but most of these monitoring studies have only focused on a few water quality variables, such as chlorophyll-a, turbidity, and total suspended solids, which have typically been considered optically active variables. Remote sensing presents a challenge in estimating the phosphorus concentration in water. The total phosphorus (TP) in lakes has been estimated from remotely sensed observations, primarily using the simple individual band ratio or their natural logarithm and the statistical regression method based on the field TP data and the spectral reflectance. In this study, we investigated the possibility of establishing a spatial modeling scheme to estimate the TP concentration of a large lake from multi-spectral satellite imagery using band combinations and regional multivariate statistical modeling techniques, and we tested the applicability of the spatial modeling scheme. The results showed that HJ-1A CCD multi-spectral satellite imagery can be used to estimate the TP concentration in a lake. The correlation and regression analysis showed a highly significant positive relationship between the TP concentration and certain remotely sensed combination variables. The proposed modeling scheme had a higher accuracy for the TP concentration estimation in the large lake compared with the traditional individual band ratio method and the whole-lake scale regression-modeling scheme. The TP concentration values showed a clear spatial variability and were high in western Lake Chaohu and relatively low in eastern Lake Chaohu. The northernmost portion, the northeastern coastal zone and the southeastern portion of western Lake Chaohu had the highest TP concentrations, and the other regions had the lowest TP concentration values, except for the coastal zone of eastern Lake Chaohu. These results strongly suggested that the proposed modeling scheme, i.e., the band combinations and the regional multivariate statistical modeling techniques, demonstrated advantages for estimating the TP concentration in a large lake and had a strong potential for universal application for the TP concentration estimation in large lake waters worldwide. Copyright © 2014 Elsevier Ltd. All rights reserved.

  7. Evaluation of the Risk Factors for a Rotator Cuff Retear After Repair Surgery.

    PubMed

    Lee, Yeong Seok; Jeong, Jeung Yeol; Park, Chan-Deok; Kang, Seung Gyoon; Yoo, Jae Chul

    2017-07-01

    A retear is a significant clinical problem after rotator cuff repair. However, no study has evaluated the retear rate with regard to the extent of footprint coverage. To evaluate the preoperative and intraoperative factors for a retear after rotator cuff repair, and to confirm the relationship with the extent of footprint coverage. Cohort study; Level of evidence, 3. Data were retrospectively collected from 693 patients who underwent arthroscopic rotator cuff repair between January 2006 and December 2014. All repairs were classified into 4 types of completeness of repair according to the amount of footprint coverage at the end of surgery. All patients underwent magnetic resonance imaging (MRI) after a mean postoperative duration of 5.4 months. Preoperative demographic data, functional scores, range of motion, and global fatty degeneration on preoperative MRI and intraoperative variables including the tear size, completeness of rotator cuff repair, concomitant subscapularis repair, number of suture anchors used, repair technique (single-row or transosseous-equivalent double-row repair), and surgical duration were evaluated. Furthermore, the factors associated with failure using the single-row technique and transosseous-equivalent double-row technique were analyzed separately. The retear rate was 7.22%. Univariate analysis revealed that rotator cuff retears were affected by age; the presence of inflammatory arthritis; the completeness of rotator cuff repair; the initial tear size; the number of suture anchors; mean operative time; functional visual analog scale scores; Simple Shoulder Test findings; American Shoulder and Elbow Surgeons scores; and fatty degeneration of the supraspinatus, infraspinatus, and subscapularis. Multivariate logistic regression analysis revealed patient age, initial tear size, and fatty degeneration of the supraspinatus as independent risk factors for a rotator cuff retear. Multivariate logistic regression analysis of the single-row group revealed patient age and fatty degeneration of the supraspinatus as independent risk factors for a rotator cuff retear. Multivariate logistic regression analysis of the transosseous-equivalent double-row group revealed a frozen shoulder as an independent risk factor for a rotator cuff retear. Our results suggest that patient age, initial tear size, and fatty degeneration of the supraspinatus are independent risk factors for a rotator cuff retear, whereas the completeness of rotator cuff repair based on the extent of footprint coverage and repair technique are not.

  8. Real estate value prediction using multivariate regression models

    NASA Astrophysics Data System (ADS)

    Manjula, R.; Jain, Shubham; Srivastava, Sharad; Rajiv Kher, Pranav

    2017-11-01

    The real estate market is one of the most competitive in terms of pricing and the same tends to vary significantly based on a lot of factors, hence it becomes one of the prime fields to apply the concepts of machine learning to optimize and predict the prices with high accuracy. Therefore in this paper, we present various important features to use while predicting housing prices with good accuracy. We have described regression models, using various features to have lower Residual Sum of Squares error. While using features in a regression model some feature engineering is required for better prediction. Often a set of features (multiple regressions) or polynomial regression (applying a various set of powers in the features) is used for making better model fit. For these models are expected to be susceptible towards over fitting ridge regression is used to reduce it. This paper thus directs to the best application of regression models in addition to other techniques to optimize the result.

  9. Accurate predictions of iron redox state in silicate glasses: A multivariate approach using X-ray absorption spectroscopy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dyar, M. Darby; McCanta, Molly; Breves, Elly

    2016-03-01

    Pre-edge features in the K absorption edge of X-ray absorption spectra are commonly used to predict Fe3+ valence state in silicate glasses. However, this study shows that using the entire spectral region from the pre-edge into the extended X-ray absorption fine-structure region provides more accurate results when combined with multivariate analysis techniques. The least absolute shrinkage and selection operator (lasso) regression technique yields %Fe3+ values that are accurate to ±3.6% absolute when the full spectral region is employed. This method can be used across a broad range of glass compositions, is easily automated, and is demonstrated to yield accurate resultsmore » from different synchrotrons. It will enable future studies involving X-ray mapping of redox gradients on standard thin sections at 1 × 1 μm pixel sizes.« less

  10. Accurate predictions of iron redox state in silicate glasses: A multivariate approach using X-ray absorption spectroscopy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dyar, M. Darby; McCanta, Molly; Breves, Elly

    2016-03-01

    Pre-edge features in the K absorption edge of X-ray absorption spectra are commonly used to predict Fe 3+ valence state in silicate glasses. However, this study shows that using the entire spectral region from the pre-edge into the extended X-ray absorption fine-structure region provides more accurate results when combined with multivariate analysis techniques. The least absolute shrinkage and selection operator (lasso) regression technique yields %Fe 3+ values that are accurate to ±3.6% absolute when the full spectral region is employed. This method can be used across a broad range of glass compositions, is easily automated, and is demonstrated to yieldmore » accurate results from different synchrotrons. It will enable future studies involving X-ray mapping of redox gradients on standard thin sections at 1 × 1 μm pixel sizes.« less

  11. An assessment on the use of bivariate, multivariate and soft computing techniques for collapse susceptibility in GIS environ

    NASA Astrophysics Data System (ADS)

    Yilmaz, Işik; Marschalko, Marian; Bednarik, Martin

    2013-04-01

    The paper presented herein compares and discusses the use of bivariate, multivariate and soft computing techniques for collapse susceptibility modelling. Conditional probability (CP), logistic regression (LR) and artificial neural networks (ANN) models representing the bivariate, multivariate and soft computing techniques were used in GIS based collapse susceptibility mapping in an area from Sivas basin (Turkey). Collapse-related factors, directly or indirectly related to the causes of collapse occurrence, such as distance from faults, slope angle and aspect, topographical elevation, distance from drainage, topographic wetness index (TWI), stream power index (SPI), Normalized Difference Vegetation Index (NDVI) by means of vegetation cover, distance from roads and settlements were used in the collapse susceptibility analyses. In the last stage of the analyses, collapse susceptibility maps were produced from the models, and they were then compared by means of their validations. However, Area Under Curve (AUC) values obtained from all three models showed that the map obtained from soft computing (ANN) model looks like more accurate than the other models, accuracies of all three models can be evaluated relatively similar. The results also showed that the conditional probability is an essential method in preparation of collapse susceptibility map and highly compatible with GIS operating features.

  12. Remote-sensing data processing with the multivariate regression analysis method for iron mineral resource potential mapping: a case study in the Sarvian area, central Iran

    NASA Astrophysics Data System (ADS)

    Mansouri, Edris; Feizi, Faranak; Jafari Rad, Alireza; Arian, Mehran

    2018-03-01

    This paper uses multivariate regression to create a mathematical model for iron skarn exploration in the Sarvian area, central Iran, using multivariate regression for mineral prospectivity mapping (MPM). The main target of this paper is to apply multivariate regression analysis (as an MPM method) to map iron outcrops in the northeastern part of the study area in order to discover new iron deposits in other parts of the study area. Two types of multivariate regression models using two linear equations were employed to discover new mineral deposits. This method is one of the reliable methods for processing satellite images. ASTER satellite images (14 bands) were used as unique independent variables (UIVs), and iron outcrops were mapped as dependent variables for MPM. According to the results of the probability value (p value), coefficient of determination value (R2) and adjusted determination coefficient (Radj2), the second regression model (which consistent of multiple UIVs) fitted better than other models. The accuracy of the model was confirmed by iron outcrops map and geological observation. Based on field observation, iron mineralization occurs at the contact of limestone and intrusive rocks (skarn type).

  13. Correlative and multivariate analysis of increased radon concentration in underground laboratory.

    PubMed

    Maletić, Dimitrije M; Udovičić, Vladimir I; Banjanac, Radomir M; Joković, Dejan R; Dragić, Aleksandar L; Veselinović, Nikola B; Filipović, Jelena

    2014-11-01

    The results of analysis using correlative and multivariate methods, as developed for data analysis in high-energy physics and implemented in the Toolkit for Multivariate Analysis software package, of the relations of the variation of increased radon concentration with climate variables in shallow underground laboratory is presented. Multivariate regression analysis identified a number of multivariate methods which can give a good evaluation of increased radon concentrations based on climate variables. The use of the multivariate regression methods will enable the investigation of the relations of specific climate variable with increased radon concentrations by analysis of regression methods resulting in 'mapped' underlying functional behaviour of radon concentrations depending on a wide spectrum of climate variables. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  14. Application of Fluorescence Spectrometry With Multivariate Calibration to the Enantiomeric Recognition of Fluoxetine in Pharmaceutical Preparations.

    PubMed

    Poláček, Roman; Májek, Pavel; Hroboňová, Katarína; Sádecká, Jana

    2016-04-01

    Fluoxetine is the most prescribed antidepressant chiral drug worldwide. Its enantiomers have a different duration of serotonin inhibition. A novel simple and rapid method for determination of the enantiomeric composition of fluoxetine in pharmaceutical pills is presented. Specifically, emission, excitation, and synchronous fluorescence techniques were employed to obtain the spectral data, which with multivariate calibration methods, namely, principal component regression (PCR) and partial least square (PLS), were investigated. The chiral recognition of fluoxetine enantiomers in the presence of β-cyclodextrin was based on diastereomeric complexes. The results of the multivariate calibration modeling indicated good prediction abilities. The obtained results for tablets were compared with those from chiral HPLC and no significant differences are shown by Fisher's (F) test and Student's t-test. The smallest residuals between reference or nominal values and predicted values were achieved by multivariate calibration of synchronous fluorescence spectral data. This conclusion is supported by calculated values of the figure of merit.

  15. Determination of enantiomeric composition of ibuprofen in pharmaceutical formulations by partial least-squares regression of strongly overlapped chromatographic profiles.

    PubMed

    Grisales, Jaiver Osorio; Arancibia, Juan A; Castells, Cecilia B; Olivieri, Alejandro C

    2012-12-01

    In this report, we demonstrate how chiral liquid chromatography combined with multivariate chemometric techniques, specifically unfolded-partial least-squares regression (U-PLS), provides a powerful analytical methodology. Using U-PLS, strongly overlapped enantiomer profiles in a sample could be successfully processed and enantiomeric purity could be accurately determined without requiring baseline enantioresolution between peaks. The samples were partially enantioseparated with a permethyl-β-cyclodextrin chiral column under reversed-phase conditions. Signals detected with a diode-array detector within a wavelength range from 198 to 241 nm were recorded, and the data were processed by a second-order multivariate algorithm to decrease detection limits. The R-(-)-enantiomer of ibuprofen in tablet formulation samples could be determined at the level of 0.5 mg L⁻¹ in the presence of 99.9% of the S-(+)-enantiomorph with relative prediction error within ±3%. Copyright © 2012 Elsevier B.V. All rights reserved.

  16. Modelling daily dissolved oxygen concentration using least square support vector machine, multivariate adaptive regression splines and M5 model tree

    NASA Astrophysics Data System (ADS)

    Heddam, Salim; Kisi, Ozgur

    2018-04-01

    In the present study, three types of artificial intelligence techniques, least square support vector machine (LSSVM), multivariate adaptive regression splines (MARS) and M5 model tree (M5T) are applied for modeling daily dissolved oxygen (DO) concentration using several water quality variables as inputs. The DO concentration and water quality variables data from three stations operated by the United States Geological Survey (USGS) were used for developing the three models. The water quality data selected consisted of daily measured of water temperature (TE, °C), pH (std. unit), specific conductance (SC, μS/cm) and discharge (DI cfs), are used as inputs to the LSSVM, MARS and M5T models. The three models were applied for each station separately and compared to each other. According to the results obtained, it was found that: (i) the DO concentration could be successfully estimated using the three models and (ii) the best model among all others differs from one station to another.

  17. Application of multivariate chemometric techniques for simultaneous determination of five parameters of cottonseed oil by single bounce attenuated total reflectance Fourier transform infrared spectroscopy.

    PubMed

    Talpur, M Younis; Kara, Huseyin; Sherazi, S T H; Ayyildiz, H Filiz; Topkafa, Mustafa; Arslan, Fatma Nur; Naz, Saba; Durmaz, Fatih; Sirajuddin

    2014-11-01

    Single bounce attenuated total reflectance (SB-ATR) Fourier transform infrared (FTIR) spectroscopy in conjunction with chemometrics was used for accurate determination of free fatty acid (FFA), peroxide value (PV), iodine value (IV), conjugated diene (CD) and conjugated triene (CT) of cottonseed oil (CSO) during potato chips frying. Partial least square (PLS), stepwise multiple linear regression (SMLR), principal component regression (PCR) and simple Beer׳s law (SBL) were applied to develop the calibrations for simultaneous evaluation of five stated parameters of cottonseed oil (CSO) during frying of French frozen potato chips at 170°C. Good regression coefficients (R(2)) were achieved for FFA, PV, IV, CD and CT with value of >0.992 by PLS, SMLR, PCR, and SBL. Root mean square error of prediction (RMSEP) was found to be less than 1.95% for all determinations. Result of the study indicated that SB-ATR FTIR in combination with multivariate chemometrics could be used for accurate and simultaneous determination of different parameters during the frying process without using any toxic organic solvent. Copyright © 2014 Elsevier B.V. All rights reserved.

  18. Mammalian cell culture monitoring using in situ spectroscopy: Is your method really optimised?

    PubMed

    André, Silvère; Lagresle, Sylvain; Hannas, Zahia; Calvosa, Éric; Duponchel, Ludovic

    2017-03-01

    In recent years, as a result of the process analytical technology initiative of the US Food and Drug Administration, many different works have been carried out on direct and in situ monitoring of critical parameters for mammalian cell cultures by Raman spectroscopy and multivariate regression techniques. However, despite interesting results, it cannot be said that the proposed monitoring strategies, which will reduce errors of the regression models and thus confidence limits of the predictions, are really optimized. Hence, the aim of this article is to optimize some critical steps of spectroscopic acquisition and data treatment in order to reach a higher level of accuracy and robustness of bioprocess monitoring. In this way, we propose first an original strategy to assess the most suited Raman acquisition time for the processes involved. In a second part, we demonstrate the importance of the interbatch variability on the accuracy of the predictive models with a particular focus on the optical probes adjustment. Finally, we propose a methodology for the optimization of the spectral variables selection in order to decrease prediction errors of multivariate regressions. © 2017 American Institute of Chemical Engineers Biotechnol. Prog., 33:308-316, 2017. © 2017 American Institute of Chemical Engineers.

  19. A systematic review of the relationship factor between women and health professionals within the multivariant analysis of maternal satisfaction.

    PubMed

    Macpherson, Ignacio; Roqué-Sánchez, María V; Legget Bn, Finola O; Fuertes, Ferran; Segarra, Ignacio

    2016-10-01

    personalised support provided to women by health professionals is one of the prime factors attaining women's satisfaction during pregnancy and childbirth. However the multifactorial nature of 'satisfaction' makes difficult to assess it. Statistical multivariate analysis may be an effective technique to obtain in depth quantitative evidence of the importance of this factor and its interaction with the other factors involved. This technique allows us to estimate the importance of overall satisfaction in its context and suggest actions for healthcare services. systematic review of studies that quantitatively measure the personal relationship between women and healthcare professionals (gynecologists, obstetricians, nurse, midwifes, etc.) regarding maternity care satisfaction. The literature search focused on studies carried out between 1970 and 2014 that used multivariate analyses and included the woman-caregiver relationship as a factor of their analysis. twenty-four studies which applied various multivariate analysis tools to different periods of maternity care (antenatal, perinatal, post partum) were selected. The studies included discrete scale scores and questionnaires from women with low-risk pregnancies. The "personal relationship" factor appeared under various names: care received, personalised treatment, professional support, amongst others. The most common multivariate techniques used to assess the percentage of variance explained and the odds ratio of each factor were principal component analysis and logistic regression. the data, variables and factor analysis suggest that continuous, personalised care provided by the usual midwife and delivered within a family or a specialised setting, generates the highest level of satisfaction. In addition, these factors foster the woman's psychological and physiological recovery, often surpassing clinical action (e.g. medicalization and hospital organization) and/or physiological determinants (e.g. pain, pathologies, etc.). Copyright © 2016 Elsevier Ltd. All rights reserved.

  20. Catalog of Air Force Weather Technical Documents, 1941-2006

    DTIC Science & Technology

    2006-05-19

    radiosondes in current use in USA. Elementary discussion of statistical terms and concepts used for expressing accuracy or error is discussed. AWS TR 105...Techniques, Appendix B: Vorticity—An Elementary Discussion of the Concept, August 1956, 27pp. Formerly AWSM 105– 50/1A. Provides the necessary back...steps involved in ordinary multiple linear regression. Conditional probability is calculated using transnormalized variables in the multivariate normal

  1. Adjustment of geochemical background by robust multivariate statistics

    USGS Publications Warehouse

    Zhou, D.

    1985-01-01

    Conventional analyses of exploration geochemical data assume that the background is a constant or slowly changing value, equivalent to a plane or a smoothly curved surface. However, it is better to regard the geochemical background as a rugged surface, varying with changes in geology and environment. This rugged surface can be estimated from observed geological, geochemical and environmental properties by using multivariate statistics. A method of background adjustment was developed and applied to groundwater and stream sediment reconnaissance data collected from the Hot Springs Quadrangle, South Dakota, as part of the National Uranium Resource Evaluation (NURE) program. Source-rock lithology appears to be a dominant factor controlling the chemical composition of groundwater or stream sediments. The most efficacious adjustment procedure is to regress uranium concentration on selected geochemical and environmental variables for each lithologic unit, and then to delineate anomalies by a common threshold set as a multiple of the standard deviation of the combined residuals. Robust versions of regression and RQ-mode principal components analysis techniques were used rather than ordinary techniques to guard against distortion caused by outliers Anomalies delineated by this background adjustment procedure correspond with uranium prospects much better than do anomalies delineated by conventional procedures. The procedure should be applicable to geochemical exploration at different scales for other metals. ?? 1985.

  2. A graphical method to evaluate spectral preprocessing in multivariate regression calibrations: example with Savitzky-Golay filters and partial least squares regression

    USDA-ARS?s Scientific Manuscript database

    In multivariate regression analysis of spectroscopy data, spectral preprocessing is often performed to reduce unwanted background information (offsets, sloped baselines) or accentuate absorption features in intrinsically overlapping bands. These procedures, also known as pretreatments, are commonly ...

  3. Partial Least Squares Regression Can Aid in Detecting Differential Abundance of Multiple Features in Sets of Metagenomic Samples

    PubMed Central

    Libiger, Ondrej; Schork, Nicholas J.

    2015-01-01

    It is now feasible to examine the composition and diversity of microbial communities (i.e., “microbiomes”) that populate different human organs and orifices using DNA sequencing and related technologies. To explore the potential links between changes in microbial communities and various diseases in the human body, it is essential to test associations involving different species within and across microbiomes, environmental settings and disease states. Although a number of statistical techniques exist for carrying out relevant analyses, it is unclear which of these techniques exhibit the greatest statistical power to detect associations given the complexity of most microbiome datasets. We compared the statistical power of principal component regression, partial least squares regression, regularized regression, distance-based regression, Hill's diversity measures, and a modified test implemented in the popular and widely used microbiome analysis methodology “Metastats” across a wide range of simulated scenarios involving changes in feature abundance between two sets of metagenomic samples. For this purpose, simulation studies were used to change the abundance of microbial species in a real dataset from a published study examining human hands. Each technique was applied to the same data, and its ability to detect the simulated change in abundance was assessed. We hypothesized that a small subset of methods would outperform the rest in terms of the statistical power. Indeed, we found that the Metastats technique modified to accommodate multivariate analysis and partial least squares regression yielded high power under the models and data sets we studied. The statistical power of diversity measure-based tests, distance-based regression and regularized regression was significantly lower. Our results provide insight into powerful analysis strategies that utilize information on species counts from large microbiome data sets exhibiting skewed frequency distributions obtained on a small to moderate number of samples. PMID:26734061

  4. Multivariate Time Series Forecasting of Crude Palm Oil Price Using Machine Learning Techniques

    NASA Astrophysics Data System (ADS)

    Kanchymalay, Kasturi; Salim, N.; Sukprasert, Anupong; Krishnan, Ramesh; Raba'ah Hashim, Ummi

    2017-08-01

    The aim of this paper was to study the correlation between crude palm oil (CPO) price, selected vegetable oil prices (such as soybean oil, coconut oil, and olive oil, rapeseed oil and sunflower oil), crude oil and the monthly exchange rate. Comparative analysis was then performed on CPO price forecasting results using the machine learning techniques. Monthly CPO prices, selected vegetable oil prices, crude oil prices and monthly exchange rate data from January 1987 to February 2017 were utilized. Preliminary analysis showed a positive and high correlation between the CPO price and soy bean oil price and also between CPO price and crude oil price. Experiments were conducted using multi-layer perception, support vector regression and Holt Winter exponential smoothing techniques. The results were assessed by using criteria of root mean square error (RMSE), means absolute error (MAE), means absolute percentage error (MAPE) and Direction of accuracy (DA). Among these three techniques, support vector regression(SVR) with Sequential minimal optimization (SMO) algorithm showed relatively better results compared to multi-layer perceptron and Holt Winters exponential smoothing method.

  5. Comparison of Xenon-Enhanced Area-Detector CT and Krypton Ventilation SPECT/CT for Assessment of Pulmonary Functional Loss and Disease Severity in Smokers.

    PubMed

    Ohno, Yoshiharu; Fujisawa, Yasuko; Takenaka, Daisuke; Kaminaga, Shigeo; Seki, Shinichiro; Sugihara, Naoki; Yoshikawa, Takeshi

    2018-02-01

    The objective of this study was to compare the capability of xenon-enhanced area-detector CT (ADCT) performed with a subtraction technique and coregistered 81m Kr-ventilation SPECT/CT for the assessment of pulmonary functional loss and disease severity in smokers. Forty-six consecutive smokers (32 men and 14 women; mean age, 67.0 years) underwent prospective unenhanced and xenon-enhanced ADCT, 81m Kr-ventilation SPECT/CT, and pulmonary function tests. Disease severity was evaluated according to the Global Initiative for Chronic Obstructive Lung Disease (GOLD) classification. CT-based functional lung volume (FLV), the percentage of wall area to total airway area (WA%), and ventilated FLV on xenon-enhanced ADCT and SPECT/CT were calculated for each smoker. All indexes were correlated with percentage of forced expiratory volume in 1 second (%FEV 1 ) using step-wise regression analyses, and univariate and multivariate logistic regression analyses were performed. In addition, the diagnostic accuracy of the proposed model was compared with that of each radiologic index by means of McNemar analysis. Multivariate logistic regression showed that %FEV 1 was significantly affected (r = 0.77, r 2 = 0.59) by two factors: the first factor, ventilated FLV on xenon-enhanced ADCT (p < 0.0001); and the second factor, WA% (p = 0.004). Univariate logistic regression analyses indicated that all indexes significantly affected GOLD classification (p < 0.05). Multivariate logistic regression analyses revealed that ventilated FLV on xenon-enhanced ADCT and CT-based FLV significantly influenced GOLD classification (p < 0.0001). The diagnostic accuracy of the proposed model was significantly higher than that of ventilated FLV on SPECT/CT (p = 0.03) and WA% (p = 0.008). Xenon-enhanced ADCT is more effective than 81m Kr-ventilation SPECT/CT for the assessment of pulmonary functional loss and disease severity.

  6. Controlled pattern imputation for sensitivity analysis of longitudinal binary and ordinal outcomes with nonignorable dropout.

    PubMed

    Tang, Yongqiang

    2018-04-30

    The controlled imputation method refers to a class of pattern mixture models that have been commonly used as sensitivity analyses of longitudinal clinical trials with nonignorable dropout in recent years. These pattern mixture models assume that participants in the experimental arm after dropout have similar response profiles to the control participants or have worse outcomes than otherwise similar participants who remain on the experimental treatment. In spite of its popularity, the controlled imputation has not been formally developed for longitudinal binary and ordinal outcomes partially due to the lack of a natural multivariate distribution for such endpoints. In this paper, we propose 2 approaches for implementing the controlled imputation for binary and ordinal data based respectively on the sequential logistic regression and the multivariate probit model. Efficient Markov chain Monte Carlo algorithms are developed for missing data imputation by using the monotone data augmentation technique for the sequential logistic regression and a parameter-expanded monotone data augmentation scheme for the multivariate probit model. We assess the performance of the proposed procedures by simulation and the analysis of a schizophrenia clinical trial and compare them with the fully conditional specification, last observation carried forward, and baseline observation carried forward imputation methods. Copyright © 2018 John Wiley & Sons, Ltd.

  7. Calibrated Multivariate Regression with Application to Neural Semantic Basis Discovery.

    PubMed

    Liu, Han; Wang, Lie; Zhao, Tuo

    2015-08-01

    We propose a calibrated multivariate regression method named CMR for fitting high dimensional multivariate regression models. Compared with existing methods, CMR calibrates regularization for each regression task with respect to its noise level so that it simultaneously attains improved finite-sample performance and tuning insensitiveness. Theoretically, we provide sufficient conditions under which CMR achieves the optimal rate of convergence in parameter estimation. Computationally, we propose an efficient smoothed proximal gradient algorithm with a worst-case numerical rate of convergence O (1/ ϵ ), where ϵ is a pre-specified accuracy of the objective function value. We conduct thorough numerical simulations to illustrate that CMR consistently outperforms other high dimensional multivariate regression methods. We also apply CMR to solve a brain activity prediction problem and find that it is as competitive as a handcrafted model created by human experts. The R package camel implementing the proposed method is available on the Comprehensive R Archive Network http://cran.r-project.org/web/packages/camel/.

  8. Transforming RNA-Seq data to improve the performance of prognostic gene signatures.

    PubMed

    Zwiener, Isabella; Frisch, Barbara; Binder, Harald

    2014-01-01

    Gene expression measurements have successfully been used for building prognostic signatures, i.e for identifying a short list of important genes that can predict patient outcome. Mostly microarray measurements have been considered, and there is little advice available for building multivariable risk prediction models from RNA-Seq data. We specifically consider penalized regression techniques, such as the lasso and componentwise boosting, which can simultaneously consider all measurements and provide both, multivariable regression models for prediction and automated variable selection. However, they might be affected by the typical skewness, mean-variance-dependency or extreme values of RNA-Seq covariates and therefore could benefit from transformations of the latter. In an analytical part, we highlight preferential selection of covariates with large variances, which is problematic due to the mean-variance dependency of RNA-Seq data. In a simulation study, we compare different transformations of RNA-Seq data for potentially improving detection of important genes. Specifically, we consider standardization, the log transformation, a variance-stabilizing transformation, the Box-Cox transformation, and rank-based transformations. In addition, the prediction performance for real data from patients with kidney cancer and acute myeloid leukemia is considered. We show that signature size, identification performance, and prediction performance critically depend on the choice of a suitable transformation. Rank-based transformations perform well in all scenarios and can even outperform complex variance-stabilizing approaches. Generally, the results illustrate that the distribution and potential transformations of RNA-Seq data need to be considered as a critical step when building risk prediction models by penalized regression techniques.

  9. Transforming RNA-Seq Data to Improve the Performance of Prognostic Gene Signatures

    PubMed Central

    Zwiener, Isabella; Frisch, Barbara; Binder, Harald

    2014-01-01

    Gene expression measurements have successfully been used for building prognostic signatures, i.e for identifying a short list of important genes that can predict patient outcome. Mostly microarray measurements have been considered, and there is little advice available for building multivariable risk prediction models from RNA-Seq data. We specifically consider penalized regression techniques, such as the lasso and componentwise boosting, which can simultaneously consider all measurements and provide both, multivariable regression models for prediction and automated variable selection. However, they might be affected by the typical skewness, mean-variance-dependency or extreme values of RNA-Seq covariates and therefore could benefit from transformations of the latter. In an analytical part, we highlight preferential selection of covariates with large variances, which is problematic due to the mean-variance dependency of RNA-Seq data. In a simulation study, we compare different transformations of RNA-Seq data for potentially improving detection of important genes. Specifically, we consider standardization, the log transformation, a variance-stabilizing transformation, the Box-Cox transformation, and rank-based transformations. In addition, the prediction performance for real data from patients with kidney cancer and acute myeloid leukemia is considered. We show that signature size, identification performance, and prediction performance critically depend on the choice of a suitable transformation. Rank-based transformations perform well in all scenarios and can even outperform complex variance-stabilizing approaches. Generally, the results illustrate that the distribution and potential transformations of RNA-Seq data need to be considered as a critical step when building risk prediction models by penalized regression techniques. PMID:24416353

  10. Using Dual Regression to Investigate Network Shape and Amplitude in Functional Connectivity Analyses

    PubMed Central

    Nickerson, Lisa D.; Smith, Stephen M.; Öngür, Döst; Beckmann, Christian F.

    2017-01-01

    Independent Component Analysis (ICA) is one of the most popular techniques for the analysis of resting state FMRI data because it has several advantageous properties when compared with other techniques. Most notably, in contrast to a conventional seed-based correlation analysis, it is model-free and multivariate, thus switching the focus from evaluating the functional connectivity of single brain regions identified a priori to evaluating brain connectivity in terms of all brain resting state networks (RSNs) that simultaneously engage in oscillatory activity. Furthermore, typical seed-based analysis characterizes RSNs in terms of spatially distributed patterns of correlation (typically by means of simple Pearson's coefficients) and thereby confounds together amplitude information of oscillatory activity and noise. ICA and other regression techniques, on the other hand, retain magnitude information and therefore can be sensitive to both changes in the spatially distributed nature of correlations (differences in the spatial pattern or “shape”) as well as the amplitude of the network activity. Furthermore, motion can mimic amplitude effects so it is crucial to use a technique that retains such information to ensure that connectivity differences are accurately localized. In this work, we investigate the dual regression approach that is frequently applied with group ICA to assess group differences in resting state functional connectivity of brain networks. We show how ignoring amplitude effects and how excessive motion corrupts connectivity maps and results in spurious connectivity differences. We also show how to implement the dual regression to retain amplitude information and how to use dual regression outputs to identify potential motion effects. Two key findings are that using a technique that retains magnitude information, e.g., dual regression, and using strict motion criteria are crucial for controlling both network amplitude and motion-related amplitude effects, respectively, in resting state connectivity analyses. We illustrate these concepts using realistic simulated resting state FMRI data and in vivo data acquired in healthy subjects and patients with bipolar disorder and schizophrenia. PMID:28348512

  11. Longitudinal assessment of treatment effects on pulmonary ventilation using 1H/3He MRI multivariate templates

    NASA Astrophysics Data System (ADS)

    Tustison, Nicholas J.; Contrella, Benjamin; Altes, Talissa A.; Avants, Brian B.; de Lange, Eduard E.; Mugler, John P.

    2013-03-01

    The utitlity of pulmonary functional imaging techniques, such as hyperpolarized 3He MRI, has encouraged their inclusion in research studies for longitudinal assessment of disease progression and the study of treatment effects. We present methodology for performing voxelwise statistical analysis of ventilation maps derived from hyper­ polarized 3He MRI which incorporates multivariate template construction using simultaneous acquisition of IH and 3He images. Additional processing steps include intensity normalization, bias correction, 4-D longitudinal segmentation, and generation of expected ventilation maps prior to voxelwise regression analysis. Analysis is demonstrated on a cohort of eight individuals with diagnosed cystic fibrosis (CF) undergoing treatment imaged five times every two weeks with a prescribed treatment schedule.

  12. Numerically accurate computational techniques for optimal estimator analyses of multi-parameter models

    NASA Astrophysics Data System (ADS)

    Berger, Lukas; Kleinheinz, Konstantin; Attili, Antonio; Bisetti, Fabrizio; Pitsch, Heinz; Mueller, Michael E.

    2018-05-01

    Modelling unclosed terms in partial differential equations typically involves two steps: First, a set of known quantities needs to be specified as input parameters for a model, and second, a specific functional form needs to be defined to model the unclosed terms by the input parameters. Both steps involve a certain modelling error, with the former known as the irreducible error and the latter referred to as the functional error. Typically, only the total modelling error, which is the sum of functional and irreducible error, is assessed, but the concept of the optimal estimator enables the separate analysis of the total and the irreducible errors, yielding a systematic modelling error decomposition. In this work, attention is paid to the techniques themselves required for the practical computation of irreducible errors. Typically, histograms are used for optimal estimator analyses, but this technique is found to add a non-negligible spurious contribution to the irreducible error if models with multiple input parameters are assessed. Thus, the error decomposition of an optimal estimator analysis becomes inaccurate, and misleading conclusions concerning modelling errors may be drawn. In this work, numerically accurate techniques for optimal estimator analyses are identified and a suitable evaluation of irreducible errors is presented. Four different computational techniques are considered: a histogram technique, artificial neural networks, multivariate adaptive regression splines, and an additive model based on a kernel method. For multiple input parameter models, only artificial neural networks and multivariate adaptive regression splines are found to yield satisfactorily accurate results. Beyond a certain number of input parameters, the assessment of models in an optimal estimator analysis even becomes practically infeasible if histograms are used. The optimal estimator analysis in this paper is applied to modelling the filtered soot intermittency in large eddy simulations using a dataset of a direct numerical simulation of a non-premixed sooting turbulent flame.

  13. Usage of multivariate geostatistics in interpolation processes for meteorological precipitation maps

    NASA Astrophysics Data System (ADS)

    Gundogdu, Ismail Bulent

    2017-01-01

    Long-term meteorological data are very important both for the evaluation of meteorological events and for the analysis of their effects on the environment. Prediction maps which are constructed by different interpolation techniques often provide explanatory information. Conventional techniques, such as surface spline fitting, global and local polynomial models, and inverse distance weighting may not be adequate. Multivariate geostatistical methods can be more significant, especially when studying secondary variables, because secondary variables might directly affect the precision of prediction. In this study, the mean annual and mean monthly precipitations from 1984 to 2014 for 268 meteorological stations in Turkey have been used to construct country-wide maps. Besides linear regression, the inverse square distance and ordinary co-Kriging (OCK) have been used and compared to each other. Also elevation, slope, and aspect data for each station have been taken into account as secondary variables, whose use has reduced errors by up to a factor of three. OCK gave the smallest errors (1.002 cm) when aspect was included.

  14. Visual Analysis of North Atlantic Hurricane Trends Using Parallel Coordinates and Statistical Techniques

    DTIC Science & Technology

    2008-07-07

    analyzing multivariate data sets. The system was developed using the Java Development Kit (JDK) version 1.5; and it yields interactive performance on a... script and captures output from the MATLAB’s “regress” and “stepwisefit” utilities that perform simple and stepwise regression, respectively. The MATLAB...Statistical Association, vol. 85, no. 411, pp. 664–675, 1990. [9] H. Hauser, F. Ledermann, and H. Doleisch, “ Angular brushing of extended parallel coordinates

  15. Load cell having strain gauges of arbitrary location

    DOEpatents

    Spletzer, Barry [Albuquerque, NM

    2007-03-13

    A load cell utilizes a plurality of strain gauges mounted upon the load cell body such that there are six independent load-strain relations. Load is determined by applying the inverse of a load-strain sensitivity matrix to a measured strain vector. The sensitivity matrix is determined by performing a multivariate regression technique on a set of known loads correlated to the resulting strains. Temperature compensation is achieved by configuring the strain gauges as co-located orthogonal pairs.

  16. Strain Gauge Balance Uncertainty Analysis at NASA Langley: A Technical Review

    NASA Technical Reports Server (NTRS)

    Tripp, John S.

    1999-01-01

    This paper describes a method to determine the uncertainties of measured forces and moments from multi-component force balances used in wind tunnel tests. A multivariate regression technique is first employed to estimate the uncertainties of the six balance sensitivities and 156 interaction coefficients derived from established balance calibration procedures. These uncertainties are then employed to calculate the uncertainties of force-moment values computed from observed balance output readings obtained during tests. Confidence and prediction intervals are obtained for each computed force and moment as functions of the actual measurands. Techniques are discussed for separate estimation of balance bias and precision uncertainties.

  17. Application of Machine-Learning Models to Predict Tacrolimus Stable Dose in Renal Transplant Recipients

    NASA Astrophysics Data System (ADS)

    Tang, Jie; Liu, Rong; Zhang, Yue-Li; Liu, Mou-Ze; Hu, Yong-Fang; Shao, Ming-Jie; Zhu, Li-Jun; Xin, Hua-Wen; Feng, Gui-Wen; Shang, Wen-Jun; Meng, Xiang-Guang; Zhang, Li-Rong; Ming, Ying-Zi; Zhang, Wei

    2017-02-01

    Tacrolimus has a narrow therapeutic window and considerable variability in clinical use. Our goal was to compare the performance of multiple linear regression (MLR) and eight machine learning techniques in pharmacogenetic algorithm-based prediction of tacrolimus stable dose (TSD) in a large Chinese cohort. A total of 1,045 renal transplant patients were recruited, 80% of which were randomly selected as the “derivation cohort” to develop dose-prediction algorithm, while the remaining 20% constituted the “validation cohort” to test the final selected algorithm. MLR, artificial neural network (ANN), regression tree (RT), multivariate adaptive regression splines (MARS), boosted regression tree (BRT), support vector regression (SVR), random forest regression (RFR), lasso regression (LAR) and Bayesian additive regression trees (BART) were applied and their performances were compared in this work. Among all the machine learning models, RT performed best in both derivation [0.71 (0.67-0.76)] and validation cohorts [0.73 (0.63-0.82)]. In addition, the ideal rate of RT was 4% higher than that of MLR. To our knowledge, this is the first study to use machine learning models to predict TSD, which will further facilitate personalized medicine in tacrolimus administration in the future.

  18. Alternatives for using multivariate regression to adjust prospective payment rates

    PubMed Central

    Sheingold, Steven H.

    1990-01-01

    Multivariate regression analysis has been used in structuring three of the adjustments to Medicare's prospective payment rates. Because the indirect-teaching adjustment, the disproportionate-share adjustment, and the adjustment for large cities are responsible for distributing approximately $3 billion in payments each year, the specification of regression models for these adjustments is of critical importance. In this article, the application of regression for adjusting Medicare's prospective rates is discussed, and the implications that differing specifications could have for these adjustments are demonstrated. PMID:10113271

  19. Comparison of Nine Statistical Model Based Warfarin Pharmacogenetic Dosing Algorithms Using the Racially Diverse International Warfarin Pharmacogenetic Consortium Cohort Database

    PubMed Central

    Liu, Rong; Li, Xi; Zhang, Wei; Zhou, Hong-Hao

    2015-01-01

    Objective Multiple linear regression (MLR) and machine learning techniques in pharmacogenetic algorithm-based warfarin dosing have been reported. However, performances of these algorithms in racially diverse group have never been objectively evaluated and compared. In this literature-based study, we compared the performances of eight machine learning techniques with those of MLR in a large, racially-diverse cohort. Methods MLR, artificial neural network (ANN), regression tree (RT), multivariate adaptive regression splines (MARS), boosted regression tree (BRT), support vector regression (SVR), random forest regression (RFR), lasso regression (LAR) and Bayesian additive regression trees (BART) were applied in warfarin dose algorithms in a cohort from the International Warfarin Pharmacogenetics Consortium database. Covariates obtained by stepwise regression from 80% of randomly selected patients were used to develop algorithms. To compare the performances of these algorithms, the mean percentage of patients whose predicted dose fell within 20% of the actual dose (mean percentage within 20%) and the mean absolute error (MAE) were calculated in the remaining 20% of patients. The performances of these techniques in different races, as well as the dose ranges of therapeutic warfarin were compared. Robust results were obtained after 100 rounds of resampling. Results BART, MARS and SVR were statistically indistinguishable and significantly out performed all the other approaches in the whole cohort (MAE: 8.84–8.96 mg/week, mean percentage within 20%: 45.88%–46.35%). In the White population, MARS and BART showed higher mean percentage within 20% and lower mean MAE than those of MLR (all p values < 0.05). In the Asian population, SVR, BART, MARS and LAR performed the same as MLR. MLR and LAR optimally performed among the Black population. When patients were grouped in terms of warfarin dose range, all machine learning techniques except ANN and LAR showed significantly higher mean percentage within 20%, and lower MAE (all p values < 0.05) than MLR in the low- and high- dose ranges. Conclusion Overall, machine learning-based techniques, BART, MARS and SVR performed superior than MLR in warfarin pharmacogenetic dosing. Differences of algorithms’ performances exist among the races. Moreover, machine learning-based algorithms tended to perform better in the low- and high- dose ranges than MLR. PMID:26305568

  20. Principal component analysis-based pattern analysis of dose-volume histograms and influence on rectal toxicity.

    PubMed

    Söhn, Matthias; Alber, Markus; Yan, Di

    2007-09-01

    The variability of dose-volume histogram (DVH) shapes in a patient population can be quantified using principal component analysis (PCA). We applied this to rectal DVHs of prostate cancer patients and investigated the correlation of the PCA parameters with late bleeding. PCA was applied to the rectal wall DVHs of 262 patients, who had been treated with a four-field box, conformal adaptive radiotherapy technique. The correlated changes in the DVH pattern were revealed as "eigenmodes," which were ordered by their importance to represent data set variability. Each DVH is uniquely characterized by its principal components (PCs). The correlation of the first three PCs and chronic rectal bleeding of Grade 2 or greater was investigated with uni- and multivariate logistic regression analyses. Rectal wall DVHs in four-field conformal RT can primarily be represented by the first two or three PCs, which describe approximately 94% or 96% of the DVH shape variability, respectively. The first eigenmode models the total irradiated rectal volume; thus, PC1 correlates to the mean dose. Mode 2 describes the interpatient differences of the relative rectal volume in the two- or four-field overlap region. Mode 3 reveals correlations of volumes with intermediate doses ( approximately 40-45 Gy) and volumes with doses >70 Gy; thus, PC3 is associated with the maximal dose. According to univariate logistic regression analysis, only PC2 correlated significantly with toxicity. However, multivariate logistic regression analysis with the first two or three PCs revealed an increased probability of bleeding for DVHs with more than one large PC. PCA can reveal the correlation structure of DVHs for a patient population as imposed by the treatment technique and provide information about its relationship to toxicity. It proves useful for augmenting normal tissue complication probability modeling approaches.

  1. Risk prediction for myocardial infarction via generalized functional regression models.

    PubMed

    Ieva, Francesca; Paganoni, Anna M

    2016-08-01

    In this paper, we propose a generalized functional linear regression model for a binary outcome indicating the presence/absence of a cardiac disease with multivariate functional data among the relevant predictors. In particular, the motivating aim is the analysis of electrocardiographic traces of patients whose pre-hospital electrocardiogram (ECG) has been sent to 118 Dispatch Center of Milan (the Italian free-toll number for emergencies) by life support personnel of the basic rescue units. The statistical analysis starts with a preprocessing of ECGs treated as multivariate functional data. The signals are reconstructed from noisy observations. The biological variability is then removed by a nonlinear registration procedure based on landmarks. Thus, in order to perform a data-driven dimensional reduction, a multivariate functional principal component analysis is carried out on the variance-covariance matrix of the reconstructed and registered ECGs and their first derivatives. We use the scores of the Principal Components decomposition as covariates in a generalized linear model to predict the presence of the disease in a new patient. Hence, a new semi-automatic diagnostic procedure is proposed to estimate the risk of infarction (in the case of interest, the probability of being affected by Left Bundle Brunch Block). The performance of this classification method is evaluated and compared with other methods proposed in literature. Finally, the robustness of the procedure is checked via leave-j-out techniques. © The Author(s) 2013.

  2. Fourier Transform Infrared Spectroscopy and Multivariate Analysis for Online Monitoring of Dibutyl Phosphate Degradation Product in Tributyl Phosphate/n-Dodecane/Nitric Acid Solvent

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tatiana G. Levitskaia; James M. Peterson; Emily L. Campbell

    2013-12-01

    In liquid–liquid extraction separation processes, accumulation of organic solvent degradation products is detrimental to the process robustness, and frequent solvent analysis is warranted. Our research explores the feasibility of online monitoring of the organic solvents relevant to used nuclear fuel reprocessing. This paper describes the first phase of developing a system for monitoring the tributyl phosphate (TBP)/n-dodecane solvent commonly used to separate used nuclear fuel. In this investigation, the effect of extraction of nitric acid from aqueous solutions of variable concentrations on the quantification of TBP and its major degradation product dibutylphosphoric acid (HDBP) was assessed. Fourier transform infrared (FTIR)more » spectroscopy was used to discriminate between HDBP and TBP in the nitric acid-containing TBP/n-dodecane solvent. Multivariate analysis of the spectral data facilitated the development of regression models for HDBP and TBP quantification in real time, enabling online implementation of the monitoring system. The predictive regression models were validated using TBP/n-dodecane solvent samples subjected to high-dose external ?-irradiation. The predictive models were translated to flow conditions using a hollow fiber FTIR probe installed in a centrifugal contactor extraction apparatus, demonstrating the applicability of the FTIR technique coupled with multivariate analysis for the online monitoring of the organic solvent degradation products.« less

  3. Fourier Transform Infrared Spectroscopy and Multivariate Analysis for Online Monitoring of Dibutyl Phosphate Degradation Product in Tributyl Phosphate /n-Dodecane/Nitric Acid Solvent

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Levitskaia, Tatiana G.; Peterson, James M.; Campbell, Emily L.

    2013-11-05

    In liquid-liquid extraction separation processes, accumulation of organic solvent degradation products is detrimental to the process robustness and frequent solvent analysis is warranted. Our research explores feasibility of online monitoring of the organic solvents relevant to used nuclear fuel reprocessing. This paper describes the first phase of developing a system for monitoring the tributyl phosphate (TBP)/n-dodecane solvent commonly used to separate used nuclear fuel. In this investigation, the effect of extraction of nitric acid from aqueous solutions of variable concentrations on the quantification of TBP and its major degradation product dibutyl phosphoric acid (HDBP) was assessed. Fourier Transform Infrared Spectroscopymore » (FTIR) spectroscopy was used to discriminate between HDBP and TBP in the nitric acid-containing TBP/n-dodecane solvent. Multivariate analysis of the spectral data facilitated the development of regression models for HDBP and TBP quantification in real time, enabling online implementation of the monitoring system. The predictive regression models were validated using TBP/n-dodecane solvent samples subjected to the high dose external gamma irradiation. The predictive models were translated to flow conditions using a hollow fiber FTIR probe installed in a centrifugal contactor extraction apparatus demonstrating the applicability of the FTIR technique coupled with multivariate analysis for the online monitoring of the organic solvent degradation products.« less

  4. Force required for correcting the deformity of pectus carinatum and related multivariate analysis.

    PubMed

    Chen, Chenghao; Zeng, Qi; Li, Zhongzhi; Zhang, Na; Yu, Jie

    2017-12-24

    To measure the force required for correcting pectus carinatum to the desired position and investigate the correlations of the required force with patients' gender, age, deformity type, severity and body mass index (BMI). A total of 125 patients with pectus carinatum were enrolled in the study from August 2013 to August 2016. Their gender, age, deformity type, severity and BMI were recorded. A chest wall compressor was used to measure the force required for correcting the chest wall deformity. Multivariate linear regression was used for data analysis. Among the 125 patients, 112 were males and 13 were females. Their mean age was 13.7±1.5 years old, mean Haller index was 2.1±0.2, and mean BMI was 17.4±1.8 kg/m 2 . Multivariate linear regression analysis showed that the desirable force for correcting chest wall deformity was not correlated with gender and deformity type, but positively correlated with age and BMI and negatively correlated with Haller index. The desirable force measured for correcting chest wall deformities of patients with pectus carinatum positively correlates with age and BMI and negatively correlates with Haller index. The study provides valuable information for future improvement of implanted bar, bar fixation technique, and personalized surgery. Retrospective study. Level 3-4. Copyright © 2018. Published by Elsevier Inc.

  5. Variation of Water Quality Parameters with Siltation Depth for River Ichamati Along International Border with Bangladesh Using Multivariate Statistical Techniques

    NASA Astrophysics Data System (ADS)

    Roy, P. K.; Pal, S.; Banerjee, G.; Biswas Roy, M.; Ray, D.; Majumder, A.

    2014-12-01

    River is considered as one of the main sources of freshwater all over the world. Hence analysis and maintenance of this water resource is globally considered a matter of major concern. This paper deals with the assessment of surface water quality of the Ichamati river using multivariate statistical techniques. Eight distinct surface water quality observation stations were located and samples were collected. For the samples collected statistical techniques were applied to the physico-chemical parameters and depth of siltation. In this paper cluster analysis is done to determine the relations between surface water quality and siltation depth of river Ichamati. Multiple regressions and mathematical equation modeling have been done to characterize surface water quality of Ichamati river on the basis of physico-chemical parameters. It was found that surface water quality of the downstream river was different from the water quality of the upstream. The analysis of the water quality parameters of the Ichamati river clearly indicate high pollution load on the river water which can be accounted to agricultural discharge, tidal effect and soil erosion. The results further reveal that with the increase in depth of siltation, water quality degraded.

  6. Distributed Monitoring of the R(sup 2) Statistic for Linear Regression

    NASA Technical Reports Server (NTRS)

    Bhaduri, Kanishka; Das, Kamalika; Giannella, Chris R.

    2011-01-01

    The problem of monitoring a multivariate linear regression model is relevant in studying the evolving relationship between a set of input variables (features) and one or more dependent target variables. This problem becomes challenging for large scale data in a distributed computing environment when only a subset of instances is available at individual nodes and the local data changes frequently. Data centralization and periodic model recomputation can add high overhead to tasks like anomaly detection in such dynamic settings. Therefore, the goal is to develop techniques for monitoring and updating the model over the union of all nodes data in a communication-efficient fashion. Correctness guarantees on such techniques are also often highly desirable, especially in safety-critical application scenarios. In this paper we develop DReMo a distributed algorithm with very low resource overhead, for monitoring the quality of a regression model in terms of its coefficient of determination (R2 statistic). When the nodes collectively determine that R2 has dropped below a fixed threshold, the linear regression model is recomputed via a network-wide convergecast and the updated model is broadcast back to all nodes. We show empirically, using both synthetic and real data, that our proposed method is highly communication-efficient and scalable, and also provide theoretical guarantees on correctness.

  7. Predictors of unsuccessful outcome in cemented femoral revisions using bone impaction grafting; Cox regression analysis of 208 cases.

    PubMed

    Te Stroet, Martijn A J; Rijnen, Wim H C; Gardeniers, Jean W M; Schreurs, B Willem; Hannink, Gerjon

    2016-09-29

    Despite improvements in the technique of femoral impaction bone grafting, reconstruction failures still can occur. Therefore, the aim of our study was to determine risk factors for the endpoint re-revision for any reason. We used prospectively collected demographic, clinical and surgical data of all 202 patients who underwent 208 femoral revisions using the X-change Femoral Revision System (Stryker-Howmedica), fresh-frozen morcellised allograft and a cemented polished Exeter stem in our department from 1991 to 2007. Univariable and multivariable Cox regression analyses were performed to identify potential factors associated with re-revision. The mean follow-up was 10.6 (5-21) years. The cumulative re-revision rate was 6.3% (13/208). After univariable selection, sex, age, body mass index (BMI), American Association of Anesthesiologists (ASA) classification, type of removed femoral component, and mesh used for reconstruction were included in multivariable regression analysis.In the multivariable analysis, BMI was the only factor that was significantly associated with the risk of re-revision after bone impaction grafting (BMI ≥30 vs. BMI <30, HR = 6.54 [95% CI 1.89-22.65]; p = 0.003). BMI was the only factor associated with the risk of re-revision for any reason. Besides BMI also other factors, such as Endoklinik score and the type of removed femoral component, can provide guidance in the process of preclinical decision making. With the knowledge obtained from this study, preoperative patient selection, informed consent, and treatment protocols can be better adjusted to the individual patient who needs to undergo a femoral revision with impaction bone grafting.

  8. The importance of extent of choroid plexus cauterization in addition to endoscopic third ventriculostomy for infantile hydrocephalus: a retrospective North American observational study using propensity score-adjusted analysis.

    PubMed

    Fallah, Aria; Weil, Alexander G; Juraschka, Kyle; Ibrahim, George M; Wang, Anthony C; Crevier, Louis; Tseng, Chi-Hong; Kulkarni, Abhaya V; Ragheb, John; Bhatia, Sanjiv

    2017-12-01

    OBJECTIVE Combined endoscopic third ventriculostomy (ETC) and choroid plexus cauterization (CPC)-ETV/CPC- is being investigated to increase the rate of shunt independence in infants with hydrocephalus. The degree of CPC necessary to achieve improved rates of shunt independence is currently unknown. METHODS Using data from a single-center, retrospective, observational cohort study involving patients who underwent ETV/CPC for treatment of infantile hydrocephalus, comparative statistical analyses were performed to detect a difference in need for subsequent CSF diversion procedure in patients undergoing partial CPC (describes unilateral CPC or bilateral CPC that only extended from the foramen of Monro [FM] to the atrium on one side) or subtotal CPC (describes CPC extending from the FM to the posterior temporal horn bilaterally) using a rigid neuroendoscope. Propensity scores for extent of CPC were calculated using age and etiology. Propensity scores were used to perform 1) case-matching comparisons and 2) Cox multivariable regression, adjusting for propensity score in the unmatched cohort. Cox multivariable regression adjusting for age and etiology, but not propensity score was also performed as a third statistical technique. RESULTS Eighty-four patients who underwent ETV/CPC had sufficient data to be included in the analysis. Subtotal CPC was performed in 58 patients (69%) and partial CPC in 26 (31%). The ETV/CPC success rates at 6 and 12 months, respectively, were 49% and 41% for patients undergoing subtotal CPC and 35% and 31% for those undergoing partial CPC. Cox multivariate regression in a 48-patient cohort case-matched by propensity score demonstrated no added effect of increased extent of CPC on ETV/CPC survival (HR 0.868, 95% CI 0.422-1.789, p = 0.702). Cox multivariate regression including all patients, with adjustment for propensity score, demonstrated no effect of extent of CPC on ETV/CPC survival (HR 0.845, 95% CI 0.462-1.548, p = 0.586). Cox multivariate regression including all patients, with adjustment for age and etiology, but not propensity score, demonstrated no effect of extent of CPC on ETV/CPC survival (HR 0.908, 95% CI 0.495-1.664, p = 0.755). CONCLUSIONS Using multiple comparative statistical analyses, no difference in need for subsequent CSF diversion procedure was detected between patients in this cohort who underwent partial versus subtotal CPC. Further investigation regarding whether there is truly no difference between partial versus subtotal extent of CPC in larger patient populations and whether further gain in CPC success can be achieved with complete CPC is warranted.

  9. Multivariate Formation Pressure Prediction with Seismic-derived Petrophysical Properties from Prestack AVO inversion and Poststack Seismic Motion Inversion

    NASA Astrophysics Data System (ADS)

    Yu, H.; Gu, H.

    2017-12-01

    A novel multivariate seismic formation pressure prediction methodology is presented, which incorporates high-resolution seismic velocity data from prestack AVO inversion, and petrophysical data (porosity and shale volume) derived from poststack seismic motion inversion. In contrast to traditional seismic formation prediction methods, the proposed methodology is based on a multivariate pressure prediction model and utilizes a trace-by-trace multivariate regression analysis on seismic-derived petrophysical properties to calibrate model parameters in order to make accurate predictions with higher resolution in both vertical and lateral directions. With prestack time migration velocity as initial velocity model, an AVO inversion was first applied to prestack dataset to obtain high-resolution seismic velocity with higher frequency that is to be used as the velocity input for seismic pressure prediction, and the density dataset to calculate accurate Overburden Pressure (OBP). Seismic Motion Inversion (SMI) is an inversion technique based on Markov Chain Monte Carlo simulation. Both structural variability and similarity of seismic waveform are used to incorporate well log data to characterize the variability of the property to be obtained. In this research, porosity and shale volume are first interpreted on well logs, and then combined with poststack seismic data using SMI to build porosity and shale volume datasets for seismic pressure prediction. A multivariate effective stress model is used to convert velocity, porosity and shale volume datasets to effective stress. After a thorough study of the regional stratigraphic and sedimentary characteristics, a regional normally compacted interval model is built, and then the coefficients in the multivariate prediction model are determined in a trace-by-trace multivariate regression analysis on the petrophysical data. The coefficients are used to convert velocity, porosity and shale volume datasets to effective stress and then to calculate formation pressure with OBP. Application of the proposed methodology to a research area in East China Sea has proved that the method can bridge the gap between seismic and well log pressure prediction and give predicted pressure values close to pressure meassurements from well testing.

  10. The use of logistic regression to enhance risk assessment and decision making by mental health administrators.

    PubMed

    Menditto, Anthony A; Linhorst, Donald M; Coleman, James C; Beck, Niels C

    2006-04-01

    Development of policies and procedures to contend with the risks presented by elopement, aggression, and suicidal behaviors are long-standing challenges for mental health administrators. Guidance in making such judgments can be obtained through the use of a multivariate statistical technique known as logistic regression. This procedure can be used to develop a predictive equation that is mathematically formulated to use the best combination of predictors, rather than considering just one factor at a time. This paper presents an overview of logistic regression and its utility in mental health administrative decision making. A case example of its application is presented using data on elopements from Missouri's long-term state psychiatric hospitals. Ultimately, the use of statistical prediction analyses tempered with differential qualitative weighting of classification errors can augment decision-making processes in a manner that provides guidance and flexibility while wrestling with the complex problem of risk assessment and decision making.

  11. Multi-Site and Multi-Variables Statistical Downscaling Technique in the Monsoon Dominated Region of Pakistan

    NASA Astrophysics Data System (ADS)

    Khan, Firdos; Pilz, Jürgen

    2016-04-01

    South Asia is under the severe impacts of changing climate and global warming. The last two decades showed that climate change or global warming is happening and the first decade of 21st century is considered as the warmest decade over Pakistan ever in history where temperature reached 53 0C in 2010. Consequently, the spatio-temporal distribution and intensity of precipitation is badly effected and causes floods, cyclones and hurricanes in the region which further have impacts on agriculture, water, health etc. To cope with the situation, it is important to conduct impact assessment studies and take adaptation and mitigation remedies. For impact assessment studies, we need climate variables at higher resolution. Downscaling techniques are used to produce climate variables at higher resolution; these techniques are broadly divided into two types, statistical downscaling and dynamical downscaling. The target location of this study is the monsoon dominated region of Pakistan. One reason for choosing this area is because the contribution of monsoon rains in this area is more than 80 % of the total rainfall. This study evaluates a statistical downscaling technique which can be then used for downscaling climatic variables. Two statistical techniques i.e. quantile regression and copula modeling are combined in order to produce realistic results for climate variables in the area under-study. To reduce the dimension of input data and deal with multicollinearity problems, empirical orthogonal functions will be used. Advantages of this new method are: (1) it is more robust to outliers as compared to ordinary least squares estimates and other estimation methods based on central tendency and dispersion measures; (2) it preserves the dependence among variables and among sites and (3) it can be used to combine different types of distributions. This is important in our case because we are dealing with climatic variables having different distributions over different meteorological stations. The proposed model will be validated by using the (National Centers for Environmental Prediction / National Center for Atmospheric Research) NCEP/NCAR predictors for the period of 1960-1990 and validated for 1990-2000. To investigate the efficiency of the proposed model, it will be compared with the multivariate multiple regression model and with dynamical downscaling climate models by using different climate indices that describe the frequency, intensity and duration of the variables of interest. KEY WORDS: Climate change, Copula, Monsoon, Quantile regression, Spatio-temporal distribution.

  12. Effect of Contact Damage on the Strength of Ceramic Materials.

    DTIC Science & Technology

    1982-10-01

    variables that are important to erosion, and a multivariate , linear regression analysis is used to fit the data to the dimensional analysis. The...of Equations 7 and 8 by a multivariable regression analysis (room tem- perature data) Exponent Regression Standard error Computed coefficient of...1980) 593. WEAVER, Proc. Brit. Ceram. Soc. 22 (1973) 125. 39. P. W. BRIDGMAN, "Dimensional Analaysis ", (Yale 18. R. W. RICE, S. W. FREIMAN and P. F

  13. Assessing risk factors for periodontitis using regression

    NASA Astrophysics Data System (ADS)

    Lobo Pereira, J. A.; Ferreira, Maria Cristina; Oliveira, Teresa

    2013-10-01

    Multivariate statistical analysis is indispensable to assess the associations and interactions between different factors and the risk of periodontitis. Among others, regression analysis is a statistical technique widely used in healthcare to investigate and model the relationship between variables. In our work we study the impact of socio-demographic, medical and behavioral factors on periodontal health. Using regression, linear and logistic models, we can assess the relevance, as risk factors for periodontitis disease, of the following independent variables (IVs): Age, Gender, Diabetic Status, Education, Smoking status and Plaque Index. The multiple linear regression analysis model was built to evaluate the influence of IVs on mean Attachment Loss (AL). Thus, the regression coefficients along with respective p-values will be obtained as well as the respective p-values from the significance tests. The classification of a case (individual) adopted in the logistic model was the extent of the destruction of periodontal tissues defined by an Attachment Loss greater than or equal to 4 mm in 25% (AL≥4mm/≥25%) of sites surveyed. The association measures include the Odds Ratios together with the correspondent 95% confidence intervals.

  14. Partial Least Squares with Structured Output for Modelling the Metabolomics Data Obtained from Complex Experimental Designs: A Study into the Y-Block Coding.

    PubMed

    Xu, Yun; Muhamadali, Howbeer; Sayqal, Ali; Dixon, Neil; Goodacre, Royston

    2016-10-28

    Partial least squares (PLS) is one of the most commonly used supervised modelling approaches for analysing multivariate metabolomics data. PLS is typically employed as either a regression model (PLS-R) or a classification model (PLS-DA). However, in metabolomics studies it is common to investigate multiple, potentially interacting, factors simultaneously following a specific experimental design. Such data often cannot be considered as a "pure" regression or a classification problem. Nevertheless, these data have often still been treated as a regression or classification problem and this could lead to ambiguous results. In this study, we investigated the feasibility of designing a hybrid target matrix Y that better reflects the experimental design than simple regression or binary class membership coding commonly used in PLS modelling. The new design of Y coding was based on the same principle used by structural modelling in machine learning techniques. Two real metabolomics datasets were used as examples to illustrate how the new Y coding can improve the interpretability of the PLS model compared to classic regression/classification coding.

  15. Assessing the independent contribution of maternal educational expectations to children's educational attainment in early adulthood: a propensity score matching analysis.

    PubMed

    Pingault, Jean Baptiste; Côté, Sylvana M; Petitclerc, Amélie; Vitaro, Frank; Tremblay, Richard E

    2015-01-01

    Parental educational expectations have been associated with children's educational attainment in a number of long-term longitudinal studies, but whether this relationship is causal has long been debated. The aims of this prospective study were twofold: 1) test whether low maternal educational expectations contributed to failure to graduate from high school; and 2) compare the results obtained using different strategies for accounting for confounding variables (i.e. multivariate regression and propensity score matching). The study sample included 1,279 participants from the Quebec Longitudinal Study of Kindergarten Children. Maternal educational expectations were assessed when the participants were aged 12 years. High school graduation—measuring educational attainment—was determined through the Quebec Ministry of Education when the participants were aged 22-23 years. Findings show that when using the most common statistical approach (i.e. multivariate regressions to adjust for a restricted set of potential confounders) the contribution of low maternal educational expectations to failure to graduate from high school was statistically significant. However, when using propensity score matching, the contribution of maternal expectations was reduced and remained statistically significant only for males. The results of this study are consistent with the possibility that the contribution of parental expectations to educational attainment is overestimated in the available literature. This may be explained by the use of a restricted range of potential confounding variables as well as the dearth of studies using appropriate statistical techniques and study designs in order to minimize confounding. Each of these techniques and designs, including propensity score matching, has its strengths and limitations: A more comprehensive understanding of the causal role of parental expectations will stem from a convergence of findings from studies using different techniques and designs.

  16. MANOVA vs nonlinear mixed effects modeling: The comparison of growth patterns of female and male quail

    NASA Astrophysics Data System (ADS)

    Gürcan, Eser Kemal

    2017-04-01

    The most commonly used methods for analyzing time-dependent data are multivariate analysis of variance (MANOVA) and nonlinear regression models. The aim of this study was to compare some MANOVA techniques and nonlinear mixed modeling approach for investigation of growth differentiation in female and male Japanese quail. Weekly individual body weight data of 352 male and 335 female quail from hatch to 8 weeks of age were used to perform analyses. It is possible to say that when all the analyses are evaluated, the nonlinear mixed modeling is superior to the other techniques because it also reveals the individual variation. In addition, the profile analysis also provides important information.

  17. Artificial neural network, genetic algorithm, and logistic regression applications for predicting renal colic in emergency settings.

    PubMed

    Eken, Cenker; Bilge, Ugur; Kartal, Mutlu; Eray, Oktay

    2009-06-03

    Logistic regression is the most common statistical model for processing multivariate data in the medical literature. Artificial intelligence models like an artificial neural network (ANN) and genetic algorithm (GA) may also be useful to interpret medical data. The purpose of this study was to perform artificial intelligence models on a medical data sheet and compare to logistic regression. ANN, GA, and logistic regression analysis were carried out on a data sheet of a previously published article regarding patients presenting to an emergency department with flank pain suspicious for renal colic. The study population was composed of 227 patients: 176 patients had a diagnosis of urinary stone, while 51 ultimately had no calculus. The GA found two decision rules in predicting urinary stones. Rule 1 consisted of being male, pain not spreading to back, and no fever. In rule 2, pelvicaliceal dilatation on bedside ultrasonography replaced no fever. ANN, GA rule 1, GA rule 2, and logistic regression had a sensitivity of 94.9, 67.6, 56.8, and 95.5%, a specificity of 78.4, 76.47, 86.3, and 47.1%, a positive likelihood ratio of 4.4, 2.9, 4.1, and 1.8, and a negative likelihood ratio of 0.06, 0.42, 0.5, and 0.09, respectively. The area under the curve was found to be 0.867, 0.720, 0.715, and 0.713 for all applications, respectively. Data mining techniques such as ANN and GA can be used for predicting renal colic in emergency settings and to constitute clinical decision rules. They may be an alternative to conventional multivariate analysis applications used in biostatistics.

  18. Variable Selection in Logistic Regression.

    DTIC Science & Technology

    1987-06-01

    23 %. AUTIOR(.) S. CONTRACT OR GRANT NUMBE Rf.i %Z. D. Bai, P. R. Krishnaiah and . C. Zhao F49620-85- C-0008 " PERFORMING ORGANIZATION NAME AND AOORESS...d I7 IOK-TK- d 7 -I0 7’ VARIABLE SELECTION IN LOGISTIC REGRESSION Z. D. Bai, P. R. Krishnaiah and L. C. Zhao Center for Multivariate Analysis...University of Pittsburgh Center for Multivariate Analysis University of Pittsburgh Y !I VARIABLE SELECTION IN LOGISTIC REGRESSION Z- 0. Bai, P. R. Krishnaiah

  19. The Multivariate Temporal Response Function (mTRF) Toolbox: A MATLAB Toolbox for Relating Neural Signals to Continuous Stimuli.

    PubMed

    Crosse, Michael J; Di Liberto, Giovanni M; Bednar, Adam; Lalor, Edmund C

    2016-01-01

    Understanding how brains process sensory signals in natural environments is one of the key goals of twenty-first century neuroscience. While brain imaging and invasive electrophysiology will play key roles in this endeavor, there is also an important role to be played by noninvasive, macroscopic techniques with high temporal resolution such as electro- and magnetoencephalography. But challenges exist in determining how best to analyze such complex, time-varying neural responses to complex, time-varying and multivariate natural sensory stimuli. There has been a long history of applying system identification techniques to relate the firing activity of neurons to complex sensory stimuli and such techniques are now seeing increased application to EEG and MEG data. One particular example involves fitting a filter-often referred to as a temporal response function-that describes a mapping between some feature(s) of a sensory stimulus and the neural response. Here, we first briefly review the history of these system identification approaches and describe a specific technique for deriving temporal response functions known as regularized linear regression. We then introduce a new open-source toolbox for performing this analysis. We describe how it can be used to derive (multivariate) temporal response functions describing a mapping between stimulus and response in both directions. We also explain the importance of regularizing the analysis and how this regularization can be optimized for a particular dataset. We then outline specifically how the toolbox implements these analyses and provide several examples of the types of results that the toolbox can produce. Finally, we consider some of the limitations of the toolbox and opportunities for future development and application.

  20. The Multivariate Temporal Response Function (mTRF) Toolbox: A MATLAB Toolbox for Relating Neural Signals to Continuous Stimuli

    PubMed Central

    Crosse, Michael J.; Di Liberto, Giovanni M.; Bednar, Adam; Lalor, Edmund C.

    2016-01-01

    Understanding how brains process sensory signals in natural environments is one of the key goals of twenty-first century neuroscience. While brain imaging and invasive electrophysiology will play key roles in this endeavor, there is also an important role to be played by noninvasive, macroscopic techniques with high temporal resolution such as electro- and magnetoencephalography. But challenges exist in determining how best to analyze such complex, time-varying neural responses to complex, time-varying and multivariate natural sensory stimuli. There has been a long history of applying system identification techniques to relate the firing activity of neurons to complex sensory stimuli and such techniques are now seeing increased application to EEG and MEG data. One particular example involves fitting a filter—often referred to as a temporal response function—that describes a mapping between some feature(s) of a sensory stimulus and the neural response. Here, we first briefly review the history of these system identification approaches and describe a specific technique for deriving temporal response functions known as regularized linear regression. We then introduce a new open-source toolbox for performing this analysis. We describe how it can be used to derive (multivariate) temporal response functions describing a mapping between stimulus and response in both directions. We also explain the importance of regularizing the analysis and how this regularization can be optimized for a particular dataset. We then outline specifically how the toolbox implements these analyses and provide several examples of the types of results that the toolbox can produce. Finally, we consider some of the limitations of the toolbox and opportunities for future development and application. PMID:27965557

  1. Comparing lagged linear correlation, lagged regression, Granger causality, and vector autoregression for uncovering associations in EHR data.

    PubMed

    Levine, Matthew E; Albers, David J; Hripcsak, George

    2016-01-01

    Time series analysis methods have been shown to reveal clinical and biological associations in data collected in the electronic health record. We wish to develop reliable high-throughput methods for identifying adverse drug effects that are easy to implement and produce readily interpretable results. To move toward this goal, we used univariate and multivariate lagged regression models to investigate associations between twenty pairs of drug orders and laboratory measurements. Multivariate lagged regression models exhibited higher sensitivity and specificity than univariate lagged regression in the 20 examples, and incorporating autoregressive terms for labs and drugs produced more robust signals in cases of known associations among the 20 example pairings. Moreover, including inpatient admission terms in the model attenuated the signals for some cases of unlikely associations, demonstrating how multivariate lagged regression models' explicit handling of context-based variables can provide a simple way to probe for health-care processes that confound analyses of EHR data.

  2. Detection and discrimination of microorganisms on various substrates with quantum cascade laser spectroscopy

    NASA Astrophysics Data System (ADS)

    Padilla-Jiménez, Amira C.; Ortiz-Rivera, William; Rios-Velazquez, Carlos; Vazquez-Ayala, Iris; Hernández-Rivera, Samuel P.

    2014-06-01

    Investigations focusing on devising rapid and accurate methods for developing signatures for microorganisms that could be used as biological warfare agents' detection, identification, and discrimination have recently increased significantly. Quantum cascade laser (QCL)-based spectroscopic systems have revolutionized many areas of defense and security including this area of research. In this contribution, infrared spectroscopy detection based on QCL was used to obtain the mid-infrared (MIR) spectral signatures of Bacillus thuringiensis, Escherichia coli, and Staphylococcus epidermidis. These bacteria were used as microorganisms that simulate biothreats (biosimulants) very truthfully. The experiments were conducted in reflection mode with biosimulants deposited on various substrates including cardboard, glass, travel bags, wood, and stainless steel. Chemometrics multivariate statistical routines, such as principal component analysis regression and partial least squares coupled to discriminant analysis, were used to analyze the MIR spectra. Overall, the investigated infrared vibrational techniques were useful for detecting target microorganisms on the studied substrates, and the multivariate data analysis techniques proved to be very efficient for classifying the bacteria and discriminating them in the presence of highly IR-interfering media.

  3. Comparative study between derivative spectrophotometry and multivariate calibration as analytical tools applied for the simultaneous quantitation of Amlodipine, Valsartan and Hydrochlorothiazide

    NASA Astrophysics Data System (ADS)

    Darwish, Hany W.; Hassan, Said A.; Salem, Maissa Y.; El-Zeany, Badr A.

    2013-09-01

    Four simple, accurate and specific methods were developed and validated for the simultaneous estimation of Amlodipine (AML), Valsartan (VAL) and Hydrochlorothiazide (HCT) in commercial tablets. The derivative spectrophotometric methods include Derivative Ratio Zero Crossing (DRZC) and Double Divisor Ratio Spectra-Derivative Spectrophotometry (DDRS-DS) methods, while the multivariate calibrations used are Principal Component Regression (PCR) and Partial Least Squares (PLSs). The proposed methods were applied successfully in the determination of the drugs in laboratory-prepared mixtures and in commercial pharmaceutical preparations. The validity of the proposed methods was assessed using the standard addition technique. The linearity of the proposed methods is investigated in the range of 2-32, 4-44 and 2-20 μg/mL for AML, VAL and HCT, respectively.

  4. The Association Between Unintended Births and Poor Child Development in India: Evidence from a Longitudinal Study.

    PubMed

    Singh, Abhishek; Upadhyay, Ashish Kumar; Singh, Ashish; Kumar, Kaushalendra

    2017-03-01

    Evidence on the association between unintended births and poor child development in developing countries is limited. We used data from three waves of the Young Lives study on childhood poverty conducted in Andhra Pradesh in 2002, 2006-07, and 2009 to examine the association between unintended births and poor child development in India. Multivariable linear regression models were used to examine the association between unintended births and four indicators of child development-height-for-age Z-score (HAZ), Peabody Picture Vocabulary Test (PPVT) score, Mathematics Achievement Test (MAT) score, and Early Grade Reading Assessment (EGRA) test score. The Propensity Score Matching (PSM) technique was also used to analyze data. Children who were reported as unintended at birth had significantly lower HAZ, PPVT, and EGRA scores compared with those who were reported as intended. PSM results support the findings from the multivariable linear regressions. Our findings provide evidence on the association between unintended births and poor child development in India. There may be a need to reposition family planning within India's reproductive and child health care programs. Future studies must take into account the unobserved heterogeneity that our study could not address fully. © 2017 The Population Council, Inc.

  5. Combining fibre optic Raman spectroscopy and tactile resonance measurement for tissue characterization

    NASA Astrophysics Data System (ADS)

    Candefjord, Stefan; Nyberg, Morgan; Jalkanen, Ville; Ramser, Kerstin; Lindahl, Olof A.

    2010-12-01

    Tissue characterization is fundamental for identification of pathological conditions. Raman spectroscopy (RS) and tactile resonance measurement (TRM) are two promising techniques that measure biochemical content and stiffness, respectively. They have potential to complement the golden standard--histological analysis. By combining RS and TRM, complementary information about tissue content can be obtained and specific drawbacks can be avoided. The aim of this study was to develop a multivariate approach to compare RS and TRM information. The approach was evaluated on measurements at the same points on porcine abdominal tissue. The measurement points were divided into five groups by multivariate analysis of the RS data. A regression analysis was performed and receiver operating characteristic (ROC) curves were used to compare the RS and TRM data. TRM identified one group efficiently (area under ROC curve 0.99). The RS data showed that the proportion of saturated fat was high in this group. The regression analysis showed that stiffness was mainly determined by the amount of fat and its composition. We concluded that RS provided additional, important information for tissue identification that was not provided by TRM alone. The results are promising for development of a method combining RS and TRM for intraoperative tissue characterization.

  6. Study of cyanotoxins presence from experimental cyanobacteria concentrations using a new data mining methodology based on multivariate adaptive regression splines in Trasona reservoir (Northern Spain).

    PubMed

    Garcia Nieto, P J; Sánchez Lasheras, F; de Cos Juez, F J; Alonso Fernández, J R

    2011-11-15

    There is an increasing need to describe cyanobacteria blooms since some cyanobacteria produce toxins, termed cyanotoxins. These latter can be toxic and dangerous to humans as well as other animals and life in general. It must be remarked that the cyanobacteria are reproduced explosively under certain conditions. This results in algae blooms, which can become harmful to other species if the cyanobacteria involved produce cyanotoxins. In this research work, the evolution of cyanotoxins in Trasona reservoir (Principality of Asturias, Northern Spain) was studied with success using the data mining methodology based on multivariate adaptive regression splines (MARS) technique. The results of the present study are two-fold. On one hand, the importance of the different kind of cyanobacteria over the presence of cyanotoxins in the reservoir is presented through the MARS model and on the other hand a predictive model able to forecast the possible presence of cyanotoxins in a short term was obtained. The agreement of the MARS model with experimental data confirmed the good performance of the same one. Finally, conclusions of this innovative research are exposed. Copyright © 2011 Elsevier B.V. All rights reserved.

  7. A matrix-based method of moments for fitting the multivariate random effects model for meta-analysis and meta-regression

    PubMed Central

    Jackson, Dan; White, Ian R; Riley, Richard D

    2013-01-01

    Multivariate meta-analysis is becoming more commonly used. Methods for fitting the multivariate random effects model include maximum likelihood, restricted maximum likelihood, Bayesian estimation and multivariate generalisations of the standard univariate method of moments. Here, we provide a new multivariate method of moments for estimating the between-study covariance matrix with the properties that (1) it allows for either complete or incomplete outcomes and (2) it allows for covariates through meta-regression. Further, for complete data, it is invariant to linear transformations. Our method reduces to the usual univariate method of moments, proposed by DerSimonian and Laird, in a single dimension. We illustrate our method and compare it with some of the alternatives using a simulation study and a real example. PMID:23401213

  8. Multivariate functions for predicting the sorption of 2,4,6-trinitrotoluene (TNT) and 1,3,5-trinitro-1,3,5-tricyclohexane (RDX) among taxonomically distinct soils.

    PubMed

    Katseanes, Chelsea K; Chappell, Mark A; Hopkins, Bryan G; Durham, Brian D; Price, Cynthia L; Porter, Beth E; Miller, Lesley F

    2016-11-01

    After nearly a century of use in numerous munition platforms, TNT and RDX contamination has turned up largely in the environment due to ammunition manufacturing or as part of releases from low-order detonations during training activities. Although the basic knowledge governing the environmental fate of TNT and RDX are known, accurate predictions of TNT and RDX persistence in soil remain elusive, particularly given the universal heterogeneity of pedomorphic soil types. In this work, we proposed a new solution for modeling the sorption and persistence of these munition constituents as multivariate mathematical functions correlating soil attribute data over a variety of taxonomically distinct soil types to contaminant behavior, instead of a single constant or parameter of a specific absolute value. To test this idea, we conducted experiments measuring the sorption of TNT and RDX on taxonomically different soil types that were extensively physical and chemically characterized. Statistical decomposition of the log-transformed, and auto-scaled soil characterization data using the dimension-reduction technique PCA (principal component analysis) revealed a strong latent structure based in the multiple pairwise correlations among the soil properties. TNT and RDX sorption partitioning coefficients (KD-TNT and KD-RDX) were regressed against this latent structure using partial least squares regression (PLSR), generating a 3-factor, multivariate linear functions. Here, PLSR models predicted KD-TNT and KD-RDX values based on attributes contributing to endogenous alkaline/calcareous and soil fertility criteria, respectively, exhibited among the different soil types: We hypothesized that the latent structure arising from the strong covariance of full multivariate geochemical matrix describing taxonomically distinguished soil types may provide the means for potentially predicting complex phenomena in soils. The development of predictive multivariate models tuned to a local soil's taxonomic designation would have direct benefit to military range managers seeking to anticipate the environmental risks of training activities on impact sites. Published by Elsevier Ltd.

  9. SMURC: High-Dimension Small-Sample Multivariate Regression With Covariance Estimation.

    PubMed

    Bayar, Belhassen; Bouaynaya, Nidhal; Shterenberg, Roman

    2017-03-01

    We consider a high-dimension low sample-size multivariate regression problem that accounts for correlation of the response variables. The system is underdetermined as there are more parameters than samples. We show that the maximum likelihood approach with covariance estimation is senseless because the likelihood diverges. We subsequently propose a normalization of the likelihood function that guarantees convergence. We call this method small-sample multivariate regression with covariance (SMURC) estimation. We derive an optimization problem and its convex approximation to compute SMURC. Simulation results show that the proposed algorithm outperforms the regularized likelihood estimator with known covariance matrix and the sparse conditional Gaussian graphical model. We also apply SMURC to the inference of the wing-muscle gene network of the Drosophila melanogaster (fruit fly).

  10. Computer-delivered interventions for reducing alcohol consumption: meta-analysis and meta-regression using behaviour change techniques and theory.

    PubMed

    Black, Nicola; Mullan, Barbara; Sharpe, Louise

    2016-09-01

    The current aim was to examine the effectiveness of behaviour change techniques (BCTs), theory and other characteristics in increasing the effectiveness of computer-delivered interventions (CDIs) to reduce alcohol consumption. Included were randomised studies with a primary aim of reducing alcohol consumption, which compared self-directed CDIs to assessment-only control groups. CDIs were coded for the use of 42 BCTs from an alcohol-specific taxonomy, the use of theory according to a theory coding scheme and general characteristics such as length of the CDI. Effectiveness of CDIs was assessed using random-effects meta-analysis and the association between the moderators and effect size was assessed using univariate and multivariate meta-regression. Ninety-three CDIs were included in at least one analysis and produced small, significant effects on five outcomes (d+ = 0.07-0.15). Larger effects occurred with some personal contact, provision of normative information or feedback on performance, prompting commitment or goal review, the social norms approach and in samples with more women. Smaller effects occurred when information on the consequences of alcohol consumption was provided. These findings can be used to inform both intervention- and theory-development. Intervention developers should focus on, including specific, effective techniques, rather than many techniques or more-elaborate approaches.

  11. Combined data preprocessing and multivariate statistical analysis characterizes fed-batch culture of mouse hybridoma cells for rational medium design.

    PubMed

    Selvarasu, Suresh; Kim, Do Yun; Karimi, Iftekhar A; Lee, Dong-Yup

    2010-10-01

    We present an integrated framework for characterizing fed-batch cultures of mouse hybridoma cells producing monoclonal antibody (mAb). This framework systematically combines data preprocessing, elemental balancing and statistical analysis technique. Initially, specific rates of cell growth, glucose/amino acid consumptions and mAb/metabolite productions were calculated via curve fitting using logistic equations, with subsequent elemental balancing of the preprocessed data indicating the presence of experimental measurement errors. Multivariate statistical analysis was then employed to understand physiological characteristics of the cellular system. The results from principal component analysis (PCA) revealed three major clusters of amino acids with similar trends in their consumption profiles: (i) arginine, threonine and serine, (ii) glycine, tyrosine, phenylalanine, methionine, histidine and asparagine, and (iii) lysine, valine and isoleucine. Further analysis using partial least square (PLS) regression identified key amino acids which were positively or negatively correlated with the cell growth, mAb production and the generation of lactate and ammonia. Based on these results, the optimal concentrations of key amino acids in the feed medium can be inferred, potentially leading to an increase in cell viability and productivity, as well as a decrease in toxic waste production. The study demonstrated how the current methodological framework using multivariate statistical analysis techniques can serve as a potential tool for deriving rational medium design strategies. Copyright © 2010 Elsevier B.V. All rights reserved.

  12. Digital soil classification and elemental mapping using imaging Vis-NIR spectroscopy: How to explicitly quantify stagnic properties of a Luvisol under Norway spruce

    NASA Astrophysics Data System (ADS)

    Kriegs, Stefanie; Buddenbaum, Henning; Rogge, Derek; Steffens, Markus

    2015-04-01

    Laboratory imaging Vis-NIR spectroscopy of soil profiles is a novel technique in soil science that can determine quantity and quality of various chemical soil properties with a hitherto unreached spatial resolution in undisturbed soil profiles. We have applied this technique to soil cores in order to get quantitative proof of redoximorphic processes under two different tree species and to proof tree-soil interactions at microscale. Due to the imaging capabilities of Vis-NIR spectroscopy a spatially explicit understanding of soil processes and properties can be achieved. Spatial heterogeneity of the soil profile can be taken into account. We took six 30 cm long rectangular soil columns of adjacent Luvisols derived from quaternary aeolian sediments (Loess) in a forest soil near Freising/Bavaria using stainless steel boxes (100×100×300 mm). Three profiles were sampled under Norway spruce and three under European beech. A hyperspectral camera (VNIR, 400-1000 nm in 160 spectral bands) with spatial resolution of 63×63 µm² per pixel was used for data acquisition. Reference samples were taken at representative spots and analysed for organic carbon (OC) quantity and quality with a CN elemental analyser and for iron oxides (Fe) content using dithionite extraction followed by ICP-OES measurement. We compared two supervised classification algorithms, Spectral Angle Mapper and Maximum Likelihood, using different sets of training areas and spectral libraries. As established in chemometrics we used multivariate analysis such as partial least-squares regression (PLSR) in addition to multivariate adaptive regression splines (MARS) to correlate chemical data with Vis-NIR spectra. As a result elemental mapping of Fe and OC within the soil core at high spatial resolution has been achieved. The regression model was validated by a new set of reference samples for chemical analysis. Digital soil classification easily visualizes soil properties within the soil profiles. By combining both techniques, detailed soil maps, elemental balances and a deeper understanding of soil forming processes at the microscale become feasible for complete soil profiles.

  13. Using Multivariate Regression Model with Least Absolute Shrinkage and Selection Operator (LASSO) to Predict the Incidence of Xerostomia after Intensity-Modulated Radiotherapy for Head and Neck Cancer

    PubMed Central

    Ting, Hui-Min; Chang, Liyun; Huang, Yu-Jie; Wu, Jia-Ming; Wang, Hung-Yu; Horng, Mong-Fong; Chang, Chun-Ming; Lan, Jen-Hong; Huang, Ya-Yu; Fang, Fu-Min; Leung, Stephen Wan

    2014-01-01

    Purpose The aim of this study was to develop a multivariate logistic regression model with least absolute shrinkage and selection operator (LASSO) to make valid predictions about the incidence of moderate-to-severe patient-rated xerostomia among head and neck cancer (HNC) patients treated with IMRT. Methods and Materials Quality of life questionnaire datasets from 206 patients with HNC were analyzed. The European Organization for Research and Treatment of Cancer QLQ-H&N35 and QLQ-C30 questionnaires were used as the endpoint evaluation. The primary endpoint (grade 3+ xerostomia) was defined as moderate-to-severe xerostomia at 3 (XER3m) and 12 months (XER12m) after the completion of IMRT. Normal tissue complication probability (NTCP) models were developed. The optimal and suboptimal numbers of prognostic factors for a multivariate logistic regression model were determined using the LASSO with bootstrapping technique. Statistical analysis was performed using the scaled Brier score, Nagelkerke R2, chi-squared test, Omnibus, Hosmer-Lemeshow test, and the AUC. Results Eight prognostic factors were selected by LASSO for the 3-month time point: Dmean-c, Dmean-i, age, financial status, T stage, AJCC stage, smoking, and education. Nine prognostic factors were selected for the 12-month time point: Dmean-i, education, Dmean-c, smoking, T stage, baseline xerostomia, alcohol abuse, family history, and node classification. In the selection of the suboptimal number of prognostic factors by LASSO, three suboptimal prognostic factors were fine-tuned by Hosmer-Lemeshow test and AUC, i.e., Dmean-c, Dmean-i, and age for the 3-month time point. Five suboptimal prognostic factors were also selected for the 12-month time point, i.e., Dmean-i, education, Dmean-c, smoking, and T stage. The overall performance for both time points of the NTCP model in terms of scaled Brier score, Omnibus, and Nagelkerke R2 was satisfactory and corresponded well with the expected values. Conclusions Multivariate NTCP models with LASSO can be used to predict patient-rated xerostomia after IMRT. PMID:24586971

  14. Using multivariate regression model with least absolute shrinkage and selection operator (LASSO) to predict the incidence of Xerostomia after intensity-modulated radiotherapy for head and neck cancer.

    PubMed

    Lee, Tsair-Fwu; Chao, Pei-Ju; Ting, Hui-Min; Chang, Liyun; Huang, Yu-Jie; Wu, Jia-Ming; Wang, Hung-Yu; Horng, Mong-Fong; Chang, Chun-Ming; Lan, Jen-Hong; Huang, Ya-Yu; Fang, Fu-Min; Leung, Stephen Wan

    2014-01-01

    The aim of this study was to develop a multivariate logistic regression model with least absolute shrinkage and selection operator (LASSO) to make valid predictions about the incidence of moderate-to-severe patient-rated xerostomia among head and neck cancer (HNC) patients treated with IMRT. Quality of life questionnaire datasets from 206 patients with HNC were analyzed. The European Organization for Research and Treatment of Cancer QLQ-H&N35 and QLQ-C30 questionnaires were used as the endpoint evaluation. The primary endpoint (grade 3(+) xerostomia) was defined as moderate-to-severe xerostomia at 3 (XER3m) and 12 months (XER12m) after the completion of IMRT. Normal tissue complication probability (NTCP) models were developed. The optimal and suboptimal numbers of prognostic factors for a multivariate logistic regression model were determined using the LASSO with bootstrapping technique. Statistical analysis was performed using the scaled Brier score, Nagelkerke R(2), chi-squared test, Omnibus, Hosmer-Lemeshow test, and the AUC. Eight prognostic factors were selected by LASSO for the 3-month time point: Dmean-c, Dmean-i, age, financial status, T stage, AJCC stage, smoking, and education. Nine prognostic factors were selected for the 12-month time point: Dmean-i, education, Dmean-c, smoking, T stage, baseline xerostomia, alcohol abuse, family history, and node classification. In the selection of the suboptimal number of prognostic factors by LASSO, three suboptimal prognostic factors were fine-tuned by Hosmer-Lemeshow test and AUC, i.e., Dmean-c, Dmean-i, and age for the 3-month time point. Five suboptimal prognostic factors were also selected for the 12-month time point, i.e., Dmean-i, education, Dmean-c, smoking, and T stage. The overall performance for both time points of the NTCP model in terms of scaled Brier score, Omnibus, and Nagelkerke R(2) was satisfactory and corresponded well with the expected values. Multivariate NTCP models with LASSO can be used to predict patient-rated xerostomia after IMRT.

  15. OPLS statistical model versus linear regression to assess sonographic predictors of stroke prognosis.

    PubMed

    Vajargah, Kianoush Fathi; Sadeghi-Bazargani, Homayoun; Mehdizadeh-Esfanjani, Robab; Savadi-Oskouei, Daryoush; Farhoudi, Mehdi

    2012-01-01

    The objective of the present study was to assess the comparable applicability of orthogonal projections to latent structures (OPLS) statistical model vs traditional linear regression in order to investigate the role of trans cranial doppler (TCD) sonography in predicting ischemic stroke prognosis. The study was conducted on 116 ischemic stroke patients admitted to a specialty neurology ward. The Unified Neurological Stroke Scale was used once for clinical evaluation on the first week of admission and again six months later. All data was primarily analyzed using simple linear regression and later considered for multivariate analysis using PLS/OPLS models through the SIMCA P+12 statistical software package. The linear regression analysis results used for the identification of TCD predictors of stroke prognosis were confirmed through the OPLS modeling technique. Moreover, in comparison to linear regression, the OPLS model appeared to have higher sensitivity in detecting the predictors of ischemic stroke prognosis and detected several more predictors. Applying the OPLS model made it possible to use both single TCD measures/indicators and arbitrarily dichotomized measures of TCD single vessel involvement as well as the overall TCD result. In conclusion, the authors recommend PLS/OPLS methods as complementary rather than alternative to the available classical regression models such as linear regression.

  16. Multivariate regression model for predicting lumber grade volumes of northern red oak sawlogs

    Treesearch

    Daniel A. Yaussy; Robert L. Brisbin

    1983-01-01

    A multivariate regression model was developed to predict green board-foot yields for the seven common factory lumber grades processed from northern red oak (Quercus rubra L.) factory grade logs. The model uses the standard log measurements of grade, scaling diameter, length, and percent defect. It was validated with an independent data set. The model...

  17. Predictive and mechanistic multivariate linear regression models for reaction development

    PubMed Central

    Santiago, Celine B.; Guo, Jing-Yao

    2018-01-01

    Multivariate Linear Regression (MLR) models utilizing computationally-derived and empirically-derived physical organic molecular descriptors are described in this review. Several reports demonstrating the effectiveness of this methodological approach towards reaction optimization and mechanistic interrogation are discussed. A detailed protocol to access quantitative and predictive MLR models is provided as a guide for model development and parameter analysis. PMID:29719711

  18. Multivariate regression model for predicting yields of grade lumber from yellow birch sawlogs

    Treesearch

    Andrew F. Howard; Daniel A. Yaussy

    1986-01-01

    A multivariate regression model was developed to predict green board-foot yields for the common grades of factory lumber processed from yellow birch factory-grade logs. The model incorporates the standard log measurements of scaling diameter, length, proportion of scalable defects, and the assigned USDA Forest Service log grade. Differences in yields between band and...

  19. G/SPLINES: A hybrid of Friedman's Multivariate Adaptive Regression Splines (MARS) algorithm with Holland's genetic algorithm

    NASA Technical Reports Server (NTRS)

    Rogers, David

    1991-01-01

    G/SPLINES are a hybrid of Friedman's Multivariable Adaptive Regression Splines (MARS) algorithm with Holland's Genetic Algorithm. In this hybrid, the incremental search is replaced by a genetic search. The G/SPLINE algorithm exhibits performance comparable to that of the MARS algorithm, requires fewer least squares computations, and allows significantly larger problems to be considered.

  20. Analytical framework for reconstructing heterogeneous environmental variables from mammal community structure.

    PubMed

    Louys, Julien; Meloro, Carlo; Elton, Sarah; Ditchfield, Peter; Bishop, Laura C

    2015-01-01

    We test the performance of two models that use mammalian communities to reconstruct multivariate palaeoenvironments. While both models exploit the correlation between mammal communities (defined in terms of functional groups) and arboreal heterogeneity, the first uses a multiple multivariate regression of community structure and arboreal heterogeneity, while the second uses a linear regression of the principal components of each ecospace. The success of these methods means the palaeoenvironment of a particular locality can be reconstructed in terms of the proportions of heavy, moderate, light, and absent tree canopy cover. The linear regression is less biased, and more precisely and accurately reconstructs heavy tree canopy cover than the multiple multivariate model. However, the multiple multivariate model performs better than the linear regression for all other canopy cover categories. Both models consistently perform better than randomly generated reconstructions. We apply both models to the palaeocommunity of the Upper Laetolil Beds, Tanzania. Our reconstructions indicate that there was very little heavy tree cover at this site (likely less than 10%), with the palaeo-landscape instead comprising a mixture of light and absent tree cover. These reconstructions help resolve the previous conflicting palaeoecological reconstructions made for this site. Copyright © 2014 Elsevier Ltd. All rights reserved.

  1. Higher-order Multivariable Polynomial Regression to Estimate Human Affective States

    NASA Astrophysics Data System (ADS)

    Wei, Jie; Chen, Tong; Liu, Guangyuan; Yang, Jiemin

    2016-03-01

    From direct observations, facial, vocal, gestural, physiological, and central nervous signals, estimating human affective states through computational models such as multivariate linear-regression analysis, support vector regression, and artificial neural network, have been proposed in the past decade. In these models, linear models are generally lack of precision because of ignoring intrinsic nonlinearities of complex psychophysiological processes; and nonlinear models commonly adopt complicated algorithms. To improve accuracy and simplify model, we introduce a new computational modeling method named as higher-order multivariable polynomial regression to estimate human affective states. The study employs standardized pictures in the International Affective Picture System to induce thirty subjects’ affective states, and obtains pure affective patterns of skin conductance as input variables to the higher-order multivariable polynomial model for predicting affective valence and arousal. Experimental results show that our method is able to obtain efficient correlation coefficients of 0.98 and 0.96 for estimation of affective valence and arousal, respectively. Moreover, the method may provide certain indirect evidences that valence and arousal have their brain’s motivational circuit origins. Thus, the proposed method can serve as a novel one for efficiently estimating human affective states.

  2. Higher-order Multivariable Polynomial Regression to Estimate Human Affective States

    PubMed Central

    Wei, Jie; Chen, Tong; Liu, Guangyuan; Yang, Jiemin

    2016-01-01

    From direct observations, facial, vocal, gestural, physiological, and central nervous signals, estimating human affective states through computational models such as multivariate linear-regression analysis, support vector regression, and artificial neural network, have been proposed in the past decade. In these models, linear models are generally lack of precision because of ignoring intrinsic nonlinearities of complex psychophysiological processes; and nonlinear models commonly adopt complicated algorithms. To improve accuracy and simplify model, we introduce a new computational modeling method named as higher-order multivariable polynomial regression to estimate human affective states. The study employs standardized pictures in the International Affective Picture System to induce thirty subjects’ affective states, and obtains pure affective patterns of skin conductance as input variables to the higher-order multivariable polynomial model for predicting affective valence and arousal. Experimental results show that our method is able to obtain efficient correlation coefficients of 0.98 and 0.96 for estimation of affective valence and arousal, respectively. Moreover, the method may provide certain indirect evidences that valence and arousal have their brain’s motivational circuit origins. Thus, the proposed method can serve as a novel one for efficiently estimating human affective states. PMID:26996254

  3. [Predicting the probability of development and progression of primary open angle glaucoma by regression modeling].

    PubMed

    Likhvantseva, V G; Sokolov, V A; Levanova, O N; Kovelenova, I V

    2018-01-01

    Prediction of the clinical course of primary open-angle glaucoma (POAG) is one of the main directions in solving the problem of vision loss prevention and stabilization of the pathological process. Simple statistical methods of correlation analysis show the extent of each risk factor's impact, but do not indicate the total impact of these factors in personalized combinations. The relationships between the risk factors is subject to correlation and regression analysis. The regression equation represents the dependence of the mathematical expectation of the resulting sign on the combination of factor signs. To develop a technique for predicting the probability of development and progression of primary open-angle glaucoma based on a personalized combination of risk factors by linear multivariate regression analysis. The study included 66 patients (23 female and 43 male; 132 eyes) with newly diagnosed primary open-angle glaucoma. The control group consisted of 14 patients (8 male and 6 female). Standard ophthalmic examination was supplemented with biochemical study of lacrimal fluid. Concentration of matrix metalloproteinase MMP-2 and MMP-9 in tear fluid in both eyes was determined using 'sandwich' enzyme-linked immunosorbent assay (ELISA) method. The study resulted in the development of regression equations and step-by-step multivariate logistic models that can help calculate the risk of development and progression of POAG. Those models are based on expert evaluation of clinical and instrumental indicators of hydrodynamic disturbances (coefficient of outflow ease - C, volume of intraocular fluid secretion - F, fluctuation of intraocular pressure), as well as personalized morphometric parameters of the retina (central retinal thickness in the macular area) and concentration of MMP-2 and MMP-9 in the tear film. The newly developed regression equations are highly informative and can be a reliable tool for studying of the influence vector and assessment of pathogenic potential of the independent risk factors in specific personalized combinations.

  4. Regression Models For Multivariate Count Data

    PubMed Central

    Zhang, Yiwen; Zhou, Hua; Zhou, Jin; Sun, Wei

    2016-01-01

    Data with multivariate count responses frequently occur in modern applications. The commonly used multinomial-logit model is limiting due to its restrictive mean-variance structure. For instance, analyzing count data from the recent RNA-seq technology by the multinomial-logit model leads to serious errors in hypothesis testing. The ubiquity of over-dispersion and complicated correlation structures among multivariate counts calls for more flexible regression models. In this article, we study some generalized linear models that incorporate various correlation structures among the counts. Current literature lacks a treatment of these models, partly due to the fact that they do not belong to the natural exponential family. We study the estimation, testing, and variable selection for these models in a unifying framework. The regression models are compared on both synthetic and real RNA-seq data. PMID:28348500

  5. Regression Models For Multivariate Count Data.

    PubMed

    Zhang, Yiwen; Zhou, Hua; Zhou, Jin; Sun, Wei

    2017-01-01

    Data with multivariate count responses frequently occur in modern applications. The commonly used multinomial-logit model is limiting due to its restrictive mean-variance structure. For instance, analyzing count data from the recent RNA-seq technology by the multinomial-logit model leads to serious errors in hypothesis testing. The ubiquity of over-dispersion and complicated correlation structures among multivariate counts calls for more flexible regression models. In this article, we study some generalized linear models that incorporate various correlation structures among the counts. Current literature lacks a treatment of these models, partly due to the fact that they do not belong to the natural exponential family. We study the estimation, testing, and variable selection for these models in a unifying framework. The regression models are compared on both synthetic and real RNA-seq data.

  6. Proton radius from electron scattering data

    NASA Astrophysics Data System (ADS)

    Higinbotham, Douglas W.; Kabir, Al Amin; Lin, Vincent; Meekins, David; Norum, Blaine; Sawatzky, Brad

    2016-05-01

    Background: The proton charge radius extracted from recent muonic hydrogen Lamb shift measurements is significantly smaller than that extracted from atomic hydrogen and electron scattering measurements. The discrepancy has become known as the proton radius puzzle. Purpose: In an attempt to understand the discrepancy, we review high-precision electron scattering results from Mainz, Jefferson Lab, Saskatoon, and Stanford. Methods: We make use of stepwise regression techniques using the F test as well as the Akaike information criterion to systematically determine the predictive variables to use for a given set and range of electron scattering data as well as to provide multivariate error estimates. Results: Starting with the precision, low four-momentum transfer (Q2) data from Mainz (1980) and Saskatoon (1974), we find that a stepwise regression of the Maclaurin series using the F test as well as the Akaike information criterion justify using a linear extrapolation which yields a value for the proton radius that is consistent with the result obtained from muonic hydrogen measurements. Applying the same Maclaurin series and statistical criteria to the 2014 Rosenbluth results on GE from Mainz, we again find that the stepwise regression tends to favor a radius consistent with the muonic hydrogen radius but produces results that are extremely sensitive to the range of data included in the fit. Making use of the high-Q2 data on GE to select functions which extrapolate to high Q2, we find that a Padé (N =M =1 ) statistical model works remarkably well, as does a dipole function with a 0.84 fm radius, GE(Q2) =(1+Q2/0.66 GeV2) -2 . Conclusions: Rigorous applications of stepwise regression techniques and multivariate error estimates result in the extraction of a proton charge radius that is consistent with the muonic hydrogen result of 0.84 fm; either from linear extrapolation of the extremely-low-Q2 data or by use of the Padé approximant for extrapolation using a larger range of data. Thus, based on a purely statistical analysis of electron scattering data, we conclude that the electron scattering results and the muonic hydrogen results are consistent. It is the atomic hydrogen results that are the outliers.

  7. What Is the Role of Apelin regarding Cardiovascular Risk and Progression of Renal Disease in Type 2 Diabetic Patients with Diabetic Nephropathy?

    PubMed Central

    Fragoso, André; Silva, Claudia; Viegas, Carla; Tavares, Nelson; Guilherme, Patrícia; Santos, Nélio; Rato, Fátima; Camacho, Ana; Cavaco, Cidália; Pereira, Victor; Faísca, Marilia; Ataíde, João; Jesus, Ilídio; Neves, Pedro

    2013-01-01

    Aims. To evaluate the association of different apelin levels with cardiovascular mortality, hospitalization, renal function, and cardiovascular risk factors in type 2 diabetic patients with mild to moderate CKD. Methods. An observational, prospective study involving 150 patients divided into groups according to baseline apelin levels: 1 ≤ 98 pg/mL, 2 = 98–328 pg/mL, and 3 ≥ 329 pg/mL. Baseline characteristics were analyzed and compared. Multivariate Cox regression was used to find out predictors of cardiovascular mortality, and multivariate logistic regression was used to find out predictors of hospitalization and disease progression. Simple linear regressions and Pearson correlations were used to investigate correlations between apelin and renal disease and cardiovascular risk factors. Results. Patients' survival at 83 months in groups 1, 2, and 3 was 39%, 40%, and 71.2%, respectively (P = 0.046). Apelin, age, and eGFR were independent predictors of mortality, and apelin, creatinine, eGFR, resistin, and visfatin were independent predictors of hospitalization. Apelin levels were negatively correlated with cardiovascular risk factors and positively correlated with eGFR. Patients with lower apelin levels were more likely to start a depurative technique. Conclusions. Apelin levels might have a significant clinical use as a marker/predictor of cardiovascular mortality and hospitalization or even as a therapeutic agent for CKD patients with cardiovascular disease. PMID:24089668

  8. Comparative study between derivative spectrophotometry and multivariate calibration as analytical tools applied for the simultaneous quantitation of Amlodipine, Valsartan and Hydrochlorothiazide.

    PubMed

    Darwish, Hany W; Hassan, Said A; Salem, Maissa Y; El-Zeany, Badr A

    2013-09-01

    Four simple, accurate and specific methods were developed and validated for the simultaneous estimation of Amlodipine (AML), Valsartan (VAL) and Hydrochlorothiazide (HCT) in commercial tablets. The derivative spectrophotometric methods include Derivative Ratio Zero Crossing (DRZC) and Double Divisor Ratio Spectra-Derivative Spectrophotometry (DDRS-DS) methods, while the multivariate calibrations used are Principal Component Regression (PCR) and Partial Least Squares (PLSs). The proposed methods were applied successfully in the determination of the drugs in laboratory-prepared mixtures and in commercial pharmaceutical preparations. The validity of the proposed methods was assessed using the standard addition technique. The linearity of the proposed methods is investigated in the range of 2-32, 4-44 and 2-20 μg/mL for AML, VAL and HCT, respectively. Copyright © 2013 Elsevier B.V. All rights reserved.

  9. Optical scatterometry of quarter-micron patterns using neural regression

    NASA Astrophysics Data System (ADS)

    Bischoff, Joerg; Bauer, Joachim J.; Haak, Ulrich; Hutschenreuther, Lutz; Truckenbrodt, Horst

    1998-06-01

    With shrinking dimensions and increasing chip areas, a rapid and non-destructive full wafer characterization after every patterning cycle is an inevitable necessity. In former publications it was shown that Optical Scatterometry (OS) has the potential to push the attainable feature limits of optical techniques from 0.8 . . . 0.5 microns for imaging methods down to 0.1 micron and below. Thus the demands of future metrology can be met. Basically being a nonimaging method, OS combines light scatter (or diffraction) measurements with modern data analysis schemes to solve the inverse scatter issue. For very fine patterns with lambda-to-pitch ratios grater than one, the specular reflected light versus the incidence angle is recorded. Usually, the data analysis comprises two steps -- a training cycle connected the a rigorous forward modeling and the prediction itself. Until now, two data analysis schemes are usually applied -- the multivariate regression based Partial Least Squares method (PLS) and a look-up-table technique which is also referred to as Minimum Mean Square Error approach (MMSE). Both methods are afflicted with serious drawbacks. On the one hand, the prediction accuracy of multivariate regression schemes degrades with larger parameter ranges due to the linearization properties of the method. On the other hand, look-up-table methods are rather time consuming during prediction thus prolonging the processing time and reducing the throughput. An alternate method is an Artificial Neural Network (ANN) based regression which combines the advantages of multivariate regression and MMSE. Due to the versatility of a neural network, not only can its structure be adapted more properly to the scatter problem, but also the nonlinearity of the neuronal transfer functions mimic the nonlinear behavior of optical diffraction processes more adequately. In spite of these pleasant properties, the prediction speed of ANN regression is comparable with that of the PLS-method. In this paper, the viability and performance of ANN-regression will be demonstrated with the example of sub-quarter-micron resist metrology. To this end, 0.25 micrometer line/space patterns have been printed in positive photoresist by means of DUV projection lithography. In order to evaluate the total metrology chain from light scatter measurement through data analysis, a thorough modeling has been performed. Assuming a trapezoidal shape of the developed resist profile, a training data set was generated by means of the Rigorous Coupled Wave Approach (RCWA). After training the model, a second data set was computed and deteriorated by Gaussian noise to imitate real measuring conditions. Then, these data have been fed into the models established before resulting in a Standard Error of Prediction (SEP) which corresponds to the measuring accuracy. Even with putting only little effort in the design of a back-propagation network, the ANN is clearly superior to the PLS-method. Depending on whether a network with one or two hidden layers was used, accuracy gains between 2 and 5 can be achieved compared with PLS regression. Furthermore, the ANN is less noise sensitive, for there is only a doubling of the SEP at 5% noise for ANN whereas for PLS the accuracy degrades rapidly with increasing noise. The accuracy gain also depends on the light polarization and on the measured parameters. Finally, these results have been proven experimentally, where the OS-results are in good accordance with the profiles obtained from cross- sectioning micrographs.

  10. Assessing the response of area burned to changing climate in western boreal North America using a Multivariate Adaptive Regression Splines (MARS) approach

    Treesearch

    Michael S. Balshi; A. David McGuire; Paul Duffy; Mike Flannigan; John Walsh; Jerry Melillo

    2009-01-01

    We developed temporally and spatially explicit relationships between air temperature and fuel moisture codes derived from the Canadian Fire Weather Index System to estimate annual area burned at 2.5o (latitude x longitude) resolution using a Multivariate Adaptive Regression Spline (MARS) approach across Alaska and Canada. Burned area was...

  11. Predictors of condom use and refusal among the population of Free State province in South Africa

    PubMed Central

    2012-01-01

    Background This study investigated the extent and predictors of condom use and condom refusal in the Free State province in South Africa. Methods Through a household survey conducted in the Free Sate province of South Africa, 5,837 adults were interviewed. Univariate and multivariate survey logistic regressions and classification trees (CT) were used for analysing two response variables ‘ever used condom’ and ‘ever refused condom’. Results Eighty-three per cent of the respondents had ever used condoms, of which 38% always used them; 61% used them during the last sexual intercourse and 9% had ever refused to use them. The univariate logistic regression models and CT analysis indicated that a strong predictor of condom use was its perceived need. In the CT analysis, this variable was followed in importance by ‘knowledge of correct use of condom’, condom availability, young age, being single and higher education. ‘Perceived need’ for condoms did not remain significant in the multivariate analysis after controlling for other variables. The strongest predictor of condom refusal, as shown by the CT, was shame associated with condoms followed by the presence of sexual risk behaviour, knowing one’s HIV status, older age and lacking knowledge of condoms (i.e., ability to prevent sexually transmitted diseases and pregnancy, availability, correct and consistent use and existence of female condoms). In the multivariate logistic regression, age was not significant for condom refusal while affordability and perceived need were additional significant variables. Conclusions The use of complementary modelling techniques such as CT in addition to logistic regressions adds to a better understanding of condom use and refusal. Further improvement in correct and consistent use of condoms will require targeted interventions. In addition to existing social marketing campaigns, tailored approaches should focus on establishing the perceived need for condom-use and improving skills for correct use. They should also incorporate interventions to reduce the shame associated with condoms and individual counselling of those likely to refuse condoms. PMID:22639964

  12. Importance of Preserving Cross-correlation in developing Statistically Downscaled Climate Forcings and in estimating Land-surface Fluxes and States

    NASA Astrophysics Data System (ADS)

    Das Bhowmik, R.; Arumugam, S.

    2015-12-01

    Multivariate downscaling techniques exhibited superiority over univariate regression schemes in terms of preserving cross-correlations between multiple variables- precipitation and temperature - from GCMs. This study focuses on two aspects: (a) develop an analytical solutions on estimating biases in cross-correlations from univariate downscaling approaches and (b) quantify the uncertainty in land-surface states and fluxes due to biases in cross-correlations in downscaled climate forcings. Both these aspects are evaluated using climate forcings available from both historical climate simulations and CMIP5 hindcasts over the entire US. The analytical solution basically relates the univariate regression parameters, co-efficient of determination of regression and the co-variance ratio between GCM and downscaled values. The analytical solutions are compared with the downscaled univariate forcings by choosing the desired p-value (Type-1 error) in preserving the observed cross-correlation. . For quantifying the impacts of biases on cross-correlation on estimating streamflow and groundwater, we corrupt the downscaled climate forcings with different cross-correlation structure.

  13. Adulteration of Argentinean milk fats with animal fats: Detection by fatty acids analysis and multivariate regression techniques.

    PubMed

    Rebechi, S R; Vélez, M A; Vaira, S; Perotti, M C

    2016-02-01

    The aims of the present study were to test the accuracy of the fatty acid ratios established by the Argentinean Legislation to detect adulterations of milk fat with animal fats and to propose a regression model suitable to evaluate these adulterations. For this purpose, 70 milk fat, 10 tallow and 7 lard fat samples were collected and analyzed by gas chromatography. Data was utilized to simulate arithmetically adulterated milk fat samples at 0%, 2%, 5%, 10% and 15%, for both animal fats. The fatty acids ratios failed to distinguish adulterated milk fats containing less than 15% of tallow or lard. For each adulterant, Multiple Linear Regression (MLR) was applied, and a model was chosen and validated. For that, calibration and validation matrices were constructed employing genuine and adulterated milk fat samples. The models were able to detect adulterations of milk fat at levels greater than 10% for tallow and 5% for lard. Copyright © 2015 Elsevier Ltd. All rights reserved.

  14. The Covariance Adjustment Approaches for Combining Incomparable Cox Regressions Caused by Unbalanced Covariates Adjustment: A Multivariate Meta-Analysis Study.

    PubMed

    Dehesh, Tania; Zare, Najaf; Ayatollahi, Seyyed Mohammad Taghi

    2015-01-01

    Univariate meta-analysis (UM) procedure, as a technique that provides a single overall result, has become increasingly popular. Neglecting the existence of other concomitant covariates in the models leads to loss of treatment efficiency. Our aim was proposing four new approximation approaches for the covariance matrix of the coefficients, which is not readily available for the multivariate generalized least square (MGLS) method as a multivariate meta-analysis approach. We evaluated the efficiency of four new approaches including zero correlation (ZC), common correlation (CC), estimated correlation (EC), and multivariate multilevel correlation (MMC) on the estimation bias, mean square error (MSE), and 95% probability coverage of the confidence interval (CI) in the synthesis of Cox proportional hazard models coefficients in a simulation study. Comparing the results of the simulation study on the MSE, bias, and CI of the estimated coefficients indicated that MMC approach was the most accurate procedure compared to EC, CC, and ZC procedures. The precision ranking of the four approaches according to all above settings was MMC ≥ EC ≥ CC ≥ ZC. This study highlights advantages of MGLS meta-analysis on UM approach. The results suggested the use of MMC procedure to overcome the lack of information for having a complete covariance matrix of the coefficients.

  15. Multivariate Boosting for Integrative Analysis of High-Dimensional Cancer Genomic Data

    PubMed Central

    Xiong, Lie; Kuan, Pei-Fen; Tian, Jianan; Keles, Sunduz; Wang, Sijian

    2015-01-01

    In this paper, we propose a novel multivariate component-wise boosting method for fitting multivariate response regression models under the high-dimension, low sample size setting. Our method is motivated by modeling the association among different biological molecules based on multiple types of high-dimensional genomic data. Particularly, we are interested in two applications: studying the influence of DNA copy number alterations on RNA transcript levels and investigating the association between DNA methylation and gene expression. For this purpose, we model the dependence of the RNA expression levels on DNA copy number alterations and the dependence of gene expression on DNA methylation through multivariate regression models and utilize boosting-type method to handle the high dimensionality as well as model the possible nonlinear associations. The performance of the proposed method is demonstrated through simulation studies. Finally, our multivariate boosting method is applied to two breast cancer studies. PMID:26609213

  16. Grade of hypospadias is the only factor predicting for re-intervention after primary hypospadias repair: a multivariate analysis from a cohort of 474 patients.

    PubMed

    Spinoit, Anne-Françoise; Poelaert, Filip; Van Praet, Charles; Groen, Luitzen-Albert; Van Laecke, Erik; Hoebeke, Piet

    2015-04-01

    There is an ongoing quest on how to minimize complications in hypospadias surgery. There is however a lack of high-quality data on the following parameters that might influence the outcome of primary hypospadias repair: age at initial surgery, the type of suture material, the initial technique, and the type of hypospadias. The objective of this study was to identify independent predictors for re-intervention in primary hypospadias repair. We retrospectively analyzed our database of 474 children undergoing primary hypospadias surgery. Univariate and multivariate logistic regression was performed to identify variables associated with re-intervention. A p-value <0.05 was considered statistically significant and therefore considered as a prognostic factor for re-intervention. Distal penile hypospadias was reported in 77.2% (n = 366), midpenile in 11.4% (n = 54) and proximal in 11.4% (n = 54) of children. Initial repair was based on an incised plate technique in 39.9% (n = 189), meatal advancement in 36.0% (n = 171), an onlay flap in 17.3% (n = 82) and other or combined techniques in 5.3% (n = 25). In 114 patients (24.1%) re-intervention was required (n = 114) of which 54 re-interventions (47.4%) were performed within the first year post-surgery, 17 (14.9%) in the second year and 43 (37.7%) later than 2 years after initial surgery. The reason for the first re-intervention was fistula in 52 patients (46.4%), meatal stenosis in 32 (28.6%), cosmesis in 35 (31.3%) and other in 14 (12.5%). The median time for re-intervention was 14 months after surgery [range 0-114]. Significant predictors for re-intervention on univariate logistic regression (polyglactin suture material versus poliglecaprone, proximal hypospadias, lower age at operation and other than meatal advancement repair) were put in a multivariate logistic regression model. Of all significant variables, only proximal hypospadias remained an independent predictor for re-intervention (OR 3.27; p = 0.012). The grade of hypospadias remains according to our retrospective analysis the only objective independent predicting factor for re-intervention in hypospadias surgery. This finding is rather obvious for everyone operating hypospadias. Curiously midpenile hypospadias cases were doing slightly better than distal hypospadias in terms of re-intervention rates. Our study however has also some shortcomings. First of all, data was gathered retrospectively and follow-up time was ill-balanced for several variables. We tried to correct this by applying sensitivity analysis, but possible associations between some variables and re-intervention might still be obscured by this. Standard questionnaires to analyze surgical outcome were not available. Therefore, we focused our analysis on re-intervention rate as this is a hard and clinically relevant end point. This retrospective analysis of a large hypospadias database with long-term follow-up indicates that the long-lasting debate about factors influencing the reoperation rate in hypospadias surgery might be futile: in experienced hands, the only variable that independently predicts for re-intervention is the severity of hypospadias, the only factor we cannot modify. This retrospective multivariate analysis of a large hypospadias database with long-term follow-up suggests that the only significant independent predictive factor for re-intervention is proximal hypospadias. In our series, technique did not influence the re-intervention rate. Copyright © 2015 Journal of Pediatric Urology Company. Published by Elsevier Ltd. All rights reserved.

  17. Estimation of soil cation exchange capacity using Genetic Expression Programming (GEP) and Multivariate Adaptive Regression Splines (MARS)

    NASA Astrophysics Data System (ADS)

    Emamgolizadeh, S.; Bateni, S. M.; Shahsavani, D.; Ashrafi, T.; Ghorbani, H.

    2015-10-01

    The soil cation exchange capacity (CEC) is one of the main soil chemical properties, which is required in various fields such as environmental and agricultural engineering as well as soil science. In situ measurement of CEC is time consuming and costly. Hence, numerous studies have used traditional regression-based techniques to estimate CEC from more easily measurable soil parameters (e.g., soil texture, organic matter (OM), and pH). However, these models may not be able to adequately capture the complex and highly nonlinear relationship between CEC and its influential soil variables. In this study, Genetic Expression Programming (GEP) and Multivariate Adaptive Regression Splines (MARS) were employed to estimate CEC from more readily measurable soil physical and chemical variables (e.g., OM, clay, and pH) by developing functional relations. The GEP- and MARS-based functional relations were tested at two field sites in Iran. Results showed that GEP and MARS can provide reliable estimates of CEC. Also, it was found that the MARS model (with root-mean-square-error (RMSE) of 0.318 Cmol+ kg-1 and correlation coefficient (R2) of 0.864) generated slightly better results than the GEP model (with RMSE of 0.270 Cmol+ kg-1 and R2 of 0.807). The performance of GEP and MARS models was compared with two existing approaches, namely artificial neural network (ANN) and multiple linear regression (MLR). The comparison indicated that MARS and GEP outperformed the MLP model, but they did not perform as good as ANN. Finally, a sensitivity analysis was conducted to determine the most and the least influential variables affecting CEC. It was found that OM and pH have the most and least significant effect on CEC, respectively.

  18. The PIT-trap-A "model-free" bootstrap procedure for inference about regression models with discrete, multivariate responses.

    PubMed

    Warton, David I; Thibaut, Loïc; Wang, Yi Alice

    2017-01-01

    Bootstrap methods are widely used in statistics, and bootstrapping of residuals can be especially useful in the regression context. However, difficulties are encountered extending residual resampling to regression settings where residuals are not identically distributed (thus not amenable to bootstrapping)-common examples including logistic or Poisson regression and generalizations to handle clustered or multivariate data, such as generalised estimating equations. We propose a bootstrap method based on probability integral transform (PIT-) residuals, which we call the PIT-trap, which assumes data come from some marginal distribution F of known parametric form. This method can be understood as a type of "model-free bootstrap", adapted to the problem of discrete and highly multivariate data. PIT-residuals have the key property that they are (asymptotically) pivotal. The PIT-trap thus inherits the key property, not afforded by any other residual resampling approach, that the marginal distribution of data can be preserved under PIT-trapping. This in turn enables the derivation of some standard bootstrap properties, including second-order correctness of pivotal PIT-trap test statistics. In multivariate data, bootstrapping rows of PIT-residuals affords the property that it preserves correlation in data without the need for it to be modelled, a key point of difference as compared to a parametric bootstrap. The proposed method is illustrated on an example involving multivariate abundance data in ecology, and demonstrated via simulation to have improved properties as compared to competing resampling methods.

  19. The PIT-trap—A “model-free” bootstrap procedure for inference about regression models with discrete, multivariate responses

    PubMed Central

    Thibaut, Loïc; Wang, Yi Alice

    2017-01-01

    Bootstrap methods are widely used in statistics, and bootstrapping of residuals can be especially useful in the regression context. However, difficulties are encountered extending residual resampling to regression settings where residuals are not identically distributed (thus not amenable to bootstrapping)—common examples including logistic or Poisson regression and generalizations to handle clustered or multivariate data, such as generalised estimating equations. We propose a bootstrap method based on probability integral transform (PIT-) residuals, which we call the PIT-trap, which assumes data come from some marginal distribution F of known parametric form. This method can be understood as a type of “model-free bootstrap”, adapted to the problem of discrete and highly multivariate data. PIT-residuals have the key property that they are (asymptotically) pivotal. The PIT-trap thus inherits the key property, not afforded by any other residual resampling approach, that the marginal distribution of data can be preserved under PIT-trapping. This in turn enables the derivation of some standard bootstrap properties, including second-order correctness of pivotal PIT-trap test statistics. In multivariate data, bootstrapping rows of PIT-residuals affords the property that it preserves correlation in data without the need for it to be modelled, a key point of difference as compared to a parametric bootstrap. The proposed method is illustrated on an example involving multivariate abundance data in ecology, and demonstrated via simulation to have improved properties as compared to competing resampling methods. PMID:28738071

  20. Finding structure in data using multivariate tree boosting

    PubMed Central

    Miller, Patrick J.; Lubke, Gitta H.; McArtor, Daniel B.; Bergeman, C. S.

    2016-01-01

    Technology and collaboration enable dramatic increases in the size of psychological and psychiatric data collections, but finding structure in these large data sets with many collected variables is challenging. Decision tree ensembles such as random forests (Strobl, Malley, & Tutz, 2009) are a useful tool for finding structure, but are difficult to interpret with multiple outcome variables which are often of interest in psychology. To find and interpret structure in data sets with multiple outcomes and many predictors (possibly exceeding the sample size), we introduce a multivariate extension to a decision tree ensemble method called gradient boosted regression trees (Friedman, 2001). Our extension, multivariate tree boosting, is a method for nonparametric regression that is useful for identifying important predictors, detecting predictors with nonlinear effects and interactions without specification of such effects, and for identifying predictors that cause two or more outcome variables to covary. We provide the R package ‘mvtboost’ to estimate, tune, and interpret the resulting model, which extends the implementation of univariate boosting in the R package ‘gbm’ (Ridgeway et al., 2015) to continuous, multivariate outcomes. To illustrate the approach, we analyze predictors of psychological well-being (Ryff & Keyes, 1995). Simulations verify that our approach identifies predictors with nonlinear effects and achieves high prediction accuracy, exceeding or matching the performance of (penalized) multivariate multiple regression and multivariate decision trees over a wide range of conditions. PMID:27918183

  1. Selecting minimum dataset soil variables using PLSR as a regressive multivariate method

    NASA Astrophysics Data System (ADS)

    Stellacci, Anna Maria; Armenise, Elena; Castellini, Mirko; Rossi, Roberta; Vitti, Carolina; Leogrande, Rita; De Benedetto, Daniela; Ferrara, Rossana M.; Vivaldi, Gaetano A.

    2017-04-01

    Long-term field experiments and science-based tools that characterize soil status (namely the soil quality indices, SQIs) assume a strategic role in assessing the effect of agronomic techniques and thus in improving soil management especially in marginal environments. Selecting key soil variables able to best represent soil status is a critical step for the calculation of SQIs. Current studies show the effectiveness of statistical methods for variable selection to extract relevant information deriving from multivariate datasets. Principal component analysis (PCA) has been mainly used, however supervised multivariate methods and regressive techniques are progressively being evaluated (Armenise et al., 2013; de Paul Obade et al., 2016; Pulido Moncada et al., 2014). The present study explores the effectiveness of partial least square regression (PLSR) in selecting critical soil variables, using a dataset comparing conventional tillage and sod-seeding on durum wheat. The results were compared to those obtained using PCA and stepwise discriminant analysis (SDA). The soil data derived from a long-term field experiment in Southern Italy. On samples collected in April 2015, the following set of variables was quantified: (i) chemical: total organic carbon and nitrogen (TOC and TN), alkali-extractable C (TEC and humic substances - HA-FA), water extractable N and organic C (WEN and WEOC), Olsen extractable P, exchangeable cations, pH and EC; (ii) physical: texture, dry bulk density (BD), macroporosity (Pmac), air capacity (AC), and relative field capacity (RFC); (iii) biological: carbon of the microbial biomass quantified with the fumigation-extraction method. PCA and SDA were previously applied to the multivariate dataset (Stellacci et al., 2016). PLSR was carried out on mean centered and variance scaled data of predictors (soil variables) and response (wheat yield) variables using the PLS procedure of SAS/STAT. In addition, variable importance for projection (VIP) statistics was used to quantitatively assess the predictors most relevant for response variable estimation and then for variable selection (Andersen and Bro, 2010). PCA and SDA returned TOC and RFC as influential variables both on the set of chemical and physical data analyzed separately as well as on the whole dataset (Stellacci et al., 2016). Highly weighted variables in PCA were also TEC, followed by K, and AC, followed by Pmac and BD, in the first PC (41.2% of total variance); Olsen P and HA-FA in the second PC (12.6%), Ca in the third (10.6%) component. Variables enabling maximum discrimination among treatments for SDA were WEOC, on the whole dataset, humic substances, followed by Olsen P, EC and clay, in the separate data analyses. The highest PLS-VIP statistics were recorded for Olsen P and Pmac, followed by TOC, TEC, pH and Mg for chemical variables and clay, RFC and AC for the physical variables. Results show that different methods may provide different ranking of the selected variables and the presence of a response variable, in regressive techniques, may affect variable selection. Further investigation with different response variables and with multi-year datasets would allow to better define advantages and limits of single or combined approaches. Acknowledgment The work was supported by the projects "BIOTILLAGE, approcci innovative per il miglioramento delle performances ambientali e produttive dei sistemi cerealicoli no-tillage", financed by PSR-Basilicata 2007-2013, and "DESERT, Low-cost water desalination and sensor technology compact module" financed by ERANET-WATERWORKS 2014. References Andersen C.M. and Bro R., 2010. Variable selection in regression - a tutorial. Journal of Chemometrics, 24 728-737. Armenise et al., 2013. Developing a soil quality index to compare soil fitness for agricultural use under different managements in the mediterranean environment. Soil and Tillage Research, 130:91-98. de Paul Obade et al., 2016. A standardized soil quality index for diverse field conditions. Sci. Total Env. 541:424-434. Pulido Moncada et al., 2014. Data-driven analysis of soil quality indicators using limited data. Geoderma, 235:271-278. Stellacci et al., 2016. Comparison of different multivariate methods to select key soil variables for soil quality indices computation. XLV Congress of the Italian Society of Agronomy (SIA), Sassari, 20-22 September 2016.

  2. Use of partial least squares regression for the multivariate calibration of hazardous air pollutants in open-path FT-IR spectrometry

    NASA Astrophysics Data System (ADS)

    Hart, Brian K.; Griffiths, Peter R.

    1998-06-01

    Partial least squares (PLS) regression has been evaluated as a robust calibration technique for over 100 hazardous air pollutants (HAPs) measured by open path Fourier transform infrared (OP/FT-IR) spectrometry. PLS has the advantage over the current recommended calibration method of classical least squares (CLS), in that it can look at the whole useable spectrum (700-1300 cm-1, 2000-2150 cm-1, and 2400-3000 cm-1), and detect several analytes simultaneously. Up to one hundred HAPs synthetically added to OP/FT-IR backgrounds have been simultaneously calibrated and detected using PLS. PLS also has the advantage in requiring less preprocessing of spectra than that which is required in CLS calibration schemes, allowing PLS to provide user independent real-time analysis of OP/FT-IR spectra.

  3. Wavelet analysis for the study of the relations among soil radon anomalies, volcanic and seismic events: the case of Mt. Etna (Italy)

    NASA Astrophysics Data System (ADS)

    Ferrera, Elisabetta; Giammanco, Salvatore; Cannata, Andrea; Montalto, Placido

    2013-04-01

    From November 2009 to April 2011 soil radon activity was continuously monitored using a Barasol® probe located on the upper NE flank of Mt. Etna volcano, close either to the Piano Provenzana fault or to the NE-Rift. Seismic and volcanological data have been analyzed together with radon data. We also analyzed air and soil temperature, barometric pressure, snow and rain fall data. In order to find possible correlations among the above parameters, and hence to reveal possible anomalies in the radon time-series, we used different statistical methods: i) multivariate linear regression; ii) cross-correlation; iii) coherence analysis through wavelet transform. Multivariate regression indicated a modest influence on soil radon from environmental parameters (R2 = 0.31). When using 100-days time windows, the R2 values showed wide variations in time, reaching their maxima (~0.63-0.66) during summer. Cross-correlation analysis over 100-days moving averages showed that, similar to multivariate linear regression analysis, the summer period is characterised by the best correlation between radon data and environmental parameters. Lastly, the wavelet coherence analysis allowed a multi-resolution coherence analysis of the time series acquired. This approach allows to study the relations among different signals either in time or frequency domain. It confirmed the results of the previous methods, but also allowed to recognize correlations between radon and environmental parameters at different observation scales (e.g., radon activity changed during strong precipitations, but also during anomalous variations of soil temperature uncorrelated with seasonal fluctuations). Our work suggests that in order to make an accurate analysis of the relations among distinct signals it is necessary to use different techniques that give complementary analytical information. In particular, the wavelet analysis showed to be very effective in discriminating radon changes due to environmental influences from those correlated with impending seismic or volcanic events.

  4. Regional vertical total electron content (VTEC) modeling together with satellite and receiver differential code biases (DCBs) using semi-parametric multivariate adaptive regression B-splines (SP-BMARS)

    NASA Astrophysics Data System (ADS)

    Durmaz, Murat; Karslioglu, Mahmut Onur

    2015-04-01

    There are various global and regional methods that have been proposed for the modeling of ionospheric vertical total electron content (VTEC). Global distribution of VTEC is usually modeled by spherical harmonic expansions, while tensor products of compactly supported univariate B-splines can be used for regional modeling. In these empirical parametric models, the coefficients of the basis functions as well as differential code biases (DCBs) of satellites and receivers can be treated as unknown parameters which can be estimated from geometry-free linear combinations of global positioning system observables. In this work we propose a new semi-parametric multivariate adaptive regression B-splines (SP-BMARS) method for the regional modeling of VTEC together with satellite and receiver DCBs, where the parametric part of the model is related to the DCBs as fixed parameters and the non-parametric part adaptively models the spatio-temporal distribution of VTEC. The latter is based on multivariate adaptive regression B-splines which is a non-parametric modeling technique making use of compactly supported B-spline basis functions that are generated from the observations automatically. This algorithm takes advantage of an adaptive scale-by-scale model building strategy that searches for best-fitting B-splines to the data at each scale. The VTEC maps generated from the proposed method are compared numerically and visually with the global ionosphere maps (GIMs) which are provided by the Center for Orbit Determination in Europe (CODE). The VTEC values from SP-BMARS and CODE GIMs are also compared with VTEC values obtained through calibration using local ionospheric model. The estimated satellite and receiver DCBs from the SP-BMARS model are compared with the CODE distributed DCBs. The results show that the SP-BMARS algorithm can be used to estimate satellite and receiver DCBs while adaptively and flexibly modeling the daily regional VTEC.

  5. Is Heart Rate Variability Better Than Routine Vital Signs for Prehospital Identification of Major Hemorrhage

    DTIC Science & Technology

    2015-01-01

    different PRBC transfusion volumes. We performed multivariate regression analysis using HRV metrics and routine vital signs to test the hypothesis that...study sponsors did not have any role in the study design, data collection, analysis and interpretation of data, report writing, or the decision to...primary outcome was hemorrhagic injury plus different PRBC transfusion volumes. We performed multivariate regression analysis using HRV metrics and

  6. Use of chemometrics to compare NIR and HPLC for the simultaneous determination of drug levels in fixed-dose combination tablets employed in tuberculosis treatment.

    PubMed

    Teixeira, Kelly Sivocy Sampaio; da Cruz Fonseca, Said Gonçalves; de Moura, Luís Carlos Brigido; de Moura, Mario Luís Ribeiro; Borges, Márcia Herminia Pinheiro; Barbosa, Euzébio Guimaraes; De Lima E Moura, Túlio Flávio Accioly

    2018-02-05

    The World Health Organization recommends that TB treatment be administered using combination therapy. The methodologies for quantifying simultaneously associated drugs are highly complex, being costly, extremely time consuming and producing chemical residues harmful to the environment. The need to seek alternative techniques that minimize these drawbacks is widely discussed in the pharmaceutical industry. Therefore, the objective of this study was to develop and validate a multivariate calibration model in association with the near infrared spectroscopy technique (NIR) for the simultaneous determination of rifampicin, isoniazid, pyrazinamide and ethambutol. These models allow the quality control of these medicines to be optimized using simple, fast, low-cost techniques that produce no chemical waste. In the NIR - PLS method, spectra readings were acquired in the 10,000-4000cm -1 range using an infrared spectrophotometer (IRPrestige - 21 - Shimadzu) with a resolution of 4cm -1 , 20 sweeps, under controlled temperature and humidity. For construction of the model, the central composite experimental design was employed on the program Statistica 13 (StatSoft Inc.). All spectra were treated by computational tools for multivariate analysis using partial least squares regression (PLS) on the software program Pirouette 3.11 (Infometrix, Inc.). Variable selections were performed by the QSAR modeling program. The models developed by NIR in association with multivariate analysis provided good prediction of the APIs for the external samples and were therefore validated. For the tablets, however, the slightly different quantitative compositions of excipients compared to the mixtures prepared for building the models led to results that were not statistically similar, despite having prediction errors considered acceptable in the literature. Copyright © 2017 Elsevier B.V. All rights reserved.

  7. A multivariate geostatistical methodology to delineate areas of potential interest for future sedimentary gold exploration.

    PubMed

    Goovaerts, P; Albuquerque, Teresa; Antunes, Margarida

    2016-11-01

    This paper describes a multivariate geostatistical methodology to delineate areas of potential interest for future sedimentary gold exploration, with an application to an abandoned sedimentary gold mining region in Portugal. The main challenge was the existence of only a dozen gold measurements confined to the grounds of the old gold mines, which precluded the application of traditional interpolation techniques, such as cokriging. The analysis could, however, capitalize on 376 stream sediment samples that were analyzed for twenty two elements. Gold (Au) was first predicted at all 376 locations using linear regression (R 2 =0.798) and four metals (Fe, As, Sn and W), which are known to be mostly associated with the local gold's paragenesis. One hundred realizations of the spatial distribution of gold content were generated using sequential indicator simulation and a soft indicator coding of regression estimates, to supplement the hard indicator coding of gold measurements. Each simulated map then underwent a local cluster analysis to identify significant aggregates of low or high values. The one hundred classified maps were processed to derive the most likely classification of each simulated node and the associated probability of occurrence. Examining the distribution of the hot-spots and cold-spots reveals a clear enrichment in Au along the Erges River downstream from the old sedimentary mineralization.

  8. Calibration transfer of a Raman spectroscopic quantification method for the assessment of liquid detergent compositions from at-line laboratory to in-line industrial scale.

    PubMed

    Brouckaert, D; Uyttersprot, J-S; Broeckx, W; De Beer, T

    2018-03-01

    Calibration transfer or standardisation aims at creating a uniform spectral response on different spectroscopic instruments or under varying conditions, without requiring a full recalibration for each situation. In the current study, this strategy is applied to construct at-line multivariate calibration models and consequently employ them in-line in a continuous industrial production line, using the same spectrometer. Firstly, quantitative multivariate models are constructed at-line at laboratory scale for predicting the concentration of two main ingredients in hard surface cleaners. By regressing the Raman spectra of a set of small-scale calibration samples against their reference concentration values, partial least squares (PLS) models are developed to quantify the surfactant levels in the liquid detergent compositions under investigation. After evaluating the models performance with a set of independent validation samples, a univariate slope/bias correction is applied in view of transporting these at-line calibration models to an in-line manufacturing set-up. This standardisation technique allows a fast and easy transfer of the PLS regression models, by simply correcting the model predictions on the in-line set-up, without adjusting anything to the original multivariate calibration models. An extensive statistical analysis is performed in order to assess the predictive quality of the transferred regression models. Before and after transfer, the R 2 and RMSEP of both models is compared for evaluating if their magnitude is similar. T-tests are then performed to investigate whether the slope and intercept of the transferred regression line are not statistically different from 1 and 0, respectively. Furthermore, it is inspected whether no significant bias can be noted. F-tests are executed as well, for assessing the linearity of the transfer regression line and for investigating the statistical coincidence of the transfer and validation regression line. Finally, a paired t-test is performed to compare the original at-line model to the slope/bias corrected in-line model, using interval hypotheses. It is shown that the calibration models of Surfactant 1 and Surfactant 2 yield satisfactory in-line predictions after slope/bias correction. While Surfactant 1 passes seven out of eight statistical tests, the recommended validation parameters are 100% successful for Surfactant 2. It is hence concluded that the proposed strategy for transferring at-line calibration models to an in-line industrial environment via a univariate slope/bias correction of the predicted values offers a successful standardisation approach. Copyright © 2017 Elsevier B.V. All rights reserved.

  9. [Statistical prediction methods in violence risk assessment and its application].

    PubMed

    Liu, Yuan-Yuan; Hu, Jun-Mei; Yang, Min; Li, Xiao-Song

    2013-06-01

    It is an urgent global problem how to improve the violence risk assessment. As a necessary part of risk assessment, statistical methods have remarkable impacts and effects. In this study, the predicted methods in violence risk assessment from the point of statistics are reviewed. The application of Logistic regression as the sample of multivariate statistical model, decision tree model as the sample of data mining technique, and neural networks model as the sample of artificial intelligence technology are all reviewed. This study provides data in order to contribute the further research of violence risk assessment.

  10. Long-Term Athletic Development in Youth Alpine Ski Racing: The Effect of Physical Fitness, Ski Racing Technique, Anthropometrics and Biological Maturity Status on Injuries

    PubMed Central

    Müller, Lisa; Hildebrandt, Carolin; Müller, Erich; Fink, Christian; Raschner, Christian

    2017-01-01

    Alpine ski racing is known to be a sport with a high risk of injuries. Because most studies have focused mainly on top-level athletes and on traumatic injuries, limited research exists about injury risk factors among youth ski racers. The aim of this study was to determine the intrinsic risk factors (anthropometrics, biological maturity, physical fitness, racing technique) for injury among youth alpine ski racers. Study participants were 81 youth ski racers attending a ski boarding school (50 males, 31 females; 9–14 years). A prospective longitudinal cohort design was used to monitor sports-related risk factors over two seasons and traumatic (TI) and overuse injuries (OI). At the beginning of the study, anthropometric characteristics (body height, body weight, sitting height, body mass index); biological maturity [status age at peak height velocity (APHV)]; physical performance parameters related to jump coordination, maximal leg and core strength, explosive and reactive strength, balance and endurance; and ski racing technique were assessed. Z score transformations normalized the age groups. Multivariate binary logistic regression (dependent variable: injury yes/no) and multivariate linear regression analyses (dependent variable: injury severity in total days of absence from training) were calculated. T-tests and multivariate analyses of variance were used to reveal differences between injured and non-injured athletes and between injury severity groups. The level of significance was set to p < 0.05. Relatively low rates of injuries were reported for both traumatic (0.63 TI/athlete) and overuse injuries (0.21 OI/athlete). Athletes with higher body weight, body height, and sitting height; lower APHV values; better core flexion strength; smaller core flexion:extension strength ratio; shorter drop jump contact time; and higher drop jump reactive strength index were at a lower injury risk or more vulnerable for fewer days of absence from training. However, significant differences between injured and non-injured athletes were only observed with respect to the drop jump reactive strength index. Regular documentation of anthropometric characteristics, biological maturity and physical fitness parameters is crucial to help to prevent injury in youth ski racing. The present findings suggest that neuromuscular training should be incorporated into the training regimen of youth ski racers to prevent injuries. PMID:28912731

  11. Long-Term Athletic Development in Youth Alpine Ski Racing: The Effect of Physical Fitness, Ski Racing Technique, Anthropometrics and Biological Maturity Status on Injuries.

    PubMed

    Müller, Lisa; Hildebrandt, Carolin; Müller, Erich; Fink, Christian; Raschner, Christian

    2017-01-01

    Alpine ski racing is known to be a sport with a high risk of injuries. Because most studies have focused mainly on top-level athletes and on traumatic injuries, limited research exists about injury risk factors among youth ski racers. The aim of this study was to determine the intrinsic risk factors (anthropometrics, biological maturity, physical fitness, racing technique) for injury among youth alpine ski racers. Study participants were 81 youth ski racers attending a ski boarding school (50 males, 31 females; 9-14 years). A prospective longitudinal cohort design was used to monitor sports-related risk factors over two seasons and traumatic (TI) and overuse injuries (OI). At the beginning of the study, anthropometric characteristics (body height, body weight, sitting height, body mass index); biological maturity [status age at peak height velocity (APHV)]; physical performance parameters related to jump coordination, maximal leg and core strength, explosive and reactive strength, balance and endurance; and ski racing technique were assessed. Z score transformations normalized the age groups. Multivariate binary logistic regression (dependent variable: injury yes/no) and multivariate linear regression analyses (dependent variable: injury severity in total days of absence from training) were calculated. T -tests and multivariate analyses of variance were used to reveal differences between injured and non-injured athletes and between injury severity groups. The level of significance was set to p < 0.05. Relatively low rates of injuries were reported for both traumatic (0.63 TI/athlete) and overuse injuries (0.21 OI/athlete). Athletes with higher body weight, body height, and sitting height; lower APHV values; better core flexion strength; smaller core flexion:extension strength ratio; shorter drop jump contact time; and higher drop jump reactive strength index were at a lower injury risk or more vulnerable for fewer days of absence from training. However, significant differences between injured and non-injured athletes were only observed with respect to the drop jump reactive strength index. Regular documentation of anthropometric characteristics, biological maturity and physical fitness parameters is crucial to help to prevent injury in youth ski racing. The present findings suggest that neuromuscular training should be incorporated into the training regimen of youth ski racers to prevent injuries.

  12. Comparison of partial least squares and lasso regression techniques as applied to laser-induced breakdown spectroscopy of geological samples

    NASA Astrophysics Data System (ADS)

    Dyar, M. D.; Carmosino, M. L.; Breves, E. A.; Ozanne, M. V.; Clegg, S. M.; Wiens, R. C.

    2012-04-01

    A remote laser-induced breakdown spectrometer (LIBS) designed to simulate the ChemCam instrument on the Mars Science Laboratory Rover Curiosity was used to probe 100 geologic samples at a 9-m standoff distance. ChemCam consists of an integrated remote LIBS instrument that will probe samples up to 7 m from the mast of the rover and a remote micro-imager (RMI) that will record context images. The elemental compositions of 100 igneous and highly-metamorphosed rocks are determined with LIBS using three variations of multivariate analysis, with a goal of improving the analytical accuracy. Two forms of partial least squares (PLS) regression are employed with finely-tuned parameters: PLS-1 regresses a single response variable (elemental concentration) against the observation variables (spectra, or intensity at each of 6144 spectrometer channels), while PLS-2 simultaneously regresses multiple response variables (concentrations of the ten major elements in rocks) against the observation predictor variables, taking advantage of natural correlations between elements. Those results are contrasted with those from the multivariate regression technique of the least absolute shrinkage and selection operator (lasso), which is a penalized shrunken regression method that selects the specific channels for each element that explain the most variance in the concentration of that element. To make this comparison, we use results of cross-validation and of held-out testing, and employ unscaled and uncentered spectral intensity data because all of the input variables are already in the same units. Results demonstrate that the lasso, PLS-1, and PLS-2 all yield comparable results in terms of accuracy for this dataset. However, the interpretability of these methods differs greatly in terms of fundamental understanding of LIBS emissions. PLS techniques generate principal components, linear combinations of intensities at any number of spectrometer channels, which explain as much variance in the response variables as possible while avoiding multicollinearity between principal components. When the selected number of principal components is projected back into the original feature space of the spectra, 6144 correlation coefficients are generated, a small fraction of which are mathematically significant to the regression. In contrast, the lasso models require only a small number (< 24) of non-zero correlation coefficients (β values) to determine the concentration of each of the ten major elements. Causality between the positively-correlated emission lines chosen by the lasso and the elemental concentration was examined. In general, the higher the lasso coefficient (β), the greater the likelihood that the selected line results from an emission of that element. Emission lines with negative β values should arise from elements that are anti-correlated with the element being predicted. For elements except Fe, Al, Ti, and P, the lasso-selected wavelength with the highest β value corresponds to the element being predicted, e.g. 559.8 nm for neutral Ca. However, the specific lines chosen by the lasso with positive β values are not always those from the element being predicted. Other wavelengths and the elements that most strongly correlate with them to predict concentration are obviously related to known geochemical correlations or close overlap of emission lines, while others must result from matrix effects. Use of the lasso technique thus directly informs our understanding of the underlying physical processes that give rise to LIBS emissions by determining which lines can best represent concentration, and which lines from other elements are causing matrix effects.

  13. Third molar development: measurements versus scores as age predictor.

    PubMed

    Thevissen, P W; Fieuws, S; Willems, G

    2011-10-01

    Human third molar development is widely used to predict chronological age of sub adult individuals with unknown or doubted age. For these predictions, classically, the radiologically observed third molar growth and maturation is registered using a staging and related scoring technique. Measures of lengths and widths of the developing wisdom tooth and its adjacent second molar can be considered as an alternative registration. The aim of this study was to verify relations between mandibular third molar developmental stages or measurements of mandibular second molar and third molars and age. Age related performance of stages and measurements were compared to assess if measurements added information to age predictions from third molar formation stage. The sample was 340 orthopantomograms (170 females, 170 males) of individuals homogenously distributed in age between 7 and 24 years. Mandibular lower right, third and second molars, were staged following Gleiser and Hunt, length and width measurements were registered, and various ratios of these measurements were calculated. Univariable regression models with age as response and third molar stage, measurements and ratios of second and third molars as predictors, were considered. Multivariable regression models assessed if measurements or ratios added information to age prediction from third molar stage. Coefficients of determination (R(2)) and root mean squared errors (RMSE) obtained from all regression models were compared. The univariable regression model using stages as predictor yielded most accurate age predictions (males: R(2) 0.85, RMSE between 0.85 and 1.22 year; females: R(2) 0.77, RMSE between 1.19 and 2.11 year) compared to all models including measurements and ratios. The multivariable regression models indicated that measurements and ratios added no clinical relevant information to the age prediction from third molar stage. Ratios and measurements of second and third molars are less accurate age predictors than stages of developing third molars. Copyright © 2011 Elsevier Ltd. All rights reserved.

  14. Study on rapid valid acidity evaluation of apple by fiber optic diffuse reflectance technique

    NASA Astrophysics Data System (ADS)

    Liu, Yande; Ying, Yibin; Fu, Xiaping; Jiang, Xuesong

    2004-03-01

    Some issues related to nondestructive evaluation of valid acidity in intact apples by means of Fourier transform near infrared (FTNIR) (800-2631nm) method were addressed. A relationship was established between the diffuse reflectance spectra recorded with a bifurcated optic fiber and the valid acidity. The data were analyzed by multivariate calibration analysis such as partial least squares (PLS) analysis and principal component regression (PCR) technique. A total of 120 Fuji apples were tested and 80 of them were used to form a calibration data set. The influence of data preprocessing and different spectra treatments were also investigated. Models based on smoothing spectra were slightly worse than models based on derivative spectra and the best result was obtained when the segment length was 5 and the gap size was 10. Depending on data preprocessing and multivariate calibration technique, the best prediction model had a correlation efficient (0.871), a low RMSEP (0.0677), a low RMSEC (0.056) and a small difference between RMSEP and RMSEC by PLS analysis. The results point out the feasibility of FTNIR spectral analysis to predict the fruit valid acidity non-destructively. The ratio of data standard deviation to the root mean square error of prediction (SDR) is better to be less than 3 in calibration models, however, the results cannot meet the demand of actual application. Therefore, further study is required for better calibration and prediction.

  15. Assessing the Independent Contribution of Maternal Educational Expectations to Children’s Educational Attainment in Early Adulthood: A Propensity Score Matching Analysis

    PubMed Central

    Pingault, Jean Baptiste; Côté, Sylvana M.; Petitclerc, Amélie; Vitaro, Frank; Tremblay, Richard E.

    2015-01-01

    Background Parental educational expectations have been associated with children’s educational attainment in a number of long-term longitudinal studies, but whether this relationship is causal has long been debated. The aims of this prospective study were twofold: 1) test whether low maternal educational expectations contributed to failure to graduate from high school; and 2) compare the results obtained using different strategies for accounting for confounding variables (i.e. multivariate regression and propensity score matching). Methodology/Principal Findings The study sample included 1,279 participants from the Quebec Longitudinal Study of Kindergarten Children. Maternal educational expectations were assessed when the participants were aged 12 years. High school graduation – measuring educational attainment – was determined through the Quebec Ministry of Education when the participants were aged 22–23 years. Findings show that when using the most common statistical approach (i.e. multivariate regressions to adjust for a restricted set of potential confounders) the contribution of low maternal educational expectations to failure to graduate from high school was statistically significant. However, when using propensity score matching, the contribution of maternal expectations was reduced and remained statistically significant only for males. Conclusions/Significance The results of this study are consistent with the possibility that the contribution of parental expectations to educational attainment is overestimated in the available literature. This may be explained by the use of a restricted range of potential confounding variables as well as the dearth of studies using appropriate statistical techniques and study designs in order to minimize confounding. Each of these techniques and designs, including propensity score matching, has its strengths and limitations: A more comprehensive understanding of the causal role of parental expectations will stem from a convergence of findings from studies using different techniques and designs. PMID:25803867

  16. Radiographic failure and rates of re-operation after acromioclavicular joint reconstruction: a comparison of surgical techniques.

    PubMed

    Spencer, H T; Hsu, L; Sodl, J; Arianjam, A; Yian, E H

    2016-04-01

    To compare radiographic failure and re-operation rates of anatomical coracoclavicular (CC) ligament reconstructional techniques with non-anatomical techniques after chronic high grade acromioclavicular (AC) joint injuries. We reviewed chronic AC joint reconstructions within a region-wide healthcare system to identify surgical technique, complications, radiographic failure and re-operations. Procedures fell into four categories: (1) modified Weaver-Dunn, (2) allograft fixed through coracoid and clavicular tunnels, (3) allograft loop coracoclavicular fixation, and (4) combined allograft loop and synthetic cortical button fixation. Among 167 patients (mean age 38.1 years, (standard deviation (sd) 14.7) treated at least a four week interval after injury, 154 had post-operative radiographs available for analysis. Radiographic failure occurred in 33/154 cases (21.4%), with the lowest rate in Technique 4 (2/42 4.8%, p = 0.001). Half the failures occurred by six weeks, and the Kaplan-Meier survivorship at 24 months was 94.4% (95% confidence interval (CI) 79.6 to 98.6) for Technique 4 and 69.9% (95% CI 59.4 to 78.3) for the other techniques when combined. In multivariable survival analysis, Technique 4 had better survival than other techniques (Hazard Ratio 0.162, 95% CI 0.039 to 0.068, p = 0.013). Among 155 patients with a minimum of six months post-operative insurance coverage, re-operation occurred in 9.7% (15 patients). However, in multivariable logistic regression, Technique 4 did not reach a statistically significant lower risk for re-operation (odds ratio 0.254, 95% CI 0.05 to 1.3, p = 0.11). In this retrospective series, anatomical CC ligament reconstruction using combined synthetic cortical button and allograft loop fixation had the lowest rate of radiographic failure. Anatomical coracoclavicular ligament reconstruction using combined synthetic cortical button and allograft loop fixation had the lowest rate of radiographic failure. ©2016 The British Editorial Society of Bone & Joint Surgery.

  17. Random forests on Hadoop for genome-wide association studies of multivariate neuroimaging phenotypes

    PubMed Central

    2013-01-01

    Motivation Multivariate quantitative traits arise naturally in recent neuroimaging genetics studies, in which both structural and functional variability of the human brain is measured non-invasively through techniques such as magnetic resonance imaging (MRI). There is growing interest in detecting genetic variants associated with such multivariate traits, especially in genome-wide studies. Random forests (RFs) classifiers, which are ensembles of decision trees, are amongst the best performing machine learning algorithms and have been successfully employed for the prioritisation of genetic variants in case-control studies. RFs can also be applied to produce gene rankings in association studies with multivariate quantitative traits, and to estimate genetic similarities measures that are predictive of the trait. However, in studies involving hundreds of thousands of SNPs and high-dimensional traits, a very large ensemble of trees must be inferred from the data in order to obtain reliable rankings, which makes the application of these algorithms computationally prohibitive. Results We have developed a parallel version of the RF algorithm for regression and genetic similarity learning tasks in large-scale population genetic association studies involving multivariate traits, called PaRFR (Parallel Random Forest Regression). Our implementation takes advantage of the MapReduce programming model and is deployed on Hadoop, an open-source software framework that supports data-intensive distributed applications. Notable speed-ups are obtained by introducing a distance-based criterion for node splitting in the tree estimation process. PaRFR has been applied to a genome-wide association study on Alzheimer's disease (AD) in which the quantitative trait consists of a high-dimensional neuroimaging phenotype describing longitudinal changes in the human brain structure. PaRFR provides a ranking of SNPs associated to this trait, and produces pair-wise measures of genetic proximity that can be directly compared to pair-wise measures of phenotypic proximity. Several known AD-related variants have been identified, including APOE4 and TOMM40. We also present experimental evidence supporting the hypothesis of a linear relationship between the number of top-ranked mutated states, or frequent mutation patterns, and an indicator of disease severity. Availability The Java codes are freely available at http://www2.imperial.ac.uk/~gmontana. PMID:24564704

  18. Random forests on Hadoop for genome-wide association studies of multivariate neuroimaging phenotypes.

    PubMed

    Wang, Yue; Goh, Wilson; Wong, Limsoon; Montana, Giovanni

    2013-01-01

    Multivariate quantitative traits arise naturally in recent neuroimaging genetics studies, in which both structural and functional variability of the human brain is measured non-invasively through techniques such as magnetic resonance imaging (MRI). There is growing interest in detecting genetic variants associated with such multivariate traits, especially in genome-wide studies. Random forests (RFs) classifiers, which are ensembles of decision trees, are amongst the best performing machine learning algorithms and have been successfully employed for the prioritisation of genetic variants in case-control studies. RFs can also be applied to produce gene rankings in association studies with multivariate quantitative traits, and to estimate genetic similarities measures that are predictive of the trait. However, in studies involving hundreds of thousands of SNPs and high-dimensional traits, a very large ensemble of trees must be inferred from the data in order to obtain reliable rankings, which makes the application of these algorithms computationally prohibitive. We have developed a parallel version of the RF algorithm for regression and genetic similarity learning tasks in large-scale population genetic association studies involving multivariate traits, called PaRFR (Parallel Random Forest Regression). Our implementation takes advantage of the MapReduce programming model and is deployed on Hadoop, an open-source software framework that supports data-intensive distributed applications. Notable speed-ups are obtained by introducing a distance-based criterion for node splitting in the tree estimation process. PaRFR has been applied to a genome-wide association study on Alzheimer's disease (AD) in which the quantitative trait consists of a high-dimensional neuroimaging phenotype describing longitudinal changes in the human brain structure. PaRFR provides a ranking of SNPs associated to this trait, and produces pair-wise measures of genetic proximity that can be directly compared to pair-wise measures of phenotypic proximity. Several known AD-related variants have been identified, including APOE4 and TOMM40. We also present experimental evidence supporting the hypothesis of a linear relationship between the number of top-ranked mutated states, or frequent mutation patterns, and an indicator of disease severity. The Java codes are freely available at http://www2.imperial.ac.uk/~gmontana.

  19. SPReM: Sparse Projection Regression Model For High-dimensional Linear Regression *

    PubMed Central

    Sun, Qiang; Zhu, Hongtu; Liu, Yufeng; Ibrahim, Joseph G.

    2014-01-01

    The aim of this paper is to develop a sparse projection regression modeling (SPReM) framework to perform multivariate regression modeling with a large number of responses and a multivariate covariate of interest. We propose two novel heritability ratios to simultaneously perform dimension reduction, response selection, estimation, and testing, while explicitly accounting for correlations among multivariate responses. Our SPReM is devised to specifically address the low statistical power issue of many standard statistical approaches, such as the Hotelling’s T2 test statistic or a mass univariate analysis, for high-dimensional data. We formulate the estimation problem of SPREM as a novel sparse unit rank projection (SURP) problem and propose a fast optimization algorithm for SURP. Furthermore, we extend SURP to the sparse multi-rank projection (SMURP) by adopting a sequential SURP approximation. Theoretically, we have systematically investigated the convergence properties of SURP and the convergence rate of SURP estimates. Our simulation results and real data analysis have shown that SPReM out-performs other state-of-the-art methods. PMID:26527844

  20. A hybrid PCA-CART-MARS-based prognostic approach of the remaining useful life for aircraft engines.

    PubMed

    Sánchez Lasheras, Fernando; García Nieto, Paulino José; de Cos Juez, Francisco Javier; Mayo Bayón, Ricardo; González Suárez, Victor Manuel

    2015-03-23

    Prognostics is an engineering discipline that predicts the future health of a system. In this research work, a data-driven approach for prognostics is proposed. Indeed, the present paper describes a data-driven hybrid model for the successful prediction of the remaining useful life of aircraft engines. The approach combines the multivariate adaptive regression splines (MARS) technique with the principal component analysis (PCA), dendrograms and classification and regression trees (CARTs). Elements extracted from sensor signals are used to train this hybrid model, representing different levels of health for aircraft engines. In this way, this hybrid algorithm is used to predict the trends of these elements. Based on this fitting, one can determine the future health state of a system and estimate its remaining useful life (RUL) with accuracy. To evaluate the proposed approach, a test was carried out using aircraft engine signals collected from physical sensors (temperature, pressure, speed, fuel flow, etc.). Simulation results show that the PCA-CART-MARS-based approach can forecast faults long before they occur and can predict the RUL. The proposed hybrid model presents as its main advantage the fact that it does not require information about the previous operation states of the input variables of the engine. The performance of this model was compared with those obtained by other benchmark models (multivariate linear regression and artificial neural networks) also applied in recent years for the modeling of remaining useful life. Therefore, the PCA-CART-MARS-based approach is very promising in the field of prognostics of the RUL for aircraft engines.

  1. A Hybrid PCA-CART-MARS-Based Prognostic Approach of the Remaining Useful Life for Aircraft Engines

    PubMed Central

    Lasheras, Fernando Sánchez; Nieto, Paulino José García; de Cos Juez, Francisco Javier; Bayón, Ricardo Mayo; Suárez, Victor Manuel González

    2015-01-01

    Prognostics is an engineering discipline that predicts the future health of a system. In this research work, a data-driven approach for prognostics is proposed. Indeed, the present paper describes a data-driven hybrid model for the successful prediction of the remaining useful life of aircraft engines. The approach combines the multivariate adaptive regression splines (MARS) technique with the principal component analysis (PCA), dendrograms and classification and regression trees (CARTs). Elements extracted from sensor signals are used to train this hybrid model, representing different levels of health for aircraft engines. In this way, this hybrid algorithm is used to predict the trends of these elements. Based on this fitting, one can determine the future health state of a system and estimate its remaining useful life (RUL) with accuracy. To evaluate the proposed approach, a test was carried out using aircraft engine signals collected from physical sensors (temperature, pressure, speed, fuel flow, etc.). Simulation results show that the PCA-CART-MARS-based approach can forecast faults long before they occur and can predict the RUL. The proposed hybrid model presents as its main advantage the fact that it does not require information about the previous operation states of the input variables of the engine. The performance of this model was compared with those obtained by other benchmark models (multivariate linear regression and artificial neural networks) also applied in recent years for the modeling of remaining useful life. Therefore, the PCA-CART-MARS-based approach is very promising in the field of prognostics of the RUL for aircraft engines. PMID:25806876

  2. Multivariate analysis on unilateral cleft lip and palate treatment outcome by EUROCRAN index: A retrospective study.

    PubMed

    Yew, Ching Ching; Alam, Mohammad Khursheed; Rahman, Shaifulizan Abdul

    2016-10-01

    This study is to evaluate the dental arch relationship and palatal morphology of unilateral cleft lip and palate patients by using EUROCRAN index, and to assess the factors that affect them using multivariate statistical analysis. A total of one hundred and seven patients from age five to twelve years old with non-syndromic unilateral cleft lip and palate were included in the study. These patients have received cheiloplasty and one stage palatoplasty surgery but yet to receive alveolar bone grafting procedure. Five assessors trained in the use of the EUROCRAN index underwent calibration exercise and ranked the dental arch relationships and palatal morphology of the patients' study models. For intra-rater agreement, the examiners scored the models twice, with two weeks interval in between sessions. Variable factors of the patients were collected and they included gender, site, type and, family history of unilateral cleft lip and palate; absence of lateral incisor on cleft side, cheiloplasty and palatoplasty technique used. Associations between various factors and dental arch relationships were assessed using logistic regression analysis. Dental arch relationship among unilateral cleft lip and palate in local population had relatively worse scoring than other parts of the world. Crude logistics regression analysis did not demonstrate any significant associations among the various socio-demographic factors, cheiloplasty and palatoplasty techniques used with the dental arch relationship outcome. This study has limitations that might have affected the results, example: having multiple operators performing the surgeries and the inability to access the influence of underlying genetic predisposed cranio-facial variability. These may have substantial influence on the treatment outcome. The factors that can affect unilateral cleft lip and palate treatment outcome is multifactorial in nature and remained controversial in general. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  3. Impact of preoperative calculation of nephron volume loss on future of partial nephrectomy techniques; planning a strategic roadmap for improving functional preservation and securing oncological safety.

    PubMed

    Rha, Koon H; Abdel Raheem, Ali; Park, Sung Y; Kim, Kwang H; Kim, Hyung J; Koo, Kyo C; Choi, Young D; Jung, Byung H; Lee, Sang K; Lee, Won K; Krishnan, Jayram; Shin, Tae Y; Cho, Jin-Seon

    2017-11-01

    To assess the correlation of the resected and ischaemic volume (RAIV), which is a preoperatively calculated volume of nephron loss, with the amount of postoperative renal function (PRF) decline after minimally invasive partial nephrectomy (PN) in a multi-institutional dataset. We identified 348 patients from March 2005 to December 2013 at six institutions. Data on all cases of laparoscopic (n = 85) and robot-assisted PN (n = 263) performed were retrospectively gathered. Univariable and multivariable linear regression analyses were used to identify the associations between various time points of PRF and the RAIV, as a continuous variable. The mean (sd) RAIV was 24.2 (29.2) cm 3 . The mean preoperative estimated glomerular filtration rate (eGFR) and the eGFRs at postoperative day 1, 6 and 36 months after PN were 91.0 and 76.8, 80.2 and 87.7 mL/min/1.73 m 2 , respectively. In multivariable linear regression analysis, the amount of decline in PRF at follow-up was significantly correlated with the RAIV (β 0.261, 0.165, 0.260 at postoperative day 1, 6 and 36 months after PN, respectively). This study has the limitation of its retrospective nature. Preoperatively calculated RAIV significantly correlates with the amount of decline in PRF during long-term follow-up. The RAIV could lead our research to the level of prediction of the amount of PRF decline after PN and thus would be appropriate for assessing the technical advantages of emerging techniques. © 2017 The Authors BJU International © 2017 BJU International Published by John Wiley & Sons Ltd.

  4. Risk Factors of Catheter-Related Thrombosis (CRT) in Cancer Patients: A Patient-Level Data (IPD) Meta-Analysis of Clinical Trials and Prospective Studies

    PubMed Central

    Saber, W.; Moua, T.; Williams, E. C.; Verso, M.; Agnelli, G.; Couban, S.; Young, A.; De Cicco, M.; Biffi, R.; van Rooden, C. J.; Huisman, M. V.; Fagnani, D.; Cimminiello, C.; Moia, M.; Magagnoli, M.; Povoski, S. P.; Malak, S. F.; Lee, A. Y.

    2010-01-01

    Background Knowledge of independent, baseline risk factors of catheter-related thrombosis (CRT) may help select adult cancer patients at high risk to receive thromboprophylaxis. Objectives We conducted a meta-analysis of individual patient-level data to identify these baseline risk factors. Patients/Methods MEDLINE, EMBASE, CINAHL, CENTRAL, DARE, Grey literature databases were searched in all languages from 1995-2008. Prospective studies and randomized controlled trials (RCTs) were eligible. Studies were included if original patient-level data were provided by the investigators and if CRT was objectively confirmed with valid imaging. Multivariate logistic regression analysis of 17 prespecified baseline characteristics was conducted. Adjusted odds ratios (OR) and 95% confidence intervals (CI) were estimated. Results A total sample of 5636 subjects from 5 RCTs and 7 prospective studies was included in the analysis. Among these subjects, 425 CRT events were observed. In multivariate logistic regression, the use of implanted ports as compared with peripherally implanted central venous catheters (PICC), decreased CRT risk (OR = 0.43; 95% CI, 0.23-0.80), whereas past history of deep vein thrombosis (DVT) (OR = 2.03; 95% CI, 1.05-3.92), subclavian venipuncture insertion technique (OR = 2.16; 95% CI, 1.07-4.34), and improper catheter tip location (OR = 1.92; 95% CI, 1.22-3.02), increased CRT risk. Conclusions CRT risk is increased with using PICC catheters, previous history of DVT, subclavian venipuncture insertion technique and improper positioning of the catheter tip. These factors may be useful for risk stratifying patients to select those for thromboprophylaxis. Prospective studies are needed to validate these findings. PMID:21040443

  5. Wavelet analysis techniques applied to removing varying spectroscopic background in calibration model for pear sugar content

    NASA Astrophysics Data System (ADS)

    Liu, Yande; Ying, Yibin; Lu, Huishan; Fu, Xiaping

    2005-11-01

    A new method is proposed to eliminate the varying background and noise simultaneously for multivariate calibration of Fourier transform near infrared (FT-NIR) spectral signals. An ideal spectrum signal prototype was constructed based on the FT-NIR spectrum of fruit sugar content measurement. The performances of wavelet based threshold de-noising approaches via different combinations of wavelet base functions were compared. Three families of wavelet base function (Daubechies, Symlets and Coiflets) were applied to estimate the performance of those wavelet bases and threshold selection rules by a series of experiments. The experimental results show that the best de-noising performance is reached via the combinations of Daubechies 4 or Symlet 4 wavelet base function. Based on the optimization parameter, wavelet regression models for sugar content of pear were also developed and result in a smaller prediction error than a traditional Partial Least Squares Regression (PLSR) mode.

  6. Simultaneous Force Regression and Movement Classification of Fingers via Surface EMG within a Unified Bayesian Framework.

    PubMed

    Baldacchino, Tara; Jacobs, William R; Anderson, Sean R; Worden, Keith; Rowson, Jennifer

    2018-01-01

    This contribution presents a novel methodology for myolectric-based control using surface electromyographic (sEMG) signals recorded during finger movements. A multivariate Bayesian mixture of experts (MoE) model is introduced which provides a powerful method for modeling force regression at the fingertips, while also performing finger movement classification as a by-product of the modeling algorithm. Bayesian inference of the model allows uncertainties to be naturally incorporated into the model structure. This method is tested using data from the publicly released NinaPro database which consists of sEMG recordings for 6 degree-of-freedom force activations for 40 intact subjects. The results demonstrate that the MoE model achieves similar performance compared to the benchmark set by the authors of NinaPro for finger force regression. Additionally, inherent to the Bayesian framework is the inclusion of uncertainty in the model parameters, naturally providing confidence bounds on the force regression predictions. Furthermore, the integrated clustering step allows a detailed investigation into classification of the finger movements, without incurring any extra computational effort. Subsequently, a systematic approach to assessing the importance of the number of electrodes needed for accurate control is performed via sensitivity analysis techniques. A slight degradation in regression performance is observed for a reduced number of electrodes, while classification performance is unaffected.

  7. EXTENDING MULTIVARIATE DISTANCE MATRIX REGRESSION WITH AN EFFECT SIZE MEASURE AND THE ASYMPTOTIC NULL DISTRIBUTION OF THE TEST STATISTIC

    PubMed Central

    McArtor, Daniel B.; Lubke, Gitta H.; Bergeman, C. S.

    2017-01-01

    Person-centered methods are useful for studying individual differences in terms of (dis)similarities between response profiles on multivariate outcomes. Multivariate distance matrix regression (MDMR) tests the significance of associations of response profile (dis)similarities and a set of predictors using permutation tests. This paper extends MDMR by deriving and empirically validating the asymptotic null distribution of its test statistic, and by proposing an effect size for individual outcome variables, which is shown to recover true associations. These extensions alleviate the computational burden of permutation tests currently used in MDMR and render more informative results, thus making MDMR accessible to new research domains. PMID:27738957

  8. Extending multivariate distance matrix regression with an effect size measure and the asymptotic null distribution of the test statistic.

    PubMed

    McArtor, Daniel B; Lubke, Gitta H; Bergeman, C S

    2017-12-01

    Person-centered methods are useful for studying individual differences in terms of (dis)similarities between response profiles on multivariate outcomes. Multivariate distance matrix regression (MDMR) tests the significance of associations of response profile (dis)similarities and a set of predictors using permutation tests. This paper extends MDMR by deriving and empirically validating the asymptotic null distribution of its test statistic, and by proposing an effect size for individual outcome variables, which is shown to recover true associations. These extensions alleviate the computational burden of permutation tests currently used in MDMR and render more informative results, thus making MDMR accessible to new research domains.

  9. Logistic models--an odd(s) kind of regression.

    PubMed

    Jupiter, Daniel C

    2013-01-01

    The logistic regression model bears some similarity to the multivariable linear regression with which we are familiar. However, the differences are great enough to warrant a discussion of the need for and interpretation of logistic regression. Copyright © 2013 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.

  10. Retro-regression--another important multivariate regression improvement.

    PubMed

    Randić, M

    2001-01-01

    We review the serious problem associated with instabilities of the coefficients of regression equations, referred to as the MRA (multivariate regression analysis) "nightmare of the first kind". This is manifested when in a stepwise regression a descriptor is included or excluded from a regression. The consequence is an unpredictable change of the coefficients of the descriptors that remain in the regression equation. We follow with consideration of an even more serious problem, referred to as the MRA "nightmare of the second kind", arising when optimal descriptors are selected from a large pool of descriptors. This process typically causes at different steps of the stepwise regression a replacement of several previously used descriptors by new ones. We describe a procedure that resolves these difficulties. The approach is illustrated on boiling points of nonanes which are considered (1) by using an ordered connectivity basis; (2) by using an ordering resulting from application of greedy algorithm; and (3) by using an ordering derived from an exhaustive search for optimal descriptors. A novel variant of multiple regression analysis, called retro-regression (RR), is outlined showing how it resolves the ambiguities associated with both "nightmares" of the first and the second kind of MRA.

  11. Linear Multivariable Regression Models for Prediction of Eddy Dissipation Rate from Available Meteorological Data

    NASA Technical Reports Server (NTRS)

    MCKissick, Burnell T. (Technical Monitor); Plassman, Gerald E.; Mall, Gerald H.; Quagliano, John R.

    2005-01-01

    Linear multivariable regression models for predicting day and night Eddy Dissipation Rate (EDR) from available meteorological data sources are defined and validated. Model definition is based on a combination of 1997-2000 Dallas/Fort Worth (DFW) data sources, EDR from Aircraft Vortex Spacing System (AVOSS) deployment data, and regression variables primarily from corresponding Automated Surface Observation System (ASOS) data. Model validation is accomplished through EDR predictions on a similar combination of 1994-1995 Memphis (MEM) AVOSS and ASOS data. Model forms include an intercept plus a single term of fixed optimal power for each of these regression variables; 30-minute forward averaged mean and variance of near-surface wind speed and temperature, variance of wind direction, and a discrete cloud cover metric. Distinct day and night models, regressing on EDR and the natural log of EDR respectively, yield best performance and avoid model discontinuity over day/night data boundaries.

  12. Assessing Principal Component Regression Prediction of Neurochemicals Detected with Fast-Scan Cyclic Voltammetry

    PubMed Central

    2011-01-01

    Principal component regression is a multivariate data analysis approach routinely used to predict neurochemical concentrations from in vivo fast-scan cyclic voltammetry measurements. This mathematical procedure can rapidly be employed with present day computer programming languages. Here, we evaluate several methods that can be used to evaluate and improve multivariate concentration determination. The cyclic voltammetric representation of the calculated regression vector is shown to be a valuable tool in determining whether the calculated multivariate model is chemically appropriate. The use of Cook’s distance successfully identified outliers contained within in vivo fast-scan cyclic voltammetry training sets. This work also presents the first direct interpretation of a residual color plot and demonstrated the effect of peak shifts on predicted dopamine concentrations. Finally, separate analyses of smaller increments of a single continuous measurement could not be concatenated without substantial error in the predicted neurochemical concentrations due to electrode drift. Taken together, these tools allow for the construction of more robust multivariate calibration models and provide the first approach to assess the predictive ability of a procedure that is inherently impossible to validate because of the lack of in vivo standards. PMID:21966586

  13. Parameter estimation of multivariate multiple regression model using bayesian with non-informative Jeffreys’ prior distribution

    NASA Astrophysics Data System (ADS)

    Saputro, D. R. S.; Amalia, F.; Widyaningsih, P.; Affan, R. C.

    2018-05-01

    Bayesian method is a method that can be used to estimate the parameters of multivariate multiple regression model. Bayesian method has two distributions, there are prior and posterior distributions. Posterior distribution is influenced by the selection of prior distribution. Jeffreys’ prior distribution is a kind of Non-informative prior distribution. This prior is used when the information about parameter not available. Non-informative Jeffreys’ prior distribution is combined with the sample information resulting the posterior distribution. Posterior distribution is used to estimate the parameter. The purposes of this research is to estimate the parameters of multivariate regression model using Bayesian method with Non-informative Jeffreys’ prior distribution. Based on the results and discussion, parameter estimation of β and Σ which were obtained from expected value of random variable of marginal posterior distribution function. The marginal posterior distributions for β and Σ are multivariate normal and inverse Wishart. However, in calculation of the expected value involving integral of a function which difficult to determine the value. Therefore, approach is needed by generating of random samples according to the posterior distribution characteristics of each parameter using Markov chain Monte Carlo (MCMC) Gibbs sampling algorithm.

  14. Field applications of stand-off sensing using visible/NIR multivariate optical computing

    NASA Astrophysics Data System (ADS)

    Eastwood, DeLyle; Soyemi, Olusola O.; Karunamuni, Jeevanandra; Zhang, Lixia; Li, Hongli; Myrick, Michael L.

    2001-02-01

    12 A novel multivariate visible/NIR optical computing approach applicable to standoff sensing will be demonstrated with porphyrin mixtures as examples. The ultimate goal is to develop environmental or counter-terrorism sensors for chemicals such as organophosphorus (OP) pesticides or chemical warfare simulants in the near infrared spectral region. The mathematical operation that characterizes prediction of properties via regression from optical spectra is a calculation of inner products between the spectrum and the pre-determined regression vector. The result is scaled appropriately and offset to correspond to the basis from which the regression vector is derived. The process involves collecting spectroscopic data and synthesizing a multivariate vector using a pattern recognition method. Then, an interference coating is designed that reproduces the pattern of the multivariate vector in its transmission or reflection spectrum, and appropriate interference filters are fabricated. High and low refractive index materials such as Nb2O5 and SiO2 are excellent choices for the visible and near infrared regions. The proof of concept has now been established for this system in the visible and will later be extended to chemicals such as OP compounds in the near and mid-infrared.

  15. Assessing principal component regression prediction of neurochemicals detected with fast-scan cyclic voltammetry.

    PubMed

    Keithley, Richard B; Wightman, R Mark

    2011-06-07

    Principal component regression is a multivariate data analysis approach routinely used to predict neurochemical concentrations from in vivo fast-scan cyclic voltammetry measurements. This mathematical procedure can rapidly be employed with present day computer programming languages. Here, we evaluate several methods that can be used to evaluate and improve multivariate concentration determination. The cyclic voltammetric representation of the calculated regression vector is shown to be a valuable tool in determining whether the calculated multivariate model is chemically appropriate. The use of Cook's distance successfully identified outliers contained within in vivo fast-scan cyclic voltammetry training sets. This work also presents the first direct interpretation of a residual color plot and demonstrated the effect of peak shifts on predicted dopamine concentrations. Finally, separate analyses of smaller increments of a single continuous measurement could not be concatenated without substantial error in the predicted neurochemical concentrations due to electrode drift. Taken together, these tools allow for the construction of more robust multivariate calibration models and provide the first approach to assess the predictive ability of a procedure that is inherently impossible to validate because of the lack of in vivo standards.

  16. A refined method for multivariate meta-analysis and meta-regression.

    PubMed

    Jackson, Daniel; Riley, Richard D

    2014-02-20

    Making inferences about the average treatment effect using the random effects model for meta-analysis is problematic in the common situation where there is a small number of studies. This is because estimates of the between-study variance are not precise enough to accurately apply the conventional methods for testing and deriving a confidence interval for the average effect. We have found that a refined method for univariate meta-analysis, which applies a scaling factor to the estimated effects' standard error, provides more accurate inference. We explain how to extend this method to the multivariate scenario and show that our proposal for refined multivariate meta-analysis and meta-regression can provide more accurate inferences than the more conventional approach. We explain how our proposed approach can be implemented using standard output from multivariate meta-analysis software packages and apply our methodology to two real examples. Copyright © 2013 John Wiley & Sons, Ltd.

  17. Impact of extent of resection and recurrent surgery on clinical outcome and overall survival in a consecutive series of 170 patients for glioblastoma in intraoperative high field magnetic resonance imaging.

    PubMed

    Coburger, Jan; Wirtz, Christian R; König, Ralph W

    2017-06-01

    In patients with a glioblastoma (GBM), few unselected data exists using actual standard adjuvant treatment and contemporary surgical techniques like iMRI. Aim of study is to assess impact of EoR and recurrent surgery on survival and outcome. We assessed a consecutive unselected series of 170 surgeries for GBM (2008-2014) applying intraoperative MRI (iMRI). All patients received adjuvant radio-chemo-therapy. Overall-survival (OS), progression free survival (PFS), complications and new permanent neurological deficits (nPND) were assessed. Uni- and multivariate-cox-regression-models were calculated. Mean follow-up was 40mo. GTR was intended in 82% and achieved in 77% of these cases. A nPND was found in 7% of patients. In multivariate cox-regression, GTR (HR:0.6, P<0.024) and absence of MGMT methylation (HR:1.6, P<0.042) was significantly associated with PFS. We found no difference in PFS after primary surgery and recurrent surgery. Concerning OS, in multivariate assessment an un-methylated MGMT-promotor (HR2.0, P<0.01) and presence of a complication (HR1.7, P<0.06) were negative prognosticators. Only GTR was significantly beneficial for OS (HR0.4, P<0.028) compared to a failed GTR and a STR. Repeated surgery for recurrent disease was positively associated with OS (HR0.6, P<0.06). Surgery in a contemporary setup using iMRI, brain mapping and modern adjuvant treatment, has a higher OS and lower complication rates as previously published. A maximum but safe resection should be the goal of surgery since a perioperative complication significantly decreases OS. Recurrent surgery has a beneficial effect on OS without an increase of complications.

  18. Correlation of porous and functional properties of food materials by NMR relaxometry and multivariate analysis.

    PubMed

    Haiduc, Adrian Marius; van Duynhoven, John

    2005-02-01

    The porous properties of food materials are known to determine important macroscopic parameters such as water-holding capacity and texture. In conventional approaches, understanding is built from a long process of establishing macrostructure-property relations in a rational manner. Only recently, multivariate approaches were introduced for the same purpose. The model systems used here are oil-in-water emulsions, stabilised by protein, and form complex structures, consisting of fat droplets dispersed in a porous protein phase. NMR time-domain decay curves were recorded for emulsions with varied levels of fat, protein and water. Hardness, dry matter content and water drainage were determined by classical means and analysed for correlation with the NMR data with multivariate techniques. Partial least squares can calibrate and predict these properties directly from the continuous NMR exponential decays and yields regression coefficients higher than 82%. However, the calibration coefficients themselves belong to the continuous exponential domain and do little to explain the connection between NMR data and emulsion properties. Transformation of the NMR decays into a discreet domain with non-negative least squares permits the use of multilinear regression (MLR) on the resulting amplitudes as predictors and hardness or water drainage as responses. The MLR coefficients show that hardness is highly correlated with the components that have T2 distributions of about 20 and 200 ms whereas water drainage is correlated with components that have T2 distributions around 400 and 1800 ms. These T2 distributions very likely correlate with water populations present in pores with different sizes and/or wall mobility. The results for the emulsions studied demonstrate that NMR time-domain decays can be employed to predict properties and to provide insight in the underlying microstructural features.

  19. Metabolic phenotyping of urine for discriminating alcohol-dependent from social drinkers and alcohol-naive subjects.

    PubMed

    Mostafa, Hamza; Amin, Arwa M; Teh, Chin-Hoe; Murugaiyah, Vikneswaran; Arif, Nor Hayati; Ibrahim, Baharudin

    2016-12-01

    Alcohol-dependence (AD) is a ravaging public health and social problem. AD diagnosis depends on questionnaires and some biomarkers, which lack specificity and sensitivity, however, often leading to less precise diagnosis, as well as delaying treatment. This represents a great burden, not only on AD individuals but also on their families. Metabolomics using nuclear magnetic resonance spectroscopy (NMR) can provide novel techniques for the identification of novel biomarkers of AD. These putative biomarkers can facilitate early diagnosis of AD. To identify novel biomarkers able to discriminate between alcohol-dependent, non-AD alcohol drinkers and controls using metabolomics. Urine samples were collected from 30 alcohol-dependent persons who did not yet start AD treatment, 54 social drinkers and 60 controls, who were then analysed using NMR. Data analysis was done using multivariate analysis including principal component analysis (PCA) and orthogonal partial least square-discriminate analysis (OPLS-DA), followed by univariate and multivariate logistic regression to develop the discriminatory model. The reproducibility was done using intraclass correlation coefficient (ICC). The OPLS-DA revealed significant discrimination between AD and other groups with sensitivity 86.21%, specificity 97.25% and accuracy 94.93%. Six biomarkers were significantly associated with AD in the multivariate logistic regression model. These biomarkers were cis-aconitic acid, citric acid, alanine, lactic acid, 1,2-propanediol and 2-hydroxyisovaleric acid. The reproducibility of all biomarkers was excellent (0.81-1.0). This study revealed that metabolomics analysis of urine using NMR identified AD novel biomarkers which can discriminate AD from social drinkers and controls with high accuracy. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  20. Information transfer and information modification to identify the structure of cardiovascular and cardiorespiratory networks.

    PubMed

    Faes, Luca; Nollo, Giandomenico; Krohova, Jana; Czippelova, Barbora; Turianikova, Zuzana; Javorka, Michal

    2017-07-01

    To fully elucidate the complex physiological mechanisms underlying the short-term autonomic regulation of heart period (H), systolic and diastolic arterial pressure (S, D) and respiratory (R) variability, the joint dynamics of these variables need to be explored using multivariate time series analysis. This study proposes the utilization of information-theoretic measures to measure causal interactions between nodes of the cardiovascular/cardiorespiratory network and to assess the nature (synergistic or redundant) of these directed interactions. Indexes of information transfer and information modification are extracted from the H, S, D and R series measured from healthy subjects in a resting state and during postural stress. Computations are performed in the framework of multivariate linear regression, using bootstrap techniques to assess on a single-subject basis the statistical significance of each measure and of its transitions across conditions. We find patterns of information transfer and modification which are related to specific cardiovascular and cardiorespiratory mechanisms in resting conditions and to their modification induced by the orthostatic stress.

  1. Proton radius from electron scattering data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Higinbotham, Douglas W.; Kabir, Al Amin; Lin, Vincent

    Background: The proton charge radius extracted from recent muonic hydrogen Lamb shift measurements is significantly smaller than that extracted from atomic hydrogen and electron scattering measurements. The discrepancy has become known as the proton radius puzzle. Purpose: In an attempt to understand the discrepancy, we review high-precision electron scattering results from Mainz, Jefferson Lab, Saskatoon and Stanford. Methods: We make use of stepwise regression techniques using the F-test as well as the Akaike information criterion to systematically determine the predictive variables to use for a given set and range of electron scattering data as well as to provide multivariate errormore » estimates. Results: Starting with the precision, low four-momentum transfer (Q 2) data from Mainz (1980) and Saskatoon (1974), we find that a stepwise regression of the Maclaurin series using the F-test as well as the Akaike information criterion justify using a linear extrapolation which yields a value for the proton radius that is consistent with the result obtained from muonic hydrogen measurements. Applying the same Maclaurin series and statistical criteria to the 2014 Rosenbluth results on GE from Mainz, we again find that the stepwise regression tends to favor a radius consistent with the muonic hydrogen radius but produces results that are extremely sensitive to the range of data included in the fit. Making use of the high-Q 2 data on G E to select functions which extrapolate to high Q 2, we find that a Pad´e (N = M = 1) statistical model works remarkably well, as does a dipole function with a 0.84 fm radius, G E(Q 2) = (1 + Q 2/0.66 GeV 2) -2. Conclusions: Rigorous applications of stepwise regression techniques and multivariate error estimates result in the extraction of a proton charge radius that is consistent with the muonic hydrogen result of 0.84 fm; either from linear extrapolation of the extreme low-Q 2 data or by use of the Pad´e approximant for extrapolation using a larger range of data. Thus, based on a purely statistical analysis of electron scattering data, we conclude that the electron scattering result and the muonic hydrogen result are consistent. Lastly, it is the atomic hydrogen results that are the outliers.« less

  2. A comparison of selected parametric and imputation methods for estimating snag density and snag quality attributes

    USGS Publications Warehouse

    Eskelson, Bianca N.I.; Hagar, Joan; Temesgen, Hailemariam

    2012-01-01

    Snags (standing dead trees) are an essential structural component of forests. Because wildlife use of snags depends on size and decay stage, snag density estimation without any information about snag quality attributes is of little value for wildlife management decision makers. Little work has been done to develop models that allow multivariate estimation of snag density by snag quality class. Using climate, topography, Landsat TM data, stand age and forest type collected for 2356 forested Forest Inventory and Analysis plots in western Washington and western Oregon, we evaluated two multivariate techniques for their abilities to estimate density of snags by three decay classes. The density of live trees and snags in three decay classes (D1: recently dead, little decay; D2: decay, without top, some branches and bark missing; D3: extensive decay, missing bark and most branches) with diameter at breast height (DBH) ≥ 12.7 cm was estimated using a nonparametric random forest nearest neighbor imputation technique (RF) and a parametric two-stage model (QPORD), for which the number of trees per hectare was estimated with a Quasipoisson model in the first stage and the probability of belonging to a tree status class (live, D1, D2, D3) was estimated with an ordinal regression model in the second stage. The presence of large snags with DBH ≥ 50 cm was predicted using a logistic regression and RF imputation. Because of the more homogenous conditions on private forest lands, snag density by decay class was predicted with higher accuracies on private forest lands than on public lands, while presence of large snags was more accurately predicted on public lands, owing to the higher prevalence of large snags on public lands. RF outperformed the QPORD model in terms of percent accurate predictions, while QPORD provided smaller root mean square errors in predicting snag density by decay class. The logistic regression model achieved more accurate presence/absence classification of large snags than the RF imputation approach. Adjusting the decision threshold to account for unequal size for presence and absence classes is more straightforward for the logistic regression than for the RF imputation approach. Overall, model accuracies were poor in this study, which can be attributed to the poor predictive quality of the explanatory variables and the large range of forest types and geographic conditions observed in the data.

  3. Proton radius from electron scattering data

    DOE PAGES

    Higinbotham, Douglas W.; Kabir, Al Amin; Lin, Vincent; ...

    2016-05-31

    Background: The proton charge radius extracted from recent muonic hydrogen Lamb shift measurements is significantly smaller than that extracted from atomic hydrogen and electron scattering measurements. The discrepancy has become known as the proton radius puzzle. Purpose: In an attempt to understand the discrepancy, we review high-precision electron scattering results from Mainz, Jefferson Lab, Saskatoon and Stanford. Methods: We make use of stepwise regression techniques using the F-test as well as the Akaike information criterion to systematically determine the predictive variables to use for a given set and range of electron scattering data as well as to provide multivariate errormore » estimates. Results: Starting with the precision, low four-momentum transfer (Q 2) data from Mainz (1980) and Saskatoon (1974), we find that a stepwise regression of the Maclaurin series using the F-test as well as the Akaike information criterion justify using a linear extrapolation which yields a value for the proton radius that is consistent with the result obtained from muonic hydrogen measurements. Applying the same Maclaurin series and statistical criteria to the 2014 Rosenbluth results on GE from Mainz, we again find that the stepwise regression tends to favor a radius consistent with the muonic hydrogen radius but produces results that are extremely sensitive to the range of data included in the fit. Making use of the high-Q 2 data on G E to select functions which extrapolate to high Q 2, we find that a Pad´e (N = M = 1) statistical model works remarkably well, as does a dipole function with a 0.84 fm radius, G E(Q 2) = (1 + Q 2/0.66 GeV 2) -2. Conclusions: Rigorous applications of stepwise regression techniques and multivariate error estimates result in the extraction of a proton charge radius that is consistent with the muonic hydrogen result of 0.84 fm; either from linear extrapolation of the extreme low-Q 2 data or by use of the Pad´e approximant for extrapolation using a larger range of data. Thus, based on a purely statistical analysis of electron scattering data, we conclude that the electron scattering result and the muonic hydrogen result are consistent. Lastly, it is the atomic hydrogen results that are the outliers.« less

  4. Carbon financial markets: A time-frequency analysis of CO2 prices

    NASA Astrophysics Data System (ADS)

    Sousa, Rita; Aguiar-Conraria, Luís; Soares, Maria Joana

    2014-11-01

    We characterize the interrelation of CO2 prices with energy prices (electricity, gas and coal), and with economic activity. Previous studies have relied on time-domain techniques, such as Vector Auto-Regressions. In this study, we use multivariate wavelet analysis, which operates in the time-frequency domain. Wavelet analysis provides convenient tools to distinguish relations at particular frequencies and at particular time horizons. Our empirical approach has the potential to identify relations getting stronger and then disappearing over specific time intervals and frequencies. We are able to examine the coherency of these variables and lead-lag relations at different frequencies for the time periods in focus.

  5. Travel Demand Modeling

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Southworth, Frank; Garrow, Dr. Laurie

    This chapter describes the principal types of both passenger and freight demand models in use today, providing a brief history of model development supported by references to a number of popular texts on the subject, and directing the reader to papers covering some of the more recent technical developments in the area. Over the past half century a variety of methods have been used to estimate and forecast travel demands, drawing concepts from economic/utility maximization theory, transportation system optimization and spatial interaction theory, using and often combining solution techniques as varied as Box-Jenkins methods, non-linear multivariate regression, non-linear mathematical programming,more » and agent-based microsimulation.« less

  6. Comparative multivariate analyses of transient otoacoustic emissions and distorsion products in normal and impaired hearing.

    PubMed

    Stamate, Mirela Cristina; Todor, Nicolae; Cosgarea, Marcel

    2015-01-01

    The clinical utility of otoacoustic emissions as a noninvasive objective test of cochlear function has been long studied. Both transient otoacoustic emissions and distorsion products can be used to identify hearing loss, but to what extent they can be used as predictors for hearing loss is still debated. Most studies agree that multivariate analyses have better test performances than univariate analyses. The aim of the study was to determine transient otoacoustic emissions and distorsion products performance in identifying normal and impaired hearing loss, using the pure tone audiogram as a gold standard procedure and different multivariate statistical approaches. The study included 105 adult subjects with normal hearing and hearing loss who underwent the same test battery: pure-tone audiometry, tympanometry, otoacoustic emission tests. We chose to use the logistic regression as a multivariate statistical technique. Three logistic regression models were developed to characterize the relations between different risk factors (age, sex, tinnitus, demographic features, cochlear status defined by otoacoustic emissions) and hearing status defined by pure-tone audiometry. The multivariate analyses allow the calculation of the logistic score, which is a combination of the inputs, weighted by coefficients, calculated within the analyses. The accuracy of each model was assessed using receiver operating characteristics curve analysis. We used the logistic score to generate receivers operating curves and to estimate the areas under the curves in order to compare different multivariate analyses. We compared the performance of each otoacoustic emission (transient, distorsion product) using three different multivariate analyses for each ear, when multi-frequency gold standards were used. We demonstrated that all multivariate analyses provided high values of the area under the curve proving the performance of the otoacoustic emissions. Each otoacoustic emission test presented high values of area under the curve, suggesting that implementing a multivariate approach to evaluate the performances of each otoacoustic emission test would serve to increase the accuracy in identifying the normal and impaired ears. We encountered the highest area under the curve value for the combined multivariate analysis suggesting that both otoacoustic emission tests should be used in assessing hearing status. Our multivariate analyses revealed that age is a constant predictor factor of the auditory status for both ears, but the presence of tinnitus was the most important predictor for the hearing level, only for the left ear. Age presented similar coefficients, but tinnitus coefficients, by their high value, produced the highest variations of the logistic scores, only for the left ear group, thus increasing the risk of hearing loss. We did not find gender differences between ears for any otoacoustic emission tests, but studies still debate this question as the results are contradictory. Neither gender, nor environment origin had any predictive value for the hearing status, according to the results of our study. Like any other audiological test, using otoacoustic emissions to identify hearing loss is not without error. Even when applying multivariate analysis, perfect test performance is never achieved. Although most studies demonstrated the benefit of using the multivariate analysis, it has not been incorporated into clinical decisions maybe because of the idiosyncratic nature of multivariate solutions or because of the lack of the validation studies.

  7. Comparative multivariate analyses of transient otoacoustic emissions and distorsion products in normal and impaired hearing

    PubMed Central

    STAMATE, MIRELA CRISTINA; TODOR, NICOLAE; COSGAREA, MARCEL

    2015-01-01

    Background and aim The clinical utility of otoacoustic emissions as a noninvasive objective test of cochlear function has been long studied. Both transient otoacoustic emissions and distorsion products can be used to identify hearing loss, but to what extent they can be used as predictors for hearing loss is still debated. Most studies agree that multivariate analyses have better test performances than univariate analyses. The aim of the study was to determine transient otoacoustic emissions and distorsion products performance in identifying normal and impaired hearing loss, using the pure tone audiogram as a gold standard procedure and different multivariate statistical approaches. Methods The study included 105 adult subjects with normal hearing and hearing loss who underwent the same test battery: pure-tone audiometry, tympanometry, otoacoustic emission tests. We chose to use the logistic regression as a multivariate statistical technique. Three logistic regression models were developed to characterize the relations between different risk factors (age, sex, tinnitus, demographic features, cochlear status defined by otoacoustic emissions) and hearing status defined by pure-tone audiometry. The multivariate analyses allow the calculation of the logistic score, which is a combination of the inputs, weighted by coefficients, calculated within the analyses. The accuracy of each model was assessed using receiver operating characteristics curve analysis. We used the logistic score to generate receivers operating curves and to estimate the areas under the curves in order to compare different multivariate analyses. Results We compared the performance of each otoacoustic emission (transient, distorsion product) using three different multivariate analyses for each ear, when multi-frequency gold standards were used. We demonstrated that all multivariate analyses provided high values of the area under the curve proving the performance of the otoacoustic emissions. Each otoacoustic emission test presented high values of area under the curve, suggesting that implementing a multivariate approach to evaluate the performances of each otoacoustic emission test would serve to increase the accuracy in identifying the normal and impaired ears. We encountered the highest area under the curve value for the combined multivariate analysis suggesting that both otoacoustic emission tests should be used in assessing hearing status. Our multivariate analyses revealed that age is a constant predictor factor of the auditory status for both ears, but the presence of tinnitus was the most important predictor for the hearing level, only for the left ear. Age presented similar coefficients, but tinnitus coefficients, by their high value, produced the highest variations of the logistic scores, only for the left ear group, thus increasing the risk of hearing loss. We did not find gender differences between ears for any otoacoustic emission tests, but studies still debate this question as the results are contradictory. Neither gender, nor environment origin had any predictive value for the hearing status, according to the results of our study. Conclusion Like any other audiological test, using otoacoustic emissions to identify hearing loss is not without error. Even when applying multivariate analysis, perfect test performance is never achieved. Although most studies demonstrated the benefit of using the multivariate analysis, it has not been incorporated into clinical decisions maybe because of the idiosyncratic nature of multivariate solutions or because of the lack of the validation studies. PMID:26733749

  8. Dynamic connectivity regression: Determining state-related changes in brain connectivity

    PubMed Central

    Cribben, Ivor; Haraldsdottir, Ragnheidur; Atlas, Lauren Y.; Wager, Tor D.; Lindquist, Martin A.

    2014-01-01

    Most statistical analyses of fMRI data assume that the nature, timing and duration of the psychological processes being studied are known. However, often it is hard to specify this information a priori. In this work we introduce a data-driven technique for partitioning the experimental time course into distinct temporal intervals with different multivariate functional connectivity patterns between a set of regions of interest (ROIs). The technique, called Dynamic Connectivity Regression (DCR), detects temporal change points in functional connectivity and estimates a graph, or set of relationships between ROIs, for data in the temporal partition that falls between pairs of change points. Hence, DCR allows for estimation of both the time of change in connectivity and the connectivity graph for each partition, without requiring prior knowledge of the nature of the experimental design. Permutation and bootstrapping methods are used to perform inference on the change points. The method is applied to various simulated data sets as well as to an fMRI data set from a study (N=26) of a state anxiety induction using a socially evaluative threat challenge. The results illustrate the method’s ability to observe how the networks between different brain regions changed with subjects’ emotional state. PMID:22484408

  9. A comparison of model-based imputation methods for handling missing predictor values in a linear regression model: A simulation study

    NASA Astrophysics Data System (ADS)

    Hasan, Haliza; Ahmad, Sanizah; Osman, Balkish Mohd; Sapri, Shamsiah; Othman, Nadirah

    2017-08-01

    In regression analysis, missing covariate data has been a common problem. Many researchers use ad hoc methods to overcome this problem due to the ease of implementation. However, these methods require assumptions about the data that rarely hold in practice. Model-based methods such as Maximum Likelihood (ML) using the expectation maximization (EM) algorithm and Multiple Imputation (MI) are more promising when dealing with difficulties caused by missing data. Then again, inappropriate methods of missing value imputation can lead to serious bias that severely affects the parameter estimates. The main objective of this study is to provide a better understanding regarding missing data concept that can assist the researcher to select the appropriate missing data imputation methods. A simulation study was performed to assess the effects of different missing data techniques on the performance of a regression model. The covariate data were generated using an underlying multivariate normal distribution and the dependent variable was generated as a combination of explanatory variables. Missing values in covariate were simulated using a mechanism called missing at random (MAR). Four levels of missingness (10%, 20%, 30% and 40%) were imposed. ML and MI techniques available within SAS software were investigated. A linear regression analysis was fitted and the model performance measures; MSE, and R-Squared were obtained. Results of the analysis showed that MI is superior in handling missing data with highest R-Squared and lowest MSE when percent of missingness is less than 30%. Both methods are unable to handle larger than 30% level of missingness.

  10. Functional Relationships and Regression Analysis.

    ERIC Educational Resources Information Center

    Preece, Peter F. W.

    1978-01-01

    Using a degenerate multivariate normal model for the distribution of organismic variables, the form of least-squares regression analysis required to estimate a linear functional relationship between variables is derived. It is suggested that the two conventional regression lines may be considered to describe functional, not merely statistical,…

  11. Value of Information Analysis for Time-lapse Seismic Data by Simulation-Regression

    NASA Astrophysics Data System (ADS)

    Dutta, G.; Mukerji, T.; Eidsvik, J.

    2016-12-01

    A novel method to estimate the Value of Information (VOI) of time-lapse seismic data in the context of reservoir development is proposed. VOI is a decision analytic metric quantifying the incremental value that would be created by collecting information prior to making a decision under uncertainty. The VOI has to be computed before collecting the information and can be used to justify its collection. Previous work on estimating the VOI of geophysical data has involved explicit approximation of the posterior distribution of reservoir properties given the data and then evaluating the prospect values for that posterior distribution of reservoir properties. Here, we propose to directly estimate the prospect values given the data by building a statistical relationship between them using regression. Various regression techniques such as Partial Least Squares Regression (PLSR), Multivariate Adaptive Regression Splines (MARS) and k-Nearest Neighbors (k-NN) are used to estimate the VOI, and the results compared. For a univariate Gaussian case, the VOI obtained from simulation-regression has been shown to be close to the analytical solution. Estimating VOI by simulation-regression is much less computationally expensive since the posterior distribution of reservoir properties given each possible dataset need not be modeled and the prospect values need not be evaluated for each such posterior distribution of reservoir properties. This method is flexible, since it does not require rigid model specification of posterior but rather fits conditional expectations non-parametrically from samples of values and data.

  12. Data analysis techniques

    NASA Technical Reports Server (NTRS)

    Park, Steve

    1990-01-01

    A large and diverse number of computational techniques are routinely used to process and analyze remotely sensed data. These techniques include: univariate statistics; multivariate statistics; principal component analysis; pattern recognition and classification; other multivariate techniques; geometric correction; registration and resampling; radiometric correction; enhancement; restoration; Fourier analysis; and filtering. Each of these techniques will be considered, in order.

  13. A multivariate geostatistical methodology to delineate areas of potential interest for future sedimentary gold exploration

    PubMed Central

    Goovaerts, P.; Albuquerque, Teresa; Antunes, Margarida

    2015-01-01

    This paper describes a multivariate geostatistical methodology to delineate areas of potential interest for future sedimentary gold exploration, with an application to an abandoned sedimentary gold mining region in Portugal. The main challenge was the existence of only a dozen gold measurements confined to the grounds of the old gold mines, which precluded the application of traditional interpolation techniques, such as cokriging. The analysis could, however, capitalize on 376 stream sediment samples that were analyzed for twenty two elements. Gold (Au) was first predicted at all 376 locations using linear regression (R2=0.798) and four metals (Fe, As, Sn and W), which are known to be mostly associated with the local gold’s paragenesis. One hundred realizations of the spatial distribution of gold content were generated using sequential indicator simulation and a soft indicator coding of regression estimates, to supplement the hard indicator coding of gold measurements. Each simulated map then underwent a local cluster analysis to identify significant aggregates of low or high values. The one hundred classified maps were processed to derive the most likely classification of each simulated node and the associated probability of occurrence. Examining the distribution of the hot-spots and cold-spots reveals a clear enrichment in Au along the Erges River downstream from the old sedimentary mineralization. PMID:27777638

  14. Timescale dependence of environmental controls on methane efflux from Poyang Hu, China

    NASA Astrophysics Data System (ADS)

    Liu, Lixiang; Xu, Ming; Li, Renqiang; Shao, Rui

    2017-04-01

    Lakes are an important natural source of CH4 to the atmosphere. However, the multi-seasonal CH4 efflux from lakes has been rarely studied. In this study, the CH4 efflux from Poyang Hu, the largest freshwater lake in China, was measured monthly over a 4-year period by using the floating-chamber technique. The mean annual CH4 efflux throughout the 4 years was 0.54 mmol m-2 day-1, ranging from 0.47 to 0.60 mmol m-2 day-1. The CH4 efflux had a high seasonal variation with an average summer (June to August) efflux of 1.34 mmol m-2 day-1 and winter (December to February) efflux of merely 0.18 mmol m-2 day-1. The efflux showed no apparent diel pattern, although most of the peak effluxes appeared in the late morning, from 10:00 to 12:00 CST (GMT + 8). Multivariate stepwise regression on a seasonal scale showed that environmental factors, such as sediment temperature, sediment total nitrogen content, dissolved oxygen, and total phosphorus content in the water, mainly regulated the CH4 efflux. However, the CH4 efflux only showed a strong positive linear correlation with wind speed within 1 day on a bihourly scale in the multivariate regression analyses but almost no correlation with wind speed on diurnal and seasonal scales.

  15. Arsenic health risk assessment in drinking water and source apportionment using multivariate statistical techniques in Kohistan region, northern Pakistan.

    PubMed

    Muhammad, Said; Tahir Shah, M; Khan, Sardar

    2010-10-01

    The present study was conducted in Kohistan region, where mafic and ultramafic rocks (Kohistan island arc and Indus suture zone) and metasedimentary rocks (Indian plate) are exposed. Water samples were collected from the springs, streams and Indus river and analyzed for physical parameters, anions, cations and arsenic (As(3+), As(5+) and arsenic total). The water quality in Kohistan region was evaluated by comparing the physio-chemical parameters with permissible limits set by Pakistan environmental protection agency and world health organization. Most of the studied parameters were found within their respective permissible limits. However in some samples, the iron and arsenic concentrations exceeded their permissible limits. For health risk assessment of arsenic, the average daily dose, hazards quotient (HQ) and cancer risk were calculated by using statistical formulas. The values of HQ were found >1 in the samples collected from Jabba, Dubair, while HQ values were <1 in rest of the samples. This level of contamination should have low chronic risk and medium cancer risk when compared with US EPA guidelines. Furthermore, the inter-dependence of physio-chemical parameters and pollution load was also calculated by using multivariate statistical techniques like one-way ANOVA, correlation analysis, regression analysis, cluster analysis and principle component analysis. Copyright © 2010 Elsevier Ltd. All rights reserved.

  16. High-throughput quantitative biochemical characterization of algal biomass by NIR spectroscopy; multiple linear regression and multivariate linear regression analysis.

    PubMed

    Laurens, L M L; Wolfrum, E J

    2013-12-18

    One of the challenges associated with microalgal biomass characterization and the comparison of microalgal strains and conversion processes is the rapid determination of the composition of algae. We have developed and applied a high-throughput screening technology based on near-infrared (NIR) spectroscopy for the rapid and accurate determination of algal biomass composition. We show that NIR spectroscopy can accurately predict the full composition using multivariate linear regression analysis of varying lipid, protein, and carbohydrate content of algal biomass samples from three strains. We also demonstrate a high quality of predictions of an independent validation set. A high-throughput 96-well configuration for spectroscopy gives equally good prediction relative to a ring-cup configuration, and thus, spectra can be obtained from as little as 10-20 mg of material. We found that lipids exhibit a dominant, distinct, and unique fingerprint in the NIR spectrum that allows for the use of single and multiple linear regression of respective wavelengths for the prediction of the biomass lipid content. This is not the case for carbohydrate and protein content, and thus, the use of multivariate statistical modeling approaches remains necessary.

  17. Proton magnetic resonance spectroscopy for assessment of human body composition.

    PubMed

    Kamba, M; Kimura, K; Koda, M; Ogawa, T

    2001-02-01

    The usefulness of magnetic resonance spectroscopy (MRS)-based techniques for assessment of human body composition has not been established. We compared a proton MRS-based technique with the total body water (TBW) method to determine the usefulness of the former technique for assessment of human body composition. Proton magnetic resonance spectra of the chest to abdomen, abdomen to pelvis, and pelvis to thigh regions were obtained from 16 volunteers by using single, free induction decay measurement with a clinical magnetic resonance system operating at 1.5 T. The MRS-derived metabolite ratio was determined as the ratio of fat methyl and methylene proton resonance to water proton resonance. The peak areas for the chest to abdomen and the pelvis to thigh regions were normalized to an external reference (approximately 2200 g benzene) and a weighted average of the MRS-derived metabolite ratios for the 2 positions was calculated. TBW for each subject was determined by the deuterium oxide dilution technique. The MRS-derived metabolite ratios were significantly correlated with the ratio of body fat to lean body mass estimated by TBW. The MRS-derived metabolite ratio for the abdomen to pelvis region correlated best with the ratio of body fat to lean body mass on simple regression analyses (r = 0.918). The MRS-derived metabolite ratio for the abdomen to pelvis region and that for the pelvis to thigh region were selected for a multivariate regression model (R = 0.947, adjusted R(2) = 0.881). This MRS-based technique is sufficiently accurate for assessment of human body composition.

  18. Analysis techniques for multivariate root loci. [a tool in linear control systems

    NASA Technical Reports Server (NTRS)

    Thompson, P. M.; Stein, G.; Laub, A. J.

    1980-01-01

    Analysis and techniques are developed for the multivariable root locus and the multivariable optimal root locus. The generalized eigenvalue problem is used to compute angles and sensitivities for both types of loci, and an algorithm is presented that determines the asymptotic properties of the optimal root locus.

  19. Spectral Mining for Discriminating Blood Origins in the Presence of Substrate Interference via Attenuated Total Reflection Fourier Transform Infrared Spectroscopy: Postmortem or Antemortem Blood?

    PubMed

    Takamura, Ayari; Watanabe, Ken; Akutsu, Tomoko; Ikegaya, Hiroshi; Ozawa, Takeaki

    2017-09-19

    Often in criminal investigations, discrimination of types of body fluid evidence is crucially important to ascertain how a crime was committed. Compared to current methods using biochemical techniques, vibrational spectroscopic approaches can provide versatile applicability to identify various body fluid types without sample invasion. However, their applicability is limited to pure body fluid samples because important signals from body fluids incorporated in a substrate are affected strongly by interference from substrate signals. Herein, we describe a novel approach to recover body fluid signals that are embedded in strong substrate interferences using attenuated total reflection Fourier transform infrared (ATR FT-IR) spectroscopy and an innovative multivariate spectral processing. This technique supported detection of covert features of body fluid signals, and then identified origins of body fluid stains on substrates. We discriminated between ATR FT-IR spectra of postmortem blood (PB) and those of antemortem blood (AB) by creating a multivariate statistics model. From ATR FT-IR spectra of PB and AB stains on interfering substrates (polyester, cotton, and denim), blood-originated signals were extracted by a weighted linear regression approach we developed originally using principal components of both blood and substrate spectra. The blood-originated signals were finally classified by the discriminant model, demonstrating high discriminant accuracy. The present method can identify body fluid evidence independently of the substrate type, which is expected to promote the application of vibrational spectroscopic techniques in forensic body fluid analysis.

  20. Flood-frequency prediction methods for unregulated streams of Tennessee, 2000

    USGS Publications Warehouse

    Law, George S.; Tasker, Gary D.

    2003-01-01

    Up-to-date flood-frequency prediction methods for unregulated, ungaged rivers and streams of Tennessee have been developed. Prediction methods include the regional-regression method and the newer region-of-influence method. The prediction methods were developed using stream-gage records from unregulated streams draining basins having from 1 percent to about 30 percent total impervious area. These methods, however, should not be used in heavily developed or storm-sewered basins with impervious areas greater than 10 percent. The methods can be used to estimate 2-, 5-, 10-, 25-, 50-, 100-, and 500-year recurrence-interval floods of most unregulated rural streams in Tennessee. A computer application was developed that automates the calculation of flood frequency for unregulated, ungaged rivers and streams of Tennessee. Regional-regression equations were derived by using both single-variable and multivariable regional-regression analysis. Contributing drainage area is the explanatory variable used in the single-variable equations. Contributing drainage area, main-channel slope, and a climate factor are the explanatory variables used in the multivariable equations. Deleted-residual standard error for the single-variable equations ranged from 32 to 65 percent. Deleted-residual standard error for the multivariable equations ranged from 31 to 63 percent. These equations are included in the computer application to allow easy comparison of results produced by the different methods. The region-of-influence method calculates multivariable regression equations for each ungaged site and recurrence interval using basin characteristics from 60 similar sites selected from the study area. Explanatory variables that may be used in regression equations computed by the region-of-influence method include contributing drainage area, main-channel slope, a climate factor, and a physiographic-region factor. Deleted-residual standard error for the region-of-influence method tended to be only slightly smaller than those for the regional-regression method and ranged from 27 to 62 percent.

  1. Multivariate logistic regression analysis of postoperative complications and risk model establishment of gastrectomy for gastric cancer: A single-center cohort report.

    PubMed

    Zhou, Jinzhe; Zhou, Yanbing; Cao, Shougen; Li, Shikuan; Wang, Hao; Niu, Zhaojian; Chen, Dong; Wang, Dongsheng; Lv, Liang; Zhang, Jian; Li, Yu; Jiao, Xuelong; Tan, Xiaojie; Zhang, Jianli; Wang, Haibo; Zhang, Bingyuan; Lu, Yun; Sun, Zhenqing

    2016-01-01

    Reporting of surgical complications is common, but few provide information about the severity and estimate risk factors of complications. If have, but lack of specificity. We retrospectively analyzed data on 2795 gastric cancer patients underwent surgical procedure at the Affiliated Hospital of Qingdao University between June 2007 and June 2012, established multivariate logistic regression model to predictive risk factors related to the postoperative complications according to the Clavien-Dindo classification system. Twenty-four out of 86 variables were identified statistically significant in univariate logistic regression analysis, 11 significant variables entered multivariate analysis were employed to produce the risk model. Liver cirrhosis, diabetes mellitus, Child classification, invasion of neighboring organs, combined resection, introperative transfusion, Billroth II anastomosis of reconstruction, malnutrition, surgical volume of surgeons, operating time and age were independent risk factors for postoperative complications after gastrectomy. Based on logistic regression equation, p=Exp∑BiXi / (1+Exp∑BiXi), multivariate logistic regression predictive model that calculated the risk of postoperative morbidity was developed, p = 1/(1 + e((4.810-1.287X1-0.504X2-0.500X3-0.474X4-0.405X5-0.318X6-0.316X7-0.305X8-0.278X9-0.255X10-0.138X11))). The accuracy, sensitivity and specificity of the model to predict the postoperative complications were 86.7%, 76.2% and 88.6%, respectively. This risk model based on Clavien-Dindo grading severity of complications system and logistic regression analysis can predict severe morbidity specific to an individual patient's risk factors, estimate patients' risks and benefits of gastric surgery as an accurate decision-making tool and may serve as a template for the development of risk models for other surgical groups.

  2. PARAMETRIC AND NON PARAMETRIC (MARS: MULTIVARIATE ADDITIVE REGRESSION SPLINES) LOGISTIC REGRESSIONS FOR PREDICTION OF A DICHOTOMOUS RESPONSE VARIABLE WITH AN EXAMPLE FOR PRESENCE/ABSENCE OF AMPHIBIANS

    EPA Science Inventory

    The purpose of this report is to provide a reference manual that could be used by investigators for making informed use of logistic regression using two methods (standard logistic regression and MARS). The details for analyses of relationships between a dependent binary response ...

  3. Experiments to Determine Whether Recursive Partitioning (CART) or an Artificial Neural Network Overcomes Theoretical Limitations of Cox Proportional Hazards Regression

    NASA Technical Reports Server (NTRS)

    Kattan, Michael W.; Hess, Kenneth R.; Kattan, Michael W.

    1998-01-01

    New computationally intensive tools for medical survival analyses include recursive partitioning (also called CART) and artificial neural networks. A challenge that remains is to better understand the behavior of these techniques in effort to know when they will be effective tools. Theoretically they may overcome limitations of the traditional multivariable survival technique, the Cox proportional hazards regression model. Experiments were designed to test whether the new tools would, in practice, overcome these limitations. Two datasets in which theory suggests CART and the neural network should outperform the Cox model were selected. The first was a published leukemia dataset manipulated to have a strong interaction that CART should detect. The second was a published cirrhosis dataset with pronounced nonlinear effects that a neural network should fit. Repeated sampling of 50 training and testing subsets was applied to each technique. The concordance index C was calculated as a measure of predictive accuracy by each technique on the testing dataset. In the interaction dataset, CART outperformed Cox (P less than 0.05) with a C improvement of 0.1 (95% Cl, 0.08 to 0.12). In the nonlinear dataset, the neural network outperformed the Cox model (P less than 0.05), but by a very slight amount (0.015). As predicted by theory, CART and the neural network were able to overcome limitations of the Cox model. Experiments like these are important to increase our understanding of when one of these new techniques will outperform the standard Cox model. Further research is necessary to predict which technique will do best a priori and to assess the magnitude of superiority.

  4. Causal diagrams and multivariate analysis II: precision work.

    PubMed

    Jupiter, Daniel C

    2014-01-01

    In this Investigators' Corner, I continue my discussion of when and why we researchers should include variables in multivariate regression. My examination focuses on studies comparing treatment groups and situations for which we can either exclude variables from multivariate analyses or include them for reasons of precision. Copyright © 2014 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.

  5. Identifying maternal and infant factors associated with newborn size in rural Bangladesh by partial least squares (PLS) regression analysis

    PubMed Central

    Rahman, Md. Jahanur; Shamim, Abu Ahmed; Klemm, Rolf D. W.; Labrique, Alain B.; Rashid, Mahbubur; Christian, Parul; West, Keith P.

    2017-01-01

    Birth weight, length and circumferences of the head, chest and arm are key measures of newborn size and health in developing countries. We assessed maternal socio-demographic factors associated with multiple measures of newborn size in a large rural population in Bangladesh using partial least squares (PLS) regression method. PLS regression, combining features from principal component analysis and multiple linear regression, is a multivariate technique with an ability to handle multicollinearity while simultaneously handling multiple dependent variables. We analyzed maternal and infant data from singletons (n = 14,506) born during a double-masked, cluster-randomized, placebo-controlled maternal vitamin A or β-carotene supplementation trial in rural northwest Bangladesh. PLS regression results identified numerous maternal factors (parity, age, early pregnancy MUAC, living standard index, years of education, number of antenatal care visits, preterm delivery and infant sex) significantly (p<0.001) associated with newborn size. Among them, preterm delivery had the largest negative influence on newborn size (Standardized β = -0.29 − -0.19; p<0.001). Scatter plots of the scores of first two PLS components also revealed an interaction between newborn sex and preterm delivery on birth size. PLS regression was found to be more parsimonious than both ordinary least squares regression and principal component regression. It also provided more stable estimates than the ordinary least squares regression and provided the effect measure of the covariates with greater accuracy as it accounts for the correlation among the covariates and outcomes. Therefore, PLS regression is recommended when either there are multiple outcome measurements in the same study, or the covariates are correlated, or both situations exist in a dataset. PMID:29261760

  6. Identifying maternal and infant factors associated with newborn size in rural Bangladesh by partial least squares (PLS) regression analysis.

    PubMed

    Kabir, Alamgir; Rahman, Md Jahanur; Shamim, Abu Ahmed; Klemm, Rolf D W; Labrique, Alain B; Rashid, Mahbubur; Christian, Parul; West, Keith P

    2017-01-01

    Birth weight, length and circumferences of the head, chest and arm are key measures of newborn size and health in developing countries. We assessed maternal socio-demographic factors associated with multiple measures of newborn size in a large rural population in Bangladesh using partial least squares (PLS) regression method. PLS regression, combining features from principal component analysis and multiple linear regression, is a multivariate technique with an ability to handle multicollinearity while simultaneously handling multiple dependent variables. We analyzed maternal and infant data from singletons (n = 14,506) born during a double-masked, cluster-randomized, placebo-controlled maternal vitamin A or β-carotene supplementation trial in rural northwest Bangladesh. PLS regression results identified numerous maternal factors (parity, age, early pregnancy MUAC, living standard index, years of education, number of antenatal care visits, preterm delivery and infant sex) significantly (p<0.001) associated with newborn size. Among them, preterm delivery had the largest negative influence on newborn size (Standardized β = -0.29 - -0.19; p<0.001). Scatter plots of the scores of first two PLS components also revealed an interaction between newborn sex and preterm delivery on birth size. PLS regression was found to be more parsimonious than both ordinary least squares regression and principal component regression. It also provided more stable estimates than the ordinary least squares regression and provided the effect measure of the covariates with greater accuracy as it accounts for the correlation among the covariates and outcomes. Therefore, PLS regression is recommended when either there are multiple outcome measurements in the same study, or the covariates are correlated, or both situations exist in a dataset.

  7. Relations among soil radon, environmental parameters, volcanic and seismic events at Mt. Etna (Italy)

    NASA Astrophysics Data System (ADS)

    Giammanco, S.; Ferrera, E.; Cannata, A.; Montalto, P.; Neri, M.

    2013-12-01

    From November 2009 to April 2011 soil radon activity was continuously monitored using a Barasol probe located on the upper NE flank of Mt. Etna volcano (Italy), close both to the Piano Provenzana fault and to the NE-Rift. Seismic, volcanological and radon data were analysed together with data on environmental parameters, such as air and soil temperature, barometric pressure, snow and rain fall. In order to find possible correlations among the above parameters, and hence to reveal possible anomalous trends in the radon time-series, we used different statistical methods: i) multivariate linear regression; ii) cross-correlation; iii) coherence analysis through wavelet transform. Multivariate regression indicated a modest influence on soil radon from environmental parameters (R2 = 0.31). When using 100-day time windows, the R2 values showed wide variations in time, reaching their maxima (~0.63-0.66) during summer. Cross-correlation analysis over 100-day moving averages showed that, similar to multivariate linear regression analysis, the summer period was characterised by the best correlation between radon data and environmental parameters. Lastly, the wavelet coherence analysis allowed a multi-resolution coherence analysis of the time series acquired. This approach allowed to study the relations among different signals either in the time or in the frequency domain. It confirmed the results of the previous methods, but also allowed to recognize correlations between radon and environmental parameters at different observation scales (e.g., radon activity changed during strong precipitations, but also during anomalous variations of soil temperature uncorrelated with seasonal fluctuations). Using the above analysis, two periods were recognized when radon variations were significantly correlated with marked soil temperature changes and also with local seismic or volcanic activity. This allowed to produce two different physical models of soil gas transport that explain the observed anomalies. Our work suggests that in order to make an accurate analysis of the relations among different signals it is necessary to use different techniques that give complementary analytical information. In particular, the wavelet analysis showed to be the most effective in discriminating radon changes due to environmental influences from those correlated with impending seismic or volcanic events.

  8. Why are we regressing?

    PubMed

    Jupiter, Daniel C

    2012-01-01

    In this first of a series of statistical methodology commentaries for the clinician, we discuss the use of multivariate linear regression. Copyright © 2012 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.

  9. Prehospital helicopter transport and survival of patients with traumatic brain injury.

    PubMed

    Bekelis, Kimon; Missios, Symeon; Mackenzie, Todd A

    2015-03-01

    To investigate the association of helicopter transport with survival of patients with traumatic brain injury (TBI), in comparison with ground emergency medical services (EMS). Helicopter utilization and its effect on the outcomes of TBI remain controversial. We performed a retrospective cohort study involving patients with TBI who were registered in the National Trauma Data Bank between 2009 and 2011. Regression techniques with propensity score matching were used to investigate the association of helicopter transport with survival of patients with TBI, in comparison with ground EMS. During the study period, there were 209,529 patients with TBI who were registered in the National Trauma Data Bank and met the inclusion criteria. Of these patients, 35,334 were transported via helicopters and 174,195 via ground EMS. For patients transported to level I trauma centers, 2797 deaths (12%) were recorded after helicopter transport and 8161 (7.8%) after ground EMS. Multivariable logistic regression analysis demonstrated an association of helicopter transport with increased survival [OR (odds ratio), 1.95; 95% confidence interval (CI), 1.81-2.10; absolute risk reduction (ARR), 6.37%]. This persisted after propensity score matching (OR, 1.88; 95% CI, 1.74-2.03; ARR, 5.93%). For patients transported to level II trauma centers, 1282 deaths (10.6%) were recorded after helicopter transport and 5097 (7.3%) after ground EMS. Multivariable logistic regression analysis demonstrated an association of helicopter transport with increased survival (OR, 1.81; 95% CI, 1.64-2.00; ARR 5.17%). This again persisted after propensity score matching (OR, 1.73; 95% CI, 1.55-1.94; ARR, 4.69). Helicopter transport of patients with TBI to level I and II trauma centers was associated with improved survival, in comparison with ground EMS.

  10. Prehospital Helicopter Transport and Survival of Patients With Traumatic Brain Injury

    PubMed Central

    Mackenzie, Todd A.

    2015-01-01

    Objective To investigate the association of helicopter transport with survival of patients with traumatic brain injury (TBI), in comparison with ground emergency medical services (EMS). Background Helicopter utilization and its effect on the outcomes of TBI remain controversial. Methods We performed a retrospective cohort study involving patients with TBI who were registered in the National Trauma Data Bank between 2009 and 2011. Regression techniques with propensity score matching were used to investigate the association of helicopter transport with survival of patients with TBI, in comparison with ground EMS. Results During the study period, there were 209,529 patients with TBI who were registered in the National Trauma Data Bank and met the inclusion criteria. Of these patients, 35,334 were transported via helicopters and 174,195 via ground EMS. For patients transported to level I trauma centers, 2797 deaths (12%) were recorded after helicopter transport and 8161 (7.8%) after ground EMS. Multivariable logistic regression analysis demonstrated an association of helicopter transport with increased survival [OR (odds ratio), 1.95; 95% confidence interval (CI), 1.81–2.10; absolute risk reduction (ARR), 6.37%]. This persisted after propensity score matching (OR, 1.88; 95% CI, 1.74–2.03; ARR, 5.93%). For patients transported to level II trauma centers, 1282 deaths (10.6%) were recorded after helicopter transport and 5097 (7.3%) after ground EMS. Multivariable logistic regression analysis demonstrated an association of helicopter transport with increased survival (OR, 1.81; 95% CI, 1.64–2.00; ARR 5.17%). This again persisted after propensity score matching (OR, 1.73; 95% CI, 1.55–1.94; ARR, 4.69). Conclusions Helicopter transport of patients with TBI to level I and II trauma centers was associated with improved survival, in comparison with ground EMS. PMID:24743624

  11. Preoperative predictive model of recovery of urinary continence after radical prostatectomy.

    PubMed

    Matsushita, Kazuhito; Kent, Matthew T; Vickers, Andrew J; von Bodman, Christian; Bernstein, Melanie; Touijer, Karim A; Coleman, Jonathan A; Laudone, Vincent T; Scardino, Peter T; Eastham, James A; Akin, Oguz; Sandhu, Jaspreet S

    2015-10-01

    To build a predictive model of urinary continence recovery after radical prostatectomy (RP) that incorporates magnetic resonance imaging (MRI) parameters and clinical data. We conducted a retrospective review of data from 2,849 patients who underwent pelvic staging MRI before RP from November 2001 to June 2010. We used logistic regression to evaluate the association between each MRI variable and continence at 6 or 12 months, adjusting for age, body mass index (BMI) and American Society of Anesthesiologists (ASA) score, and then used multivariable logistic regression to create our model. A nomogram was constructed using the multivariable logistic regression models. In all, 68% (1,742/2,559) and 82% (2,205/2,689) regained function at 6 and 12 months, respectively. In the base model, age, BMI and ASA score were significant predictors of continence at 6 or 12 months on univariate analysis (P < 0.005). Among the preoperative MRI measurements, membranous urethral length, which showed great significance, was incorporated into the base model to create the full model. For continence recovery at 6 months, the addition of membranous urethral length increased the area under the curve (AUC) to 0.664 for the validation set, an increase of 0.064 over the base model. For continence recovery at 12 months, the AUC was 0.674, an increase of 0.085 over the base model. Using our model, the likelihood of continence recovery increases with membranous urethral length and decreases with age, BMI and ASA score. This model could be used for patient counselling and for the identification of patients at high risk for urinary incontinence in whom to study changes in operative technique that improve urinary function after RP. © 2015 The Authors BJU International © 2015 BJU International Published by John Wiley & Sons Ltd.

  12. A retrospective study: Multivariate logistic regression analysis of the outcomes after pressure sores reconstruction with fasciocutaneous, myocutaneous, and perforator flaps.

    PubMed

    Chiu, Yu-Jen; Liao, Wen-Chieh; Wang, Tien-Hsiang; Shih, Yu-Chung; Ma, Hsu; Lin, Chih-Hsun; Wu, Szu-Hsien; Perng, Cherng-Kang

    2017-08-01

    Despite significant advances in medical care and surgical techniques, pressure sore reconstruction is still prone to elevated rates of complication and recurrence. We conducted a retrospective study to investigate not only complication and recurrence rates following pressure sore reconstruction but also preoperative risk stratification. This study included 181 ulcers underwent flap operations between January 2002 and December 2013 were included in the study. We performed a multivariable logistic regression model, which offers a regression-based method accounting for the within-patient correlation of the success or failure of each flap. The overall complication and recurrence rates for all flaps were 46.4% and 16.0%, respectively, with a mean follow-up period of 55.4 ± 38.0 months. No statistically significant differences of complication and recurrence rates were observed among three different reconstruction methods. In subsequent analysis, albumin ≤3.0 g/dl and paraplegia were significantly associated with higher postoperative complication. The anatomic factor, ischial wound location, significantly trended toward the development of ulcer recurrence. In the fasciocutaneous group, paraplegia had significant correlation to higher complication and recurrence rates. In the musculocutaneous flap group, variables had no significant correlation to complication and recurrence rates. In the free-style perforator group, ischial wound location and malnourished status correlated with significantly higher complication rates; ischial wound location also correlated with significantly higher recurrence rate. Ultimately, our review of a noteworthy cohort with lengthy follow-up helped identify and confirm certain risk factors that can facilitate a more informed and thoughtful pre- and postoperative decision-making process for patients with pressure ulcers. Copyright © 2017 British Association of Plastic, Reconstructive and Aesthetic Surgeons. Published by Elsevier Ltd. All rights reserved.

  13. Are conventional statistical techniques exhaustive for defining metal background concentrations in harbour sediments? A case study: The Coastal Area of Bari (Southeast Italy).

    PubMed

    Mali, Matilda; Dell'Anna, Maria Michela; Mastrorilli, Piero; Damiani, Leonardo; Ungaro, Nicola; Belviso, Claudia; Fiore, Saverio

    2015-11-01

    Sediment contamination by metals poses significant risks to coastal ecosystems and is considered to be problematic for dredging operations. The determination of the background values of metal and metalloid distribution based on site-specific variability is fundamental in assessing pollution levels in harbour sediments. The novelty of the present work consists of addressing the scope and limitation of analysing port sediments through the use of conventional statistical techniques (such as: linear regression analysis, construction of cumulative frequency curves and the iterative 2σ technique), that are commonly employed for assessing Regional Geochemical Background (RGB) values in coastal sediments. This study ascertained that although the tout court use of such techniques in determining the RGB values in harbour sediments seems appropriate (the chemical-physical parameters of port sediments fit well with statistical equations), it should nevertheless be avoided because it may be misleading and can mask key aspects of the study area that can only be revealed by further investigations, such as mineralogical and multivariate statistical analyses. Copyright © 2015 Elsevier Ltd. All rights reserved.

  14. Fast detection and visualization of minced lamb meat adulteration using NIR hyperspectral imaging and multivariate image analysis.

    PubMed

    Kamruzzaman, Mohammed; Sun, Da-Wen; ElMasry, Gamal; Allen, Paul

    2013-01-15

    Many studies have been carried out in developing non-destructive technologies for predicting meat adulteration, but there is still no endeavor for non-destructive detection and quantification of adulteration in minced lamb meat. The main goal of this study was to develop and optimize a rapid analytical technique based on near-infrared (NIR) hyperspectral imaging to detect the level of adulteration in minced lamb. Initial investigation was carried out using principal component analysis (PCA) to identify the most potential adulterate in minced lamb. Minced lamb meat samples were then adulterated with minced pork in the range 2-40% (w/w) at approximately 2% increments. Spectral data were used to develop a partial least squares regression (PLSR) model to predict the level of adulteration in minced lamb. Good prediction model was obtained using the whole spectral range (910-1700 nm) with a coefficient of determination (R(2)(cv)) of 0.99 and root-mean-square errors estimated by cross validation (RMSECV) of 1.37%. Four important wavelengths (940, 1067, 1144 and 1217 nm) were selected using weighted regression coefficients (Bw) and a multiple linear regression (MLR) model was then established using these important wavelengths to predict adulteration. The MLR model resulted in a coefficient of determination (R(2)(cv)) of 0.98 and RMSECV of 1.45%. The developed MLR model was then applied to each pixel in the image to obtain prediction maps to visualize the distribution of adulteration of the tested samples. The results demonstrated that the laborious and time-consuming tradition analytical techniques could be replaced by spectral data in order to provide rapid, low cost and non-destructive testing technique for adulterate detection in minced lamb meat. Copyright © 2012 Elsevier B.V. All rights reserved.

  15. Neuroanatomical morphometric characterization of sex differences in youth using statistical learning.

    PubMed

    Sepehrband, Farshid; Lynch, Kirsten M; Cabeen, Ryan P; Gonzalez-Zacarias, Clio; Zhao, Lu; D'Arcy, Mike; Kesselman, Carl; Herting, Megan M; Dinov, Ivo D; Toga, Arthur W; Clark, Kristi A

    2018-05-15

    Exploring neuroanatomical sex differences using a multivariate statistical learning approach can yield insights that cannot be derived with univariate analysis. While gross differences in total brain volume are well-established, uncovering the more subtle, regional sex-related differences in neuroanatomy requires a multivariate approach that can accurately model spatial complexity as well as the interactions between neuroanatomical features. Here, we developed a multivariate statistical learning model using a support vector machine (SVM) classifier to predict sex from MRI-derived regional neuroanatomical features from a single-site study of 967 healthy youth from the Philadelphia Neurodevelopmental Cohort (PNC). Then, we validated the multivariate model on an independent dataset of 682 healthy youth from the multi-site Pediatric Imaging, Neurocognition and Genetics (PING) cohort study. The trained model exhibited an 83% cross-validated prediction accuracy, and correctly predicted the sex of 77% of the subjects from the independent multi-site dataset. Results showed that cortical thickness of the middle occipital lobes and the angular gyri are major predictors of sex. Results also demonstrated the inferential benefits of going beyond classical regression approaches to capture the interactions among brain features in order to better characterize sex differences in male and female youths. We also identified specific cortical morphological measures and parcellation techniques, such as cortical thickness as derived from the Destrieux atlas, that are better able to discriminate between males and females in comparison to other brain atlases (Desikan-Killiany, Brodmann and subcortical atlases). Copyright © 2018 Elsevier Inc. All rights reserved.

  16. Weighing of risk factors for penetrating keratoplasty graft failure: application of Risk Score System.

    PubMed

    Tourkmani, Abdo Karim; Sánchez-Huerta, Valeria; De Wit, Guillermo; Martínez, Jaime D; Mingo, David; Mahillo-Fernández, Ignacio; Jiménez-Alfaro, Ignacio

    2017-01-01

    To analyze the relationship between the score obtained in the Risk Score System (RSS) proposed by Hicks et al with penetrating keratoplasty (PKP) graft failure at 1y postoperatively and among each factor in the RSS with the risk of PKP graft failure using univariate and multivariate analysis. The retrospective cohort study had 152 PKPs from 152 patients. Eighteen cases were excluded from our study due to primary failure (10 cases), incomplete medical notes (5 cases) and follow-up less than 1y (3 cases). We included 134 PKPs from 134 patients stratified by preoperative risk score. Spearman coefficient was calculated for the relationship between the score obtained and risk of failure at 1y. Univariate and multivariate analysis were calculated for the impact of every single risk factor included in the RSS over graft failure at 1y. Spearman coefficient showed statistically significant correlation between the score in the RSS and graft failure ( P <0.05). Multivariate logistic regression analysis showed no statistically significant relationship ( P >0.05) between diagnosis and lens status with graft failure. The relationship between the other risk factors studied and graft failure was significant ( P <0.05), although the results for previous grafts and graft failure was unreliable. None of our patients had previous blood transfusion, thus, it had no impact. After the application of multivariate analysis techniques, some risk factors do not show the expected impact over graft failure at 1y.

  17. Weighing of risk factors for penetrating keratoplasty graft failure: application of Risk Score System

    PubMed Central

    Tourkmani, Abdo Karim; Sánchez-Huerta, Valeria; De Wit, Guillermo; Martínez, Jaime D.; Mingo, David; Mahillo-Fernández, Ignacio; Jiménez-Alfaro, Ignacio

    2017-01-01

    AIM To analyze the relationship between the score obtained in the Risk Score System (RSS) proposed by Hicks et al with penetrating keratoplasty (PKP) graft failure at 1y postoperatively and among each factor in the RSS with the risk of PKP graft failure using univariate and multivariate analysis. METHODS The retrospective cohort study had 152 PKPs from 152 patients. Eighteen cases were excluded from our study due to primary failure (10 cases), incomplete medical notes (5 cases) and follow-up less than 1y (3 cases). We included 134 PKPs from 134 patients stratified by preoperative risk score. Spearman coefficient was calculated for the relationship between the score obtained and risk of failure at 1y. Univariate and multivariate analysis were calculated for the impact of every single risk factor included in the RSS over graft failure at 1y. RESULTS Spearman coefficient showed statistically significant correlation between the score in the RSS and graft failure (P<0.05). Multivariate logistic regression analysis showed no statistically significant relationship (P>0.05) between diagnosis and lens status with graft failure. The relationship between the other risk factors studied and graft failure was significant (P<0.05), although the results for previous grafts and graft failure was unreliable. None of our patients had previous blood transfusion, thus, it had no impact. CONCLUSION After the application of multivariate analysis techniques, some risk factors do not show the expected impact over graft failure at 1y. PMID:28393027

  18. Comparison and validation of statistical methods for predicting power outage durations in the event of hurricanes.

    PubMed

    Nateghi, Roshanak; Guikema, Seth D; Quiring, Steven M

    2011-12-01

    This article compares statistical methods for modeling power outage durations during hurricanes and examines the predictive accuracy of these methods. Being able to make accurate predictions of power outage durations is valuable because the information can be used by utility companies to plan their restoration efforts more efficiently. This information can also help inform customers and public agencies of the expected outage times, enabling better collective response planning, and coordination of restoration efforts for other critical infrastructures that depend on electricity. In the long run, outage duration estimates for future storm scenarios may help utilities and public agencies better allocate risk management resources to balance the disruption from hurricanes with the cost of hardening power systems. We compare the out-of-sample predictive accuracy of five distinct statistical models for estimating power outage duration times caused by Hurricane Ivan in 2004. The methods compared include both regression models (accelerated failure time (AFT) and Cox proportional hazard models (Cox PH)) and data mining techniques (regression trees, Bayesian additive regression trees (BART), and multivariate additive regression splines). We then validate our models against two other hurricanes. Our results indicate that BART yields the best prediction accuracy and that it is possible to predict outage durations with reasonable accuracy. © 2011 Society for Risk Analysis.

  19. A refined method for multivariate meta-analysis and meta-regression

    PubMed Central

    Jackson, Daniel; Riley, Richard D

    2014-01-01

    Making inferences about the average treatment effect using the random effects model for meta-analysis is problematic in the common situation where there is a small number of studies. This is because estimates of the between-study variance are not precise enough to accurately apply the conventional methods for testing and deriving a confidence interval for the average effect. We have found that a refined method for univariate meta-analysis, which applies a scaling factor to the estimated effects’ standard error, provides more accurate inference. We explain how to extend this method to the multivariate scenario and show that our proposal for refined multivariate meta-analysis and meta-regression can provide more accurate inferences than the more conventional approach. We explain how our proposed approach can be implemented using standard output from multivariate meta-analysis software packages and apply our methodology to two real examples. © 2013 The Authors. Statistics in Medicine published by John Wiley & Sons, Ltd. PMID:23996351

  20. Multivariate meta-analysis for non-linear and other multi-parameter associations

    PubMed Central

    Gasparrini, A; Armstrong, B; Kenward, M G

    2012-01-01

    In this paper, we formalize the application of multivariate meta-analysis and meta-regression to synthesize estimates of multi-parameter associations obtained from different studies. This modelling approach extends the standard two-stage analysis used to combine results across different sub-groups or populations. The most straightforward application is for the meta-analysis of non-linear relationships, described for example by regression coefficients of splines or other functions, but the methodology easily generalizes to any setting where complex associations are described by multiple correlated parameters. The modelling framework of multivariate meta-analysis is implemented in the package mvmeta within the statistical environment R. As an illustrative example, we propose a two-stage analysis for investigating the non-linear exposure–response relationship between temperature and non-accidental mortality using time-series data from multiple cities. Multivariate meta-analysis represents a useful analytical tool for studying complex associations through a two-stage procedure. Copyright © 2012 John Wiley & Sons, Ltd. PMID:22807043

  1. A new improved study of cyanotoxins presence from experimental cyanobacteria concentrations in the Trasona reservoir (Northern Spain) using the MARS technique.

    PubMed

    García Nieto, P J; Alonso Fernández, J R; Sánchez Lasheras, F; de Cos Juez, F J; Díaz Muñiz, C

    2012-07-15

    Cyanotoxins, a kind of poisonous substances produced by cyanobacteria, are responsible for health risks in drinking and recreational water uses. The aim of this study is to improve our previous and successful work about cyanotoxins prediction from some experimental cyanobacteria concentrations in the Trasona reservoir (Asturias, Northern Spain) using the multivariate adaptive regression splines (MARS) technique at a local scale. In fact, this new improvement consists of using not only biological variables, but also the physical-chemical ones. As a result, the coefficient of determination has improved from 0.84 to 0.94, that is to say, more accurate predictive calculations and a better approximation to the real problem were obtained. Finally the agreement of the MARS model with experimental data confirmed the good performance. Copyright © 2012 Elsevier B.V. All rights reserved.

  2. Pyrogenic carbon distribution in mineral topsoils of the northeastern United States

    USGS Publications Warehouse

    Jauss, Verena; Sullivan, Patrick J.; Sanderman, Jonathan; Smith, David; Lehmann, Johannes

    2017-01-01

    Due to its slow turnover rates in soil, pyrogenic carbon (PyC) is considered an important C pool and relevant to climate change processes. Therefore, the amounts of soil PyC were compared to environmental covariates over an area of 327,757 km2 in the northeastern United States in order to understand the controls on PyC distribution over large areas. Topsoil (defined as the soil A horizon, after removal of any organic horizons) samples were collected at 165 field sites in a generalised random tessellation stratified design that corresponded to approximately 1 site per 1600 km2 and PyC was estimated from diffuse reflectance mid-infrared spectroscopy measurements using a partial least-squares regression analysis in conjunction with a large database of PyC measurements based on a solid-state 13C nuclear magnetic resonance spectroscopy technique. Three spatial models were applied to the data in order to relate critical environmental covariates to the changes in spatial density of PyC over the landscape. Regional mean density estimates of PyC were 11.0 g kg− 1 (0.84 Gg km− 2) for Ordinary Kriging, 25.8 g kg− 1(12.2 Gg km− 2) for Multivariate Linear Regression, and 26.1 g kg− 1 (12.4 Gg km− 2) for Bayesian Regression Kriging. Akaike Information Criterion (AIC) indicated that the Multivariate Linear Regression model performed best (AIC = 842.6; n = 165) compared to Ordinary Kriging (AIC = 982.4) and Bayesian Regression Kriging (AIC = 979.2). Soil PyC concentrations correlated well with total soil sulphur (P < 0.001; n = 165), plant tissue lignin (P = 0.003), and drainage class (P = 0.008). This suggests the opportunity of including related environmental parameters in the spatial assessment of PyC in soils. Better estimates of the contribution of PyC to the global carbon cycle will thus also require more accurate assessments of these covariates.

  3. A land use regression model for ambient ultrafine particles in Montreal, Canada: A comparison of linear regression and a machine learning approach.

    PubMed

    Weichenthal, Scott; Ryswyk, Keith Van; Goldstein, Alon; Bagg, Scott; Shekkarizfard, Maryam; Hatzopoulou, Marianne

    2016-04-01

    Existing evidence suggests that ambient ultrafine particles (UFPs) (<0.1µm) may contribute to acute cardiorespiratory morbidity. However, few studies have examined the long-term health effects of these pollutants owing in part to a need for exposure surfaces that can be applied in large population-based studies. To address this need, we developed a land use regression model for UFPs in Montreal, Canada using mobile monitoring data collected from 414 road segments during the summer and winter months between 2011 and 2012. Two different approaches were examined for model development including standard multivariable linear regression and a machine learning approach (kernel-based regularized least squares (KRLS)) that learns the functional form of covariate impacts on ambient UFP concentrations from the data. The final models included parameters for population density, ambient temperature and wind speed, land use parameters (park space and open space), length of local roads and rail, and estimated annual average NOx emissions from traffic. The final multivariable linear regression model explained 62% of the spatial variation in ambient UFP concentrations whereas the KRLS model explained 79% of the variance. The KRLS model performed slightly better than the linear regression model when evaluated using an external dataset (R(2)=0.58 vs. 0.55) or a cross-validation procedure (R(2)=0.67 vs. 0.60). In general, our findings suggest that the KRLS approach may offer modest improvements in predictive performance compared to standard multivariable linear regression models used to estimate spatial variations in ambient UFPs. However, differences in predictive performance were not statistically significant when evaluated using the cross-validation procedure. Crown Copyright © 2015. Published by Elsevier Inc. All rights reserved.

  4. Access disparities to Magnet hospitals for patients undergoing neurosurgical operations

    PubMed Central

    Missios, Symeon; Bekelis, Kimon

    2017-01-01

    Background Centers of excellence focusing on quality improvement have demonstrated superior outcomes for a variety of surgical interventions. We investigated the presence of access disparities to hospitals recognized by the Magnet Recognition Program of the American Nurses Credentialing Center (ANCC) for patients undergoing neurosurgical operations. Methods We performed a cohort study of all neurosurgery patients who were registered in the New York Statewide Planning and Research Cooperative System (SPARCS) database from 2009–2013. We examined the association of African-American race and lack of insurance with Magnet status hospitalization for neurosurgical procedures. A mixed effects propensity adjusted multivariable regression analysis was used to control for confounding. Results During the study period, 190,535 neurosurgical patients met the inclusion criteria. Using a multivariable logistic regression, we demonstrate that African-Americans had lower admission rates to Magnet institutions (OR 0.62; 95% CI, 0.58–0.67). This persisted in a mixed effects logistic regression model (OR 0.77; 95% CI, 0.70–0.83) to adjust for clustering at the patient county level, and a propensity score adjusted logistic regression model (OR 0.75; 95% CI, 0.69–0.82). Additionally, lack of insurance was associated with lower admission rates to Magnet institutions (OR 0.71; 95% CI, 0.68–0.73), in a multivariable logistic regression model. This persisted in a mixed effects logistic regression model (OR 0.72; 95% CI, 0.69–0.74), and a propensity score adjusted logistic regression model (OR 0.72; 95% CI, 0.69–0.75). Conclusions Using a comprehensive all-payer cohort of neurosurgery patients in New York State we identified an association of African-American race and lack of insurance with lower rates of admission to Magnet hospitals. PMID:28684152

  5. Remote sensing and GIS-based landslide hazard analysis and cross-validation using multivariate logistic regression model on three test areas in Malaysia

    NASA Astrophysics Data System (ADS)

    Pradhan, Biswajeet

    2010-05-01

    This paper presents the results of the cross-validation of a multivariate logistic regression model using remote sensing data and GIS for landslide hazard analysis on the Penang, Cameron, and Selangor areas in Malaysia. Landslide locations in the study areas were identified by interpreting aerial photographs and satellite images, supported by field surveys. SPOT 5 and Landsat TM satellite imagery were used to map landcover and vegetation index, respectively. Maps of topography, soil type, lineaments and land cover were constructed from the spatial datasets. Ten factors which influence landslide occurrence, i.e., slope, aspect, curvature, distance from drainage, lithology, distance from lineaments, soil type, landcover, rainfall precipitation, and normalized difference vegetation index (ndvi), were extracted from the spatial database and the logistic regression coefficient of each factor was computed. Then the landslide hazard was analysed using the multivariate logistic regression coefficients derived not only from the data for the respective area but also using the logistic regression coefficients calculated from each of the other two areas (nine hazard maps in all) as a cross-validation of the model. For verification of the model, the results of the analyses were then compared with the field-verified landslide locations. Among the three cases of the application of logistic regression coefficient in the same study area, the case of Selangor based on the Selangor logistic regression coefficients showed the highest accuracy (94%), where as Penang based on the Penang coefficients showed the lowest accuracy (86%). Similarly, among the six cases from the cross application of logistic regression coefficient in other two areas, the case of Selangor based on logistic coefficient of Cameron showed highest (90%) prediction accuracy where as the case of Penang based on the Selangor logistic regression coefficients showed the lowest accuracy (79%). Qualitatively, the cross application model yields reasonable results which can be used for preliminary landslide hazard mapping.

  6. Serum dehydroepiandrosterone sulphate, psychosocial factors and musculoskeletal pain in workers.

    PubMed

    Marinelli, A; Prodi, A; Pesel, G; Ronchese, F; Bovenzi, M; Negro, C; Larese Filon, F

    2017-12-30

    The serum level of dehydroepiandrosterone sulphate (DHEA-S) has been suggested as a biological marker of stress. To assess the association between serum DHEA-S, psychosocial factors and musculoskeletal (MS) pain in university workers. The study population included voluntary workers at the scientific departments of the University of Trieste (Italy) who underwent periodical health surveillance from January 2011 to June 2012. DHEA-S level was analysed in serum. The assessment tools included the General Health Questionnaire (GHQ) and a modified Nordic musculoskeletal symptoms questionnaire. The relation between DHEA-S, individual characteristics, pain perception and psychological factors was assessed by means of multivariable linear regression analysis. There were 189 study participants. The study population was characterized by high reward and low effort. Pain perception in the neck, shoulder, upper limbs, upper back and lower back was reported by 42, 32, 19, 29 and 43% of people, respectively. In multivariable regression analysis, gender, age and pain perception in the shoulder and upper limbs were significantly related to serum DHEA-S. Effort and overcommitment were related to shoulder and neck pain but not to DHEA-S. The GHQ score was associated with pain perception in different body sites and inversely to DHEA-S but significance was lost in multivariable regression analysis. DHEA-S was associated with age, gender and perception of MS pain, while effort-reward imbalance dimensions and GHQ score failed to reach the statistical significance in multivariable regression analysis. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society of Occupational Medicine. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  7. Independent Prognostic Factors for Acute Organophosphorus Pesticide Poisoning.

    PubMed

    Tang, Weidong; Ruan, Feng; Chen, Qi; Chen, Suping; Shao, Xuebo; Gao, Jianbo; Zhang, Mao

    2016-07-01

    Acute organophosphorus pesticide poisoning (AOPP) is becoming a significant problem and a potential cause of human mortality because of the abuse of organophosphate compounds. This study aims to determine the independent prognostic factors of AOPP by using multivariate logistic regression analysis. The clinical data for 71 subjects with AOPP admitted to our hospital were retrospectively analyzed. This information included the Acute Physiology and Chronic Health Evaluation II (APACHE II) scores, 6-h post-admission blood lactate levels, post-admission 6-h lactate clearance rates, admission blood cholinesterase levels, 6-h post-admission blood cholinesterase levels, cholinesterase activity, blood pH, and other factors. Univariate analysis and multivariate logistic regression analyses were conducted to identify all prognostic factors and independent prognostic factors, respectively. A receiver operating characteristic curve was plotted to analyze the testing power of independent prognostic factors. Twelve of 71 subjects died. Admission blood lactate levels, 6-h post-admission blood lactate levels, post-admission 6-h lactate clearance rates, blood pH, and APACHE II scores were identified as prognostic factors for AOPP according to the univariate analysis, whereas only 6-h post-admission blood lactate levels, post-admission 6-h lactate clearance rates, and blood pH were independent prognostic factors identified by multivariate logistic regression analysis. The receiver operating characteristic analysis suggested that post-admission 6-h lactate clearance rates were of moderate diagnostic value. High 6-h post-admission blood lactate levels, low blood pH, and low post-admission 6-h lactate clearance rates were independent prognostic factors identified by multivariate logistic regression analysis. Copyright © 2016 by Daedalus Enterprises.

  8. Comparative forensic soil analysis of New Jersey state parks using a combination of simple techniques with multivariate statistics.

    PubMed

    Bonetti, Jennifer; Quarino, Lawrence

    2014-05-01

    This study has shown that the combination of simple techniques with the use of multivariate statistics offers the potential for the comparative analysis of soil samples. Five samples were obtained from each of twelve state parks across New Jersey in both the summer and fall seasons. Each sample was examined using particle-size distribution, pH analysis in both water and 1 M CaCl2 , and a loss on ignition technique. Data from each of the techniques were combined, and principal component analysis (PCA) and canonical discriminant analysis (CDA) were used for multivariate data transformation. Samples from different locations could be visually differentiated from one another using these multivariate plots. Hold-one-out cross-validation analysis showed error rates as low as 3.33%. Ten blind study samples were analyzed resulting in no misclassifications using Mahalanobis distance calculations and visual examinations of multivariate plots. Seasonal variation was minimal between corresponding samples, suggesting potential success in forensic applications. © 2014 American Academy of Forensic Sciences.

  9. Optical characteristics of fine and coarse particulates at Grand Canyon, Arizona

    NASA Astrophysics Data System (ADS)

    Malm, William C.; Johnson, Christopher E.

    The relationship between airborne particulate matter and atmospheric light extinction was examined using the multivariate techniques of principal component analysis and multiple linear regression on data gathered at the Grand Canyon, Arizona, from December 1979 to November 1981. Results showed that, on the average, fine sulfates were most strongly associated with light attenuation in the atmosphere. Other fine mass (nitrates, organics, soot and carbonaceous material) and coarse mass (primarily windblown dust) were much less associated with atmospheric extinction. Fine sulfate mass at the Grand Canyon was responsible for 63% of atmospheric light extinction while other fine mass and coarse mass were responsible for 17 and 20% of atmospheric extinction, respectively.

  10. Comparative effectiveness research in cancer with observational data.

    PubMed

    Giordano, Sharon H

    2015-01-01

    Observational studies are increasingly being used for comparative effectiveness research. These studies can have the greatest impact when randomized trials are not feasible or when randomized studies have not included the population or outcomes of interest. However, careful attention must be paid to study design to minimize the likelihood of selection biases. Analytic techniques, such as multivariable regression modeling, propensity score analysis, and instrumental variable analysis, also can also be used to help address confounding. Oncology has many existing large and clinically rich observational databases that can be used for comparative effectiveness research. With careful study design, observational studies can produce valid results to assess the benefits and harms of a treatment or intervention in representative real-world populations.

  11. Digital controllers for VTOL aircraft

    NASA Technical Reports Server (NTRS)

    Stengel, R. F.; Broussard, J. R.; Berry, P. W.

    1976-01-01

    Using linear-optimal estimation and control techniques, digital-adaptive control laws have been designed for a tandem-rotor helicopter which is equipped for fully automatic flight in terminal area operations. Two distinct discrete-time control laws are designed to interface with velocity-command and attitude-command guidance logic, and each incorporates proportional-integral compensation for non-zero-set-point regulation, as well as reduced-order Kalman filters for sensor blending and noise rejection. Adaptation to flight condition is achieved with a novel gain-scheduling method based on correlation and regression analysis. The linear-optimal design approach is found to be a valuable tool in the development of practical multivariable control laws for vehicles which evidence significant coupling and insufficient natural stability.

  12. Impact of different variables on the outcome of patients with clinically confined prostate carcinoma: prediction of pathologic stage and biochemical failure using an artificial neural network.

    PubMed

    Ziada, A M; Lisle, T C; Snow, P B; Levine, R F; Miller, G; Crawford, E D

    2001-04-15

    The advent of advanced computing techniques has provided the opportunity to analyze clinical data using artificial intelligence techniques. This study was designed to determine whether a neural network could be developed using preoperative prognostic indicators to predict the pathologic stage and time of biochemical failure for patients who undergo radical prostatectomy. The preoperative information included TNM stage, prostate size, prostate specific antigen (PSA) level, biopsy results (Gleason score and percentage of positive biopsy), as well as patient age. All 309 patients underwent radical prostatectomy at the University of Colorado Health Sciences Center. The data from all patients were used to train a multilayer perceptron artificial neural network. The failure rate was defined as a rise in the PSA level > 0.2 ng/mL. The biochemical failure rate in the data base used was 14.2%. Univariate and multivariate analyses were performed to validate the results. The neural network statistics for the validation set showed a sensitivity and specificity of 79% and 81%, respectively, for the prediction of pathologic stage with an overall accuracy of 80% compared with an overall accuracy of 67% using the multivariate regression analysis. The sensitivity and specificity for the prediction of failure were 67% and 85%, respectively, demonstrating a high confidence in predicting failure. The overall accuracy rates for the artificial neural network and the multivariate analysis were similar. Neural networks can offer a convenient vehicle for clinicians to assess the preoperative risk of disease progression for patients who are about to undergo radical prostatectomy. Continued investigation of this approach with larger data sets seems warranted. Copyright 2001 American Cancer Society.

  13. Noninvasive and fast measurement of blood glucose in vivo by near infrared (NIR) spectroscopy

    NASA Astrophysics Data System (ADS)

    Jintao, Xue; Liming, Ye; Yufei, Liu; Chunyan, Li; Han, Chen

    2017-05-01

    This research was to develop a method for noninvasive and fast blood glucose assay in vivo. Near-infrared (NIR) spectroscopy, a more promising technique compared to other methods, was investigated in rats with diabetes and normal rats. Calibration models are generated by two different multivariate strategies: partial least squares (PLS) as linear regression method and artificial neural networks (ANN) as non-linear regression method. The PLS model was optimized individually by considering spectral range, spectral pretreatment methods and number of model factors, while the ANN model was studied individually by selecting spectral pretreatment methods, parameters of network topology, number of hidden neurons, and times of epoch. The results of the validation showed the two models were robust, accurate and repeatable. Compared to the ANN model, the performance of the PLS model was much better, with lower root mean square error of validation (RMSEP) of 0.419 and higher correlation coefficients (R) of 96.22%.

  14. Epidemiological characteristics of reported sporadic and outbreak cases of E. coli O157 in people from Alberta, Canada (2000-2002): methodological challenges of comparing clustered to unclustered data.

    PubMed

    Pearl, D L; Louie, M; Chui, L; Doré, K; Grimsrud, K M; Martin, S W; Michel, P; Svenson, L W; McEwen, S A

    2008-04-01

    Using multivariable models, we compared whether there were significant differences between reported outbreak and sporadic cases in terms of their sex, age, and mode and site of disease transmission. We also determined the potential role of administrative, temporal, and spatial factors within these models. We compared a variety of approaches to account for clustering of cases in outbreaks including weighted logistic regression, random effects models, general estimating equations, robust variance estimates, and the random selection of one case from each outbreak. Age and mode of transmission were the only epidemiologically and statistically significant covariates in our final models using the above approaches. Weighing observations in a logistic regression model by the inverse of their outbreak size appeared to be a relatively robust and valid means for modelling these data. Some analytical techniques, designed to account for clustering, had difficulty converging or producing realistic measures of association.

  15. Design of multivariable feedback control systems via spectral assignment. [as applied to aircraft flight control

    NASA Technical Reports Server (NTRS)

    Liberty, S. R.; Mielke, R. R.; Tung, L. J.

    1981-01-01

    Applied research in the area of spectral assignment in multivariable systems is reported. A frequency domain technique for determining the set of all stabilizing controllers for a single feedback loop multivariable system is described. It is shown that decoupling and tracking are achievable using this procedure. The technique is illustrated with a simple example.

  16. Predicting volumes in four Hawaii hardwoods...first multivariate equations developed

    Treesearch

    David A. Sharpnack

    1966-01-01

    Multivariate regression equations were developed for predicting board-foot (Int. 1/ 4-inch log rule ) and cubic-foot volumes in each 8.15-foot section of trees of four Hawaii hardwood species. The species are koa (Acacia koa), ohia (Metrosideros polymorpha), robusta eucalyptus (Eucalyptus robusta), and...

  17. A Multivariate Test of the Bott Hypothesis in an Urban Irish Setting

    ERIC Educational Resources Information Center

    Gordon, Michael; Downing, Helen

    1978-01-01

    Using a sample of 686 married Irish women in Cork City the Bott hypothesis was tested, and the results of a multivariate regression analysis revealed that neither network connectedness nor the strength of the respondent's emotional ties to the network had any explanatory power. (Author)

  18. Multivariate analysis of fMRI time series: classification and regression of brain responses using machine learning.

    PubMed

    Formisano, Elia; De Martino, Federico; Valente, Giancarlo

    2008-09-01

    Machine learning and pattern recognition techniques are being increasingly employed in functional magnetic resonance imaging (fMRI) data analysis. By taking into account the full spatial pattern of brain activity measured simultaneously at many locations, these methods allow detecting subtle, non-strictly localized effects that may remain invisible to the conventional analysis with univariate statistical methods. In typical fMRI applications, pattern recognition algorithms "learn" a functional relationship between brain response patterns and a perceptual, cognitive or behavioral state of a subject expressed in terms of a label, which may assume discrete (classification) or continuous (regression) values. This learned functional relationship is then used to predict the unseen labels from a new data set ("brain reading"). In this article, we describe the mathematical foundations of machine learning applications in fMRI. We focus on two methods, support vector machines and relevance vector machines, which are respectively suited for the classification and regression of fMRI patterns. Furthermore, by means of several examples and applications, we illustrate and discuss the methodological challenges of using machine learning algorithms in the context of fMRI data analysis.

  19. Body Fat Percentage Prediction Using Intelligent Hybrid Approaches

    PubMed Central

    Shao, Yuehjen E.

    2014-01-01

    Excess of body fat often leads to obesity. Obesity is typically associated with serious medical diseases, such as cancer, heart disease, and diabetes. Accordingly, knowing the body fat is an extremely important issue since it affects everyone's health. Although there are several ways to measure the body fat percentage (BFP), the accurate methods are often associated with hassle and/or high costs. Traditional single-stage approaches may use certain body measurements or explanatory variables to predict the BFP. Diverging from existing approaches, this study proposes new intelligent hybrid approaches to obtain fewer explanatory variables, and the proposed forecasting models are able to effectively predict the BFP. The proposed hybrid models consist of multiple regression (MR), artificial neural network (ANN), multivariate adaptive regression splines (MARS), and support vector regression (SVR) techniques. The first stage of the modeling includes the use of MR and MARS to obtain fewer but more important sets of explanatory variables. In the second stage, the remaining important variables are served as inputs for the other forecasting methods. A real dataset was used to demonstrate the development of the proposed hybrid models. The prediction results revealed that the proposed hybrid schemes outperformed the typical, single-stage forecasting models. PMID:24723804

  20. Television Viewing and Its Association with Sedentary Behaviors, Self-Rated Heath and Academic Performance among Secondary School Students in Peru.

    PubMed

    Sharma, Bimala; Cosme Chavez, Rosemary; Jeong, Ae Suk; Nam, Eun Woo

    2017-04-05

    The study assessed television viewing >2 h a day and its association with sedentary behaviors, self-rated health, and academic performance among secondary school adolescents. A cross-sectional survey was conducted among randomly selected students in Lima in 2015. We measured self-reported responses of students using a standard questionnaire, and conducted in-depth interviews with 10 parents and 10 teachers. Chi-square test, correlation and multivariate logistic regression analysis were performed among 1234 students, and thematic analysis technique was used for qualitative information. A total of 23.1% adolescents reported watching television >2 h a day. Qualitative findings also show that adolescents spend most of their leisure time watching television, playing video games or using the Internet. Television viewing had a significant positive correlation with video game use in males and older adolescents, with Internet use in both sexes, and a negative correlation with self-rated health and academic performance in females. Multivariate logistic regression analysis shows that television viewing >2 h a day, independent of physical activity was associated with video games use >2 h a day, Internet use >2 h a day, poor/fair self-rated health and poor self-reported academic performance. Television viewing time and sex had a significant interaction effect on both video game use >2 h a day and Internet use >2 h a day. Reducing television viewing time may be an effective strategy for improving health and academic performance in adolescents.

  1. Television Viewing and Its Association with Sedentary Behaviors, Self-Rated Health and Academic Performance among Secondary School Students in Peru

    PubMed Central

    Sharma, Bimala; Cosme Chavez, Rosemary; Jeong, Ae Suk; Nam, Eun Woo

    2017-01-01

    The study assessed television viewing >2 h a day and its association with sedentary behaviors, self-rated health, and academic performance among secondary school adolescents. A cross-sectional survey was conducted among randomly selected students in Lima in 2015. We measured self-reported responses of students using a standard questionnaire, and conducted in-depth interviews with 10 parents and 10 teachers. Chi-square test, correlation and multivariate logistic regression analysis were performed among 1234 students, and thematic analysis technique was used for qualitative information. A total of 23.1% adolescents reported watching television >2 h a day. Qualitative findings also show that adolescents spend most of their leisure time watching television, playing video games or using the Internet. Television viewing had a significant positive correlation with video game use in males and older adolescents, with Internet use in both sexes, and a negative correlation with self-rated health and academic performance in females. Multivariate logistic regression analysis shows that television viewing >2 h a day, independent of physical activity was associated with video games use >2 h a day, Internet use >2 h a day, poor/fair self-rated health and poor self-reported academic performance. Television viewing time and sex had a significant interaction effect on both video game use >2 h a day and Internet use >2 h a day. Reducing television viewing time may be an effective strategy for improving health and academic performance in adolescents. PMID:28379202

  2. Detection and quantification of adulteration in sandalwood oil through near infrared spectroscopy.

    PubMed

    Kuriakose, Saji; Thankappan, Xavier; Joe, Hubert; Venkataraman, Venkateswaran

    2010-10-01

    The confirmation of authenticity of essential oils and the detection of adulteration are problems of increasing importance in the perfumes, pharmaceutical, flavor and fragrance industries. This is especially true for 'value added' products like sandalwood oil. A methodical study is conducted here to demonstrate the potential use of Near Infrared (NIR) spectroscopy along with multivariate calibration models like principal component regression (PCR) and partial least square regression (PLSR) as rapid analytical techniques for the qualitative and quantitative determination of adulterants in sandalwood oil. After suitable pre-processing of the NIR raw spectral data, the models are built-up by cross-validation. The lowest Root Mean Square Error of Cross-Validation and Calibration (RMSECV and RMSEC % v/v) are used as a decision supporting system to fix the optimal number of factors. The coefficient of determination (R(2)) and the Root Mean Square Error of Prediction (RMSEP % v/v) in the prediction sets are used as the evaluation parameters (R(2) = 0.9999 and RMSEP = 0.01355). The overall result leads to the conclusion that NIR spectroscopy with chemometric techniques could be successfully used as a rapid, simple, instant and non-destructive method for the detection of adulterants, even 1% of the low-grade oils, in the high quality form of sandalwood oil.

  3. Reduction of time-resolved space-based CCD photometry developed for MOST Fabry Imaging data*

    NASA Astrophysics Data System (ADS)

    Reegen, P.; Kallinger, T.; Frast, D.; Gruberbauer, M.; Huber, D.; Matthews, J. M.; Punz, D.; Schraml, S.; Weiss, W. W.; Kuschnig, R.; Moffat, A. F. J.; Walker, G. A. H.; Guenther, D. B.; Rucinski, S. M.; Sasselov, D.

    2006-04-01

    The MOST (Microvariability and Oscillations of Stars) satellite obtains ultraprecise photometry from space with high sampling rates and duty cycles. Astronomical photometry or imaging missions in low Earth orbits, like MOST, are especially sensitive to scattered light from Earthshine, and all these missions have a common need to extract target information from voluminous data cubes. They consist of upwards of hundreds of thousands of two-dimensional CCD frames (or subrasters) containing from hundreds to millions of pixels each, where the target information, superposed on background and instrumental effects, is contained only in a subset of pixels (Fabry Images, defocused images, mini-spectra). We describe a novel reduction technique for such data cubes: resolving linear correlations of target and background pixel intensities. This step-wise multiple linear regression removes only those target variations which are also detected in the background. The advantage of regression analysis versus background subtraction is the appropriate scaling, taking into account that the amount of contamination may differ from pixel to pixel. The multivariate solution for all pairs of target/background pixels is minimally invasive of the raw photometry while being very effective in reducing contamination due to, e.g. stray light. The technique is tested and demonstrated with both simulated oscillation signals and real MOST photometry.

  4. Multivariate classification of small order watersheds in the Quabbin Reservoir Basin, Massachusetts

    USGS Publications Warehouse

    Lent, R.M.; Waldron, M.C.; Rader, J.C.

    1998-01-01

    A multivariate approach was used to analyze hydrologic, geologic, geographic, and water-chemistry data from small order watersheds in the Quabbin Reservoir Basin in central Massachusetts. Eighty three small order watersheds were delineated and landscape attributes defining hydrologic, geologic, and geographic features of the watersheds were compiled from geographic information system data layers. Principal components analysis was used to evaluate 11 chemical constituents collected bi-weekly for 1 year at 15 surface-water stations in order to subdivide the basin into subbasins comprised of watersheds with similar water quality characteristics. Three principal components accounted for about 90 percent of the variance in water chemistry data. The principal components were defined as a biogeochemical variable related to wetland density, an acid-neutralization variable, and a road-salt variable related to density of primary roads. Three subbasins were identified. Analysis of variance and multiple comparisons of means were used to identify significant differences in stream water chemistry and landscape attributes among subbasins. All stream water constituents were significantly different among subbasins. Multiple regression techniques were used to relate stream water chemistry to landscape attributes. Important differences in landscape attributes were related to wetlands, slope, and soil type.A multivariate approach was used to analyze hydrologic, geologic, geographic, and water-chemistry data from small order watersheds in the Quabbin Reservoir Basin in central Massachusetts. Eighty three small order watersheds were delineated and landscape attributes defining hydrologic, geologic, and geographic features of the watersheds were compiled from geographic information system data layers. Principal components analysis was used to evaluate 11 chemical constituents collected bi-weekly for 1 year at 15 surface-water stations in order to subdivide the basin into subbasins comprised of watersheds with similar water quality characteristics. Three principal components accounted for about 90 percent of the variance in water chemistry data. The principal components were defined as a biogeochemical variable related to wetland density, an acid-neutralization variable, and a road-salt variable related to density of primary roads. Three subbasins were identified. Analysis of variance and multiple comparisons of means were used to identify significant differences in stream water chemistry and landscape attributes among subbasins. All stream water constituents were significantly different among subbasins. Multiple regression techniques were used to relate stream water chemistry to landscape attributes. Important differences in landscape attributes were related to wetlands, slope, and soil type.

  5. Understanding adaptive gait in lower-limb amputees: insights from multivariate analyses

    PubMed Central

    2013-01-01

    Background In this paper we use multivariate statistical techniques to gain insights into how adaptive gait involving obstacle crossing is regulated in lower-limb amputees compared to able-bodied controls, with the aim of identifying underlying characteristics that differ between the two groups and consequently highlighting gait deficits in the amputees. Methods Eight unilateral trans-tibial amputees and twelve able-bodied controls completed adaptive gait trials involving negotiating various height obstacles; with amputees leading with their prosthetic limb. Spatiotemporal variables that are regularly used to quantify how gait is adapted when crossing obstacles were determined and subsequently analysed using multivariate statistical techniques. Results and discussion There were fundamental differences in the adaptive gait between the two groups. Compared to controls, amputees had a reduced approach velocity, reduced foot placement distance before and after the obstacle and reduced foot clearance over it, and reduced lead-limb knee flexion during the step following crossing. Logistic regression analysis highlighted the variables that best distinguished between the gait of the two groups and multiple regression analysis (with approach velocity as a controlling factor) helped identify what gait adaptations were driving the differences seen in these variables. Getting closer to the obstacle before crossing it appeared to be a strategy to ensure the heel of the lead-limb foot passed over the obstacle prior to the foot being lowered to the ground. Despite adopting such a heel clearance strategy, the lead-foot was positioned closer to the obstacle following crossing, which was likely a result of a desire to attain a limb/foot angle and orientation at instant of landing that minimised loads on the residuum (as evidenced by the reduced lead-limb knee flexion during the step following crossing). These changes in foot placement meant the foot was in a different part of swing at point of crossing and this explains why foot clearance was considerably reduced in amputees. Conclusions These results highlight that trans-tibial amputees use quite different gait adaptations to cross obstacles compared with controls (at least when leading with their prosthetic limb), indicating they are governed by different constraints; seemingly related to how they land on/load their prosthesis after crossing the obstacle. PMID:23958032

  6. Chemical studies of H chondrites. 6: Antarctic/non-Antarctic compositional differences revisited

    NASA Astrophysics Data System (ADS)

    Wolf, Stephen F.; Lipschutz, Michael E.

    1995-02-01

    We report data for the trace elements Au, Co, Sb, Ga, Rb, Ag, Se, Cs, Te, Zn, Cd, Bi, T1, and In (ordered by putative volatility during nebular condensation and accretion) determined by radiochemical neutron activation analysis of 14 additional H5 and H6 chondrite falls. Data for the 10 most volatile elements (Rb to In) treated by the multivariate techniques of linear discriminant analysis and logistic regression in these and 44 other falls are compared with those of 59 H4-6 chondrites from Antarctica. Various populations are tested by the multivariate techniques, using the previously developed method of randomization-simulation to assess significance levels. An earlier conclusion, based on fewer examples, that H4-6 chondrite falls are compositionally distinguishable from the Antarctic suite is verified by the additional data. This distinctiveness is highly significant because of the presence of samples from Victoria Land in the Antarctic population, which differ compositionally from falls beyond any reasonable doubt. However, it cannot be proven unequivocally that falls and Antarctic samples from Queen Maud Land are compositionally distinguishable. Trivial causes (e.g., analyst bias, weathering) cannot explain the Victoria Land (Antarctic)/non-Antarctic compositional difference for paradigmatic H4-6 chondrites. This seems to reflect a time-dependent variation of near-Earth meteoroid source regions differing in average thermal history.

  7. Chemical studies of H chondrites. 6: Antarctic/non-Antarctic compositional differences revisited

    NASA Technical Reports Server (NTRS)

    Wolf, Stephen F.; Lipschutz, Michael E.

    1995-01-01

    We report data for the trace elements Au, Co, Sb, Ga, Rb, Ag, Se, Cs, Te, Zn, Cd, Bi, T1, and In (ordered by putative volatility during nebular condensation and accretion) determined by radiochemical neutron activation analysis of 14 additional H5 and H6 chondrite falls. Data for the 10 most volatile elements (Rb to In) treated by the multivariate techniques of linear discriminant analysis and logistic regression in these and 44 other falls are compared with those of 59 H4-6 chondrites from Antarctica. Various populations are tested by the multivariate techniques, using the previously developed method of randomization-simulation to assess significance levels. An earlier conclusion, based on fewer examples, that H4-6 chondrite falls are compositionally distinguishable from the Antarctic suite is verified by the additional data. This distinctiveness is highly significant because of the presence of samples from Victoria Land in the Antarctic population, which differ compositionally from falls beyond any reasonable doubt. However, it cannot be proven unequivocally that falls and Antarctic samples from Queen Maud Land are compositionally distinguishable. Trivial causes (e.g., analyst bias, weathering) cannot explain the Victoria Land (Antarctic)/non-Antarctic compositional difference for paradigmatic H4-6 chondrites. This seems to reflect a time-dependent variation of near-Earth meteoroid source regions differing in average thermal history.

  8. Multivariate research in areas of phosphorus cast-iron brake shoes manufacturing using the statistical analysis and the multiple regression equations

    NASA Astrophysics Data System (ADS)

    Kiss, I.; Cioată, V. G.; Alexa, V.; Raţiu, S. A.

    2017-05-01

    The braking system is one of the most important and complex subsystems of railway vehicles, especially when it comes for safety. Therefore, installing efficient safe brakes on the modern railway vehicles is essential. Nowadays is devoted attention to solving problems connected with using high performance brake materials and its impact on thermal and mechanical loading of railway wheels. The main factor that influences the selection of a friction material for railway applications is the performance criterion, due to the interaction between the brake block and the wheel produce complex thermos-mechanical phenomena. In this work, the investigated subjects are the cast-iron brake shoes, which are still widely used on freight wagons. Therefore, the cast-iron brake shoes - with lamellar graphite and with a high content of phosphorus (0.8-1.1%) - need a special investigation. In order to establish the optimal condition for the cast-iron brake shoes we proposed a mathematical modelling study by using the statistical analysis and multiple regression equations. Multivariate research is important in areas of cast-iron brake shoes manufacturing, because many variables interact with each other simultaneously. Multivariate visualization comes to the fore when researchers have difficulties in comprehending many dimensions at one time. Technological data (hardness and chemical composition) obtained from cast-iron brake shoes were used for this purpose. In order to settle the multiple correlation between the hardness of the cast-iron brake shoes, and the chemical compositions elements several model of regression equation types has been proposed. Because a three-dimensional surface with variables on three axes is a common way to illustrate multivariate data, in which the maximum and minimum values are easily highlighted, we plotted graphical representation of the regression equations in order to explain interaction of the variables and locate the optimal level of each variable for maximal response. For the calculation of the regression coefficients, dispersion and correlation coefficients, the software Matlab was used.

  9. Near and mid infrared spectroscopy and multivariate data analysis in studies of oxidation of edible oils.

    PubMed

    Wójcicki, Krzysztof; Khmelinskii, Igor; Sikorski, Marek; Sikorska, Ewa

    2015-11-15

    Infrared spectroscopic techniques and chemometric methods were used to study oxidation of olive, sunflower and rapeseed oils. Accelerated oxidative degradation of oils at 60°C was monitored using peroxide values and FT-MIR ATR and FT-NIR transmittance spectroscopy. Principal component analysis (PCA) facilitated visualization and interpretation of spectral changes occurring during oxidation. Multivariate curve resolution (MCR) method found three spectral components in the NIR and MIR spectral matrix, corresponding to the oxidation products, and saturated and unsaturated structures. Good quantitative relation was found between peroxide value and contribution of oxidation products evaluated using MCR--based on NIR (R(2) = 0.890), MIR (R(2) = 0.707) and combined NIR and MIR (R(2) = 0.747) data. Calibration models for prediction peroxide value established using partial least squares (PLS) regression were characterized for MIR (R(2) = 0.701, RPD = 1.7), NIR (R(2) = 0.970, RPD = 5.3), and combined NIR and MIR data (R(2) = 0.954, RPD = 3.1). Copyright © 2015 Elsevier Ltd. All rights reserved.

  10. Use of hearing protection and perceptions of noise exposure and hearing loss among construction workers.

    PubMed

    Lusk, S L; Kerr, M J; Kauffman, S A

    1998-07-01

    The purpose of this study was to describe construction workers' use of hearing protection devices (HPDs) and determine their perceptions of noise exposure and hearing loss. Operating engineers, carpenters, and plumbers/pipe fitters in the Midwest (n = 400) completed a written questionnaire regarding their use of HPDs and their perceptions of noise exposure and hearing loss. Subjects were recruited through their trade union groups. Mean reported use of HPDs and mean perceived noise exposure were compared across trade groups. Bivariate and multivariate analysis techniques were used to assess relationships between use of HPDs and trade category, education, age, years of employment, noise exposure, and hearing loss. Bivariate analyses identified significant differences in mean use of HPDs by age, years of employment, and trade group. Multivariate logistic regression assessing the independent effects of these variables found significant differences only by trade group. Results indicate a need for significant improvement in all three trade groups' use of HPDs, and suggest a need to consider use and exposure levels, demographics, and trade group membership in designing hearing conservation programs.

  11. Development of Pattern Recognition Techniques for the Evaluation of Toxicant Impacts to Multispecies Systems

    DTIC Science & Technology

    1993-06-18

    the exception. In the Standardized Aquatic Microcosm and the Mixed Flask Culture (MFC) microcosms, multivariate analysis and clustering methods...rule rather than the exception. In the Standardized Aquatic Microcosm and the Mixed Flask Culture (MFC) microcosms, multivariate analysis and...experiments using two microcosm protocols. We use nonmetric clustering, a multivariate pattern recognition technique developed by Matthews and Heame (1991

  12. Impact of robotic technique and surgical volume on the cost of radical prostatectomy.

    PubMed

    Hyams, Elias S; Mullins, Jeffrey K; Pierorazio, Phillip M; Partin, Alan W; Allaf, Mohamad E; Matlaga, Brian R

    2013-03-01

    Our present understanding of the effect of robotic surgery and surgical volume on the cost of radical prostatectomy (RP) is limited. Given the increasing pressures placed on healthcare resource utilization, such determinations of healthcare value are becoming increasingly important. Therefore, we performed a study to define the effect of robotic technology and surgical volume on the cost of RP. The state of Maryland mandates that all acute-care hospitals report encounter-level and hospital discharge data to the Health Service Cost Review Commission (HSCRC). The HSCRC was queried for men undergoing RP between 2008 and 2011 (the period during which robot-assisted laparoscopic radical prostatectomy [RALRP] was coded separately). High-volume hospitals were defined as >60 cases per year, and high-volume surgeons were defined as >40 cases per year. Multivariate regression analysis was performed to evaluate whether robotic technique and high surgical volume impacted the cost of RP. There were 1499 patients who underwent RALRP and 2565 who underwent radical retropubic prostatectomy (RRP) during the study period. The total cost for RALRP was higher than for RRP ($14,000 vs 10,100; P<0.001) based primarily on operating room charges and supply charges. Multivariate regression demonstrated that RALRP was associated with a significantly higher cost (β coeff 4.1; P<0.001), even within high-volume hospitals (β coeff 3.3; P<0.001). High-volume surgeons and high-volume hospitals, however, were associated with a significantly lower cost for RP overall. High surgeon volume was associated with lower cost for RALRP and RRP, while high institutional volume was associated with lower cost for RALRP only. High surgical volume was associated with lower cost of RP. Even at high surgical volume, however, the cost of RALRP still exceeded that of RRP. As robotic surgery has come to dominate the healthcare marketplace, strategies to increase the role of high-volume providers may be needed to improve the cost-effectiveness of prostate cancer surgical therapy.

  13. Soil sail content estimation in the yellow river delta with satellite hyperspectral data

    USGS Publications Warehouse

    Weng, Yongling; Gong, Peng; Zhu, Zhi-Liang

    2008-01-01

    Soil salinization is one of the most common land degradation processes and is a severe environmental hazard. The primary objective of this study is to investigate the potential of predicting salt content in soils with hyperspectral data acquired with EO-1 Hyperion. Both partial least-squares regression (PLSR) and conventional multiple linear regression (MLR), such as stepwise regression (SWR), were tested as the prediction model. PLSR is commonly used to overcome the problem caused by high-dimensional and correlated predictors. Chemical analysis of 95 samples collected from the top layer of soils in the Yellow River delta area shows that salt content was high on average, and the dominant chemicals in the saline soil were NaCl and MgCl2. Multivariate models were established between soil contents and hyperspectral data. Our results indicate that the PLSR technique with laboratory spectral data has a strong prediction capacity. Spectral bands at 1487-1527, 1971-1991, 2032-2092, and 2163-2355 nm possessed large absolute values of regression coefficients, with the largest coefficient at 2203 nm. We obtained a root mean squared error (RMSE) for calibration (with 61 samples) of RMSEC = 0.753 (R2 = 0.893) and a root mean squared error for validation (with 30 samples) of RMSEV = 0.574. The prediction model was applied on a pixel-by-pixel basis to a Hyperion reflectance image to yield a quantitative surface distribution map of soil salt content. The result was validated successfully from 38 sampling points. We obtained an RMSE estimate of 1.037 (R2 = 0.784) for the soil salt content map derived by the PLSR model. The salinity map derived from the SWR model shows that the predicted value is higher than the true value. These results demonstrate that the PLSR method is a more suitable technique than stepwise regression for quantitative estimation of soil salt content in a large area. ?? 2008 CASI.

  14. Hyperspectral imaging using a color camera and its application for pathogen detection

    NASA Astrophysics Data System (ADS)

    Yoon, Seung-Chul; Shin, Tae-Sung; Heitschmidt, Gerald W.; Lawrence, Kurt C.; Park, Bosoon; Gamble, Gary

    2015-02-01

    This paper reports the results of a feasibility study for the development of a hyperspectral image recovery (reconstruction) technique using a RGB color camera and regression analysis in order to detect and classify colonies of foodborne pathogens. The target bacterial pathogens were the six representative non-O157 Shiga-toxin producing Escherichia coli (STEC) serogroups (O26, O45, O103, O111, O121, and O145) grown in Petri dishes of Rainbow agar. The purpose of the feasibility study was to evaluate whether a DSLR camera (Nikon D700) could be used to predict hyperspectral images in the wavelength range from 400 to 1,000 nm and even to predict the types of pathogens using a hyperspectral STEC classification algorithm that was previously developed. Unlike many other studies using color charts with known and noise-free spectra for training reconstruction models, this work used hyperspectral and color images, separately measured by a hyperspectral imaging spectrometer and the DSLR color camera. The color images were calibrated (i.e. normalized) to relative reflectance, subsampled and spatially registered to match with counterpart pixels in hyperspectral images that were also calibrated to relative reflectance. Polynomial multivariate least-squares regression (PMLR) was previously developed with simulated color images. In this study, partial least squares regression (PLSR) was also evaluated as a spectral recovery technique to minimize multicollinearity and overfitting. The two spectral recovery models (PMLR and PLSR) and their parameters were evaluated by cross-validation. The QR decomposition was used to find a numerically more stable solution of the regression equation. The preliminary results showed that PLSR was more effective especially with higher order polynomial regressions than PMLR. The best classification accuracy measured with an independent test set was about 90%. The results suggest the potential of cost-effective color imaging using hyperspectral image classification algorithms for rapidly differentiating pathogens in agar plates.

  15. Rex fortran 4 system for combinatorial screening or conventional analysis of multivariate regressions

    Treesearch

    L.R. Grosenbaugh

    1967-01-01

    Describes an expansible computerized system that provides data needed in regression or covariance analysis of as many as 50 variables, 8 of which may be dependent. Alternatively, it can screen variously generated combinations of independent variables to find the regression with the smallest mean-squared-residual, which will be fitted if desired. The user can easily...

  16. Multivariate logistic regression for predicting total culturable virus presence at the intake of a potable-water treatment plant: novel application of the atypical coliform/total coliform ratio.

    PubMed

    Black, L E; Brion, G M; Freitas, S J

    2007-06-01

    Predicting the presence of enteric viruses in surface waters is a complex modeling problem. Multiple water quality parameters that indicate the presence of human fecal material, the load of fecal material, and the amount of time fecal material has been in the environment are needed. This paper presents the results of a multiyear study of raw-water quality at the inlet of a potable-water plant that related 17 physical, chemical, and biological indices to the presence of enteric viruses as indicated by cytopathic changes in cell cultures. It was found that several simple, multivariate logistic regression models that could reliably identify observations of the presence or absence of total culturable virus could be fitted. The best models developed combined a fecal age indicator (the atypical coliform [AC]/total coliform [TC] ratio), the detectable presence of a human-associated sterol (epicoprostanol) to indicate the fecal source, and one of several fecal load indicators (the levels of Giardia species cysts, coliform bacteria, and coprostanol). The best fit to the data was found when the AC/TC ratio, the presence of epicoprostanol, and the density of fecal coliform bacteria were input into a simple, multivariate logistic regression equation, resulting in 84.5% and 78.6% accuracies for the identification of the presence and absence of total culturable virus, respectively. The AC/TC ratio was the most influential input variable in all of the models generated, but producing the best prediction required additional input related to the fecal source and the fecal load. The potential for replacing microbial indicators of fecal load with levels of coprostanol was proposed and evaluated by multivariate logistic regression modeling for the presence and absence of virus.

  17. Enhanced ID Pit Sizing Using Multivariate Regression Algorithm

    NASA Astrophysics Data System (ADS)

    Krzywosz, Kenji

    2007-03-01

    EPRI is funding a program to enhance and improve the reliability of inside diameter (ID) pit sizing for balance-of plant heat exchangers, such as condensers and component cooling water heat exchangers. More traditional approaches to ID pit sizing involve the use of frequency-specific amplitude or phase angles. The enhanced multivariate regression algorithm for ID pit depth sizing incorporates three simultaneous input parameters of frequency, amplitude, and phase angle. A set of calibration data sets consisting of machined pits of various rounded and elongated shapes and depths was acquired in the frequency range of 100 kHz to 1 MHz for stainless steel tubing having nominal wall thickness of 0.028 inch. To add noise to the acquired data set, each test sample was rotated and test data acquired at 3, 6, 9, and 12 o'clock positions. The ID pit depths were estimated using a second order and fourth order regression functions by relying on normalized amplitude and phase angle information from multiple frequencies. Due to unique damage morphology associated with the microbiologically-influenced ID pits, it was necessary to modify the elongated calibration standard-based algorithms by relying on the algorithm developed solely from the destructive sectioning results. This paper presents the use of transformed multivariate regression algorithm to estimate ID pit depths and compare the results with the traditional univariate phase angle analysis. Both estimates were then compared with the destructive sectioning results.

  18. Determination of boiling point of petrochemicals by gas chromatography-mass spectrometry and multivariate regression analysis of structural activity relationship.

    PubMed

    Fakayode, Sayo O; Mitchell, Breanna S; Pollard, David A

    2014-08-01

    Accurate understanding of analyte boiling points (BP) is of critical importance in gas chromatographic (GC) separation and crude oil refinery operation in petrochemical industries. This study reported the first combined use of GC separation and partial-least-square (PLS1) multivariate regression analysis of petrochemical structural activity relationship (SAR) for accurate BP determination of two commercially available (D3710 and MA VHP) calibration gas mix samples. The results of the BP determination using PLS1 multivariate regression were further compared with the results of traditional simulated distillation method of BP determination. The developed PLS1 regression was able to correctly predict analytes BP in D3710 and MA VHP calibration gas mix samples, with a root-mean-square-%-relative-error (RMS%RE) of 6.4%, and 10.8% respectively. In contrast, the overall RMS%RE of 32.9% and 40.4%, respectively obtained for BP determination in D3710 and MA VHP using a traditional simulated distillation method were approximately four times larger than the corresponding RMS%RE of BP prediction using MRA, demonstrating the better predictive ability of MRA. The reported method is rapid, robust, and promising, and can be potentially used routinely for fast analysis, pattern recognition, and analyte BP determination in petrochemical industries. Copyright © 2014 Elsevier B.V. All rights reserved.

  19. A regression-kriging model for estimation of rainfall in the Laohahe basin

    NASA Astrophysics Data System (ADS)

    Wang, Hong; Ren, Li L.; Liu, Gao H.

    2009-10-01

    This paper presents a multivariate geostatistical algorithm called regression-kriging (RK) for predicting the spatial distribution of rainfall by incorporating five topographic/geographic factors of latitude, longitude, altitude, slope and aspect. The technique is illustrated using rainfall data collected at 52 rain gauges from the Laohahe basis in northeast China during 1986-2005 . Rainfall data from 44 stations were selected for modeling and the remaining 8 stations were used for model validation. To eliminate multicollinearity, the five explanatory factors were first transformed using factor analysis with three Principal Components (PCs) extracted. The rainfall data were then fitted using step-wise regression and residuals interpolated using SK. The regression coefficients were estimated by generalized least squares (GLS), which takes the spatial heteroskedasticity between rainfall and PCs into account. Finally, the rainfall prediction based on RK was compared with that predicted from ordinary kriging (OK) and ordinary least squares (OLS) multiple regression (MR). For correlated topographic factors are taken into account, RK improves the efficiency of predictions. RK achieved a lower relative root mean square error (RMSE) (44.67%) than MR (49.23%) and OK (73.60%) and a lower bias than MR and OK (23.82 versus 30.89 and 32.15 mm) for annual rainfall. It is much more effective for the wet season than for the dry season. RK is suitable for estimation of rainfall in areas where there are no stations nearby and where topography has a major influence on rainfall.

  20. Chemiluminescence-based multivariate sensing of local equivalence ratios in premixed atmospheric methane-air flames

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tripathi, Markandey M.; Krishnan, Sundar R.; Srinivasan, Kalyan K.

    Chemiluminescence emissions from OH*, CH*, C2, and CO2 formed within the reaction zone of premixed flames depend upon the fuel-air equivalence ratio in the burning mixture. In the present paper, a new partial least square regression (PLS-R) based multivariate sensing methodology is investigated and compared with an OH*/CH* intensity ratio-based calibration model for sensing equivalence ratio in atmospheric methane-air premixed flames. Five replications of spectral data at nine different equivalence ratios ranging from 0.73 to 1.48 were used in the calibration of both models. During model development, the PLS-R model was initially validated with the calibration data set using themore » leave-one-out cross validation technique. Since the PLS-R model used the entire raw spectral intensities, it did not need the nonlinear background subtraction of CO2 emission that is required for typical OH*/CH* intensity ratio calibrations. An unbiased spectral data set (not used in the PLS-R model development), for 28 different equivalence ratio conditions ranging from 0.71 to 1.67, was used to predict equivalence ratios using the PLS-R and the intensity ratio calibration models. It was found that the equivalence ratios predicted with the PLS-R based multivariate calibration model matched the experimentally measured equivalence ratios within 7%; whereas, the OH*/CH* intensity ratio calibration grossly underpredicted equivalence ratios in comparison to measured equivalence ratios, especially under rich conditions ( > 1.2). The practical implications of the chemiluminescence-based multivariate equivalence ratio sensing methodology are also discussed.« less

  1. A survey of variable selection methods in two Chinese epidemiology journals

    PubMed Central

    2010-01-01

    Background Although much has been written on developing better procedures for variable selection, there is little research on how it is practiced in actual studies. This review surveys the variable selection methods reported in two high-ranking Chinese epidemiology journals. Methods Articles published in 2004, 2006, and 2008 in the Chinese Journal of Epidemiology and the Chinese Journal of Preventive Medicine were reviewed. Five categories of methods were identified whereby variables were selected using: A - bivariate analyses; B - multivariable analysis; e.g. stepwise or individual significance testing of model coefficients; C - first bivariate analyses, followed by multivariable analysis; D - bivariate analyses or multivariable analysis; and E - other criteria like prior knowledge or personal judgment. Results Among the 287 articles that reported using variable selection methods, 6%, 26%, 30%, 21%, and 17% were in categories A through E, respectively. One hundred sixty-three studies selected variables using bivariate analyses, 80% (130/163) via multiple significance testing at the 5% alpha-level. Of the 219 multivariable analyses, 97 (44%) used stepwise procedures, 89 (41%) tested individual regression coefficients, but 33 (15%) did not mention how variables were selected. Sixty percent (58/97) of the stepwise routines also did not specify the algorithm and/or significance levels. Conclusions The variable selection methods reported in the two journals were limited in variety, and details were often missing. Many studies still relied on problematic techniques like stepwise procedures and/or multiple testing of bivariate associations at the 0.05 alpha-level. These deficiencies should be rectified to safeguard the scientific validity of articles published in Chinese epidemiology journals. PMID:20920252

  2. NIR and Py-mbms coupled with multivariate data analysis as a high-throughput biomass characterization technique: a review

    PubMed Central

    Xiao, Li; Wei, Hui; Himmel, Michael E.; Jameel, Hasan; Kelley, Stephen S.

    2014-01-01

    Optimizing the use of lignocellulosic biomass as the feedstock for renewable energy production is currently being developed globally. Biomass is a complex mixture of cellulose, hemicelluloses, lignins, extractives, and proteins; as well as inorganic salts. Cell wall compositional analysis for biomass characterization is laborious and time consuming. In order to characterize biomass fast and efficiently, several high through-put technologies have been successfully developed. Among them, near infrared spectroscopy (NIR) and pyrolysis-molecular beam mass spectrometry (Py-mbms) are complementary tools and capable of evaluating a large number of raw or modified biomass in a short period of time. NIR shows vibrations associated with specific chemical structures whereas Py-mbms depicts the full range of fragments from the decomposition of biomass. Both NIR vibrations and Py-mbms peaks are assigned to possible chemical functional groups and molecular structures. They provide complementary information of chemical insight of biomaterials. However, it is challenging to interpret the informative results because of the large amount of overlapping bands or decomposition fragments contained in the spectra. In order to improve the efficiency of data analysis, multivariate analysis tools have been adapted to define the significant correlations among data variables, so that the large number of bands/peaks could be replaced by a small number of reconstructed variables representing original variation. Reconstructed data variables are used for sample comparison (principal component analysis) and for building regression models (partial least square regression) between biomass chemical structures and properties of interests. In this review, the important biomass chemical structures measured by NIR and Py-mbms are summarized. The advantages and disadvantages of conventional data analysis methods and multivariate data analysis methods are introduced, compared and evaluated. This review aims to serve as a guide for choosing the most effective data analysis methods for NIR and Py-mbms characterization of biomass. PMID:25147552

  3. Common side closure type, but not stapler brand or oversewing, influences side-to-side anastomotic leak rates.

    PubMed

    Fleetwood, V A; Gross, K N; Alex, G C; Cortina, C S; Smolevitz, J B; Sarvepalli, S; Bakhsh, S R; Poirier, J; Myers, J A; Singer, M A; Orkin, B A

    2017-03-01

    Anastomotic leak (AL) increases costs and cancer recurrence. Studies show decreased AL with side-to-side stapled anastomosis (SSA), but none identify risk factors within SSAs. We hypothesized that stapler characteristics and closure technique of the common enterotomy affect AL rates. Retrospective review of bowel SSAs was performed. Data included stapler brand, staple line oversewing, and closure method (handsewn, HC; linear stapler [Barcelona technique], BT; transverse stapler, TX). Primary endpoint was AL. Statistical analysis included Fisher's test and logistic regression. 463 patients were identified, 58.5% BT, 21.2% HC, and 20.3% TX. Covidien staplers comprised 74.9%, Ethicon 18.1%. There were no differences between stapler types (Covidien 5.8%, Ethicon 6.0%). However, AL rates varied by common side closure (BT 3.7% vs. TX 10.6%, p = 0.017), remaining significant on multivariate analysis. Closure method of the common side impacts AL rates. Barcelona technique has fewer leaks than transverse stapled closure. Further prospective evaluation is recommended. Copyright © 2017. Published by Elsevier Inc.

  4. Using foreground/background analysis to determine leaf and canopy chemistry

    NASA Technical Reports Server (NTRS)

    Pinzon, J. E.; Ustin, S. L.; Hart, Q. J.; Jacquemoud, S.; Smith, M. O.

    1995-01-01

    Spectral Mixture Analysis (SMA) has become a well established procedure for analyzing imaging spectrometry data, however, the technique is relatively insensitive to minor sources of spectral variation (e.g., discriminating stressed from unstressed vegetation and variations in canopy chemistry). Other statistical approaches have been tried e.g., stepwise multiple linear regression analysis to predict canopy chemistry. Grossman et al. reported that SMLR is sensitive to measurement error and that the prediction of minor chemical components are not independent of patterns observed in more dominant spectral components like water. Further, they observed that the relationships were strongly dependent on the mode of expressing reflectance (R, -log R) and whether chemistry was expressed on a weight (g/g) or are basis (g/sq m). Thus, alternative multivariate techniques need to be examined. Smith et al. reported a revised SMA that they termed Foreground/Background Analysis (FBA) that permits directing the analysis along any axis of variance by identifying vectors through the n-dimensional spectral volume orthonormal to each other. Here, we report an application of the FBA technique for the detection of canopy chemistry using a modified form of the analysis.

  5. Multivariate random-parameters zero-inflated negative binomial regression model: an application to estimate crash frequencies at intersections.

    PubMed

    Dong, Chunjiao; Clarke, David B; Yan, Xuedong; Khattak, Asad; Huang, Baoshan

    2014-09-01

    Crash data are collected through police reports and integrated with road inventory data for further analysis. Integrated police reports and inventory data yield correlated multivariate data for roadway entities (e.g., segments or intersections). Analysis of such data reveals important relationships that can help focus on high-risk situations and coming up with safety countermeasures. To understand relationships between crash frequencies and associated variables, while taking full advantage of the available data, multivariate random-parameters models are appropriate since they can simultaneously consider the correlation among the specific crash types and account for unobserved heterogeneity. However, a key issue that arises with correlated multivariate data is the number of crash-free samples increases, as crash counts have many categories. In this paper, we describe a multivariate random-parameters zero-inflated negative binomial (MRZINB) regression model for jointly modeling crash counts. The full Bayesian method is employed to estimate the model parameters. Crash frequencies at urban signalized intersections in Tennessee are analyzed. The paper investigates the performance of MZINB and MRZINB regression models in establishing the relationship between crash frequencies, pavement conditions, traffic factors, and geometric design features of roadway intersections. Compared to the MZINB model, the MRZINB model identifies additional statistically significant factors and provides better goodness of fit in developing the relationships. The empirical results show that MRZINB model possesses most of the desirable statistical properties in terms of its ability to accommodate unobserved heterogeneity and excess zero counts in correlated data. Notably, in the random-parameters MZINB model, the estimated parameters vary significantly across intersections for different crash types. Copyright © 2014 Elsevier Ltd. All rights reserved.

  6. Learning investment indicators through data extension

    NASA Astrophysics Data System (ADS)

    Dvořák, Marek

    2017-07-01

    Stock prices in the form of time series were analysed using single and multivariate statistical methods. After simple data preprocessing in the form of logarithmic differences, we augmented this single variate time series to a multivariate representation. This method makes use of sliding windows to calculate several dozen of new variables using simple statistic tools like first and second moments as well as more complicated statistic, like auto-regression coefficients and residual analysis, followed by an optional quadratic transformation that was further used for data extension. These were used as a explanatory variables in a regularized logistic LASSO regression which tried to estimate Buy-Sell Index (BSI) from real stock market data.

  7. Fast Detection of Copper Content in Rice by Laser-Induced Breakdown Spectroscopy with Uni- and Multivariate Analysis.

    PubMed

    Liu, Fei; Ye, Lanhan; Peng, Jiyu; Song, Kunlin; Shen, Tingting; Zhang, Chu; He, Yong

    2018-02-27

    Fast detection of heavy metals is very important for ensuring the quality and safety of crops. Laser-induced breakdown spectroscopy (LIBS), coupled with uni- and multivariate analysis, was applied for quantitative analysis of copper in three kinds of rice (Jiangsu rice, regular rice, and Simiao rice). For univariate analysis, three pre-processing methods were applied to reduce fluctuations, including background normalization, the internal standard method, and the standard normal variate (SNV). Linear regression models showed a strong correlation between spectral intensity and Cu content, with an R 2 more than 0.97. The limit of detection (LOD) was around 5 ppm, lower than the tolerance limit of copper in foods. For multivariate analysis, partial least squares regression (PLSR) showed its advantage in extracting effective information for prediction, and its sensitivity reached 1.95 ppm, while support vector machine regression (SVMR) performed better in both calibration and prediction sets, where R c 2 and R p 2 reached 0.9979 and 0.9879, respectively. This study showed that LIBS could be considered as a constructive tool for the quantification of copper contamination in rice.

  8. Fast Detection of Copper Content in Rice by Laser-Induced Breakdown Spectroscopy with Uni- and Multivariate Analysis

    PubMed Central

    Ye, Lanhan; Song, Kunlin; Shen, Tingting

    2018-01-01

    Fast detection of heavy metals is very important for ensuring the quality and safety of crops. Laser-induced breakdown spectroscopy (LIBS), coupled with uni- and multivariate analysis, was applied for quantitative analysis of copper in three kinds of rice (Jiangsu rice, regular rice, and Simiao rice). For univariate analysis, three pre-processing methods were applied to reduce fluctuations, including background normalization, the internal standard method, and the standard normal variate (SNV). Linear regression models showed a strong correlation between spectral intensity and Cu content, with an R2 more than 0.97. The limit of detection (LOD) was around 5 ppm, lower than the tolerance limit of copper in foods. For multivariate analysis, partial least squares regression (PLSR) showed its advantage in extracting effective information for prediction, and its sensitivity reached 1.95 ppm, while support vector machine regression (SVMR) performed better in both calibration and prediction sets, where Rc2 and Rp2 reached 0.9979 and 0.9879, respectively. This study showed that LIBS could be considered as a constructive tool for the quantification of copper contamination in rice. PMID:29495445

  9. The microbiological profile and presence of bloodstream infection influence mortality rates in necrotizing fasciitis

    PubMed Central

    2011-01-01

    Introduction Necrotizing fasciitis (NF) is a life threatening infectious disease with a high mortality rate. We carried out a microbiological characterization of the causative pathogens. We investigated the correlation of mortality in NF with bloodstream infection and with the presence of co-morbidities. Methods In this retrospective study, we analyzed 323 patients who presented with necrotizing fasciitis at two different institutions. Bloodstream infection (BSI) was defined as a positive blood culture result. The patients were categorized as survivors and non-survivors. Eleven clinically important variables which were statistically significant by univariate analysis were selected for multivariate regression analysis and a stepwise logistic regression model was developed to determine the association between BSI and mortality. Results Univariate logistic regression analysis showed that patients with hypotension, heart disease, liver disease, presence of Vibrio spp. in wound cultures, presence of fungus in wound cultures, and presence of Streptococcus group A, Aeromonas spp. or Vibrio spp. in blood cultures, had a significantly higher risk of in-hospital mortality. Our multivariate logistic regression analysis showed a higher risk of mortality in patients with pre-existing conditions like hypotension, heart disease, and liver disease. Multivariate logistic regression analysis also showed that presence of Vibrio spp in wound cultures, and presence of Streptococcus Group A in blood cultures were associated with a high risk of mortality while debridement > = 3 was associated with improved survival. Conclusions Mortality in patients with necrotizing fasciitis was significantly associated with the presence of Vibrio in wound cultures and Streptococcus group A in blood cultures. PMID:21693053

  10. Using Logistic Regression To Predict the Probability of Debris Flows Occurring in Areas Recently Burned By Wildland Fires

    USGS Publications Warehouse

    Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.

    2003-01-01

    Logistic regression was used to predict the probability of debris flows occurring in areas recently burned by wildland fires. Multiple logistic regression is conceptually similar to multiple linear regression because statistical relations between one dependent variable and several independent variables are evaluated. In logistic regression, however, the dependent variable is transformed to a binary variable (debris flow did or did not occur), and the actual probability of the debris flow occurring is statistically modeled. Data from 399 basins located within 15 wildland fires that burned during 2000-2002 in Colorado, Idaho, Montana, and New Mexico were evaluated. More than 35 independent variables describing the burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows were delineated from National Elevation Data using a Geographic Information System (GIS). (2) Data describing the burn severity, geology, land surface gradient, rainfall, and soil properties were determined for each basin. These data were then downloaded to a statistics software package for analysis using logistic regression. (3) Relations between the occurrence/non-occurrence of debris flows and burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated and several preliminary multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combination produced the most effective model. The multivariate model that best predicted the occurrence of debris flows was selected. (4) The multivariate logistic regression model was entered into a GIS, and a map showing the probability of debris flows was constructed. The most effective model incorporates the percentage of each basin with slope greater than 30 percent, percentage of land burned at medium and high burn severity in each basin, particle size sorting, average storm intensity (millimeters per hour), soil organic matter content, soil permeability, and soil drainage. The results of this study demonstrate that logistic regression is a valuable tool for predicting the probability of debris flows occurring in recently-burned landscapes.

  11. The quantitative structure-insecticidal activity relationships from plant derived compounds against chikungunya and zika Aedes aegypti (Diptera:Culicidae) vector.

    PubMed

    Saavedra, Laura M; Romanelli, Gustavo P; Rozo, Ciro E; Duchowicz, Pablo R

    2018-01-01

    The insecticidal activity of a series of 62 plant derived molecules against the chikungunya, dengue and zika vector, the Aedes aegypti (Diptera:Culicidae) mosquito, is subjected to a Quantitative Structure-Activity Relationships (QSAR) analysis. The Replacement Method (RM) variable subset selection technique based on Multivariable Linear Regression (MLR) proves to be successful for exploring 4885 molecular descriptors calculated with Dragon 6. The predictive capability of the obtained models is confirmed through an external test set of compounds, Leave-One-Out (LOO) cross-validation and Y-Randomization. The present study constitutes a first necessary computational step for designing less toxic insecticides. Copyright © 2017 Elsevier B.V. All rights reserved.

  12. Urban aerosols harbor diverse and dynamic bacterial populations

    PubMed Central

    Brodie, Eoin L.; DeSantis, Todd Z.; Parker, Jordan P. Moberg; Zubietta, Ingrid X.; Piceno, Yvette M.; Andersen, Gary L.

    2007-01-01

    Considering the importance of its potential implications for human health, agricultural productivity, and ecosystem stability, surprisingly little is known regarding the composition or dynamics of the atmosphere's microbial inhabitants. Using a custom high-density DNA microarray, we detected and monitored bacterial populations in two U.S. cities over 17 weeks. These urban aerosols contained at least 1,800 diverse bacterial types, a richness approaching that of some soil bacterial communities. We also reveal the consistent presence of bacterial families with pathogenic members including environmental relatives of select agents of bioterrorism significance. Finally, using multivariate regression techniques, we demonstrate that temporal and meteorological influences can be stronger factors than location in shaping the biological composition of the air we breathe. PMID:17182744

  13. [Influence of sample surface roughness on mathematical model of NIR quantitative analysis of wood density].

    PubMed

    Huang, An-Min; Fei, Ben-Hua; Jiang, Ze-Hui; Hse, Chung-Yun

    2007-09-01

    Near infrared spectroscopy is widely used as a quantitative method, and the main multivariate techniques consist of regression methods used to build prediction models, however, the accuracy of analysis results will be affected by many factors. In the present paper, the influence of different sample roughness on the mathematical model of NIR quantitative analysis of wood density was studied. The result of experiments showed that if the roughness of predicted samples was consistent with that of calibrated samples, the result was good, otherwise the error would be much higher. The roughness-mixed model was more flexible and adaptable to different sample roughness. The prediction ability of the roughness-mixed model was much better than that of the single-roughness model.

  14. Improved estimation of PM2.5 using Lagrangian satellite-measured aerosol optical depth

    NASA Astrophysics Data System (ADS)

    Olivas Saunders, Rolando

    Suspended particulate matter (aerosols) with aerodynamic diameters less than 2.5 mum (PM2.5) has negative effects on human health, plays an important role in climate change and also causes the corrosion of structures by acid deposition. Accurate estimates of PM2.5 concentrations are thus relevant in air quality, epidemiology, cloud microphysics and climate forcing studies. Aerosol optical depth (AOD) retrieved by the Moderate Resolution Imaging Spectroradiometer (MODIS) satellite instrument has been used as an empirical predictor to estimate ground-level concentrations of PM2.5 . These estimates usually have large uncertainties and errors. The main objective of this work is to assess the value of using upwind (Lagrangian) MODIS-AOD as predictors in empirical models of PM2.5. The upwind locations of the Lagrangian AOD were estimated using modeled backward air trajectories. Since the specification of an arrival elevation is somewhat arbitrary, trajectories were calculated to arrive at four different elevations at ten measurement sites within the continental United States. A systematic examination revealed trajectory model calculations to be sensitive to starting elevation. With a 500 m difference in starting elevation, the 48-hr mean horizontal separation of trajectory endpoints was 326 km. When the difference in starting elevation was doubled and tripled to 1000 m and 1500m, the mean horizontal separation of trajectory endpoints approximately doubled and tripled to 627 km and 886 km, respectively. A seasonal dependence of this sensitivity was also found: the smallest mean horizontal separation of trajectory endpoints was exhibited during the summer and the largest separations during the winter. A daily average AOD product was generated and coupled to the trajectory model in order to determine AOD values upwind of the measurement sites during the period 2003-2007. Empirical models that included in situ AOD and upwind AOD as predictors of PM2.5 were generated by multivariate linear regressions using the least squares method. The multivariate models showed improved performance over the single variable regression (PM2.5 and in situ AOD) models. The statistical significance of the improvement of the multivariate models over the single variable regression models was tested using the extra sum of squares principle. In many cases, even when the R-squared was high for the multivariate models, the improvement over the single models was not statistically significant. The R-squared of these multivariate models varied with respect to seasons, with the best performance occurring during the summer months. A set of seasonal categorical variables was included in the regressions to exploit this variability. The multivariate regression models that included these categorical seasonal variables performed better than the models that didn't account for seasonal variability. Furthermore, 71% of these regressions exhibited improvement over the single variable models that was statistically significant at a 95% confidence level.

  15. Multivariate regression model for partitioning tree volume of white oak into round-product classes

    Treesearch

    Daniel A. Yaussy; David L. Sonderman

    1984-01-01

    Describes the development of multivariate equations that predict the expected cubic volume of four round-product classes from independent variables composed of individual tree-quality characteristics. Although the model has limited application at this time, it does demonstrate the feasibility of partitioning total tree cubic volume into round-product classes based on...

  16. When and why a colonoscopist should discontinue colonoscopy by himself?

    PubMed

    Gan, Tao; Yang, Jin-Lin; Wu, Jun-Chao; Wang, Yi-Ping; Yang, Li

    2015-07-07

    To investigate when and why a colonoscopist should discontinue incomplete colonoscopy by himself. In this cross-sectional study, 517 difficult colonoscope insertions (Grade C, Kudo's difficulty classification) screened from 37800 colonoscopy insertions were collected from April 2004 to June 2014 by three 4(th)-level (Kudo's classification) colonoscopists. The following common factors for the incomplete insertion were excluded: structural obstruction of the colon or rectum, insufficient colon cleansing, discontinuation due to patient's discomfort or pain, severe colon disease with a perforation risk (e.g., severe ischemic colonopathy). All the excluded patients were re-scheduled if permission was obtained from the patients whose intubation had failed. If the repeat intubations were still a failure because of the difficult operative techniques, those patients were also included in this study. The patient's age, sex, anesthesia and colonoscope type were recorded before colonoscopy. During the colonoscopic examination, the influencing factors of fixation, tortuosity, laxity and redundancy of the colon were assessed, and the insertion time (> 10 min or ≤ 10 min) were registered. The insertion time was analyzed by t-test, and other factors were analyzed by univariate and multivariate logistic regression. Three hundred and twenty-two (62.3%) of the 517 insertions were complete in the colonoscope insertion into the ileocecum, but 195 (37.7%) failed in the insertion. Fixation, tortuosity, laxity or redundancy occurred during the colonoscopic examination. Multivariate logistic regression analysis revealed that fixation (OR = 0.06, 95%CI: 0.03-0.16, P < 0.001) and tortuosity (OR = 0.04, 95%CI: 0.02-0.08, P < 0.001) were significantly related to the insertion into the ileocecum in the left hemicolon; multivariate logistic regression analysis also revealed that fixation (OR = 0.16, 95%CI: 0.06-0.39, P < 0.001), tortuosity (OR 0.23, 95%CI: 0.13-0.43, P < 0.001), redundancy (OR = 0.12, 95%CI: 0.05-0.26, P < 0.001) and sex (OR = 0.35, 95%CI: 0.20-0.63, P < 0.001) were significantly related to the insertion into the ileocecum in the right hemicolon. Prolonged insertion time (> 10 min) was an unfavorable factor for the insertion into the ileocecum. Colonoscopy should be discontinued if freedom of the colonoscope body's insertion and rotation is completely lost, and the insertion time is prolonged over 30 min.

  17. Bayesian inference for multivariate meta-analysis Box-Cox transformation models for individual patient data with applications to evaluation of cholesterol lowering drugs

    PubMed Central

    Kim, Sungduk; Chen, Ming-Hui; Ibrahim, Joseph G.; Shah, Arvind K.; Lin, Jianxin

    2013-01-01

    In this paper, we propose a class of Box-Cox transformation regression models with multidimensional random effects for analyzing multivariate responses for individual patient data (IPD) in meta-analysis. Our modeling formulation uses a multivariate normal response meta-analysis model with multivariate random effects, in which each response is allowed to have its own Box-Cox transformation. Prior distributions are specified for the Box-Cox transformation parameters as well as the regression coefficients in this complex model, and the Deviance Information Criterion (DIC) is used to select the best transformation model. Since the model is quite complex, a novel Monte Carlo Markov chain (MCMC) sampling scheme is developed to sample from the joint posterior of the parameters. This model is motivated by a very rich dataset comprising 26 clinical trials involving cholesterol lowering drugs where the goal is to jointly model the three dimensional response consisting of Low Density Lipoprotein Cholesterol (LDL-C), High Density Lipoprotein Cholesterol (HDL-C), and Triglycerides (TG) (LDL-C, HDL-C, TG). Since the joint distribution of (LDL-C, HDL-C, TG) is not multivariate normal and in fact quite skewed, a Box-Cox transformation is needed to achieve normality. In the clinical literature, these three variables are usually analyzed univariately: however, a multivariate approach would be more appropriate since these variables are correlated with each other. A detailed analysis of these data is carried out using the proposed methodology. PMID:23580436

  18. Bayesian inference for multivariate meta-analysis Box-Cox transformation models for individual patient data with applications to evaluation of cholesterol-lowering drugs.

    PubMed

    Kim, Sungduk; Chen, Ming-Hui; Ibrahim, Joseph G; Shah, Arvind K; Lin, Jianxin

    2013-10-15

    In this paper, we propose a class of Box-Cox transformation regression models with multidimensional random effects for analyzing multivariate responses for individual patient data in meta-analysis. Our modeling formulation uses a multivariate normal response meta-analysis model with multivariate random effects, in which each response is allowed to have its own Box-Cox transformation. Prior distributions are specified for the Box-Cox transformation parameters as well as the regression coefficients in this complex model, and the deviance information criterion is used to select the best transformation model. Because the model is quite complex, we develop a novel Monte Carlo Markov chain sampling scheme to sample from the joint posterior of the parameters. This model is motivated by a very rich dataset comprising 26 clinical trials involving cholesterol-lowering drugs where the goal is to jointly model the three-dimensional response consisting of low density lipoprotein cholesterol (LDL-C), high density lipoprotein cholesterol (HDL-C), and triglycerides (TG) (LDL-C, HDL-C, TG). Because the joint distribution of (LDL-C, HDL-C, TG) is not multivariate normal and in fact quite skewed, a Box-Cox transformation is needed to achieve normality. In the clinical literature, these three variables are usually analyzed univariately; however, a multivariate approach would be more appropriate because these variables are correlated with each other. We carry out a detailed analysis of these data by using the proposed methodology. Copyright © 2013 John Wiley & Sons, Ltd.

  19. Application of multivariate statistical techniques in microbial ecology

    PubMed Central

    Paliy, O.; Shankar, V.

    2016-01-01

    Recent advances in high-throughput methods of molecular analyses have led to an explosion of studies generating large scale ecological datasets. Especially noticeable effect has been attained in the field of microbial ecology, where new experimental approaches provided in-depth assessments of the composition, functions, and dynamic changes of complex microbial communities. Because even a single high-throughput experiment produces large amounts of data, powerful statistical techniques of multivariate analysis are well suited to analyze and interpret these datasets. Many different multivariate techniques are available, and often it is not clear which method should be applied to a particular dataset. In this review we describe and compare the most widely used multivariate statistical techniques including exploratory, interpretive, and discriminatory procedures. We consider several important limitations and assumptions of these methods, and we present examples of how these approaches have been utilized in recent studies to provide insight into the ecology of the microbial world. Finally, we offer suggestions for the selection of appropriate methods based on the research question and dataset structure. PMID:26786791

  20. Analysis of Multivariate Experimental Data Using A Simplified Regression Model Search Algorithm

    NASA Technical Reports Server (NTRS)

    Ulbrich, Norbert M.

    2013-01-01

    A new regression model search algorithm was developed that may be applied to both general multivariate experimental data sets and wind tunnel strain-gage balance calibration data. The algorithm is a simplified version of a more complex algorithm that was originally developed for the NASA Ames Balance Calibration Laboratory. The new algorithm performs regression model term reduction to prevent overfitting of data. It has the advantage that it needs only about one tenth of the original algorithm's CPU time for the completion of a regression model search. In addition, extensive testing showed that the prediction accuracy of math models obtained from the simplified algorithm is similar to the prediction accuracy of math models obtained from the original algorithm. The simplified algorithm, however, cannot guarantee that search constraints related to a set of statistical quality requirements are always satisfied in the optimized regression model. Therefore, the simplified algorithm is not intended to replace the original algorithm. Instead, it may be used to generate an alternate optimized regression model of experimental data whenever the application of the original search algorithm fails or requires too much CPU time. Data from a machine calibration of NASA's MK40 force balance is used to illustrate the application of the new search algorithm.

  1. Regional trends in short-duration precipitation extremes: a flexible multivariate monotone quantile regression approach

    NASA Astrophysics Data System (ADS)

    Cannon, Alex

    2017-04-01

    Estimating historical trends in short-duration rainfall extremes at regional and local scales is challenging due to low signal-to-noise ratios and the limited availability of homogenized observational data. In addition to being of scientific interest, trends in rainfall extremes are of practical importance, as their presence calls into question the stationarity assumptions that underpin traditional engineering and infrastructure design practice. Even with these fundamental challenges, increasingly complex questions are being asked about time series of extremes. For instance, users may not only want to know whether or not rainfall extremes have changed over time, they may also want information on the modulation of trends by large-scale climate modes or on the nonstationarity of trends (e.g., identifying hiatus periods or periods of accelerating positive trends). Efforts have thus been devoted to the development and application of more robust and powerful statistical estimators for regional and local scale trends. While a standard nonparametric method like the regional Mann-Kendall test, which tests for the presence of monotonic trends (i.e., strictly non-decreasing or non-increasing changes), makes fewer assumptions than parametric methods and pools information from stations within a region, it is not designed to visualize detected trends, include information from covariates, or answer questions about the rate of change in trends. As a remedy, monotone quantile regression (MQR) has been developed as a nonparametric alternative that can be used to estimate a common monotonic trend in extremes at multiple stations. Quantile regression makes efficient use of data by directly estimating conditional quantiles based on information from all rainfall data in a region, i.e., without having to precompute the sample quantiles. The MQR method is also flexible and can be used to visualize and analyze the nonlinearity of the detected trend. However, it is fundamentally a univariate technique, and cannot incorporate information from additional covariates, for example ENSO state or physiographic controls on extreme rainfall within a region. Here, the univariate MQR model is extended to allow the use of multiple covariates. Multivariate monotone quantile regression (MMQR) is based on a single hidden-layer feedforward network with the quantile regression error function and partial monotonicity constraints. The MMQR model is demonstrated via Monte Carlo simulations and the estimation and visualization of regional trends in moderate rainfall extremes based on homogenized sub-daily precipitation data at stations in Canada.

  2. Application of multivariable search techniques to the optimization of airfoils in a low speed nonlinear inviscid flow field

    NASA Technical Reports Server (NTRS)

    Hague, D. S.; Merz, A. W.

    1975-01-01

    Multivariable search techniques are applied to a particular class of airfoil optimization problems. These are the maximization of lift and the minimization of disturbance pressure magnitude in an inviscid nonlinear flow field. A variety of multivariable search techniques contained in an existing nonlinear optimization code, AESOP, are applied to this design problem. These techniques include elementary single parameter perturbation methods, organized search such as steepest-descent, quadratic, and Davidon methods, randomized procedures, and a generalized search acceleration technique. Airfoil design variables are seven in number and define perturbations to the profile of an existing NACA airfoil. The relative efficiency of the techniques are compared. It is shown that elementary one parameter at a time and random techniques compare favorably with organized searches in the class of problems considered. It is also shown that significant reductions in disturbance pressure magnitude can be made while retaining reasonable lift coefficient values at low free stream Mach numbers.

  3. Multivariable PID controller design tuning using bat algorithm for activated sludge process

    NASA Astrophysics Data System (ADS)

    Atikah Nor’Azlan, Nur; Asmiza Selamat, Nur; Mat Yahya, Nafrizuan

    2018-04-01

    The designing of a multivariable PID control for multi input multi output is being concerned with this project by applying four multivariable PID control tuning which is Davison, Penttinen-Koivo, Maciejowski and Proposed Combined method. The determination of this study is to investigate the performance of selected optimization technique to tune the parameter of MPID controller. The selected optimization technique is Bat Algorithm (BA). All the MPID-BA tuning result will be compared and analyzed. Later, the best MPID-BA will be chosen in order to determine which techniques are better based on the system performances in terms of transient response.

  4. Application of a voltammetric electronic tongue and near infrared spectroscopy for a rapid umami taste assessment.

    PubMed

    Bagnasco, Lucia; Cosulich, M Elisabetta; Speranza, Giovanna; Medini, Luca; Oliveri, Paolo; Lanteri, Silvia

    2014-08-15

    The relationships between sensory attribute and analytical measurements, performed by electronic tongue (ET) and near-infrared spectroscopy (NIRS), were investigated in order to develop a rapid method for the assessment of umami taste. Commercially available umami products and some aminoacids were submitted to sensory analysis. Results were analysed in comparison with the outcomes of analytical measurements. Multivariate exploratory analysis was performed by principal component analysis (PCA). Calibration models for prediction of the umami taste on the basis of ET and NIR signals were obtained using partial least squares (PLS) regression. Different approaches for merging data from the two different analytical instruments were considered. Both of the techniques demonstrated to provide information related with umami taste. In particular, ET signals showed the higher correlation with umami attribute. Data fusion was found to be slightly beneficial - not so significantly as to justify the coupled use of the two analytical techniques. Copyright © 2014 Elsevier Ltd. All rights reserved.

  5. Regionalisation of low flow frequency curves for the Peninsular Malaysia

    NASA Astrophysics Data System (ADS)

    Mamun, Abdullah A.; Hashim, Alias; Daoud, Jamal I.

    2010-02-01

    SUMMARYRegional maps and equations for the magnitude and frequency of 1, 7 and 30-day low flows were derived and are presented in this paper. The river gauging stations of neighbouring catchments that produced similar low flow frequency curves were grouped together. As such, the Peninsular Malaysia was divided into seven low flow regions. Regional equations were developed using the multivariate regression technique. An empirical relationship was developed for mean annual minimum flow as a function of catchment area, mean annual rainfall and mean annual evaporation. The regional equations exhibited good coefficient of determination ( R2 > 0.90). Three low flow frequency curves showing the low, mean and high limits for each region were proposed based on a graphical best-fit technique. Knowing the catchment area, mean annual rainfall and evaporation in the region, design low flows of different durations can be easily estimated for the ungauged catchments. This procedure is expected to overcome the problem of data unavailability in estimating low flows in the Peninsular Malaysia.

  6. Broiler chickens can benefit from machine learning: support vector machine analysis of observational epidemiological data

    PubMed Central

    Hepworth, Philip J.; Nefedov, Alexey V.; Muchnik, Ilya B.; Morgan, Kenton L.

    2012-01-01

    Machine-learning algorithms pervade our daily lives. In epidemiology, supervised machine learning has the potential for classification, diagnosis and risk factor identification. Here, we report the use of support vector machine learning to identify the features associated with hock burn on commercial broiler farms, using routinely collected farm management data. These data lend themselves to analysis using machine-learning techniques. Hock burn, dermatitis of the skin over the hock, is an important indicator of broiler health and welfare. Remarkably, this classifier can predict the occurrence of high hock burn prevalence with accuracy of 0.78 on unseen data, as measured by the area under the receiver operating characteristic curve. We also compare the results with those obtained by standard multi-variable logistic regression and suggest that this technique provides new insights into the data. This novel application of a machine-learning algorithm, embedded in poultry management systems could offer significant improvements in broiler health and welfare worldwide. PMID:22319115

  7. Broiler chickens can benefit from machine learning: support vector machine analysis of observational epidemiological data.

    PubMed

    Hepworth, Philip J; Nefedov, Alexey V; Muchnik, Ilya B; Morgan, Kenton L

    2012-08-07

    Machine-learning algorithms pervade our daily lives. In epidemiology, supervised machine learning has the potential for classification, diagnosis and risk factor identification. Here, we report the use of support vector machine learning to identify the features associated with hock burn on commercial broiler farms, using routinely collected farm management data. These data lend themselves to analysis using machine-learning techniques. Hock burn, dermatitis of the skin over the hock, is an important indicator of broiler health and welfare. Remarkably, this classifier can predict the occurrence of high hock burn prevalence with accuracy of 0.78 on unseen data, as measured by the area under the receiver operating characteristic curve. We also compare the results with those obtained by standard multi-variable logistic regression and suggest that this technique provides new insights into the data. This novel application of a machine-learning algorithm, embedded in poultry management systems could offer significant improvements in broiler health and welfare worldwide.

  8. A spline-based regression parameter set for creating customized DARTEL MRI brain templates from infancy to old age.

    PubMed

    Wilke, Marko

    2018-02-01

    This dataset contains the regression parameters derived by analyzing segmented brain MRI images (gray matter and white matter) from a large population of healthy subjects, using a multivariate adaptive regression splines approach. A total of 1919 MRI datasets ranging in age from 1-75 years from four publicly available datasets (NIH, C-MIND, fCONN, and IXI) were segmented using the CAT12 segmentation framework, writing out gray matter and white matter images normalized using an affine-only spatial normalization approach. These images were then subjected to a six-step DARTEL procedure, employing an iterative non-linear registration approach and yielding increasingly crisp intermediate images. The resulting six datasets per tissue class were then analyzed using multivariate adaptive regression splines, using the CerebroMatic toolbox. This approach allows for flexibly modelling smoothly varying trajectories while taking into account demographic (age, gender) as well as technical (field strength, data quality) predictors. The resulting regression parameters described here can be used to generate matched DARTEL or SHOOT templates for a given population under study, from infancy to old age. The dataset and the algorithm used to generate it are publicly available at https://irc.cchmc.org/software/cerebromatic.php.

  9. [Influences of environmental factors and interaction of several chemokines gene-environmental on systemic lupus erythematosus].

    PubMed

    Ye, Dong-qing; Hu, Yi-song; Li, Xiang-pei; Huang, Fen; Yang, Shi-gui; Hao, Jia-hu; Yin, Jing; Zhang, Guo-qing; Liu, Hui-hui

    2004-11-01

    To explore the impact of environmental factors, daily lifestyle, psycho-social factors and the interactions between environmental factors and chemokines genes on systemic lupus erythematosus (SLE). Case-control study was carried out and environmental factors for SLE were analyzed by univariate and multivariate unconditional logistic regression. Interactions between environmental factors and chemokines polymorphism contributing to systemic lupus erythematosus were also analyzed by logistic regression model. There were nineteen factors associated with SLE when univariate unconditional logistic regression was used. However, when multivariate unconditional logistic regression was used, only five factors showed having impacts on the disease, in which drinking well water (OR=0.099) was protective factor for SLE, and multiple drug allergy (OR=8.174), over-exposure to sunshine (OR=18.339), taking antibiotics (OR=9.630) and oral contraceptives were risk factors for SLE. When unconditional logistic regression model was used, results showed that there was interaction between eating irritable food and -2518MCP-1G/G genotype (OR=4.387). No interaction between environmental factors was found that contributing to SLE in this study. Many environmental factors were related to SLE, and there was an interaction between -2518MCP-1G/G genotype and eating irritable food.

  10. Bootstrap Enhanced Penalized Regression for Variable Selection with Neuroimaging Data.

    PubMed

    Abram, Samantha V; Helwig, Nathaniel E; Moodie, Craig A; DeYoung, Colin G; MacDonald, Angus W; Waller, Niels G

    2016-01-01

    Recent advances in fMRI research highlight the use of multivariate methods for examining whole-brain connectivity. Complementary data-driven methods are needed for determining the subset of predictors related to individual differences. Although commonly used for this purpose, ordinary least squares (OLS) regression may not be ideal due to multi-collinearity and over-fitting issues. Penalized regression is a promising and underutilized alternative to OLS regression. In this paper, we propose a nonparametric bootstrap quantile (QNT) approach for variable selection with neuroimaging data. We use real and simulated data, as well as annotated R code, to demonstrate the benefits of our proposed method. Our results illustrate the practical potential of our proposed bootstrap QNT approach. Our real data example demonstrates how our method can be used to relate individual differences in neural network connectivity with an externalizing personality measure. Also, our simulation results reveal that the QNT method is effective under a variety of data conditions. Penalized regression yields more stable estimates and sparser models than OLS regression in situations with large numbers of highly correlated neural predictors. Our results demonstrate that penalized regression is a promising method for examining associations between neural predictors and clinically relevant traits or behaviors. These findings have important implications for the growing field of functional connectivity research, where multivariate methods produce numerous, highly correlated brain networks.

  11. Bootstrap Enhanced Penalized Regression for Variable Selection with Neuroimaging Data

    PubMed Central

    Abram, Samantha V.; Helwig, Nathaniel E.; Moodie, Craig A.; DeYoung, Colin G.; MacDonald, Angus W.; Waller, Niels G.

    2016-01-01

    Recent advances in fMRI research highlight the use of multivariate methods for examining whole-brain connectivity. Complementary data-driven methods are needed for determining the subset of predictors related to individual differences. Although commonly used for this purpose, ordinary least squares (OLS) regression may not be ideal due to multi-collinearity and over-fitting issues. Penalized regression is a promising and underutilized alternative to OLS regression. In this paper, we propose a nonparametric bootstrap quantile (QNT) approach for variable selection with neuroimaging data. We use real and simulated data, as well as annotated R code, to demonstrate the benefits of our proposed method. Our results illustrate the practical potential of our proposed bootstrap QNT approach. Our real data example demonstrates how our method can be used to relate individual differences in neural network connectivity with an externalizing personality measure. Also, our simulation results reveal that the QNT method is effective under a variety of data conditions. Penalized regression yields more stable estimates and sparser models than OLS regression in situations with large numbers of highly correlated neural predictors. Our results demonstrate that penalized regression is a promising method for examining associations between neural predictors and clinically relevant traits or behaviors. These findings have important implications for the growing field of functional connectivity research, where multivariate methods produce numerous, highly correlated brain networks. PMID:27516732

  12. Binge eating disorder and depressive symptoms among females of child-bearing age: the Korea Nurses' Health Study.

    PubMed

    Kim, O; Kim, M S; Kim, J; Lee, J E; Jung, H

    2018-01-17

    Most studies regarding the relationship between binge eating disorder (BED) and depression have targeted obese populations. However, nurses, particularly female nurses, are one of the vocations that face these issues due to various reasons including high stress and shift work. This study investigated the prevalence of BED and the correlation between BED and severity of self-reported depressive symptoms among female nurses in South Korea. Participants were 7,267 female nurses, of which 502 had symptoms of BED. Using the propensity score matching (PSM) technique, 502 nurses with BED and 502 without BED were included in the analyses. Data were analyzed using descriptive statistics, Spearman's correlation, and multivariable ordinal logistic regression analysis. The proportion of binge eating disorder was 6.90% among the nurses, and 81.3% of nurses displayed some levels of depressive symptoms. Multivariable ordinal logistic regression analysis revealed that age (40 years old and older), alcohol consumption (frequent drinkers), self-rated health, sleep problems, and stress were associated with self-reported depression symptoms. Overall, after adjusting for confounders, nurses with BED had 1.80 times the risk (95% CI = [1.41-2.30]; p-value < 0.001) of experiencing a greater severity of self-reported depression symptoms. Korean female nurse showed a higher prevalence of both binge eating disorder and depressive symptoms, and the association between the two factors was proven in the study. Therefore, hospital management and health policy makers should be alarmed and agreed on both examining nurses on such problems and providing organized and systematic assistance.

  13. Reconstruction of the Irradiated Breast: A National Claims-Based Assessment of Postoperative Morbidity.

    PubMed

    Chetta, Matthew D; Aliu, Oluseyi; Zhong, Lin; Sears, Erika D; Waljee, Jennifer F; Chung, Kevin C; Momoh, Adeyiza O

    2017-04-01

    Implant-based reconstruction rates have risen among irradiation-treated breast cancer patients in the United States. This study aims to assess the morbidity associated with various breast reconstruction techniques in irradiated patients. From the MarketScan Commercial Claims and Encounters database, the authors selected breast cancer patients who had undergone mastectomy, irradiation, and breast reconstruction from 2009 to 2012. Demographic and clinical treatment data, including data on the timing of irradiation relative to breast reconstruction were recorded. Complications and failures after implant and autologous reconstruction were also recorded. A multivariable logistic regression model was developed with postoperative complications as the dependent variable and patient demographic and clinical variables as independent variables. Four thousand seven hundred eighty-one irradiated patients who met the inclusion criteria were selected. A majority of the patients [n = 3846 (80 percent)] underwent reconstruction with implants. Overall complication rates were 45.3 percent and 30.8 percent for patients with implant and autologous reconstruction, respectively. Failure of reconstruction occurred in 29.4 percent of patients with implant reconstruction compared with 4.3 percent of patients with autologous reconstruction. In multivariable logistic regression, irradiated patients with implant reconstruction had two times the odds of having any complication and 11 times the odds of failure relative to patients with autologous reconstruction. Implant-based breast reconstruction in the irradiated patient, although popular, is associated with significant morbidity. Failures of reconstruction with implants in these patients approach 30 percent in the short term, suggesting a need for careful shared decision-making, with full disclosure of the potential morbidity. Therapeutic, III.

  14. Outcomes following Kidney transplantation in IgA nephropathy: a UNOS/OPTN analysis.

    PubMed

    Kadiyala, Aditya; Mathew, Anna T; Sachdeva, Mala; Sison, Cristina P; Shah, Hitesh H; Fishbane, Steven; Jhaveri, Kenar D

    2015-10-01

    This study updates assessment of post-transplant outcomes in IgAN patients in the modern era of immunosuppression. Using UNOS/OPTN data, patients ≥18 yr of age with first kidney transplant (1/1/1999 to 12/31/2008) were analyzed. Multivariable Cox regression models and propensity score-based matching techniques were used to estimate hazard ratios (HRs) for death-censored allograft survival (DCGS) and patient survival in IgAN compared to non-IgAN. Results of multivariable regression were stratified by donor type (living vs. deceased). A total of 107, 747 recipients were included (4589 with IgAN and 103 158 with non-IgAN). Adjusted HR for DCGS showed no significant difference between IgAN and non-IgAN. IgAN had higher patient survival compared to non-IgAN (HR 0.54, 95% CI 0.47-0.62, p < 0.0001 for deceased donors; HR 0.42, 95% CI 0.33-0.54, p < 0.0001 for living donors). Propensity score-matched analysis was similar, with no significant difference in DCGS between matched groups and higher patient survival in IgAN patients compared to non-IgAN group (HR 0.54, 95% CI 0.47, 0.63; p-value <0.0001). IgAN patients with first kidney transplant have superior patient survival and similar graft survival compared to non-IgAN recipients. Results can be used in prognostication and informed decision-making about kidney transplantation in patients with IgAN. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  15. The association between Type D personality and the metabolic syndrome: a cross-sectional study in a University-based outpatient lipid clinic

    PubMed Central

    2011-01-01

    Background Type D personality has been associated in the past with increased cardiovascular mortality among patients with established coronary heart disease. Very few studies have investigated the association of type D personality with traditional cardiovascular risk factors. In this study, we assessed the association between type D personality and the metabolic syndrome. Findings New consecutive patients referred to an outpatient lipid clinic for evaluation of possible metabolic syndrome were eligible for inclusion in the study. The metabolic syndrome was defined according to the International Diabetes Federation (IDF) diagnostic criteria. Type D personality was assessed with the DS-14 scale. Multivariate regression techniques were used to investigate the association between personality and metabolic syndromes adjusting for a number of medical and psychiatric confounders. Three hundred and fifty-nine persons were screened of whom 206 met the diagnostic criteria for the metabolic syndrome ("cases") and 153 did not ("control group"). The prevalence of type D personality was significantly higher in the cases as compared to the control group (44% versus 15% respectively, p < 0.001). In multivariate logistic regression analysis the presence of Type D personality was significantly associated with metabolic syndrome independently of other clinical factors, anxiety and depressive symptoms (odds ratio 3.47; 95% Confidence Interval: 1.90 - 6.33). Conclusions Type D personality was independently associated with the metabolic syndrome in this cross-sectional study. The potential implications of this finding, especially from a clinical or preventive perspective, should be examined in future research. PMID:21466680

  16. Meta-Analytic Structural Equation Modeling (MASEM): Comparison of the Multivariate Methods

    ERIC Educational Resources Information Center

    Zhang, Ying

    2011-01-01

    Meta-analytic Structural Equation Modeling (MASEM) has drawn interest from many researchers recently. In doing MASEM, researchers usually first synthesize correlation matrices across studies using meta-analysis techniques and then analyze the pooled correlation matrix using structural equation modeling techniques. Several multivariate methods of…

  17. Investigation of the Sensitivity of Transmission Raman Spectroscopy for Polymorph Detection in Pharmaceutical Tablets.

    PubMed

    Feng, Hanzhou; Bondi, Robert W; Anderson, Carl A; Drennen, James K; Igne, Benoît

    2017-08-01

    Polymorph detection is critical for ensuring pharmaceutical product quality in drug substances exhibiting polymorphism. Conventional analytical techniques such as X-ray powder diffraction and solid-state nuclear magnetic resonance are utilized primarily for characterizing the presence and identity of specific polymorphs in a sample. These techniques have encountered challenges in analyzing the constitution of polymorphs in the presence of other components commonly found in pharmaceutical dosage forms. Laborious sample preparation procedures are usually required to achieve satisfactory data interpretability. There is a need for alternative techniques capable of probing pharmaceutical dosage forms rapidly and nondestructively, which is dictated by the practical requirements of applications such as quality monitoring on production lines or when quantifying product shelf lifetime. The sensitivity of transmission Raman spectroscopy for detecting polymorphs in final tablet cores was investigated in this work. Carbamazepine was chosen as a model drug, polymorph form III is the commercial form, whereas form I is an undesired polymorph that requires effective detection. The concentration of form I in a direct compression tablet formulation containing 20% w/w of carbamazepine, 74.00% w/w of fillers (mannitol and microcrystalline cellulose), and 6% w/w of croscarmellose sodium, silicon dioxide, and magnesium stearate was estimated using transmission Raman spectroscopy. Quantitative models were generated and optimized using multivariate regression and data preprocessing. Prediction uncertainty was estimated for each validation sample by accounting for all the main variables contributing to the prediction. Multivariate detection limits were calculated based on statistical hypothesis testing. The transmission Raman spectroscopic model had an absolute prediction error of 0.241% w/w for the independent validation set. The method detection limit was estimated at 1.31% w/w. The results demonstrated that transmission Raman spectroscopy is a sensitive tool for polymorphs detection in pharmaceutical tablets.

  18. Predictors of Peritonitis and the Impact of Peritonitis on Clinical Outcomes of Continuous Ambulatory Peritoneal Dialysis Patients in Taiwan—10 Years’ Experience in a Single Center

    PubMed Central

    Hsieh, Yao-Peng; Chang, Chia-Chu; Wen, Yao-Ko; Chiu, Ping-Fang; Yang, Yu

    2014-01-01

    ♦ Objective: Peritoneal dialysis (PD) has become more prevalent as a treatment modality for end-stage renal disease, and peritonitis remains one of its most devastating complications. The aim of the present investigation was to examine the frequency and predictors of peritonitis and the impact of peritonitis on clinical outcomes. ♦ Methods: Our retrospective observational cohort study enrolled 391 patients who had been treated with continuous ambulatory PD (CAPD) for at least 90 days. Relevant demographic, biochemical, and clinical data were collected for an analysis of CAPD-associated peritonitis, technique failure, drop-out from PD, and patient mortality. ♦ Results: The peritonitis rate was 0.196 episodes per patient-year. Older age (>65 years) was the only identified risk factor associated with peritonitis. A multivariate Cox regression model demonstrated that technique failure occurred more often in patients experiencing peritonitis than in those free of peritonitis (p < 0.001). Kaplan-Meier analysis revealed that the group experiencing peritonitis tended to survive longer than the group that was peritonitis-free (p = 0.11). After multivariate adjustment, the survival advantage reached significance (hazard ratio: 0.64; 95% confidence interval: 0.46 to 0.89; p = 0.006). Compared with the peritonitis-free group, the group experiencing peritonitis also had more drop-out from PD (p = 0.03). ♦ Conclusions: The peritonitis rate was relatively low in the present investigation. Elderly patients were at higher risk of peritonitis episodes. Peritonitis independently predicted technique failure, in agreement with other reports. However, contrary to previous studies, all-cause mortality was better in patients experiencing peritonitis than in those free of peritonitis. The underlying mechanisms of this presumptive “peritonitis paradox” remain to be clarified. PMID:24084840

  19. Predictive equations for the estimation of body size in seals and sea lions (Carnivora: Pinnipedia)

    PubMed Central

    Churchill, Morgan; Clementz, Mark T; Kohno, Naoki

    2014-01-01

    Body size plays an important role in pinniped ecology and life history. However, body size data is often absent for historical, archaeological, and fossil specimens. To estimate the body size of pinnipeds (seals, sea lions, and walruses) for today and the past, we used 14 commonly preserved cranial measurements to develop sets of single variable and multivariate predictive equations for pinniped body mass and total length. Principal components analysis (PCA) was used to test whether separate family specific regressions were more appropriate than single predictive equations for Pinnipedia. The influence of phylogeny was tested with phylogenetic independent contrasts (PIC). The accuracy of these regressions was then assessed using a combination of coefficient of determination, percent prediction error, and standard error of estimation. Three different methods of multivariate analysis were examined: bidirectional stepwise model selection using Akaike information criteria; all-subsets model selection using Bayesian information criteria (BIC); and partial least squares regression. The PCA showed clear discrimination between Otariidae (fur seals and sea lions) and Phocidae (earless seals) for the 14 measurements, indicating the need for family-specific regression equations. The PIC analysis found that phylogeny had a minor influence on relationship between morphological variables and body size. The regressions for total length were more accurate than those for body mass, and equations specific to Otariidae were more accurate than those for Phocidae. Of the three multivariate methods, the all-subsets approach required the fewest number of variables to estimate body size accurately. We then used the single variable predictive equations and the all-subsets approach to estimate the body size of two recently extinct pinniped taxa, the Caribbean monk seal (Monachus tropicalis) and the Japanese sea lion (Zalophus japonicus). Body size estimates using single variable regressions generally under or over-estimated body size; however, the all-subset regression produced body size estimates that were close to historically recorded body length for these two species. This indicates that the all-subset regression equations developed in this study can estimate body size accurately. PMID:24916814

  20. [Correlation between gaseous exchange rate, body temperature, and mitochondrial protein content in the liver of mice].

    PubMed

    Muradian, Kh K; Utko, N O; Mozzhukhina, T H; Pishel', I M; Litoshenko, O Ia; Bezrukov, V V; Fraĭfel'd, V E

    2002-01-01

    Correlative and regressive relations between the gaseous exchange, thermoregulation and mitochondrial protein content were analyzed by two- and three-dimensional statistics in mice. It has been shown that the pair wise linear methods of analysis did not reveal any significant correlation between the parameters under exploration. However, it became evident at three-dimensional and non-linear plotting for which the coefficients of multivariable correlation reached and even exceeded 0.7-0.8. The calculations based on partial differentiation of the multivariable regression equations allow to conclude that at certain values of VO2, VCO2 and body temperature negative relations between the systems of gaseous exchange and thermoregulation become dominating.

  1. Hybrid ABC Optimized MARS-Based Modeling of the Milling Tool Wear from Milling Run Experimental Data

    PubMed Central

    García Nieto, Paulino José; García-Gonzalo, Esperanza; Ordóñez Galán, Celestino; Bernardo Sánchez, Antonio

    2016-01-01

    Milling cutters are important cutting tools used in milling machines to perform milling operations, which are prone to wear and subsequent failure. In this paper, a practical new hybrid model to predict the milling tool wear in a regular cut, as well as entry cut and exit cut, of a milling tool is proposed. The model was based on the optimization tool termed artificial bee colony (ABC) in combination with multivariate adaptive regression splines (MARS) technique. This optimization mechanism involved the parameter setting in the MARS training procedure, which significantly influences the regression accuracy. Therefore, an ABC–MARS-based model was successfully used here to predict the milling tool flank wear (output variable) as a function of the following input variables: the time duration of experiment, depth of cut, feed, type of material, etc. Regression with optimal hyperparameters was performed and a determination coefficient of 0.94 was obtained. The ABC–MARS-based model's goodness of fit to experimental data confirmed the good performance of this model. This new model also allowed us to ascertain the most influential parameters on the milling tool flank wear with a view to proposing milling machine's improvements. Finally, conclusions of this study are exposed. PMID:28787882

  2. Hybrid ABC Optimized MARS-Based Modeling of the Milling Tool Wear from Milling Run Experimental Data.

    PubMed

    García Nieto, Paulino José; García-Gonzalo, Esperanza; Ordóñez Galán, Celestino; Bernardo Sánchez, Antonio

    2016-01-28

    Milling cutters are important cutting tools used in milling machines to perform milling operations, which are prone to wear and subsequent failure. In this paper, a practical new hybrid model to predict the milling tool wear in a regular cut, as well as entry cut and exit cut, of a milling tool is proposed. The model was based on the optimization tool termed artificial bee colony (ABC) in combination with multivariate adaptive regression splines (MARS) technique. This optimization mechanism involved the parameter setting in the MARS training procedure, which significantly influences the regression accuracy. Therefore, an ABC-MARS-based model was successfully used here to predict the milling tool flank wear (output variable) as a function of the following input variables: the time duration of experiment, depth of cut, feed, type of material, etc . Regression with optimal hyperparameters was performed and a determination coefficient of 0.94 was obtained. The ABC-MARS-based model's goodness of fit to experimental data confirmed the good performance of this model. This new model also allowed us to ascertain the most influential parameters on the milling tool flank wear with a view to proposing milling machine's improvements. Finally, conclusions of this study are exposed.

  3. Regional flow duration curves: Geostatistical techniques versus multivariate regression

    USGS Publications Warehouse

    Pugliese, Alessio; Farmer, William H.; Castellarin, Attilio; Archfield, Stacey A.; Vogel, Richard M.

    2016-01-01

    A period-of-record flow duration curve (FDC) represents the relationship between the magnitude and frequency of daily streamflows. Prediction of FDCs is of great importance for locations characterized by sparse or missing streamflow observations. We present a detailed comparison of two methods which are capable of predicting an FDC at ungauged basins: (1) an adaptation of the geostatistical method, Top-kriging, employing a linear weighted average of dimensionless empirical FDCs, standardised with a reference streamflow value; and (2) regional multiple linear regression of streamflow quantiles, perhaps the most common method for the prediction of FDCs at ungauged sites. In particular, Top-kriging relies on a metric for expressing the similarity between catchments computed as the negative deviation of the FDC from a reference streamflow value, which we termed total negative deviation (TND). Comparisons of these two methods are made in 182 largely unregulated river catchments in the southeastern U.S. using a three-fold cross-validation algorithm. Our results reveal that the two methods perform similarly throughout flow-regimes, with average Nash-Sutcliffe Efficiencies 0.566 and 0.662, (0.883 and 0.829 on log-transformed quantiles) for the geostatistical and the linear regression models, respectively. The differences between the reproduction of FDC's occurred mostly for low flows with exceedance probability (i.e. duration) above 0.98.

  4. Estimation of Subpixel Snow-Covered Area by Nonparametric Regression Splines

    NASA Astrophysics Data System (ADS)

    Kuter, S.; Akyürek, Z.; Weber, G.-W.

    2016-10-01

    Measurement of the areal extent of snow cover with high accuracy plays an important role in hydrological and climate modeling. Remotely-sensed data acquired by earth-observing satellites offer great advantages for timely monitoring of snow cover. However, the main obstacle is the tradeoff between temporal and spatial resolution of satellite imageries. Soft or subpixel classification of low or moderate resolution satellite images is a preferred technique to overcome this problem. The most frequently employed snow cover fraction methods applied on Moderate Resolution Imaging Spectroradiometer (MODIS) data have evolved from spectral unmixing and empirical Normalized Difference Snow Index (NDSI) methods to latest machine learning-based artificial neural networks (ANNs). This study demonstrates the implementation of subpixel snow-covered area estimation based on the state-of-the-art nonparametric spline regression method, namely, Multivariate Adaptive Regression Splines (MARS). MARS models were trained by using MODIS top of atmospheric reflectance values of bands 1-7 as predictor variables. Reference percentage snow cover maps were generated from higher spatial resolution Landsat ETM+ binary snow cover maps. A multilayer feed-forward ANN with one hidden layer trained with backpropagation was also employed to estimate the percentage snow-covered area on the same data set. The results indicated that the developed MARS model performed better than th

  5. A High-Dimensional, Multivariate Copula Approach to Modeling Multivariate Agricultural Price Relationships and Tail Dependencies

    Treesearch

    Xuan Chi; Barry Goodwin

    2012-01-01

    Spatial and temporal relationships among agricultural prices have been an important topic of applied research for many years. Such research is used to investigate the performance of markets and to examine linkages up and down the marketing chain. This research has empirically evaluated price linkages by using correlation and regression models and, later, linear and...

  6. Factors affecting plant species composition of hedgerows: relative importance and hierarchy

    NASA Astrophysics Data System (ADS)

    Deckers, Bart; Hermy, Martin; Muys, Bart

    2004-07-01

    Although there has been a clear quantitative and qualitative decline in traditional hedgerow network landscapes during last century, hedgerows are crucial for the conservation of rural biodiversity, functioning as an important habitat, refuge and corridor for numerous species. To safeguard this conservation function, insight in the basic organizing principles of hedgerow plant communities is needed. The vegetation composition of 511 individual hedgerows situated within an ancient hedgerow network landscape in Flanders, Belgium was recorded, in combination with a wide range of explanatory variables, including a selection of spatial variables. Non-parametric statistics in combination with multivariate data analysis techniques were used to study the effect of individual explanatory variables. Next, variables were grouped in five distinct subsets and the relative importance of these variable groups was assessed by two related variation partitioning techniques, partial regression and partial canonical correspondence analysis, taking into account explicitly the existence of intercorrelations between variables of different factor groups. Most explanatory variables affected significantly hedgerow species richness and composition. Multivariate analysis showed that, besides adjacent land use, hedgerow management, soil conditions, hedgerow type and origin, the role of other factors such as hedge dimensions, intactness, etc., could certainly not be neglected. Furthermore, both methods revealed the same overall ranking of the five distinct factor groups. Besides a predominant impact of abiotic environmental conditions, it was found that management variables and structural aspects have a relatively larger influence on the distribution of plant species in hedgerows than their historical background or spatial configuration.

  7. Sparse Multivariate Autoregressive Modeling for Mild Cognitive Impairment Classification

    PubMed Central

    Li, Yang; Wee, Chong-Yaw; Jie, Biao; Peng, Ziwen

    2014-01-01

    Brain connectivity network derived from functional magnetic resonance imaging (fMRI) is becoming increasingly prevalent in the researches related to cognitive and perceptual processes. The capability to detect causal or effective connectivity is highly desirable for understanding the cooperative nature of brain network, particularly when the ultimate goal is to obtain good performance of control-patient classification with biological meaningful interpretations. Understanding directed functional interactions between brain regions via brain connectivity network is a challenging task. Since many genetic and biomedical networks are intrinsically sparse, incorporating sparsity property into connectivity modeling can make the derived models more biologically plausible. Accordingly, we propose an effective connectivity modeling of resting-state fMRI data based on the multivariate autoregressive (MAR) modeling technique, which is widely used to characterize temporal information of dynamic systems. This MAR modeling technique allows for the identification of effective connectivity using the Granger causality concept and reducing the spurious causality connectivity in assessment of directed functional interaction from fMRI data. A forward orthogonal least squares (OLS) regression algorithm is further used to construct a sparse MAR model. By applying the proposed modeling to mild cognitive impairment (MCI) classification, we identify several most discriminative regions, including middle cingulate gyrus, posterior cingulate gyrus, lingual gyrus and caudate regions, in line with results reported in previous findings. A relatively high classification accuracy of 91.89 % is also achieved, with an increment of 5.4 % compared to the fully-connected, non-directional Pearson-correlation-based functional connectivity approach. PMID:24595922

  8. Utility of an Abbreviated Dizziness Questionnaire to Differentiate between Causes of Vertigo and Guide Appropriate Referral: A Multicenter Prospective Blinded Study

    PubMed Central

    Roland, Lauren T.; Kallogjeri, Dorina; Sinks, Belinda C.; Rauch, Steven D.; Shepard, Neil T.; White, Judith A.; Goebel, Joel A.

    2015-01-01

    Objective Test performance of a focused dizziness questionnaire’s ability to discriminate between peripheral and non-peripheral causes of vertigo. Study Design Prospective multi-center Setting Four academic centers with experienced balance specialists Patients New dizzy patients Interventions A 32-question survey was given to participants. Balance specialists were blinded and a diagnosis was established for all participating patients within 6 months. Main outcomes Multinomial logistic regression was used to evaluate questionnaire performance in predicting final diagnosis and differentiating between peripheral and non-peripheral vertigo. Univariate and multivariable stepwise logistic regression were used to identify questions as significant predictors of the ultimate diagnosis. C-index was used to evaluate performance and discriminative power of the multivariable models. Results 437 patients participated in the study. Eight participants without confirmed diagnoses were excluded and 429 were included in the analysis. Multinomial regression revealed that the model had good overall predictive accuracy of 78.5% for the final diagnosis and 75.5% for differentiating between peripheral and non-peripheral vertigo. Univariate logistic regression identified significant predictors of three main categories of vertigo: peripheral, central and other. Predictors were entered into forward stepwise multivariable logistic regression. The discriminative power of the final models for peripheral, central and other causes were considered good as measured by c-indices of 0.75, 0.7 and 0.78, respectively. Conclusions This multicenter study demonstrates a focused dizziness questionnaire can accurately predict diagnosis for patients with chronic/relapsing dizziness referred to outpatient clinics. Additionally, this survey has significant capability to differentiate peripheral from non-peripheral causes of vertigo and may, in the future, serve as a screening tool for specialty referral. Clinical utility of this questionnaire to guide specialty referral is discussed. PMID:26485598

  9. Utility of an Abbreviated Dizziness Questionnaire to Differentiate Between Causes of Vertigo and Guide Appropriate Referral: A Multicenter Prospective Blinded Study.

    PubMed

    Roland, Lauren T; Kallogjeri, Dorina; Sinks, Belinda C; Rauch, Steven D; Shepard, Neil T; White, Judith A; Goebel, Joel A

    2015-12-01

    Test performance of a focused dizziness questionnaire's ability to discriminate between peripheral and nonperipheral causes of vertigo. Prospective multicenter. Four academic centers with experienced balance specialists. New dizzy patients. A 32-question survey was given to participants. Balance specialists were blinded and a diagnosis was established for all participating patients within 6 months. Multinomial logistic regression was used to evaluate questionnaire performance in predicting final diagnosis and differentiating between peripheral and nonperipheral vertigo. Univariate and multivariable stepwise logistic regression were used to identify questions as significant predictors of the ultimate diagnosis. C-index was used to evaluate performance and discriminative power of the multivariable models. In total, 437 patients participated in the study. Eight participants without confirmed diagnoses were excluded and 429 were included in the analysis. Multinomial regression revealed that the model had good overall predictive accuracy of 78.5% for the final diagnosis and 75.5% for differentiating between peripheral and nonperipheral vertigo. Univariate logistic regression identified significant predictors of three main categories of vertigo: peripheral, central, and other. Predictors were entered into forward stepwise multivariable logistic regression. The discriminative power of the final models for peripheral, central, and other causes was considered good as measured by c-indices of 0.75, 0.7, and 0.78, respectively. This multicenter study demonstrates a focused dizziness questionnaire can accurately predict diagnosis for patients with chronic/relapsing dizziness referred to outpatient clinics. Additionally, this survey has significant capability to differentiate peripheral from nonperipheral causes of vertigo and may, in the future, serve as a screening tool for specialty referral. Clinical utility of this questionnaire to guide specialty referral is discussed.

  10. Scope of partial least-squares regression applied to the enantiomeric composition determination of ketoprofen from strongly overlapped chromatographic profiles.

    PubMed

    Padró, Juan M; Osorio-Grisales, Jaiver; Arancibia, Juan A; Olivieri, Alejandro C; Castells, Cecilia B

    2015-07-01

    Valuable quantitative information could be obtained from strongly overlapped chromatographic profiles of two enantiomers by using proper chemometric methods. Complete separation profiles where the peaks are fully resolved are difficult to achieve in chiral separation methods, and this becomes a particularly severe problem in case that the analyst needs to measure the chiral purity, i.e., when one of the enantiomers is present in the sample in very low concentrations. In this report, we explore the scope of a multivariate chemometric technique based on unfolded partial least-squares regression, as a mathematical tool to solve this quite frequent difficulty. This technique was applied to obtain quantitative results from partially overlapped chromatographic profiles of R- and S-ketoprofen, with different values of enantioresolution factors (from 0.81 down to less than 0.2 resolution units), and also at several different S:R enantiomeric ratios. Enantiomeric purity below 1% was determined with excellent precision even from almost completely overlapped signals. All these assays were tested on the most demanding condition, i.e., when the minor peak elutes immediately after the main peak. The results were validated using univariate calibration of completely resolved profiles and the method applied to the determination of enantiomeric purity of commercial pharmaceuticals. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  11. Near infrared spectrometric technique for testing fruit quality: optimisation of regression models using genetic algorithms

    NASA Astrophysics Data System (ADS)

    Isingizwe Nturambirwe, J. Frédéric; Perold, Willem J.; Opara, Umezuruike L.

    2016-02-01

    Near infrared (NIR) spectroscopy has gained extensive use in quality evaluation. It is arguably one of the most advanced spectroscopic tools in non-destructive quality testing of food stuff, from measurement to data analysis and interpretation. NIR spectral data are interpreted through means often involving multivariate statistical analysis, sometimes associated with optimisation techniques for model improvement. The objective of this research was to explore the extent to which genetic algorithms (GA) can be used to enhance model development, for predicting fruit quality. Apple fruits were used, and NIR spectra in the range from 12000 to 4000 cm-1 were acquired on both bruised and healthy tissues, with different degrees of mechanical damage. GAs were used in combination with partial least squares regression methods to develop bruise severity prediction models, and compared to PLS models developed using the full NIR spectrum. A classification model was developed, which clearly separated bruised from unbruised apple tissue. GAs helped improve prediction models by over 10%, in comparison with full spectrum-based models, as evaluated in terms of error of prediction (Root Mean Square Error of Cross-validation). PLS models to predict internal quality, such as sugar content and acidity were developed and compared to the versions optimized by genetic algorithm. Overall, the results highlighted the potential use of GA method to improve speed and accuracy of fruit quality prediction.

  12. The Effect of the Multivariate Box-Cox Transformation on the Power of MANOVA.

    ERIC Educational Resources Information Center

    Kirisci, Levent; Hsu, Tse-Chi

    Most of the multivariate statistical techniques rely on the assumption of multivariate normality. The effects of non-normality on multivariate tests are assumed to be negligible when variance-covariance matrices and sample sizes are equal. Therefore, in practice, investigators do not usually attempt to remove non-normality. In this simulation…

  13. Experimental variability and data pre-processing as factors affecting the discrimination power of some chemometric approaches (PCA, CA and a new algorithm based on linear regression) applied to (+/-)ESI/MS and RPLC/UV data: Application on green tea extracts.

    PubMed

    Iorgulescu, E; Voicu, V A; Sârbu, C; Tache, F; Albu, F; Medvedovici, A

    2016-08-01

    The influence of the experimental variability (instrumental repeatability, instrumental intermediate precision and sample preparation variability) and data pre-processing (normalization, peak alignment, background subtraction) on the discrimination power of multivariate data analysis methods (Principal Component Analysis -PCA- and Cluster Analysis -CA-) as well as a new algorithm based on linear regression was studied. Data used in the study were obtained through positive or negative ion monitoring electrospray mass spectrometry (+/-ESI/MS) and reversed phase liquid chromatography/UV spectrometric detection (RPLC/UV) applied to green tea extracts. Extractions in ethanol and heated water infusion were used as sample preparation procedures. The multivariate methods were directly applied to mass spectra and chromatograms, involving strictly a holistic comparison of shapes, without assignment of any structural identity to compounds. An alternative data interpretation based on linear regression analysis mutually applied to data series is also discussed. Slopes, intercepts and correlation coefficients produced by the linear regression analysis applied on pairs of very large experimental data series successfully retain information resulting from high frequency instrumental acquisition rates, obviously better defining the profiles being compared. Consequently, each type of sample or comparison between samples produces in the Cartesian space an ellipsoidal volume defined by the normal variation intervals of the slope, intercept and correlation coefficient. Distances between volumes graphically illustrates (dis)similarities between compared data. The instrumental intermediate precision had the major effect on the discrimination power of the multivariate data analysis methods. Mass spectra produced through ionization from liquid state in atmospheric pressure conditions of bulk complex mixtures resulting from extracted materials of natural origins provided an excellent data basis for multivariate analysis methods, equivalent to data resulting from chromatographic separations. The alternative evaluation of very large data series based on linear regression analysis produced information equivalent to results obtained through application of PCA an CA. Copyright © 2016 Elsevier B.V. All rights reserved.

  14. Evaluation of Meterorite Amono Acid Analysis Data Using Multivariate Techniques

    NASA Technical Reports Server (NTRS)

    McDonald, G.; Storrie-Lombardi, M.; Nealson, K.

    1999-01-01

    The amino acid distributions in the Murchison carbonaceous chondrite, Mars meteorite ALH84001, and ice from the Allan Hills region of Antarctica are shown, using a multivariate technique known as Principal Component Analysis (PCA), to be statistically distinct from the average amino acid compostion of 101 terrestrial protein superfamilies.

  15. A Comparison of Conventional Linear Regression Methods and Neural Networks for Forecasting Educational Spending.

    ERIC Educational Resources Information Center

    Baker, Bruce D.; Richards, Craig E.

    1999-01-01

    Applies neural network methods for forecasting 1991-95 per-pupil expenditures in U.S. public elementary and secondary schools. Forecasting models included the National Center for Education Statistics' multivariate regression model and three neural architectures. Regarding prediction accuracy, neural network results were comparable or superior to…

  16. "Let Me Count the Ways:" Fostering Reasons for Living among Low-Income, Suicidal, African American Women

    ERIC Educational Resources Information Center

    West, Lindsey M.; Davis, Telsie A.; Thompson, Martie P.; Kaslow, Nadine J.

    2011-01-01

    Protective factors for fostering reasons for living were examined among low-income, suicidal, African American women. Bivariate logistic regressions revealed that higher levels of optimism, spiritual well-being, and family social support predicted reasons for living. Multivariate logistic regressions indicated that spiritual well-being showed…

  17. Predicting exposure-response associations of ambient particulate matter with mortality in 73 Chinese cities.

    PubMed

    Madaniyazi, Lina; Guo, Yuming; Chen, Renjie; Kan, Haidong; Tong, Shilu

    2016-01-01

    Estimating the burden of mortality associated with particulates requires knowledge of exposure-response associations. However, the evidence on exposure-response associations is limited in many cities, especially in developing countries. In this study, we predicted associations of particulates smaller than 10 μm in aerodynamic diameter (PM10) with mortality in 73 Chinese cities. The meta-regression model was used to test and quantify which city-specific characteristics contributed significantly to the heterogeneity of PM10-mortality associations for 16 Chinese cities. Then, those city-specific characteristics with statistically significant regression coefficients were treated as independent variables to build multivariate meta-regression models. The model with the best fitness was used to predict PM10-mortality associations in 73 Chinese cities in 2010. Mean temperature, PM10 concentration and green space per capita could best explain the heterogeneity in PM10-mortality associations. Based on city-specific characteristics, we were able to develop multivariate meta-regression models to predict associations between air pollutants and health outcomes reasonably well. Copyright © 2015 Elsevier Ltd. All rights reserved.

  18. Barriers to health-care and psychological distress among mothers living with HIV in Quebec (Canada).

    PubMed

    Blais, Martin; Fernet, Mylène; Proulx-Boucher, Karène; Lebouché, Bertrand; Rodrigue, Carl; Lapointe, Normand; Otis, Joanne; Samson, Johanne

    2015-01-01

    Health-care providers play a major role in providing good quality care and in preventing psychological distress among mothers living with HIV (MLHIV). The objectives of this study are to explore the impact of health-care services and satisfaction with care providers on psychological distress in MLHIV. One hundred MLHIV were recruited from community and clinical settings in the province of Quebec (Canada). Prevalence estimation of clinical psychological distress and univariate and multivariable logistic regression models were performed to predict clinical psychological distress. Forty-five percent of the participants reported clinical psychological distress. In the multivariable regression, the following variables were significantly associated with psychological distress while controlling for sociodemographic variables: resilience, quality of communication with the care providers, resources, and HIV disclosure concerns. The multivariate results support the key role of personal, structural, and medical resources in understanding psychological distress among MLHIV. Interventions that can support the psychological health of MLHIV are discussed.

  19. Healthcare Expenditures Associated with Depression Among Individuals with Osteoarthritis: Post-Regression Linear Decomposition Approach.

    PubMed

    Agarwal, Parul; Sambamoorthi, Usha

    2015-12-01

    Depression is common among individuals with osteoarthritis and leads to increased healthcare burden. The objective of this study was to examine excess total healthcare expenditures associated with depression among individuals with osteoarthritis in the US. Adults with self-reported osteoarthritis (n = 1881) were identified using data from the 2010 Medical Expenditure Panel Survey (MEPS). Among those with osteoarthritis, chi-square tests and ordinary least square regressions (OLS) were used to examine differences in healthcare expenditures between those with and without depression. Post-regression linear decomposition technique was used to estimate the relative contribution of different constructs of the Anderson's behavioral model, i.e., predisposing, enabling, need, personal healthcare practices, and external environment factors, to the excess expenditures associated with depression among individuals with osteoarthritis. All analysis accounted for the complex survey design of MEPS. Depression coexisted among 20.6 % of adults with osteoarthritis. The average total healthcare expenditures were $13,684 among adults with depression compared to $9284 among those without depression. Multivariable OLS regression revealed that adults with depression had 38.8 % higher healthcare expenditures (p < 0.001) compared to those without depression. Post-regression linear decomposition analysis indicated that 50 % of differences in expenditures among adults with and without depression can be explained by differences in need factors. Among individuals with coexisting osteoarthritis and depression, excess healthcare expenditures associated with depression were mainly due to comorbid anxiety, chronic conditions and poor health status. These expenditures may potentially be reduced by providing timely intervention for need factors or by providing care under a collaborative care model.

  20. Using Time Series Analysis to Predict Cardiac Arrest in a PICU.

    PubMed

    Kennedy, Curtis E; Aoki, Noriaki; Mariscalco, Michele; Turley, James P

    2015-11-01

    To build and test cardiac arrest prediction models in a PICU, using time series analysis as input, and to measure changes in prediction accuracy attributable to different classes of time series data. Retrospective cohort study. Thirty-one bed academic PICU that provides care for medical and general surgical (not congenital heart surgery) patients. Patients experiencing a cardiac arrest in the PICU and requiring external cardiac massage for at least 2 minutes. None. One hundred three cases of cardiac arrest and 109 control cases were used to prepare a baseline dataset that consisted of 1,025 variables in four data classes: multivariate, raw time series, clinical calculations, and time series trend analysis. We trained 20 arrest prediction models using a matrix of five feature sets (combinations of data classes) with four modeling algorithms: linear regression, decision tree, neural network, and support vector machine. The reference model (multivariate data with regression algorithm) had an accuracy of 78% and 87% area under the receiver operating characteristic curve. The best model (multivariate + trend analysis data with support vector machine algorithm) had an accuracy of 94% and 98% area under the receiver operating characteristic curve. Cardiac arrest predictions based on a traditional model built with multivariate data and a regression algorithm misclassified cases 3.7 times more frequently than predictions that included time series trend analysis and built with a support vector machine algorithm. Although the final model lacks the specificity necessary for clinical application, we have demonstrated how information from time series data can be used to increase the accuracy of clinical prediction models.

  1. Physical Function in Older Men With Hyperkyphosis

    PubMed Central

    Harrison, Stephanie L.; Fink, Howard A.; Marshall, Lynn M.; Orwoll, Eric; Barrett-Connor, Elizabeth; Cawthon, Peggy M.; Kado, Deborah M.

    2015-01-01

    Background. Age-related hyperkyphosis has been associated with poor physical function and is a well-established predictor of adverse health outcomes in older women, but its impact on health in older men is less well understood. Methods. We conducted a cross-sectional study to evaluate the association of hyperkyphosis and physical function in 2,363 men, aged 71–98 (M = 79) from the Osteoporotic Fractures in Men Study. Kyphosis was measured using the Rancho Bernardo Study block method. Measurements of grip strength and lower extremity function, including gait speed over 6 m, narrow walk (measure of dynamic balance), repeated chair stands ability and time, and lower extremity power (Nottingham Power Rig) were included separately as primary outcomes. We investigated associations of kyphosis and each outcome in age-adjusted and multivariable linear or logistic regression models, controlling for age, clinic, education, race, bone mineral density, height, weight, diabetes, and physical activity. Results. In multivariate linear regression, we observed a dose-related response of worse scores on each lower extremity physical function test as number of blocks increased, p for trend ≤.001. Using a cutoff of ≥4 blocks, 20% (N = 469) of men were characterized with hyperkyphosis. In multivariate logistic regression, men with hyperkyphosis had increased odds (range 1.5–1.8) of being in the worst quartile of performing lower extremity physical function tasks (p < .001 for each outcome). Kyphosis was not associated with grip strength in any multivariate analysis. Conclusions. Hyperkyphosis is associated with impaired lower extremity physical function in older men. Further studies are needed to determine the direction of causality. PMID:25431353

  2. Multivariate functional response regression, with application to fluorescence spectroscopy in a cervical pre-cancer study.

    PubMed

    Zhu, Hongxiao; Morris, Jeffrey S; Wei, Fengrong; Cox, Dennis D

    2017-07-01

    Many scientific studies measure different types of high-dimensional signals or images from the same subject, producing multivariate functional data. These functional measurements carry different types of information about the scientific process, and a joint analysis that integrates information across them may provide new insights into the underlying mechanism for the phenomenon under study. Motivated by fluorescence spectroscopy data in a cervical pre-cancer study, a multivariate functional response regression model is proposed, which treats multivariate functional observations as responses and a common set of covariates as predictors. This novel modeling framework simultaneously accounts for correlations between functional variables and potential multi-level structures in data that are induced by experimental design. The model is fitted by performing a two-stage linear transformation-a basis expansion to each functional variable followed by principal component analysis for the concatenated basis coefficients. This transformation effectively reduces the intra-and inter-function correlations and facilitates fast and convenient calculation. A fully Bayesian approach is adopted to sample the model parameters in the transformed space, and posterior inference is performed after inverse-transforming the regression coefficients back to the original data domain. The proposed approach produces functional tests that flag local regions on the functional effects, while controlling the overall experiment-wise error rate or false discovery rate. It also enables functional discriminant analysis through posterior predictive calculation. Analysis of the fluorescence spectroscopy data reveals local regions with differential expressions across the pre-cancer and normal samples. These regions may serve as biomarkers for prognosis and disease assessment.

  3. Primary closure after carotid endarterectomy is not inferior to other closure techniques.

    PubMed

    Avgerinos, Efthymios D; Chaer, Rabih A; Naddaf, Abdallah; El-Shazly, Omar M; Marone, Luke; Makaroun, Michel S

    2016-09-01

    Primary closure after carotid endarterectomy (CEA) has been much maligned as an inferior technique with worse outcomes than in patch closure. Our purpose was to compare perioperative and long-term results of different CEA closure techniques in a large institutional experience. A consecutive cohort of CEAs between January 1, 2000, and December 31, 2010, was retrospectively analyzed. Closure technique was used to divide patients into three groups: primary longitudinal arteriotomy closure (PRC), patch closure (PAC), and eversion closure (EVC). End points were perioperative events, long-term strokes, and restenosis ≥70%. Multivariate regression models were used to assess the effect of baseline predictors. There were 1737 CEA cases (bilateral, 143; mean age, 71.4 ± 9.3 years; 56.2% men; 35.3% symptomatic) performed during the study period with a mean clinical follow-up of 49.8 ± 36.4 months (range, 0-155 months). More men had primary closure, but other demographic and baseline symptoms were similar between groups. Half the patients had PAC, with the rest evenly distributed between PRC and EVC. The rate of nerve injury was 2.7%, the rate of reintervention for hematoma was 1.5%, and the length of hospital stay was 2.4 ± 3.0 days, with no significant differences among groups. The combined stroke and death rate was 2.5% overall and 3.9% and 1.7% in the symptomatic and asymptomatic cohort, respectively. Stroke and death rates were similar between groups: PRC, 11 (2.7%); PAC, 19 (2.2%); EVC, 13 (2.9%). Multivariate analysis showed baseline symptomatic disease (odds ratio, 2.4; P = .007) and heart failure (odds ratio, 3.1; P = .003) as predictors of perioperative stroke and death, but not the type of closure. Cox regression analysis demonstrated, among other risk factors, no statin use (hazard ratio, 2.1; P = .008) as a predictor of ipsilateral stroke and severe (glomerular filtration rate <30 mL/min/1.73 m(2)) renal insufficiency (hazard ratio, 2.6; P = .032) as the only predictor of restenosis ≥70%. Type of closure did not have any predictive value. In our study, baseline risk factors and statin use, but not the type of closure, affect perioperative and long-term outcomes after CEA. Copyright © 2016 Society for Vascular Surgery. Published by Elsevier Inc. All rights reserved.

  4. Performance evaluation of a hybrid-passive landfill leachate treatment system using multivariate statistical techniques

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wallace, Jack, E-mail: jack.wallace@ce.queensu.ca; Champagne, Pascale, E-mail: champagne@civil.queensu.ca; Monnier, Anne-Charlotte, E-mail: anne-charlotte.monnier@insa-lyon.fr

    Highlights: • Performance of a hybrid passive landfill leachate treatment system was evaluated. • 33 Water chemistry parameters were sampled for 21 months and statistically analyzed. • Parameters were strongly linked and explained most (>40%) of the variation in data. • Alkalinity, ammonia, COD, heavy metals, and iron were criteria for performance. • Eight other parameters were key in modeling system dynamics and criteria. - Abstract: A pilot-scale hybrid-passive treatment system operated at the Merrick Landfill in North Bay, Ontario, Canada, treats municipal landfill leachate and provides for subsequent natural attenuation. Collected leachate is directed to a hybrid-passive treatment system,more » followed by controlled release to a natural attenuation zone before entering the nearby Little Sturgeon River. The study presents a comprehensive evaluation of the performance of the system using multivariate statistical techniques to determine the interactions between parameters, major pollutants in the leachate, and the biological and chemical processes occurring in the system. Five parameters (ammonia, alkalinity, chemical oxygen demand (COD), “heavy” metals of interest, with atomic weights above calcium, and iron) were set as criteria for the evaluation of system performance based on their toxicity to aquatic ecosystems and importance in treatment with respect to discharge regulations. System data for a full range of water quality parameters over a 21-month period were analyzed using principal components analysis (PCA), as well as principal components (PC) and partial least squares (PLS) regressions. PCA indicated a high degree of association for most parameters with the first PC, which explained a high percentage (>40%) of the variation in the data, suggesting strong statistical relationships among most of the parameters in the system. Regression analyses identified 8 parameters (set as independent variables) that were most frequently retained for modeling the five criteria parameters (set as dependent variables), on a statistically significant level: conductivity, dissolved oxygen (DO), nitrite (NO{sub 2}{sup −}), organic nitrogen (N), oxidation reduction potential (ORP), pH, sulfate and total volatile solids (TVS). The criteria parameters and the significant explanatory parameters were most important in modeling the dynamics of the passive treatment system during the study period. Such techniques and procedures were found to be highly valuable and could be applied to other sites to determine parameters of interest in similar naturalized engineered systems.« less

  5. Logistic regression analysis of factors associated with avascular necrosis of the femoral head following femoral neck fractures in middle-aged and elderly patients.

    PubMed

    Ai, Zi-Sheng; Gao, You-Shui; Sun, Yuan; Liu, Yue; Zhang, Chang-Qing; Jiang, Cheng-Hua

    2013-03-01

    Risk factors for femoral neck fracture-induced avascular necrosis of the femoral head have not been elucidated clearly in middle-aged and elderly patients. Moreover, the high incidence of screw removal in China and its effect on the fate of the involved femoral head require statistical methods to reflect their intrinsic relationship. Ninety-nine patients older than 45 years with femoral neck fracture were treated by internal fixation between May 1999 and April 2004. Descriptive analysis, interaction analysis between associated factors, single factor logistic regression, multivariate logistic regression, and detailed interaction analysis were employed to explore potential relationships among associated factors. Avascular necrosis of the femoral head was found in 15 cases (15.2 %). Age × the status of implants (removal vs. maintenance) and gender × the timing of reduction were interactive according to two-factor interactive analysis. Age, the displacement of fractures, the quality of reduction, and the status of implants were found to be significant factors in single factor logistic regression analysis. Age, age × the status of implants, and the quality of reduction were found to be significant factors in multivariate logistic regression analysis. In fine interaction analysis after multivariate logistic regression analysis, implant removal was the most important risk factor for avascular necrosis in 56-to-85-year-old patients, with a risk ratio of 26.00 (95 % CI = 3.076-219.747). The middle-aged and elderly have less incidence of avascular necrosis of the femoral head following femoral neck fractures treated by cannulated screws. The removal of cannulated screws can induce a significantly high incidence of avascular necrosis of the femoral head in elderly patients, while a high-quality reduction is helpful to reduce avascular necrosis.

  6. Multivariate adaptive regression splines analysis to predict biomarkers of spontaneous preterm birth.

    PubMed

    Menon, Ramkumar; Bhat, Geeta; Saade, George R; Spratt, Heidi

    2014-04-01

    To develop classification models of demographic/clinical factors and biomarker data from spontaneous preterm birth in African Americans and Caucasians. Secondary analysis of biomarker data using multivariate adaptive regression splines (MARS), a supervised machine learning algorithm method. Analysis of data on 36 biomarkers from 191 women was reduced by MARS to develop predictive models for preterm birth in African Americans and Caucasians. Maternal plasma, cord plasma collected at admission for preterm or term labor and amniotic fluid at delivery. Data were partitioned into training and testing sets. Variable importance, a relative indicator (0-100%) and area under the receiver operating characteristic curve (AUC) characterized results. Multivariate adaptive regression splines generated models for combined and racially stratified biomarker data. Clinical and demographic data did not contribute to the model. Racial stratification of data produced distinct models in all three compartments. In African Americans maternal plasma samples IL-1RA, TNF-α, angiopoietin 2, TNFRI, IL-5, MIP1α, IL-1β and TGF-α modeled preterm birth (AUC train: 0.98, AUC test: 0.86). In Caucasians TNFR1, ICAM-1 and IL-1RA contributed to the model (AUC train: 0.84, AUC test: 0.68). African Americans cord plasma samples produced IL-12P70, IL-8 (AUC train: 0.82, AUC test: 0.66). Cord plasma in Caucasians modeled IGFII, PDGFBB, TGF-β1 , IL-12P70, and TIMP1 (AUC train: 0.99, AUC test: 0.82). Amniotic fluid in African Americans modeled FasL, TNFRII, RANTES, KGF, IGFI (AUC train: 0.95, AUC test: 0.89) and in Caucasians, TNF-α, MCP3, TGF-β3 , TNFR1 and angiopoietin 2 (AUC train: 0.94 AUC test: 0.79). Multivariate adaptive regression splines models multiple biomarkers associated with preterm birth and demonstrated racial disparity. © 2014 Nordic Federation of Societies of Obstetrics and Gynecology.

  7. Comparative evaluation of the powder and compression properties of various grades and brands of microcrystalline cellulose by multivariate methods.

    PubMed

    Haware, Rahul V; Bauer-Brandl, Annette; Tho, Ingunn

    2010-01-01

    The present work challenges a newly developed approach to tablet formulation development by using chemically identical materials (grades and brands of microcrystalline cellulose). Tablet properties with respect to process and formulation parameters (e.g. compression speed, added lubricant and Emcompress fractions) were evaluated by 2(3)-factorial designs. Tablets of constant true volume were prepared on a compaction simulator at constant pressure (approx. 100 MPa). The highly repeatable and accurate force-displacement data obtained was evaluated by simple 'in-die' Heckel method and work descriptors. Relationships and interactions between formulation, process and tablet parameters were identified and quantified by multivariate analysis techniques; principal component analysis (PCA) and partial least square regressions (PLS). The method proved to be able to distinguish between different grades of MCC and even between two different brands of the same grade (Avicel PH 101 and Vivapur 101). One example of interaction was studied in more detail by mixed level design: The interaction effect of lubricant and Emcompress on elastic recovery of Avicel PH 102 was demonstrated to be complex and non-linear using the development tool under investigation.

  8. Investigating the Moisture Content of Polyamide 6 by Raman-Microscopy and Multivariate Data Analysis

    NASA Astrophysics Data System (ADS)

    Lechner, Tobias; Noack, Kristina; Thöne, Manuel; Amend, Philipp; Schmidt, Michael; Will, Stefan

    Thermal malleability of thermoplastics results in a high product diversity in various industry sectors. However, industrial applications require a constant and high component quality. Hence, material processing such as laser welding has to consider that, e.g., the moisture content of thermoplastics influences the mechanical properties such as the tensile strength. Moreover, water evaporates during laser welding and can form pores and defects. Thus, there is a large need for non-invasive material inspection before processing. To that end, we developed a methodology based on Raman-microscopy and multivariate data analysis (MVD) to determine the moisture content of polyamide (MCP). Further, the impact of the MCP on the mechanical properties was verified. For samples with a defined variation of the MCP, xyz-Raman-scans were carried out and analysed using MVD. For reference purposes, the samples were weighted and tensile tests were performed. An evaluation by means of partial least squares regression analysis (PLSR) resulted in a prediction of the MCP with a correlation coefficient >98%. Consequently, Raman-microscopy shows large potential for developing new techniques for inspection and quality control of plastics before processing. Dedicated to Professor Alfred Leipertz on the occasion of his 70th birthday.

  9. Utility and Clinical Profile of Dexmedetomidine in Pediatric Cardiac Catheterization Procedures: A Matched Controlled Analysis.

    PubMed

    Riveros, Ricardo; Makarova, Natalya; Riveros-Perez, Efrain; Chodavarapu, Praneeta; Saasouh, Wael; Yılmaz, Hüseyin Oğuz; Cuko, Evis; Babazade, Rovnat; Kimatian, Stephen; Turan, Alparslan

    2017-12-01

    Dexmedetomidine is increasingly used in children undergoing cardiac catheterization procedures. We compared the percentage of surgical time with hemodynamic instability and the incidence of postoperative agitation between pediatric cardiac catheterization patients who received dexmedetomidine infusion and those who did not and the incidence of postoperative agitation. We matched 653 pediatric patients scheduled for cardiac catheterization. Two separate multivariable linear mixed models were used to assess the association between dexmedetomidine use and intraoperative blood pressure and heart rate instability. A multivariate logistic regression was used for relationship between dexmedetomidine and postoperative agitation. No difference between the study groups was found in the duration of MAP ( P = .867) or heart rate (HR) instabilities ( P = .224). The relationship between dexmedetomidine use and the duration of negative hemodynamic effects does not depend on any of the considered CHD types (all P > .001) or intervention ( P = .453 for MAP and P = .023 for HR). No difference in postoperative agitation was found between the study groups ( P = .590). Our study demonstrated no benefit in using dexmedetomidine infusion compared with other general anesthesia techniques to maintain hemodynamic stability or decrease agitation in pediatric patients undergoing cardiac catheterization procedures.

  10. Multivariate evoked response detection based on the spectral F-test.

    PubMed

    Rocha, Paulo Fábio F; Felix, Leonardo B; Miranda de Sá, Antonio Mauricio F L; Mendes, Eduardo M A M

    2016-05-01

    Objective response detection techniques, such as magnitude square coherence, component synchrony measure, and the spectral F-test, have been used to automate the detection of evoked responses. The performance of these detectors depends on both the signal-to-noise ratio (SNR) and the length of the electroencephalogram (EEG) signal. Recently, multivariate detectors were developed to increase the detection rate even in the case of a low signal-to-noise ratio or of short data records originated from EEG signals. In this context, an extension to the multivariate case of the spectral F-test detector is proposed. The performance of this technique is assessed using Monte Carlo. As an example, EEG data from 12 subjects during photic stimulation is used to demonstrate the usefulness of the proposed detector. The multivariate method showed detection rates consistently higher than those ones when only one signal was used. It is shown that the response detection in EEG signals with the multivariate technique was statistically significant if two or more EEG derivations were used. Copyright © 2016 Elsevier B.V. All rights reserved.

  11. Validity and reliability of dental age estimation of teeth root translucency based on digital luminance determination.

    PubMed

    Ramsthaler, Frank; Kettner, Mattias; Verhoff, Marcel A

    2014-01-01

    In forensic anthropological casework, estimating age-at-death is key to profiling unknown skeletal remains. The aim of this study was to examine the reliability of a new, simple, fast, and inexpensive digital odontological method for age-at-death estimation. The method is based on the original Lamendin method, which is a widely used technique in the repertoire of odontological aging methods in forensic anthropology. We examined 129 single root teeth employing a digital camera and imaging software for the measurement of the luminance of the teeth's translucent root zone. Variability in luminance detection was evaluated using statistical technical error of measurement analysis. The method revealed stable values largely unrelated to observer experience, whereas requisite formulas proved to be camera-specific and should therefore be generated for an individual recording setting based on samples of known chronological age. Multiple regression analysis showed a highly significant influence of the coefficients of the variables "arithmetic mean" and "standard deviation" of luminance for the regression formula. For the use of this primer multivariate equation for age-at-death estimation in casework, a standard error of the estimate of 6.51 years was calculated. Step-by-step reduction of the number of embedded variables to linear regression analysis employing the best contributor "arithmetic mean" of luminance yielded a regression equation with a standard error of 6.72 years (p < 0.001). The results of this study not only support the premise of root translucency as an age-related phenomenon, but also demonstrate that translucency reflects a number of other influencing factors in addition to age. This new digital measuring technique of the zone of dental root luminance can broaden the array of methods available for estimating chronological age, and furthermore facilitate measurement and age classification due to its low dependence on observer experience.

  12. Predicting introductory programming performance: A multi-institutional multivariate study

    NASA Astrophysics Data System (ADS)

    Bergin, Susan; Reilly, Ronan

    2006-12-01

    A model for predicting student performance on introductory programming modules is presented. The model uses attributes identified in a study carried out at four third-level institutions in the Republic of Ireland. Four instruments were used to collect the data and over 25 attributes were examined. A data reduction technique was applied and a logistic regression model using 10-fold stratified cross validation was developed. The model used three attributes: Leaving Certificate Mathematics result (final mathematics examination at second level), number of hours playing computer games while taking the module and programming self-esteem. Prediction success was significant with 80% of students correctly classified. The model also works well on a per-institution level. A discussion on the implications of the model is provided and future work is outlined.

  13. F100 Multivariable Control Synthesis Program. Computer Implementation of the F100 Multivariable Control Algorithm

    NASA Technical Reports Server (NTRS)

    Soeder, J. F.

    1983-01-01

    As turbofan engines become more complex, the development of controls necessitate the use of multivariable control techniques. A control developed for the F100-PW-100(3) turbofan engine by using linear quadratic regulator theory and other modern multivariable control synthesis techniques is described. The assembly language implementation of this control on an SEL 810B minicomputer is described. This implementation was then evaluated by using a real-time hybrid simulation of the engine. The control software was modified to run with a real engine. These modifications, in the form of sensor and actuator failure checks and control executive sequencing, are discussed. Finally recommendations for control software implementations are presented.

  14. Random sample consensus combined with partial least squares regression (RANSAC-PLS) for microbial metabolomics data mining and phenotype improvement.

    PubMed

    Teoh, Shao Thing; Kitamura, Miki; Nakayama, Yasumune; Putri, Sastia; Mukai, Yukio; Fukusaki, Eiichiro

    2016-08-01

    In recent years, the advent of high-throughput omics technology has made possible a new class of strain engineering approaches, based on identification of possible gene targets for phenotype improvement from omic-level comparison of different strains or growth conditions. Metabolomics, with its focus on the omic level closest to the phenotype, lends itself naturally to this semi-rational methodology. When a quantitative phenotype such as growth rate under stress is considered, regression modeling using multivariate techniques such as partial least squares (PLS) is often used to identify metabolites correlated with the target phenotype. However, linear modeling techniques such as PLS require a consistent metabolite-phenotype trend across the samples, which may not be the case when outliers or multiple conflicting trends are present in the data. To address this, we proposed a data-mining strategy that utilizes random sample consensus (RANSAC) to select subsets of samples with consistent trends for construction of better regression models. By applying a combination of RANSAC and PLS (RANSAC-PLS) to a dataset from a previous study (gas chromatography/mass spectrometry metabolomics data and 1-butanol tolerance of 19 yeast mutant strains), new metabolites were indicated to be correlated with tolerance within certain subsets of the samples. The relevance of these metabolites to 1-butanol tolerance were then validated from single-deletion strains of corresponding metabolic genes. The results showed that RANSAC-PLS is a promising strategy to identify unique metabolites that provide additional hints for phenotype improvement, which could not be detected by traditional PLS modeling using the entire dataset. Copyright © 2016 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.

  15. An Investigation of Multivariate Adaptive Regression Splines for Modeling and Analysis of Univariate and Semi-Multivariate Time Series Systems

    DTIC Science & Technology

    1991-09-01

    However, there is no guarantee that this would work; for instance if the data were generated by an ARCH model (Tong, 1990 pp. 116-117) then a simple...Hill, R., Griffiths, W., Lutkepohl, H., and Lee, T., Introduction to the Theory and Practice of Econometrics , 2th ed., Wiley, 1985. Kendall, M., Stuart

  16. Application of multivariate statistical techniques in microbial ecology.

    PubMed

    Paliy, O; Shankar, V

    2016-03-01

    Recent advances in high-throughput methods of molecular analyses have led to an explosion of studies generating large-scale ecological data sets. In particular, noticeable effect has been attained in the field of microbial ecology, where new experimental approaches provided in-depth assessments of the composition, functions and dynamic changes of complex microbial communities. Because even a single high-throughput experiment produces large amount of data, powerful statistical techniques of multivariate analysis are well suited to analyse and interpret these data sets. Many different multivariate techniques are available, and often it is not clear which method should be applied to a particular data set. In this review, we describe and compare the most widely used multivariate statistical techniques including exploratory, interpretive and discriminatory procedures. We consider several important limitations and assumptions of these methods, and we present examples of how these approaches have been utilized in recent studies to provide insight into the ecology of the microbial world. Finally, we offer suggestions for the selection of appropriate methods based on the research question and data set structure. © 2016 John Wiley & Sons Ltd.

  17. A tool for classifying individuals with chronic back pain: using multivariate pattern analysis with functional magnetic resonance imaging data.

    PubMed

    Callan, Daniel; Mills, Lloyd; Nott, Connie; England, Robert; England, Shaun

    2014-01-01

    Chronic pain is one of the most prevalent health problems in the world today, yet neurological markers, critical to diagnosis of chronic pain, are still largely unknown. The ability to objectively identify individuals with chronic pain using functional magnetic resonance imaging (fMRI) data is important for the advancement of diagnosis, treatment, and theoretical knowledge of brain processes associated with chronic pain. The purpose of our research is to investigate specific neurological markers that could be used to diagnose individuals experiencing chronic pain by using multivariate pattern analysis with fMRI data. We hypothesize that individuals with chronic pain have different patterns of brain activity in response to induced pain. This pattern can be used to classify the presence or absence of chronic pain. The fMRI experiment consisted of alternating 14 seconds of painful electric stimulation (applied to the lower back) with 14 seconds of rest. We analyzed contrast fMRI images in stimulation versus rest in pain-related brain regions to distinguish between the groups of participants: 1) chronic pain and 2) normal controls. We employed supervised machine learning techniques, specifically sparse logistic regression, to train a classifier based on these contrast images using a leave-one-out cross-validation procedure. We correctly classified 92.3% of the chronic pain group (N = 13) and 92.3% of the normal control group (N = 13) by recognizing multivariate patterns of activity in the somatosensory and inferior parietal cortex. This technique demonstrates that differences in the pattern of brain activity to induced pain can be used as a neurological marker to distinguish between individuals with and without chronic pain. Medical, legal and business professionals have recognized the importance of this research topic and of developing objective measures of chronic pain. This method of data analysis was very successful in correctly classifying each of the two groups.

  18. A Tool for Classifying Individuals with Chronic Back Pain: Using Multivariate Pattern Analysis with Functional Magnetic Resonance Imaging Data

    PubMed Central

    Callan, Daniel; Mills, Lloyd; Nott, Connie; England, Robert; England, Shaun

    2014-01-01

    Chronic pain is one of the most prevalent health problems in the world today, yet neurological markers, critical to diagnosis of chronic pain, are still largely unknown. The ability to objectively identify individuals with chronic pain using functional magnetic resonance imaging (fMRI) data is important for the advancement of diagnosis, treatment, and theoretical knowledge of brain processes associated with chronic pain. The purpose of our research is to investigate specific neurological markers that could be used to diagnose individuals experiencing chronic pain by using multivariate pattern analysis with fMRI data. We hypothesize that individuals with chronic pain have different patterns of brain activity in response to induced pain. This pattern can be used to classify the presence or absence of chronic pain. The fMRI experiment consisted of alternating 14 seconds of painful electric stimulation (applied to the lower back) with 14 seconds of rest. We analyzed contrast fMRI images in stimulation versus rest in pain-related brain regions to distinguish between the groups of participants: 1) chronic pain and 2) normal controls. We employed supervised machine learning techniques, specifically sparse logistic regression, to train a classifier based on these contrast images using a leave-one-out cross-validation procedure. We correctly classified 92.3% of the chronic pain group (N = 13) and 92.3% of the normal control group (N = 13) by recognizing multivariate patterns of activity in the somatosensory and inferior parietal cortex. This technique demonstrates that differences in the pattern of brain activity to induced pain can be used as a neurological marker to distinguish between individuals with and without chronic pain. Medical, legal and business professionals have recognized the importance of this research topic and of developing objective measures of chronic pain. This method of data analysis was very successful in correctly classifying each of the two groups. PMID:24905072

  19. Procedural and longer-term outcomes of wire- versus device-based antegrade dissection and re-entry techniques for the percutaneous revascularization of coronary chronic total occlusions.

    PubMed

    Azzalini, Lorenzo; Dautov, Rustem; Brilakis, Emmanouil S; Ojeda, Soledad; Benincasa, Susanna; Bellini, Barbara; Karatasakis, Aris; Chavarría, Jorge; Rangan, Bavana V; Pan, Manuel; Carlino, Mauro; Colombo, Antonio; Rinfret, Stéphane

    2017-03-15

    There are few data regarding the procedural and follow-up outcomes of different antegrade dissection/re-entry (ADR) techniques for chronic total occlusion (CTO) percutaneous coronary intervention (PCI). We compiled a multicenter registry of consecutive patients undergoing ADR-based CTO PCI at four high-volume specialized institutions. Patients were divided according to the specific ADR technique used: subintimal tracking and re-entry (STAR), limited antegrade subintimal tracking (LAST), or device-based with the CrossBoss/Stingray system (Boston Scientific, Marlborough, MA). Major adverse cardiac events (MACE: cardiac death, target-vessel myocardial infarction and target-vessel revascularization) on follow-up were the main outcome of this study. Independent predictors of MACE were sought with Cox regression analysis. A total of 223 patients were included (STAR n=39, LAST n=68, CrossBoss/Stingray n=116). Baseline characteristics were similar across groups. Technical and procedural success was lower with STAR (59% and 59%), as compared with LAST (96% and 96%) and CrossBoss/Stingray (89% and 87%; p<0.001 for both). At 24-month follow-up, MACE rates were higher in STAR (15.4%) and LAST (17.5%), as compared with device-based ADR with CrossBoss/Stingray (4.3%, p=0.02), driven by TVR (7.7% vs. 15.5% vs. 3.1%, respectively; p=0.02). Multivariable Cox regression analysis identified wire-based ADR (STAR and LAST) and total stent length as independent predictors of MACE. In this multicenter cohort of patients undergoing CTO PCI with ADR techniques, STAR had lower success rates, as compared with the CrossBoss/Stingray system and LAST. The CrossBoss/Stingray system was independently associated with lower risk of MACE on follow-up, as compared with wire-based ADR techniques. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  20. Simulating Multivariate Nonnormal Data Using an Iterative Algorithm

    ERIC Educational Resources Information Center

    Ruscio, John; Kaczetow, Walter

    2008-01-01

    Simulating multivariate nonnormal data with specified correlation matrices is difficult. One especially popular method is Vale and Maurelli's (1983) extension of Fleishman's (1978) polynomial transformation technique to multivariate applications. This requires the specification of distributional moments and the calculation of an intermediate…

  1. Multivariate Linear Regression and CART Regression Analysis of TBM Performance at Abu Hamour Phase-I Tunnel

    NASA Astrophysics Data System (ADS)

    Jakubowski, J.; Stypulkowski, J. B.; Bernardeau, F. G.

    2017-12-01

    The first phase of the Abu Hamour drainage and storm tunnel was completed in early 2017. The 9.5 km long, 3.7 m diameter tunnel was excavated with two Earth Pressure Balance (EPB) Tunnel Boring Machines from Herrenknecht. TBM operation processes were monitored and recorded by Data Acquisition and Evaluation System. The authors coupled collected TBM drive data with available information on rock mass properties, cleansed, completed with secondary variables and aggregated by weeks and shifts. Correlations and descriptive statistics charts were examined. Multivariate Linear Regression and CART regression tree models linking TBM penetration rate (PR), penetration per revolution (PPR) and field penetration index (FPI) with TBM operational and geotechnical characteristics were performed for the conditions of the weak/soft rock of Doha. Both regression methods are interpretable and the data were screened with different computational approaches allowing enriched insight. The primary goal of the analysis was to investigate empirical relations between multiple explanatory and responding variables, to search for best subsets of explanatory variables and to evaluate the strength of linear and non-linear relations. For each of the penetration indices, a predictive model coupling both regression methods was built and validated. The resultant models appeared to be stronger than constituent ones and indicated an opportunity for more accurate and robust TBM performance predictions.

  2. Effects of Social Class and School Conditions on Educational Enrollment and Achievement of Boys and Girls in Rural Viet Nam

    ERIC Educational Resources Information Center

    Nguyen, Phuong L.

    2006-01-01

    This study examines the effects of parental SES, school quality, and community factors on children's enrollment and achievement in rural areas in Viet Nam, using logistic regression and ordered logistic regression. Multivariate analysis reveals significant differences in educational enrollment and outcomes by level of household expenditures and…

  3. A hybrid artificial neural network as a software sensor for optimal control of a wastewater treatment process.

    PubMed

    Choi, D J; Park, H

    2001-11-01

    For control and automation of biological treatment processes, lack of reliable on-line sensors to measure water quality parameters is one of the most important problems to overcome. Many parameters cannot be measured directly with on-line sensors. The accuracy of existing hardware sensors is also not sufficient and maintenance problems such as electrode fouling often cause trouble. This paper deals with the development of software sensor techniques that estimate the target water quality parameter from other parameters using the correlation between water quality parameters. We focus our attention on the preprocessing of noisy data and the selection of the best model feasible to the situation. Problems of existing approaches are also discussed. We propose a hybrid neural network as a software sensor inferring wastewater quality parameter. Multivariate regression, artificial neural networks (ANN), and a hybrid technique that combines principal component analysis as a preprocessing stage are applied to data from industrial wastewater processes. The hybrid ANN technique shows an enhancement of prediction capability and reduces the overfitting problem of neural networks. The result shows that the hybrid ANN technique can be used to extract information from noisy data and to describe the nonlinearity of complex wastewater treatment processes.

  4. Procedures for using signals from one sensor as substitutes for signals of another

    NASA Technical Reports Server (NTRS)

    Suits, G.; Malila, W.; Weller, T.

    1988-01-01

    Long-term monitoring of surface conditions may require a transfer from using data from one satellite sensor to data from a different sensor having different spectral characteristics. Two general procedures for spectral signal substitution are described in this paper, a principal-components procedure and a complete multivariate regression procedure. They are evaluated through a simulation study of five satellite sensors (MSS, TM, AVHRR, CZCS, and HRV). For illustration, they are compared to another recently described procedure for relating AVHRR and MSS signals. The multivariate regression procedure is shown to be best. TM can accurately emulate the other sensors, but they, on the other hand, have difficulty in accurately emulating its shortwave infrared bands (TM5 and TM7).

  5. Non-proportional odds multivariate logistic regression of ordinal family data.

    PubMed

    Zaloumis, Sophie G; Scurrah, Katrina J; Harrap, Stephen B; Ellis, Justine A; Gurrin, Lyle C

    2015-03-01

    Methods to examine whether genetic and/or environmental sources can account for the residual variation in ordinal family data usually assume proportional odds. However, standard software to fit the non-proportional odds model to ordinal family data is limited because the correlation structure of family data is more complex than for other types of clustered data. To perform these analyses we propose the non-proportional odds multivariate logistic regression model and take a simulation-based approach to model fitting using Markov chain Monte Carlo methods, such as partially collapsed Gibbs sampling and the Metropolis algorithm. We applied the proposed methodology to male pattern baldness data from the Victorian Family Heart Study. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  6. Genetic parameters for growth characteristics of free-range chickens under univariate random regression models.

    PubMed

    Rovadoscki, Gregori A; Petrini, Juliana; Ramirez-Diaz, Johanna; Pertile, Simone F N; Pertille, Fábio; Salvian, Mayara; Iung, Laiza H S; Rodriguez, Mary Ana P; Zampar, Aline; Gaya, Leila G; Carvalho, Rachel S B; Coelho, Antonio A D; Savino, Vicente J M; Coutinho, Luiz L; Mourão, Gerson B

    2016-09-01

    Repeated measures from the same individual have been analyzed by using repeatability and finite dimension models under univariate or multivariate analyses. However, in the last decade, the use of random regression models for genetic studies with longitudinal data have become more common. Thus, the aim of this research was to estimate genetic parameters for body weight of four experimental chicken lines by using univariate random regression models. Body weight data from hatching to 84 days of age (n = 34,730) from four experimental free-range chicken lines (7P, Caipirão da ESALQ, Caipirinha da ESALQ and Carijó Barbado) were used. The analysis model included the fixed effects of contemporary group (gender and rearing system), fixed regression coefficients for age at measurement, and random regression coefficients for permanent environmental effects and additive genetic effects. Heterogeneous variances for residual effects were considered, and one residual variance was assigned for each of six subclasses of age at measurement. Random regression curves were modeled by using Legendre polynomials of the second and third orders, with the best model chosen based on the Akaike Information Criterion, Bayesian Information Criterion, and restricted maximum likelihood. Multivariate analyses under the same animal mixed model were also performed for the validation of the random regression models. The Legendre polynomials of second order were better for describing the growth curves of the lines studied. Moderate to high heritabilities (h(2) = 0.15 to 0.98) were estimated for body weight between one and 84 days of age, suggesting that selection for body weight at all ages can be used as a selection criteria. Genetic correlations among body weight records obtained through multivariate analyses ranged from 0.18 to 0.96, 0.12 to 0.89, 0.06 to 0.96, and 0.28 to 0.96 in 7P, Caipirão da ESALQ, Caipirinha da ESALQ, and Carijó Barbado chicken lines, respectively. Results indicate that genetic gain for body weight can be achieved by selection. Also, selection for body weight at 42 days of age can be maintained as a selection criterion. © 2016 Poultry Science Association Inc.

  7. Multivariate generalized hidden Markov regression models with random covariates: Physical exercise in an elderly population.

    PubMed

    Punzo, Antonio; Ingrassia, Salvatore; Maruotti, Antonello

    2018-04-22

    A time-varying latent variable model is proposed to jointly analyze multivariate mixed-support longitudinal data. The proposal can be viewed as an extension of hidden Markov regression models with fixed covariates (HMRMFCs), which is the state of the art for modelling longitudinal data, with a special focus on the underlying clustering structure. HMRMFCs are inadequate for applications in which a clustering structure can be identified in the distribution of the covariates, as the clustering is independent from the covariates distribution. Here, hidden Markov regression models with random covariates are introduced by explicitly specifying state-specific distributions for the covariates, with the aim of improving the recovering of the clusters in the data with respect to a fixed covariates paradigm. The hidden Markov regression models with random covariates class is defined focusing on the exponential family, in a generalized linear model framework. Model identifiability conditions are sketched, an expectation-maximization algorithm is outlined for parameter estimation, and various implementation and operational issues are discussed. Properties of the estimators of the regression coefficients, as well as of the hidden path parameters, are evaluated through simulation experiments and compared with those of HMRMFCs. The method is applied to physical activity data. Copyright © 2018 John Wiley & Sons, Ltd.

  8. A novel strategy for forensic age prediction by DNA methylation and support vector regression model

    PubMed Central

    Xu, Cheng; Qu, Hongzhu; Wang, Guangyu; Xie, Bingbing; Shi, Yi; Yang, Yaran; Zhao, Zhao; Hu, Lan; Fang, Xiangdong; Yan, Jiangwei; Feng, Lei

    2015-01-01

    High deviations resulting from prediction model, gender and population difference have limited age estimation application of DNA methylation markers. Here we identified 2,957 novel age-associated DNA methylation sites (P < 0.01 and R2 > 0.5) in blood of eight pairs of Chinese Han female monozygotic twins. Among them, nine novel sites (false discovery rate < 0.01), along with three other reported sites, were further validated in 49 unrelated female volunteers with ages of 20–80 years by Sequenom Massarray. A total of 95 CpGs were covered in the PCR products and 11 of them were built the age prediction models. After comparing four different models including, multivariate linear regression, multivariate nonlinear regression, back propagation neural network and support vector regression, SVR was identified as the most robust model with the least mean absolute deviation from real chronological age (2.8 years) and an average accuracy of 4.7 years predicted by only six loci from the 11 loci, as well as an less cross-validated error compared with linear regression model. Our novel strategy provides an accurate measurement that is highly useful in estimating the individual age in forensic practice as well as in tracking the aging process in other related applications. PMID:26635134

  9. Analysis of Multivariate Experimental Data Using A Simplified Regression Model Search Algorithm

    NASA Technical Reports Server (NTRS)

    Ulbrich, Norbert Manfred

    2013-01-01

    A new regression model search algorithm was developed in 2011 that may be used to analyze both general multivariate experimental data sets and wind tunnel strain-gage balance calibration data. The new algorithm is a simplified version of a more complex search algorithm that was originally developed at the NASA Ames Balance Calibration Laboratory. The new algorithm has the advantage that it needs only about one tenth of the original algorithm's CPU time for the completion of a search. In addition, extensive testing showed that the prediction accuracy of math models obtained from the simplified algorithm is similar to the prediction accuracy of math models obtained from the original algorithm. The simplified algorithm, however, cannot guarantee that search constraints related to a set of statistical quality requirements are always satisfied in the optimized regression models. Therefore, the simplified search algorithm is not intended to replace the original search algorithm. Instead, it may be used to generate an alternate optimized regression model of experimental data whenever the application of the original search algorithm either fails or requires too much CPU time. Data from a machine calibration of NASA's MK40 force balance is used to illustrate the application of the new regression model search algorithm.

  10. Arthroscopic modified Mason-Allen technique for large U- or L-shaped rotator cuff tears.

    PubMed

    Jung, Sung-Weon; Kim, Dong-Hee; Kang, Seung-Hoon; Lee, Ji-Heon

    2017-07-01

    While a conventional single- or double-row repair technique could be applied for repair of C-shaped tears, a different surgical strategy should be considered for repair of U- or L-shaped tears because they typically have complex patterns with anterior, posterior, or both mobile leaves. This study was performed to examine the outcomes of the modified Mason-Allen technique for footprint restoration in the treatment of large U- or L-shaped rotator cuff tears. Thirty-two patients who underwent an arthroscopic modified Mason-Allen technique for large U- or L-shaped rotator cuff tears between January 2012 and December 2013 were included in this study. Margin convergence was first performed to reduce the tear gap and tension, and then, an arthroscopic Mason-Allen technique was performed to restore the rotator cuff footprint in a side-to-end repair fashion. All patients were evaluated preoperatively and for a minimum of 2 years of follow-up with a visual analog scale (VAS) for pain, Constant score, and ultrasonography. There was significant improvement in all VAS and Constant scores compared with the preoperative values (P < 0.001). Functional results by Constant scores included 9 cases that were classified as excellent, 11 cases as good, 8 cases as fair, and 2 cases as poor. Binary logistic regression analysis revealed that heavy work, pseudoparalysis, joint space narrowing, fatty degeneration of the SST and IST, and a positive tangent sign were found to significantly correlate with functional outcomes. Multivariable logistic regression analysis revealed that only fatty degeneration of the SST was a risk factor for fair/poor clinical outcomes. Complications occurred in 5 of the 32 patients (15.6 %), and the reoperation rate due to complications was 6.3 % (2 of 32 patients). An arthroscopic modified Mason-Allen technique was sufficient to restore the footprint of the rotator cuff in our data. Overall satisfactory results were achieved in most patients, with the exception of those with severe fatty degeneration. An arthroscopic modified Mason-Allen technique could be an effective and reliable alternative for patients with large U- or L-shaped rotator cuff tears. Case Series, Therapeutic Level IV.

  11. The Dirichlet-Multinomial Model for Multivariate Randomized Response Data and Small Samples

    ERIC Educational Resources Information Center

    Avetisyan, Marianna; Fox, Jean-Paul

    2012-01-01

    In survey sampling the randomized response (RR) technique can be used to obtain truthful answers to sensitive questions. Although the individual answers are masked due to the RR technique, individual (sensitive) response rates can be estimated when observing multivariate response data. The beta-binomial model for binary RR data will be generalized…

  12. All-Possible-Subsets for MANOVA and Factorial MANOVAs: Less than a Weekend Project

    ERIC Educational Resources Information Center

    Nimon, Kim; Zientek, Linda Reichwein; Kraha, Amanda

    2016-01-01

    Multivariate techniques are increasingly popular as researchers attempt to accurately model a complex world. MANOVA is a multivariate technique used to investigate the dimensions along which groups differ, and how these dimensions may be used to predict group membership. A concern in a MANOVA analysis is to determine if a smaller subset of…

  13. Multivariate mixed linear model analysis of longitudinal data: an information-rich statistical technique for analyzing disease resistance data

    USDA-ARS?s Scientific Manuscript database

    The mixed linear model (MLM) is currently among the most advanced and flexible statistical modeling techniques and its use in tackling problems in plant pathology has begun surfacing in the literature. The longitudinal MLM is a multivariate extension that handles repeatedly measured data, such as r...

  14. Functional Path Analysis as a Multivariate Technique in Developing a Theory of Participation in Adult Education.

    ERIC Educational Resources Information Center

    Martin, James L.

    This paper reports on attempts by the author to construct a theoretical framework of adult education participation using a theory development process and the corresponding multivariate statistical techniques. Two problems are identified: the lack of theoretical framework in studying problems, and the limiting of statistical analysis to univariate…

  15. TG study of the Li0.4Fe2.4Zn0.2O4 ferrite synthesis

    NASA Astrophysics Data System (ADS)

    Lysenko, E. N.; Nikolaev, E. V.; Surzhikov, A. P.

    2016-02-01

    In this paper, the kinetic analysis of Li-Zn ferrite synthesis was studied using thermogravimetry (TG) method through the simultaneous application of non-linear regression to several measurements run at different heating rates (multivariate non-linear regression). Using TG-curves obtained for the four heating rates and Netzsch Thermokinetics software package, the kinetic models with minimal adjustable parameters were selected to quantitatively describe the reaction of Li-Zn ferrite synthesis. It was shown that the experimental TG-curves clearly suggest a two-step process for the ferrite synthesis and therefore a model-fitting kinetic analysis based on multivariate non-linear regressions was conducted. The complex reaction was described by a two-step reaction scheme consisting of sequential reaction steps. It is established that the best results were obtained using the Yander three-dimensional diffusion model at the first stage and Ginstling-Bronstein model at the second step. The kinetic parameters for lithium-zinc ferrite synthesis reaction were found and discussed.

  16. Structural brain connectivity and cognitive ability differences: A multivariate distance matrix regression analysis.

    PubMed

    Ponsoda, Vicente; Martínez, Kenia; Pineda-Pardo, José A; Abad, Francisco J; Olea, Julio; Román, Francisco J; Barbey, Aron K; Colom, Roberto

    2017-02-01

    Neuroimaging research involves analyses of huge amounts of biological data that might or might not be related with cognition. This relationship is usually approached using univariate methods, and, therefore, correction methods are mandatory for reducing false positives. Nevertheless, the probability of false negatives is also increased. Multivariate frameworks have been proposed for helping to alleviate this balance. Here we apply multivariate distance matrix regression for the simultaneous analysis of biological and cognitive data, namely, structural connections among 82 brain regions and several latent factors estimating cognitive performance. We tested whether cognitive differences predict distances among individuals regarding their connectivity pattern. Beginning with 3,321 connections among regions, the 36 edges better predicted by the individuals' cognitive scores were selected. Cognitive scores were related to connectivity distances in both the full (3,321) and reduced (36) connectivity patterns. The selected edges connect regions distributed across the entire brain and the network defined by these edges supports high-order cognitive processes such as (a) (fluid) executive control, (b) (crystallized) recognition, learning, and language processing, and (c) visuospatial processing. This multivariate study suggests that one widespread, but limited number, of regions in the human brain, supports high-level cognitive ability differences. Hum Brain Mapp 38:803-816, 2017. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  17. Multivariate analysis of cytokine profiles in pregnancy complications.

    PubMed

    Azizieh, Fawaz; Dingle, Kamaludin; Raghupathy, Raj; Johnson, Kjell; VanderPlas, Jacob; Ansari, Ali

    2018-03-01

    The immunoregulation to tolerate the semiallogeneic fetus during pregnancy includes a harmonious dynamic balance between anti- and pro-inflammatory cytokines. Several earlier studies reported significantly different levels and/or ratios of several cytokines in complicated pregnancy as compared to normal pregnancy. However, as cytokines operate in networks with potentially complex interactions, it is also interesting to compare groups with multi-cytokine data sets, with multivariate analysis. Such analysis will further examine how great the differences are, and which cytokines are more different than others. Various multivariate statistical tools, such as Cramer test, classification and regression trees, partial least squares regression figures, 2-dimensional Kolmogorov-Smirmov test, principal component analysis and gap statistic, were used to compare cytokine data of normal vs anomalous groups of different pregnancy complications. Multivariate analysis assisted in examining if the groups were different, how strongly they differed, in what ways they differed and further reported evidence for subgroups in 1 group (pregnancy-induced hypertension), possibly indicating multiple causes for the complication. This work contributes to a better understanding of cytokines interaction and may have important implications on targeting cytokine balance modulation or design of future medications or interventions that best direct management or prevention from an immunological approach. © 2018 The Authors. American Journal of Reproductive Immunology Published by John Wiley & Sons Ltd.

  18. [PROGNOSTIC MODELS IN MODERN MANAGEMENT OF VULVAR CANCER].

    PubMed

    Tsvetkov, Ch; Gorchev, G; Tomov, S; Nikolova, M; Genchev, G

    2016-01-01

    The aim of the research was to evaluate and analyse prognosis and prognostic factors in patients with squamous cell vulvar carcinoma after primary surgery with individual approach applied during the course of treatment. In the period between January 2000 and July 2010, 113 patients with squamous cell carcinoma of the vulva were diagnosed and operated on at Gynecologic Oncology Clinic of Medical University, Pleven. All the patients were monitored at the same clinic. Individual approach was applied to each patient and whenever it was possible, more conservative operative techniques were applied. The probable clinicopathological characteristics influencing the overall survival and recurrence free survival were analyzed. Univariate statistical analysis and Cox regression analysis were made in order to evaluate the characteristics, which were statistically significant for overall survival and survival without recurrence. A multivariate logistic regression analysis (Forward Wald procedure) was applied to evaluate the combined influence of the significant factors. While performing the multivariate analysis, the synergic effect of the independent prognostic factors of both kinds of survivals was also evaluated. Approaching individually each patient, we applied the following operative techniques: 1. Deep total radical vulvectomy with separate incisions for lymph dissection (LD) or without dissection--68 (60.18 %) patients. 2. En-bloc vulvectomy with bilateral LD without vulva reconstruction--10 (8.85%) 3. Modified radical vulvactomy (hemivulvectomy, patial vulvactomy)--25 (22.02%). 4. wide-local excision--3 (2.65%). 5. Simple (total /partial) vulvectomy--5 (4.43%) patients. 6. En-bloc resection with reconstruction--2 (1.77%) After a thorough analysis of the overall survival and recurrence free survival, we made the conclusion that the relapse occurrence and clinical stage of FIGO were independent prognostic factors for overall survival and the independent prognostic factors for recurrence free survival were: metastatic inguinal nodes (unilateral or bilateral), tumor size (above or below 3 cm) and lymphovascular space invasion. On the basis of these results we created two prognostic models: 1. A prognostic model of overall survival 2. A prognostic model for survival without recurrence. Following the surgical staging of the disease, were able to gather and analyse important clinicopathological indexes, which gave us the opportunity to form prognostic groups for overall survival and recurrence-free survival.

  19. Landslide susceptibility assessment in Lianhua County (China): A comparison between a random forest data mining technique and bivariate and multivariate statistical models

    NASA Astrophysics Data System (ADS)

    Hong, Haoyuan; Pourghasemi, Hamid Reza; Pourtaghi, Zohre Sadat

    2016-04-01

    Landslides are an important natural hazard that causes a great amount of damage around the world every year, especially during the rainy season. The Lianhua area is located in the middle of China's southern mountainous area, west of Jiangxi Province, and is known to be an area prone to landslides. The aim of this study was to evaluate and compare landslide susceptibility maps produced using the random forest (RF) data mining technique with those produced by bivariate (evidential belief function and frequency ratio) and multivariate (logistic regression) statistical models for Lianhua County, China. First, a landslide inventory map was prepared using aerial photograph interpretation, satellite images, and extensive field surveys. In total, 163 landslide events were recognized in the study area, with 114 landslides (70%) used for training and 49 landslides (30%) used for validation. Next, the landslide conditioning factors-including the slope angle, altitude, slope aspect, topographic wetness index (TWI), slope-length (LS), plan curvature, profile curvature, distance to rivers, distance to faults, distance to roads, annual precipitation, land use, normalized difference vegetation index (NDVI), and lithology-were derived from the spatial database. Finally, the landslide susceptibility maps of Lianhua County were generated in ArcGIS 10.1 based on the random forest (RF), evidential belief function (EBF), frequency ratio (FR), and logistic regression (LR) approaches and were validated using a receiver operating characteristic (ROC) curve. The ROC plot assessment results showed that for landslide susceptibility maps produced using the EBF, FR, LR, and RF models, the area under the curve (AUC) values were 0.8122, 0.8134, 0.7751, and 0.7172, respectively. Therefore, we can conclude that all four models have an AUC of more than 0.70 and can be used in landslide susceptibility mapping in the study area; meanwhile, the EBF and FR models had the best performance for Lianhua County, China. Thus, the resultant susceptibility maps will be useful for land use planning and hazard mitigation aims.

  20. Application of multivariable statistical techniques in plant-wide WWTP control strategies analysis.

    PubMed

    Flores, X; Comas, J; Roda, I R; Jiménez, L; Gernaey, K V

    2007-01-01

    The main objective of this paper is to present the application of selected multivariable statistical techniques in plant-wide wastewater treatment plant (WWTP) control strategies analysis. In this study, cluster analysis (CA), principal component analysis/factor analysis (PCA/FA) and discriminant analysis (DA) are applied to the evaluation matrix data set obtained by simulation of several control strategies applied to the plant-wide IWA Benchmark Simulation Model No 2 (BSM2). These techniques allow i) to determine natural groups or clusters of control strategies with a similar behaviour, ii) to find and interpret hidden, complex and casual relation features in the data set and iii) to identify important discriminant variables within the groups found by the cluster analysis. This study illustrates the usefulness of multivariable statistical techniques for both analysis and interpretation of the complex multicriteria data sets and allows an improved use of information for effective evaluation of control strategies.

  1. Clinical management provided by board-certificated physiatrists in early rehabilitation is a significant determinant of functional improvement in acute stroke patients: a retrospective analysis of Japan rehabilitation database.

    PubMed

    Kinoshita, Shoji; Kakuda, Wataru; Momosaki, Ryo; Yamada, Naoki; Sugawara, Hidekazu; Watanabe, Shu; Abo, Masahiro

    2015-05-01

    Early rehabilitation for acute stroke patients is widely recommended. We tested the hypothesis that clinical outcome of stroke patients who receive early rehabilitation managed by board-certificated physiatrists (BCP) is generally better than that provided by other medical specialties. Data of stroke patients who underwent early rehabilitation in 19 acute hospitals between January 2005 and December 2013 were collected from the Japan Rehabilitation Database and analyzed retrospectively. Multivariate linear regression analysis using generalized estimating equations method was performed to assess the association between Functional Independence Measure (FIM) effectiveness and management provided by BCP in early rehabilitation. In addition, multivariate logistic regression analysis was also performed to assess the impact of management provided by BCP in acute phase on discharge destination. After setting the inclusion criteria, data of 3838 stroke patients were eligible for analysis. BCP provided early rehabilitation in 814 patients (21.2%). Both the duration of daily exercise time and the frequency of regular conferencing were significantly higher for patients managed by BCP than by other specialties. Although the mortality rate was not different, multivariate regression analysis showed that FIM effectiveness correlated significantly and positively with the management provided by BCP (coefficient, .35; 95% confidence interval [CI], .012-.059; P < .005). In addition, multivariate logistic analysis identified clinical management by BCP as a significant determinant of home discharge (odds ratio, 1.24; 95% CI, 1.08-1.44; P < .005). Our retrospective cohort study demonstrated that clinical management provided by BCP in early rehabilitation can lead to functional recovery of acute stroke. Copyright © 2015 National Stroke Association. Published by Elsevier Inc. All rights reserved.

  2. Physical function in older men with hyperkyphosis.

    PubMed

    Katzman, Wendy B; Harrison, Stephanie L; Fink, Howard A; Marshall, Lynn M; Orwoll, Eric; Barrett-Connor, Elizabeth; Cawthon, Peggy M; Kado, Deborah M

    2015-05-01

    Age-related hyperkyphosis has been associated with poor physical function and is a well-established predictor of adverse health outcomes in older women, but its impact on health in older men is less well understood. We conducted a cross-sectional study to evaluate the association of hyperkyphosis and physical function in 2,363 men, aged 71-98 (M = 79) from the Osteoporotic Fractures in Men Study. Kyphosis was measured using the Rancho Bernardo Study block method. Measurements of grip strength and lower extremity function, including gait speed over 6 m, narrow walk (measure of dynamic balance), repeated chair stands ability and time, and lower extremity power (Nottingham Power Rig) were included separately as primary outcomes. We investigated associations of kyphosis and each outcome in age-adjusted and multivariable linear or logistic regression models, controlling for age, clinic, education, race, bone mineral density, height, weight, diabetes, and physical activity. In multivariate linear regression, we observed a dose-related response of worse scores on each lower extremity physical function test as number of blocks increased, p for trend ≤.001. Using a cutoff of ≥4 blocks, 20% (N = 469) of men were characterized with hyperkyphosis. In multivariate logistic regression, men with hyperkyphosis had increased odds (range 1.5-1.8) of being in the worst quartile of performing lower extremity physical function tasks (p < .001 for each outcome). Kyphosis was not associated with grip strength in any multivariate analysis. Hyperkyphosis is associated with impaired lower extremity physical function in older men. Further studies are needed to determine the direction of causality. © The Author 2014. Published by Oxford University Press on behalf of The Gerontological Society of America. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  3. Primary fascial closure with mesh reinforcement is superior to bridged mesh repair for abdominal wall reconstruction.

    PubMed

    Booth, Justin H; Garvey, Patrick B; Baumann, Donald P; Selber, Jesse C; Nguyen, Alexander T; Clemens, Mark W; Liu, Jun; Butler, Charles E

    2013-12-01

    Many surgeons believe that primary fascial closure with mesh reinforcement should be the goal of abdominal wall reconstruction (AWR), yet others have reported acceptable outcomes when mesh is used to bridge the fascial edges. It has not been clearly shown how the outcomes for these techniques differ. We hypothesized that bridged repairs result in higher hernia recurrence rates than mesh-reinforced repairs that achieve fascial coaptation. We retrospectively reviewed prospectively collected data from consecutive patients with 1 year or more of follow-up, who underwent midline AWR between 2000 and 2011 at a single center. We compared surgical outcomes between patients with bridged and mesh-reinforced fascial repairs. The primary outcomes measure was hernia recurrence. Multivariate logistic regression analysis was used to identify factors predictive of or protective for complications. We included 222 patients (195 mesh-reinforced and 27 bridged repairs) with a mean follow-up of 31.1 ± 14.2 months. The bridged repairs were associated with a significantly higher risk of hernia recurrence (56% vs 8%; hazard ratio [HR] 9.5; p < 0.001) and a higher overall complication rate (74% vs 32%; odds ratio [OR] 3.9; p < 0.001). The interval to recurrence was more than 9 times shorter in the bridged group (HR 9.5; p < 0.001). Multivariate Cox proportional hazard regression analysis identified bridged repair and defect width > 15 cm to be independent predictors of hernia recurrence (HR 7.3; p < 0.001 and HR 2.5; p = 0.028, respectively). Mesh-reinforced AWRs with primary fascial coaptation resulted in fewer hernia recurrences and fewer overall complications than bridged repairs. Surgeons should make every effort to achieve primary fascial coaptation to reduce complications. Published by Elsevier Inc.

  4. Models of Marine Fish Biodiversity: Assessing Predictors from Three Habitat Classification Schemes.

    PubMed

    Yates, Katherine L; Mellin, Camille; Caley, M Julian; Radford, Ben T; Meeuwig, Jessica J

    2016-01-01

    Prioritising biodiversity conservation requires knowledge of where biodiversity occurs. Such knowledge, however, is often lacking. New technologies for collecting biological and physical data coupled with advances in modelling techniques could help address these gaps and facilitate improved management outcomes. Here we examined the utility of environmental data, obtained using different methods, for developing models of both uni- and multivariate biodiversity metrics. We tested which biodiversity metrics could be predicted best and evaluated the performance of predictor variables generated from three types of habitat data: acoustic multibeam sonar imagery, predicted habitat classification, and direct observer habitat classification. We used boosted regression trees (BRT) to model metrics of fish species richness, abundance and biomass, and multivariate regression trees (MRT) to model biomass and abundance of fish functional groups. We compared model performance using different sets of predictors and estimated the relative influence of individual predictors. Models of total species richness and total abundance performed best; those developed for endemic species performed worst. Abundance models performed substantially better than corresponding biomass models. In general, BRT and MRTs developed using predicted habitat classifications performed less well than those using multibeam data. The most influential individual predictor was the abiotic categorical variable from direct observer habitat classification and models that incorporated predictors from direct observer habitat classification consistently outperformed those that did not. Our results show that while remotely sensed data can offer considerable utility for predictive modelling, the addition of direct observer habitat classification data can substantially improve model performance. Thus it appears that there are aspects of marine habitats that are important for modelling metrics of fish biodiversity that are not fully captured by remotely sensed data. As such, the use of remotely sensed data to model biodiversity represents a compromise between model performance and data availability.

  5. Comparison of perioperative outcomes between open and robotic radical cystectomy: a population based analysis.

    PubMed

    Nazzani, Sebastiano; Mazzone, Elio; Preisser, Felix; Bandini, Marco; Tian, Zhe; Marchioni, Michele; Ratti, Dario; Motta, Gloria; Zorn, Kevin Christopher; Briganti, Alberto; Shariat, Shahrokh F; Montanari, Emanuele; Carmignani, Luca; Karakiewicz, Pierre I

    2018-05-30

    Radical cystectomy represents the standard of care for muscle invasive bladder cancer (MIBC). Due to its novelty the use of robotic radical cystectomy (RARC) is still under debate. We examined intraoperative and postoperative morbidity and mortality as well as impact on length of stay (LOS) and total hospital charges (THCGs) of RARC compared to open radical cystectomy (ORC). Within National Inpatient Sample (NIS) (2008-2013), we identified patients with non-metastatic bladder cancer treated with either ORC or RARC. We relied on inverse probability of treatment weighting (IPTW) to reduce the effect of inherent differences between ORC vs. RARC. Multivariable logistic regression (MLR) and multivariable Poisson regression models (MPR) were used. Of all 10 027 patients, 12.6% underwent RARC. Between 2008 and 2013, RARC rates increased from 0.8 to 20.4% [Estimated annual percentage change (EAPC): +26.5%, CI: +11.1 to +48.3; p=0.035] and RARC THCGs decreased from 45 981 to 31 749 United States Dollars (EAPC: -6.8%, CI: -9.6 to -3.9; p=0.01). In MLR models RARC resulted in lower rates of overall complications (OR: 0.6; p <0.001) and transfusions (OR: 0.44; p <0.001). In MPR models, RARC was associated with shorter LOS [relative risk(RR)0.91 ; p <0.001]. Finally, higher THCGs (OR: 1.09; p <0.001) were recorded for RARC. Data are retrospective and no tumor characteristics were available. RARC is related to lower rates of overall complications and transfusions rates. In consequence, RARC is a safe and feasible technique in select muscle invasive bladder cancer patients. Moreover, RARC is associated with shorter LOS albeit higher THCGs.  .

  6. Models of Marine Fish Biodiversity: Assessing Predictors from Three Habitat Classification Schemes

    PubMed Central

    Yates, Katherine L.; Mellin, Camille; Caley, M. Julian; Radford, Ben T.; Meeuwig, Jessica J.

    2016-01-01

    Prioritising biodiversity conservation requires knowledge of where biodiversity occurs. Such knowledge, however, is often lacking. New technologies for collecting biological and physical data coupled with advances in modelling techniques could help address these gaps and facilitate improved management outcomes. Here we examined the utility of environmental data, obtained using different methods, for developing models of both uni- and multivariate biodiversity metrics. We tested which biodiversity metrics could be predicted best and evaluated the performance of predictor variables generated from three types of habitat data: acoustic multibeam sonar imagery, predicted habitat classification, and direct observer habitat classification. We used boosted regression trees (BRT) to model metrics of fish species richness, abundance and biomass, and multivariate regression trees (MRT) to model biomass and abundance of fish functional groups. We compared model performance using different sets of predictors and estimated the relative influence of individual predictors. Models of total species richness and total abundance performed best; those developed for endemic species performed worst. Abundance models performed substantially better than corresponding biomass models. In general, BRT and MRTs developed using predicted habitat classifications performed less well than those using multibeam data. The most influential individual predictor was the abiotic categorical variable from direct observer habitat classification and models that incorporated predictors from direct observer habitat classification consistently outperformed those that did not. Our results show that while remotely sensed data can offer considerable utility for predictive modelling, the addition of direct observer habitat classification data can substantially improve model performance. Thus it appears that there are aspects of marine habitats that are important for modelling metrics of fish biodiversity that are not fully captured by remotely sensed data. As such, the use of remotely sensed data to model biodiversity represents a compromise between model performance and data availability. PMID:27333202

  7. Household food insecurity and mental distress among pregnant women in Southwestern Ethiopia: a cross sectional study design.

    PubMed

    Jebena, Mulusew G; Taha, Mohammed; Nakajima, Motohiro; Lemieux, Andrine; Lemessa, Fikre; Hoffman, Richard; Tesfaye, Markos; Belachew, Tefera; Workineh, Netsanet; Kebede, Esayas; Gemechu, Teklu; Tariku, Yinebeb; Segni, Hailemariam; Kolsteren, Patrick; al'Absi, Mustafa

    2015-10-08

    There are compelling theoretical and empirical reasons that link household food insecurity to mental distress in the setting where both problems are common. However, little is known about their association during pregnancy in Ethiopia. A cross-sectional study was conducted to examine the association of household food insecurity with mental distress during pregnancy. Six hundred and forty-two pregnant women were recruited from 11 health centers and one hospital. Probability proportional to size (PPS) and consecutive sampling techniques were employed to recruit study subjects until the desired sample size was obtained. The Self Reporting Questionnaire (SRQ-20) was used to measure mental distress and a 9-item Household Food Insecurity Access Scale was used to measure food security status. Descriptive and inferential statistics were computed accordingly. Multivariate logistic regression was used to estimate the effect of food insecurity on mental distress. Fifty eight of the respondents (9%) were moderately food insecure and 144 of the respondents (22.4%) had mental distress. Food insecurity was also associated with mental distress. Pregnant women living in food insecure households were 4 times more likely to have mental distress than their counterparts (COR = 3.77, 95% CI: 2.17, 6.55). After controlling for confounders, a multivariate logistic regression model supported a link between food insecurity and mental distress (AOR = 4.15, 95% CI: 1.67, 10.32). The study found a significant association between food insecurity and mental distress. However, the mechanism by which food insecurity is associated with mental distress is not clear. Further investigation is therefore needed to understand either how food insecurity during pregnancy leads to mental distress or weather mental distress is a contributing factor in the development of food insecurity.

  8. Does midwife experience affect the rate of severe perineal tears?

    PubMed

    Mizrachi, Yossi; Leytes, Sophia; Levy, Michal; Hiaev, Zvia; Ginath, Shimon; Bar, Jacob; Kovo, Michal

    2017-06-01

    Our aim was to study whether midwife experience affects the rate of severe perineal tears (3rd and 4th degree). A retrospective cohort study of all women with term vertex singleton pregnancies, who underwent normal vaginal deliveries, in a single tertiary hospital, between 2011 and 2015, was performed. Exclusion criteria were instrumental deliveries and stillbirth. All midwives used a "hands on" technique for protecting the perineum. The midwife experience at each delivery was calculated as the time interval between her first delivery and current delivery. A comparison was performed between deliveries in which midwife experience was less than 2 years (inexperienced), between 2 and 10 years (moderately experienced), and more than 10 years (highly experienced). A multivariate regression analysis was performed to assess the association between midwife experience and the incidence of severe perineal tears, after controlling for confounders. Overall, 15 146 deliveries were included. Severe perineal tears were diagnosed in 51 (0.33%) deliveries. Women delivered by inexperienced midwives had a higher rate of severe perineal tears compared with women delivered by highly experienced midwives (0.5% vs 0.2%, respectively, P=.024). On multivariate regression analysis, midwife experience was independently associated with a lower rate of severe perineal tears, after controlling for confounding factors. Each additional year of experience was associated with a 4.7% decrease in the risk of severe perineal tears (adjusted OR 0.95 [95% CI 0.91-0.99, P=.03). More experienced midwives had a lower rate of severe perineal tears, and may be preferred for managing deliveries of women at high risk for such tears. © 2017 Wiley Periodicals, Inc.

  9. Blended learning in situated contexts: 3-year evaluation of an online peer review project.

    PubMed

    Bridges, S; Chang, J W W; Chu, C H; Gardner, K

    2014-08-01

    Situated and sociocultural perspectives on learning indicate that the design of complex tasks supported by educational technologies holds potential for dental education in moving novices towards closer approximation of the clinical outcomes of their expert mentors. A cross-faculty-, student-centred, web-based project in operative dentistry was established within the Universitas 21 (U21) network of higher education institutions to support university goals for internationalisation in clinical learning by enabling distributed interactions across sites and institutions. This paper aims to present evaluation of one dental faculty's project experience of curriculum redesign for deeper student learning. A mixed-method case study approach was utilised. Three cohorts of second-year students from a 5-year bachelor of dental surgery (BDS) programme were invited to participate in annual surveys and focus group interviews on project completion. Survey data were analysed for differences between years using multivariate logistical regression analysis. Thematic analysis of questionnaire open responses and interview transcripts was conducted. Multivariate logistic regression analysis noted significant differences across items over time indicating learning improvements, attainment of university aims and the positive influence of redesign. Students perceived the enquiry-based project as stimulating and motivating, and building confidence in operative techniques. Institutional goals for greater understanding of others and lifelong learning showed improvement over time. Despite positive scores, students indicated global citizenship and intercultural understanding were conceptually challenging. Establishment of online student learning communities through a blended approach to learning stimulated motivation and intellectual engagement, thereby supporting a situated approach to cognition. Sociocultural perspectives indicate that novice-expert interactions supported student development of professional identities. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  10. Death Anxiety as a Predictor of Posttraumatic Stress Levels among Individuals with Spinal Cord Injuries

    ERIC Educational Resources Information Center

    Martz, Erin

    2004-01-01

    Because the onset of a spinal cord injury may involve a brush with death and because serious injury and disability can act as a reminder of death, death anxiety was examined as a predictor of posttraumatic stress levels among individuals with disabilities. This cross-sectional study used multiple regression and multivariate multiple regression to…

  11. Cellulose I crystallinity determination using FT-Raman spectroscopy : univariate and multivariate methods

    Treesearch

    Umesh P. Agarwal; Richard S. Reiner; Sally A. Ralph

    2010-01-01

    Two new methods based on FT–Raman spectroscopy, one simple, based on band intensity ratio, and the other using a partial least squares (PLS) regression model, are proposed to determine cellulose I crystallinity. In the simple method, crystallinity in cellulose I samples was determined based on univariate regression that was first developed using the Raman band...

  12. Predicting Potential Changes in Suitable Habitat and Distribution by 2100 for Tree Species of the Eastern United States

    Treesearch

    Louis R Iverson; Anantha M. Prasad; Mark W. Schwartz; Mark W. Schwartz

    2005-01-01

    We predict current distribution and abundance for tree species present in eastern North America, and subsequently estimate potential suitable habitat for those species under a changed climate with 2 x CO2. We used a series of statistical models (i.e., Regression Tree Analysis (RTA), Multivariate Adaptive Regression Splines (MARS), Bagging Trees (...

  13. Per capita community-level effects of an invasive grass, Microstegium vimineum, on vegetation in mesic forests in northern Mississippi (USA)

    Treesearch

    J. Stephen Brewer

    2010-01-01

    Quantifying per capita impacts of invasive species on resident communities requires integrating regression analyses with experiments under natural conditions. Using multivariate and univariate approaches, I regressed the abundance of 105 resident species of groundcover plants and tree seedlings against the abundance and height of an invasive grass, Microstegium...

  14. Regression analysis for LED color detection of visual-MIMO system

    NASA Astrophysics Data System (ADS)

    Banik, Partha Pratim; Saha, Rappy; Kim, Ki-Doo

    2018-04-01

    Color detection from a light emitting diode (LED) array using a smartphone camera is very difficult in a visual multiple-input multiple-output (visual-MIMO) system. In this paper, we propose a method to determine the LED color using a smartphone camera by applying regression analysis. We employ a multivariate regression model to identify the LED color. After taking a picture of an LED array, we select the LED array region, and detect the LED using an image processing algorithm. We then apply the k-means clustering algorithm to determine the number of potential colors for feature extraction of each LED. Finally, we apply the multivariate regression model to predict the color of the transmitted LEDs. In this paper, we show our results for three types of environmental light condition: room environmental light, low environmental light (560 lux), and strong environmental light (2450 lux). We compare the results of our proposed algorithm from the analysis of training and test R-Square (%) values, percentage of closeness of transmitted and predicted colors, and we also mention about the number of distorted test data points from the analysis of distortion bar graph in CIE1931 color space.

  15. Multivariate stochastic simulation with subjective multivariate normal distributions

    Treesearch

    P. J. Ince; J. Buongiorno

    1991-01-01

    In many applications of Monte Carlo simulation in forestry or forest products, it may be known that some variables are correlated. However, for simplicity, in most simulations it has been assumed that random variables are independently distributed. This report describes an alternative Monte Carlo simulation technique for subjectively assesed multivariate normal...

  16. Esophageal wall dose-surface maps do not improve the predictive performance of a multivariable NTCP model for acute esophageal toxicity in advanced stage NSCLC patients treated with intensity-modulated (chemo-)radiotherapy.

    PubMed

    Dankers, Frank; Wijsman, Robin; Troost, Esther G C; Monshouwer, René; Bussink, Johan; Hoffmann, Aswin L

    2017-05-07

    In our previous work, a multivariable normal-tissue complication probability (NTCP) model for acute esophageal toxicity (AET) Grade  ⩾2 after highly conformal (chemo-)radiotherapy for non-small cell lung cancer (NSCLC) was developed using multivariable logistic regression analysis incorporating clinical parameters and mean esophageal dose (MED). Since the esophagus is a tubular organ, spatial information of the esophageal wall dose distribution may be important in predicting AET. We investigated whether the incorporation of esophageal wall dose-surface data with spatial information improves the predictive power of our established NTCP model. For 149 NSCLC patients treated with highly conformal radiation therapy esophageal wall dose-surface histograms (DSHs) and polar dose-surface maps (DSMs) were generated. DSMs were used to generate new DSHs and dose-length-histograms that incorporate spatial information of the dose-surface distribution. From these histograms dose parameters were derived and univariate logistic regression analysis showed that they correlated significantly with AET. Following our previous work, new multivariable NTCP models were developed using the most significant dose histogram parameters based on univariate analysis (19 in total). However, the 19 new models incorporating esophageal wall dose-surface data with spatial information did not show improved predictive performance (area under the curve, AUC range 0.79-0.84) over the established multivariable NTCP model based on conventional dose-volume data (AUC  =  0.84). For prediction of AET, based on the proposed multivariable statistical approach, spatial information of the esophageal wall dose distribution is of no added value and it is sufficient to only consider MED as a predictive dosimetric parameter.

  17. Esophageal wall dose-surface maps do not improve the predictive performance of a multivariable NTCP model for acute esophageal toxicity in advanced stage NSCLC patients treated with intensity-modulated (chemo-)radiotherapy

    NASA Astrophysics Data System (ADS)

    Dankers, Frank; Wijsman, Robin; Troost, Esther G. C.; Monshouwer, René; Bussink, Johan; Hoffmann, Aswin L.

    2017-05-01

    In our previous work, a multivariable normal-tissue complication probability (NTCP) model for acute esophageal toxicity (AET) Grade  ⩾2 after highly conformal (chemo-)radiotherapy for non-small cell lung cancer (NSCLC) was developed using multivariable logistic regression analysis incorporating clinical parameters and mean esophageal dose (MED). Since the esophagus is a tubular organ, spatial information of the esophageal wall dose distribution may be important in predicting AET. We investigated whether the incorporation of esophageal wall dose-surface data with spatial information improves the predictive power of our established NTCP model. For 149 NSCLC patients treated with highly conformal radiation therapy esophageal wall dose-surface histograms (DSHs) and polar dose-surface maps (DSMs) were generated. DSMs were used to generate new DSHs and dose-length-histograms that incorporate spatial information of the dose-surface distribution. From these histograms dose parameters were derived and univariate logistic regression analysis showed that they correlated significantly with AET. Following our previous work, new multivariable NTCP models were developed using the most significant dose histogram parameters based on univariate analysis (19 in total). However, the 19 new models incorporating esophageal wall dose-surface data with spatial information did not show improved predictive performance (area under the curve, AUC range 0.79-0.84) over the established multivariable NTCP model based on conventional dose-volume data (AUC  =  0.84). For prediction of AET, based on the proposed multivariable statistical approach, spatial information of the esophageal wall dose distribution is of no added value and it is sufficient to only consider MED as a predictive dosimetric parameter.

  18. Uses of Multivariate Analytical Techniques in Online and Blended Business Education: An Assessment of Current Practice and Recommendations for Future Research

    ERIC Educational Resources Information Center

    Arbaugh, J. B.; Hwang, Alvin

    2013-01-01

    Seeking to assess the analytical rigor of empirical research in management education, this article reviews the use of multivariate statistical techniques in 85 studies of online and blended management education over the past decade and compares them with prescriptions offered by both the organization studies and educational research communities.…

  19. Recent applications of multivariate data analysis methods in the authentication of rice and the most analyzed parameters: A review.

    PubMed

    Maione, Camila; Barbosa, Rommel Melgaço

    2018-01-24

    Rice is one of the most important staple foods around the world. Authentication of rice is one of the most addressed concerns in the present literature, which includes recognition of its geographical origin and variety, certification of organic rice and many other issues. Good results have been achieved by multivariate data analysis and data mining techniques when combined with specific parameters for ascertaining authenticity and many other useful characteristics of rice, such as quality, yield and others. This paper brings a review of the recent research projects on discrimination and authentication of rice using multivariate data analysis and data mining techniques. We found that data obtained from image processing, molecular and atomic spectroscopy, elemental fingerprinting, genetic markers, molecular content and others are promising sources of information regarding geographical origin, variety and other aspects of rice, being widely used combined with multivariate data analysis techniques. Principal component analysis and linear discriminant analysis are the preferred methods, but several other data classification techniques such as support vector machines, artificial neural networks and others are also frequently present in some studies and show high performance for discrimination of rice.

  20. Multivariate geometry as an approach to algal community analysis

    USGS Publications Warehouse

    Allen, T.F.H.; Skagen, S.

    1973-01-01

    Multivariate analyses are put in the context of more usual approaches to phycological investigations. The intuitive common-sense involved in methods of ordination, classification and discrimination are emphasised by simple geometric accounts which avoid jargon and matrix algebra. Warnings are given that artifacts result from technique abuses by the naive or over-enthusiastic. An analysis of a simple periphyton data set is presented as an example of the approach. Suggestions are made as to situations in phycological investigations, where the techniques could be appropriate. The discipline is reprimanded for its neglect of the multivariate approach.

  1. Assessing landslide susceptibility by statistical data analysis and GIS: the case of Daunia (Apulian Apennines, Italy)

    NASA Astrophysics Data System (ADS)

    Ceppi, C.; Mancini, F.; Ritrovato, G.

    2009-04-01

    This study aim at the landslide susceptibility mapping within an area of the Daunia (Apulian Apennines, Italy) by a multivariate statistical method and data manipulation in a Geographical Information System (GIS) environment. Among the variety of existing statistical data analysis techniques, the logistic regression was chosen to produce a susceptibility map all over an area where small settlements are historically threatened by landslide phenomena. By logistic regression a best fitting between the presence or absence of landslide (dependent variable) and the set of independent variables is performed on the basis of a maximum likelihood criterion, bringing to the estimation of regression coefficients. The reliability of such analysis is therefore due to the ability to quantify the proneness to landslide occurrences by the probability level produced by the analysis. The inventory of dependent and independent variables were managed in a GIS, where geometric properties and attributes have been translated into raster cells in order to proceed with the logistic regression by means of SPSS (Statistical Package for the Social Sciences) package. A landslide inventory was used to produce the bivariate dependent variable whereas the independent set of variable concerned with slope, aspect, elevation, curvature, drained area, lithology and land use after their reductions to dummy variables. The effect of independent parameters on landslide occurrence was assessed by the corresponding coefficient in the logistic regression function, highlighting a major role played by the land use variable in determining occurrence and distribution of phenomena. Once the outcomes of the logistic regression are determined, data are re-introduced in the GIS to produce a map reporting the proneness to landslide as predicted level of probability. As validation of results and regression model a cell-by-cell comparison between the susceptibility map and the initial inventory of landslide events was performed and an agreement at 75% level achieved.

  2. Quantitative methods for compensation of matrix effects and self-absorption in Laser Induced Breakdown Spectroscopy signals of solids

    NASA Astrophysics Data System (ADS)

    Takahashi, Tomoko; Thornton, Blair

    2017-12-01

    This paper reviews methods to compensate for matrix effects and self-absorption during quantitative analysis of compositions of solids measured using Laser Induced Breakdown Spectroscopy (LIBS) and their applications to in-situ analysis. Methods to reduce matrix and self-absorption effects on calibration curves are first introduced. The conditions where calibration curves are applicable to quantification of compositions of solid samples and their limitations are discussed. While calibration-free LIBS (CF-LIBS), which corrects matrix effects theoretically based on the Boltzmann distribution law and Saha equation, has been applied in a number of studies, requirements need to be satisfied for the calculation of chemical compositions to be valid. Also, peaks of all elements contained in the target need to be detected, which is a bottleneck for in-situ analysis of unknown materials. Multivariate analysis techniques are gaining momentum in LIBS analysis. Among the available techniques, principal component regression (PCR) analysis and partial least squares (PLS) regression analysis, which can extract related information to compositions from all spectral data, are widely established methods and have been applied to various fields including in-situ applications in air and for planetary explorations. Artificial neural networks (ANNs), where non-linear effects can be modelled, have also been investigated as a quantitative method and their applications are introduced. The ability to make quantitative estimates based on LIBS signals is seen as a key element for the technique to gain wider acceptance as an analytical method, especially in in-situ applications. In order to accelerate this process, it is recommended that the accuracy should be described using common figures of merit which express the overall normalised accuracy, such as the normalised root mean square errors (NRMSEs), when comparing the accuracy obtained from different setups and analytical methods.

  3. Lameness detection in dairy cattle: single predictor v. multivariate analysis of image-based posture processing and behaviour and performance sensing.

    PubMed

    Van Hertem, T; Bahr, C; Schlageter Tello, A; Viazzi, S; Steensels, M; Romanini, C E B; Lokhorst, C; Maltz, E; Halachmi, I; Berckmans, D

    2016-09-01

    The objective of this study was to evaluate if a multi-sensor system (milk, activity, body posture) was a better classifier for lameness than the single-sensor-based detection models. Between September 2013 and August 2014, 3629 cow observations were collected on a commercial dairy farm in Belgium. Human locomotion scoring was used as reference for the model development and evaluation. Cow behaviour and performance was measured with existing sensors that were already present at the farm. A prototype of three-dimensional-based video recording system was used to quantify automatically the back posture of a cow. For the single predictor comparisons, a receiver operating characteristics curve was made. For the multivariate detection models, logistic regression and generalized linear mixed models (GLMM) were developed. The best lameness classification model was obtained by the multi-sensor analysis (area under the receiver operating characteristics curve (AUC)=0.757±0.029), containing a combination of milk and milking variables, activity and gait and posture variables from videos. Second, the multivariate video-based system (AUC=0.732±0.011) performed better than the multivariate milk sensors (AUC=0.604±0.026) and the multivariate behaviour sensors (AUC=0.633±0.018). The video-based system performed better than the combined behaviour and performance-based detection model (AUC=0.669±0.028), indicating that it is worthwhile to consider a video-based lameness detection system, regardless the presence of other existing sensors in the farm. The results suggest that Θ2, the feature variable for the back curvature around the hip joints, with an AUC of 0.719 is the best single predictor variable for lameness detection based on locomotion scoring. In general, this study showed that the video-based back posture monitoring system is outperforming the behaviour and performance sensing techniques for locomotion scoring-based lameness detection. A GLMM with seven specific variables (walking speed, back posture measurement, daytime activity, milk yield, lactation stage, milk peak flow rate and milk peak conductivity) is the best combination of variables for lameness classification. The accuracy on four-level lameness classification was 60.3%. The accuracy improved to 79.8% for binary lameness classification. The binary GLMM obtained a sensitivity of 68.5% and a specificity of 87.6%, which both exceed the sensitivity (52.1%±4.7%) and specificity (83.2%±2.3%) of the multi-sensor logistic regression model. This shows that the repeated measures analysis in the GLMM, taking into account the individual history of the animal, outperforms the classification when thresholds based on herd level (a statistical population) are used.

  4. Pan evaporation modeling using six different heuristic computing methods in different climates of China

    NASA Astrophysics Data System (ADS)

    Wang, Lunche; Kisi, Ozgur; Zounemat-Kermani, Mohammad; Li, Hui

    2017-01-01

    Pan evaporation (Ep) plays important roles in agricultural water resources management. One of the basic challenges is modeling Ep using limited climatic parameters because there are a number of factors affecting the evaporation rate. This study investigated the abilities of six different soft computing methods, multi-layer perceptron (MLP), generalized regression neural network (GRNN), fuzzy genetic (FG), least square support vector machine (LSSVM), multivariate adaptive regression spline (MARS), adaptive neuro-fuzzy inference systems with grid partition (ANFIS-GP), and two regression methods, multiple linear regression (MLR) and Stephens and Stewart model (SS) in predicting monthly Ep. Long-term climatic data at various sites crossing a wide range of climates during 1961-2000 are used for model development and validation. The results showed that the models have different accuracies in different climates and the MLP model performed superior to the other models in predicting monthly Ep at most stations using local input combinations (for example, the MAE (mean absolute errors), RMSE (root mean square errors), and determination coefficient (R2) are 0.314 mm/day, 0.405 mm/day and 0.988, respectively for HEB station), while GRNN model performed better in Tibetan Plateau (MAE, RMSE and R2 are 0.459 mm/day, 0.592 mm/day and 0.932, respectively). The accuracies of above models ranked as: MLP, GRNN, LSSVM, FG, ANFIS-GP, MARS and MLR. The overall results indicated that the soft computing techniques generally performed better than the regression methods, but MLR and SS models can be more preferred at some climatic zones instead of complex nonlinear models, for example, the BJ (Beijing), CQ (Chongqing) and HK (Haikou) stations. Therefore, it can be concluded that Ep could be successfully predicted using above models in hydrological modeling studies.

  5. Evaluation of in-line Raman data for end-point determination of a coating process: Comparison of Science-Based Calibration, PLS-regression and univariate data analysis.

    PubMed

    Barimani, Shirin; Kleinebudde, Peter

    2017-10-01

    A multivariate analysis method, Science-Based Calibration (SBC), was used for the first time for endpoint determination of a tablet coating process using Raman data. Two types of tablet cores, placebo and caffeine cores, received a coating suspension comprising a polyvinyl alcohol-polyethylene glycol graft-copolymer and titanium dioxide to a maximum coating thickness of 80µm. Raman spectroscopy was used as in-line PAT tool. The spectra were acquired every minute and correlated to the amount of applied aqueous coating suspension. SBC was compared to another well-known multivariate analysis method, Partial Least Squares-regression (PLS) and a simpler approach, Univariate Data Analysis (UVDA). All developed calibration models had coefficient of determination values (R 2 ) higher than 0.99. The coating endpoints could be predicted with root mean square errors (RMSEP) less than 3.1% of the applied coating suspensions. Compared to PLS and UVDA, SBC proved to be an alternative multivariate calibration method with high predictive power. Copyright © 2017 Elsevier B.V. All rights reserved.

  6. Influence factors and forecast of carbon emission in China: structure adjustment for emission peak

    NASA Astrophysics Data System (ADS)

    Wang, B.; Cui, C. Q.; Li, Z. P.

    2018-02-01

    This paper introduced Principal Component Analysis and Multivariate Linear Regression Model to verify long-term balance relationships between Carbon Emissions and the impact factors. The integrated model of improved PCA and multivariate regression analysis model is attainable to figure out the pattern of carbon emission sources. Main empirical results indicate that among all selected variables, the role of energy consumption scale was largest. GDP and Population follow and also have significant impacts on carbon emission. Industrialization rate and fossil fuel proportion, which is the indicator of reflecting the economic structure and energy structure, have a higher importance than the factor of urbanization rate and the dweller consumption level of urban areas. In this way, some suggestions are put forward for government to achieve the peak of carbon emissions.

  7. DEFINITION OF MULTIVARIATE GEOCHEMICAL ASSOCIATIONS WITH POLYMETALLIC MINERAL OCCURRENCES USING A SPATIALLY DEPENDENT CLUSTERING TECHNIQUE AND RASTERIZED STREAM SEDIMENT DATA - AN ALASKAN EXAMPLE.

    USGS Publications Warehouse

    Jenson, Susan K.; Trautwein, C.M.

    1984-01-01

    The application of an unsupervised, spatially dependent clustering technique (AMOEBA) to interpolated raster arrays of stream sediment data has been found to provide useful multivariate geochemical associations for modeling regional polymetallic resource potential. The technique is based on three assumptions regarding the compositional and spatial relationships of stream sediment data and their regional significance. These assumptions are: (1) compositionally separable classes exist and can be statistically distinguished; (2) the classification of multivariate data should minimize the pair probability of misclustering to establish useful compositional associations; and (3) a compositionally defined class represented by three or more contiguous cells within an array is a more important descriptor of a terrane than a class represented by spatial outliers.

  8. Estimating the concrete compressive strength using hard clustering and fuzzy clustering based regression techniques.

    PubMed

    Nagwani, Naresh Kumar; Deo, Shirish V

    2014-01-01

    Understanding of the compressive strength of concrete is important for activities like construction arrangement, prestressing operations, and proportioning new mixtures and for the quality assurance. Regression techniques are most widely used for prediction tasks where relationship between the independent variables and dependent (prediction) variable is identified. The accuracy of the regression techniques for prediction can be improved if clustering can be used along with regression. Clustering along with regression will ensure the more accurate curve fitting between the dependent and independent variables. In this work cluster regression technique is applied for estimating the compressive strength of the concrete and a novel state of the art is proposed for predicting the concrete compressive strength. The objective of this work is to demonstrate that clustering along with regression ensures less prediction errors for estimating the concrete compressive strength. The proposed technique consists of two major stages: in the first stage, clustering is used to group the similar characteristics concrete data and then in the second stage regression techniques are applied over these clusters (groups) to predict the compressive strength from individual clusters. It is found from experiments that clustering along with regression techniques gives minimum errors for predicting compressive strength of concrete; also fuzzy clustering algorithm C-means performs better than K-means algorithm.

  9. Estimating the Concrete Compressive Strength Using Hard Clustering and Fuzzy Clustering Based Regression Techniques

    PubMed Central

    Nagwani, Naresh Kumar; Deo, Shirish V.

    2014-01-01

    Understanding of the compressive strength of concrete is important for activities like construction arrangement, prestressing operations, and proportioning new mixtures and for the quality assurance. Regression techniques are most widely used for prediction tasks where relationship between the independent variables and dependent (prediction) variable is identified. The accuracy of the regression techniques for prediction can be improved if clustering can be used along with regression. Clustering along with regression will ensure the more accurate curve fitting between the dependent and independent variables. In this work cluster regression technique is applied for estimating the compressive strength of the concrete and a novel state of the art is proposed for predicting the concrete compressive strength. The objective of this work is to demonstrate that clustering along with regression ensures less prediction errors for estimating the concrete compressive strength. The proposed technique consists of two major stages: in the first stage, clustering is used to group the similar characteristics concrete data and then in the second stage regression techniques are applied over these clusters (groups) to predict the compressive strength from individual clusters. It is found from experiments that clustering along with regression techniques gives minimum errors for predicting compressive strength of concrete; also fuzzy clustering algorithm C-means performs better than K-means algorithm. PMID:25374939

  10. Classification and regression tree analysis vs. multivariable linear and logistic regression methods as statistical tools for studying haemophilia.

    PubMed

    Henrard, S; Speybroeck, N; Hermans, C

    2015-11-01

    Haemophilia is a rare genetic haemorrhagic disease characterized by partial or complete deficiency of coagulation factor VIII, for haemophilia A, or IX, for haemophilia B. As in any other medical research domain, the field of haemophilia research is increasingly concerned with finding factors associated with binary or continuous outcomes through multivariable models. Traditional models include multiple logistic regressions, for binary outcomes, and multiple linear regressions for continuous outcomes. Yet these regression models are at times difficult to implement, especially for non-statisticians, and can be difficult to interpret. The present paper sought to didactically explain how, why, and when to use classification and regression tree (CART) analysis for haemophilia research. The CART method is non-parametric and non-linear, based on the repeated partitioning of a sample into subgroups based on a certain criterion. Breiman developed this method in 1984. Classification trees (CTs) are used to analyse categorical outcomes and regression trees (RTs) to analyse continuous ones. The CART methodology has become increasingly popular in the medical field, yet only a few examples of studies using this methodology specifically in haemophilia have to date been published. Two examples using CART analysis and previously published in this field are didactically explained in details. There is increasing interest in using CART analysis in the health domain, primarily due to its ease of implementation, use, and interpretation, thus facilitating medical decision-making. This method should be promoted for analysing continuous or categorical outcomes in haemophilia, when applicable. © 2015 John Wiley & Sons Ltd.

  11. Stochastic modelling of temperatures affecting the in situ performance of a solar-assisted heat pump: The multivariate approach and physical interpretation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Loveday, D.L.; Craggs, C.

    Box-Jenkins-based multivariate stochastic modeling is carried out using data recorded from a domestic heating system. The system comprises an air-source heat pump sited in the roof space of a house, solar assistance being provided by the conventional tile roof acting as a radiation absorber. Multivariate models are presented which illustrate the time-dependent relationships between three air temperatures - at external ambient, at entry to, and at exit from, the heat pump evaporator. Using a deterministic modeling approach, physical interpretations are placed on the results of the multivariate technique. It is concluded that the multivariate Box-Jenkins approach is a suitable techniquemore » for building thermal analysis. Application to multivariate Box-Jenkins approach is a suitable technique for building thermal analysis. Application to multivariate model-based control is discussed, with particular reference to building energy management systems. It is further concluded that stochastic modeling of data drawn from a short monitoring period offers a means of retrofitting an advanced model-based control system in existing buildings, which could be used to optimize energy savings. An approach to system simulation is suggested.« less

  12. New strategy for determination of anthocyanins, polyphenols and antioxidant capacity of Brassica oleracea liquid extract using infrared spectroscopies and multivariate regression

    NASA Astrophysics Data System (ADS)

    de Oliveira, Isadora R. N.; Roque, Jussara V.; Maia, Mariza P.; Stringheta, Paulo C.; Teófilo, Reinaldo F.

    2018-04-01

    A new method was developed to determine the antioxidant properties of red cabbage extract (Brassica oleracea) by mid (MID) and near (NIR) infrared spectroscopies and partial least squares (PLS) regression. A 70% (v/v) ethanolic extract of red cabbage was concentrated to 9° Brix and further diluted (12 to 100%) in water. The dilutions were used as external standards for the building of PLS models. For the first time, this strategy was applied for building multivariate regression models. Reference analyses and spectral data were obtained from diluted extracts. The determinate properties were total and monomeric anthocyanins, total polyphenols and antioxidant capacity by ABTS (2,2-azino-bis(3-ethyl-benzothiazoline-6-sulfonate)) and DPPH (2,2-diphenyl-1-picrylhydrazyl) methods. Ordered predictors selection (OPS) and genetic algorithm (GA) were used for feature selection before PLS regression (PLS-1). In addition, a PLS-2 regression was applied to all properties simultaneously. PLS-1 models provided more predictive models than did PLS-2 regression. PLS-OPS and PLS-GA models presented excellent prediction results with a correlation coefficient higher than 0.98. However, the best models were obtained using PLS and variable selection with the OPS algorithm and the models based on NIR spectra were considered more predictive for all properties. Then, these models provided a simple, rapid and accurate method for determination of red cabbage extract antioxidant properties and its suitability for use in the food industry.

  13. Determining the response of sea level to atmospheric pressure forcing using TOPEX/POSEIDON data

    NASA Technical Reports Server (NTRS)

    Fu, Lee-Lueng; Pihos, Greg

    1994-01-01

    The static response of sea level to the forcing of atmospheric pressure, the so-called inverted barometer (IB) effect, is investigated using TOPEX/POSEIDON data. This response, characterized by the rise and fall of sea level to compensate for the change of atmospheric pressure at a rate of -1 cm/mbar, is not associated with any ocean currents and hence is normally treated as an error to be removed from sea level observation. Linear regression and spectral transfer function analyses are applied to sea level and pressure to examine the validity of the IB effect. In regions outside the tropics, the regression coefficient is found to be consistently close to the theoretical value except for the regions of western boundary currents, where the mesoscale variability interferes with the IB effect. The spectral transfer function shows near IB response at periods of 30 degrees is -0.84 +/- 0.29 cm/mbar (1 standard deviation). The deviation from = 1 cm /mbar is shown to be caused primarily by the effect of wind forcing on sea level, based on multivariate linear regression model involving both pressure and wind forcing. The regression coefficient for pressure resulting from the multivariate analysis is -0.96 +/- 0.32 cm/mbar. In the tropics the multivariate analysis fails because sea level in the tropics is primarily responding to remote wind forcing. However, after removing from the data the wind-forced sea level estimated by a dynamic model of the tropical Pacific, the pressure regression coefficient improves from -1.22 +/- 0.69 cm/mbar to -0.99 +/- 0.46 cm/mbar, clearly revealing an IB response. The result of the study suggests that with a proper removal of the effect of wind forcing the IB effect is valid in most of the open ocean at periods longer than 20 days and spatial scales larger than 500 km.

  14. A graphical method to evaluate spectral preprocessing in multivariate regression calibrations: example with Savitzky-Golay filters and partial least squares regression.

    PubMed

    Delwiche, Stephen R; Reeves, James B

    2010-01-01

    In multivariate regression analysis of spectroscopy data, spectral preprocessing is often performed to reduce unwanted background information (offsets, sloped baselines) or accentuate absorption features in intrinsically overlapping bands. These procedures, also known as pretreatments, are commonly smoothing operations or derivatives. While such operations are often useful in reducing the number of latent variables of the actual decomposition and lowering residual error, they also run the risk of misleading the practitioner into accepting calibration equations that are poorly adapted to samples outside of the calibration. The current study developed a graphical method to examine this effect on partial least squares (PLS) regression calibrations of near-infrared (NIR) reflection spectra of ground wheat meal with two analytes, protein content and sodium dodecyl sulfate sedimentation (SDS) volume (an indicator of the quantity of the gluten proteins that contribute to strong doughs). These two properties were chosen because of their differing abilities to be modeled by NIR spectroscopy: excellent for protein content, fair for SDS sedimentation volume. To further demonstrate the potential pitfalls of preprocessing, an artificial component, a randomly generated value, was included in PLS regression trials. Savitzky-Golay (digital filter) smoothing, first-derivative, and second-derivative preprocess functions (5 to 25 centrally symmetric convolution points, derived from quadratic polynomials) were applied to PLS calibrations of 1 to 15 factors. The results demonstrated the danger of an over reliance on preprocessing when (1) the number of samples used in a multivariate calibration is low (<50), (2) the spectral response of the analyte is weak, and (3) the goodness of the calibration is based on the coefficient of determination (R(2)) rather than a term based on residual error. The graphical method has application to the evaluation of other preprocess functions and various types of spectroscopy data.

  15. Total body weight loss of ≥ 10 % is associated with improved hepatic fibrosis in patients with nonalcoholic steatohepatitis.

    PubMed

    Glass, Lisa M; Dickson, Rolland C; Anderson, Joseph C; Suriawinata, Arief A; Putra, Juan; Berk, Brian S; Toor, Arifa

    2015-04-01

    Given the rising epidemics of obesity and metabolic syndrome, nonalcoholic steatohepatitis (NASH) is now the most common cause of liver disease in the developed world. Effective treatment for NASH, either to reverse or prevent the progression of hepatic fibrosis, is currently lacking. To define the predictors associated with improved hepatic fibrosis in NASH patients undergoing serial liver biopsies at prolonged biopsy interval. This is a cohort study of 45 NASH patients undergoing serial liver biopsies for clinical monitoring in a tertiary care setting. Biopsies were scored using the NASH Clinical Research Network guidelines. Fibrosis regression was defined as improvement in fibrosis score ≥1 stage. Univariate analysis utilized Fisher's exact or Student's t test. Multivariate regression models determined independent predictors for regression of fibrosis. Forty-five NASH patients with biopsies collected at a mean interval of 4.6 years (±1.4) were included. The mean initial fibrosis stage was 1.96, two patients had cirrhosis and 12 patients (26.7 %) underwent bariatric surgery. There was a significantly higher rate of fibrosis regression among patients who lost ≥10 % total body weight (TBW) (63.2 vs. 9.1 %; p = 0.001) and who underwent bariatric surgery (47.4 vs. 4.5 %; p = 0.003). Factors such as age, gender, glucose intolerance, elevated ferritin, and A1AT heterozygosity did not influence fibrosis regression. On multivariate analysis, only weight loss of ≥10 % TBW predicted fibrosis regression [OR 8.14 (CI 1.08-61.17)]. Results indicate that regression of fibrosis in NASH is possible, even in advanced stages. Weight loss of ≥10 % TBW predicts fibrosis regression.

  16. Applications of fluorescence spectroscopy for predicting percent wastewater in an urban stream

    USGS Publications Warehouse

    Goldman, Jami H.; Rounds, Stewart A.; Needoba, Joseph A.

    2012-01-01

    Dissolved organic carbon (DOC) is a significant organic carbon reservoir in many ecosystems, and its characteristics and sources determine many aspects of ecosystem health and water quality. Fluorescence spectroscopy methods can quantify and characterize the subset of the DOC pool that can absorb and re-emit electromagnetic energy as fluorescence and thus provide a rapid technique for environmental monitoring of DOC in lakes and rivers. Using high resolution fluorescence techniques, we characterized DOC in the Tualatin River watershed near Portland, Oregon, and identified fluorescence parameters associated with effluent from two wastewater treatment plants and samples from sites within and outside the urban region. Using a variety of statistical approaches, we developed and validated a multivariate linear regression model to predict the amount of wastewater in the river as a function of the relative abundance of specific fluorescence excitation/emission pairs. The model was tested with independent data and predicts the percentage of wastewater in a sample within 80% confidence. Model results can be used to develop in situ instrumentation, inform monitoring programs, and develop additional water quality indicators for aquatic systems.

  17. Gene set analysis using variance component tests.

    PubMed

    Huang, Yen-Tsung; Lin, Xihong

    2013-06-28

    Gene set analyses have become increasingly important in genomic research, as many complex diseases are contributed jointly by alterations of numerous genes. Genes often coordinate together as a functional repertoire, e.g., a biological pathway/network and are highly correlated. However, most of the existing gene set analysis methods do not fully account for the correlation among the genes. Here we propose to tackle this important feature of a gene set to improve statistical power in gene set analyses. We propose to model the effects of an independent variable, e.g., exposure/biological status (yes/no), on multiple gene expression values in a gene set using a multivariate linear regression model, where the correlation among the genes is explicitly modeled using a working covariance matrix. We develop TEGS (Test for the Effect of a Gene Set), a variance component test for the gene set effects by assuming a common distribution for regression coefficients in multivariate linear regression models, and calculate the p-values using permutation and a scaled chi-square approximation. We show using simulations that type I error is protected under different choices of working covariance matrices and power is improved as the working covariance approaches the true covariance. The global test is a special case of TEGS when correlation among genes in a gene set is ignored. Using both simulation data and a published diabetes dataset, we show that our test outperforms the commonly used approaches, the global test and gene set enrichment analysis (GSEA). We develop a gene set analyses method (TEGS) under the multivariate regression framework, which directly models the interdependence of the expression values in a gene set using a working covariance. TEGS outperforms two widely used methods, GSEA and global test in both simulation and a diabetes microarray data.

  18. Molecular Classification Substitutes for the Prognostic Variables Stage, Age, and MYCN Status in Neuroblastoma Risk Assessment.

    PubMed

    Rosswog, Carolina; Schmidt, Rene; Oberthuer, André; Juraeva, Dilafruz; Brors, Benedikt; Engesser, Anne; Kahlert, Yvonne; Volland, Ruth; Bartenhagen, Christoph; Simon, Thorsten; Berthold, Frank; Hero, Barbara; Faldum, Andreas; Fischer, Matthias

    2017-12-01

    Current risk stratification systems for neuroblastoma patients consider clinical, histopathological, and genetic variables, and additional prognostic markers have been proposed in recent years. We here sought to select highly informative covariates in a multistep strategy based on consecutive Cox regression models, resulting in a risk score that integrates hazard ratios of prognostic variables. A cohort of 695 neuroblastoma patients was divided into a discovery set (n=75) for multigene predictor generation, a training set (n=411) for risk score development, and a validation set (n=209). Relevant prognostic variables were identified by stepwise multivariable L1-penalized least absolute shrinkage and selection operator (LASSO) Cox regression, followed by backward selection in multivariable Cox regression, and then integrated into a novel risk score. The variables stage, age, MYCN status, and two multigene predictors, NB-th24 and NB-th44, were selected as independent prognostic markers by LASSO Cox regression analysis. Following backward selection, only the multigene predictors were retained in the final model. Integration of these classifiers in a risk scoring system distinguished three patient subgroups that differed substantially in their outcome. The scoring system discriminated patients with diverging outcome in the validation cohort (5-year event-free survival, 84.9±3.4 vs 63.6±14.5 vs 31.0±5.4; P<.001), and its prognostic value was validated by multivariable analysis. We here propose a translational strategy for developing risk assessment systems based on hazard ratios of relevant prognostic variables. Our final neuroblastoma risk score comprised two multigene predictors only, supporting the notion that molecular properties of the tumor cells strongly impact clinical courses of neuroblastoma patients. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

  19. Future Performance Trend Indicators: A Current Value Approach to Human Resources Accounting. Report III. Multivariate Predictions of Organizational Performance Across Time.

    ERIC Educational Resources Information Center

    Pecorella, Patricia A.; Bowers, David G.

    Multiple regression in a double cross-validated design was used to predict two performance measures (total variable expense and absence rate) by multi-month period in five industrial firms. The regressions do cross-validate, and produce multiple coefficients which display both concurrent and predictive effects, peaking 18 months to two years…

  20. Simultaneous determination of estrogens (ethinylestradiol and norgestimate) concentrations in human and bovine serum albumin by use of fluorescence spectroscopy and multivariate regression analysis.

    PubMed

    Hordge, LaQuana N; McDaniel, Kiara L; Jones, Derick D; Fakayode, Sayo O

    2016-05-15

    The endocrine disruption property of estrogens necessitates the immediate need for effective monitoring and development of analytical protocols for their analyses in biological and human specimens. This study explores the first combined utility of a steady-state fluorescence spectroscopy and multivariate partial-least-square (PLS) regression analysis for the simultaneous determination of two estrogens (17α-ethinylestradiol (EE) and norgestimate (NOR)) concentrations in bovine serum albumin (BSA) and human serum albumin (HSA) samples. The influence of EE and NOR concentrations and temperature on the emission spectra of EE-HSA EE-BSA, NOR-HSA, and NOR-BSA complexes was also investigated. The binding of EE with HSA and BSA resulted in increase in emission characteristics of HSA and BSA and a significant blue spectra shift. In contrast, the interaction of NOR with HSA and BSA quenched the emission characteristics of HSA and BSA. The observed emission spectral shifts preclude the effective use of traditional univariate regression analysis of fluorescent data for the determination of EE and NOR concentrations in HSA and BSA samples. Multivariate partial-least-squares (PLS) regression analysis was utilized to correlate the changes in emission spectra with EE and NOR concentrations in HSA and BSA samples. The figures-of-merit of the developed PLS regression models were excellent, with limits of detection as low as 1.6×10(-8) M for EE and 2.4×10(-7) M for NOR and good linearity (R(2)>0.994985). The PLS models correctly predicted EE and NOR concentrations in independent validation HSA and BSA samples with a root-mean-square-percent-relative-error (RMS%RE) of less than 6.0% at physiological condition. On the contrary, the use of univariate regression resulted in poor predictions of EE and NOR in HSA and BSA samples, with RMS%RE larger than 40% at physiological conditions. High accuracy, low sensitivity, simplicity, low-cost with no prior analyte extraction or separation required makes this method promising, compelling, and attractive alternative for the rapid determination of estrogen concentrations in biomedical and biological specimens, pharmaceuticals, or environmental samples. Published by Elsevier B.V.

  1. Using Logistic Regression to Predict the Probability of Debris Flows in Areas Burned by Wildfires, Southern California, 2003-2006

    USGS Publications Warehouse

    Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.; Michael, John A.; Helsel, Dennis R.

    2008-01-01

    Logistic regression was used to develop statistical models that can be used to predict the probability of debris flows in areas recently burned by wildfires by using data from 14 wildfires that burned in southern California during 2003-2006. Twenty-eight independent variables describing the basin morphology, burn severity, rainfall, and soil properties of 306 drainage basins located within those burned areas were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows soon after the 2003 to 2006 fires were delineated from data in the National Elevation Dataset using a geographic information system; (2) Data describing the basin morphology, burn severity, rainfall, and soil properties were compiled for each basin. These data were then input to a statistics software package for analysis using logistic regression; and (3) Relations between the occurrence or absence of debris flows and the basin morphology, burn severity, rainfall, and soil properties were evaluated, and five multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combinations produced the most effective models, and the multivariate models that best predicted the occurrence of debris flows were identified. Percentage of high burn severity and 3-hour peak rainfall intensity were significant variables in all models. Soil organic matter content and soil clay content were significant variables in all models except Model 5. Soil slope was a significant variable in all models except Model 4. The most suitable model can be selected from these five models on the basis of the availability of independent variables in the particular area of interest and field checking of probability maps. The multivariate logistic regression models can be entered into a geographic information system, and maps showing the probability of debris flows can be constructed in recently burned areas of southern California. This study demonstrates that logistic regression is a valuable tool for developing models that predict the probability of debris flows occurring in recently burned landscapes.

  2. Analysis and compensation for the effect of the catheter position on image intensities in intravascular optical coherence tomography

    NASA Astrophysics Data System (ADS)

    Liu, Shengnan; Eggermont, Jeroen; Wolterbeek, Ron; Broersen, Alexander; Busk, Carol A. G. R.; Precht, Helle; Lelieveldt, Boudewijn P. F.; Dijkstra, Jouke

    2016-12-01

    Intravascular optical coherence tomography (IVOCT) is an imaging technique that is used to analyze the underlying cause of cardiovascular disease. Because a catheter is used during imaging, the intensities can be affected by the catheter position. This work aims to analyze the effect of the catheter position on IVOCT image intensities and to propose a compensation method to minimize this effect in order to improve the visualization and the automatic analysis of IVOCT images. The effect of catheter position is modeled with respect to the distance between the catheter and the arterial wall (distance-dependent factor) and the incident angle onto the arterial wall (angle-dependent factor). A light transmission model incorporating both factors is introduced. On the basis of this model, the interaction effect of both factors is estimated with a hierarchical multivariant linear regression model. Statistical analysis shows that IVOCT intensities are significantly affected by both factors with p<0.001, as either aspect increases the intensity decreases. This effect differs for different pullbacks. The regression results were used to compensate for this effect. Experiments show that the proposed compensation method can improve the performance of the automatic bioresorbable vascular scaffold strut detection.

  3. Comparative study of outcome measures and analysis methods for traumatic brain injury trials.

    PubMed

    Alali, Aziz S; Vavrek, Darcy; Barber, Jason; Dikmen, Sureyya; Nathens, Avery B; Temkin, Nancy R

    2015-04-15

    Batteries of functional and cognitive measures have been proposed as alternatives to the Extended Glasgow Outcome Scale (GOSE) as the primary outcome for traumatic brain injury (TBI) trials. We evaluated several approaches to analyzing GOSE and a battery of four functional and cognitive measures. Using data from a randomized trial, we created a "super" dataset of 16,550 subjects from patients with complete data (n=331) and then simulated multiple treatment effects across multiple outcome measures. Patients were sampled with replacement (bootstrapping) to generate 10,000 samples for each treatment effect (n=400 patients/group). The percentage of samples where the null hypothesis was rejected estimates the power. All analytic techniques had appropriate rates of type I error (≤5%). Accounting for baseline prognosis either by using sliding dichotomy for GOSE or using regression-based methods substantially increased the power over the corresponding analysis without accounting for prognosis. Analyzing GOSE using multivariate proportional odds regression or analyzing the four-outcome battery with regression-based adjustments had the highest power, assuming equal treatment effect across all components. Analyzing GOSE using a fixed dichotomy provided the lowest power for both unadjusted and regression-adjusted analyses. We assumed an equal treatment effect for all measures. This may not be true in an actual clinical trial. Accounting for baseline prognosis is critical to attaining high power in Phase III TBI trials. The choice of primary outcome for future trials should be guided by power, the domain of brain function that an intervention is likely to impact, and the feasibility of collecting outcome data.

  4. Comparative Study of Outcome Measures and Analysis Methods for Traumatic Brain Injury Trials

    PubMed Central

    Alali, Aziz S.; Vavrek, Darcy; Barber, Jason; Dikmen, Sureyya; Nathens, Avery B.

    2015-01-01

    Abstract Batteries of functional and cognitive measures have been proposed as alternatives to the Extended Glasgow Outcome Scale (GOSE) as the primary outcome for traumatic brain injury (TBI) trials. We evaluated several approaches to analyzing GOSE and a battery of four functional and cognitive measures. Using data from a randomized trial, we created a “super” dataset of 16,550 subjects from patients with complete data (n=331) and then simulated multiple treatment effects across multiple outcome measures. Patients were sampled with replacement (bootstrapping) to generate 10,000 samples for each treatment effect (n=400 patients/group). The percentage of samples where the null hypothesis was rejected estimates the power. All analytic techniques had appropriate rates of type I error (≤5%). Accounting for baseline prognosis either by using sliding dichotomy for GOSE or using regression-based methods substantially increased the power over the corresponding analysis without accounting for prognosis. Analyzing GOSE using multivariate proportional odds regression or analyzing the four-outcome battery with regression-based adjustments had the highest power, assuming equal treatment effect across all components. Analyzing GOSE using a fixed dichotomy provided the lowest power for both unadjusted and regression-adjusted analyses. We assumed an equal treatment effect for all measures. This may not be true in an actual clinical trial. Accounting for baseline prognosis is critical to attaining high power in Phase III TBI trials. The choice of primary outcome for future trials should be guided by power, the domain of brain function that an intervention is likely to impact, and the feasibility of collecting outcome data. PMID:25317951

  5. Automated processing of label-free Raman microscope images of macrophage cells with standardized regression for high-throughput analysis.

    PubMed

    Milewski, Robert J; Kumagai, Yutaro; Fujita, Katsumasa; Standley, Daron M; Smith, Nicholas I

    2010-11-19

    Macrophages represent the front lines of our immune system; they recognize and engulf pathogens or foreign particles thus initiating the immune response. Imaging macrophages presents unique challenges, as most optical techniques require labeling or staining of the cellular compartments in order to resolve organelles, and such stains or labels have the potential to perturb the cell, particularly in cases where incomplete information exists regarding the precise cellular reaction under observation. Label-free imaging techniques such as Raman microscopy are thus valuable tools for studying the transformations that occur in immune cells upon activation, both on the molecular and organelle levels. Due to extremely low signal levels, however, Raman microscopy requires sophisticated image processing techniques for noise reduction and signal extraction. To date, efficient, automated algorithms for resolving sub-cellular features in noisy, multi-dimensional image sets have not been explored extensively. We show that hybrid z-score normalization and standard regression (Z-LSR) can highlight the spectral differences within the cell and provide image contrast dependent on spectral content. In contrast to typical Raman imaging processing methods using multivariate analysis, such as single value decomposition (SVD), our implementation of the Z-LSR method can operate nearly in real-time. In spite of its computational simplicity, Z-LSR can automatically remove background and bias in the signal, improve the resolution of spatially distributed spectral differences and enable sub-cellular features to be resolved in Raman microscopy images of mouse macrophage cells. Significantly, the Z-LSR processed images automatically exhibited subcellular architectures whereas SVD, in general, requires human assistance in selecting the components of interest. The computational efficiency of Z-LSR enables automated resolution of sub-cellular features in large Raman microscopy data sets without compromise in image quality or information loss in associated spectra. These results motivate further use of label free microscopy techniques in real-time imaging of live immune cells.

  6. Negative Events in Childhood Predict Trajectories of Internalizing Symptoms Up to Young Adulthood: An 18-Year Longitudinal Study

    PubMed Central

    Melchior, Maria; Touchette, Évelyne; Prokofyeva, Elena; Chollet, Aude; Fombonne, Eric; Elidemir, Gulizar; Galéra, Cédric

    2014-01-01

    Background Common negative events can precipitate the onset of internalizing symptoms. We studied whether their occurrence in childhood is associated with mental health trajectories over the course of development. Methods Using data from the TEMPO study, a French community-based cohort study of youths, we studied the association between negative events in 1991 (when participants were aged 4–16 years) and internalizing symptoms, assessed by the ASEBA family of instruments in 1991, 1999, and 2009 (n = 1503). Participants' trajectories of internalizing symptoms were estimated with semi-parametric regression methods (PROC TRAJ). Data were analyzed using multinomial regression models controlled for participants' sex, age, parental family status, socio-economic position, and parental history of depression. Results Negative childhood events were associated with an increased likelihood of concurrent internalizing symptoms which sometimes persisted into adulthood (multivariate ORs associated with > = 3 negative events respectively: high and decreasing internalizing symptoms: 5.54, 95% CI: 3.20–9.58; persistently high internalizing symptoms: 8.94, 95% CI: 2.82–28.31). Specific negative events most strongly associated with youths' persistent internalizing symptoms included: school difficulties (multivariate OR: 5.31, 95% CI: 2.24–12.59), parental stress (multivariate OR: 4.69, 95% CI: 2.02–10.87), serious illness/health problems (multivariate OR: 4.13, 95% CI: 1.76–9.70), and social isolation (multivariate OR: 2.24, 95% CI: 1.00–5.08). Conclusions Common negative events can contribute to the onset of children's lasting psychological difficulties. PMID:25485875

  7. Application of quality by design concepts in the development of fluidized bed granulation and tableting processes.

    PubMed

    Djuris, Jelena; Medarevic, Djordje; Krstic, Marko; Djuric, Zorica; Ibric, Svetlana

    2013-06-01

    This study illustrates the application of experimental design and multivariate data analysis in defining design space for granulation and tableting processes. According to the quality by design concepts, critical quality attributes (CQAs) of granules and tablets, as well as critical parameters of granulation and tableting processes, were identified and evaluated. Acetaminophen was used as the model drug, and one of the study aims was to investigate the possibility of the development of immediate- or extended-release acetaminophen tablets. Granulation experiments were performed in the fluid bed processor using polyethylene oxide polymer as a binder in the direct granulation method. Tablets were compressed in the laboratory excenter tablet press. The first set of experiments was organized according to Plackett-Burman design, followed by the full factorial experimental design. Principal component analysis and partial least squares regression were applied as the multivariate analysis techniques. By using these different methods, CQAs and process parameters were identified and quantified. Furthermore, an in-line method was developed to monitor the temperature during the fluidized bed granulation process, to foresee possible defects in granules CQAs. Various control strategies that are based on the process understanding and assure desired quality attributes of the product are proposed. Copyright © 2013 Wiley Periodicals, Inc.

  8. Prediction of beef color using time-domain nuclear magnetic resonance (TD-NMR) relaxometry data and multivariate analyses.

    PubMed

    Moreira, Luiz Felipe Pompeu Prado; Ferrari, Adriana Cristina; Moraes, Tiago Bueno; Reis, Ricardo Andrade; Colnago, Luiz Alberto; Pereira, Fabíola Manhas Verbi

    2016-05-19

    Time-domain nuclear magnetic resonance and chemometrics were used to predict color parameters, such as lightness (L*), redness (a*), and yellowness (b*) of beef (Longissimus dorsi muscle) samples. Analyzing the relaxation decays with multivariate models performed with partial least-squares regression, color quality parameters were predicted. The partial least-squares models showed low errors independent of the sample size, indicating the potentiality of the method. Minced procedure and weighing were not necessary to improve the predictive performance of the models. The reduction of transverse relaxation time (T 2 ) measured by Carr-Purcell-Meiboom-Gill pulse sequence in darker beef in comparison with lighter ones can be explained by the lower relaxivity Fe 2+ present in deoxymyoglobin and oxymyoglobin (red beef) to the higher relaxivity of Fe 3+ present in metmyoglobin (brown beef). These results point that time-domain nuclear magnetic resonance spectroscopy can become a useful tool for quality assessment of beef cattle on bulk of the sample and through-packages, because this technique is also widely applied to measure sensorial parameters, such as flavor, juiciness and tenderness, and physicochemical parameters, cooking loss, fat and moisture content, and instrumental tenderness using Warner Bratzler shear force. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  9. Evaluating the "cushion effect" among children in frontal motor vehicle crashes.

    PubMed

    Harbaugh, Calista M; Zhang, Peng; Henderson, Brianna; Derstine, Brian A; Holcombe, Sven A; Wang, Stewart C; Kohoyda-Inglis, Carla; Ehrlich, Peter F

    2018-05-01

    The "Cushion Effect," the phenomenon in which obesity protects against abdominal injury in adults in motor vehicle accidents, has not been evaluated among pediatric patients. This work evaluates the association between subcutaneous fat cross-sectional area, quantified using analytic morphomic techniques and abdominal injury. This retrospective study includes 119 patients aged 1 to 18years involved in frontal impact motor vehicle accidents (2003-2015) with computed tomography scans. Subcutaneous fat cross-sectional area was measured and converted to age- and gender-adjusted percentiles from population-based normative data. Multivariable analysis determined the risk of the primary outcome, Maximum Abbreviated Injury Scale (MAIS) 2+ abdominal injury, after adjusting for age, weight, seatbelt status, and impact rating. MAIS 2+ abdominal injuries occurred in 20 (16.8%) of the patients. Subcutaneous fat area percentile was not significantly associated with MAIS 2+ abdominal injury on multivariable logistic regression (adjusted Odds Ratio, 0.86; 95% CI, 0.72-1.03; p=0.10). The "cushion effect" was not apparent among pediatric frontal motor vehicle crash victims in this study. Future work is needed to investigate other analytic morphomic measures. By understanding how body composition relates to injury patterns, there is a unique opportunity to improve vehicle safety design. Prognosis Study, Level III. Copyright © 2018. Published by Elsevier Inc.

  10. Development of an accelerometer-based multivariate model to predict free-living energy expenditure in a large military cohort.

    PubMed

    Horner, Fleur; Bilzon, James L; Rayson, Mark; Blacker, Sam; Richmond, Victoria; Carter, James; Wright, Anthony; Nevill, Alan

    2013-01-01

    This study developed a multivariate model to predict free-living energy expenditure (EE) in independent military cohorts. Two hundred and eighty-eight individuals (20.6 ± 3.9 years, 67.9 ± 12.0 kg, 1.71 ± 0.10 m) from 10 cohorts wore accelerometers during observation periods of 7 or 10 days. Accelerometer counts (PAC) were recorded at 1-minute epochs. Total energy expenditure (TEE) and physical activity energy expenditure (PAEE) were derived using the doubly labelled water technique. Data were reduced to n = 155 based on wear-time. Associations between PAC and EE were assessed using allometric modelling. Models were derived using multiple log-linear regression analysis and gender differences assessed using analysis of covariance. In all models PAC, height and body mass were related to TEE (P < 0.01). For models predicting TEE (r (2) = 0.65, SE = 462 kcal · d(-1) (13.0%)), PAC explained 4% of the variance. For models predicting PAEE (r (2) = 0.41, SE = 490 kcal · d(-1) (32.0%)), PAC accounted for 6% of the variance. Accelerometry increases the accuracy of EE estimation in military populations. However, the unique nature of military life means accurate prediction of individual free-living EE is highly dependent on anthropometric measurements.

  11. Perception of control, coping and psychological stress of infertile women undergoing IVF.

    PubMed

    Gourounti, Kleanthi; Anagnostopoulos, Fotios; Potamianos, Grigorios; Lykeridou, Katerina; Schmidt, Lone; Vaslamatzis, Grigorios

    2012-06-01

    The study aimed to examine: (i) the association between perception of infertility controllability and coping strategies; and (ii) the association between perception of infertility controllability and coping strategies to psychological distress, applying multivariate statistical techniques to control for the effects of demographic variables. This cross-sectional study included 137 women with fertility problems undergoing IVF in a public hospital. All participants completed questionnaires that measured fertility-related stress, state anxiety, depressive symptomatology, perception of control and coping strategies. Pearson's correlation coefficients were calculated between all study variables, followed by hierarchical multiple linear regression. Low perception of personal and treatment controllability was associated with frequent use of avoidance coping and high perception of treatment controllability was positively associated with problem-focused coping. Multivariate analysis showed that, when controlling for demographic factors, low perception of personal control and avoidance coping were positively associated with fertility-related stress and state anxiety, and problem-appraisal coping was negatively and significantly associated with fertility-related stress and depressive symptomatology scores. The findings of this study merit the understanding of the role of control perception and coping in psychological stress of infertile women to identify beforehand those women who might be at risk of experiencing high stress and in need of support. Copyright © 2012 Reproductive Healthcare Ltd. Published by Elsevier Ltd. All rights reserved.

  12. [Retrospective statistical analysis of clinical factors of recurrence in chronic subdural hematoma: correlation between univariate and multivariate analysis].

    PubMed

    Takayama, Motoharu; Terui, Keita; Oiwa, Yoshitsugu

    2012-10-01

    Chronic subdural hematoma is common in elderly individuals and surgical procedures are simple. The recurrence rate of chronic subdural hematoma, however, varies from 9.2 to 26.5% after surgery. The authors studied factors of the recurrence using univariate and multivariate analyses in patients with chronic subdural hematoma We retrospectively reviewed 239 consecutive cases of chronic subdural hematoma who received burr-hole surgery with irrigation and closed-system drainage. We analyzed the relationships between recurrence of chronic subdural hematoma and factors such as sex, age, laterality, bleeding tendency, other complicated diseases, density on CT, volume of the hematoma, residual air in the hematoma cavity, use of artificial cerebrospinal fluid. Twenty-one patients (8.8%) experienced a recurrence of chronic subdural hematoma. Multiple logistic regression found that the recurrence rate was higher in patients with a large volume of the residual air, and was lower in patients using artificial cerebrospinal fluid. No statistical differences were found in bleeding tendency. Techniques to reduce the air in the hematoma cavity are important for good outcome in surgery of chronic subdural hematoma. Also, the use of artificial cerebrospinal fluid reduces recurrence of chronic subdural hematoma. The surgical procedures can be the same for patients with bleeding tendencies.

  13. Predictors of the Perception of Smoking Health Risks in Smokers With or Without Schizophrenia.

    PubMed

    Kowalczyk, William J; Wehring, Heidi J; Burton, George; Raley, Heather; Feldman, Stephanie; Heishman, Stephen J; Kelly, Deanna L

    2017-01-01

    This study sought to examine the predictors of health risk perception in smokers with or without schizophrenia. The health risk subscale from the Smoking Consequences Questionnaire was dichotomized and used to measure health risk perception in smokers with (n = 67) and without schizophrenia (n = 100). A backward stepwise logistic regression was conducted using variables associated at the bivariate level to determine multivariate predictors. Overall, 62.5% of smokers without schizophrenia and 40.3% of smokers with schizophrenia completely recognize the health risks of smoking (p ≤ .01). Multivariate predictors for smokers without schizophrenia included: sex (Exp (B) = .3; p < .05), Smoking Consequences Questionnaire state enhancement (Exp (B) = .69; p < .01), and craving relief (Exp (B) = 1.8; p < .01). Among smokers with schizophrenia, predictors were education (Exp (B) = .7; p < .05), nicotine dependence (Exp (B) = .5; p < .01), motivation to quit (Exp (B) = 1.8; p < .01), and Smoking Consequences Questionnaire craving relief (Exp (B) = 1.8; p < .01). There was overlap and differences between predictors in smokers with and without schizophrenia. Commonly used techniques for education on the health consequences of cigarettes may work in smokers with schizophrenia, but intervention efforts specifically tailored to smokers with schizophrenia might be more efficacious.

  14. Predictors of the Perception of Smoking Health Risks in Smokers With or Without Schizophrenia

    PubMed Central

    Kowalczyk, William J.; Wehring, Heidi J.; Burton, George; Raley, Heather; Feldman, Stephanie; Heishman, Stephen J.; Kelly, Deanna L.

    2017-01-01

    Objective This study sought to examine the predictors of health risk perception in smokers with or without schizophrenia. Methods The health risk subscale from the Smoking Consequences Questionnaire was dichotomized and used to measure health risk perception in smokers with (n = 67) and without schizophrenia (n = 100). A backward stepwise logistic regression was conducted using variables associated at the bivariate level to determine multivariate predictors. Results Overall, 62.5% of smokers without schizophrenia and 40.3% of smokers with schizophrenia completely recognize the health risks of smoking (p ≤ .01). Multivariate predictors for smokers without schizophrenia included: sex (Exp (B) = .3; p < .05), Smoking Consequences Questionnaire state enhancement (Exp (B) = .69; p < .01), and craving relief (Exp (B) = 1.8; p < .01). Among smokers with schizophrenia, predictors were education (Exp (B) = .7; p < .05), nicotine dependence (Exp (B) = .5; p < .01), motivation to quit (Exp (B) = 1.8; p < .01), and Smoking Consequences Questionnaire craving relief (Exp (B) = 1.8; p < .01). Conclusions There was overlap and differences between predictors in smokers with and without schizophrenia. Commonly used techniques for education on the health consequences of cigarettes may work in smokers with schizophrenia, but intervention efforts specifically tailored to smokers with schizophrenia might be more efficacious. PMID:27858591

  15. Ripening-dependent metabolic changes in the volatiles of pineapple (Ananas comosus (L.) Merr.) fruit: II. Multivariate statistical profiling of pineapple aroma compounds based on comprehensive two-dimensional gas chromatography-mass spectrometry.

    PubMed

    Steingass, Christof Björn; Jutzi, Manfred; Müller, Jenny; Carle, Reinhold; Schmarr, Hans-Georg

    2015-03-01

    Ripening-dependent changes of pineapple volatiles were studied in a nontargeted profiling analysis. Volatiles were isolated via headspace solid phase microextraction and analyzed by comprehensive 2D gas chromatography and mass spectrometry (HS-SPME-GC×GC-qMS). Profile patterns presented in the contour plots were evaluated applying image processing techniques and subsequent multivariate statistical data analysis. Statistical methods comprised unsupervised hierarchical cluster analysis (HCA) and principal component analysis (PCA) to classify the samples. Supervised partial least squares discriminant analysis (PLS-DA) and partial least squares (PLS) regression were applied to discriminate different ripening stages and describe the development of volatiles during postharvest storage, respectively. Hereby, substantial chemical markers allowing for class separation were revealed. The workflow permitted the rapid distinction between premature green-ripe pineapples and postharvest-ripened sea-freighted fruits. Volatile profiles of fully ripe air-freighted pineapples were similar to those of green-ripe fruits postharvest ripened for 6 days after simulated sea freight export, after PCA with only two principal components. However, PCA considering also the third principal component allowed differentiation between air-freighted fruits and the four progressing postharvest maturity stages of sea-freighted pineapples.

  16. Postoperative pain after one-visit root-canal treatment on teeth with vital pulps: Comparison of three different obturation technique

    PubMed Central

    Alonso-Ezpeleta, Luis O.; Gasco-Garcia, Carmen; Castellanos-Cosano, Lizett; Martín-González, Jenifer; López-Frías, Francsico J.

    2012-01-01

    Objectives. To investigate and compare postoperative pain after one-visit root canal treatment (RCT) on teeth with vital pulps using three different obturation techniques. Study Design. Two hundred and four patients (105 men and 99 women) aged 12 to 77 years were randomly assigned into three treatments groups: cold lateral compaction of gutta-percha (LC), Thermafil technique (TT), and Backfill - Thermafil obturation technique (BT). Postoperative pain was recorded on a visual analogue scale (VAS) of 0 - 10 after 2 and 6 hours, and 1, 2, 3, 4, 5, 6 and 7 days. Data were statistically analyzed using multivariate logistic regression analysis. Results. In the total sample, 87% of patients experienced discomfort or pain in some moment between RCT and the seventh day. The discomfort experienced was weak, light, moderate and intense in 6%, 44%, 20% and 6% of the cases, respectively. Mean pain levels were 0.4 ± 0.4, 0.4 ± 0.3, and 1.4 ± 0.7 in LC, BT, and TT groups, respectively. Patients of TT group experienced a significantly higher mean pain level compared to other two groups (p < 0.0001). In TT group, all patients felt some level of pain at six hours after RCT. Conclusions. Postoperative pain was significantly associated with the obturation technique used during root canal treatment. Patients whose teeth were filled with Thermafil obturators (TT technique) showed significantly higher levels of discomfort than patients whose teeth were filled using any of the other two techniques. Key words:Postoperative pain, root-canal obturation, root-canal treatment, Thermafil. PMID:22322522

  17. qFeature

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    2015-09-14

    This package contains statistical routines for extracting features from multivariate time-series data which can then be used for subsequent multivariate statistical analysis to identify patterns and anomalous behavior. It calculates local linear or quadratic regression model fits to moving windows for each series and then summarizes the model coefficients across user-defined time intervals for each series. These methods are domain agnostic-but they have been successfully applied to a variety of domains, including commercial aviation and electric power grid data.

  18. Modelling lecturer performance index of private university in Tulungagung by using survival analysis with multivariate adaptive regression spline

    NASA Astrophysics Data System (ADS)

    Hasyim, M.; Prastyo, D. D.

    2018-03-01

    Survival analysis performs relationship between independent variables and survival time as dependent variable. In fact, not all survival data can be recorded completely by any reasons. In such situation, the data is called censored data. Moreover, several model for survival analysis requires assumptions. One of the approaches in survival analysis is nonparametric that gives more relax assumption. In this research, the nonparametric approach that is employed is Multivariate Regression Adaptive Spline (MARS). This study is aimed to measure the performance of private university’s lecturer. The survival time in this study is duration needed by lecturer to obtain their professional certificate. The results show that research activities is a significant factor along with developing courses material, good publication in international or national journal, and activities in research collaboration.

  19. The association between a body shape index and cardiovascular risk in overweight and obese children and adolescents.

    PubMed

    Mameli, Chiara; Krakauer, Nir Y; Krakauer, Jesse C; Bosetti, Alessandra; Ferrari, Chiara Matilde; Moiana, Norma; Schneider, Laura; Borsani, Barbara; Genoni, Teresa; Zuccotti, Gianvincenzo

    2018-01-01

    A Body Shape Index (ABSI) and normalized hip circumference (Hip Index, HI) have been recently shown to be strong risk factors for mortality and for cardiovascular disease in adults. We conducted an observational cross-sectional study to evaluate the relationship between ABSI, HI and cardiometabolic risk factors and obesity-related comorbidities in overweight and obese children and adolescents aged 2-18 years. We performed multivariate linear and logistic regression analyses with BMI, ABSI, and HI age and sex normalized z scores as predictors to examine the association with cardiometabolic risk markers (systolic and diastolic blood pressure, fasting glucose and insulin, total cholesterol and its components, transaminases, fat mass % detected by bioelectrical impedance analysis) and obesity-related conditions (including hepatic steatosis and metabolic syndrome). We recruited 217 patients (114 males), mean age 11.3 years. Multivariate linear regression showed a significant association of ABSI z score with 10 out of 15 risk markers expressed as continuous variables, while BMI z score showed a significant correlation with 9 and HI only with 1. In multivariate logistic regression to predict occurrence of obesity-related conditions and above-threshold values of risk factors, BMI z score was significantly correlated to 7 out of 12, ABSI to 5, and HI to 1. Overall, ABSI is an independent anthropometric index that was significantly associated with cardiometabolic risk markers in a pediatric population affected by overweight and obesity.

  20. Influence of professional preparation and class structure on sexuality topics taught in middle and high schools.

    PubMed

    Rhodes, Darson L; Kirchofer, Gregg; Hammig, Bart J; Ogletree, Roberta J

    2013-05-01

    This study examined the impact of professional preparation and class structure on sexuality topics taught and use of practice-based instructional strategies in US middle and high school health classes. Data from the classroom-level file of the 2006 School Health Policies and Programs were used. A series of multivariable logistic regression models were employed to determine if sexuality content taught was dependent on professional preparation and /or class structure (HE only versus HE/another subject combined). Additional multivariable logistic regression models were employed to determine if use of practice-based instructional strategies was dependent upon professional preparation and/or class structure. Years of teaching health topics and size of the school district were included as covariates in the multivariable logistic regression models. Findings indicated professionally prepared health educators were significantly more likely to teach 7 of the 13 sexuality topics as compared to nonprofessionally prepared health educators. There was no statistically significant difference in the instructional strategies used by professionally prepared and nonprofessionally prepared health educators. Exclusively health education classes versus combined classes were significantly more likely to have included 6 of the 13 topics and to have incorporated practice-based instructional strategies in the curricula. This study indicated professional preparation and class structure impacted sexuality content taught. Class structure also impacted whether opportunities for students to practice skills were made available. Results support the need for continued advocacy for professionally prepared health educators and health only courses. © 2013, American School Health Association.

  1. Association between cardiovascular risk factors and carotid intima-media thickness in prepubertal Brazilian children.

    PubMed

    Gazolla, Fernanda Mussi; Neves Bordallo, Maria Alice; Madeira, Isabel Rey; de Miranda Carvalho, Cecilia Noronha; Vieira Monteiro, Alexandra Maria; Pinheiro Rodrigues, Nádia Cristina; Borges, Marcos Antonio; Collett-Solberg, Paulo Ferrez; Muniz, Bruna Moreira; de Oliveira, Cecilia Lacroix; Pinheiro, Suellen Martins; de Queiroz Ribeiro, Rebeca Mathias

    2015-05-01

    Early exposure to cardiovascular risk factors creates a chronic inflammatory state that could damage the endothelium followed by thickening of the carotid intima-media. To investigate the association of cardiovascular risk factors and thickening of the carotid intima. Media in prepubertal children. In this cross-sectional study, carotid intima-media thickness (cIMT) and cardiovascular risk factors were assessed in 129 prepubertal children aged from 5 to 10 year. Association was assessed by simple and multivariate logistic regression analyses. In simple logistic regression analyses, body mass index (BMI) z-score, waist circumference, and systolic blood pressure (SBP) were positively associated with increased left, right, and average cIMT, whereas diastolic blood pressure was positively associated only with increased left and average cIMT (p<0.05). In multivariate logistic regression analyses increased left cIMT was positively associated to BMI z-score and SBP, and increased average cIMT was only positively associated to SBP (p<0.05). BMI z-score and SBP were the strongest risk factors for increased cIMT.

  2. Predictors of effects of lifestyle intervention on diabetes mellitus type 2 patients.

    PubMed

    Jacobsen, Ramune; Vadstrup, Eva; Røder, Michael; Frølich, Anne

    2012-01-01

    The main aim of the study was to identify predictors of the effects of lifestyle intervention on diabetes mellitus type 2 patients by means of multivariate analysis. Data from a previously published randomised clinical trial, which compared the effects of a rehabilitation programme including standardised education and physical training sessions in the municipality's health care centre with the same duration of individual counseling in the diabetes outpatient clinic, were used. Data from 143 diabetes patients were analysed. The merged lifestyle intervention resulted in statistically significant improvements in patients' systolic blood pressure, waist circumference, exercise capacity, glycaemic control, and some aspects of general health-related quality of life. The linear multivariate regression models explained 45% to 80% of the variance in these improvements. The baseline outcomes in accordance to the logic of the regression to the mean phenomenon were the only statistically significant and robust predictors in all regression models. These results are important from a clinical point of view as they highlight the more urgent need for and better outcomes following lifestyle intervention for those patients who have worse general and disease-specific health.

  3. Creep-Rupture Data Analysis - Engineering Application of Regression Techniques. Ph.D. Thesis - North Carolina State Univ.

    NASA Technical Reports Server (NTRS)

    Rummler, D. R.

    1976-01-01

    The results are presented of investigations to apply regression techniques to the development of methodology for creep-rupture data analysis. Regression analysis techniques are applied to the explicit description of the creep behavior of materials for space shuttle thermal protection systems. A regression analysis technique is compared with five parametric methods for analyzing three simulated and twenty real data sets, and a computer program for the evaluation of creep-rupture data is presented.

  4. Modelling nitrate pollution pressure using a multivariate statistical approach: the case of Kinshasa groundwater body, Democratic Republic of Congo

    NASA Astrophysics Data System (ADS)

    Mfumu Kihumba, Antoine; Ndembo Longo, Jean; Vanclooster, Marnik

    2016-03-01

    A multivariate statistical modelling approach was applied to explain the anthropogenic pressure of nitrate pollution on the Kinshasa groundwater body (Democratic Republic of Congo). Multiple regression and regression tree models were compared and used to identify major environmental factors that control the groundwater nitrate concentration in this region. The analyses were made in terms of physical attributes related to the topography, land use, geology and hydrogeology in the capture zone of different groundwater sampling stations. For the nitrate data, groundwater datasets from two different surveys were used. The statistical models identified the topography, the residential area, the service land (cemetery), and the surface-water land-use classes as major factors explaining nitrate occurrence in the groundwater. Also, groundwater nitrate pollution depends not on one single factor but on the combined influence of factors representing nitrogen loading sources and aquifer susceptibility characteristics. The groundwater nitrate pressure was better predicted with the regression tree model than with the multiple regression model. Furthermore, the results elucidated the sensitivity of the model performance towards the method of delineation of the capture zones. For pollution modelling at the monitoring points, therefore, it is better to identify capture-zone shapes based on a conceptual hydrogeological model rather than to adopt arbitrary circular capture zones.

  5. Successive Projections Algorithm-Multivariable Linear Regression Classifier for the Detection of Contaminants on Chicken Carcasses in Hyperspectral Images

    NASA Astrophysics Data System (ADS)

    Wu, W.; Chen, G. Y.; Kang, R.; Xia, J. C.; Huang, Y. P.; Chen, K. J.

    2017-07-01

    During slaughtering and further processing, chicken carcasses are inevitably contaminated by microbial pathogen contaminants. Due to food safety concerns, many countries implement a zero-tolerance policy that forbids the placement of visibly contaminated carcasses in ice-water chiller tanks during processing. Manual detection of contaminants is labor consuming and imprecise. Here, a successive projections algorithm (SPA)-multivariable linear regression (MLR) classifier based on an optimal performance threshold was developed for automatic detection of contaminants on chicken carcasses. Hyperspectral images were obtained using a hyperspectral imaging system. A regression model of the classifier was established by MLR based on twelve characteristic wavelengths (505, 537, 561, 562, 564, 575, 604, 627, 656, 665, 670, and 689 nm) selected by SPA , and the optimal threshold T = 1 was obtained from the receiver operating characteristic (ROC) analysis. The SPA-MLR classifier provided the best detection results when compared with the SPA-partial least squares (PLS) regression classifier and the SPA-least squares supported vector machine (LS-SVM) classifier. The true positive rate (TPR) of 100% and the false positive rate (FPR) of 0.392% indicate that the SPA-MLR classifier can utilize spatial and spectral information to effectively detect contaminants on chicken carcasses.

  6. Aging, not menopause, is associated with higher prevalence of hyperuricemia among older women.

    PubMed

    Krishnan, Eswar; Bennett, Mihoko; Chen, Linjun

    2014-11-01

    This work aims to study the associations, if any, of hyperuricemia, gout, and menopause status in the US population. Using multiyear data from the National Health and Nutrition Examination Survey, we performed unmatched comparisons and one to three age-matched comparisons of women aged 20 to 70 years with and without hyperuricemia (serum urate ≥6 mg/dL). Analyses were performed using survey-weighted multiple logistic regression and conditional logistic regression, respectively. Overall, there were 1,477 women with hyperuricemia. Age and serum urate were significantly correlated. In unmatched analyses (n = 9,573 controls), postmenopausal women were older, were heavier, and had higher prevalence of renal impairment, hypertension, diabetes, and hyperlipidemia. In multivariable regression, after accounting for age, body mass index, glomerular filtration rate, and diuretic use, menopause was associated with hyperuricemia (odds ratio, 1.36; 95% CI, 1.05-1.76; P = 0.002). In corresponding multivariable regression using age-matched data (n = 4,431 controls), the odds ratio for menopause was 0.94 (95% CI, 0.83-1.06). Current use of hormone therapy was not associated with prevalent hyperuricemia in both unmatched and matched analyses. Age is a better statistical explanation for the higher prevalence of hyperuricemia among older women than menopause status.

  7. Risk factors and outcomes of high peritonitis rate in continuous ambulatory peritoneal dialysis patients: A retrospective study.

    PubMed

    Tian, Yuanshi; Xie, Xishao; Xiang, Shilong; Yang, Xin; Zhang, Xiaohui; Shou, Zhangfei; Chen, Jianghua

    2016-12-01

    Peritonitis remains a major complication of peritoneal dialysis (PD). A high peritonitis rate (HPR) affects continuous ambulatory peritoneal dialysis (CAPD) patients' technique survival and mortality. Predictors and outcomes of HPR, rather than the first peritonitis episode, were rarely studied in the Chinese population. In this study, we examined the risk factors associated with HPR and its effects on clinical outcomes in CAPD patients.This is a single center, retrospective, observational cohort study. A total of 294 patients who developing at least 1 episode of peritonitis were followed up from March 1st, 2002, to July 31, 2014, in our PD center. Multivariate logistic regression was used to determine the factors associated with HPR, and the Cox proportional hazard model was conducted to assess the effects of HPR on clinical outcomes.During the study period of 2917.5 patient-years, 489 episodes of peritonitis were recorded, and the total peritonitis rate was 0.168 episodes per patient-year. The multivariate analysis showed that factors associated with HPR include a quick occurrence of peritonitis after CAPD initiation (shorter than 12 months), and a low serum albumin level at the start of CAPD. In the Cox proportional hazard model, HPR was a significant predictor of technique failure. There were no differences between HPR and low peritonitis rate (LPR) group for all-cause mortality. However, when the peritonitis rate was considered as a continuous variable, a positive correlation was observed between the peritonitis rate and mortality.We found the quick peritonitis occurrence after CAPD and the low serum albumin level before CAPD were strongly associated with an HPR. Also, our results verified that HPR was positively correlated with technique failure. More importantly, the increase in the peritonitis rate suggested a higher risk of all-cause mortality.These results may help to identify and target patients who are at higher risk of HPR at the start of CAPD and to take interventions to reduce peritonitis incidence and improve clinical outcomes.

  8. Risk factors and outcomes of high peritonitis rate in continuous ambulatory peritoneal dialysis patients

    PubMed Central

    Tian, Yuanshi; Xie, Xishao; Xiang, Shilong; Yang, Xin; Zhang, Xiaohui; Shou, Zhangfei; Chen, Jianghua

    2016-01-01

    Abstract Peritonitis remains a major complication of peritoneal dialysis (PD). A high peritonitis rate (HPR) affects continuous ambulatory peritoneal dialysis (CAPD) patients’ technique survival and mortality. Predictors and outcomes of HPR, rather than the first peritonitis episode, were rarely studied in the Chinese population. In this study, we examined the risk factors associated with HPR and its effects on clinical outcomes in CAPD patients. This is a single center, retrospective, observational cohort study. A total of 294 patients who developing at least 1 episode of peritonitis were followed up from March 1st, 2002, to July 31, 2014, in our PD center. Multivariate logistic regression was used to determine the factors associated with HPR, and the Cox proportional hazard model was conducted to assess the effects of HPR on clinical outcomes. During the study period of 2917.5 patient-years, 489 episodes of peritonitis were recorded, and the total peritonitis rate was 0.168 episodes per patient-year. The multivariate analysis showed that factors associated with HPR include a quick occurrence of peritonitis after CAPD initiation (shorter than 12 months), and a low serum albumin level at the start of CAPD. In the Cox proportional hazard model, HPR was a significant predictor of technique failure. There were no differences between HPR and low peritonitis rate (LPR) group for all-cause mortality. However, when the peritonitis rate was considered as a continuous variable, a positive correlation was observed between the peritonitis rate and mortality. We found the quick peritonitis occurrence after CAPD and the low serum albumin level before CAPD were strongly associated with an HPR. Also, our results verified that HPR was positively correlated with technique failure. More importantly, the increase in the peritonitis rate suggested a higher risk of all-cause mortality. These results may help to identify and target patients who are at higher risk of HPR at the start of CAPD and to take interventions to reduce peritonitis incidence and improve clinical outcomes. PMID:27930566

  9. Using Logistic Regression and Random Forests multivariate statistical methods for landslide spatial probability assessment in North-Est Sicily, Italy

    NASA Astrophysics Data System (ADS)

    Trigila, Alessandro; Iadanza, Carla; Esposito, Carlo; Scarascia-Mugnozza, Gabriele

    2015-04-01

    North-East Sicily is strongly exposed to shallow landslide events. On October, 1st 2009 a severe rainstorm (225.5 mm of cumulative rainfall in 9 hours) caused flash floods and more than 1000 landslides, which struck several small villages as Giampilieri, Altolia, Molino, Pezzolo, Scaletta Zanclea, Itala, with 31 fatalities, 6 missing persons and damage to buildings and transportation infrastructures. Landslides, mainly consisting in earth and debris translational slides evolving into debris flows, triggered on steep slopes involving colluvium and regolith materials which cover the underlying metamorphic bedrock of Peloritani Mountains. In this area catchments are small (about 10 square kilometres), elongated, with steep slopes, low order streams, short time of concentration, and discharge directly into the sea. In the past, landslides occurred at Altolia in 1613 and 2000, at Molino in 1750, 1805 and 2000, at Giampilieri in 1791, 1918, 1929, 1932, 2000 and on October 25, 2007. The aim of this work is to define susceptibility models for shallow landslides using multivariate statistical analyses in the Giampilieri area (25 square kilometres). A detailed landslide inventory map has been produced, as the first step, through field surveys coupled with the observation of high resolution aerial colour orthophoto taken immediately after the event. 1,490 initiation zones have been identified; most of them have planimetric dimensions ranging between tens to few hundreds of square metres. The spatial hazard assessment has been focused on the detachment areas. Susceptibility models, performed in a GIS environment, took into account several parameters. The morphometric and hydrologic parameters has been derived from a detailed LiDAR 1×1 m. Square grid cells of 4×4 m were adopted as mapping units, on the basis of the area-frequency distribution of the detachment zones, and the optimal representation of the local morphometric conditions (e.g. slope angle, plan curvature). A first phase of the work addressed to identify the spatial relationships between the landslides location and the 13 related factors by using the Frequency Ratio bivariate statistical method. The analysis was then carried out by adopting a multivariate statistical approach, according to the Logistic Regression technique and Random Forests technique that gave best results in terms of AUC. The models were performed and evaluated with different sample sizes and also taking into account the temporal variation of input variables such as burned areas by wildfire. The most significant outcome of this work are: the relevant influence of the sample size on the model results and the strong importance of some environmental factors (e.g. land use and wildfires) for the identification of the depletion zones of extremely rapid shallow landslides.

  10. The Outlier Detection for Ordinal Data Using Scalling Technique of Regression Coefficients

    NASA Astrophysics Data System (ADS)

    Adnan, Arisman; Sugiarto, Sigit

    2017-06-01

    The aims of this study is to detect the outliers by using coefficients of Ordinal Logistic Regression (OLR) for the case of k category responses where the score from 1 (the best) to 8 (the worst). We detect them by using the sum of moduli of the ordinal regression coefficients calculated by jackknife technique. This technique is improved by scalling the regression coefficients to their means. R language has been used on a set of ordinal data from reference distribution. Furthermore, we compare this approach by using studentised residual plots of jackknife technique for ANOVA (Analysis of Variance) and OLR. This study shows that the jackknifing technique along with the proper scaling may lead us to reveal outliers in ordinal regression reasonably well.

  11. LIVER ULTRASONOGRAPHY IN DOLPHINS: USE OF ULTRASONOGRAPHY TO ESTABLISH A TECHNIQUE FOR HEPATOBILIARY IMAGING AND TO EVALUATE METABOLIC DISEASE-ASSOCIATED LIVER CHANGES IN BOTTLENOSE DOLPHINS (TURSIOPS TRUNCATUS).

    PubMed

    Seitz, Kelsey E; Smith, Cynthia R; Marks, Stanley L; Venn-Watson, Stephanie K; Ivančić, Marina

    2016-12-01

    The objective of this study was to establish a comprehensive technique for ultrasound examination of the dolphin hepatobiliary system and apply this technique to 30 dolphins to determine what, if any, sonographic changes are associated with blood-based indicators of metabolic syndrome (insulin greater than 14 μIU/ml or glucose greater than 112 mg/dl) and iron overload (transferrin saturation greater than 65%). A prospective study of individuals in a cross-sectional population with and without elevated postprandial insulin levels was performed. Twenty-nine bottlenose dolphins ( Tursiops truncatus ) in a managed collection were included in the final data analysis. An in-water ultrasound technique was developed that included detailed analysis of the liver and pancreas. Dolphins with hyperinsulinemia concentrations had larger livers compared with dolphins with nonelevated concentrations. Using stepwise, multivariate regression including blood-based indicators of metabolic syndrome in dolphins, glucose was the best predictor of and had a positive linear association with liver size (P = 0.007, R 2 = 0.24). Bottlenose dolphins are susceptible to metabolic syndrome and associated complications that affect the liver, including fatty liver disease and iron overload. This study facilitated the establishment of a technique for a rapid, diagnostic, and noninvasive ultrasonographic evaluation of the dolphin liver. In addition, the study identified ultrasound-detectable hepatic changes associated primarily with elevated glucose concentration in dolphins. Future investigations will strive to detail the pathophysiological mechanisms for these changes.

  12. [Logistic regression model of noninvasive prediction for portal hypertensive gastropathy in patients with hepatitis B associated cirrhosis].

    PubMed

    Wang, Qingliang; Li, Xiaojie; Hu, Kunpeng; Zhao, Kun; Yang, Peisheng; Liu, Bo

    2015-05-12

    To explore the risk factors of portal hypertensive gastropathy (PHG) in patients with hepatitis B associated cirrhosis and establish a Logistic regression model of noninvasive prediction. The clinical data of 234 hospitalized patients with hepatitis B associated cirrhosis from March 2012 to March 2014 were analyzed retrospectively. The dependent variable was the occurrence of PHG while the independent variables were screened by binary Logistic analysis. Multivariate Logistic regression was used for further analysis of significant noninvasive independent variables. Logistic regression model was established and odds ratio was calculated for each factor. The accuracy, sensitivity and specificity of model were evaluated by the curve of receiver operating characteristic (ROC). According to univariate Logistic regression, the risk factors included hepatic dysfunction, albumin (ALB), bilirubin (TB), prothrombin time (PT), platelet (PLT), white blood cell (WBC), portal vein diameter, spleen index, splenic vein diameter, diameter ratio, PLT to spleen volume ratio, esophageal varices (EV) and gastric varices (GV). Multivariate analysis showed that hepatic dysfunction (X1), TB (X2), PLT (X3) and splenic vein diameter (X4) were the major occurring factors for PHG. The established regression model was Logit P=-2.667+2.186X1-2.167X2+0.725X3+0.976X4. The accuracy of model for PHG was 79.1% with a sensitivity of 77.2% and a specificity of 80.8%. Hepatic dysfunction, TB, PLT and splenic vein diameter are risk factors for PHG and the noninvasive predicted Logistic regression model was Logit P=-2.667+2.186X1-2.167X2+0.725X3+0.976X4.

  13. Spatially resolved regression analysis of pre-treatment FDG, FLT and Cu-ATSM PET from post-treatment FDG PET: an exploratory study

    PubMed Central

    Bowen, Stephen R; Chappell, Richard J; Bentzen, Søren M; Deveau, Michael A; Forrest, Lisa J; Jeraj, Robert

    2012-01-01

    Purpose To quantify associations between pre-radiotherapy and post-radiotherapy PET parameters via spatially resolved regression. Materials and methods Ten canine sinonasal cancer patients underwent PET/CT scans of [18F]FDG (FDGpre), [18F]FLT (FLTpre), and [61Cu]Cu-ATSM (Cu-ATSMpre). Following radiotherapy regimens of 50 Gy in 10 fractions, veterinary patients underwent FDG PET/CT scans at three months (FDGpost). Regression of standardized uptake values in baseline FDGpre, FLTpre and Cu-ATSMpre tumour voxels to those in FDGpost images was performed for linear, log-linear, generalized-linear and mixed-fit linear models. Goodness-of-fit in regression coefficients was assessed by R2. Hypothesis testing of coefficients over the patient population was performed. Results Multivariate linear model fits of FDGpre to FDGpost were significantly positive over the population (FDGpost~0.17 FDGpre, p=0.03), and classified slopes of RECIST non-responders and responders to be different (0.37 vs. 0.07, p=0.01). Generalized-linear model fits related FDGpre to FDGpost by a linear power law (FDGpost~FDGpre0.93, p<0.001). Univariate mixture model fits of FDGpre improved R2 from 0.17 to 0.52. Neither baseline FLT PET nor Cu-ATSM PET uptake contributed statistically significant multivariate regression coefficients. Conclusions Spatially resolved regression analysis indicates that pre-treatment FDG PET uptake is most strongly associated with three-month post-treatment FDG PET uptake in this patient population, though associations are histopathology-dependent. PMID:22682748

  14. Power and sample size for multivariate logistic modeling of unmatched case-control studies.

    PubMed

    Gail, Mitchell H; Haneuse, Sebastien

    2017-01-01

    Sample size calculations are needed to design and assess the feasibility of case-control studies. Although such calculations are readily available for simple case-control designs and univariate analyses, there is limited theory and software for multivariate unconditional logistic analysis of case-control data. Here we outline the theory needed to detect scalar exposure effects or scalar interactions while controlling for other covariates in logistic regression. Both analytical and simulation methods are presented, together with links to the corresponding software.

  15. The time frame of Epstein-Barr virus latent membrane protein-1 gene to disappear in nasopharyngeal swabs after initiation of primary radiotherapy is an independently significant prognostic factor predicting local control for patients with nasopharyngeal carcinoma

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lin, S.-Y.; Chang, K.-P.; Graduate Institute of Clinical Medical Sciences, Chang Gung University, Linkou, Taiwan

    Purpose: The presence of Epstein-Barr virus latent membrane protein-1 (LMP-1) gene in nasopharyngeal swabs indicates the presence of nasopharyngeal carcinoma (NPC) mucosal tumor cells. This study was undertaken to investigate whether the time taken for LMP-1 to disappear after initiation of primary radiotherapy (RT) was inversely associated with NPC local control. Methods and Materials: During July 1999 and October 2002, there were 127 nondisseminated NPC patients receiving serial examinations of nasopharyngeal swabbing with detection of LMP-1 during the RT course. The time for LMP-1 regression was defined as the number of days after initiation of RT for LMP-1 results tomore » turn negative. The primary outcome was local control, which was represented by freedom from local recurrence. Results: The time for LMP-1 regression showed a statistically significant influence on NPC local control both univariately (p < 0.0001) and multivariately (p = 0.004). In multivariate analysis, the administration of chemotherapy conferred a significantly more favorable local control (p = 0.03). Advanced T status ({>=} T2b), overall treatment time of external photon radiotherapy longer than 55 days, and older age showed trends toward being poor prognosticators. The time for LMP-1 regression was very heterogeneous. According to the quartiles of the time for LMP-1 regression, we defined the pattern of LMP-1 regression as late regression if it required 40 days or more. Kaplan-Meier plots indicated that the patients with late regression had a significantly worse local control than those with intermediate or early regression (p 0.0129). Conclusion: Among the potential prognostic factors examined in this study, the time for LMP-1 regression was the most independently significant factor that was inversely associated with NPC local control.« less

  16. Application of multivariate statistical techniques for differentiation of ripe banana flour based on the composition of elements.

    PubMed

    Alkarkhi, Abbas F M; Ramli, Saifullah Bin; Easa, Azhar Mat

    2009-01-01

    Major (sodium, potassium, calcium, magnesium) and minor elements (iron, copper, zinc, manganese) and one heavy metal (lead) of Cavendish banana flour and Dream banana flour were determined, and data were analyzed using multivariate statistical techniques of factor analysis and discriminant analysis. Factor analysis yielded four factors explaining more than 81% of the total variance: the first factor explained 28.73%, comprising magnesium, sodium, and iron; the second factor explained 21.47%, comprising only manganese and copper; the third factor explained 15.66%, comprising zinc and lead; while the fourth factor explained 15.50%, comprising potassium. Discriminant analysis showed that magnesium and sodium exhibited a strong contribution in discriminating the two types of banana flour, affording 100% correct assignation. This study presents the usefulness of multivariate statistical techniques for analysis and interpretation of complex mineral content data from banana flour of different varieties.

  17. Quantifying the impact of between-study heterogeneity in multivariate meta-analyses

    PubMed Central

    Jackson, Dan; White, Ian R; Riley, Richard D

    2012-01-01

    Measures that quantify the impact of heterogeneity in univariate meta-analysis, including the very popular I2 statistic, are now well established. Multivariate meta-analysis, where studies provide multiple outcomes that are pooled in a single analysis, is also becoming more commonly used. The question of how to quantify heterogeneity in the multivariate setting is therefore raised. It is the univariate R2 statistic, the ratio of the variance of the estimated treatment effect under the random and fixed effects models, that generalises most naturally, so this statistic provides our basis. This statistic is then used to derive a multivariate analogue of I2, which we call . We also provide a multivariate H2 statistic, the ratio of a generalisation of Cochran's heterogeneity statistic and its associated degrees of freedom, with an accompanying generalisation of the usual I2 statistic, . Our proposed heterogeneity statistics can be used alongside all the usual estimates and inferential procedures used in multivariate meta-analysis. We apply our methods to some real datasets and show how our statistics are equally appropriate in the context of multivariate meta-regression, where study level covariate effects are included in the model. Our heterogeneity statistics may be used when applying any procedure for fitting the multivariate random effects model. Copyright © 2012 John Wiley & Sons, Ltd. PMID:22763950

  18. Empirical Bayes approach to the estimation of "unsafety": the multivariate regression method.

    PubMed

    Hauer, E

    1992-10-01

    There are two kinds of clues to the unsafety of an entity: its traits (such as traffic, geometry, age, or gender) and its historical accident record. The Empirical Bayes approach to unsafety estimation makes use of both kinds of clues. It requires information about the mean and the variance of the unsafety in a "reference population" of similar entities. The method now in use for this purpose suffers from several shortcomings. First, a very large reference population is required. Second, the choice of reference population is to some extent arbitrary. Third, entities in the reference population usually cannot match the traits of the entity the unsafety of which is estimated. To alleviate these shortcomings the multivariate regression method for estimating the mean and variance of unsafety in reference populations is offered. Its logical foundations are described and its soundness is demonstrated. The use of the multivariate method makes the Empirical Bayes approach to unsafety estimation applicable to a wider range of circumstances and yields better estimates of unsafety. The application of the method to the tasks of identifying deviant entities and of estimating the effect of interventions on unsafety are discussed and illustrated by numerical examples.

  19. A multivariate analysis of clinical and morphological prognostic factors in squamous cell carcinoma of the vulva.

    PubMed

    Smyczek-Gargya, B; Volz, B; Geppert, M; Dietl, J

    1997-01-01

    Clinical and histological data of 168 patients with squamous cell carcinoma of the vulva were analyzed with respect to survival. 151 patients underwent surgery, 12 patients were treated with primary radiation and in 5 patients no treatment was performed. Follow-up lasted from at least 2 up to 22 years' posttreatment. In univariate analysis, the following factors were highly significant: presurgery lymph node status, tumor infiltration beyond the vulva, tumor grading, histological inguinal lymph node status, pre- and postsurgery tumor stage, depth of invasion and tumor diameter. In the multivariate analysis (Cox regression), the most powerful factors were shown to be histological inguinal lymph node status, tumor diameter and tumor grading. The multivariate logistic regression analysis worked out as main prognostic factors for metastases of inguinal lymph nodes: presurgery inguinal lymph node status, tumor size, depth of invasion and tumor grading. Based on these results, tumor biology seems to be the decisive factor concerning recurrence and survival. Therefore, we suggest a more conservative treatment of vulvar carcinoma. Patients with confined carcinoma to the vulva, with a tumor diameter up to 3 cm and without clinical suspected lymph nodes, should be treated by wide excision/partial vulvectomy with ipsilateral lymphadenectomy.

  20. Three-Hand Endoscopic Endonasal Transsphenoidal Surgery: Experience With an Anatomy-Preserving Mononostril Approach Technique.

    PubMed

    Eseonu, Chikezie I; ReFaey, Karim; Pamias-Portalatin, Eva; Asensio, Javier; Garcia, Oscar; Boahene, Kofi D; Quiñones-Hinojosa, Alfredo

    2018-02-01

    Variations on the endoscopic transsphenoidal approach present unique surgical techniques that have unique effects on surgical outcomes, extent of resection (EOR), and anatomical complications. To analyze the learning curve and perioperative outcomes of the 3-hand endoscopic endonasal mononostril transsphenoidal technique. Prospective case series and retrospective data analysis of patients who were treated with the 3-hand transsphenoidal technique between January 2007 and May 2015 by a single neurosurgeon. Patient characteristics, preoperative presentation, tumor characteristics, operative times, learning curve, and postoperative outcomes were analyzed. Volumetric EOR was evaluated, and a logistic regression analysis was used to assess predictors of EOR. Two hundred seventy-five patients underwent an endoscopic transsphenoidal surgery using the 3-hand technique. One hundred eighteen patients in the early group had surgery between 2007 and 2010, while 157 patients in the late group had surgery between 2011 and 2015. Operative time was significantly shorter in the late group (161.6 min) compared to the early group (211.3 min, P = .001). Both cohorts had similar EOR (early group 84.6% vs late group 85.5%, P = .846) and postoperative outcomes. The learning curve showed that it took 54 cases to achieve operative proficiency with the 3-handed technique. Multivariate modeling suggested that prior resections and preoperative tumor size are important predictors for EOR. We describe a 3-hand, mononostril endoscopic transsphenoidal technique performed by a single neurosurgeon that has minimal anatomic distortion and postoperative complications. During the learning curve of this technique, operative time can significantly decrease, while EOR, postoperative outcomes, and complications are not jeopardized. Copyright © 2017 by the Congress of Neurological Surgeons

  1. Analysis of spreadable cheese by Raman spectroscopy and chemometric tools.

    PubMed

    Oliveira, Kamila de Sá; Callegaro, Layce de Souza; Stephani, Rodrigo; Almeida, Mariana Ramos; de Oliveira, Luiz Fernando Cappa

    2016-03-01

    In this work, FT-Raman spectroscopy was explored to evaluate spreadable cheese samples. A partial least squares discriminant analysis was employed to identify the spreadable cheese samples containing starch. To build the models, two types of samples were used: commercial samples and samples manufactured in local industries. The method of supervised classification PLS-DA was employed to classify the samples as adulterated or without starch. Multivariate regression was performed using the partial least squares method to quantify the starch in the spreadable cheese. The limit of detection obtained for the model was 0.34% (w/w) and the limit of quantification was 1.14% (w/w). The reliability of the models was evaluated by determining the confidence interval, which was calculated using the bootstrap re-sampling technique. The results show that the classification models can be used to complement classical analysis and as screening methods. Copyright © 2015 Elsevier Ltd. All rights reserved.

  2. Discrimination of honeys using colorimetric sensor arrays, sensory analysis and gas chromatography techniques.

    PubMed

    Tahir, Haroon Elrasheid; Xiaobo, Zou; Xiaowei, Huang; Jiyong, Shi; Mariod, Abdalbasit Adam

    2016-09-01

    Aroma profiles of six honey varieties of different botanical origins were investigated using colorimetric sensor array, gas chromatography-mass spectrometry (GC-MS) and descriptive sensory analysis. Fifty-eight aroma compounds were identified, including 2 norisoprenoids, 5 hydrocarbons, 4 terpenes, 6 phenols, 7 ketones, 9 acids, 12 aldehydes and 13 alcohols. Twenty abundant or active compounds were chosen as key compounds to characterize honey aroma. Discrimination of the honeys was subsequently implemented using multivariate analysis, including hierarchical clustering analysis (HCA) and principal component analysis (PCA). Honeys of the same botanical origin were grouped together in the PCA score plot and HCA dendrogram. SPME-GC/MS and colorimetric sensor array were able to discriminate the honeys effectively with the advantages of being rapid, simple and low-cost. Moreover, partial least squares regression (PLSR) was applied to indicate the relationship between sensory descriptors and aroma compounds. Copyright © 2016 Elsevier Ltd. All rights reserved.

  3. Advanced spectrophotometric chemometric methods for resolving the binary mixture of doxylamine succinate and pyridoxine hydrochloride.

    PubMed

    Katsarov, Plamen; Gergov, Georgi; Alin, Aylin; Pilicheva, Bissera; Al-Degs, Yahya; Simeonov, Vasil; Kassarova, Margarita

    2018-03-01

    The prediction power of partial least squares (PLS) and multivariate curve resolution-alternating least squares (MCR-ALS) methods have been studied for simultaneous quantitative analysis of the binary drug combination - doxylamine succinate and pyridoxine hydrochloride. Analysis of first-order UV overlapped spectra was performed using different PLS models - classical PLS1 and PLS2 as well as partial robust M-regression (PRM). These linear models were compared to MCR-ALS with equality and correlation constraints (MCR-ALS-CC). All techniques operated within the full spectral region and extracted maximum information for the drugs analysed. The developed chemometric methods were validated on external sample sets and were applied to the analyses of pharmaceutical formulations. The obtained statistical parameters were satisfactory for calibration and validation sets. All developed methods can be successfully applied for simultaneous spectrophotometric determination of doxylamine and pyridoxine both in laboratory-prepared mixtures and commercial dosage forms.

  4. Diet and the role of lipoproteins, lipases, and thyroid hormones in coronary lesion growth

    NASA Technical Reports Server (NTRS)

    Barth, Jacques D.; Jansen, Hans; Reiber, Johan H. C.; Birkenhager, Jan C.; Kromhout, Daan

    1987-01-01

    The relationships between the coronary lesion growth and the blood contents of lipoprotein fractions, thyroic hormones, and the lipoprotein lipase activity were investigated in male patients with severe coronary atherosclerosis, who participated in a lipid-lowering dietary intervention program. A quantitative computer-assisted image-processing technique was used to assess the severity of coronary obstructions at the beginning of the program and at its termination two years later. Based on absolute coronary scores, patients were divided into a no-lesion growth group (14 patients) and a progression group (21 paients). At the end of the trial, the very-low-density lipoprotein cholesterol and triglycerides were found to be significantly higher, while the high-density lipoprotein cholesterol and hepatic lipase (HL) were lower in the progression group. Multivariate regression analysis showed HL to be the most important determinant of changes in coronary atherosclerotic lesions.

  5. Rapid analysis of pharmaceutical drugs using LIBS coupled with multivariate analysis.

    PubMed

    Tiwari, P K; Awasthi, S; Kumar, R; Anand, R K; Rai, P K; Rai, A K

    2018-02-01

    Type 2 diabetes drug tablets containing voglibose having dose strengths of 0.2 and 0.3 mg of various brands have been examined, using laser-induced breakdown spectroscopy (LIBS) technique. The statistical methods such as the principal component analysis (PCA) and the partial least square regression analysis (PLSR) have been employed on LIBS spectral data for classifying and developing the calibration models of drug samples. We have developed the ratio-based calibration model applying PLSR in which relative spectral intensity ratios H/C, H/N and O/N are used. Further, the developed model has been employed to predict the relative concentration of element in unknown drug samples. The experiment has been performed in air and argon atmosphere, respectively, and the obtained results have been compared. The present model provides rapid spectroscopic method for drug analysis with high statistical significance for online control and measurement process in a wide variety of pharmaceutical industrial applications.

  6. Real-time absorption and scattering characterization of slab-shaped turbid samples obtained by a combination of angular and spatially resolved measurements.

    PubMed

    Dam, Jan S; Yavari, Nazila; Sørensen, Søren; Andersson-Engels, Stefan

    2005-07-10

    We present a fast and accurate method for real-time determination of the absorption coefficient, the scattering coefficient, and the anisotropy factor of thin turbid samples by using simple continuous-wave noncoherent light sources. The three optical properties are extracted from recordings of angularly resolved transmittance in addition to spatially resolved diffuse reflectance and transmittance. The applied multivariate calibration and prediction techniques are based on multiple polynomial regression in combination with a Newton--Raphson algorithm. The numerical test results based on Monte Carlo simulations showed mean prediction errors of approximately 0.5% for all three optical properties within ranges typical for biological media. Preliminary experimental results are also presented yielding errors of approximately 5%. Thus the presented methods show a substantial potential for simultaneous absorption and scattering characterization of turbid media.

  7. Influence of intravenous opioid dose on postoperative ileus.

    PubMed

    Barletta, Jeffrey F; Asgeirsson, Theodor; Senagore, Anthony J

    2011-07-01

    Intravenous opioids represent a major component in the pathophysiology of postoperative ileus (POI). However, the most appropriate measure and threshold to quantify the association between opioid dose (eg, average daily, cumulative, maximum daily) and POI remains unknown. To evaluate the relationship between opioid dose, POI, and length of stay (LOS) and identify the opioid measure that was most strongly associated with POI. Consecutive patients admitted to a community teaching hospital who underwent elective colorectal surgery by any technique with an enhanced-recovery protocol postoperatively were retrospectively identified. Patients were excluded if they received epidural analgesia, developed a major intraabdominal complication or medical complication, or had a prolonged workup prior to surgery. Intravenous opioid doses were quantified and converted to hydromorphone equivalents. Classification and regression tree (CART) analysis was used to determine the dosing threshold for the opioid measure most associated with POI and define high versus low use of opioids. Risk factors for POI and prolonged LOS were determined through multivariate analysis. The incidence of POI in 279 patients was 8.6%. CART analysis identified a maximum daily intravenous hydromorphone dose of 2 mg or more as the opioid measure most associated with POI. Multivariate analysis revealed maximum daily hydromorphone dose of 2 mg or more (p = 0.034), open surgical technique (p = 0.045), and days of intravenous narcotic therapy (p = 0.003) as significant risk factors for POI. Variables associated with increased LOS were POI (p < 0.001), maximum daily hydromorphone dose of 2 mg or more (p < 0.001), and age (p = 0.005); laparoscopy (p < 0.001) was associated with a decreased LOS. Intravenous opioid therapy is significantly associated with POI and prolonged LOS, particularly when the maximum hydromorphone dose per day exceeds 2 mg. Clinicians should consider alternative, nonopioid-based pain management options when this occurs.

  8. A climate-based multivariate extreme emulator of met-ocean-hydrological events for coastal flooding

    NASA Astrophysics Data System (ADS)

    Camus, Paula; Rueda, Ana; Mendez, Fernando J.; Tomas, Antonio; Del Jesus, Manuel; Losada, Iñigo J.

    2015-04-01

    Atmosphere-ocean general circulation models (AOGCMs) are useful to analyze large-scale climate variability (long-term historical periods, future climate projections). However, applications such as coastal flood modeling require climate information at finer scale. Besides, flooding events depend on multiple climate conditions: waves, surge levels from the open-ocean and river discharge caused by precipitation. Therefore, a multivariate statistical downscaling approach is adopted to reproduce relationships between variables and due to its low computational cost. The proposed method can be considered as a hybrid approach which combines a probabilistic weather type downscaling model with a stochastic weather generator component. Predictand distributions are reproduced modeling the relationship with AOGCM predictors based on a physical division in weather types (Camus et al., 2012). The multivariate dependence structure of the predictand (extreme events) is introduced linking the independent marginal distributions of the variables by a probabilistic copula regression (Ben Ayala et al., 2014). This hybrid approach is applied for the downscaling of AOGCM data to daily precipitation and maximum significant wave height and storm-surge in different locations along the Spanish coast. Reanalysis data is used to assess the proposed method. A commonly predictor for the three variables involved is classified using a regression-guided clustering algorithm. The most appropriate statistical model (general extreme value distribution, pareto distribution) for daily conditions is fitted. Stochastic simulation of the present climate is performed obtaining the set of hydraulic boundary conditions needed for high resolution coastal flood modeling. References: Camus, P., Menéndez, M., Méndez, F.J., Izaguirre, C., Espejo, A., Cánovas, V., Pérez, J., Rueda, A., Losada, I.J., Medina, R. (2014b). A weather-type statistical downscaling framework for ocean wave climate. Journal of Geophysical Research, doi: 10.1002/2014JC010141. Ben Ayala, M.A., Chebana, F., Ouarda, T.B.M.J. (2014). Probabilistic Gaussian Copula Regression Model for Multisite and Multivariable Downscaling, Journal of Climate, 27, 3331-3347.

  9. Kidney transplantation from deceased donors with elevated serum creatinine.

    PubMed

    Gallinat, Anja; Leerhoff, Sabine; Paul, Andreas; Molmenti, Ernesto P; Schulze, Maren; Witzke, Oliver; Sotiropoulos, Georgios C

    2016-12-01

    Elevated donor serum creatinine has been associated with inferior graft survival in kidney transplantation (KT). The aim of this study was to evaluate the impact of elevated donor serum creatinine on short and long-term outcomes and to determine possible ways to optimize the use of these organs. All kidney transplants from 01-2000 to 12-2012 with donor creatinine ≥ 2 mg/dl were considered. Risk factors for delayed graft function (DGF) were explored with uni- and multivariate regression analyses. Donor and recipient data were analyzed with uni- and multivariate cox proportional hazard analyses. Graft and patient survival were calculated using the Kaplan-Meier method. Seventy-eight patients were considered. Median recipient age and waiting time on dialysis were 53 years and 5.1 years, respectively. After a median follow-up of 6.2 years, 63 patients are alive. 1, 3, and 5-year graft and patient survival rates were 92, 89, and 89 % and 96, 93, and 89 %, respectively. Serum creatinine level at procurement and recipient's dialysis time prior to KT were predictors of DGF in multivariate analysis (p = 0.0164 and p = 0.0101, respectively). Charlson comorbidity score retained statistical significance by multivariate regression analysis for graft survival (p = 0.0321). Recipient age (p = 0.0035) was predictive of patient survival by multivariate analysis. Satisfactory long-term kidney transplant outcomes in the setting of elevated donor serum creatinine ≥2 mg/dl can be achieved when donor creatinine is <3.5 mg/dl, and the recipient has low comorbidities, is under 56 years of age, and remains in dialysis prior to KT for <6.8 years.

  10. Application of stepwise multiple regression techniques to inversion of Nimbus 'IRIS' observations.

    NASA Technical Reports Server (NTRS)

    Ohring, G.

    1972-01-01

    Exploratory studies with Nimbus-3 infrared interferometer-spectrometer (IRIS) data indicate that, in addition to temperature, such meteorological parameters as geopotential heights of pressure surfaces, tropopause pressure, and tropopause temperature can be inferred from the observed spectra with the use of simple regression equations. The technique of screening the IRIS spectral data by means of stepwise regression to obtain the best radiation predictors of meteorological parameters is validated. The simplicity of application of the technique and the simplicity of the derived linear regression equations - which contain only a few terms - suggest usefulness for this approach. Based upon the results obtained, suggestions are made for further development and exploitation of the stepwise regression analysis technique.

  11. Reduction of interferences in graphite furnace atomic absorption spectrometry by multiple linear regression modelling

    NASA Astrophysics Data System (ADS)

    Grotti, Marco; Abelmoschi, Maria Luisa; Soggia, Francesco; Tiberiade, Christian; Frache, Roberto

    2000-12-01

    The multivariate effects of Na, K, Mg and Ca as nitrates on the electrothermal atomisation of manganese, cadmium and iron were studied by multiple linear regression modelling. Since the models proved to efficiently predict the effects of the considered matrix elements in a wide range of concentrations, they were applied to correct the interferences occurring in the determination of trace elements in seawater after pre-concentration of the analytes. In order to obtain a statistically significant number of samples, a large volume of the certified seawater reference materials CASS-3 and NASS-3 was treated with Chelex-100 resin; then, the chelating resin was separated from the solution, divided into several sub-samples, each of them was eluted with nitric acid and analysed by electrothermal atomic absorption spectrometry (for trace element determinations) and inductively coupled plasma optical emission spectrometry (for matrix element determinations). To minimise any other systematic error besides that due to matrix effects, accuracy of the pre-concentration step and contamination levels of the procedure were checked by inductively coupled plasma mass spectrometric measurements. Analytical results obtained by applying the multiple linear regression models were compared with those obtained with other calibration methods, such as external calibration using acid-based standards, external calibration using matrix-matched standards and the analyte addition technique. Empirical models proved to efficiently reduce interferences occurring in the analysis of real samples, allowing an improvement of accuracy better than for other calibration methods.

  12. Voxel-wise prostate cell density prediction using multiparametric magnetic resonance imaging and machine learning.

    PubMed

    Sun, Yu; Reynolds, Hayley M; Wraith, Darren; Williams, Scott; Finnegan, Mary E; Mitchell, Catherine; Murphy, Declan; Haworth, Annette

    2018-04-26

    There are currently no methods to estimate cell density in the prostate. This study aimed to develop predictive models to estimate prostate cell density from multiparametric magnetic resonance imaging (mpMRI) data at a voxel level using machine learning techniques. In vivo mpMRI data were collected from 30 patients before radical prostatectomy. Sequences included T2-weighted imaging, diffusion-weighted imaging and dynamic contrast-enhanced imaging. Ground truth cell density maps were computed from histology and co-registered with mpMRI. Feature extraction and selection were performed on mpMRI data. Final models were fitted using three regression algorithms including multivariate adaptive regression spline (MARS), polynomial regression (PR) and generalised additive model (GAM). Model parameters were optimised using leave-one-out cross-validation on the training data and model performance was evaluated on test data using root mean square error (RMSE) measurements. Predictive models to estimate voxel-wise prostate cell density were successfully trained and tested using the three algorithms. The best model (GAM) achieved a RMSE of 1.06 (± 0.06) × 10 3 cells/mm 2 and a relative deviation of 13.3 ± 0.8%. Prostate cell density can be quantitatively estimated non-invasively from mpMRI data using high-quality co-registered data at a voxel level. These cell density predictions could be used for tissue classification, treatment response evaluation and personalised radiotherapy.

  13. Prediction equations of forced oscillation technique: the insidious role of collinearity.

    PubMed

    Narchi, Hassib; AlBlooshi, Afaf

    2018-03-27

    Many studies have reported reference data for forced oscillation technique (FOT) in healthy children. The prediction equation of FOT parameters were derived from a multivariable regression model examining the effect of age, gender, weight and height on each parameter. As many of these variables are likely to be correlated, collinearity might have affected the accuracy of the model, potentially resulting in misleading, erroneous or difficult to interpret conclusions.The aim of this work was: To review all FOT publications in children since 2005 to analyze whether collinearity was considered in the construction of the published prediction equations. Then to compare these prediction equations with our own study. And to analyse, in our study, how collinearity between the explanatory variables might affect the predicted equations if it was not considered in the model. The results showed that none of the ten reviewed studies had stated whether collinearity was checked for. Half of the reports had also included in their equations variables which are physiologically correlated, such as age, weight and height. The predicted resistance varied by up to 28% amongst these studies. And in our study, multicollinearity was identified between the explanatory variables initially considered for the regression model (age, weight and height). Ignoring it would have resulted in inaccuracies in the coefficients of the equation, their signs (positive or negative), their 95% confidence intervals, their significance level and the model goodness of fit. In Conclusion with inaccurately constructed and improperly reported models, understanding the results and reproducing the models for future research might be compromised.

  14. Predicting individualized clinical measures by a generalized prediction framework and multimodal fusion of MRI data

    PubMed Central

    Meng, Xing; Jiang, Rongtao; Lin, Dongdong; Bustillo, Juan; Jones, Thomas; Chen, Jiayu; Yu, Qingbao; Du, Yuhui; Zhang, Yu; Jiang, Tianzi; Sui, Jing; Calhoun, Vince D.

    2016-01-01

    Neuroimaging techniques have greatly enhanced the understanding of neurodiversity (human brain variation across individuals) in both health and disease. The ultimate goal of using brain imaging biomarkers is to perform individualized predictions. Here we proposed a generalized framework that can predict explicit values of the targeted measures by taking advantage of joint information from multiple modalities. This framework also enables whole brain voxel-wise searching by combining multivariate techniques such as ReliefF, clustering, correlation-based feature selection and multiple regression models, which is more flexible and can achieve better prediction performance than alternative atlas-based methods. For 50 healthy controls and 47 schizophrenia patients, three kinds of features derived from resting-state fMRI (fALFF), sMRI (gray matter) and DTI (fractional anisotropy) were extracted and fed into a regression model, achieving high prediction for both cognitive scores (MCCB composite r = 0.7033, MCCB social cognition r = 0.7084) and symptomatic scores (positive and negative syndrome scale [PANSS] positive r = 0.7785, PANSS negative r = 0.7804). Moreover, the brain areas likely responsible for cognitive deficits of schizophrenia, including middle temporal gyrus, dorsolateral prefrontal cortex, striatum, cuneus and cerebellum, were located with different weights, as well as regions predicting PANSS symptoms, including thalamus, striatum and inferior parietal lobule, pinpointing the potential neuromarkers. Finally, compared to a single modality, multimodal combination achieves higher prediction accuracy and enables individualized prediction on multiple clinical measures. There is more work to be done, but the current results highlight the potential utility of multimodal brain imaging biomarkers to eventually inform clinical decision-making. PMID:27177764

  15. A Novel Continuous Blood Pressure Estimation Approach Based on Data Mining Techniques.

    PubMed

    Miao, Fen; Fu, Nan; Zhang, Yuan-Ting; Ding, Xiao-Rong; Hong, Xi; He, Qingyun; Li, Ye

    2017-11-01

    Continuous blood pressure (BP) estimation using pulse transit time (PTT) is a promising method for unobtrusive BP measurement. However, the accuracy of this approach must be improved for it to be viable for a wide range of applications. This study proposes a novel continuous BP estimation approach that combines data mining techniques with a traditional mechanism-driven model. First, 14 features derived from simultaneous electrocardiogram and photoplethysmogram signals were extracted for beat-to-beat BP estimation. A genetic algorithm-based feature selection method was then used to select BP indicators for each subject. Multivariate linear regression and support vector regression were employed to develop the BP model. The accuracy and robustness of the proposed approach were validated for static, dynamic, and follow-up performance. Experimental results based on 73 subjects showed that the proposed approach exhibited excellent accuracy in static BP estimation, with a correlation coefficient and mean error of 0.852 and -0.001 ± 3.102 mmHg for systolic BP, and 0.790 and -0.004 ± 2.199 mmHg for diastolic BP. Similar performance was observed for dynamic BP estimation. The robustness results indicated that the estimation accuracy was lower by a certain degree one day after model construction but was relatively stable from one day to six months after construction. The proposed approach is superior to the state-of-the-art PTT-based model for an approximately 2-mmHg reduction in the standard derivation at different time intervals, thus providing potentially novel insights for cuffless BP estimation.

  16. Estimation of soil clay and organic matter using two quantitative methods (PLSR and MARS) based on reflectance spectroscopy

    NASA Astrophysics Data System (ADS)

    Nawar, Said; Buddenbaum, Henning; Hill, Joachim

    2014-05-01

    A rapid and inexpensive soil analytical technique is needed for soil quality assessment and accurate mapping. This study investigated a method for improved estimation of soil clay (SC) and organic matter (OM) using reflectance spectroscopy. Seventy soil samples were collected from Sinai peninsula in Egypt to estimate the soil clay and organic matter relative to the soil spectra. Soil samples were scanned with an Analytical Spectral Devices (ASD) spectrometer (350-2500 nm). Three spectral formats were used in the calibration models derived from the spectra and the soil properties: (1) original reflectance spectra (OR), (2) first-derivative spectra smoothened using the Savitzky-Golay technique (FD-SG) and (3) continuum-removed reflectance (CR). Partial least-squares regression (PLSR) models using the CR of the 400-2500 nm spectral region resulted in R2 = 0.76 and 0.57, and RPD = 2.1 and 1.5 for estimating SC and OM, respectively, indicating better performance than that obtained using OR and SG. The multivariate adaptive regression splines (MARS) calibration model with the CR spectra resulted in an improved performance (R2 = 0.89 and 0.83, RPD = 3.1 and 2.4) for estimating SC and OM, respectively. The results show that the MARS models have a great potential for estimating SC and OM compared with PLSR models. The results obtained in this study have potential value in the field of soil spectroscopy because they can be applied directly to the mapping of soil properties using remote sensing imagery in arid environment conditions. Key Words: soil clay, organic matter, PLSR, MARS, reflectance spectroscopy.

  17. Estimation of railroad capacity using parametric methods.

    DOT National Transportation Integrated Search

    2013-12-01

    This paper reviews different methodologies used for railroad capacity estimation and presents a user-friendly method to measure capacity. The objective of this paper is to use multivariate regression analysis to develop a continuous relation of the d...

  18. Use of the forced-oscillation technique to estimate spirometry values.

    PubMed

    Yamamoto, Shoichiro; Miyoshi, Seigo; Katayama, Hitoshi; Okazaki, Mikio; Shigematsu, Hisayuki; Sano, Yoshifumi; Matsubara, Minoru; Hamaguchi, Naohiko; Okura, Takafumi; Higaki, Jitsuo

    2017-01-01

    Spirometry is sometimes difficult to perform in elderly patients and in those with severe respiratory distress. The forced-oscillation technique (FOT) is a simple and noninvasive method of measuring respiratory impedance. The aim of this study was to determine if FOT data reflect spirometric indices. Patients underwent both FOT and spirometry procedures prior to inclusion in development (n=1,089) and validation (n=552) studies. Multivariate linear regression analysis was performed to identify FOT parameters predictive of vital capacity (VC), forced VC (FVC), and forced expiratory volume in 1 second (FEV 1 ). A regression equation was used to calculate estimated VC, FVC, and FEV 1 . We then determined whether the estimated data reflected spirometric indices. Agreement between actual and estimated spirometry data was assessed by Bland-Altman analysis. Significant correlations were observed between actual and estimated VC, FVC, and FEV 1 values (all r >0.8 and P <0.001). These results were deemed robust by a separate validation study (all r >0.8 and P <0.001). Bias between the actual data and estimated data for VC, FVC, and FEV 1 in the development study was 0.007 L (95% limits of agreement [LOA] 0.907 and -0.893 L), -0.064 L (95% LOA 0.843 and -0.971 L), and -0.039 L (95% LOA 0.735 and -0.814 L), respectively. On the other hand, bias between the actual data and estimated data for VC, FVC, and FEV 1 in the validation study was -0.201 L (95% LOA 0.62 and -1.022 L), -0.262 L (95% LOA 0.582 and -1.106 L), and -0.174 L (95% LOA 0.576 and -0.923 L), respectively, suggesting that the estimated data in the validation study did not have high accuracy. Further studies are needed to generate more accurate regression equations for spirometric indices based on FOT measurements.

  19. Neural network uncertainty assessment using Bayesian statistics: a remote sensing application

    NASA Technical Reports Server (NTRS)

    Aires, F.; Prigent, C.; Rossow, W. B.

    2004-01-01

    Neural network (NN) techniques have proved successful for many regression problems, in particular for remote sensing; however, uncertainty estimates are rarely provided. In this article, a Bayesian technique to evaluate uncertainties of the NN parameters (i.e., synaptic weights) is first presented. In contrast to more traditional approaches based on point estimation of the NN weights, we assess uncertainties on such estimates to monitor the robustness of the NN model. These theoretical developments are illustrated by applying them to the problem of retrieving surface skin temperature, microwave surface emissivities, and integrated water vapor content from a combined analysis of satellite microwave and infrared observations over land. The weight uncertainty estimates are then used to compute analytically the uncertainties in the network outputs (i.e., error bars and correlation structure of these errors). Such quantities are very important for evaluating any application of an NN model. The uncertainties on the NN Jacobians are then considered in the third part of this article. Used for regression fitting, NN models can be used effectively to represent highly nonlinear, multivariate functions. In this situation, most emphasis is put on estimating the output errors, but almost no attention has been given to errors associated with the internal structure of the regression model. The complex structure of dependency inside the NN is the essence of the model, and assessing its quality, coherency, and physical character makes all the difference between a blackbox model with small output errors and a reliable, robust, and physically coherent model. Such dependency structures are described to the first order by the NN Jacobians: they indicate the sensitivity of one output with respect to the inputs of the model for given input data. We use a Monte Carlo integration procedure to estimate the robustness of the NN Jacobians. A regularization strategy based on principal component analysis is proposed to suppress the multicollinearities in order to make these Jacobians robust and physically meaningful.

  20. On the degrees of freedom of reduced-rank estimators in multivariate regression

    PubMed Central

    Mukherjee, A.; Chen, K.; Wang, N.; Zhu, J.

    2015-01-01

    Summary We study the effective degrees of freedom of a general class of reduced-rank estimators for multivariate regression in the framework of Stein's unbiased risk estimation. A finite-sample exact unbiased estimator is derived that admits a closed-form expression in terms of the thresholded singular values of the least-squares solution and hence is readily computable. The results continue to hold in the high-dimensional setting where both the predictor and the response dimensions may be larger than the sample size. The derived analytical form facilitates the investigation of theoretical properties and provides new insights into the empirical behaviour of the degrees of freedom. In particular, we examine the differences and connections between the proposed estimator and a commonly-used naive estimator. The use of the proposed estimator leads to efficient and accurate prediction risk estimation and model selection, as demonstrated by simulation studies and a data example. PMID:26702155

  1. Computational Visual Stress Level Analysis of Calcareous Algae Exposed to Sedimentation

    PubMed Central

    Nilssen, Ingunn; Eide, Ingvar; de Oliveira Figueiredo, Marcia Abreu; de Souza Tâmega, Frederico Tapajós; Nattkemper, Tim W.

    2016-01-01

    This paper presents a machine learning based approach for analyses of photos collected from laboratory experiments conducted to assess the potential impact of water-based drill cuttings on deep-water rhodolith-forming calcareous algae. This pilot study uses imaging technology to quantify and monitor the stress levels of the calcareous algae Mesophyllum engelhartii (Foslie) Adey caused by various degrees of light exposure, flow intensity and amount of sediment. A machine learning based algorithm was applied to assess the temporal variation of the calcareous algae size (∼ mass) and color automatically. Measured size and color were correlated to the photosynthetic efficiency (maximum quantum yield of charge separation in photosystem II, ΦPSIImax) and degree of sediment coverage using multivariate regression. The multivariate regression showed correlations between time and calcareous algae sizes, as well as correlations between fluorescence and calcareous algae colors. PMID:27285611

  2. Factors affecting the outcome of excimer laser photorefractive keratectomy: a preliminary multivariable regression analysis

    NASA Astrophysics Data System (ADS)

    Maguen, Ezra I.; Papaioannou, Thanassis; Nesburn, Anthony B.; Salz, James J.; Warren, Cathy; Grundfest, Warren S.

    1996-05-01

    Multivariable regression analysis was used to evaluate the combined effects of some preoperative and operative variables on the change of refraction following excimer laser photorefractive keratectomy for myopia (PRK). This analysis was performed on 152 eyes (at 6 months postoperatively) and 156 eyes (at 12 months postoperatively). The following variables were considered: intended refractive correction, patient age, treatment zone, central corneal thickness, average corneal curvature, and intraocular pressure. At 6 months after surgery, the cumulative R2 was 0.43 with 0.38 attributed to the intended correction and 0.06 attributed to the preoperative corneal curvature. At 12 months, the cumulative R2 was 0.37 where 0.33 was attributed to the intended correction, 0.02 to the preoperative corneal curvature, and 0.01 to both preoperative corneal thickness and to the patient age. Further model augmentation is necessary to account for the remaining variability and the behavior of the residuals.

  3. Specific prognostic factors for secondary pancreatic infection in severe acute pancreatitis.

    PubMed

    Armengol-Carrasco, M; Oller, B; Escudero, L E; Roca, J; Gener, J; Rodríguez, N; del Moral, P; Moreno, P

    1999-01-01

    The aim of the present study was to investigate whether there are specific prognostic factors to predict the development of secondary pancreatic infection (SPI) in severe acute pancreatitis in order to perform a computed tomography-fine needle aspiration with bacteriological sampling at the right moment and confirm the diagnosis. Twenty-five clinical and laboratory parameters were determined sequentially in 150 patients with severe acute pancreatitis (SAP) and univariate, and multivariate regression analyses were done looking for correlation with the development of SPI. Only APACHE II score and C-reactive protein levels were related to the development of SPI in the multivariate analysis. A regression equation was designed using these two parameters, and empiric cut-off points defined the subgroup of patients at high risk of developing secondary pancreatic infection. The results showed that it is possible to predict SPI during SAP allowing bacteriological confirmation and early treatment of this severe condition.

  4. Compulsive buying: Earlier illicit drug use, impulse buying, depression, and adult ADHD symptoms.

    PubMed

    Brook, Judith S; Zhang, Chenshu; Brook, David W; Leukefeld, Carl G

    2015-08-30

    This longitudinal study examined the association between psychosocial antecedents, including illicit drug use, and adult compulsive buying (CB) across a 29-year time period from mean age 14 to mean age 43. Participants originally came from a community-based random sample of residents in two upstate New York counties. Multivariate linear regression analysis was used to study the relationship between the participant's earlier psychosocial antecedents and adult CB in the fifth decade of life. The results of the multivariate linear regression analyses showed that gender (female), earlier adult impulse buying (IB), depressive mood, illicit drug use, and concurrent ADHD symptoms were all significantly associated with adult CB at mean age 43. It is important that clinicians treating CB in adults should consider the role of drug use, symptoms of ADHD, IB, depression, and family factors in CB. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  5. Compulsive Buying: Earlier Illicit Drug Use, Impulse Buying, Depression, and Adult ADHD Symptoms

    PubMed Central

    Brook, Judith S.; Zhang, Chenshu; Brook, David W.; Leukefeld, Carl G.

    2015-01-01

    This longitudinal study examined the association between psychosocial antecedents, including illicit drug use, and adult compulsive buying (CB) across a 29-year time period from mean age 14 to mean age 43. Participants originally came from a community-based random sample of residents in two upstate New York counties. Multivariate linear regression analysis was used to study the relationship between the participant’s earlier psychosocial antecedents and adult CB in the fifth decade of life. The results of the multivariate linear regression analyses showed that gender (female), earlier adult impulse buying (IB), depressive mood, illicit drug use, and concurrent ADHD symptoms were all significantly associated with adult CB at mean age 43. It is important that clinicians treating CB in adults should consider the role of drug use, symptoms of ADHD, IB, depression, and family factors in CB. PMID:26165963

  6. Self-reported mental health among US military personnel prior and subsequent to the terrorist attacks of September 11, 2001.

    PubMed

    Smith, Tyler C; Smith, Besa; Corbeil, Thomas E; Riddle, James R; Ryan, Margaret A K

    2004-08-01

    There is much concern over the potential for short- and long-term adverse mental health effects caused by the terrorist attacks on September 11, 2001. This analysis used data from the Millennium Cohort Study to identify subgroups of US military members who enrolled in the cohort and reported their mental health status before the traumatic events of September 11 and soon after September 11. While adjusting for confounding, multivariable logistic regression, analysis of variance, and multivariate ordinal, or polychotomous logistic regression were used to compare 18 self-reported mental health measures in US military members who enrolled in the cohort before September 11, 2001 with those military personnel who enrolled after September 11, 2001. In contrast to studies of other populations, military respondents reported fewer mental health problems in the months immediately after September 11, 2001.

  7. Development and validation of multivariate calibration methods for simultaneous estimation of Paracetamol, Enalapril maleate and hydrochlorothiazide in pharmaceutical dosage form

    NASA Astrophysics Data System (ADS)

    Singh, Veena D.; Daharwal, Sanjay J.

    2017-01-01

    Three multivariate calibration spectrophotometric methods were developed for simultaneous estimation of Paracetamol (PARA), Enalapril maleate (ENM) and Hydrochlorothiazide (HCTZ) in tablet dosage form; namely multi-linear regression calibration (MLRC), trilinear regression calibration method (TLRC) and classical least square (CLS) method. The selectivity of the proposed methods were studied by analyzing the laboratory prepared ternary mixture and successfully applied in their combined dosage form. The proposed methods were validated as per ICH guidelines and good accuracy; precision and specificity were confirmed within the concentration range of 5-35 μg mL- 1, 5-40 μg mL- 1 and 5-40 μg mL- 1of PARA, HCTZ and ENM, respectively. The results were statistically compared with reported HPLC method. Thus, the proposed methods can be effectively useful for the routine quality control analysis of these drugs in commercial tablet dosage form.

  8. Method for enhanced accuracy in predicting peptides using liquid separations or chromatography

    DOEpatents

    Kangas, Lars J.; Auberry, Kenneth J.; Anderson, Gordon A.; Smith, Richard D.

    2006-11-14

    A method for predicting the elution time of a peptide in chromatographic and electrophoretic separations by first providing a data set of known elution times of known peptides, then creating a plurality of vectors, each vector having a plurality of dimensions, and each dimension representing the elution time of amino acids present in each of these known peptides from the data set. The elution time of any protein is then be predicted by first creating a vector by assigning dimensional values for the elution time of amino acids of at least one hypothetical peptide and then calculating a predicted elution time for the vector by performing a multivariate regression of the dimensional values of the hypothetical peptide using the dimensional values of the known peptides. Preferably, the multivariate regression is accomplished by the use of an artificial neural network and the elution times are first normalized using a transfer function.

  9. [Multivariate Adaptive Regression Splines (MARS), an alternative for the analysis of time series].

    PubMed

    Vanegas, Jairo; Vásquez, Fabián

    Multivariate Adaptive Regression Splines (MARS) is a non-parametric modelling method that extends the linear model, incorporating nonlinearities and interactions between variables. It is a flexible tool that automates the construction of predictive models: selecting relevant variables, transforming the predictor variables, processing missing values and preventing overshooting using a self-test. It is also able to predict, taking into account structural factors that might influence the outcome variable, thereby generating hypothetical models. The end result could identify relevant cut-off points in data series. It is rarely used in health, so it is proposed as a tool for the evaluation of relevant public health indicators. For demonstrative purposes, data series regarding the mortality of children under 5 years of age in Costa Rica were used, comprising the period 1978-2008. Copyright © 2016 SESPAS. Publicado por Elsevier España, S.L.U. All rights reserved.

  10. Generation of multivariate near shore extreme wave conditions based on an extreme value copula for offshore boundary conditions.

    NASA Astrophysics Data System (ADS)

    Leyssen, Gert; Mercelis, Peter; De Schoesitter, Philippe; Blanckaert, Joris

    2013-04-01

    Near shore extreme wave conditions, used as input for numerical wave agitation simulations and for the dimensioning of coastal defense structures, need to be determined at a harbour entrance situated at the French North Sea coast. To obtain significant wave heights, the numerical wave model SWAN has been used. A multivariate approach was used to account for the joint probabilities. Considered variables are: wind velocity and direction, water level and significant offshore wave height and wave period. In a first step a univariate extreme value distribution has been determined for the main variables. By means of a technique based on the mean excess function, an appropriate member of the GPD is selected. An optimal threshold for peak over threshold selection is determined by maximum likelihood optimization. Next, the joint dependency structure for the primary random variables is modeled by an extreme value copula. Eventually the multivariate domain of variables was stratified in different classes, each of which representing a combination of variable quantiles with a joint probability, which are used for model simulation. The main variable is the wind velocity, as in the area of concern extreme wave conditions are wind driven. The analysis is repeated for 9 different wind directions. The secondary variable is water level. In shallow waters extreme waves will be directly affected by water depth. Hence the joint probability of occurrence for water level and wave height is of major importance for design of coastal defense structures. Wind velocity and water levels are only dependent for some wind directions (wind induced setup). Dependent directions are detected using a Kendall and Spearman test and appeared to be those with the longest fetch. For these directions, wind velocity and water level extreme value distributions are multivariately linked through a Gumbel Copula. These distributions are stratified into classes of which the frequency of occurrence can be calculated. For the remaining directions the univariate extreme wind velocity distribution is stratified, each class combined with 5 high water levels. The wave height at the model boundaries was taken into account by a regression with the extreme wind velocity at the offshore location. The regression line and the 95% confidence limits where combined with each class. Eventually the wave period is computed by a new regression with the significant wave height. This way 1103 synthetic events were selected and simulated with the SWAN wave model, each of which a frequency of occurrence is calculated for. Hence near shore significant wave heights are obtained with corresponding frequencies. The statistical distribution of the near shore wave heights is determined by sorting the model results in a descending order and accumulating the corresponding frequencies. This approach allows determination of conditional return periods. For example, for the imposed univariate design return periods of 100 years for significant wave height and 30 years for water level, the joint return period for a simultaneous exceedance of both conditions can be computed as 4000 years. Hence, this methodology allows for a probabilistic design of coastal defense structures.

  11. Sampling effort affects multivariate comparisons of stream assemblages

    USGS Publications Warehouse

    Cao, Y.; Larsen, D.P.; Hughes, R.M.; Angermeier, P.L.; Patton, T.M.

    2002-01-01

    Multivariate analyses are used widely for determining patterns of assemblage structure, inferring species-environment relationships and assessing human impacts on ecosystems. The estimation of ecological patterns often depends on sampling effort, so the degree to which sampling effort affects the outcome of multivariate analyses is a concern. We examined the effect of sampling effort on site and group separation, which was measured using a mean similarity method. Two similarity measures, the Jaccard Coefficient and Bray-Curtis Index were investigated with 1 benthic macroinvertebrate and 2 fish data sets. Site separation was significantly improved with increased sampling effort because the similarity between replicate samples of a site increased more rapidly than between sites. Similarly, the faster increase in similarity between sites of the same group than between sites of different groups caused clearer separation between groups. The strength of site and group separation completely stabilized only when the mean similarity between replicates reached 1. These results are applicable to commonly used multivariate techniques such as cluster analysis and ordination because these multivariate techniques start with a similarity matrix. Completely stable outcomes of multivariate analyses are not feasible. Instead, we suggest 2 criteria for estimating the stability of multivariate analyses of assemblage data: 1) mean within-site similarity across all sites compared, indicating sample representativeness, and 2) the SD of within-site similarity across sites, measuring sample comparability.

  12. An Alternative Flight Software Trigger Paradigm: Applying Multivariate Logistic Regression to Sense Trigger Conditions Using Inaccurate or Scarce Information

    NASA Technical Reports Server (NTRS)

    Smith, Kelly M.; Gay, Robert S.; Stachowiak, Susan J.

    2013-01-01

    In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter to improve altitude knowledge. In order to increase overall robustness, the vehicle also has an alternate method of triggering the parachute deployment sequence based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this backup trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to semi-automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a statistical classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers improved performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles.

  13. Retrieval of total suspended matter concentrations from high resolution WorldView-2 imagery: a case study of inland rivers

    NASA Astrophysics Data System (ADS)

    Shi, Liangliang; Mao, Zhihua; Wang, Zheng

    2018-02-01

    Satellite imagery has played an important role in monitoring water quality of lakes or coastal waters presently, but scarcely been applied in inland rivers. This paper presents an attempt of feasibility to apply regression model to quantify and map the concentrations of total suspended matter (CTSM) in inland rivers which have a large scale of spatial and a high CTSM dynamic range by using high resolution satellite remote sensing data, WorldView-2. An empirical approach to quantify CTSM by integrated use of high resolution WorldView-2 multispectral data and 21 in situ CTSM measurements. Radiometric correction, geometric and atmospheric correction involved in image processing procedure is carried out for deriving the surface reflectance to correlate the CTSM and satellite data by using single-variable and multivariable regression technique. Results of regression model show that the single near-infrared (NIR) band 8 of WorldView-2 have a relative strong relationship (R2=0.93) with CTSM. Different prediction models were developed on various combinations of WorldView-2 bands, the Akaike Information Criteria approach was used to choose the best model. The model involving band 1, 3, 5, and 8 of WorldView-2 had a best performance, whose R2 reach to 0.92, with SEE of 53.30 g/m3. The spatial distribution maps were produced by using the best multiple regression model. The results of this paper indicated that it is feasible to apply the empirical model by using high resolution satellite imagery to retrieve CTSM of inland rivers in routine monitoring of water quality.

  14. An Alternative Flight Software Paradigm: Applying Multivariate Logistic Regression to Sense Trigger Conditions using Inaccurate or Scarce Information

    NASA Technical Reports Server (NTRS)

    Smith, Kelly; Gay, Robert; Stachowiak, Susan

    2013-01-01

    In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter to improve altitude knowledge. In order to increase overall robustness, the vehicle also has an alternate method of triggering the parachute deployment sequence based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this backup trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to semi-automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a statistical classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers improved performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles

  15. An Alternative Flight Software Trigger Paradigm: Applying Multivariate Logistic Regression to Sense Trigger Conditions using Inaccurate or Scarce Information

    NASA Technical Reports Server (NTRS)

    Smith, Kelly M.; Gay, Robert S.; Stachowiak, Susan J.

    2013-01-01

    In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter. In order to increase overall robustness, the vehicle also has an alternate method of triggering the drogue parachute deployment based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this velocity-based trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers excellent performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles.

  16. The Association Between Internet Use and Ambulatory Care-Seeking Behaviors in Taiwan: A Cross-Sectional Study

    PubMed Central

    Chen, Tsung-Fu; Liang, Jyh-Chong; Lin, Tzu-Bin; Tsai, Chin-Chung

    2016-01-01

    Background Compared with the traditional ways of gaining health-related information from newspapers, magazines, radio, and television, the Internet is inexpensive, accessible, and conveys diverse opinions. Several studies on how increasing Internet use affected outpatient clinic visits were inconclusive. Objective The objective of this study was to examine the role of Internet use on ambulatory care-seeking behaviors as indicated by the number of outpatient clinic visits after adjusting for confounding variables. Methods We conducted this study using a sample randomly selected from the general population in Taiwan. To handle the missing data, we built a multivariate logistic regression model for propensity score matching using age and sex as the independent variables. The questionnaires with no missing data were then included in a multivariate linear regression model for examining the association between Internet use and outpatient clinic visits. Results We included a sample of 293 participants who answered the questionnaire with no missing data in the multivariate linear regression model. We found that Internet use was significantly associated with more outpatient clinic visits (P=.04). The participants with chronic diseases tended to make more outpatient clinic visits (P<.01). Conclusions The inconsistent quality of health-related information obtained from the Internet may be associated with patients’ increasing need for interpreting and discussing the information with health care professionals, thus resulting in an increasing number of outpatient clinic visits. In addition, the media literacy of Web-based health-related information seekers may also affect their ambulatory care-seeking behaviors, such as outpatient clinic visits. PMID:27927606

  17. Landscape controls on total and methyl Hg in the Upper Hudson River basin, New York, USA

    USGS Publications Warehouse

    Burns, Douglas A.; Riva-Murray, K.; Bradley, P.M.; Aiken, G.R.; Brigham, M.E.

    2012-01-01

    Approaches are needed to better predict spatial variation in riverine Hg concentrations across heterogeneous landscapes that include mountains, wetlands, and open waters. We applied multivariate linear regression to determine the landscape factors and chemical variables that best account for the spatial variation of total Hg (THg) and methyl Hg (MeHg) concentrations in 27 sub-basins across the 493 km2 upper Hudson River basin in the Adirondack Mountains of New York. THg concentrations varied by sixfold, and those of MeHg by 40-fold in synoptic samples collected at low-to-moderate flow, during spring and summer of 2006 and 2008. Bivariate linear regression relations of THg and MeHg concentrations with either percent wetland area or DOC concentrations were significant but could account for only about 1/3 of the variation in these Hg forms in summer. In contrast, multivariate linear regression relations that included metrics of (1) hydrogeomorphology, (2) riparian/wetland area, and (3) open water, explained about 66% to >90% of spatial variation in each Hg form in spring and summer samples. These metrics reflect the influence of basin morphometry and riparian soils on Hg source and transport, and the role of open water as a Hg sink. Multivariate models based solely on these landscape metrics generally accounted for as much or more of the variation in Hg concentrations than models based on chemical and physical metrics, and show great promise for identifying waters with expected high Hg concentrations in the Adirondack region and similar glaciated riverine ecosystems.

  18. The combination of ovarian volume and outline has better diagnostic accuracy than prostate-specific antigen (PSA) concentrations in women with polycystic ovarian syndrome (PCOs).

    PubMed

    Bili, Eleni; Bili, Authors Eleni; Dampala, Kaliopi; Iakovou, Ioannis; Tsolakidis, Dimitrios; Giannakou, Anastasia; Tarlatzis, Basil C

    2014-08-01

    The aim of this study was to determine the performance of prostate specific antigen (PSA) and ultrasound parameters, such as ovarian volume and outline, in the diagnosis of polycystic ovary syndrome (PCOS). This prospective, observational, case-controlled study included 43 women with PCOS, and 40 controls. Between day 3 and 5 of the menstrual cycle, fasting serum samples were collected and transvaginal ultrasound was performed. The diagnostic performance of each parameter [total PSA (tPSA), total-to-free PSA ratio (tPSA:fPSA), ovarian volume, ovarian outline] was estimated by means of receiver operating characteristic (ROC) analysis, along with area under the curve (AUC), threshold, sensitivity, specificity as well as positive (+) and negative (-) likelihood ratios (LRs). Multivariate logistical regression models, using ovarian volume and ovarian outline, were constructed. The tPSA and tPSA:fPSA ratio resulted in AUC of 0.74 and 0.70, respectively, with moderate specificity/sensitivity and insufficient LR+/- values. In the multivariate logistic regression model, the combination of ovarian volume and outline had a sensitivity of 97.7% and a specificity of 97.5% in the diagnosis of PCOS, with +LR and -LR values of 39.1 and 0.02, respectively. In women with PCOS, tPSA and tPSA:fPSA ratio have similar diagnostic performance. The use of a multivariate logistic regression model, incorporating ovarian volume and outline, offers very good diagnostic accuracy in distinguishing women with PCOS patients from controls. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  19. [Multivariate ordinal logistic regression analysis on the association between consumption of fried food and both esophageal cancer and precancerous lesions].

    PubMed

    Guo, L W; Liu, S Z; Zhang, M; Chen, Q; Zhang, S K; Sun, X B

    2017-12-10

    Objective: To investigate the effect of fried food intake on the pathogenesis of esophageal cancer and precancerous lesions. Methods: From 2005 to 2013, all the residents aged 40-69 years from 11 counties (cities) where cancer screening of upper gastrointestinal cancer had been conducted in rural areas of Henan province, were recruited as the subjects of study. Information on demography and lifestyle was collected. The residents under study were screened with iodine staining endoscopic examination and biopsy samples were diagnosed pathologically, under standardized criteria. Subjects with high risk were divided into the groups based on their different pathological degrees. Multivariate ordinal logistic regression analysis was used to analyze the relationship between the frequency of fried food intake and esophageal cancer and precancerous lesions. Results: A total number of 8 792 cases with normal esophagus, 3 680 with mild hyperplasia, 972 with moderate hyperplasia, 413 with severe hyperplasia carcinoma in situ, and 336 cases of esophageal cancer were recruited. Results from multivariate logistic regression analysis showed that, when compared with those who did not eat fried food, the intake of fried food (<2 times/week: OR =1.60, 95% CI : 1.40-1.83; ≥2 times/week: OR =2.58, 95% CI : 1.98-3.37) appeared a risk factor for both esophageal cancer or precancerous lesions after adjustment for age, sex, marital status, educational level, body mass index, smoking and alcohol intake. Conclusion: The intake of fried food appeared a risk factor for both esophageal cancer and precancerous lesions.

  20. Sparse multivariate factor analysis regression models and its applications to integrative genomics analysis.

    PubMed

    Zhou, Yan; Wang, Pei; Wang, Xianlong; Zhu, Ji; Song, Peter X-K

    2017-01-01

    The multivariate regression model is a useful tool to explore complex associations between two kinds of molecular markers, which enables the understanding of the biological pathways underlying disease etiology. For a set of correlated response variables, accounting for such dependency can increase statistical power. Motivated by integrative genomic data analyses, we propose a new methodology-sparse multivariate factor analysis regression model (smFARM), in which correlations of response variables are assumed to follow a factor analysis model with latent factors. This proposed method not only allows us to address the challenge that the number of association parameters is larger than the sample size, but also to adjust for unobserved genetic and/or nongenetic factors that potentially conceal the underlying response-predictor associations. The proposed smFARM is implemented by the EM algorithm and the blockwise coordinate descent algorithm. The proposed methodology is evaluated and compared to the existing methods through extensive simulation studies. Our results show that accounting for latent factors through the proposed smFARM can improve sensitivity of signal detection and accuracy of sparse association map estimation. We illustrate smFARM by two integrative genomics analysis examples, a breast cancer dataset, and an ovarian cancer dataset, to assess the relationship between DNA copy numbers and gene expression arrays to understand genetic regulatory patterns relevant to the disease. We identify two trans-hub regions: one in cytoband 17q12 whose amplification influences the RNA expression levels of important breast cancer genes, and the other in cytoband 9q21.32-33, which is associated with chemoresistance in ovarian cancer. © 2016 WILEY PERIODICALS, INC.

Top