Correlative and multivariate analysis of increased radon concentration in underground laboratory.
Maletić, Dimitrije M; Udovičić, Vladimir I; Banjanac, Radomir M; Joković, Dejan R; Dragić, Aleksandar L; Veselinović, Nikola B; Filipović, Jelena
2014-11-01
The results of analysis using correlative and multivariate methods, as developed for data analysis in high-energy physics and implemented in the Toolkit for Multivariate Analysis software package, of the relations of the variation of increased radon concentration with climate variables in shallow underground laboratory is presented. Multivariate regression analysis identified a number of multivariate methods which can give a good evaluation of increased radon concentrations based on climate variables. The use of the multivariate regression methods will enable the investigation of the relations of specific climate variable with increased radon concentrations by analysis of regression methods resulting in 'mapped' underlying functional behaviour of radon concentrations depending on a wide spectrum of climate variables. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Allegrini, Franco; Braga, Jez W B; Moreira, Alessandro C O; Olivieri, Alejandro C
2018-06-29
A new multivariate regression model, named Error Covariance Penalized Regression (ECPR) is presented. Following a penalized regression strategy, the proposed model incorporates information about the measurement error structure of the system, using the error covariance matrix (ECM) as a penalization term. Results are reported from both simulations and experimental data based on replicate mid and near infrared (MIR and NIR) spectral measurements. The results for ECPR are better under non-iid conditions when compared with traditional first-order multivariate methods such as ridge regression (RR), principal component regression (PCR) and partial least-squares regression (PLS). Copyright © 2018 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Mansouri, Edris; Feizi, Faranak; Jafari Rad, Alireza; Arian, Mehran
2018-03-01
This paper uses multivariate regression to create a mathematical model for iron skarn exploration in the Sarvian area, central Iran, using multivariate regression for mineral prospectivity mapping (MPM). The main target of this paper is to apply multivariate regression analysis (as an MPM method) to map iron outcrops in the northeastern part of the study area in order to discover new iron deposits in other parts of the study area. Two types of multivariate regression models using two linear equations were employed to discover new mineral deposits. This method is one of the reliable methods for processing satellite images. ASTER satellite images (14 bands) were used as unique independent variables (UIVs), and iron outcrops were mapped as dependent variables for MPM. According to the results of the probability value (p value), coefficient of determination value (R2) and adjusted determination coefficient (Radj2), the second regression model (which consistent of multiple UIVs) fitted better than other models. The accuracy of the model was confirmed by iron outcrops map and geological observation. Based on field observation, iron mineralization occurs at the contact of limestone and intrusive rocks (skarn type).
Calibrated Multivariate Regression with Application to Neural Semantic Basis Discovery.
Liu, Han; Wang, Lie; Zhao, Tuo
2015-08-01
We propose a calibrated multivariate regression method named CMR for fitting high dimensional multivariate regression models. Compared with existing methods, CMR calibrates regularization for each regression task with respect to its noise level so that it simultaneously attains improved finite-sample performance and tuning insensitiveness. Theoretically, we provide sufficient conditions under which CMR achieves the optimal rate of convergence in parameter estimation. Computationally, we propose an efficient smoothed proximal gradient algorithm with a worst-case numerical rate of convergence O (1/ ϵ ), where ϵ is a pre-specified accuracy of the objective function value. We conduct thorough numerical simulations to illustrate that CMR consistently outperforms other high dimensional multivariate regression methods. We also apply CMR to solve a brain activity prediction problem and find that it is as competitive as a handcrafted model created by human experts. The R package camel implementing the proposed method is available on the Comprehensive R Archive Network http://cran.r-project.org/web/packages/camel/.
Jackson, Dan; White, Ian R; Riley, Richard D
2013-01-01
Multivariate meta-analysis is becoming more commonly used. Methods for fitting the multivariate random effects model include maximum likelihood, restricted maximum likelihood, Bayesian estimation and multivariate generalisations of the standard univariate method of moments. Here, we provide a new multivariate method of moments for estimating the between-study covariance matrix with the properties that (1) it allows for either complete or incomplete outcomes and (2) it allows for covariates through meta-regression. Further, for complete data, it is invariant to linear transformations. Our method reduces to the usual univariate method of moments, proposed by DerSimonian and Laird, in a single dimension. We illustrate our method and compare it with some of the alternatives using a simulation study and a real example. PMID:23401213
USDA-ARS?s Scientific Manuscript database
In multivariate regression analysis of spectroscopy data, spectral preprocessing is often performed to reduce unwanted background information (offsets, sloped baselines) or accentuate absorption features in intrinsically overlapping bands. These procedures, also known as pretreatments, are commonly ...
Multivariate Boosting for Integrative Analysis of High-Dimensional Cancer Genomic Data
Xiong, Lie; Kuan, Pei-Fen; Tian, Jianan; Keles, Sunduz; Wang, Sijian
2015-01-01
In this paper, we propose a novel multivariate component-wise boosting method for fitting multivariate response regression models under the high-dimension, low sample size setting. Our method is motivated by modeling the association among different biological molecules based on multiple types of high-dimensional genomic data. Particularly, we are interested in two applications: studying the influence of DNA copy number alterations on RNA transcript levels and investigating the association between DNA methylation and gene expression. For this purpose, we model the dependence of the RNA expression levels on DNA copy number alterations and the dependence of gene expression on DNA methylation through multivariate regression models and utilize boosting-type method to handle the high dimensionality as well as model the possible nonlinear associations. The performance of the proposed method is demonstrated through simulation studies. Finally, our multivariate boosting method is applied to two breast cancer studies. PMID:26609213
Higher-order Multivariable Polynomial Regression to Estimate Human Affective States
NASA Astrophysics Data System (ADS)
Wei, Jie; Chen, Tong; Liu, Guangyuan; Yang, Jiemin
2016-03-01
From direct observations, facial, vocal, gestural, physiological, and central nervous signals, estimating human affective states through computational models such as multivariate linear-regression analysis, support vector regression, and artificial neural network, have been proposed in the past decade. In these models, linear models are generally lack of precision because of ignoring intrinsic nonlinearities of complex psychophysiological processes; and nonlinear models commonly adopt complicated algorithms. To improve accuracy and simplify model, we introduce a new computational modeling method named as higher-order multivariable polynomial regression to estimate human affective states. The study employs standardized pictures in the International Affective Picture System to induce thirty subjects’ affective states, and obtains pure affective patterns of skin conductance as input variables to the higher-order multivariable polynomial model for predicting affective valence and arousal. Experimental results show that our method is able to obtain efficient correlation coefficients of 0.98 and 0.96 for estimation of affective valence and arousal, respectively. Moreover, the method may provide certain indirect evidences that valence and arousal have their brain’s motivational circuit origins. Thus, the proposed method can serve as a novel one for efficiently estimating human affective states.
Higher-order Multivariable Polynomial Regression to Estimate Human Affective States
Wei, Jie; Chen, Tong; Liu, Guangyuan; Yang, Jiemin
2016-01-01
From direct observations, facial, vocal, gestural, physiological, and central nervous signals, estimating human affective states through computational models such as multivariate linear-regression analysis, support vector regression, and artificial neural network, have been proposed in the past decade. In these models, linear models are generally lack of precision because of ignoring intrinsic nonlinearities of complex psychophysiological processes; and nonlinear models commonly adopt complicated algorithms. To improve accuracy and simplify model, we introduce a new computational modeling method named as higher-order multivariable polynomial regression to estimate human affective states. The study employs standardized pictures in the International Affective Picture System to induce thirty subjects’ affective states, and obtains pure affective patterns of skin conductance as input variables to the higher-order multivariable polynomial model for predicting affective valence and arousal. Experimental results show that our method is able to obtain efficient correlation coefficients of 0.98 and 0.96 for estimation of affective valence and arousal, respectively. Moreover, the method may provide certain indirect evidences that valence and arousal have their brain’s motivational circuit origins. Thus, the proposed method can serve as a novel one for efficiently estimating human affective states. PMID:26996254
Warton, David I; Thibaut, Loïc; Wang, Yi Alice
2017-01-01
Bootstrap methods are widely used in statistics, and bootstrapping of residuals can be especially useful in the regression context. However, difficulties are encountered extending residual resampling to regression settings where residuals are not identically distributed (thus not amenable to bootstrapping)-common examples including logistic or Poisson regression and generalizations to handle clustered or multivariate data, such as generalised estimating equations. We propose a bootstrap method based on probability integral transform (PIT-) residuals, which we call the PIT-trap, which assumes data come from some marginal distribution F of known parametric form. This method can be understood as a type of "model-free bootstrap", adapted to the problem of discrete and highly multivariate data. PIT-residuals have the key property that they are (asymptotically) pivotal. The PIT-trap thus inherits the key property, not afforded by any other residual resampling approach, that the marginal distribution of data can be preserved under PIT-trapping. This in turn enables the derivation of some standard bootstrap properties, including second-order correctness of pivotal PIT-trap test statistics. In multivariate data, bootstrapping rows of PIT-residuals affords the property that it preserves correlation in data without the need for it to be modelled, a key point of difference as compared to a parametric bootstrap. The proposed method is illustrated on an example involving multivariate abundance data in ecology, and demonstrated via simulation to have improved properties as compared to competing resampling methods.
Thibaut, Loïc; Wang, Yi Alice
2017-01-01
Bootstrap methods are widely used in statistics, and bootstrapping of residuals can be especially useful in the regression context. However, difficulties are encountered extending residual resampling to regression settings where residuals are not identically distributed (thus not amenable to bootstrapping)—common examples including logistic or Poisson regression and generalizations to handle clustered or multivariate data, such as generalised estimating equations. We propose a bootstrap method based on probability integral transform (PIT-) residuals, which we call the PIT-trap, which assumes data come from some marginal distribution F of known parametric form. This method can be understood as a type of “model-free bootstrap”, adapted to the problem of discrete and highly multivariate data. PIT-residuals have the key property that they are (asymptotically) pivotal. The PIT-trap thus inherits the key property, not afforded by any other residual resampling approach, that the marginal distribution of data can be preserved under PIT-trapping. This in turn enables the derivation of some standard bootstrap properties, including second-order correctness of pivotal PIT-trap test statistics. In multivariate data, bootstrapping rows of PIT-residuals affords the property that it preserves correlation in data without the need for it to be modelled, a key point of difference as compared to a parametric bootstrap. The proposed method is illustrated on an example involving multivariate abundance data in ecology, and demonstrated via simulation to have improved properties as compared to competing resampling methods. PMID:28738071
Flood-frequency prediction methods for unregulated streams of Tennessee, 2000
Law, George S.; Tasker, Gary D.
2003-01-01
Up-to-date flood-frequency prediction methods for unregulated, ungaged rivers and streams of Tennessee have been developed. Prediction methods include the regional-regression method and the newer region-of-influence method. The prediction methods were developed using stream-gage records from unregulated streams draining basins having from 1 percent to about 30 percent total impervious area. These methods, however, should not be used in heavily developed or storm-sewered basins with impervious areas greater than 10 percent. The methods can be used to estimate 2-, 5-, 10-, 25-, 50-, 100-, and 500-year recurrence-interval floods of most unregulated rural streams in Tennessee. A computer application was developed that automates the calculation of flood frequency for unregulated, ungaged rivers and streams of Tennessee. Regional-regression equations were derived by using both single-variable and multivariable regional-regression analysis. Contributing drainage area is the explanatory variable used in the single-variable equations. Contributing drainage area, main-channel slope, and a climate factor are the explanatory variables used in the multivariable equations. Deleted-residual standard error for the single-variable equations ranged from 32 to 65 percent. Deleted-residual standard error for the multivariable equations ranged from 31 to 63 percent. These equations are included in the computer application to allow easy comparison of results produced by the different methods. The region-of-influence method calculates multivariable regression equations for each ungaged site and recurrence interval using basin characteristics from 60 similar sites selected from the study area. Explanatory variables that may be used in regression equations computed by the region-of-influence method include contributing drainage area, main-channel slope, a climate factor, and a physiographic-region factor. Deleted-residual standard error for the region-of-influence method tended to be only slightly smaller than those for the regional-regression method and ranged from 27 to 62 percent.
Su, Liyun; Zhao, Yanyong; Yan, Tianshun; Li, Fenglan
2012-01-01
Multivariate local polynomial fitting is applied to the multivariate linear heteroscedastic regression model. Firstly, the local polynomial fitting is applied to estimate heteroscedastic function, then the coefficients of regression model are obtained by using generalized least squares method. One noteworthy feature of our approach is that we avoid the testing for heteroscedasticity by improving the traditional two-stage method. Due to non-parametric technique of local polynomial estimation, it is unnecessary to know the form of heteroscedastic function. Therefore, we can improve the estimation precision, when the heteroscedastic function is unknown. Furthermore, we verify that the regression coefficients is asymptotic normal based on numerical simulations and normal Q-Q plots of residuals. Finally, the simulation results and the local polynomial estimation of real data indicate that our approach is surely effective in finite-sample situations.
Levine, Matthew E; Albers, David J; Hripcsak, George
2016-01-01
Time series analysis methods have been shown to reveal clinical and biological associations in data collected in the electronic health record. We wish to develop reliable high-throughput methods for identifying adverse drug effects that are easy to implement and produce readily interpretable results. To move toward this goal, we used univariate and multivariate lagged regression models to investigate associations between twenty pairs of drug orders and laboratory measurements. Multivariate lagged regression models exhibited higher sensitivity and specificity than univariate lagged regression in the 20 examples, and incorporating autoregressive terms for labs and drugs produced more robust signals in cases of known associations among the 20 example pairings. Moreover, including inpatient admission terms in the model attenuated the signals for some cases of unlikely associations, demonstrating how multivariate lagged regression models' explicit handling of context-based variables can provide a simple way to probe for health-care processes that confound analyses of EHR data.
A refined method for multivariate meta-analysis and meta-regression.
Jackson, Daniel; Riley, Richard D
2014-02-20
Making inferences about the average treatment effect using the random effects model for meta-analysis is problematic in the common situation where there is a small number of studies. This is because estimates of the between-study variance are not precise enough to accurately apply the conventional methods for testing and deriving a confidence interval for the average effect. We have found that a refined method for univariate meta-analysis, which applies a scaling factor to the estimated effects' standard error, provides more accurate inference. We explain how to extend this method to the multivariate scenario and show that our proposal for refined multivariate meta-analysis and meta-regression can provide more accurate inferences than the more conventional approach. We explain how our proposed approach can be implemented using standard output from multivariate meta-analysis software packages and apply our methodology to two real examples. Copyright © 2013 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Saputro, D. R. S.; Amalia, F.; Widyaningsih, P.; Affan, R. C.
2018-05-01
Bayesian method is a method that can be used to estimate the parameters of multivariate multiple regression model. Bayesian method has two distributions, there are prior and posterior distributions. Posterior distribution is influenced by the selection of prior distribution. Jeffreys’ prior distribution is a kind of Non-informative prior distribution. This prior is used when the information about parameter not available. Non-informative Jeffreys’ prior distribution is combined with the sample information resulting the posterior distribution. Posterior distribution is used to estimate the parameter. The purposes of this research is to estimate the parameters of multivariate regression model using Bayesian method with Non-informative Jeffreys’ prior distribution. Based on the results and discussion, parameter estimation of β and Σ which were obtained from expected value of random variable of marginal posterior distribution function. The marginal posterior distributions for β and Σ are multivariate normal and inverse Wishart. However, in calculation of the expected value involving integral of a function which difficult to determine the value. Therefore, approach is needed by generating of random samples according to the posterior distribution characteristics of each parameter using Markov chain Monte Carlo (MCMC) Gibbs sampling algorithm.
Carvalho, Carlos; Gomes, Danielo G.; Agoulmine, Nazim; de Souza, José Neuman
2011-01-01
This paper proposes a method based on multivariate spatial and temporal correlation to improve prediction accuracy in data reduction for Wireless Sensor Networks (WSN). Prediction of data not sent to the sink node is a technique used to save energy in WSNs by reducing the amount of data traffic. However, it may not be very accurate. Simulations were made involving simple linear regression and multiple linear regression functions to assess the performance of the proposed method. The results show a higher correlation between gathered inputs when compared to time, which is an independent variable widely used for prediction and forecasting. Prediction accuracy is lower when simple linear regression is used, whereas multiple linear regression is the most accurate one. In addition to that, our proposal outperforms some current solutions by about 50% in humidity prediction and 21% in light prediction. To the best of our knowledge, we believe that we are probably the first to address prediction based on multivariate correlation for WSN data reduction. PMID:22346626
Characterizing multivariate decoding models based on correlated EEG spectral features
McFarland, Dennis J.
2013-01-01
Objective Multivariate decoding methods are popular techniques for analysis of neurophysiological data. The present study explored potential interpretative problems with these techniques when predictors are correlated. Methods Data from sensorimotor rhythm-based cursor control experiments was analyzed offline with linear univariate and multivariate models. Features were derived from autoregressive (AR) spectral analysis of varying model order which produced predictors that varied in their degree of correlation (i.e., multicollinearity). Results The use of multivariate regression models resulted in much better prediction of target position as compared to univariate regression models. However, with lower order AR features interpretation of the spectral patterns of the weights was difficult. This is likely to be due to the high degree of multicollinearity present with lower order AR features. Conclusions Care should be exercised when interpreting the pattern of weights of multivariate models with correlated predictors. Comparison with univariate statistics is advisable. Significance While multivariate decoding algorithms are very useful for prediction their utility for interpretation may be limited when predictors are correlated. PMID:23466267
A refined method for multivariate meta-analysis and meta-regression
Jackson, Daniel; Riley, Richard D
2014-01-01
Making inferences about the average treatment effect using the random effects model for meta-analysis is problematic in the common situation where there is a small number of studies. This is because estimates of the between-study variance are not precise enough to accurately apply the conventional methods for testing and deriving a confidence interval for the average effect. We have found that a refined method for univariate meta-analysis, which applies a scaling factor to the estimated effects’ standard error, provides more accurate inference. We explain how to extend this method to the multivariate scenario and show that our proposal for refined multivariate meta-analysis and meta-regression can provide more accurate inferences than the more conventional approach. We explain how our proposed approach can be implemented using standard output from multivariate meta-analysis software packages and apply our methodology to two real examples. © 2013 The Authors. Statistics in Medicine published by John Wiley & Sons, Ltd. PMID:23996351
Dinç, Erdal; Ozdemir, Abdil
2005-01-01
Multivariate chromatographic calibration technique was developed for the quantitative analysis of binary mixtures enalapril maleate (EA) and hydrochlorothiazide (HCT) in tablets in the presence of losartan potassium (LST). The mathematical algorithm of multivariate chromatographic calibration technique is based on the use of the linear regression equations constructed using relationship between concentration and peak area at the five-wavelength set. The algorithm of this mathematical calibration model having a simple mathematical content was briefly described. This approach is a powerful mathematical tool for an optimum chromatographic multivariate calibration and elimination of fluctuations coming from instrumental and experimental conditions. This multivariate chromatographic calibration contains reduction of multivariate linear regression functions to univariate data set. The validation of model was carried out by analyzing various synthetic binary mixtures and using the standard addition technique. Developed calibration technique was applied to the analysis of the real pharmaceutical tablets containing EA and HCT. The obtained results were compared with those obtained by classical HPLC method. It was observed that the proposed multivariate chromatographic calibration gives better results than classical HPLC.
NASA Astrophysics Data System (ADS)
Samhouri, M.; Al-Ghandoor, A.; Fouad, R. H.
2009-08-01
In this study two techniques, for modeling electricity consumption of the Jordanian industrial sector, are presented: (i) multivariate linear regression and (ii) neuro-fuzzy models. Electricity consumption is modeled as function of different variables such as number of establishments, number of employees, electricity tariff, prevailing fuel prices, production outputs, capacity utilizations, and structural effects. It was found that industrial production and capacity utilization are the most important variables that have significant effect on future electrical power demand. The results showed that both the multivariate linear regression and neuro-fuzzy models are generally comparable and can be used adequately to simulate industrial electricity consumption. However, comparison that is based on the square root average squared error of data suggests that the neuro-fuzzy model performs slightly better for future prediction of electricity consumption than the multivariate linear regression model. Such results are in full agreement with similar work, using different methods, for other countries.
Bootstrap Enhanced Penalized Regression for Variable Selection with Neuroimaging Data.
Abram, Samantha V; Helwig, Nathaniel E; Moodie, Craig A; DeYoung, Colin G; MacDonald, Angus W; Waller, Niels G
2016-01-01
Recent advances in fMRI research highlight the use of multivariate methods for examining whole-brain connectivity. Complementary data-driven methods are needed for determining the subset of predictors related to individual differences. Although commonly used for this purpose, ordinary least squares (OLS) regression may not be ideal due to multi-collinearity and over-fitting issues. Penalized regression is a promising and underutilized alternative to OLS regression. In this paper, we propose a nonparametric bootstrap quantile (QNT) approach for variable selection with neuroimaging data. We use real and simulated data, as well as annotated R code, to demonstrate the benefits of our proposed method. Our results illustrate the practical potential of our proposed bootstrap QNT approach. Our real data example demonstrates how our method can be used to relate individual differences in neural network connectivity with an externalizing personality measure. Also, our simulation results reveal that the QNT method is effective under a variety of data conditions. Penalized regression yields more stable estimates and sparser models than OLS regression in situations with large numbers of highly correlated neural predictors. Our results demonstrate that penalized regression is a promising method for examining associations between neural predictors and clinically relevant traits or behaviors. These findings have important implications for the growing field of functional connectivity research, where multivariate methods produce numerous, highly correlated brain networks.
Bootstrap Enhanced Penalized Regression for Variable Selection with Neuroimaging Data
Abram, Samantha V.; Helwig, Nathaniel E.; Moodie, Craig A.; DeYoung, Colin G.; MacDonald, Angus W.; Waller, Niels G.
2016-01-01
Recent advances in fMRI research highlight the use of multivariate methods for examining whole-brain connectivity. Complementary data-driven methods are needed for determining the subset of predictors related to individual differences. Although commonly used for this purpose, ordinary least squares (OLS) regression may not be ideal due to multi-collinearity and over-fitting issues. Penalized regression is a promising and underutilized alternative to OLS regression. In this paper, we propose a nonparametric bootstrap quantile (QNT) approach for variable selection with neuroimaging data. We use real and simulated data, as well as annotated R code, to demonstrate the benefits of our proposed method. Our results illustrate the practical potential of our proposed bootstrap QNT approach. Our real data example demonstrates how our method can be used to relate individual differences in neural network connectivity with an externalizing personality measure. Also, our simulation results reveal that the QNT method is effective under a variety of data conditions. Penalized regression yields more stable estimates and sparser models than OLS regression in situations with large numbers of highly correlated neural predictors. Our results demonstrate that penalized regression is a promising method for examining associations between neural predictors and clinically relevant traits or behaviors. These findings have important implications for the growing field of functional connectivity research, where multivariate methods produce numerous, highly correlated brain networks. PMID:27516732
SMURC: High-Dimension Small-Sample Multivariate Regression With Covariance Estimation.
Bayar, Belhassen; Bouaynaya, Nidhal; Shterenberg, Roman
2017-03-01
We consider a high-dimension low sample-size multivariate regression problem that accounts for correlation of the response variables. The system is underdetermined as there are more parameters than samples. We show that the maximum likelihood approach with covariance estimation is senseless because the likelihood diverges. We subsequently propose a normalization of the likelihood function that guarantees convergence. We call this method small-sample multivariate regression with covariance (SMURC) estimation. We derive an optimization problem and its convex approximation to compute SMURC. Simulation results show that the proposed algorithm outperforms the regularized likelihood estimator with known covariance matrix and the sparse conditional Gaussian graphical model. We also apply SMURC to the inference of the wing-muscle gene network of the Drosophila melanogaster (fruit fly).
Finding structure in data using multivariate tree boosting
Miller, Patrick J.; Lubke, Gitta H.; McArtor, Daniel B.; Bergeman, C. S.
2016-01-01
Technology and collaboration enable dramatic increases in the size of psychological and psychiatric data collections, but finding structure in these large data sets with many collected variables is challenging. Decision tree ensembles such as random forests (Strobl, Malley, & Tutz, 2009) are a useful tool for finding structure, but are difficult to interpret with multiple outcome variables which are often of interest in psychology. To find and interpret structure in data sets with multiple outcomes and many predictors (possibly exceeding the sample size), we introduce a multivariate extension to a decision tree ensemble method called gradient boosted regression trees (Friedman, 2001). Our extension, multivariate tree boosting, is a method for nonparametric regression that is useful for identifying important predictors, detecting predictors with nonlinear effects and interactions without specification of such effects, and for identifying predictors that cause two or more outcome variables to covary. We provide the R package ‘mvtboost’ to estimate, tune, and interpret the resulting model, which extends the implementation of univariate boosting in the R package ‘gbm’ (Ridgeway et al., 2015) to continuous, multivariate outcomes. To illustrate the approach, we analyze predictors of psychological well-being (Ryff & Keyes, 1995). Simulations verify that our approach identifies predictors with nonlinear effects and achieves high prediction accuracy, exceeding or matching the performance of (penalized) multivariate multiple regression and multivariate decision trees over a wide range of conditions. PMID:27918183
NASA Astrophysics Data System (ADS)
Bressan, Lucas P.; do Nascimento, Paulo Cícero; Schmidt, Marcella E. P.; Faccin, Henrique; de Machado, Leandro Carvalho; Bohrer, Denise
2017-02-01
A novel method was developed to determine low molecular weight polycyclic aromatic hydrocarbons in aqueous leachates from soils and sediments using a salting-out assisted liquid-liquid extraction, synchronous fluorescence spectrometry and a multivariate calibration technique. Several experimental parameters were controlled and the optimum conditions were: sodium carbonate as the salting-out agent at concentration of 2 mol L- 1, 3 mL of acetonitrile as extraction solvent, 6 mL of aqueous leachate, vortexing for 5 min and centrifuging at 4000 rpm for 5 min. The partial least squares calibration was optimized to the lowest values of root mean squared error and five latent variables were chosen for each of the targeted compounds. The regression coefficients for the true versus predicted concentrations were higher than 0.99. Figures of merit for the multivariate method were calculated, namely sensitivity, multivariate detection limit and multivariate quantification limit. The selectivity was also evaluated and other polycyclic aromatic hydrocarbons did not interfere in the analysis. Likewise, high performance liquid chromatography was used as a comparative methodology, and the regression analysis between the methods showed no statistical difference (t-test). The proposed methodology was applied to soils and sediments of a Brazilian river and the recoveries ranged from 74.3% to 105.8%. Overall, the proposed methodology was suitable for the targeted compounds, showing that the extraction method can be applied to spectrofluorometric analysis and that the multivariate calibration is also suitable for these compounds in leachates from real samples.
Nonlinear multivariate and time series analysis by neural network methods
NASA Astrophysics Data System (ADS)
Hsieh, William W.
2004-03-01
Methods in multivariate statistical analysis are essential for working with large amounts of geophysical data, data from observational arrays, from satellites, or from numerical model output. In classical multivariate statistical analysis, there is a hierarchy of methods, starting with linear regression at the base, followed by principal component analysis (PCA) and finally canonical correlation analysis (CCA). A multivariate time series method, the singular spectrum analysis (SSA), has been a fruitful extension of the PCA technique. The common drawback of these classical methods is that only linear structures can be correctly extracted from the data. Since the late 1980s, neural network methods have become popular for performing nonlinear regression and classification. More recently, neural network methods have been extended to perform nonlinear PCA (NLPCA), nonlinear CCA (NLCCA), and nonlinear SSA (NLSSA). This paper presents a unified view of the NLPCA, NLCCA, and NLSSA techniques and their applications to various data sets of the atmosphere and the ocean (especially for the El Niño-Southern Oscillation and the stratospheric quasi-biennial oscillation). These data sets reveal that the linear methods are often too simplistic to describe real-world systems, with a tendency to scatter a single oscillatory phenomenon into numerous unphysical modes or higher harmonics, which can be largely alleviated in the new nonlinear paradigm.
ERIC Educational Resources Information Center
Baker, Bruce D.; Richards, Craig E.
1999-01-01
Applies neural network methods for forecasting 1991-95 per-pupil expenditures in U.S. public elementary and secondary schools. Forecasting models included the National Center for Education Statistics' multivariate regression model and three neural architectures. Regarding prediction accuracy, neural network results were comparable or superior to…
Umesh P. Agarwal; Richard S. Reiner; Sally A. Ralph
2010-01-01
Two new methods based on FTâRaman spectroscopy, one simple, based on band intensity ratio, and the other using a partial least squares (PLS) regression model, are proposed to determine cellulose I crystallinity. In the simple method, crystallinity in cellulose I samples was determined based on univariate regression that was first developed using the Raman band...
The purpose of this report is to provide a reference manual that could be used by investigators for making informed use of logistic regression using two methods (standard logistic regression and MARS). The details for analyses of relationships between a dependent binary response ...
Learning investment indicators through data extension
NASA Astrophysics Data System (ADS)
Dvořák, Marek
2017-07-01
Stock prices in the form of time series were analysed using single and multivariate statistical methods. After simple data preprocessing in the form of logarithmic differences, we augmented this single variate time series to a multivariate representation. This method makes use of sliding windows to calculate several dozen of new variables using simple statistic tools like first and second moments as well as more complicated statistic, like auto-regression coefficients and residual analysis, followed by an optional quadratic transformation that was further used for data extension. These were used as a explanatory variables in a regularized logistic LASSO regression which tried to estimate Buy-Sell Index (BSI) from real stock market data.
NASA Astrophysics Data System (ADS)
Singh, Veena D.; Daharwal, Sanjay J.
2017-01-01
Three multivariate calibration spectrophotometric methods were developed for simultaneous estimation of Paracetamol (PARA), Enalapril maleate (ENM) and Hydrochlorothiazide (HCTZ) in tablet dosage form; namely multi-linear regression calibration (MLRC), trilinear regression calibration method (TLRC) and classical least square (CLS) method. The selectivity of the proposed methods were studied by analyzing the laboratory prepared ternary mixture and successfully applied in their combined dosage form. The proposed methods were validated as per ICH guidelines and good accuracy; precision and specificity were confirmed within the concentration range of 5-35 μg mL- 1, 5-40 μg mL- 1 and 5-40 μg mL- 1of PARA, HCTZ and ENM, respectively. The results were statistically compared with reported HPLC method. Thus, the proposed methods can be effectively useful for the routine quality control analysis of these drugs in commercial tablet dosage form.
Fakayode, Sayo O; Mitchell, Breanna S; Pollard, David A
2014-08-01
Accurate understanding of analyte boiling points (BP) is of critical importance in gas chromatographic (GC) separation and crude oil refinery operation in petrochemical industries. This study reported the first combined use of GC separation and partial-least-square (PLS1) multivariate regression analysis of petrochemical structural activity relationship (SAR) for accurate BP determination of two commercially available (D3710 and MA VHP) calibration gas mix samples. The results of the BP determination using PLS1 multivariate regression were further compared with the results of traditional simulated distillation method of BP determination. The developed PLS1 regression was able to correctly predict analytes BP in D3710 and MA VHP calibration gas mix samples, with a root-mean-square-%-relative-error (RMS%RE) of 6.4%, and 10.8% respectively. In contrast, the overall RMS%RE of 32.9% and 40.4%, respectively obtained for BP determination in D3710 and MA VHP using a traditional simulated distillation method were approximately four times larger than the corresponding RMS%RE of BP prediction using MRA, demonstrating the better predictive ability of MRA. The reported method is rapid, robust, and promising, and can be potentially used routinely for fast analysis, pattern recognition, and analyte BP determination in petrochemical industries. Copyright © 2014 Elsevier B.V. All rights reserved.
Louys, Julien; Meloro, Carlo; Elton, Sarah; Ditchfield, Peter; Bishop, Laura C
2015-01-01
We test the performance of two models that use mammalian communities to reconstruct multivariate palaeoenvironments. While both models exploit the correlation between mammal communities (defined in terms of functional groups) and arboreal heterogeneity, the first uses a multiple multivariate regression of community structure and arboreal heterogeneity, while the second uses a linear regression of the principal components of each ecospace. The success of these methods means the palaeoenvironment of a particular locality can be reconstructed in terms of the proportions of heavy, moderate, light, and absent tree canopy cover. The linear regression is less biased, and more precisely and accurately reconstructs heavy tree canopy cover than the multiple multivariate model. However, the multiple multivariate model performs better than the linear regression for all other canopy cover categories. Both models consistently perform better than randomly generated reconstructions. We apply both models to the palaeocommunity of the Upper Laetolil Beds, Tanzania. Our reconstructions indicate that there was very little heavy tree cover at this site (likely less than 10%), with the palaeo-landscape instead comprising a mixture of light and absent tree cover. These reconstructions help resolve the previous conflicting palaeoecological reconstructions made for this site. Copyright © 2014 Elsevier Ltd. All rights reserved.
Applications of modern statistical methods to analysis of data in physical science
NASA Astrophysics Data System (ADS)
Wicker, James Eric
Modern methods of statistical and computational analysis offer solutions to dilemmas confronting researchers in physical science. Although the ideas behind modern statistical and computational analysis methods were originally introduced in the 1970's, most scientists still rely on methods written during the early era of computing. These researchers, who analyze increasingly voluminous and multivariate data sets, need modern analysis methods to extract the best results from their studies. The first section of this work showcases applications of modern linear regression. Since the 1960's, many researchers in spectroscopy have used classical stepwise regression techniques to derive molecular constants. However, problems with thresholds of entry and exit for model variables plagues this analysis method. Other criticisms of this kind of stepwise procedure include its inefficient searching method, the order in which variables enter or leave the model and problems with overfitting data. We implement an information scoring technique that overcomes the assumptions inherent in the stepwise regression process to calculate molecular model parameters. We believe that this kind of information based model evaluation can be applied to more general analysis situations in physical science. The second section proposes new methods of multivariate cluster analysis. The K-means algorithm and the EM algorithm, introduced in the 1960's and 1970's respectively, formed the basis of multivariate cluster analysis methodology for many years. However, several shortcomings of these methods include strong dependence on initial seed values and inaccurate results when the data seriously depart from hypersphericity. We propose new cluster analysis methods based on genetic algorithms that overcomes the strong dependence on initial seed values. In addition, we propose a generalization of the Genetic K-means algorithm which can accurately identify clusters with complex hyperellipsoidal covariance structures. We then use this new algorithm in a genetic algorithm based Expectation-Maximization process that can accurately calculate parameters describing complex clusters in a mixture model routine. Using the accuracy of this GEM algorithm, we assign information scores to cluster calculations in order to best identify the number of mixture components in a multivariate data set. We will showcase how these algorithms can be used to process multivariate data from astronomical observations.
McArtor, Daniel B.; Lubke, Gitta H.; Bergeman, C. S.
2017-01-01
Person-centered methods are useful for studying individual differences in terms of (dis)similarities between response profiles on multivariate outcomes. Multivariate distance matrix regression (MDMR) tests the significance of associations of response profile (dis)similarities and a set of predictors using permutation tests. This paper extends MDMR by deriving and empirically validating the asymptotic null distribution of its test statistic, and by proposing an effect size for individual outcome variables, which is shown to recover true associations. These extensions alleviate the computational burden of permutation tests currently used in MDMR and render more informative results, thus making MDMR accessible to new research domains. PMID:27738957
McArtor, Daniel B; Lubke, Gitta H; Bergeman, C S
2017-12-01
Person-centered methods are useful for studying individual differences in terms of (dis)similarities between response profiles on multivariate outcomes. Multivariate distance matrix regression (MDMR) tests the significance of associations of response profile (dis)similarities and a set of predictors using permutation tests. This paper extends MDMR by deriving and empirically validating the asymptotic null distribution of its test statistic, and by proposing an effect size for individual outcome variables, which is shown to recover true associations. These extensions alleviate the computational burden of permutation tests currently used in MDMR and render more informative results, thus making MDMR accessible to new research domains.
Quirós, Elia; Felicísimo, Angel M; Cuartero, Aurora
2009-01-01
This work proposes a new method to classify multi-spectral satellite images based on multivariate adaptive regression splines (MARS) and compares this classification system with the more common parallelepiped and maximum likelihood (ML) methods. We apply the classification methods to the land cover classification of a test zone located in southwestern Spain. The basis of the MARS method and its associated procedures are explained in detail, and the area under the ROC curve (AUC) is compared for the three methods. The results show that the MARS method provides better results than the parallelepiped method in all cases, and it provides better results than the maximum likelihood method in 13 cases out of 17. These results demonstrate that the MARS method can be used in isolation or in combination with other methods to improve the accuracy of soil cover classification. The improvement is statistically significant according to the Wilcoxon signed rank test.
NASA Astrophysics Data System (ADS)
Darvishzadeh, R.; Skidmore, A. K.; Mirzaie, M.; Atzberger, C.; Schlerf, M.
2014-12-01
Accurate estimation of grassland biomass at their peak productivity can provide crucial information regarding the functioning and productivity of the rangelands. Hyperspectral remote sensing has proved to be valuable for estimation of vegetation biophysical parameters such as biomass using different statistical techniques. However, in statistical analysis of hyperspectral data, multicollinearity is a common problem due to large amount of correlated hyper-spectral reflectance measurements. The aim of this study was to examine the prospect of above ground biomass estimation in a heterogeneous Mediterranean rangeland employing multivariate calibration methods. Canopy spectral measurements were made in the field using a GER 3700 spectroradiometer, along with concomitant in situ measurements of above ground biomass for 170 sample plots. Multivariate calibrations including partial least squares regression (PLSR), principal component regression (PCR), and Least-Squared Support Vector Machine (LS-SVM) were used to estimate the above ground biomass. The prediction accuracy of the multivariate calibration methods were assessed using cross validated R2 and RMSE. The best model performance was obtained using LS_SVM and then PLSR both calibrated with first derivative reflectance dataset with R2cv = 0.88 & 0.86 and RMSEcv= 1.15 & 1.07 respectively. The weakest prediction accuracy was appeared when PCR were used (R2cv = 0.31 and RMSEcv= 2.48). The obtained results highlight the importance of multivariate calibration methods for biomass estimation when hyperspectral data are used.
Iorgulescu, E; Voicu, V A; Sârbu, C; Tache, F; Albu, F; Medvedovici, A
2016-08-01
The influence of the experimental variability (instrumental repeatability, instrumental intermediate precision and sample preparation variability) and data pre-processing (normalization, peak alignment, background subtraction) on the discrimination power of multivariate data analysis methods (Principal Component Analysis -PCA- and Cluster Analysis -CA-) as well as a new algorithm based on linear regression was studied. Data used in the study were obtained through positive or negative ion monitoring electrospray mass spectrometry (+/-ESI/MS) and reversed phase liquid chromatography/UV spectrometric detection (RPLC/UV) applied to green tea extracts. Extractions in ethanol and heated water infusion were used as sample preparation procedures. The multivariate methods were directly applied to mass spectra and chromatograms, involving strictly a holistic comparison of shapes, without assignment of any structural identity to compounds. An alternative data interpretation based on linear regression analysis mutually applied to data series is also discussed. Slopes, intercepts and correlation coefficients produced by the linear regression analysis applied on pairs of very large experimental data series successfully retain information resulting from high frequency instrumental acquisition rates, obviously better defining the profiles being compared. Consequently, each type of sample or comparison between samples produces in the Cartesian space an ellipsoidal volume defined by the normal variation intervals of the slope, intercept and correlation coefficient. Distances between volumes graphically illustrates (dis)similarities between compared data. The instrumental intermediate precision had the major effect on the discrimination power of the multivariate data analysis methods. Mass spectra produced through ionization from liquid state in atmospheric pressure conditions of bulk complex mixtures resulting from extracted materials of natural origins provided an excellent data basis for multivariate analysis methods, equivalent to data resulting from chromatographic separations. The alternative evaluation of very large data series based on linear regression analysis produced information equivalent to results obtained through application of PCA an CA. Copyright © 2016 Elsevier B.V. All rights reserved.
Real, Jordi; Forné, Carles; Roso-Llorach, Albert; Martínez-Sánchez, Jose M
2016-05-01
Controlling for confounders is a crucial step in analytical observational studies, and multivariable models are widely used as statistical adjustment techniques. However, the validation of the assumptions of the multivariable regression models (MRMs) should be made clear in scientific reporting. The objective of this study is to review the quality of statistical reporting of the most commonly used MRMs (logistic, linear, and Cox regression) that were applied in analytical observational studies published between 2003 and 2014 by journals indexed in MEDLINE.Review of a representative sample of articles indexed in MEDLINE (n = 428) with observational design and use of MRMs (logistic, linear, and Cox regression). We assessed the quality of reporting about: model assumptions and goodness-of-fit, interactions, sensitivity analysis, crude and adjusted effect estimate, and specification of more than 1 adjusted model.The tests of underlying assumptions or goodness-of-fit of the MRMs used were described in 26.2% (95% CI: 22.0-30.3) of the articles and 18.5% (95% CI: 14.8-22.1) reported the interaction analysis. Reporting of all items assessed was higher in articles published in journals with a higher impact factor.A low percentage of articles indexed in MEDLINE that used multivariable techniques provided information demonstrating rigorous application of the model selected as an adjustment method. Given the importance of these methods to the final results and conclusions of observational studies, greater rigor is required in reporting the use of MRMs in the scientific literature.
Characterizing multivariate decoding models based on correlated EEG spectral features.
McFarland, Dennis J
2013-07-01
Multivariate decoding methods are popular techniques for analysis of neurophysiological data. The present study explored potential interpretative problems with these techniques when predictors are correlated. Data from sensorimotor rhythm-based cursor control experiments was analyzed offline with linear univariate and multivariate models. Features were derived from autoregressive (AR) spectral analysis of varying model order which produced predictors that varied in their degree of correlation (i.e., multicollinearity). The use of multivariate regression models resulted in much better prediction of target position as compared to univariate regression models. However, with lower order AR features interpretation of the spectral patterns of the weights was difficult. This is likely to be due to the high degree of multicollinearity present with lower order AR features. Care should be exercised when interpreting the pattern of weights of multivariate models with correlated predictors. Comparison with univariate statistics is advisable. While multivariate decoding algorithms are very useful for prediction their utility for interpretation may be limited when predictors are correlated. Copyright © 2013 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.
SPReM: Sparse Projection Regression Model For High-dimensional Linear Regression *
Sun, Qiang; Zhu, Hongtu; Liu, Yufeng; Ibrahim, Joseph G.
2014-01-01
The aim of this paper is to develop a sparse projection regression modeling (SPReM) framework to perform multivariate regression modeling with a large number of responses and a multivariate covariate of interest. We propose two novel heritability ratios to simultaneously perform dimension reduction, response selection, estimation, and testing, while explicitly accounting for correlations among multivariate responses. Our SPReM is devised to specifically address the low statistical power issue of many standard statistical approaches, such as the Hotelling’s T2 test statistic or a mass univariate analysis, for high-dimensional data. We formulate the estimation problem of SPREM as a novel sparse unit rank projection (SURP) problem and propose a fast optimization algorithm for SURP. Furthermore, we extend SURP to the sparse multi-rank projection (SMURP) by adopting a sequential SURP approximation. Theoretically, we have systematically investigated the convergence properties of SURP and the convergence rate of SURP estimates. Our simulation results and real data analysis have shown that SPReM out-performs other state-of-the-art methods. PMID:26527844
Empirical Bayes approach to the estimation of "unsafety": the multivariate regression method.
Hauer, E
1992-10-01
There are two kinds of clues to the unsafety of an entity: its traits (such as traffic, geometry, age, or gender) and its historical accident record. The Empirical Bayes approach to unsafety estimation makes use of both kinds of clues. It requires information about the mean and the variance of the unsafety in a "reference population" of similar entities. The method now in use for this purpose suffers from several shortcomings. First, a very large reference population is required. Second, the choice of reference population is to some extent arbitrary. Third, entities in the reference population usually cannot match the traits of the entity the unsafety of which is estimated. To alleviate these shortcomings the multivariate regression method for estimating the mean and variance of unsafety in reference populations is offered. Its logical foundations are described and its soundness is demonstrated. The use of the multivariate method makes the Empirical Bayes approach to unsafety estimation applicable to a wider range of circumstances and yields better estimates of unsafety. The application of the method to the tasks of identifying deviant entities and of estimating the effect of interventions on unsafety are discussed and illustrated by numerical examples.
Improved accuracy in quantitative laser-induced breakdown spectroscopy using sub-models
Anderson, Ryan; Clegg, Samuel M.; Frydenvang, Jens; Wiens, Roger C.; McLennan, Scott M.; Morris, Richard V.; Ehlmann, Bethany L.; Dyar, M. Darby
2017-01-01
Accurate quantitative analysis of diverse geologic materials is one of the primary challenges faced by the Laser-Induced Breakdown Spectroscopy (LIBS)-based ChemCam instrument on the Mars Science Laboratory (MSL) rover. The SuperCam instrument on the Mars 2020 rover, as well as other LIBS instruments developed for geochemical analysis on Earth or other planets, will face the same challenge. Consequently, part of the ChemCam science team has focused on the development of improved multivariate analysis calibrations methods. Developing a single regression model capable of accurately determining the composition of very different target materials is difficult because the response of an element’s emission lines in LIBS spectra can vary with the concentration of other elements. We demonstrate a conceptually simple “sub-model” method for improving the accuracy of quantitative LIBS analysis of diverse target materials. The method is based on training several regression models on sets of targets with limited composition ranges and then “blending” these “sub-models” into a single final result. Tests of the sub-model method show improvement in test set root mean squared error of prediction (RMSEP) for almost all cases. The sub-model method, using partial least squares regression (PLS), is being used as part of the current ChemCam quantitative calibration, but the sub-model method is applicable to any multivariate regression method and may yield similar improvements.
Liu, Fei; Ye, Lanhan; Peng, Jiyu; Song, Kunlin; Shen, Tingting; Zhang, Chu; He, Yong
2018-02-27
Fast detection of heavy metals is very important for ensuring the quality and safety of crops. Laser-induced breakdown spectroscopy (LIBS), coupled with uni- and multivariate analysis, was applied for quantitative analysis of copper in three kinds of rice (Jiangsu rice, regular rice, and Simiao rice). For univariate analysis, three pre-processing methods were applied to reduce fluctuations, including background normalization, the internal standard method, and the standard normal variate (SNV). Linear regression models showed a strong correlation between spectral intensity and Cu content, with an R 2 more than 0.97. The limit of detection (LOD) was around 5 ppm, lower than the tolerance limit of copper in foods. For multivariate analysis, partial least squares regression (PLSR) showed its advantage in extracting effective information for prediction, and its sensitivity reached 1.95 ppm, while support vector machine regression (SVMR) performed better in both calibration and prediction sets, where R c 2 and R p 2 reached 0.9979 and 0.9879, respectively. This study showed that LIBS could be considered as a constructive tool for the quantification of copper contamination in rice.
Ye, Lanhan; Song, Kunlin; Shen, Tingting
2018-01-01
Fast detection of heavy metals is very important for ensuring the quality and safety of crops. Laser-induced breakdown spectroscopy (LIBS), coupled with uni- and multivariate analysis, was applied for quantitative analysis of copper in three kinds of rice (Jiangsu rice, regular rice, and Simiao rice). For univariate analysis, three pre-processing methods were applied to reduce fluctuations, including background normalization, the internal standard method, and the standard normal variate (SNV). Linear regression models showed a strong correlation between spectral intensity and Cu content, with an R2 more than 0.97. The limit of detection (LOD) was around 5 ppm, lower than the tolerance limit of copper in foods. For multivariate analysis, partial least squares regression (PLSR) showed its advantage in extracting effective information for prediction, and its sensitivity reached 1.95 ppm, while support vector machine regression (SVMR) performed better in both calibration and prediction sets, where Rc2 and Rp2 reached 0.9979 and 0.9879, respectively. This study showed that LIBS could be considered as a constructive tool for the quantification of copper contamination in rice. PMID:29495445
Non-proportional odds multivariate logistic regression of ordinal family data.
Zaloumis, Sophie G; Scurrah, Katrina J; Harrap, Stephen B; Ellis, Justine A; Gurrin, Lyle C
2015-03-01
Methods to examine whether genetic and/or environmental sources can account for the residual variation in ordinal family data usually assume proportional odds. However, standard software to fit the non-proportional odds model to ordinal family data is limited because the correlation structure of family data is more complex than for other types of clustered data. To perform these analyses we propose the non-proportional odds multivariate logistic regression model and take a simulation-based approach to model fitting using Markov chain Monte Carlo methods, such as partially collapsed Gibbs sampling and the Metropolis algorithm. We applied the proposed methodology to male pattern baldness data from the Victorian Family Heart Study. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Estimation of railroad capacity using parametric methods.
DOT National Transportation Integrated Search
2013-12-01
This paper reviews different methodologies used for railroad capacity estimation and presents a user-friendly method to measure capacity. The objective of this paper is to use multivariate regression analysis to develop a continuous relation of the d...
Barimani, Shirin; Kleinebudde, Peter
2017-10-01
A multivariate analysis method, Science-Based Calibration (SBC), was used for the first time for endpoint determination of a tablet coating process using Raman data. Two types of tablet cores, placebo and caffeine cores, received a coating suspension comprising a polyvinyl alcohol-polyethylene glycol graft-copolymer and titanium dioxide to a maximum coating thickness of 80µm. Raman spectroscopy was used as in-line PAT tool. The spectra were acquired every minute and correlated to the amount of applied aqueous coating suspension. SBC was compared to another well-known multivariate analysis method, Partial Least Squares-regression (PLS) and a simpler approach, Univariate Data Analysis (UVDA). All developed calibration models had coefficient of determination values (R 2 ) higher than 0.99. The coating endpoints could be predicted with root mean square errors (RMSEP) less than 3.1% of the applied coating suspensions. Compared to PLS and UVDA, SBC proved to be an alternative multivariate calibration method with high predictive power. Copyright © 2017 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Rounaghi, Mohammad Mahdi; Abbaszadeh, Mohammad Reza; Arashi, Mohammad
2015-11-01
One of the most important topics of interest to investors is stock price changes. Investors whose goals are long term are sensitive to stock price and its changes and react to them. In this regard, we used multivariate adaptive regression splines (MARS) model and semi-parametric splines technique for predicting stock price in this study. The MARS model as a nonparametric method is an adaptive method for regression and it fits for problems with high dimensions and several variables. semi-parametric splines technique was used in this study. Smoothing splines is a nonparametric regression method. In this study, we used 40 variables (30 accounting variables and 10 economic variables) for predicting stock price using the MARS model and using semi-parametric splines technique. After investigating the models, we select 4 accounting variables (book value per share, predicted earnings per share, P/E ratio and risk) as influencing variables on predicting stock price using the MARS model. After fitting the semi-parametric splines technique, only 4 accounting variables (dividends, net EPS, EPS Forecast and P/E Ratio) were selected as variables effective in forecasting stock prices.
Henrard, S; Speybroeck, N; Hermans, C
2015-11-01
Haemophilia is a rare genetic haemorrhagic disease characterized by partial or complete deficiency of coagulation factor VIII, for haemophilia A, or IX, for haemophilia B. As in any other medical research domain, the field of haemophilia research is increasingly concerned with finding factors associated with binary or continuous outcomes through multivariable models. Traditional models include multiple logistic regressions, for binary outcomes, and multiple linear regressions for continuous outcomes. Yet these regression models are at times difficult to implement, especially for non-statisticians, and can be difficult to interpret. The present paper sought to didactically explain how, why, and when to use classification and regression tree (CART) analysis for haemophilia research. The CART method is non-parametric and non-linear, based on the repeated partitioning of a sample into subgroups based on a certain criterion. Breiman developed this method in 1984. Classification trees (CTs) are used to analyse categorical outcomes and regression trees (RTs) to analyse continuous ones. The CART methodology has become increasingly popular in the medical field, yet only a few examples of studies using this methodology specifically in haemophilia have to date been published. Two examples using CART analysis and previously published in this field are didactically explained in details. There is increasing interest in using CART analysis in the health domain, primarily due to its ease of implementation, use, and interpretation, thus facilitating medical decision-making. This method should be promoted for analysing continuous or categorical outcomes in haemophilia, when applicable. © 2015 John Wiley & Sons Ltd.
NASA Astrophysics Data System (ADS)
Das, Bappa; Sahoo, Rabi N.; Pargal, Sourabh; Krishna, Gopal; Verma, Rakesh; Chinnusamy, Viswanathan; Sehgal, Vinay K.; Gupta, Vinod K.; Dash, Sushanta K.; Swain, Padmini
2018-03-01
In the present investigation, the changes in sucrose, reducing and total sugar content due to water-deficit stress in rice leaves were modeled using visible, near infrared (VNIR) and shortwave infrared (SWIR) spectroscopy. The objectives of the study were to identify the best vegetation indices and suitable multivariate technique based on precise analysis of hyperspectral data (350 to 2500 nm) and sucrose, reducing sugar and total sugar content measured at different stress levels from 16 different rice genotypes. Spectral data analysis was done to identify suitable spectral indices and models for sucrose estimation. Novel spectral indices in near infrared (NIR) range viz. ratio spectral index (RSI) and normalised difference spectral indices (NDSI) sensitive to sucrose, reducing sugar and total sugar content were identified which were subsequently calibrated and validated. The RSI and NDSI models had R2 values of 0.65, 0.71 and 0.67; RPD values of 1.68, 1.95 and 1.66 for sucrose, reducing sugar and total sugar, respectively for validation dataset. Different multivariate spectral models such as artificial neural network (ANN), multivariate adaptive regression splines (MARS), multiple linear regression (MLR), partial least square regression (PLSR), random forest regression (RFR) and support vector machine regression (SVMR) were also evaluated. The best performing multivariate models for sucrose, reducing sugars and total sugars were found to be, MARS, ANN and MARS, respectively with respect to RPD values of 2.08, 2.44, and 1.93. Results indicated that VNIR and SWIR spectroscopy combined with multivariate calibration can be used as a reliable alternative to conventional methods for measurement of sucrose, reducing sugars and total sugars of rice under water-deficit stress as this technique is fast, economic, and noninvasive.
2011-01-01
Principal component regression is a multivariate data analysis approach routinely used to predict neurochemical concentrations from in vivo fast-scan cyclic voltammetry measurements. This mathematical procedure can rapidly be employed with present day computer programming languages. Here, we evaluate several methods that can be used to evaluate and improve multivariate concentration determination. The cyclic voltammetric representation of the calculated regression vector is shown to be a valuable tool in determining whether the calculated multivariate model is chemically appropriate. The use of Cook’s distance successfully identified outliers contained within in vivo fast-scan cyclic voltammetry training sets. This work also presents the first direct interpretation of a residual color plot and demonstrated the effect of peak shifts on predicted dopamine concentrations. Finally, separate analyses of smaller increments of a single continuous measurement could not be concatenated without substantial error in the predicted neurochemical concentrations due to electrode drift. Taken together, these tools allow for the construction of more robust multivariate calibration models and provide the first approach to assess the predictive ability of a procedure that is inherently impossible to validate because of the lack of in vivo standards. PMID:21966586
Field applications of stand-off sensing using visible/NIR multivariate optical computing
NASA Astrophysics Data System (ADS)
Eastwood, DeLyle; Soyemi, Olusola O.; Karunamuni, Jeevanandra; Zhang, Lixia; Li, Hongli; Myrick, Michael L.
2001-02-01
12 A novel multivariate visible/NIR optical computing approach applicable to standoff sensing will be demonstrated with porphyrin mixtures as examples. The ultimate goal is to develop environmental or counter-terrorism sensors for chemicals such as organophosphorus (OP) pesticides or chemical warfare simulants in the near infrared spectral region. The mathematical operation that characterizes prediction of properties via regression from optical spectra is a calculation of inner products between the spectrum and the pre-determined regression vector. The result is scaled appropriately and offset to correspond to the basis from which the regression vector is derived. The process involves collecting spectroscopic data and synthesizing a multivariate vector using a pattern recognition method. Then, an interference coating is designed that reproduces the pattern of the multivariate vector in its transmission or reflection spectrum, and appropriate interference filters are fabricated. High and low refractive index materials such as Nb2O5 and SiO2 are excellent choices for the visible and near infrared regions. The proof of concept has now been established for this system in the visible and will later be extended to chemicals such as OP compounds in the near and mid-infrared.
Keithley, Richard B; Wightman, R Mark
2011-06-07
Principal component regression is a multivariate data analysis approach routinely used to predict neurochemical concentrations from in vivo fast-scan cyclic voltammetry measurements. This mathematical procedure can rapidly be employed with present day computer programming languages. Here, we evaluate several methods that can be used to evaluate and improve multivariate concentration determination. The cyclic voltammetric representation of the calculated regression vector is shown to be a valuable tool in determining whether the calculated multivariate model is chemically appropriate. The use of Cook's distance successfully identified outliers contained within in vivo fast-scan cyclic voltammetry training sets. This work also presents the first direct interpretation of a residual color plot and demonstrated the effect of peak shifts on predicted dopamine concentrations. Finally, separate analyses of smaller increments of a single continuous measurement could not be concatenated without substantial error in the predicted neurochemical concentrations due to electrode drift. Taken together, these tools allow for the construction of more robust multivariate calibration models and provide the first approach to assess the predictive ability of a procedure that is inherently impossible to validate because of the lack of in vivo standards.
Improved accuracy in quantitative laser-induced breakdown spectroscopy using sub-models
DOE Office of Scientific and Technical Information (OSTI.GOV)
Anderson, Ryan B.; Clegg, Samuel M.; Frydenvang, Jens
We report that accurate quantitative analysis of diverse geologic materials is one of the primary challenges faced by the Laser-Induced Breakdown Spectroscopy (LIBS)-based ChemCam instrument on the Mars Science Laboratory (MSL) rover. The SuperCam instrument on the Mars 2020 rover, as well as other LIBS instruments developed for geochemical analysis on Earth or other planets, will face the same challenge. Consequently, part of the ChemCam science team has focused on the development of improved multivariate analysis calibrations methods. Developing a single regression model capable of accurately determining the composition of very different target materials is difficult because the response ofmore » an element’s emission lines in LIBS spectra can vary with the concentration of other elements. We demonstrate a conceptually simple “submodel” method for improving the accuracy of quantitative LIBS analysis of diverse target materials. The method is based on training several regression models on sets of targets with limited composition ranges and then “blending” these “sub-models” into a single final result. Tests of the sub-model method show improvement in test set root mean squared error of prediction (RMSEP) for almost all cases. Lastly, the sub-model method, using partial least squares regression (PLS), is being used as part of the current ChemCam quantitative calibration, but the sub-model method is applicable to any multivariate regression method and may yield similar improvements.« less
Improved accuracy in quantitative laser-induced breakdown spectroscopy using sub-models
Anderson, Ryan B.; Clegg, Samuel M.; Frydenvang, Jens; ...
2016-12-15
We report that accurate quantitative analysis of diverse geologic materials is one of the primary challenges faced by the Laser-Induced Breakdown Spectroscopy (LIBS)-based ChemCam instrument on the Mars Science Laboratory (MSL) rover. The SuperCam instrument on the Mars 2020 rover, as well as other LIBS instruments developed for geochemical analysis on Earth or other planets, will face the same challenge. Consequently, part of the ChemCam science team has focused on the development of improved multivariate analysis calibrations methods. Developing a single regression model capable of accurately determining the composition of very different target materials is difficult because the response ofmore » an element’s emission lines in LIBS spectra can vary with the concentration of other elements. We demonstrate a conceptually simple “submodel” method for improving the accuracy of quantitative LIBS analysis of diverse target materials. The method is based on training several regression models on sets of targets with limited composition ranges and then “blending” these “sub-models” into a single final result. Tests of the sub-model method show improvement in test set root mean squared error of prediction (RMSEP) for almost all cases. Lastly, the sub-model method, using partial least squares regression (PLS), is being used as part of the current ChemCam quantitative calibration, but the sub-model method is applicable to any multivariate regression method and may yield similar improvements.« less
Li, Min; Zhang, Lu; Yao, Xiaolong; Jiang, Xingyu
2017-01-01
The emerging membrane introduction mass spectrometry technique has been successfully used to detect benzene, toluene, ethyl benzene and xylene (BTEX), while overlapped spectra have unfortunately hindered its further application to the analysis of mixtures. Multivariate calibration, an efficient method to analyze mixtures, has been widely applied. In this paper, we compared univariate and multivariate analyses for quantification of the individual components of mixture samples. The results showed that the univariate analysis creates poor models with regression coefficients of 0.912, 0.867, 0.440 and 0.351 for BTEX, respectively. For multivariate analysis, a comparison to the partial-least squares (PLS) model shows that the orthogonal partial-least squares (OPLS) regression exhibits an optimal performance with regression coefficients of 0.995, 0.999, 0.980 and 0.976, favorable calibration parameters (RMSEC and RMSECV) and a favorable validation parameter (RMSEP). Furthermore, the OPLS exhibits a good recovery of 73.86 - 122.20% and relative standard deviation (RSD) of the repeatability of 1.14 - 4.87%. Thus, MIMS coupled with the OPLS regression provides an optimal approach for a quantitative BTEX mixture analysis in monitoring and predicting water pollution.
A regularization corrected score method for nonlinear regression models with covariate error.
Zucker, David M; Gorfine, Malka; Li, Yi; Tadesse, Mahlet G; Spiegelman, Donna
2013-03-01
Many regression analyses involve explanatory variables that are measured with error, and failing to account for this error is well known to lead to biased point and interval estimates of the regression coefficients. We present here a new general method for adjusting for covariate error. Our method consists of an approximate version of the Stefanski-Nakamura corrected score approach, using the method of regularization to obtain an approximate solution of the relevant integral equation. We develop the theory in the setting of classical likelihood models; this setting covers, for example, linear regression, nonlinear regression, logistic regression, and Poisson regression. The method is extremely general in terms of the types of measurement error models covered, and is a functional method in the sense of not involving assumptions on the distribution of the true covariate. We discuss the theoretical properties of the method and present simulation results in the logistic regression setting (univariate and multivariate). For illustration, we apply the method to data from the Harvard Nurses' Health Study concerning the relationship between physical activity and breast cancer mortality in the period following a diagnosis of breast cancer. Copyright © 2013, The International Biometric Society.
NASA Astrophysics Data System (ADS)
de Oliveira, Isadora R. N.; Roque, Jussara V.; Maia, Mariza P.; Stringheta, Paulo C.; Teófilo, Reinaldo F.
2018-04-01
A new method was developed to determine the antioxidant properties of red cabbage extract (Brassica oleracea) by mid (MID) and near (NIR) infrared spectroscopies and partial least squares (PLS) regression. A 70% (v/v) ethanolic extract of red cabbage was concentrated to 9° Brix and further diluted (12 to 100%) in water. The dilutions were used as external standards for the building of PLS models. For the first time, this strategy was applied for building multivariate regression models. Reference analyses and spectral data were obtained from diluted extracts. The determinate properties were total and monomeric anthocyanins, total polyphenols and antioxidant capacity by ABTS (2,2-azino-bis(3-ethyl-benzothiazoline-6-sulfonate)) and DPPH (2,2-diphenyl-1-picrylhydrazyl) methods. Ordered predictors selection (OPS) and genetic algorithm (GA) were used for feature selection before PLS regression (PLS-1). In addition, a PLS-2 regression was applied to all properties simultaneously. PLS-1 models provided more predictive models than did PLS-2 regression. PLS-OPS and PLS-GA models presented excellent prediction results with a correlation coefficient higher than 0.98. However, the best models were obtained using PLS and variable selection with the OPS algorithm and the models based on NIR spectra were considered more predictive for all properties. Then, these models provided a simple, rapid and accurate method for determination of red cabbage extract antioxidant properties and its suitability for use in the food industry.
Improving Cluster Analysis with Automatic Variable Selection Based on Trees
2014-12-01
regression trees Daisy DISsimilAritY PAM partitioning around medoids PMA penalized multivariate analysis SPC sparse principal components UPGMA unweighted...unweighted pair-group average method ( UPGMA ). This method measures dissimilarities between all objects in two clusters and takes the average value
Gene set analysis using variance component tests.
Huang, Yen-Tsung; Lin, Xihong
2013-06-28
Gene set analyses have become increasingly important in genomic research, as many complex diseases are contributed jointly by alterations of numerous genes. Genes often coordinate together as a functional repertoire, e.g., a biological pathway/network and are highly correlated. However, most of the existing gene set analysis methods do not fully account for the correlation among the genes. Here we propose to tackle this important feature of a gene set to improve statistical power in gene set analyses. We propose to model the effects of an independent variable, e.g., exposure/biological status (yes/no), on multiple gene expression values in a gene set using a multivariate linear regression model, where the correlation among the genes is explicitly modeled using a working covariance matrix. We develop TEGS (Test for the Effect of a Gene Set), a variance component test for the gene set effects by assuming a common distribution for regression coefficients in multivariate linear regression models, and calculate the p-values using permutation and a scaled chi-square approximation. We show using simulations that type I error is protected under different choices of working covariance matrices and power is improved as the working covariance approaches the true covariance. The global test is a special case of TEGS when correlation among genes in a gene set is ignored. Using both simulation data and a published diabetes dataset, we show that our test outperforms the commonly used approaches, the global test and gene set enrichment analysis (GSEA). We develop a gene set analyses method (TEGS) under the multivariate regression framework, which directly models the interdependence of the expression values in a gene set using a working covariance. TEGS outperforms two widely used methods, GSEA and global test in both simulation and a diabetes microarray data.
Baratieri, Sabrina C; Barbosa, Juliana M; Freitas, Matheus P; Martins, José A
2006-01-23
A multivariate method of analysis of nystatin and metronidazole in a semi-solid matrix, based on diffuse reflectance NIR measurements and partial least squares regression, is reported. The product, a vaginal cream used in the antifungal and antibacterial treatment, is usually, quantitatively analyzed through microbiological tests (nystatin) and HPLC technique (metronidazole), according to pharmacopeial procedures. However, near infrared spectroscopy has demonstrated to be a valuable tool for content determination, given the rapidity and scope of the method. In the present study, it was successfully applied in the prediction of nystatin (even in low concentrations, ca. 0.3-0.4%, w/w, which is around 100,000 IU/5g) and metronidazole contents, as demonstrated by some figures of merit, namely linearity, precision (mean and repeatability) and accuracy.
NASA Astrophysics Data System (ADS)
Moura, Ricardo; Sinha, Bimal; Coelho, Carlos A.
2017-06-01
The recent popularity of the use of synthetic data as a Statistical Disclosure Control technique has enabled the development of several methods of generating and analyzing such data, but almost always relying in asymptotic distributions and in consequence being not adequate for small sample datasets. Thus, a likelihood-based exact inference procedure is derived for the matrix of regression coefficients of the multivariate regression model, for multiply imputed synthetic data generated via Posterior Predictive Sampling. Since it is based in exact distributions this procedure may even be used in small sample datasets. Simulation studies compare the results obtained from the proposed exact inferential procedure with the results obtained from an adaptation of Reiters combination rule to multiply imputed synthetic datasets and an application to the 2000 Current Population Survey is discussed.
NASA Astrophysics Data System (ADS)
Jakubowski, J.; Stypulkowski, J. B.; Bernardeau, F. G.
2017-12-01
The first phase of the Abu Hamour drainage and storm tunnel was completed in early 2017. The 9.5 km long, 3.7 m diameter tunnel was excavated with two Earth Pressure Balance (EPB) Tunnel Boring Machines from Herrenknecht. TBM operation processes were monitored and recorded by Data Acquisition and Evaluation System. The authors coupled collected TBM drive data with available information on rock mass properties, cleansed, completed with secondary variables and aggregated by weeks and shifts. Correlations and descriptive statistics charts were examined. Multivariate Linear Regression and CART regression tree models linking TBM penetration rate (PR), penetration per revolution (PPR) and field penetration index (FPI) with TBM operational and geotechnical characteristics were performed for the conditions of the weak/soft rock of Doha. Both regression methods are interpretable and the data were screened with different computational approaches allowing enriched insight. The primary goal of the analysis was to investigate empirical relations between multiple explanatory and responding variables, to search for best subsets of explanatory variables and to evaluate the strength of linear and non-linear relations. For each of the penetration indices, a predictive model coupling both regression methods was built and validated. The resultant models appeared to be stronger than constituent ones and indicated an opportunity for more accurate and robust TBM performance predictions.
Physical Function in Older Men With Hyperkyphosis
Harrison, Stephanie L.; Fink, Howard A.; Marshall, Lynn M.; Orwoll, Eric; Barrett-Connor, Elizabeth; Cawthon, Peggy M.; Kado, Deborah M.
2015-01-01
Background. Age-related hyperkyphosis has been associated with poor physical function and is a well-established predictor of adverse health outcomes in older women, but its impact on health in older men is less well understood. Methods. We conducted a cross-sectional study to evaluate the association of hyperkyphosis and physical function in 2,363 men, aged 71–98 (M = 79) from the Osteoporotic Fractures in Men Study. Kyphosis was measured using the Rancho Bernardo Study block method. Measurements of grip strength and lower extremity function, including gait speed over 6 m, narrow walk (measure of dynamic balance), repeated chair stands ability and time, and lower extremity power (Nottingham Power Rig) were included separately as primary outcomes. We investigated associations of kyphosis and each outcome in age-adjusted and multivariable linear or logistic regression models, controlling for age, clinic, education, race, bone mineral density, height, weight, diabetes, and physical activity. Results. In multivariate linear regression, we observed a dose-related response of worse scores on each lower extremity physical function test as number of blocks increased, p for trend ≤.001. Using a cutoff of ≥4 blocks, 20% (N = 469) of men were characterized with hyperkyphosis. In multivariate logistic regression, men with hyperkyphosis had increased odds (range 1.5–1.8) of being in the worst quartile of performing lower extremity physical function tasks (p < .001 for each outcome). Kyphosis was not associated with grip strength in any multivariate analysis. Conclusions. Hyperkyphosis is associated with impaired lower extremity physical function in older men. Further studies are needed to determine the direction of causality. PMID:25431353
Bayesian Estimation of Multivariate Latent Regression Models: Gauss versus Laplace
ERIC Educational Resources Information Center
Culpepper, Steven Andrew; Park, Trevor
2017-01-01
A latent multivariate regression model is developed that employs a generalized asymmetric Laplace (GAL) prior distribution for regression coefficients. The model is designed for high-dimensional applications where an approximate sparsity condition is satisfied, such that many regression coefficients are near zero after accounting for all the model…
Ponsoda, Vicente; Martínez, Kenia; Pineda-Pardo, José A; Abad, Francisco J; Olea, Julio; Román, Francisco J; Barbey, Aron K; Colom, Roberto
2017-02-01
Neuroimaging research involves analyses of huge amounts of biological data that might or might not be related with cognition. This relationship is usually approached using univariate methods, and, therefore, correction methods are mandatory for reducing false positives. Nevertheless, the probability of false negatives is also increased. Multivariate frameworks have been proposed for helping to alleviate this balance. Here we apply multivariate distance matrix regression for the simultaneous analysis of biological and cognitive data, namely, structural connections among 82 brain regions and several latent factors estimating cognitive performance. We tested whether cognitive differences predict distances among individuals regarding their connectivity pattern. Beginning with 3,321 connections among regions, the 36 edges better predicted by the individuals' cognitive scores were selected. Cognitive scores were related to connectivity distances in both the full (3,321) and reduced (36) connectivity patterns. The selected edges connect regions distributed across the entire brain and the network defined by these edges supports high-order cognitive processes such as (a) (fluid) executive control, (b) (crystallized) recognition, learning, and language processing, and (c) visuospatial processing. This multivariate study suggests that one widespread, but limited number, of regions in the human brain, supports high-level cognitive ability differences. Hum Brain Mapp 38:803-816, 2017. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Wen, Cheng; Dallimer, Martin; Carver, Steve; Ziv, Guy
2018-05-06
Despite the great potential of mitigating carbon emission, development of wind farms is often opposed by local communities due to the visual impact on landscape. A growing number of studies have applied nonmarket valuation methods like Choice Experiments (CE) to value the visual impact by eliciting respondents' willingness to pay (WTP) or willingness to accept (WTA) for hypothetical wind farms through survey questions. Several meta-analyses have been found in the literature to synthesize results from different valuation studies, but they have various limitations related to the use of the prevailing multivariate meta-regression analysis. In this paper, we propose a new meta-analysis method to establish general functions for the relationships between the estimated WTP or WTA and three wind farm attributes, namely the distance to residential/coastal areas, the number of turbines and turbine height. This method involves establishing WTA or WTP functions for individual studies, fitting the average derivative functions and deriving the general integral functions of WTP or WTA against wind farm attributes. Results indicate that respondents in different studies consistently showed increasing WTP for moving wind farms to greater distances, which can be fitted by non-linear (natural logarithm) functions. However, divergent preferences for the number of turbines and turbine height were found in different studies. We argue that the new analysis method proposed in this paper is an alternative to the mainstream multivariate meta-regression analysis for synthesizing CE studies and the general integral functions of WTP or WTA against wind farm attributes are useful for future spatial modelling and benefit transfer studies. We also suggest that future multivariate meta-analyses should include non-linear components in the regression functions. Copyright © 2018. Published by Elsevier B.V.
Multivariate Regression Analysis and Slaughter Livestock,
AGRICULTURE, *ECONOMICS), (*MEAT, PRODUCTION), MULTIVARIATE ANALYSIS, REGRESSION ANALYSIS , ANIMALS, WEIGHT, COSTS, PREDICTIONS, STABILITY, MATHEMATICAL MODELS, STORAGE, BEEF, PORK, FOOD, STATISTICAL DATA, ACCURACY
Delwiche, Stephen R; Reeves, James B
2010-01-01
In multivariate regression analysis of spectroscopy data, spectral preprocessing is often performed to reduce unwanted background information (offsets, sloped baselines) or accentuate absorption features in intrinsically overlapping bands. These procedures, also known as pretreatments, are commonly smoothing operations or derivatives. While such operations are often useful in reducing the number of latent variables of the actual decomposition and lowering residual error, they also run the risk of misleading the practitioner into accepting calibration equations that are poorly adapted to samples outside of the calibration. The current study developed a graphical method to examine this effect on partial least squares (PLS) regression calibrations of near-infrared (NIR) reflection spectra of ground wheat meal with two analytes, protein content and sodium dodecyl sulfate sedimentation (SDS) volume (an indicator of the quantity of the gluten proteins that contribute to strong doughs). These two properties were chosen because of their differing abilities to be modeled by NIR spectroscopy: excellent for protein content, fair for SDS sedimentation volume. To further demonstrate the potential pitfalls of preprocessing, an artificial component, a randomly generated value, was included in PLS regression trials. Savitzky-Golay (digital filter) smoothing, first-derivative, and second-derivative preprocess functions (5 to 25 centrally symmetric convolution points, derived from quadratic polynomials) were applied to PLS calibrations of 1 to 15 factors. The results demonstrated the danger of an over reliance on preprocessing when (1) the number of samples used in a multivariate calibration is low (<50), (2) the spectral response of the analyte is weak, and (3) the goodness of the calibration is based on the coefficient of determination (R(2)) rather than a term based on residual error. The graphical method has application to the evaluation of other preprocess functions and various types of spectroscopy data.
Muradian, Kh K; Utko, N O; Mozzhukhina, T H; Pishel', I M; Litoshenko, O Ia; Bezrukov, V V; Fraĭfel'd, V E
2002-01-01
Correlative and regressive relations between the gaseous exchange, thermoregulation and mitochondrial protein content were analyzed by two- and three-dimensional statistics in mice. It has been shown that the pair wise linear methods of analysis did not reveal any significant correlation between the parameters under exploration. However, it became evident at three-dimensional and non-linear plotting for which the coefficients of multivariable correlation reached and even exceeded 0.7-0.8. The calculations based on partial differentiation of the multivariable regression equations allow to conclude that at certain values of VO2, VCO2 and body temperature negative relations between the systems of gaseous exchange and thermoregulation become dominating.
NASA Astrophysics Data System (ADS)
Mahaboob, B.; Venkateswarlu, B.; Sankar, J. Ravi; Balasiddamuni, P.
2017-11-01
This paper uses matrix calculus techniques to obtain Nonlinear Least Squares Estimator (NLSE), Maximum Likelihood Estimator (MLE) and Linear Pseudo model for nonlinear regression model. David Pollard and Peter Radchenko [1] explained analytic techniques to compute the NLSE. However the present research paper introduces an innovative method to compute the NLSE using principles in multivariate calculus. This study is concerned with very new optimization techniques used to compute MLE and NLSE. Anh [2] derived NLSE and MLE of a heteroscedatistic regression model. Lemcoff [3] discussed a procedure to get linear pseudo model for nonlinear regression model. In this research article a new technique is developed to get the linear pseudo model for nonlinear regression model using multivariate calculus. The linear pseudo model of Edmond Malinvaud [4] has been explained in a very different way in this paper. David Pollard et.al used empirical process techniques to study the asymptotic of the LSE (Least-squares estimation) for the fitting of nonlinear regression function in 2006. In Jae Myung [13] provided a go conceptual for Maximum likelihood estimation in his work “Tutorial on maximum likelihood estimation
Predictive equations for the estimation of body size in seals and sea lions (Carnivora: Pinnipedia)
Churchill, Morgan; Clementz, Mark T; Kohno, Naoki
2014-01-01
Body size plays an important role in pinniped ecology and life history. However, body size data is often absent for historical, archaeological, and fossil specimens. To estimate the body size of pinnipeds (seals, sea lions, and walruses) for today and the past, we used 14 commonly preserved cranial measurements to develop sets of single variable and multivariate predictive equations for pinniped body mass and total length. Principal components analysis (PCA) was used to test whether separate family specific regressions were more appropriate than single predictive equations for Pinnipedia. The influence of phylogeny was tested with phylogenetic independent contrasts (PIC). The accuracy of these regressions was then assessed using a combination of coefficient of determination, percent prediction error, and standard error of estimation. Three different methods of multivariate analysis were examined: bidirectional stepwise model selection using Akaike information criteria; all-subsets model selection using Bayesian information criteria (BIC); and partial least squares regression. The PCA showed clear discrimination between Otariidae (fur seals and sea lions) and Phocidae (earless seals) for the 14 measurements, indicating the need for family-specific regression equations. The PIC analysis found that phylogeny had a minor influence on relationship between morphological variables and body size. The regressions for total length were more accurate than those for body mass, and equations specific to Otariidae were more accurate than those for Phocidae. Of the three multivariate methods, the all-subsets approach required the fewest number of variables to estimate body size accurately. We then used the single variable predictive equations and the all-subsets approach to estimate the body size of two recently extinct pinniped taxa, the Caribbean monk seal (Monachus tropicalis) and the Japanese sea lion (Zalophus japonicus). Body size estimates using single variable regressions generally under or over-estimated body size; however, the all-subset regression produced body size estimates that were close to historically recorded body length for these two species. This indicates that the all-subset regression equations developed in this study can estimate body size accurately. PMID:24916814
Bello, Alessandra; Bianchi, Federica; Careri, Maria; Giannetto, Marco; Mori, Giovanni; Musci, Marilena
2007-11-05
A new NIR method based on multivariate calibration for determination of ethanol in industrially packed wholemeal bread was developed and validated. GC-FID was used as reference method for the determination of actual ethanol concentration of different samples of wholemeal bread with proper content of added ethanol, ranging from 0 to 3.5% (w/w). Stepwise discriminant analysis was carried out on the NIR dataset, in order to reduce the number of original variables by selecting those that were able to discriminate between the samples of different ethanol concentrations. With the so selected variables a multivariate calibration model was then obtained by multiple linear regression. The prediction power of the linear model was optimized by a new "leave one out" method, so that the number of original variables resulted further reduced.
Method for enhanced accuracy in predicting peptides using liquid separations or chromatography
Kangas, Lars J.; Auberry, Kenneth J.; Anderson, Gordon A.; Smith, Richard D.
2006-11-14
A method for predicting the elution time of a peptide in chromatographic and electrophoretic separations by first providing a data set of known elution times of known peptides, then creating a plurality of vectors, each vector having a plurality of dimensions, and each dimension representing the elution time of amino acids present in each of these known peptides from the data set. The elution time of any protein is then be predicted by first creating a vector by assigning dimensional values for the elution time of amino acids of at least one hypothetical peptide and then calculating a predicted elution time for the vector by performing a multivariate regression of the dimensional values of the hypothetical peptide using the dimensional values of the known peptides. Preferably, the multivariate regression is accomplished by the use of an artificial neural network and the elution times are first normalized using a transfer function.
Access disparities to Magnet hospitals for patients undergoing neurosurgical operations
Missios, Symeon; Bekelis, Kimon
2017-01-01
Background Centers of excellence focusing on quality improvement have demonstrated superior outcomes for a variety of surgical interventions. We investigated the presence of access disparities to hospitals recognized by the Magnet Recognition Program of the American Nurses Credentialing Center (ANCC) for patients undergoing neurosurgical operations. Methods We performed a cohort study of all neurosurgery patients who were registered in the New York Statewide Planning and Research Cooperative System (SPARCS) database from 2009–2013. We examined the association of African-American race and lack of insurance with Magnet status hospitalization for neurosurgical procedures. A mixed effects propensity adjusted multivariable regression analysis was used to control for confounding. Results During the study period, 190,535 neurosurgical patients met the inclusion criteria. Using a multivariable logistic regression, we demonstrate that African-Americans had lower admission rates to Magnet institutions (OR 0.62; 95% CI, 0.58–0.67). This persisted in a mixed effects logistic regression model (OR 0.77; 95% CI, 0.70–0.83) to adjust for clustering at the patient county level, and a propensity score adjusted logistic regression model (OR 0.75; 95% CI, 0.69–0.82). Additionally, lack of insurance was associated with lower admission rates to Magnet institutions (OR 0.71; 95% CI, 0.68–0.73), in a multivariable logistic regression model. This persisted in a mixed effects logistic regression model (OR 0.72; 95% CI, 0.69–0.74), and a propensity score adjusted logistic regression model (OR 0.72; 95% CI, 0.69–0.75). Conclusions Using a comprehensive all-payer cohort of neurosurgery patients in New York State we identified an association of African-American race and lack of insurance with lower rates of admission to Magnet hospitals. PMID:28684152
Multivariate Analysis of Seismic Field Data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Alam, M. Kathleen
1999-06-01
This report includes the details of the model building procedure and prediction of seismic field data. Principal Components Regression, a multivariate analysis technique, was used to model seismic data collected as two pieces of equipment were cycled on and off. Models built that included only the two pieces of equipment of interest had trouble predicting data containing signals not included in the model. Evidence for poor predictions came from the prediction curves as well as spectral F-ratio plots. Once the extraneous signals were included in the model, predictions improved dramatically. While Principal Components Regression performed well for the present datamore » sets, the present data analysis suggests further work will be needed to develop more robust modeling methods as the data become more complex.« less
Zhang, Fang; Wagner, Anita K; Soumerai, Stephen B; Ross-Degnan, Dennis
2009-02-01
Interrupted time series (ITS) is a strong quasi-experimental research design, which is increasingly applied to estimate the effects of health services and policy interventions. We describe and illustrate two methods for estimating confidence intervals (CIs) around absolute and relative changes in outcomes calculated from segmented regression parameter estimates. We used multivariate delta and bootstrapping methods (BMs) to construct CIs around relative changes in level and trend, and around absolute changes in outcome based on segmented linear regression analyses of time series data corrected for autocorrelated errors. Using previously published time series data, we estimated CIs around the effect of prescription alerts for interacting medications with warfarin on the rate of prescriptions per 10,000 warfarin users per month. Both the multivariate delta method (MDM) and the BM produced similar results. BM is preferred for calculating CIs of relative changes in outcomes of time series studies, because it does not require large sample sizes when parameter estimates are obtained correctly from the model. Caution is needed when sample size is small.
DOE Office of Scientific and Technical Information (OSTI.GOV)
2015-09-14
This package contains statistical routines for extracting features from multivariate time-series data which can then be used for subsequent multivariate statistical analysis to identify patterns and anomalous behavior. It calculates local linear or quadratic regression model fits to moving windows for each series and then summarizes the model coefficients across user-defined time intervals for each series. These methods are domain agnostic-but they have been successfully applied to a variety of domains, including commercial aviation and electric power grid data.
NASA Astrophysics Data System (ADS)
Rossi, M.; Apuani, T.; Felletti, F.
2009-04-01
The aim of this paper is to compare the results of two statistical methods for landslide susceptibility analysis: 1) univariate probabilistic method based on landslide susceptibility index, 2) multivariate method (logistic regression). The study area is the Febbraro valley, located in the central Italian Alps, where different types of metamorphic rocks croup out. On the eastern part of the studied basin a quaternary cover represented by colluvial and secondarily, by glacial deposits, is dominant. In this study 110 earth flows, mainly located toward NE portion of the catchment, were analyzed. They involve only the colluvial deposits and their extension mainly ranges from 36 to 3173 m2. Both statistical methods require to establish a spatial database, in which each landslide is described by several parameters that can be assigned using a main scarp central point of landslide. The spatial database is constructed using a Geographical Information System (GIS). Each landslide is described by several parameters corresponding to the value of main scarp central point of the landslide. Based on bibliographic review a total of 15 predisposing factors were utilized. The width of the intervals, in which the maps of the predisposing factors have to be reclassified, has been defined assuming constant intervals to: elevation (100 m), slope (5 °), solar radiation (0.1 MJ/cm2/year), profile curvature (1.2 1/m), tangential curvature (2.2 1/m), drainage density (0.5), lineament density (0.00126). For the other parameters have been used the results of the probability-probability plots analysis and the statistical indexes of landslides site. In particular slope length (0 ÷ 2, 2 ÷ 5, 5 ÷ 10, 10 ÷ 20, 20 ÷ 35, 35 ÷ 260), accumulation flow (0 ÷ 1, 1 ÷ 2, 2 ÷ 5, 5 ÷ 12, 12 ÷ 60, 60 ÷27265), Topographic Wetness Index 0 ÷ 0.74, 0.74 ÷ 1.94, 1.94 ÷ 2.62, 2.62 ÷ 3.48, 3.48 ÷ 6,00, 6.00 ÷ 9.44), Stream Power Index (0 ÷ 0.64, 0.64 ÷ 1.28, 1.28 ÷ 1.81, 1.81 ÷ 4.20, 4.20 ÷ 9.40). Geological map and land use map were also used, considering geological and land use properties as categorical variables. Appling the univariate probabilistic method the Landslide Susceptibility Index (LSI) is defined as the sum of the ratio Ra/Rb calculated for each predisposing factor, where Ra is the ratio between number of pixel of class and the total number of pixel of the study area, and Rb is the ratio between number of landslides respect to the pixel number of the interval area. From the analysis of the Ra/Rb ratio the relationship between landslide occurrence and predisposing factors were defined. Then the equation of LSI was used in GIS to trace the landslide susceptibility maps. The multivariate method for landslide susceptibility analysis, based on logistic regression, was performed starting from the density maps of the predisposing factors, calculated with the intervals defined above using the equation Rb/Rbtot, where Rbtot is a sum of all Rb values. Using stepwise forward algorithms the logistic regression was performed in two successive steps: first a univariate logistic regression is used to choose the most significant predisposing factors, then the multivariate logistic regression can be performed. The univariate regression highlighted the importance of the following factors: elevation, accumulation flow, drainage density, lineament density, geology and land use. When the multivariate regression was applied the number of controlling factors was reduced neglecting the geological properties. The resulting final susceptibility equation is: P = 1 / (1 + exp-(6.46-22.34*elevation-5.33*accumulation flow-7.99* drainage density-4.47*lineament density-17.31*land use)) and using this equation the susceptibility maps were obtained. To easy compare the results of the two methodologies, the susceptibility maps were reclassified in five susceptibility intervals (very high, high, moderate, low and very low) using natural breaks. Then the maps were validated using two cumulative distribution curves, one related to the landslides (number of landslides in each susceptibility class) and one to the basin (number of pixel covering each class). Comparing the curves for each method, it results that the two approaches (univariate and multivariate) are appropriate, providing acceptable results. In both maps the distribution of high susceptibility condition is mainly localized on the left slope of the catchment in agreement with the field evidences. The comparison between the methods was obtained by subtraction of the two maps. This operation shows that about 40% of the basin is classified by the same class of susceptibility. In general the univariate probabilistic method tends to overestimate the areal extension of the high susceptibility class with respect to the maps obtained by the logistic regression method.
Campos-Filho, N; Franco, E L
1989-02-01
A frequent procedure in matched case-control studies is to report results from the multivariate unmatched analyses if they do not differ substantially from the ones obtained after conditioning on the matching variables. Although conceptually simple, this rule requires that an extensive series of logistic regression models be evaluated by both the conditional and unconditional maximum likelihood methods. Most computer programs for logistic regression employ only one maximum likelihood method, which requires that the analyses be performed in separate steps. This paper describes a Pascal microcomputer (IBM PC) program that performs multiple logistic regression by both maximum likelihood estimation methods, which obviates the need for switching between programs to obtain relative risk estimates from both matched and unmatched analyses. The program calculates most standard statistics and allows factoring of categorical or continuous variables by two distinct methods of contrast. A built-in, descriptive statistics option allows the user to inspect the distribution of cases and controls across categories of any given variable.
Melchior, Maria; Touchette, Évelyne; Prokofyeva, Elena; Chollet, Aude; Fombonne, Eric; Elidemir, Gulizar; Galéra, Cédric
2014-01-01
Background Common negative events can precipitate the onset of internalizing symptoms. We studied whether their occurrence in childhood is associated with mental health trajectories over the course of development. Methods Using data from the TEMPO study, a French community-based cohort study of youths, we studied the association between negative events in 1991 (when participants were aged 4–16 years) and internalizing symptoms, assessed by the ASEBA family of instruments in 1991, 1999, and 2009 (n = 1503). Participants' trajectories of internalizing symptoms were estimated with semi-parametric regression methods (PROC TRAJ). Data were analyzed using multinomial regression models controlled for participants' sex, age, parental family status, socio-economic position, and parental history of depression. Results Negative childhood events were associated with an increased likelihood of concurrent internalizing symptoms which sometimes persisted into adulthood (multivariate ORs associated with > = 3 negative events respectively: high and decreasing internalizing symptoms: 5.54, 95% CI: 3.20–9.58; persistently high internalizing symptoms: 8.94, 95% CI: 2.82–28.31). Specific negative events most strongly associated with youths' persistent internalizing symptoms included: school difficulties (multivariate OR: 5.31, 95% CI: 2.24–12.59), parental stress (multivariate OR: 4.69, 95% CI: 2.02–10.87), serious illness/health problems (multivariate OR: 4.13, 95% CI: 1.76–9.70), and social isolation (multivariate OR: 2.24, 95% CI: 1.00–5.08). Conclusions Common negative events can contribute to the onset of children's lasting psychological difficulties. PMID:25485875
Power and sample size for multivariate logistic modeling of unmatched case-control studies.
Gail, Mitchell H; Haneuse, Sebastien
2017-01-01
Sample size calculations are needed to design and assess the feasibility of case-control studies. Although such calculations are readily available for simple case-control designs and univariate analyses, there is limited theory and software for multivariate unconditional logistic analysis of case-control data. Here we outline the theory needed to detect scalar exposure effects or scalar interactions while controlling for other covariates in logistic regression. Both analytical and simulation methods are presented, together with links to the corresponding software.
Spectroscopic analysis and control
Tate; , James D.; Reed, Christopher J.; Domke, Christopher H.; Le, Linh; Seasholtz, Mary Beth; Weber, Andy; Lipp, Charles
2017-04-18
Apparatus for spectroscopic analysis which includes a tunable diode laser spectrometer having a digital output signal and a digital computer for receiving the digital output signal from the spectrometer, the digital computer programmed to process the digital output signal using a multivariate regression algorithm. In addition, a spectroscopic method of analysis using such apparatus. Finally, a method for controlling an ethylene cracker hydrogenator.
Victimization and Suicidality among Female College Students
ERIC Educational Resources Information Center
Leone, Janel M.; Carroll, James M.
2016-01-01
Objective: To investigate the predictive role of victimization in suicidality among college women. Participants: Female respondents to the American College Health Association National College Health Assessment II (N = 258). Methods: Multivariate logistic regression analyses examined the relationship between victimization and suicidality. Results:…
Prediction of energy expenditure and physical activity in preschoolers
USDA-ARS?s Scientific Manuscript database
Accurate, nonintrusive, and feasible methods are needed to predict energy expenditure (EE) and physical activity (PA) levels in preschoolers. Herein, we validated cross-sectional time series (CSTS) and multivariate adaptive regression splines (MARS) models based on accelerometry and heart rate (HR) ...
Reduced rank regression via adaptive nuclear norm penalization
Chen, Kun; Dong, Hongbo; Chan, Kung-Sik
2014-01-01
Summary We propose an adaptive nuclear norm penalization approach for low-rank matrix approximation, and use it to develop a new reduced rank estimation method for high-dimensional multivariate regression. The adaptive nuclear norm is defined as the weighted sum of the singular values of the matrix, and it is generally non-convex under the natural restriction that the weight decreases with the singular value. However, we show that the proposed non-convex penalized regression method has a global optimal solution obtained from an adaptively soft-thresholded singular value decomposition. The method is computationally efficient, and the resulting solution path is continuous. The rank consistency of and prediction/estimation performance bounds for the estimator are established for a high-dimensional asymptotic regime. Simulation studies and an application in genetics demonstrate its efficacy. PMID:25045172
Multivariate meta-analysis using individual participant data.
Riley, R D; Price, M J; Jackson, D; Wardle, M; Gueyffier, F; Wang, J; Staessen, J A; White, I R
2015-06-01
When combining results across related studies, a multivariate meta-analysis allows the joint synthesis of correlated effect estimates from multiple outcomes. Joint synthesis can improve efficiency over separate univariate syntheses, may reduce selective outcome reporting biases, and enables joint inferences across the outcomes. A common issue is that within-study correlations needed to fit the multivariate model are unknown from published reports. However, provision of individual participant data (IPD) allows them to be calculated directly. Here, we illustrate how to use IPD to estimate within-study correlations, using a joint linear regression for multiple continuous outcomes and bootstrapping methods for binary, survival and mixed outcomes. In a meta-analysis of 10 hypertension trials, we then show how these methods enable multivariate meta-analysis to address novel clinical questions about continuous, survival and binary outcomes; treatment-covariate interactions; adjusted risk/prognostic factor effects; longitudinal data; prognostic and multiparameter models; and multiple treatment comparisons. Both frequentist and Bayesian approaches are applied, with example software code provided to derive within-study correlations and to fit the models. © 2014 The Authors. Research Synthesis Methods published by John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Riad, Safaa M.; Salem, Hesham; Elbalkiny, Heba T.; Khattab, Fatma I.
2015-04-01
Five, accurate, precise, and sensitive univariate and multivariate spectrophotometric methods were developed for the simultaneous determination of a ternary mixture containing Trimethoprim (TMP), Sulphamethoxazole (SMZ) and Oxytetracycline (OTC) in waste water samples collected from different cites either production wastewater or livestock wastewater after their solid phase extraction using OASIS HLB cartridges. In univariate methods OTC was determined at its λmax 355.7 nm (0D), while (TMP) and (SMZ) were determined by three different univariate methods. Method (A) is based on successive spectrophotometric resolution technique (SSRT). The technique starts with the ratio subtraction method followed by ratio difference method for determination of TMP and SMZ. Method (B) is successive derivative ratio technique (SDR). Method (C) is mean centering of the ratio spectra (MCR). The developed multivariate methods are principle component regression (PCR) and partial least squares (PLS). The specificity of the developed methods is investigated by analyzing laboratory prepared mixtures containing different ratios of the three drugs. The obtained results are statistically compared with those obtained by the official methods, showing no significant difference with respect to accuracy and precision at p = 0.05.
Riad, Safaa M; Salem, Hesham; Elbalkiny, Heba T; Khattab, Fatma I
2015-04-05
Five, accurate, precise, and sensitive univariate and multivariate spectrophotometric methods were developed for the simultaneous determination of a ternary mixture containing Trimethoprim (TMP), Sulphamethoxazole (SMZ) and Oxytetracycline (OTC) in waste water samples collected from different cites either production wastewater or livestock wastewater after their solid phase extraction using OASIS HLB cartridges. In univariate methods OTC was determined at its λmax 355.7 nm (0D), while (TMP) and (SMZ) were determined by three different univariate methods. Method (A) is based on successive spectrophotometric resolution technique (SSRT). The technique starts with the ratio subtraction method followed by ratio difference method for determination of TMP and SMZ. Method (B) is successive derivative ratio technique (SDR). Method (C) is mean centering of the ratio spectra (MCR). The developed multivariate methods are principle component regression (PCR) and partial least squares (PLS). The specificity of the developed methods is investigated by analyzing laboratory prepared mixtures containing different ratios of the three drugs. The obtained results are statistically compared with those obtained by the official methods, showing no significant difference with respect to accuracy and precision at p=0.05. Copyright © 2015 Elsevier B.V. All rights reserved.
2011-01-01
Introduction Necrotizing fasciitis (NF) is a life threatening infectious disease with a high mortality rate. We carried out a microbiological characterization of the causative pathogens. We investigated the correlation of mortality in NF with bloodstream infection and with the presence of co-morbidities. Methods In this retrospective study, we analyzed 323 patients who presented with necrotizing fasciitis at two different institutions. Bloodstream infection (BSI) was defined as a positive blood culture result. The patients were categorized as survivors and non-survivors. Eleven clinically important variables which were statistically significant by univariate analysis were selected for multivariate regression analysis and a stepwise logistic regression model was developed to determine the association between BSI and mortality. Results Univariate logistic regression analysis showed that patients with hypotension, heart disease, liver disease, presence of Vibrio spp. in wound cultures, presence of fungus in wound cultures, and presence of Streptococcus group A, Aeromonas spp. or Vibrio spp. in blood cultures, had a significantly higher risk of in-hospital mortality. Our multivariate logistic regression analysis showed a higher risk of mortality in patients with pre-existing conditions like hypotension, heart disease, and liver disease. Multivariate logistic regression analysis also showed that presence of Vibrio spp in wound cultures, and presence of Streptococcus Group A in blood cultures were associated with a high risk of mortality while debridement > = 3 was associated with improved survival. Conclusions Mortality in patients with necrotizing fasciitis was significantly associated with the presence of Vibrio in wound cultures and Streptococcus group A in blood cultures. PMID:21693053
Poláček, Roman; Májek, Pavel; Hroboňová, Katarína; Sádecká, Jana
2016-04-01
Fluoxetine is the most prescribed antidepressant chiral drug worldwide. Its enantiomers have a different duration of serotonin inhibition. A novel simple and rapid method for determination of the enantiomeric composition of fluoxetine in pharmaceutical pills is presented. Specifically, emission, excitation, and synchronous fluorescence techniques were employed to obtain the spectral data, which with multivariate calibration methods, namely, principal component regression (PCR) and partial least square (PLS), were investigated. The chiral recognition of fluoxetine enantiomers in the presence of β-cyclodextrin was based on diastereomeric complexes. The results of the multivariate calibration modeling indicated good prediction abilities. The obtained results for tablets were compared with those from chiral HPLC and no significant differences are shown by Fisher's (F) test and Student's t-test. The smallest residuals between reference or nominal values and predicted values were achieved by multivariate calibration of synchronous fluorescence spectral data. This conclusion is supported by calculated values of the figure of merit.
Multivariate meta-analysis using individual participant data
Riley, R. D.; Price, M. J.; Jackson, D.; Wardle, M.; Gueyffier, F.; Wang, J.; Staessen, J. A.; White, I. R.
2016-01-01
When combining results across related studies, a multivariate meta-analysis allows the joint synthesis of correlated effect estimates from multiple outcomes. Joint synthesis can improve efficiency over separate univariate syntheses, may reduce selective outcome reporting biases, and enables joint inferences across the outcomes. A common issue is that within-study correlations needed to fit the multivariate model are unknown from published reports. However, provision of individual participant data (IPD) allows them to be calculated directly. Here, we illustrate how to use IPD to estimate within-study correlations, using a joint linear regression for multiple continuous outcomes and bootstrapping methods for binary, survival and mixed outcomes. In a meta-analysis of 10 hypertension trials, we then show how these methods enable multivariate meta-analysis to address novel clinical questions about continuous, survival and binary outcomes; treatment–covariate interactions; adjusted risk/prognostic factor effects; longitudinal data; prognostic and multiparameter models; and multiple treatment comparisons. Both frequentist and Bayesian approaches are applied, with example software code provided to derive within-study correlations and to fit the models. PMID:26099484
DOE Office of Scientific and Technical Information (OSTI.GOV)
Clegg, Samuel M; Barefield, James E; Wiens, Roger C
2008-01-01
The ChemCam instrument on the Mars Science Laboratory (MSL) will include a laser-induced breakdown spectrometer (LIBS) to quantify major and minor elemental compositions. The traditional analytical chemistry approach to calibration curves for these data regresses a single diagnostic peak area against concentration for each element. This approach contrasts with a new multivariate method in which elemental concentrations are predicted by step-wise multiple regression analysis based on areas of a specific set of diagnostic peaks for each element. The method is tested on LIBS data from igneous and metamorphosed rocks. Between 4 and 13 partial regression coefficients are needed to describemore » each elemental abundance accurately (i.e., with a regression line of R{sup 2} > 0.9995 for the relationship between predicted and measured elemental concentration) for all major and minor elements studied. Validation plots suggest that the method is limited at present by the small data set, and will work best for prediction of concentration when a wide variety of compositions and rock types has been analyzed.« less
Dong, Chunjiao; Clarke, David B; Yan, Xuedong; Khattak, Asad; Huang, Baoshan
2014-09-01
Crash data are collected through police reports and integrated with road inventory data for further analysis. Integrated police reports and inventory data yield correlated multivariate data for roadway entities (e.g., segments or intersections). Analysis of such data reveals important relationships that can help focus on high-risk situations and coming up with safety countermeasures. To understand relationships between crash frequencies and associated variables, while taking full advantage of the available data, multivariate random-parameters models are appropriate since they can simultaneously consider the correlation among the specific crash types and account for unobserved heterogeneity. However, a key issue that arises with correlated multivariate data is the number of crash-free samples increases, as crash counts have many categories. In this paper, we describe a multivariate random-parameters zero-inflated negative binomial (MRZINB) regression model for jointly modeling crash counts. The full Bayesian method is employed to estimate the model parameters. Crash frequencies at urban signalized intersections in Tennessee are analyzed. The paper investigates the performance of MZINB and MRZINB regression models in establishing the relationship between crash frequencies, pavement conditions, traffic factors, and geometric design features of roadway intersections. Compared to the MZINB model, the MRZINB model identifies additional statistically significant factors and provides better goodness of fit in developing the relationships. The empirical results show that MRZINB model possesses most of the desirable statistical properties in terms of its ability to accommodate unobserved heterogeneity and excess zero counts in correlated data. Notably, in the random-parameters MZINB model, the estimated parameters vary significantly across intersections for different crash types. Copyright © 2014 Elsevier Ltd. All rights reserved.
TG study of the Li0.4Fe2.4Zn0.2O4 ferrite synthesis
NASA Astrophysics Data System (ADS)
Lysenko, E. N.; Nikolaev, E. V.; Surzhikov, A. P.
2016-02-01
In this paper, the kinetic analysis of Li-Zn ferrite synthesis was studied using thermogravimetry (TG) method through the simultaneous application of non-linear regression to several measurements run at different heating rates (multivariate non-linear regression). Using TG-curves obtained for the four heating rates and Netzsch Thermokinetics software package, the kinetic models with minimal adjustable parameters were selected to quantitatively describe the reaction of Li-Zn ferrite synthesis. It was shown that the experimental TG-curves clearly suggest a two-step process for the ferrite synthesis and therefore a model-fitting kinetic analysis based on multivariate non-linear regressions was conducted. The complex reaction was described by a two-step reaction scheme consisting of sequential reaction steps. It is established that the best results were obtained using the Yander three-dimensional diffusion model at the first stage and Ginstling-Bronstein model at the second step. The kinetic parameters for lithium-zinc ferrite synthesis reaction were found and discussed.
Alternative High School Students: Prevalence and Correlates of Overweight
ERIC Educational Resources Information Center
Kubik, Martha Y.; Davey, Cynthia; Fulkerson, Jayne A.; Sirard, John; Story, Mary; Arcan, Chrisa
2009-01-01
Objective: To determine prevalence and correlates of overweight among adolescents attending alternative high schools (AHS). Methods: AHS students (n=145) from 6 schools completed surveys and anthropometric measures. Cross-sectional associations were assessed using mixed model multivariate logistic regression. Results: Among students, 42% were…
Innovation Analysis | Energy Analysis | NREL
. New empirical methods for estimating technical and commercial impact (based on patent citations and Commercial Breakthroughs, NREL employed regression models and multivariate simulations to compare social in the marketplace and found that: Web presence may provide a better representation of the commercial
College Student Invulnerability Beliefs and HIV Vaccine Acceptability
ERIC Educational Resources Information Center
Ravert, Russell D.; Zimet, Gregory D.
2009-01-01
Objective: To examine behavioral history, beliefs, and vaccine characteristics as predictors of HIV vaccine acceptability. Methods: Two hundred forty-five US under graduates were surveyed regarding their sexual history, risk beliefs, and likelihood of accepting hypothetical HIV vaccines. Results: Multivariate regression analysis indicated that…
Zenebe, Chernet Baye; Adefris, Mulat; Yenit, Melaku Kindie; Gelaw, Yalemzewod Assefa
2017-09-06
Despite the fact that long acting family planning methods reduce population growth and improve maternal health, their utilization remains poor. Therefore, this study assessed the prevalence of long acting and permanent family planning method utilization and associated factors among women in reproductive age groups who have decided not to have more children in Gondar city, northwest Ethiopia. An institution based cross-sectional study was conducted from August to October, 2015. Three hundred seventeen women who have decided not to have more children were selected consecutively into the study. A structured and pretested questionnaire was used to collect data. Both bivariate and multi-variable logistic regressions analyses were used to identify factors associated with utilization of long acting and permanent family planning methods. The multi-variable logistic regression analysis was used to investigate factors associated with the utilization of long acting and permanent family planning methods. The Adjusted Odds Ratio (AOR) with the corresponding 95% Confidence Interval (CI) was used to show the strength of associations, and variables with a P-value of <0.05 were considered statistically significant. In this study, the overall prevalence of long acting and permanent contraceptive (LAPCM) method utilization was 34.7% (95% CI: 29.5-39.9). According to the multi-variable logistic regression analysis, utilization of long acting and permanent contraceptive methods was significantly associated with women who had secondary school, (AOR: 2279, 95% CI: 1.17, 4.44), college, and above education (AOR: 2.91, 95% CI: 1.36, 6.24), history of previous utilization (AOR: 3.02, 95% CI: 1.69, 5.38), and information about LAPCM (AOR: 8.85, 95% CI: 2.04, 38.41). In this study the prevalence of long acting and permanent family planning method utilization among women who have decided not to have more children was high compared with previous studies conducted elsewhere. Advanced educational status, previous utilization of LAPCM, and information on LAPCM were significantly associated with the utilization of LAPCM. As a result, strengthening behavioral change communication channels to make information accessible is highly recommended.
NASA Astrophysics Data System (ADS)
Darwish, Hany W.; Hassan, Said A.; Salem, Maissa Y.; El-Zeany, Badr A.
2013-09-01
Four simple, accurate and specific methods were developed and validated for the simultaneous estimation of Amlodipine (AML), Valsartan (VAL) and Hydrochlorothiazide (HCT) in commercial tablets. The derivative spectrophotometric methods include Derivative Ratio Zero Crossing (DRZC) and Double Divisor Ratio Spectra-Derivative Spectrophotometry (DDRS-DS) methods, while the multivariate calibrations used are Principal Component Regression (PCR) and Partial Least Squares (PLSs). The proposed methods were applied successfully in the determination of the drugs in laboratory-prepared mixtures and in commercial pharmaceutical preparations. The validity of the proposed methods was assessed using the standard addition technique. The linearity of the proposed methods is investigated in the range of 2-32, 4-44 and 2-20 μg/mL for AML, VAL and HCT, respectively.
Penalized regression procedures for variable selection in the potential outcomes framework
Ghosh, Debashis; Zhu, Yeying; Coffman, Donna L.
2015-01-01
A recent topic of much interest in causal inference is model selection. In this article, we describe a framework in which to consider penalized regression approaches to variable selection for causal effects. The framework leads to a simple ‘impute, then select’ class of procedures that is agnostic to the type of imputation algorithm as well as penalized regression used. It also clarifies how model selection involves a multivariate regression model for causal inference problems, and that these methods can be applied for identifying subgroups in which treatment effects are homogeneous. Analogies and links with the literature on machine learning methods, missing data and imputation are drawn. A difference LASSO algorithm is defined, along with its multiple imputation analogues. The procedures are illustrated using a well-known right heart catheterization dataset. PMID:25628185
Balabin, Roman M; Lomakina, Ekaterina I
2011-04-21
In this study, we make a general comparison of the accuracy and robustness of five multivariate calibration models: partial least squares (PLS) regression or projection to latent structures, polynomial partial least squares (Poly-PLS) regression, artificial neural networks (ANNs), and two novel techniques based on support vector machines (SVMs) for multivariate data analysis: support vector regression (SVR) and least-squares support vector machines (LS-SVMs). The comparison is based on fourteen (14) different datasets: seven sets of gasoline data (density, benzene content, and fractional composition/boiling points), two sets of ethanol gasoline fuel data (density and ethanol content), one set of diesel fuel data (total sulfur content), three sets of petroleum (crude oil) macromolecules data (weight percentages of asphaltenes, resins, and paraffins), and one set of petroleum resins data (resins content). Vibrational (near-infrared, NIR) spectroscopic data are used to predict the properties and quality coefficients of gasoline, biofuel/biodiesel, diesel fuel, and other samples of interest. The four systems presented here range greatly in composition, properties, strength of intermolecular interactions (e.g., van der Waals forces, H-bonds), colloid structure, and phase behavior. Due to the high diversity of chemical systems studied, general conclusions about SVM regression methods can be made. We try to answer the following question: to what extent can SVM-based techniques replace ANN-based approaches in real-world (industrial/scientific) applications? The results show that both SVR and LS-SVM methods are comparable to ANNs in accuracy. Due to the much higher robustness of the former, the SVM-based approaches are recommended for practical (industrial) application. This has been shown to be especially true for complicated, highly nonlinear objects.
Kyle J. Haynes; Andrew M. Liebhold; Ottar N. Bjørnstad; Andrew J. Allstadt; Randall S. Morin
2018-01-01
Evaluating the causes of spatial synchrony in population dynamics in nature is notoriously difficult due to a lack of data and appropriate statistical methods. Here, we use a recently developed method, a multivariate extension of the local indicators of spatial autocorrelation statistic, to map geographic variation in the synchrony of gypsy moth outbreaks. Regression...
ERIC Educational Resources Information Center
Chen, Duan-Rung; Truong, Khoa D.; Tsai, Meng-Ju
2013-01-01
Background: The linkage between sleep quality and weight status among teenagers has gained more attention in the recent literature and health policy but no consensus has been reached. Methods: Using both a propensity score method and multivariate linear regression for a cross-sectional sample of 2,113 teenagers, we analyzed their body mass index…
Tang, Yongqiang
2018-04-30
The controlled imputation method refers to a class of pattern mixture models that have been commonly used as sensitivity analyses of longitudinal clinical trials with nonignorable dropout in recent years. These pattern mixture models assume that participants in the experimental arm after dropout have similar response profiles to the control participants or have worse outcomes than otherwise similar participants who remain on the experimental treatment. In spite of its popularity, the controlled imputation has not been formally developed for longitudinal binary and ordinal outcomes partially due to the lack of a natural multivariate distribution for such endpoints. In this paper, we propose 2 approaches for implementing the controlled imputation for binary and ordinal data based respectively on the sequential logistic regression and the multivariate probit model. Efficient Markov chain Monte Carlo algorithms are developed for missing data imputation by using the monotone data augmentation technique for the sequential logistic regression and a parameter-expanded monotone data augmentation scheme for the multivariate probit model. We assess the performance of the proposed procedures by simulation and the analysis of a schizophrenia clinical trial and compare them with the fully conditional specification, last observation carried forward, and baseline observation carried forward imputation methods. Copyright © 2018 John Wiley & Sons, Ltd.
Linear regression analysis: part 14 of a series on evaluation of scientific publications.
Schneider, Astrid; Hommel, Gerhard; Blettner, Maria
2010-11-01
Regression analysis is an important statistical method for the analysis of medical data. It enables the identification and characterization of relationships among multiple factors. It also enables the identification of prognostically relevant risk factors and the calculation of risk scores for individual prognostication. This article is based on selected textbooks of statistics, a selective review of the literature, and our own experience. After a brief introduction of the uni- and multivariable regression models, illustrative examples are given to explain what the important considerations are before a regression analysis is performed, and how the results should be interpreted. The reader should then be able to judge whether the method has been used correctly and interpret the results appropriately. The performance and interpretation of linear regression analysis are subject to a variety of pitfalls, which are discussed here in detail. The reader is made aware of common errors of interpretation through practical examples. Both the opportunities for applying linear regression analysis and its limitations are presented.
Multivariate decoding of brain images using ordinal regression.
Doyle, O M; Ashburner, J; Zelaya, F O; Williams, S C R; Mehta, M A; Marquand, A F
2013-11-01
Neuroimaging data are increasingly being used to predict potential outcomes or groupings, such as clinical severity, drug dose response, and transitional illness states. In these examples, the variable (target) we want to predict is ordinal in nature. Conventional classification schemes assume that the targets are nominal and hence ignore their ranked nature, whereas parametric and/or non-parametric regression models enforce a metric notion of distance between classes. Here, we propose a novel, alternative multivariate approach that overcomes these limitations - whole brain probabilistic ordinal regression using a Gaussian process framework. We applied this technique to two data sets of pharmacological neuroimaging data from healthy volunteers. The first study was designed to investigate the effect of ketamine on brain activity and its subsequent modulation with two compounds - lamotrigine and risperidone. The second study investigates the effect of scopolamine on cerebral blood flow and its modulation using donepezil. We compared ordinal regression to multi-class classification schemes and metric regression. Considering the modulation of ketamine with lamotrigine, we found that ordinal regression significantly outperformed multi-class classification and metric regression in terms of accuracy and mean absolute error. However, for risperidone ordinal regression significantly outperformed metric regression but performed similarly to multi-class classification both in terms of accuracy and mean absolute error. For the scopolamine data set, ordinal regression was found to outperform both multi-class and metric regression techniques considering the regional cerebral blood flow in the anterior cingulate cortex. Ordinal regression was thus the only method that performed well in all cases. Our results indicate the potential of an ordinal regression approach for neuroimaging data while providing a fully probabilistic framework with elegant approaches for model selection. Copyright © 2013. Published by Elsevier Inc.
Le Strat, Yann
2017-01-01
The objective of this paper is to evaluate a panel of statistical algorithms for temporal outbreak detection. Based on a large dataset of simulated weekly surveillance time series, we performed a systematic assessment of 21 statistical algorithms, 19 implemented in the R package surveillance and two other methods. We estimated false positive rate (FPR), probability of detection (POD), probability of detection during the first week, sensitivity, specificity, negative and positive predictive values and F1-measure for each detection method. Then, to identify the factors associated with these performance measures, we ran multivariate Poisson regression models adjusted for the characteristics of the simulated time series (trend, seasonality, dispersion, outbreak sizes, etc.). The FPR ranged from 0.7% to 59.9% and the POD from 43.3% to 88.7%. Some methods had a very high specificity, up to 99.4%, but a low sensitivity. Methods with a high sensitivity (up to 79.5%) had a low specificity. All methods had a high negative predictive value, over 94%, while positive predictive values ranged from 6.5% to 68.4%. Multivariate Poisson regression models showed that performance measures were strongly influenced by the characteristics of time series. Past or current outbreak size and duration strongly influenced detection performances. PMID:28715489
NASA Astrophysics Data System (ADS)
Anderson, R. B.; Finch, N.; Clegg, S. M.; Graff, T.; Morris, R. V.; Laura, J.
2018-04-01
The PySAT point spectra tool provides a flexible graphical interface, enabling scientists to apply a wide variety of preprocessing and machine learning methods to point spectral data, with an emphasis on multivariate regression.
Correlates of Gambling among Eighth-Grade Boys and Girls
ERIC Educational Resources Information Center
Chaumeton, Nigel R.; Ramowski, Sarah K.; Nystrom, Robert J.
2011-01-01
Background: This study examined the correlates of gambling behavior among eighth-grade students. Methods: Children (n = 15,865) enrolled in publicly funded schools in Oregon completed the 2008 Oregon Healthy Teens survey. Multivariate logistic regression analyses assessed the combined and independent associations between risk and protective…
Alcohol Use, Eating Patterns, and Weight Behaviors in a University Population
ERIC Educational Resources Information Center
Nelson, Melissa C.; Lust, Katherine; Story, Mary; Ehlinger, Ed
2009-01-01
Objective: To explore associations between alcohol, alcohol-related eating, and weight-related health indicators. Methods: Cross-sectional, multivariate regression of weight behaviors, binge drinking, and alcohol-related eating, using self-reported student survey data (n = 3206 undergraduates/graduates). Results: Binge drinking was associated with…
Optical scatterometry of quarter-micron patterns using neural regression
NASA Astrophysics Data System (ADS)
Bischoff, Joerg; Bauer, Joachim J.; Haak, Ulrich; Hutschenreuther, Lutz; Truckenbrodt, Horst
1998-06-01
With shrinking dimensions and increasing chip areas, a rapid and non-destructive full wafer characterization after every patterning cycle is an inevitable necessity. In former publications it was shown that Optical Scatterometry (OS) has the potential to push the attainable feature limits of optical techniques from 0.8 . . . 0.5 microns for imaging methods down to 0.1 micron and below. Thus the demands of future metrology can be met. Basically being a nonimaging method, OS combines light scatter (or diffraction) measurements with modern data analysis schemes to solve the inverse scatter issue. For very fine patterns with lambda-to-pitch ratios grater than one, the specular reflected light versus the incidence angle is recorded. Usually, the data analysis comprises two steps -- a training cycle connected the a rigorous forward modeling and the prediction itself. Until now, two data analysis schemes are usually applied -- the multivariate regression based Partial Least Squares method (PLS) and a look-up-table technique which is also referred to as Minimum Mean Square Error approach (MMSE). Both methods are afflicted with serious drawbacks. On the one hand, the prediction accuracy of multivariate regression schemes degrades with larger parameter ranges due to the linearization properties of the method. On the other hand, look-up-table methods are rather time consuming during prediction thus prolonging the processing time and reducing the throughput. An alternate method is an Artificial Neural Network (ANN) based regression which combines the advantages of multivariate regression and MMSE. Due to the versatility of a neural network, not only can its structure be adapted more properly to the scatter problem, but also the nonlinearity of the neuronal transfer functions mimic the nonlinear behavior of optical diffraction processes more adequately. In spite of these pleasant properties, the prediction speed of ANN regression is comparable with that of the PLS-method. In this paper, the viability and performance of ANN-regression will be demonstrated with the example of sub-quarter-micron resist metrology. To this end, 0.25 micrometer line/space patterns have been printed in positive photoresist by means of DUV projection lithography. In order to evaluate the total metrology chain from light scatter measurement through data analysis, a thorough modeling has been performed. Assuming a trapezoidal shape of the developed resist profile, a training data set was generated by means of the Rigorous Coupled Wave Approach (RCWA). After training the model, a second data set was computed and deteriorated by Gaussian noise to imitate real measuring conditions. Then, these data have been fed into the models established before resulting in a Standard Error of Prediction (SEP) which corresponds to the measuring accuracy. Even with putting only little effort in the design of a back-propagation network, the ANN is clearly superior to the PLS-method. Depending on whether a network with one or two hidden layers was used, accuracy gains between 2 and 5 can be achieved compared with PLS regression. Furthermore, the ANN is less noise sensitive, for there is only a doubling of the SEP at 5% noise for ANN whereas for PLS the accuracy degrades rapidly with increasing noise. The accuracy gain also depends on the light polarization and on the measured parameters. Finally, these results have been proven experimentally, where the OS-results are in good accordance with the profiles obtained from cross- sectioning micrographs.
Zhou, Yan; Wang, Pei; Wang, Xianlong; Zhu, Ji; Song, Peter X-K
2017-01-01
The multivariate regression model is a useful tool to explore complex associations between two kinds of molecular markers, which enables the understanding of the biological pathways underlying disease etiology. For a set of correlated response variables, accounting for such dependency can increase statistical power. Motivated by integrative genomic data analyses, we propose a new methodology-sparse multivariate factor analysis regression model (smFARM), in which correlations of response variables are assumed to follow a factor analysis model with latent factors. This proposed method not only allows us to address the challenge that the number of association parameters is larger than the sample size, but also to adjust for unobserved genetic and/or nongenetic factors that potentially conceal the underlying response-predictor associations. The proposed smFARM is implemented by the EM algorithm and the blockwise coordinate descent algorithm. The proposed methodology is evaluated and compared to the existing methods through extensive simulation studies. Our results show that accounting for latent factors through the proposed smFARM can improve sensitivity of signal detection and accuracy of sparse association map estimation. We illustrate smFARM by two integrative genomics analysis examples, a breast cancer dataset, and an ovarian cancer dataset, to assess the relationship between DNA copy numbers and gene expression arrays to understand genetic regulatory patterns relevant to the disease. We identify two trans-hub regions: one in cytoband 17q12 whose amplification influences the RNA expression levels of important breast cancer genes, and the other in cytoband 9q21.32-33, which is associated with chemoresistance in ovarian cancer. © 2016 WILEY PERIODICALS, INC.
Moseson, Heidi; Gerdts, Caitlin; Dehlendorf, Christine; Hiatt, Robert A; Vittinghoff, Eric
2017-12-21
The list experiment is a promising measurement tool for eliciting truthful responses to stigmatized or sensitive health behaviors. However, investigators may be hesitant to adopt the method due to previously untestable assumptions and the perceived inability to conduct multivariable analysis. With a recently developed statistical test that can detect the presence of a design effect - the absence of which is a central assumption of the list experiment method - we sought to test the validity of a list experiment conducted on self-reported abortion in Liberia. We also aim to introduce recently developed multivariable regression estimators for the analysis of list experiment data, to explore relationships between respondent characteristics and having had an abortion - an important component of understanding the experiences of women who have abortions. To test the null hypothesis of no design effect in the Liberian list experiment data, we calculated the percentage of each respondent "type," characterized by response to the control items, and compared these percentages across treatment and control groups with a Bonferroni-adjusted alpha criterion. We then implemented two least squares and two maximum likelihood models (four total), each representing different bias-variance trade-offs, to estimate the association between respondent characteristics and abortion. We find no clear evidence of a design effect in list experiment data from Liberia (p = 0.18), affirming the first key assumption of the method. Multivariable analyses suggest a negative association between education and history of abortion. The retrospective nature of measuring lifetime experience of abortion, however, complicates interpretation of results, as the timing and safety of a respondent's abortion may have influenced her ability to pursue an education. Our work demonstrates that multivariable analyses, as well as statistical testing of a key design assumption, are possible with list experiment data, although with important limitations when considering lifetime measures. We outline how to implement this methodology with list experiment data in future research.
Ferreira, Ana P; Tobyn, Mike
2015-01-01
In the pharmaceutical industry, chemometrics is rapidly establishing itself as a tool that can be used at every step of product development and beyond: from early development to commercialization. This set of multivariate analysis methods allows the extraction of information contained in large, complex data sets thus contributing to increase product and process understanding which is at the core of the Food and Drug Administration's Process Analytical Tools (PAT) Guidance for Industry and the International Conference on Harmonisation's Pharmaceutical Development guideline (Q8). This review is aimed at providing pharmaceutical industry professionals an introduction to multivariate analysis and how it is being adopted and implemented by companies in the transition from "quality-by-testing" to "quality-by-design". It starts with an introduction to multivariate analysis and the two methods most commonly used: principal component analysis and partial least squares regression, their advantages, common pitfalls and requirements for their effective use. That is followed with an overview of the diverse areas of application of multivariate analysis in the pharmaceutical industry: from the development of real-time analytical methods to definition of the design space and control strategy, from formulation optimization during development to the application of quality-by-design principles to improve manufacture of existing commercial products.
Alternatives for using multivariate regression to adjust prospective payment rates
Sheingold, Steven H.
1990-01-01
Multivariate regression analysis has been used in structuring three of the adjustments to Medicare's prospective payment rates. Because the indirect-teaching adjustment, the disproportionate-share adjustment, and the adjustment for large cities are responsible for distributing approximately $3 billion in payments each year, the specification of regression models for these adjustments is of critical importance. In this article, the application of regression for adjusting Medicare's prospective rates is discussed, and the implications that differing specifications could have for these adjustments are demonstrated. PMID:10113271
NASA Astrophysics Data System (ADS)
Naguib, Ibrahim A.; Darwish, Hany W.
2012-02-01
A comparison between support vector regression (SVR) and Artificial Neural Networks (ANNs) multivariate regression methods is established showing the underlying algorithm for each and making a comparison between them to indicate the inherent advantages and limitations. In this paper we compare SVR to ANN with and without variable selection procedure (genetic algorithm (GA)). To project the comparison in a sensible way, the methods are used for the stability indicating quantitative analysis of mixtures of mebeverine hydrochloride and sulpiride in binary mixtures as a case study in presence of their reported impurities and degradation products (summing up to 6 components) in raw materials and pharmaceutical dosage form via handling the UV spectral data. For proper analysis, a 6 factor 5 level experimental design was established resulting in a training set of 25 mixtures containing different ratios of the interfering species. An independent test set consisting of 5 mixtures was used to validate the prediction ability of the suggested models. The proposed methods (linear SVR (without GA) and linear GA-ANN) were successfully applied to the analysis of pharmaceutical tablets containing mebeverine hydrochloride and sulpiride mixtures. The results manifest the problem of nonlinearity and how models like the SVR and ANN can handle it. The methods indicate the ability of the mentioned multivariate calibration models to deconvolute the highly overlapped UV spectra of the 6 components' mixtures, yet using cheap and easy to handle instruments like the UV spectrophotometer.
Punzo, Antonio; Ingrassia, Salvatore; Maruotti, Antonello
2018-04-22
A time-varying latent variable model is proposed to jointly analyze multivariate mixed-support longitudinal data. The proposal can be viewed as an extension of hidden Markov regression models with fixed covariates (HMRMFCs), which is the state of the art for modelling longitudinal data, with a special focus on the underlying clustering structure. HMRMFCs are inadequate for applications in which a clustering structure can be identified in the distribution of the covariates, as the clustering is independent from the covariates distribution. Here, hidden Markov regression models with random covariates are introduced by explicitly specifying state-specific distributions for the covariates, with the aim of improving the recovering of the clusters in the data with respect to a fixed covariates paradigm. The hidden Markov regression models with random covariates class is defined focusing on the exponential family, in a generalized linear model framework. Model identifiability conditions are sketched, an expectation-maximization algorithm is outlined for parameter estimation, and various implementation and operational issues are discussed. Properties of the estimators of the regression coefficients, as well as of the hidden path parameters, are evaluated through simulation experiments and compared with those of HMRMFCs. The method is applied to physical activity data. Copyright © 2018 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Eyarkai Nambi, Vijayaram; Thangavel, Kuladaisamy; Manickavasagan, Annamalai; Shahir, Sultan
2017-01-01
Prediction of ripeness level in climacteric fruits is essential for post-harvest handling. An index capable of predicting ripening level with minimum inputs would be highly beneficial to the handlers, processors and researchers in fruit industry. A study was conducted with Indian mango cultivars to develop a ripeness index and associated model. Changes in physicochemical, colour and textural properties were measured throughout the ripening period and the period was classified into five stages (unripe, early ripe, partially ripe, ripe and over ripe). Multivariate regression techniques like partial least square regression, principal component regression and multi linear regression were compared and evaluated for its prediction. Multi linear regression model with 12 parameters was found more suitable in ripening prediction. Scientific variable reduction method was adopted to simplify the developed model. Better prediction was achieved with either 2 or 3 variables (total soluble solids, colour and acidity). Cross validation was done to increase the robustness and it was found that proposed ripening index was more effective in prediction of ripening stages. Three-variable model would be suitable for commercial applications where reasonable accuracies are sufficient. However, 12-variable model can be used to obtain more precise results in research and development applications.
A climate-based multivariate extreme emulator of met-ocean-hydrological events for coastal flooding
NASA Astrophysics Data System (ADS)
Camus, Paula; Rueda, Ana; Mendez, Fernando J.; Tomas, Antonio; Del Jesus, Manuel; Losada, Iñigo J.
2015-04-01
Atmosphere-ocean general circulation models (AOGCMs) are useful to analyze large-scale climate variability (long-term historical periods, future climate projections). However, applications such as coastal flood modeling require climate information at finer scale. Besides, flooding events depend on multiple climate conditions: waves, surge levels from the open-ocean and river discharge caused by precipitation. Therefore, a multivariate statistical downscaling approach is adopted to reproduce relationships between variables and due to its low computational cost. The proposed method can be considered as a hybrid approach which combines a probabilistic weather type downscaling model with a stochastic weather generator component. Predictand distributions are reproduced modeling the relationship with AOGCM predictors based on a physical division in weather types (Camus et al., 2012). The multivariate dependence structure of the predictand (extreme events) is introduced linking the independent marginal distributions of the variables by a probabilistic copula regression (Ben Ayala et al., 2014). This hybrid approach is applied for the downscaling of AOGCM data to daily precipitation and maximum significant wave height and storm-surge in different locations along the Spanish coast. Reanalysis data is used to assess the proposed method. A commonly predictor for the three variables involved is classified using a regression-guided clustering algorithm. The most appropriate statistical model (general extreme value distribution, pareto distribution) for daily conditions is fitted. Stochastic simulation of the present climate is performed obtaining the set of hydraulic boundary conditions needed for high resolution coastal flood modeling. References: Camus, P., Menéndez, M., Méndez, F.J., Izaguirre, C., Espejo, A., Cánovas, V., Pérez, J., Rueda, A., Losada, I.J., Medina, R. (2014b). A weather-type statistical downscaling framework for ocean wave climate. Journal of Geophysical Research, doi: 10.1002/2014JC010141. Ben Ayala, M.A., Chebana, F., Ouarda, T.B.M.J. (2014). Probabilistic Gaussian Copula Regression Model for Multisite and Multivariable Downscaling, Journal of Climate, 27, 3331-3347.
Creativity, Bipolar Disorder Vulnerability and Psychological Well-Being: A Preliminary Study
ERIC Educational Resources Information Center
Gostoli, Sara; Cerini, Veronica; Piolanti, Antonio; Rafanelli, Chiara
2017-01-01
The aim of this research was to investigate the relationships between creativity, subclinical bipolar disorder symptomatology, and psychological well-being. The study method was of descriptive, correlational type. Significant tests were performed using multivariate regression analysis. Students of the 4th grade of 6 different Italian colleges…
Dynamics of Volunteering in Older Europeans
ERIC Educational Resources Information Center
Hank, Karsten; Erlinghagen, Marcel
2010-01-01
Purpose: To investigate the dynamics of volunteering in the population aged 50 years or older across 11 Continental European countries. Design and Methods: Using longitudinal data from the first 2 waves of the Survey of Health, Ageing and Retirement in Europe, we run multivariate regressions on a set of binary-dependent variables indicating…
USDA-ARS?s Scientific Manuscript database
Spectral scattering is useful for nondestructive sensing of fruit firmness. Prediction models, however, are typically built using multivariate statistical methods such as partial least squares regression (PLSR), whose performance generally depends on the characteristics of the data. The aim of this ...
Detecting Outliers in Factor Analysis Using the Forward Search Algorithm
ERIC Educational Resources Information Center
Mavridis, Dimitris; Moustaki, Irini
2008-01-01
In this article we extend and implement the forward search algorithm for identifying atypical subjects/observations in factor analysis models. The forward search has been mainly developed for detecting aberrant observations in regression models (Atkinson, 1994) and in multivariate methods such as cluster and discriminant analysis (Atkinson, Riani,…
Biostatistics Series Module 10: Brief Overview of Multivariate Methods.
Hazra, Avijit; Gogtay, Nithya
2017-01-01
Multivariate analysis refers to statistical techniques that simultaneously look at three or more variables in relation to the subjects under investigation with the aim of identifying or clarifying the relationships between them. These techniques have been broadly classified as dependence techniques, which explore the relationship between one or more dependent variables and their independent predictors, and interdependence techniques, that make no such distinction but treat all variables equally in a search for underlying relationships. Multiple linear regression models a situation where a single numerical dependent variable is to be predicted from multiple numerical independent variables. Logistic regression is used when the outcome variable is dichotomous in nature. The log-linear technique models count type of data and can be used to analyze cross-tabulations where more than two variables are included. Analysis of covariance is an extension of analysis of variance (ANOVA), in which an additional independent variable of interest, the covariate, is brought into the analysis. It tries to examine whether a difference persists after "controlling" for the effect of the covariate that can impact the numerical dependent variable of interest. Multivariate analysis of variance (MANOVA) is a multivariate extension of ANOVA used when multiple numerical dependent variables have to be incorporated in the analysis. Interdependence techniques are more commonly applied to psychometrics, social sciences and market research. Exploratory factor analysis and principal component analysis are related techniques that seek to extract from a larger number of metric variables, a smaller number of composite factors or components, which are linearly related to the original variables. Cluster analysis aims to identify, in a large number of cases, relatively homogeneous groups called clusters, without prior information about the groups. The calculation intensive nature of multivariate analysis has so far precluded most researchers from using these techniques routinely. The situation is now changing with wider availability, and increasing sophistication of statistical software and researchers should no longer shy away from exploring the applications of multivariate methods to real-life data sets.
Regression analysis for LED color detection of visual-MIMO system
NASA Astrophysics Data System (ADS)
Banik, Partha Pratim; Saha, Rappy; Kim, Ki-Doo
2018-04-01
Color detection from a light emitting diode (LED) array using a smartphone camera is very difficult in a visual multiple-input multiple-output (visual-MIMO) system. In this paper, we propose a method to determine the LED color using a smartphone camera by applying regression analysis. We employ a multivariate regression model to identify the LED color. After taking a picture of an LED array, we select the LED array region, and detect the LED using an image processing algorithm. We then apply the k-means clustering algorithm to determine the number of potential colors for feature extraction of each LED. Finally, we apply the multivariate regression model to predict the color of the transmitted LEDs. In this paper, we show our results for three types of environmental light condition: room environmental light, low environmental light (560 lux), and strong environmental light (2450 lux). We compare the results of our proposed algorithm from the analysis of training and test R-Square (%) values, percentage of closeness of transmitted and predicted colors, and we also mention about the number of distorted test data points from the analysis of distortion bar graph in CIE1931 color space.
A diagnostic analysis of the VVP single-doppler retrieval technique
NASA Technical Reports Server (NTRS)
Boccippio, Dennis J.
1995-01-01
A diagnostic analysis of the VVP (volume velocity processing) retrieval method is presented, with emphasis on understanding the technique as a linear, multivariate regression. Similarities and differences to the velocity-azimuth display and extended velocity-azimuth display retrieval techniques are discussed, using this framework. Conventional regression diagnostics are then employed to quantitatively determine situations in which the VVP technique is likely to fail. An algorithm for preparation and analysis of a robust VVP retrieval is developed and applied to synthetic and actual datasets with high temporal and spatial resolution. A fundamental (but quantifiable) limitation to some forms of VVP analysis is inadequate sampling dispersion in the n space of the multivariate regression, manifest as a collinearity between the basis functions of some fitted parameters. Such collinearity may be present either in the definition of these basis functions or in their realization in a given sampling configuration. This nonorthogonality may cause numerical instability, variance inflation (decrease in robustness), and increased sensitivity to bias from neglected wind components. It is shown that these effects prevent the application of VVP to small azimuthal sectors of data. The behavior of the VVP regression is further diagnosed over a wide range of sampling constraints, and reasonable sector limits are established.
Darwish, Hany W; Hassan, Said A; Salem, Maissa Y; El-Zeany, Badr A
2013-09-01
Four simple, accurate and specific methods were developed and validated for the simultaneous estimation of Amlodipine (AML), Valsartan (VAL) and Hydrochlorothiazide (HCT) in commercial tablets. The derivative spectrophotometric methods include Derivative Ratio Zero Crossing (DRZC) and Double Divisor Ratio Spectra-Derivative Spectrophotometry (DDRS-DS) methods, while the multivariate calibrations used are Principal Component Regression (PCR) and Partial Least Squares (PLSs). The proposed methods were applied successfully in the determination of the drugs in laboratory-prepared mixtures and in commercial pharmaceutical preparations. The validity of the proposed methods was assessed using the standard addition technique. The linearity of the proposed methods is investigated in the range of 2-32, 4-44 and 2-20 μg/mL for AML, VAL and HCT, respectively. Copyright © 2013 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Anggraeni, Anni; Arianto, Fernando; Mutalib, Abdul; Pratomo, Uji; Bahti, Husein H.
2017-05-01
Rare Earth Elements (REE) are elements that a lot of function for life, such as metallurgy, optical devices, and manufacture of electronic devices. Sources of REE is present in the mineral, in which each element has similar properties. Currently, to determining the content of REE is used instruments such as ICP-OES, ICP-MS, XRF, and HPLC. But in each instruments, there are still have some weaknesses. Therefore we need an alternative analytical method for the determination of rare earth metal content, one of them is by a combination of UV-Visible spectrophotometry and multivariate analysis, including Principal Component Analysis (PCA), Principal Component Regression (PCR), and Partial Least Square Regression (PLS). The purpose of this experiment is to determine the content of light and medium rare earth elements in the mineral monazite without chemical separation by using a combination of multivariate analysis and UV-Visible spectrophotometric methods. Training set created 22 variations of concentration and absorbance was measured using a UV-Vis spectrophotometer, then the data is processed by PCA, PCR, and PLSR. The results were compared and validated to obtain the mathematical equation with the smallest percent error. From this experiment, mathematical equation used PLS methods was better than PCR after validated, which has RMSE value for La, Ce, Pr, Nd, Gd, Sm, Eu, and Tb respectively 0.095; 0.573; 0.538; 0.440; 3.387; 1.240; 1.870; and 0.639.
Statistical primer: propensity score matching and its alternatives.
Benedetto, Umberto; Head, Stuart J; Angelini, Gianni D; Blackstone, Eugene H
2018-06-01
Propensity score (PS) methods offer certain advantages over more traditional regression methods to control for confounding by indication in observational studies. Although multivariable regression models adjust for confounders by modelling the relationship between covariates and outcome, the PS methods estimate the treatment effect by modelling the relationship between confounders and treatment assignment. Therefore, methods based on the PS are not limited by the number of events, and their use may be warranted when the number of confounders is large, or the number of outcomes is small. The PS is the probability for a subject to receive a treatment conditional on a set of baseline characteristics (confounders). The PS is commonly estimated using logistic regression, and it is used to match patients with similar distribution of confounders so that difference in outcomes gives unbiased estimate of treatment effect. This review summarizes basic concepts of the PS matching and provides guidance in implementing matching and other methods based on the PS, such as stratification, weighting and covariate adjustment.
Applied Statistics: From Bivariate through Multivariate Techniques [with CD-ROM
ERIC Educational Resources Information Center
Warner, Rebecca M.
2007-01-01
This book provides a clear introduction to widely used topics in bivariate and multivariate statistics, including multiple regression, discriminant analysis, MANOVA, factor analysis, and binary logistic regression. The approach is applied and does not require formal mathematics; equations are accompanied by verbal explanations. Students are asked…
Ozdemir, Durmus; Dinc, Erdal
2004-07-01
Simultaneous determination of binary mixtures pyridoxine hydrochloride and thiamine hydrochloride in a vitamin combination using UV-visible spectrophotometry and classical least squares (CLS) and three newly developed genetic algorithm (GA) based multivariate calibration methods was demonstrated. The three genetic multivariate calibration methods are Genetic Classical Least Squares (GCLS), Genetic Inverse Least Squares (GILS) and Genetic Regression (GR). The sample data set contains the UV-visible spectra of 30 synthetic mixtures (8 to 40 microg/ml) of these vitamins and 10 tablets containing 250 mg from each vitamin. The spectra cover the range from 200 to 330 nm in 0.1 nm intervals. Several calibration models were built with the four methods for the two components. Overall, the standard error of calibration (SEC) and the standard error of prediction (SEP) for the synthetic data were in the range of <0.01 and 0.43 microg/ml for all the four methods. The SEP values for the tablets were in the range of 2.91 and 11.51 mg/tablets. A comparison of genetic algorithm selected wavelengths for each component using GR method was also included.
Yoneoka, Daisuke; Henmi, Masayuki
2017-06-01
Recently, the number of regression models has dramatically increased in several academic fields. However, within the context of meta-analysis, synthesis methods for such models have not been developed in a commensurate trend. One of the difficulties hindering the development is the disparity in sets of covariates among literature models. If the sets of covariates differ across models, interpretation of coefficients will differ, thereby making it difficult to synthesize them. Moreover, previous synthesis methods for regression models, such as multivariate meta-analysis, often have problems because covariance matrix of coefficients (i.e. within-study correlations) or individual patient data are not necessarily available. This study, therefore, proposes a brief explanation regarding a method to synthesize linear regression models under different covariate sets by using a generalized least squares method involving bias correction terms. Especially, we also propose an approach to recover (at most) threecorrelations of covariates, which is required for the calculation of the bias term without individual patient data. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Yu, H.; Gu, H.
2017-12-01
A novel multivariate seismic formation pressure prediction methodology is presented, which incorporates high-resolution seismic velocity data from prestack AVO inversion, and petrophysical data (porosity and shale volume) derived from poststack seismic motion inversion. In contrast to traditional seismic formation prediction methods, the proposed methodology is based on a multivariate pressure prediction model and utilizes a trace-by-trace multivariate regression analysis on seismic-derived petrophysical properties to calibrate model parameters in order to make accurate predictions with higher resolution in both vertical and lateral directions. With prestack time migration velocity as initial velocity model, an AVO inversion was first applied to prestack dataset to obtain high-resolution seismic velocity with higher frequency that is to be used as the velocity input for seismic pressure prediction, and the density dataset to calculate accurate Overburden Pressure (OBP). Seismic Motion Inversion (SMI) is an inversion technique based on Markov Chain Monte Carlo simulation. Both structural variability and similarity of seismic waveform are used to incorporate well log data to characterize the variability of the property to be obtained. In this research, porosity and shale volume are first interpreted on well logs, and then combined with poststack seismic data using SMI to build porosity and shale volume datasets for seismic pressure prediction. A multivariate effective stress model is used to convert velocity, porosity and shale volume datasets to effective stress. After a thorough study of the regional stratigraphic and sedimentary characteristics, a regional normally compacted interval model is built, and then the coefficients in the multivariate prediction model are determined in a trace-by-trace multivariate regression analysis on the petrophysical data. The coefficients are used to convert velocity, porosity and shale volume datasets to effective stress and then to calculate formation pressure with OBP. Application of the proposed methodology to a research area in East China Sea has proved that the method can bridge the gap between seismic and well log pressure prediction and give predicted pressure values close to pressure meassurements from well testing.
Yilmaz, Banu; Aras, Egemen; Nacar, Sinan; Kankal, Murat
2018-05-23
The functional life of a dam is often determined by the rate of sediment delivery to its reservoir. Therefore, an accurate estimate of the sediment load in rivers with dams is essential for designing and predicting a dam's useful lifespan. The most credible method is direct measurements of sediment input, but this can be very costly and it cannot always be implemented at all gauging stations. In this study, we tested various regression models to estimate suspended sediment load (SSL) at two gauging stations on the Çoruh River in Turkey, including artificial bee colony (ABC), teaching-learning-based optimization algorithm (TLBO), and multivariate adaptive regression splines (MARS). These models were also compared with one another and with classical regression analyses (CRA). Streamflow values and previously collected data of SSL were used as model inputs with predicted SSL data as output. Two different training and testing dataset configurations were used to reinforce the model accuracy. For the MARS method, the root mean square error value was found to range between 35% and 39% for the test two gauging stations, which was lower than errors for other models. Error values were even lower (7% to 15%) using another dataset. Our results indicate that simultaneous measurements of streamflow with SSL provide the most effective parameter for obtaining accurate predictive models and that MARS is the most accurate model for predicting SSL. Copyright © 2017 Elsevier B.V. All rights reserved.
Effect of Contact Damage on the Strength of Ceramic Materials.
1982-10-01
variables that are important to erosion, and a multivariate , linear regression analysis is used to fit the data to the dimensional analysis. The...of Equations 7 and 8 by a multivariable regression analysis (room tem- perature data) Exponent Regression Standard error Computed coefficient of...1980) 593. WEAVER, Proc. Brit. Ceram. Soc. 22 (1973) 125. 39. P. W. BRIDGMAN, "Dimensional Analaysis ", (Yale 18. R. W. RICE, S. W. FREIMAN and P. F
Guo, Ying; Little, Roderick J; McConnell, Daniel S
2012-01-01
Covariate measurement error is common in epidemiologic studies. Current methods for correcting measurement error with information from external calibration samples are insufficient to provide valid adjusted inferences. We consider the problem of estimating the regression of an outcome Y on covariates X and Z, where Y and Z are observed, X is unobserved, but a variable W that measures X with error is observed. Information about measurement error is provided in an external calibration sample where data on X and W (but not Y and Z) are recorded. We describe a method that uses summary statistics from the calibration sample to create multiple imputations of the missing values of X in the regression sample, so that the regression coefficients of Y on X and Z and associated standard errors can be estimated using simple multiple imputation combining rules, yielding valid statistical inferences under the assumption of a multivariate normal distribution. The proposed method is shown by simulation to provide better inferences than existing methods, namely the naive method, classical calibration, and regression calibration, particularly for correction for bias and achieving nominal confidence levels. We also illustrate our method with an example using linear regression to examine the relation between serum reproductive hormone concentrations and bone mineral density loss in midlife women in the Michigan Bone Health and Metabolism Study. Existing methods fail to adjust appropriately for bias due to measurement error in the regression setting, particularly when measurement error is substantial. The proposed method corrects this deficiency.
Estuarine Sediment Deposition during Wetland Restoration: A GIS and Remote Sensing Modeling Approach
NASA Technical Reports Server (NTRS)
Newcomer, Michelle; Kuss, Amber; Kentron, Tyler; Remar, Alex; Choksi, Vivek; Skiles, J. W.
2011-01-01
Restoration of the industrial salt flats in the San Francisco Bay, California is an ongoing wetland rehabilitation project. Remote sensing maps of suspended sediment concentration, and other GIS predictor variables were used to model sediment deposition within these recently restored ponds. Suspended sediment concentrations were calibrated to reflectance values from Landsat TM 5 and ASTER using three statistical techniques -- linear regression, multivariate regression, and an Artificial Neural Network (ANN), to map suspended sediment concentrations. Multivariate and ANN regressions using ASTER proved to be the most accurate methods, yielding r2 values of 0.88 and 0.87, respectively. Predictor variables such as sediment grain size and tidal frequency were used in the Marsh Sedimentation (MARSED) model for predicting deposition rates for three years. MARSED results for a fully restored pond show a root mean square deviation (RMSD) of 66.8 mm (<1) between modeled and field observations. This model was further applied to a pond breached in November 2010 and indicated that the recently breached pond will reach equilibrium levels after 60 months of tidal inundation.
Giacomo, Della Riccia; Stefania, Del Zotto
2013-12-15
Fumonisins are mycotoxins produced by Fusarium species that commonly live in maize. Whereas fungi damage plants, fumonisins cause disease both to cattle breedings and human beings. Law limits set fumonisins tolerable daily intake with respect to several maize based feed and food. Chemical techniques assure the most reliable and accurate measurements, but they are expensive and time consuming. A method based on Near Infrared spectroscopy and multivariate statistical regression is described as a simpler, cheaper and faster alternative. We apply Partial Least Squares with full cross validation. Two models are described, having high correlation of calibration (0.995, 0.998) and of validation (0.908, 0.909), respectively. Description of observed phenomenon is accurate and overfitting is avoided. Screening of contaminated maize with respect to European legal limit of 4 mg kg(-1) should be assured. Copyright © 2013 Elsevier Ltd. All rights reserved.
[Multivariate Adaptive Regression Splines (MARS), an alternative for the analysis of time series].
Vanegas, Jairo; Vásquez, Fabián
Multivariate Adaptive Regression Splines (MARS) is a non-parametric modelling method that extends the linear model, incorporating nonlinearities and interactions between variables. It is a flexible tool that automates the construction of predictive models: selecting relevant variables, transforming the predictor variables, processing missing values and preventing overshooting using a self-test. It is also able to predict, taking into account structural factors that might influence the outcome variable, thereby generating hypothetical models. The end result could identify relevant cut-off points in data series. It is rarely used in health, so it is proposed as a tool for the evaluation of relevant public health indicators. For demonstrative purposes, data series regarding the mortality of children under 5 years of age in Costa Rica were used, comprising the period 1978-2008. Copyright © 2016 SESPAS. Publicado por Elsevier España, S.L.U. All rights reserved.
Sexual Orientation, Weight Concerns, and Eating-Disordered Behaviors in Adolescent Girls and Boys.
ERIC Educational Resources Information Center
Austin, S. Bryn; Ziyadeh, Najat; Kahn, Jessica A.; Camargo, Carlos A.; Colditz, Graham A.; Field, Alison E.
2004-01-01
Objective: To examine sexual orientation group differences in eating disorder symptoms in adolescent girls and boys. Method: Cross-sectional associations were examined using multivariate regression techniques using data gathered in 1999 from 10,583 adolescents in the Growing Up Today Study, a cohort of children of women participating in the…
Associations between Smoking and Extreme Dieting among Adolescents
ERIC Educational Resources Information Center
Seo, Dong-Chul; Jiang, Nan
2009-01-01
This study examined the association between cigarette smoking and dieting behaviors and trends in that association among US adolescents in grades 9-12 between 1999 and 2007. Youth Risk Behavior Survey datasets were analyzed using the multivariable logistic regression method. The sample size of each survey year ranged from 13,554 to 15,273 with…
Comparison of Employer Factors in Disability and Other Employment Discrimination Charges
ERIC Educational Resources Information Center
Nazarov, Zafar E.; von Schrader, Sarah
2014-01-01
Purpose: We explore whether certain employer characteristics predict Americans with Disabilities Act (ADA) charges and whether the same characteristics predict receipt of the Age Discrimination in Employment Act and Title VII of the Civil Rights Act charges. Method: We estimate a set of multivariate regressions using the ordinary least squares…
Parental Youth Assets and Sexual Activity: Differences by Race/Ethnicity
ERIC Educational Resources Information Center
Tolma, Eleni L.; Oman, Roy F.; Vesely, Sara K.; Aspy, Cheryl B.; Beebe, Laura; Fluhr, Janene
2011-01-01
Objectives: To examine how the relationship between parental-related youth assets and youth sexual activity differed by race/ethnicity. Methods: A random sample of 976 youth and their parents living in a Midwestern city participated in the study. Multivariate logistic regression analyses were conducted for 3 major ethnic groups controlling for the…
Ai, Zi-Sheng; Gao, You-Shui; Sun, Yuan; Liu, Yue; Zhang, Chang-Qing; Jiang, Cheng-Hua
2013-03-01
Risk factors for femoral neck fracture-induced avascular necrosis of the femoral head have not been elucidated clearly in middle-aged and elderly patients. Moreover, the high incidence of screw removal in China and its effect on the fate of the involved femoral head require statistical methods to reflect their intrinsic relationship. Ninety-nine patients older than 45 years with femoral neck fracture were treated by internal fixation between May 1999 and April 2004. Descriptive analysis, interaction analysis between associated factors, single factor logistic regression, multivariate logistic regression, and detailed interaction analysis were employed to explore potential relationships among associated factors. Avascular necrosis of the femoral head was found in 15 cases (15.2 %). Age × the status of implants (removal vs. maintenance) and gender × the timing of reduction were interactive according to two-factor interactive analysis. Age, the displacement of fractures, the quality of reduction, and the status of implants were found to be significant factors in single factor logistic regression analysis. Age, age × the status of implants, and the quality of reduction were found to be significant factors in multivariate logistic regression analysis. In fine interaction analysis after multivariate logistic regression analysis, implant removal was the most important risk factor for avascular necrosis in 56-to-85-year-old patients, with a risk ratio of 26.00 (95 % CI = 3.076-219.747). The middle-aged and elderly have less incidence of avascular necrosis of the femoral head following femoral neck fractures treated by cannulated screws. The removal of cannulated screws can induce a significantly high incidence of avascular necrosis of the femoral head in elderly patients, while a high-quality reduction is helpful to reduce avascular necrosis.
Menon, Ramkumar; Bhat, Geeta; Saade, George R; Spratt, Heidi
2014-04-01
To develop classification models of demographic/clinical factors and biomarker data from spontaneous preterm birth in African Americans and Caucasians. Secondary analysis of biomarker data using multivariate adaptive regression splines (MARS), a supervised machine learning algorithm method. Analysis of data on 36 biomarkers from 191 women was reduced by MARS to develop predictive models for preterm birth in African Americans and Caucasians. Maternal plasma, cord plasma collected at admission for preterm or term labor and amniotic fluid at delivery. Data were partitioned into training and testing sets. Variable importance, a relative indicator (0-100%) and area under the receiver operating characteristic curve (AUC) characterized results. Multivariate adaptive regression splines generated models for combined and racially stratified biomarker data. Clinical and demographic data did not contribute to the model. Racial stratification of data produced distinct models in all three compartments. In African Americans maternal plasma samples IL-1RA, TNF-α, angiopoietin 2, TNFRI, IL-5, MIP1α, IL-1β and TGF-α modeled preterm birth (AUC train: 0.98, AUC test: 0.86). In Caucasians TNFR1, ICAM-1 and IL-1RA contributed to the model (AUC train: 0.84, AUC test: 0.68). African Americans cord plasma samples produced IL-12P70, IL-8 (AUC train: 0.82, AUC test: 0.66). Cord plasma in Caucasians modeled IGFII, PDGFBB, TGF-β1 , IL-12P70, and TIMP1 (AUC train: 0.99, AUC test: 0.82). Amniotic fluid in African Americans modeled FasL, TNFRII, RANTES, KGF, IGFI (AUC train: 0.95, AUC test: 0.89) and in Caucasians, TNF-α, MCP3, TGF-β3 , TNFR1 and angiopoietin 2 (AUC train: 0.94 AUC test: 0.79). Multivariate adaptive regression splines models multiple biomarkers associated with preterm birth and demonstrated racial disparity. © 2014 Nordic Federation of Societies of Obstetrics and Gynecology.
Risk prediction for myocardial infarction via generalized functional regression models.
Ieva, Francesca; Paganoni, Anna M
2016-08-01
In this paper, we propose a generalized functional linear regression model for a binary outcome indicating the presence/absence of a cardiac disease with multivariate functional data among the relevant predictors. In particular, the motivating aim is the analysis of electrocardiographic traces of patients whose pre-hospital electrocardiogram (ECG) has been sent to 118 Dispatch Center of Milan (the Italian free-toll number for emergencies) by life support personnel of the basic rescue units. The statistical analysis starts with a preprocessing of ECGs treated as multivariate functional data. The signals are reconstructed from noisy observations. The biological variability is then removed by a nonlinear registration procedure based on landmarks. Thus, in order to perform a data-driven dimensional reduction, a multivariate functional principal component analysis is carried out on the variance-covariance matrix of the reconstructed and registered ECGs and their first derivatives. We use the scores of the Principal Components decomposition as covariates in a generalized linear model to predict the presence of the disease in a new patient. Hence, a new semi-automatic diagnostic procedure is proposed to estimate the risk of infarction (in the case of interest, the probability of being affected by Left Bundle Brunch Block). The performance of this classification method is evaluated and compared with other methods proposed in literature. Finally, the robustness of the procedure is checked via leave-j-out techniques. © The Author(s) 2013.
Dinç, Erdal; Ustündağ, Ozgür; Baleanu, Dumitru
2010-08-01
The sole use of pyridoxine hydrochloride during treatment of tuberculosis gives rise to pyridoxine deficiency. Therefore, a combination of pyridoxine hydrochloride and isoniazid is used in pharmaceutical dosage form in tuberculosis treatment to reduce this side effect. In this study, two chemometric methods, partial least squares (PLS) and principal component regression (PCR), were applied to the simultaneous determination of pyridoxine (PYR) and isoniazid (ISO) in their tablets. A concentration training set comprising binary mixtures of PYR and ISO consisting of 20 different combinations were randomly prepared in 0.1 M HCl. Both multivariate calibration models were constructed using the relationships between the concentration data set (concentration data matrix) and absorbance data matrix in the spectral region 200-330 nm. The accuracy and the precision of the proposed chemometric methods were validated by analyzing synthetic mixtures containing the investigated drugs. The recovery results obtained by applying PCR and PLS calibrations to the artificial mixtures were found between 100.0 and 100.7%. Satisfactory results obtained by applying the PLS and PCR methods to both artificial and commercial samples were obtained. The results obtained in this manuscript strongly encourage us to use them for the quality control and the routine analysis of the marketing tablets containing PYR and ISO drugs. Copyright © 2010 John Wiley & Sons, Ltd.
Variable Selection in Logistic Regression.
1987-06-01
23 %. AUTIOR(.) S. CONTRACT OR GRANT NUMBE Rf.i %Z. D. Bai, P. R. Krishnaiah and . C. Zhao F49620-85- C-0008 " PERFORMING ORGANIZATION NAME AND AOORESS...d I7 IOK-TK- d 7 -I0 7’ VARIABLE SELECTION IN LOGISTIC REGRESSION Z. D. Bai, P. R. Krishnaiah and L. C. Zhao Center for Multivariate Analysis...University of Pittsburgh Center for Multivariate Analysis University of Pittsburgh Y !I VARIABLE SELECTION IN LOGISTIC REGRESSION Z- 0. Bai, P. R. Krishnaiah
NASA Astrophysics Data System (ADS)
Durmaz, Murat; Karslioglu, Mahmut Onur
2015-04-01
There are various global and regional methods that have been proposed for the modeling of ionospheric vertical total electron content (VTEC). Global distribution of VTEC is usually modeled by spherical harmonic expansions, while tensor products of compactly supported univariate B-splines can be used for regional modeling. In these empirical parametric models, the coefficients of the basis functions as well as differential code biases (DCBs) of satellites and receivers can be treated as unknown parameters which can be estimated from geometry-free linear combinations of global positioning system observables. In this work we propose a new semi-parametric multivariate adaptive regression B-splines (SP-BMARS) method for the regional modeling of VTEC together with satellite and receiver DCBs, where the parametric part of the model is related to the DCBs as fixed parameters and the non-parametric part adaptively models the spatio-temporal distribution of VTEC. The latter is based on multivariate adaptive regression B-splines which is a non-parametric modeling technique making use of compactly supported B-spline basis functions that are generated from the observations automatically. This algorithm takes advantage of an adaptive scale-by-scale model building strategy that searches for best-fitting B-splines to the data at each scale. The VTEC maps generated from the proposed method are compared numerically and visually with the global ionosphere maps (GIMs) which are provided by the Center for Orbit Determination in Europe (CODE). The VTEC values from SP-BMARS and CODE GIMs are also compared with VTEC values obtained through calibration using local ionospheric model. The estimated satellite and receiver DCBs from the SP-BMARS model are compared with the CODE distributed DCBs. The results show that the SP-BMARS algorithm can be used to estimate satellite and receiver DCBs while adaptively and flexibly modeling the daily regional VTEC.
Byun, Bo-Ram; Kim, Yong-Il; Yamaguchi, Tetsutaro; Maki, Koutaro; Ko, Ching-Chang; Hwang, Dea-Seok; Park, Soo-Byung; Son, Woo-Sung
2015-11-01
The purpose of this study was to establish multivariable regression models for the estimation of skeletal maturation status in Japanese boys and girls using the cone-beam computed tomography (CBCT)-based cervical vertebral maturation (CVM) assessment method and hand-wrist radiography. The analyzed sample consisted of hand-wrist radiographs and CBCT images from 47 boys and 57 girls. To quantitatively evaluate the correlation between the skeletal maturation status and measurement ratios, a CBCT-based CVM assessment method was applied to the second, third, and fourth cervical vertebrae. Pearson's correlation coefficient analysis and multivariable regression analysis were used to determine the ratios for each of the cervical vertebrae (p < 0.05). Four characteristic parameters ((OH2 + PH2)/W2, (OH2 + AH2)/W2, D2, AH3/W3), as independent variables, were used to build the multivariable regression models: for the Japanese boys, the skeletal maturation status according to the CBCT-based quantitative cervical vertebral maturation (QCVM) assessment was 5.90 + 99.11 × AH3/W3 - 14.88 × (OH2 + AH2)/W2 + 13.24 × D2; for the Japanese girls, it was 41.39 + 59.52 × AH3/W3 - 15.88 × (OH2 + PH2)/W2 + 10.93 × D2. The CBCT-generated CVM images proved very useful to the definition of the cervical vertebral body and the odontoid process. The newly developed CBCT-based QCVM assessment method showed a high correlation between the derived ratios from the second cervical vertebral body and odontoid process. There are high correlations between the skeletal maturation status and the ratios of the second cervical vertebra based on the remnant of dentocentral synchondrosis.
NASA Astrophysics Data System (ADS)
Hegazy, Maha A.; Lotfy, Hayam M.; Mowaka, Shereen; Mohamed, Ekram Hany
2016-07-01
Wavelets have been adapted for a vast number of signal-processing applications due to the amount of information that can be extracted from a signal. In this work, a comparative study on the efficiency of continuous wavelet transform (CWT) as a signal processing tool in univariate regression and a pre-processing tool in multivariate analysis using partial least square (CWT-PLS) was conducted. These were applied to complex spectral signals of ternary and quaternary mixtures. CWT-PLS method succeeded in the simultaneous determination of a quaternary mixture of drotaverine (DRO), caffeine (CAF), paracetamol (PAR) and p-aminophenol (PAP, the major impurity of paracetamol). While, the univariate CWT failed to simultaneously determine the quaternary mixture components and was able to determine only PAR and PAP, the ternary mixtures of DRO, CAF, and PAR and CAF, PAR, and PAP. During the calculations of CWT, different wavelet families were tested. The univariate CWT method was validated according to the ICH guidelines. While for the development of the CWT-PLS model a calibration set was prepared by means of an orthogonal experimental design and their absorption spectra were recorded and processed by CWT. The CWT-PLS model was constructed by regression between the wavelet coefficients and concentration matrices and validation was performed by both cross validation and external validation sets. Both methods were successfully applied for determination of the studied drugs in pharmaceutical formulations.
Decomposing Racial/Ethnic Disparities in Influenza Vaccination among the Elderly
Yoo, Byung-Kwang; Hasebe, Takuya; Szilagyi, Peter G.
2015-01-01
While persistent racial/ethnic disparities in influenza vaccination have been reported among the elderly, characteristics contributing to disparities are poorly understood. This study aimed to assess characteristics associated with racial/ethnic disparities in influenza vaccination using a nonlinear Oaxaca-Blinder decomposition method. We performed cross-sectional multivariable logistic regression analyses for which the dependent variable was self-reported receipt of influenza vaccine during the 2010–2011 season among community dwelling non-Hispanic African-American (AA), non-Hispanic White (W), English-speaking Hispanic (EH) and Spanish-speaking Hispanic (SH) elderly, enrolled in the 2011 Medicare Current Beneficiary Survey (MCBS) (un-weighted/weighted N= 6,095/19.2million). Using the nonlinear Oaxaca-Blinder decomposition method, we assessed the relative contribution of seventeen covariates—including socio-demographic characteristics, health status, insurance, access, preference regarding healthcare, and geographic regions —to disparities in influenza vaccination. Unadjusted racial/ethnic disparities in influenza vaccination were 14.1 percentage points (pp) (W-AA disparity, p<.001), 25.7 pp (W-SH disparity, p<.001) and 0.6 pp (W-EH disparity, p>.8). The Oaxaca-Blinder decomposition method estimated that the unadjusted W-AA and W-SH disparities in vaccination could be reduced by only 45% even if AA and SH groups become equivalent to Whites in all covariates in multivariable regression models. The remaining 55% of disparities were attributed to (a) racial/ethnic differences in the estimated coefficients (e.g., odds ratios) in the regression models and (b) characteristics not included in the regression models. Our analysis found that only about 45% of racial/ethnic disparities in influenza vaccination among the elderly could be reduced by equalizing recognized characteristics among racial/ethnic groups. Future studies are needed to identify additional modifiable characteristics causing disparities in influenza vaccination. PMID:25900133
Kinoshita, Shoji; Kakuda, Wataru; Momosaki, Ryo; Yamada, Naoki; Sugawara, Hidekazu; Watanabe, Shu; Abo, Masahiro
2015-05-01
Early rehabilitation for acute stroke patients is widely recommended. We tested the hypothesis that clinical outcome of stroke patients who receive early rehabilitation managed by board-certificated physiatrists (BCP) is generally better than that provided by other medical specialties. Data of stroke patients who underwent early rehabilitation in 19 acute hospitals between January 2005 and December 2013 were collected from the Japan Rehabilitation Database and analyzed retrospectively. Multivariate linear regression analysis using generalized estimating equations method was performed to assess the association between Functional Independence Measure (FIM) effectiveness and management provided by BCP in early rehabilitation. In addition, multivariate logistic regression analysis was also performed to assess the impact of management provided by BCP in acute phase on discharge destination. After setting the inclusion criteria, data of 3838 stroke patients were eligible for analysis. BCP provided early rehabilitation in 814 patients (21.2%). Both the duration of daily exercise time and the frequency of regular conferencing were significantly higher for patients managed by BCP than by other specialties. Although the mortality rate was not different, multivariate regression analysis showed that FIM effectiveness correlated significantly and positively with the management provided by BCP (coefficient, .35; 95% confidence interval [CI], .012-.059; P < .005). In addition, multivariate logistic analysis identified clinical management by BCP as a significant determinant of home discharge (odds ratio, 1.24; 95% CI, 1.08-1.44; P < .005). Our retrospective cohort study demonstrated that clinical management provided by BCP in early rehabilitation can lead to functional recovery of acute stroke. Copyright © 2015 National Stroke Association. Published by Elsevier Inc. All rights reserved.
Physical function in older men with hyperkyphosis.
Katzman, Wendy B; Harrison, Stephanie L; Fink, Howard A; Marshall, Lynn M; Orwoll, Eric; Barrett-Connor, Elizabeth; Cawthon, Peggy M; Kado, Deborah M
2015-05-01
Age-related hyperkyphosis has been associated with poor physical function and is a well-established predictor of adverse health outcomes in older women, but its impact on health in older men is less well understood. We conducted a cross-sectional study to evaluate the association of hyperkyphosis and physical function in 2,363 men, aged 71-98 (M = 79) from the Osteoporotic Fractures in Men Study. Kyphosis was measured using the Rancho Bernardo Study block method. Measurements of grip strength and lower extremity function, including gait speed over 6 m, narrow walk (measure of dynamic balance), repeated chair stands ability and time, and lower extremity power (Nottingham Power Rig) were included separately as primary outcomes. We investigated associations of kyphosis and each outcome in age-adjusted and multivariable linear or logistic regression models, controlling for age, clinic, education, race, bone mineral density, height, weight, diabetes, and physical activity. In multivariate linear regression, we observed a dose-related response of worse scores on each lower extremity physical function test as number of blocks increased, p for trend ≤.001. Using a cutoff of ≥4 blocks, 20% (N = 469) of men were characterized with hyperkyphosis. In multivariate logistic regression, men with hyperkyphosis had increased odds (range 1.5-1.8) of being in the worst quartile of performing lower extremity physical function tasks (p < .001 for each outcome). Kyphosis was not associated with grip strength in any multivariate analysis. Hyperkyphosis is associated with impaired lower extremity physical function in older men. Further studies are needed to determine the direction of causality. © The Author 2014. Published by Oxford University Press on behalf of The Gerontological Society of America. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
NASA Astrophysics Data System (ADS)
Shi, Jinfei; Zhu, Songqing; Chen, Ruwen
2017-12-01
An order selection method based on multiple stepwise regressions is proposed for General Expression of Nonlinear Autoregressive model which converts the model order problem into the variable selection of multiple linear regression equation. The partial autocorrelation function is adopted to define the linear term in GNAR model. The result is set as the initial model, and then the nonlinear terms are introduced gradually. Statistics are chosen to study the improvements of both the new introduced and originally existed variables for the model characteristics, which are adopted to determine the model variables to retain or eliminate. So the optimal model is obtained through data fitting effect measurement or significance test. The simulation and classic time-series data experiment results show that the method proposed is simple, reliable and can be applied to practical engineering.
Improved estimation of PM2.5 using Lagrangian satellite-measured aerosol optical depth
NASA Astrophysics Data System (ADS)
Olivas Saunders, Rolando
Suspended particulate matter (aerosols) with aerodynamic diameters less than 2.5 mum (PM2.5) has negative effects on human health, plays an important role in climate change and also causes the corrosion of structures by acid deposition. Accurate estimates of PM2.5 concentrations are thus relevant in air quality, epidemiology, cloud microphysics and climate forcing studies. Aerosol optical depth (AOD) retrieved by the Moderate Resolution Imaging Spectroradiometer (MODIS) satellite instrument has been used as an empirical predictor to estimate ground-level concentrations of PM2.5 . These estimates usually have large uncertainties and errors. The main objective of this work is to assess the value of using upwind (Lagrangian) MODIS-AOD as predictors in empirical models of PM2.5. The upwind locations of the Lagrangian AOD were estimated using modeled backward air trajectories. Since the specification of an arrival elevation is somewhat arbitrary, trajectories were calculated to arrive at four different elevations at ten measurement sites within the continental United States. A systematic examination revealed trajectory model calculations to be sensitive to starting elevation. With a 500 m difference in starting elevation, the 48-hr mean horizontal separation of trajectory endpoints was 326 km. When the difference in starting elevation was doubled and tripled to 1000 m and 1500m, the mean horizontal separation of trajectory endpoints approximately doubled and tripled to 627 km and 886 km, respectively. A seasonal dependence of this sensitivity was also found: the smallest mean horizontal separation of trajectory endpoints was exhibited during the summer and the largest separations during the winter. A daily average AOD product was generated and coupled to the trajectory model in order to determine AOD values upwind of the measurement sites during the period 2003-2007. Empirical models that included in situ AOD and upwind AOD as predictors of PM2.5 were generated by multivariate linear regressions using the least squares method. The multivariate models showed improved performance over the single variable regression (PM2.5 and in situ AOD) models. The statistical significance of the improvement of the multivariate models over the single variable regression models was tested using the extra sum of squares principle. In many cases, even when the R-squared was high for the multivariate models, the improvement over the single models was not statistically significant. The R-squared of these multivariate models varied with respect to seasons, with the best performance occurring during the summer months. A set of seasonal categorical variables was included in the regressions to exploit this variability. The multivariate regression models that included these categorical seasonal variables performed better than the models that didn't account for seasonal variability. Furthermore, 71% of these regressions exhibited improvement over the single variable models that was statistically significant at a 95% confidence level.
NASA Astrophysics Data System (ADS)
Mfumu Kihumba, Antoine; Ndembo Longo, Jean; Vanclooster, Marnik
2016-03-01
A multivariate statistical modelling approach was applied to explain the anthropogenic pressure of nitrate pollution on the Kinshasa groundwater body (Democratic Republic of Congo). Multiple regression and regression tree models were compared and used to identify major environmental factors that control the groundwater nitrate concentration in this region. The analyses were made in terms of physical attributes related to the topography, land use, geology and hydrogeology in the capture zone of different groundwater sampling stations. For the nitrate data, groundwater datasets from two different surveys were used. The statistical models identified the topography, the residential area, the service land (cemetery), and the surface-water land-use classes as major factors explaining nitrate occurrence in the groundwater. Also, groundwater nitrate pollution depends not on one single factor but on the combined influence of factors representing nitrogen loading sources and aquifer susceptibility characteristics. The groundwater nitrate pressure was better predicted with the regression tree model than with the multiple regression model. Furthermore, the results elucidated the sensitivity of the model performance towards the method of delineation of the capture zones. For pollution modelling at the monitoring points, therefore, it is better to identify capture-zone shapes based on a conceptual hydrogeological model rather than to adopt arbitrary circular capture zones.
NASA Astrophysics Data System (ADS)
Moustafa, Azza Aziz; Salem, Hesham; Hegazy, Maha; Ali, Omnia
2015-02-01
Simple, accurate, and selective methods have been developed and validated for simultaneous determination of a ternary mixture of Chlorpheniramine maleate (CPM), Pseudoephedrine HCl (PSE) and Ibuprofen (IBF), in tablet dosage form. Four univariate methods manipulating ratio spectra were applied, method A is the double divisor-ratio difference spectrophotometric method (DD-RD). Method B is double divisor-derivative ratio spectrophotometric method (DD-RD). Method C is derivative ratio spectrum-zero crossing method (DRZC), while method D is mean centering of ratio spectra (MCR). Two multivariate methods were also developed and validated, methods E and F are Principal Component Regression (PCR) and Partial Least Squares (PLSs). The proposed methods have the advantage of simultaneous determination of the mentioned drugs without prior separation steps. They were successfully applied to laboratory-prepared mixtures and to commercial pharmaceutical preparation without any interference from additives. The proposed methods were validated according to the ICH guidelines. The obtained results were statistically compared with the official methods where no significant difference was observed regarding both accuracy and precision.
Multivariate Analysis and Prediction of Dioxin-Furan ...
Peer Review Draft of Regional Methods Initiative Final Report Dioxins, which are bioaccumulative and environmentally persistent, pose an ongoing risk to human and ecosystem health. Fish constitute a significant source of dioxin exposure for humans and fish-eating wildlife. Current dioxin analytical methods are costly, time-consuming, and produce hazardous by-products. A Danish team developed a novel, multivariate statistical methodology based on the covariance of dioxin-furan congener Toxic Equivalences (TEQs) and fatty acid methyl esters (FAMEs) and applied it to North Atlantic Ocean fishmeal samples. The goal of the current study was to attempt to extend this Danish methodology to 77 whole and composite fish samples from three trophic groups: predator (whole largemouth bass), benthic (whole flathead and channel catfish) and forage fish (composite bluegill, pumpkinseed and green sunfish) from two dioxin contaminated rivers (Pocatalico R. and Kanawha R.) in West Virginia, USA. Multivariate statistical analyses, including, Principal Components Analysis (PCA), Hierarchical Clustering, and Partial Least Squares Regression (PLS), were used to assess the relationship between the FAMEs and TEQs in these dioxin contaminated freshwater fish from the Kanawha and Pocatalico Rivers. These three multivariate statistical methods all confirm that the pattern of Fatty Acid Methyl Esters (FAMEs) in these freshwater fish covaries with and is predictive of the WHO TE
ERIC Educational Resources Information Center
Grinshteyn, Erin; Yang, Y. T.
2017-01-01
Background: We examined the relationship between exposure to electronic bullying and absenteeism as a result of being afraid. Methods: This multivariate, multinomial regression analysis of the 2013 Youth Risk Behavior Survey data assessed the association between experiencing electronic bullying in the past year and how often students were absent…
ERIC Educational Resources Information Center
Casola, Allison R.; Nelson, Deborah B.; Patterson, Freda
2017-01-01
Background: Contraception non-use among sexually active adolescents is a major cause of unintended pregnancy (UP). Methods: In this cross-sectional study we sought to identify overall and sex-specific correlates of contraception non-use using the 2015 Philadelphia Youth Risk Behavior Survey (YRBS) (N = 9540). Multivariate regression models were…
ERIC Educational Resources Information Center
Tamers, Sara L.; Allen, Jennifer; Yang, May; Stoddard, Anne; Harley, Amy; Sorensen, Glorian
2014-01-01
Objective: To explore relationships between concerns and physical activity and body mass index (BMI) among a racially/ethnically diverse low-income population. Method: A cross-sectional survey documented behavioral risks among racially/ethnically diverse low-income residents in the Boston area (2005-2009). Multivariable logistic regressions were…
ERIC Educational Resources Information Center
Muslihah, Oleh Eneng
2015-01-01
The research examines the correlation between the understanding of school-based management, emotional intelligences and headmaster performance. Data was collected, using quantitative methods. The statistical analysis used was the Pearson Correlation, and multivariate regression analysis. The results of this research suggest firstly that there is…
Quantifying the impact of between-study heterogeneity in multivariate meta-analyses
Jackson, Dan; White, Ian R; Riley, Richard D
2012-01-01
Measures that quantify the impact of heterogeneity in univariate meta-analysis, including the very popular I2 statistic, are now well established. Multivariate meta-analysis, where studies provide multiple outcomes that are pooled in a single analysis, is also becoming more commonly used. The question of how to quantify heterogeneity in the multivariate setting is therefore raised. It is the univariate R2 statistic, the ratio of the variance of the estimated treatment effect under the random and fixed effects models, that generalises most naturally, so this statistic provides our basis. This statistic is then used to derive a multivariate analogue of I2, which we call . We also provide a multivariate H2 statistic, the ratio of a generalisation of Cochran's heterogeneity statistic and its associated degrees of freedom, with an accompanying generalisation of the usual I2 statistic, . Our proposed heterogeneity statistics can be used alongside all the usual estimates and inferential procedures used in multivariate meta-analysis. We apply our methods to some real datasets and show how our statistics are equally appropriate in the context of multivariate meta-regression, where study level covariate effects are included in the model. Our heterogeneity statistics may be used when applying any procedure for fitting the multivariate random effects model. Copyright © 2012 John Wiley & Sons, Ltd. PMID:22763950
Rapid and Simultaneous Prediction of Eight Diesel Quality Parameters through ATR-FTIR Analysis.
Nespeca, Maurilio Gustavo; Hatanaka, Rafael Rodrigues; Flumignan, Danilo Luiz; de Oliveira, José Eduardo
2018-01-01
Quality assessment of diesel fuel is highly necessary for society, but the costs and time spent are very high while using standard methods. Therefore, this study aimed to develop an analytical method capable of simultaneously determining eight diesel quality parameters (density; flash point; total sulfur content; distillation temperatures at 10% (T10), 50% (T50), and 85% (T85) recovery; cetane index; and biodiesel content) through attenuated total reflection Fourier transform infrared (ATR-FTIR) spectroscopy and the multivariate regression method, partial least square (PLS). For this purpose, the quality parameters of 409 samples were determined using standard methods, and their spectra were acquired in ranges of 4000-650 cm -1 . The use of the multivariate filters, generalized least squares weighting (GLSW) and orthogonal signal correction (OSC), was evaluated to improve the signal-to-noise ratio of the models. Likewise, four variable selection approaches were tested: manual exclusion, forward interval PLS (FiPLS), backward interval PLS (BiPLS), and genetic algorithm (GA). The multivariate filters and variables selection algorithms generated more fitted and accurate PLS models. According to the validation, the FTIR/PLS models presented accuracy comparable to the reference methods and, therefore, the proposed method can be applied in the diesel routine monitoring to significantly reduce costs and analysis time.
Rapid and Simultaneous Prediction of Eight Diesel Quality Parameters through ATR-FTIR Analysis
Hatanaka, Rafael Rodrigues; Flumignan, Danilo Luiz; de Oliveira, José Eduardo
2018-01-01
Quality assessment of diesel fuel is highly necessary for society, but the costs and time spent are very high while using standard methods. Therefore, this study aimed to develop an analytical method capable of simultaneously determining eight diesel quality parameters (density; flash point; total sulfur content; distillation temperatures at 10% (T10), 50% (T50), and 85% (T85) recovery; cetane index; and biodiesel content) through attenuated total reflection Fourier transform infrared (ATR-FTIR) spectroscopy and the multivariate regression method, partial least square (PLS). For this purpose, the quality parameters of 409 samples were determined using standard methods, and their spectra were acquired in ranges of 4000–650 cm−1. The use of the multivariate filters, generalized least squares weighting (GLSW) and orthogonal signal correction (OSC), was evaluated to improve the signal-to-noise ratio of the models. Likewise, four variable selection approaches were tested: manual exclusion, forward interval PLS (FiPLS), backward interval PLS (BiPLS), and genetic algorithm (GA). The multivariate filters and variables selection algorithms generated more fitted and accurate PLS models. According to the validation, the FTIR/PLS models presented accuracy comparable to the reference methods and, therefore, the proposed method can be applied in the diesel routine monitoring to significantly reduce costs and analysis time. PMID:29629209
Multivariate regression model for predicting lumber grade volumes of northern red oak sawlogs
Daniel A. Yaussy; Robert L. Brisbin
1983-01-01
A multivariate regression model was developed to predict green board-foot yields for the seven common factory lumber grades processed from northern red oak (Quercus rubra L.) factory grade logs. The model uses the standard log measurements of grade, scaling diameter, length, and percent defect. It was validated with an independent data set. The model...
Predictive and mechanistic multivariate linear regression models for reaction development
Santiago, Celine B.; Guo, Jing-Yao
2018-01-01
Multivariate Linear Regression (MLR) models utilizing computationally-derived and empirically-derived physical organic molecular descriptors are described in this review. Several reports demonstrating the effectiveness of this methodological approach towards reaction optimization and mechanistic interrogation are discussed. A detailed protocol to access quantitative and predictive MLR models is provided as a guide for model development and parameter analysis. PMID:29719711
Multivariate regression model for predicting yields of grade lumber from yellow birch sawlogs
Andrew F. Howard; Daniel A. Yaussy
1986-01-01
A multivariate regression model was developed to predict green board-foot yields for the common grades of factory lumber processed from yellow birch factory-grade logs. The model incorporates the standard log measurements of scaling diameter, length, proportion of scalable defects, and the assigned USDA Forest Service log grade. Differences in yields between band and...
NASA Technical Reports Server (NTRS)
Rogers, David
1991-01-01
G/SPLINES are a hybrid of Friedman's Multivariable Adaptive Regression Splines (MARS) algorithm with Holland's Genetic Algorithm. In this hybrid, the incremental search is replaced by a genetic search. The G/SPLINE algorithm exhibits performance comparable to that of the MARS algorithm, requires fewer least squares computations, and allows significantly larger problems to be considered.
USDA-ARS?s Scientific Manuscript database
Accurate, nonintrusive, and inexpensive techniques are needed to measure energy expenditure (EE) in free-living populations. Our primary aim in this study was to validate cross-sectional time series (CSTS) and multivariate adaptive regression splines (MARS) models based on observable participant cha...
Regression Models For Multivariate Count Data
Zhang, Yiwen; Zhou, Hua; Zhou, Jin; Sun, Wei
2016-01-01
Data with multivariate count responses frequently occur in modern applications. The commonly used multinomial-logit model is limiting due to its restrictive mean-variance structure. For instance, analyzing count data from the recent RNA-seq technology by the multinomial-logit model leads to serious errors in hypothesis testing. The ubiquity of over-dispersion and complicated correlation structures among multivariate counts calls for more flexible regression models. In this article, we study some generalized linear models that incorporate various correlation structures among the counts. Current literature lacks a treatment of these models, partly due to the fact that they do not belong to the natural exponential family. We study the estimation, testing, and variable selection for these models in a unifying framework. The regression models are compared on both synthetic and real RNA-seq data. PMID:28348500
Regression Models For Multivariate Count Data.
Zhang, Yiwen; Zhou, Hua; Zhou, Jin; Sun, Wei
2017-01-01
Data with multivariate count responses frequently occur in modern applications. The commonly used multinomial-logit model is limiting due to its restrictive mean-variance structure. For instance, analyzing count data from the recent RNA-seq technology by the multinomial-logit model leads to serious errors in hypothesis testing. The ubiquity of over-dispersion and complicated correlation structures among multivariate counts calls for more flexible regression models. In this article, we study some generalized linear models that incorporate various correlation structures among the counts. Current literature lacks a treatment of these models, partly due to the fact that they do not belong to the natural exponential family. We study the estimation, testing, and variable selection for these models in a unifying framework. The regression models are compared on both synthetic and real RNA-seq data.
Confounder summary scores when comparing the effects of multiple drug exposures.
Cadarette, Suzanne M; Gagne, Joshua J; Solomon, Daniel H; Katz, Jeffrey N; Stürmer, Til
2010-01-01
Little information is available comparing methods to adjust for confounding when considering multiple drug exposures. We compared three analytic strategies to control for confounding based on measured variables: conventional multivariable, exposure propensity score (EPS), and disease risk score (DRS). Each method was applied to a dataset (2000-2006) recently used to examine the comparative effectiveness of four drugs. The relative effectiveness of risedronate, nasal calcitonin, and raloxifene in preventing non-vertebral fracture, were each compared to alendronate. EPSs were derived both by using multinomial logistic regression (single model EPS) and by three separate logistic regression models (separate model EPS). DRSs were derived and event rates compared using Cox proportional hazard models. DRSs derived among the entire cohort (full cohort DRS) was compared to DRSs derived only among the referent alendronate (unexposed cohort DRS). Less than 8% deviation from the base estimate (conventional multivariable) was observed applying single model EPS, separate model EPS or full cohort DRS. Applying the unexposed cohort DRS when background risk for fracture differed between comparison drug exposure cohorts resulted in -7 to + 13% deviation from our base estimate. With sufficient numbers of exposed and outcomes, either conventional multivariable, EPS or full cohort DRS may be used to adjust for confounding to compare the effects of multiple drug exposures. However, our data also suggest that unexposed cohort DRS may be problematic when background risks differ between referent and exposed groups. Further empirical and simulation studies will help to clarify the generalizability of our findings.
Likhvantseva, V G; Sokolov, V A; Levanova, O N; Kovelenova, I V
2018-01-01
Prediction of the clinical course of primary open-angle glaucoma (POAG) is one of the main directions in solving the problem of vision loss prevention and stabilization of the pathological process. Simple statistical methods of correlation analysis show the extent of each risk factor's impact, but do not indicate the total impact of these factors in personalized combinations. The relationships between the risk factors is subject to correlation and regression analysis. The regression equation represents the dependence of the mathematical expectation of the resulting sign on the combination of factor signs. To develop a technique for predicting the probability of development and progression of primary open-angle glaucoma based on a personalized combination of risk factors by linear multivariate regression analysis. The study included 66 patients (23 female and 43 male; 132 eyes) with newly diagnosed primary open-angle glaucoma. The control group consisted of 14 patients (8 male and 6 female). Standard ophthalmic examination was supplemented with biochemical study of lacrimal fluid. Concentration of matrix metalloproteinase MMP-2 and MMP-9 in tear fluid in both eyes was determined using 'sandwich' enzyme-linked immunosorbent assay (ELISA) method. The study resulted in the development of regression equations and step-by-step multivariate logistic models that can help calculate the risk of development and progression of POAG. Those models are based on expert evaluation of clinical and instrumental indicators of hydrodynamic disturbances (coefficient of outflow ease - C, volume of intraocular fluid secretion - F, fluctuation of intraocular pressure), as well as personalized morphometric parameters of the retina (central retinal thickness in the macular area) and concentration of MMP-2 and MMP-9 in the tear film. The newly developed regression equations are highly informative and can be a reliable tool for studying of the influence vector and assessment of pathogenic potential of the independent risk factors in specific personalized combinations.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rupšys, P.
A system of stochastic differential equations (SDE) with mixed-effects parameters and multivariate normal copula density function were used to develop tree height model for Scots pine trees in Lithuania. A two-step maximum likelihood parameter estimation method is used and computational guidelines are given. After fitting the conditional probability density functions to outside bark diameter at breast height, and total tree height, a bivariate normal copula distribution model was constructed. Predictions from the mixed-effects parameters SDE tree height model calculated during this research were compared to the regression tree height equations. The results are implemented in the symbolic computational language MAPLE.
A single determinant dominates the rate of yeast protein evolution.
Drummond, D Allan; Raval, Alpan; Wilke, Claus O
2006-02-01
A gene's rate of sequence evolution is among the most fundamental evolutionary quantities in common use, but what determines evolutionary rates has remained unclear. Here, we carry out the first combined analysis of seven predictors (gene expression level, dispensability, protein abundance, codon adaptation index, gene length, number of protein-protein interactions, and the gene's centrality in the interaction network) previously reported to have independent influences on protein evolutionary rates. Strikingly, our analysis reveals a single dominant variable linked to the number of translation events which explains 40-fold more variation in evolutionary rate than any other, suggesting that protein evolutionary rate has a single major determinant among the seven predictors. The dominant variable explains nearly half the variation in the rate of synonymous and protein evolution. We show that the two most commonly used methods to disentangle the determinants of evolutionary rate, partial correlation analysis and ordinary multivariate regression, produce misleading or spurious results when applied to noisy biological data. We overcome these difficulties by employing principal component regression, a multivariate regression of evolutionary rate against the principal components of the predictor variables. Our results support the hypothesis that translational selection governs the rate of synonymous and protein sequence evolution in yeast.
Oviedo de la Fuente, Manuel; Febrero-Bande, Manuel; Muñoz, María Pilar; Domínguez, Àngela
2018-01-01
This paper proposes a novel approach that uses meteorological information to predict the incidence of influenza in Galicia (Spain). It extends the Generalized Least Squares (GLS) methods in the multivariate framework to functional regression models with dependent errors. These kinds of models are useful when the recent history of the incidence of influenza are readily unavailable (for instance, by delays on the communication with health informants) and the prediction must be constructed by correcting the temporal dependence of the residuals and using more accessible variables. A simulation study shows that the GLS estimators render better estimations of the parameters associated with the regression model than they do with the classical models. They obtain extremely good results from the predictive point of view and are competitive with the classical time series approach for the incidence of influenza. An iterative version of the GLS estimator (called iGLS) was also proposed that can help to model complicated dependence structures. For constructing the model, the distance correlation measure [Formula: see text] was employed to select relevant information to predict influenza rate mixing multivariate and functional variables. These kinds of models are extremely useful to health managers in allocating resources in advance to manage influenza epidemics.
Beyond Reading Alone: The Relationship Between Aural Literacy And Asthma Management
Rosenfeld, Lindsay; Rudd, Rima; Emmons, Karen M.; Acevedo-García, Dolores; Martin, Laurie; Buka, Stephen
2010-01-01
Objectives To examine the relationship between literacy and asthma management with a focus on the oral exchange. Methods Study participants, all of whom reported asthma, were drawn from the New England Family Study (NEFS), an examination of links between education and health. NEFS data included reading, oral (speaking), and aural (listening) literacy measures. An additional survey was conducted with this group of study participants related to asthma issues, particularly asthma management. Data analysis focused on bivariate and multivariable logistic regression. Results In bivariate logistic regression models exploring aural literacy, there was a statistically significant association between those participants with lower aural literacy skills and less successful asthma management (OR:4.37, 95%CI:1.11, 17.32). In multivariable logistic regression analyses, controlling for gender, income, and race in separate models (one-at-a-time), there remained a statistically significant association between those participants with lower aural literacy skills and less successful asthma management. Conclusion Lower aural literacy skills seem to complicate asthma management capabilities. Practice Implications Greater attention to the oral exchange, in particular the listening skills highlighted by aural literacy, as well as other related literacy skills may help us develop strategies for clear communication related to asthma management. PMID:20399060
Mammalian cell culture monitoring using in situ spectroscopy: Is your method really optimised?
André, Silvère; Lagresle, Sylvain; Hannas, Zahia; Calvosa, Éric; Duponchel, Ludovic
2017-03-01
In recent years, as a result of the process analytical technology initiative of the US Food and Drug Administration, many different works have been carried out on direct and in situ monitoring of critical parameters for mammalian cell cultures by Raman spectroscopy and multivariate regression techniques. However, despite interesting results, it cannot be said that the proposed monitoring strategies, which will reduce errors of the regression models and thus confidence limits of the predictions, are really optimized. Hence, the aim of this article is to optimize some critical steps of spectroscopic acquisition and data treatment in order to reach a higher level of accuracy and robustness of bioprocess monitoring. In this way, we propose first an original strategy to assess the most suited Raman acquisition time for the processes involved. In a second part, we demonstrate the importance of the interbatch variability on the accuracy of the predictive models with a particular focus on the optical probes adjustment. Finally, we propose a methodology for the optimization of the spectral variables selection in order to decrease prediction errors of multivariate regressions. © 2017 American Institute of Chemical Engineers Biotechnol. Prog., 33:308-316, 2017. © 2017 American Institute of Chemical Engineers.
NASA Astrophysics Data System (ADS)
Nikolić, G. S.; Žerajić, S.; Cakić, M.
2011-10-01
Multivariate calibration method is a powerful mathematical tool that can be applied in analytical chemistry when the analytical signals are highly overlapped. The method with regression by partial least squares is proposed for the simultaneous spectrophotometric determination of adrenergic vasoconstrictors in decongestive solution containing two active components: phenyleprine hydrochloride and trimazoline hydrochloride. These sympathomimetic agents are that frequently associated in pharmaceutical formulations against the common cold. The proposed method, which is, simple and rapid, offers the advantages of sensitivity and wide range of determinations without the need for extraction of the vasoconstrictors. In order to minimize the optimal factors necessary to obtain the calibration matrix by multivariate calibration, different parameters were evaluated. The adequate selection of the spectral regions proved to be important on the number of factors. In order to simultaneously quantify both hydrochlorides among excipients, the spectral region between 250 and 290 nm was selected. A recovery for the vasoconstrictor was 98-101%. The developed method was applied to assay of two decongestive pharmaceutical preparations.
Hegazy, Maha A; Lotfy, Hayam M; Mowaka, Shereen; Mohamed, Ekram Hany
2016-07-05
Wavelets have been adapted for a vast number of signal-processing applications due to the amount of information that can be extracted from a signal. In this work, a comparative study on the efficiency of continuous wavelet transform (CWT) as a signal processing tool in univariate regression and a pre-processing tool in multivariate analysis using partial least square (CWT-PLS) was conducted. These were applied to complex spectral signals of ternary and quaternary mixtures. CWT-PLS method succeeded in the simultaneous determination of a quaternary mixture of drotaverine (DRO), caffeine (CAF), paracetamol (PAR) and p-aminophenol (PAP, the major impurity of paracetamol). While, the univariate CWT failed to simultaneously determine the quaternary mixture components and was able to determine only PAR and PAP, the ternary mixtures of DRO, CAF, and PAR and CAF, PAR, and PAP. During the calculations of CWT, different wavelet families were tested. The univariate CWT method was validated according to the ICH guidelines. While for the development of the CWT-PLS model a calibration set was prepared by means of an orthogonal experimental design and their absorption spectra were recorded and processed by CWT. The CWT-PLS model was constructed by regression between the wavelet coefficients and concentration matrices and validation was performed by both cross validation and external validation sets. Both methods were successfully applied for determination of the studied drugs in pharmaceutical formulations. Copyright © 2016 Elsevier B.V. All rights reserved.
ERIC Educational Resources Information Center
Agaku, Israel T.; Ayo-Yusuf, Olalekan A.
2014-01-01
Introduction: This study assessed the influence of exposure to pro-tobacco advertisements on experimentation with emerging tobacco products among U.S. adolescents aged =9 years, in Grades 6 to 12. Method: Data were obtained from the 2011 National Youth Tobacco Survey. Multivariate logistic regression was used to measure the association between…
ERIC Educational Resources Information Center
Hanson, Carl L.; Novilla, M. Lelinneth L. B.; Barnes, Michael D.; Eggett, Dennis; McKell, Chelsea; Reichman, Peter; Havens, Mike
2009-01-01
The purpose of the study was to compare 30-day prevalence of alcohol, tobacco, and other drug use among twelfth-grade students in Montana across a rural-urban continuum during 2000, 2002, and 2004. The methods include an analysis of the Montana Prevention Needs Assessment (N = 15,372) using multivariable logistic regression adjusting for risk…
Bowen, Stephen R; Chappell, Richard J; Bentzen, Søren M; Deveau, Michael A; Forrest, Lisa J; Jeraj, Robert
2012-01-01
Purpose To quantify associations between pre-radiotherapy and post-radiotherapy PET parameters via spatially resolved regression. Materials and methods Ten canine sinonasal cancer patients underwent PET/CT scans of [18F]FDG (FDGpre), [18F]FLT (FLTpre), and [61Cu]Cu-ATSM (Cu-ATSMpre). Following radiotherapy regimens of 50 Gy in 10 fractions, veterinary patients underwent FDG PET/CT scans at three months (FDGpost). Regression of standardized uptake values in baseline FDGpre, FLTpre and Cu-ATSMpre tumour voxels to those in FDGpost images was performed for linear, log-linear, generalized-linear and mixed-fit linear models. Goodness-of-fit in regression coefficients was assessed by R2. Hypothesis testing of coefficients over the patient population was performed. Results Multivariate linear model fits of FDGpre to FDGpost were significantly positive over the population (FDGpost~0.17 FDGpre, p=0.03), and classified slopes of RECIST non-responders and responders to be different (0.37 vs. 0.07, p=0.01). Generalized-linear model fits related FDGpre to FDGpost by a linear power law (FDGpost~FDGpre0.93, p<0.001). Univariate mixture model fits of FDGpre improved R2 from 0.17 to 0.52. Neither baseline FLT PET nor Cu-ATSM PET uptake contributed statistically significant multivariate regression coefficients. Conclusions Spatially resolved regression analysis indicates that pre-treatment FDG PET uptake is most strongly associated with three-month post-treatment FDG PET uptake in this patient population, though associations are histopathology-dependent. PMID:22682748
Amirian, Mohammad-Elyas; Fazilat-Pour, Masoud
2016-08-01
The present study examined simple and multivariate relationships of spiritual intelligence with general health and happiness. The employed method was descriptive and correlational. King's Spiritual Quotient scales, GHQ-28 and Oxford Happiness Inventory, are filled out by a sample consisted of 384 students, which were selected using stratified random sampling from the students of Shahid Bahonar University of Kerman. Data are subjected to descriptive and inferential statistics including correlations and multivariate regressions. Bivariate correlations support positive and significant predictive value of spiritual intelligence toward general health and happiness. Further analysis showed that among the Spiritual Intelligence' subscales, Existential Critical Thinking Predicted General Health and Happiness, reversely. In addition, happiness was positively predicted by generation of personal meaning and transcendental awareness. The findings are discussed in line with the previous studies and the relevant theoretical background.
Misspecification of Cox regression models with composite endpoints
Wu, Longyang; Cook, Richard J
2012-01-01
Researchers routinely adopt composite endpoints in multicenter randomized trials designed to evaluate the effect of experimental interventions in cardiovascular disease, diabetes, and cancer. Despite their widespread use, relatively little attention has been paid to the statistical properties of estimators of treatment effect based on composite endpoints. We consider this here in the context of multivariate models for time to event data in which copula functions link marginal distributions with a proportional hazards structure. We then examine the asymptotic and empirical properties of the estimator of treatment effect arising from a Cox regression model for the time to the first event. We point out that even when the treatment effect is the same for the component events, the limiting value of the estimator based on the composite endpoint is usually inconsistent for this common value. We find that in this context the limiting value is determined by the degree of association between the events, the stochastic ordering of events, and the censoring distribution. Within the framework adopted, marginal methods for the analysis of multivariate failure time data yield consistent estimators of treatment effect and are therefore preferred. We illustrate the methods by application to a recent asthma study. Copyright © 2012 John Wiley & Sons, Ltd. PMID:22736519
Michael S. Balshi; A. David McGuire; Paul Duffy; Mike Flannigan; John Walsh; Jerry Melillo
2009-01-01
We developed temporally and spatially explicit relationships between air temperature and fuel moisture codes derived from the Canadian Fire Weather Index System to estimate annual area burned at 2.5o (latitude x longitude) resolution using a Multivariate Adaptive Regression Spline (MARS) approach across Alaska and Canada. Burned area was...
Multivariate meta-analysis: a robust approach based on the theory of U-statistic.
Ma, Yan; Mazumdar, Madhu
2011-10-30
Meta-analysis is the methodology for combining findings from similar research studies asking the same question. When the question of interest involves multiple outcomes, multivariate meta-analysis is used to synthesize the outcomes simultaneously taking into account the correlation between the outcomes. Likelihood-based approaches, in particular restricted maximum likelihood (REML) method, are commonly utilized in this context. REML assumes a multivariate normal distribution for the random-effects model. This assumption is difficult to verify, especially for meta-analysis with small number of component studies. The use of REML also requires iterative estimation between parameters, needing moderately high computation time, especially when the dimension of outcomes is large. A multivariate method of moments (MMM) is available and is shown to perform equally well to REML. However, there is a lack of information on the performance of these two methods when the true data distribution is far from normality. In this paper, we propose a new nonparametric and non-iterative method for multivariate meta-analysis on the basis of the theory of U-statistic and compare the properties of these three procedures under both normal and skewed data through simulation studies. It is shown that the effect on estimates from REML because of non-normal data distribution is marginal and that the estimates from MMM and U-statistic-based approaches are very similar. Therefore, we conclude that for performing multivariate meta-analysis, the U-statistic estimation procedure is a viable alternative to REML and MMM. Easy implementation of all three methods are illustrated by their application to data from two published meta-analysis from the fields of hip fracture and periodontal disease. We discuss ideas for future research based on U-statistic for testing significance of between-study heterogeneity and for extending the work to meta-regression setting. Copyright © 2011 John Wiley & Sons, Ltd.
Corron, Louise; Marchal, François; Condemi, Silvana; Chaumoître, Kathia; Adalian, Pascal
2017-01-01
Juvenile age estimation methods used in forensic anthropology generally lack methodological consistency and/or statistical validity. Considering this, a standard approach using nonparametric Multivariate Adaptive Regression Splines (MARS) models were tested to predict age from iliac biometric variables of male and female juveniles from Marseilles, France, aged 0-12 years. Models using unidimensional (length and width) and bidimensional iliac data (module and surface) were constructed on a training sample of 176 individuals and validated on an independent test sample of 68 individuals. Results show that MARS prediction models using iliac width, module and area give overall better and statistically valid age estimates. These models integrate punctual nonlinearities of the relationship between age and osteometric variables. By constructing valid prediction intervals whose size increases with age, MARS models take into account the normal increase of individual variability. MARS models can qualify as a practical and standardized approach for juvenile age estimation. © 2016 American Academy of Forensic Sciences.
A survey of variable selection methods in two Chinese epidemiology journals
2010-01-01
Background Although much has been written on developing better procedures for variable selection, there is little research on how it is practiced in actual studies. This review surveys the variable selection methods reported in two high-ranking Chinese epidemiology journals. Methods Articles published in 2004, 2006, and 2008 in the Chinese Journal of Epidemiology and the Chinese Journal of Preventive Medicine were reviewed. Five categories of methods were identified whereby variables were selected using: A - bivariate analyses; B - multivariable analysis; e.g. stepwise or individual significance testing of model coefficients; C - first bivariate analyses, followed by multivariable analysis; D - bivariate analyses or multivariable analysis; and E - other criteria like prior knowledge or personal judgment. Results Among the 287 articles that reported using variable selection methods, 6%, 26%, 30%, 21%, and 17% were in categories A through E, respectively. One hundred sixty-three studies selected variables using bivariate analyses, 80% (130/163) via multiple significance testing at the 5% alpha-level. Of the 219 multivariable analyses, 97 (44%) used stepwise procedures, 89 (41%) tested individual regression coefficients, but 33 (15%) did not mention how variables were selected. Sixty percent (58/97) of the stepwise routines also did not specify the algorithm and/or significance levels. Conclusions The variable selection methods reported in the two journals were limited in variety, and details were often missing. Many studies still relied on problematic techniques like stepwise procedures and/or multiple testing of bivariate associations at the 0.05 alpha-level. These deficiencies should be rectified to safeguard the scientific validity of articles published in Chinese epidemiology journals. PMID:20920252
NASA Astrophysics Data System (ADS)
Braga, Jez Willian Batista; Trevizan, Lilian Cristina; Nunes, Lidiane Cristina; Rufini, Iolanda Aparecida; Santos, Dário, Jr.; Krug, Francisco José
2010-01-01
The application of laser induced breakdown spectrometry (LIBS) aiming the direct analysis of plant materials is a great challenge that still needs efforts for its development and validation. In this way, a series of experimental approaches has been carried out in order to show that LIBS can be used as an alternative method to wet acid digestions based methods for analysis of agricultural and environmental samples. The large amount of information provided by LIBS spectra for these complex samples increases the difficulties for selecting the most appropriated wavelengths for each analyte. Some applications have suggested that improvements in both accuracy and precision can be achieved by the application of multivariate calibration in LIBS data when compared to the univariate regression developed with line emission intensities. In the present work, the performance of univariate and multivariate calibration, based on partial least squares regression (PLSR), was compared for analysis of pellets of plant materials made from an appropriate mixture of cryogenically ground samples with cellulose as the binding agent. The development of a specific PLSR model for each analyte and the selection of spectral regions containing only lines of the analyte of interest were the best conditions for the analysis. In this particular application, these models showed a similar performance, but PLSR seemed to be more robust due to a lower occurrence of outliers in comparison to the univariate method. Data suggests that efforts dealing with sample presentation and fitness of standards for LIBS analysis must be done in order to fulfill the boundary conditions for matrix independent development and validation.
Montes, Alejandro; Pazos, Gustavo
2016-02-01
Identifying children at risk of failing the National Developmental Screening Test by combining prevalences of children suspected of having inapparent developmental disorders (IDDs) and associated risk factors (RFs) would allow to save resources. 1. To estimate the prevalence of children suspected of having IDDs. 2. To identify associated RFs. 3. To assess three methods developed based on observed RFs and propose a pre-screening procedure. The National Developmental Screening Test was administered to 60 randomly selected children aged between 2 and 4 years old from a socioeconomically disadvantaged area from Puerto Madryn. Twenty-four biological and socioenvironmental outcome measures were assessed in order to identify potential RFs using bivariate and multivariate analyses. The likelihood of failing the screening test was estimated as follows: 1. a multivariate logistic regression model was developed; 2. a relationship was established between the number of RFs present in each child and the percentage of children who failed the test; 3. these two methods were combined. The prevalence of children suspected of having IDDs was 55.0% (95% confidence interval: 42.4%-67.6%). Six RFs were initially identified using the bivariate approach. Three of them (maternal education, number of health checkups and Z scores for height-for-age, and maternal age) were included in the logistic regression model, which has a greater explanatory power. The third method included in the assessment showed greater sensitivity and specificity (85% and 79%, respectively). The estimated prevalence of children suspected of having IDDs was four times higher than the national standards. Seven RFs were identified. Combining the analysis of risk factor accumulation and a multivariate model provides a firm basis for developing a sensitive, specific and practical pre-screening procedure for socioeconomically disadvantaged areas. Sociedad Argentina de Pediatría.
Hordge, LaQuana N; McDaniel, Kiara L; Jones, Derick D; Fakayode, Sayo O
2016-05-15
The endocrine disruption property of estrogens necessitates the immediate need for effective monitoring and development of analytical protocols for their analyses in biological and human specimens. This study explores the first combined utility of a steady-state fluorescence spectroscopy and multivariate partial-least-square (PLS) regression analysis for the simultaneous determination of two estrogens (17α-ethinylestradiol (EE) and norgestimate (NOR)) concentrations in bovine serum albumin (BSA) and human serum albumin (HSA) samples. The influence of EE and NOR concentrations and temperature on the emission spectra of EE-HSA EE-BSA, NOR-HSA, and NOR-BSA complexes was also investigated. The binding of EE with HSA and BSA resulted in increase in emission characteristics of HSA and BSA and a significant blue spectra shift. In contrast, the interaction of NOR with HSA and BSA quenched the emission characteristics of HSA and BSA. The observed emission spectral shifts preclude the effective use of traditional univariate regression analysis of fluorescent data for the determination of EE and NOR concentrations in HSA and BSA samples. Multivariate partial-least-squares (PLS) regression analysis was utilized to correlate the changes in emission spectra with EE and NOR concentrations in HSA and BSA samples. The figures-of-merit of the developed PLS regression models were excellent, with limits of detection as low as 1.6×10(-8) M for EE and 2.4×10(-7) M for NOR and good linearity (R(2)>0.994985). The PLS models correctly predicted EE and NOR concentrations in independent validation HSA and BSA samples with a root-mean-square-percent-relative-error (RMS%RE) of less than 6.0% at physiological condition. On the contrary, the use of univariate regression resulted in poor predictions of EE and NOR in HSA and BSA samples, with RMS%RE larger than 40% at physiological conditions. High accuracy, low sensitivity, simplicity, low-cost with no prior analyte extraction or separation required makes this method promising, compelling, and attractive alternative for the rapid determination of estrogen concentrations in biomedical and biological specimens, pharmaceuticals, or environmental samples. Published by Elsevier B.V.
Regression models for analyzing costs and their determinants in health care: an introductory review.
Gregori, Dario; Petrinco, Michele; Bo, Simona; Desideri, Alessandro; Merletti, Franco; Pagano, Eva
2011-06-01
This article aims to describe the various approaches in multivariable modelling of healthcare costs data and to synthesize the respective criticisms as proposed in the literature. We present regression methods suitable for the analysis of healthcare costs and then apply them to an experimental setting in cardiovascular treatment (COSTAMI study) and an observational setting in diabetes hospital care. We show how methods can produce different results depending on the degree of matching between the underlying assumptions of each method and the specific characteristics of the healthcare problem. The matching of healthcare cost models to the analytic objectives and characteristics of the data available to a study requires caution. The study results and interpretation can be heavily dependent on the choice of model with a real risk of spurious results and conclusions.
2015-01-01
different PRBC transfusion volumes. We performed multivariate regression analysis using HRV metrics and routine vital signs to test the hypothesis that...study sponsors did not have any role in the study design, data collection, analysis and interpretation of data, report writing, or the decision to...primary outcome was hemorrhagic injury plus different PRBC transfusion volumes. We performed multivariate regression analysis using HRV metrics and
Kidney transplantation from deceased donors with elevated serum creatinine.
Gallinat, Anja; Leerhoff, Sabine; Paul, Andreas; Molmenti, Ernesto P; Schulze, Maren; Witzke, Oliver; Sotiropoulos, Georgios C
2016-12-01
Elevated donor serum creatinine has been associated with inferior graft survival in kidney transplantation (KT). The aim of this study was to evaluate the impact of elevated donor serum creatinine on short and long-term outcomes and to determine possible ways to optimize the use of these organs. All kidney transplants from 01-2000 to 12-2012 with donor creatinine ≥ 2 mg/dl were considered. Risk factors for delayed graft function (DGF) were explored with uni- and multivariate regression analyses. Donor and recipient data were analyzed with uni- and multivariate cox proportional hazard analyses. Graft and patient survival were calculated using the Kaplan-Meier method. Seventy-eight patients were considered. Median recipient age and waiting time on dialysis were 53 years and 5.1 years, respectively. After a median follow-up of 6.2 years, 63 patients are alive. 1, 3, and 5-year graft and patient survival rates were 92, 89, and 89 % and 96, 93, and 89 %, respectively. Serum creatinine level at procurement and recipient's dialysis time prior to KT were predictors of DGF in multivariate analysis (p = 0.0164 and p = 0.0101, respectively). Charlson comorbidity score retained statistical significance by multivariate regression analysis for graft survival (p = 0.0321). Recipient age (p = 0.0035) was predictive of patient survival by multivariate analysis. Satisfactory long-term kidney transplant outcomes in the setting of elevated donor serum creatinine ≥2 mg/dl can be achieved when donor creatinine is <3.5 mg/dl, and the recipient has low comorbidities, is under 56 years of age, and remains in dialysis prior to KT for <6.8 years.
Chromatography methods and chemometrics for determination of milk fat adulterants
NASA Astrophysics Data System (ADS)
Trbović, D.; Petronijević, R.; Đorđević, V.
2017-09-01
Milk and milk-based products are among the leading food categories according to reported cases of food adulteration. Although many authentication problems exist in all areas of the food industry, adequate control methods are required to evaluate the authenticity of milk and milk products in the dairy industry. Moreover, gas chromatography (GC) analysis of triacylglycerols (TAGs) or fatty acid (FA) profiles of milk fat (MF) in combination with multivariate statistical data processing have been used to detect adulterations of milk and dairy products with foreign fats. The adulteration of milk and butter is a major issue for the dairy industry. The major adulterants of MF are vegetable oils (soybean, sunflower, groundnut, coconut, palm and peanut oil) and animal fat (cow tallow and pork lard). Multivariate analysis enables adulterated MF to be distinguished from authentic MF, while taking into account many analytical factors. Various multivariate analysis methods have been proposed to quantitatively detect levels of adulterant non-MFs, with multiple linear regression (MLR) seemingly the most suitable. There is a need for increased use of chemometric data analyses to detect adulterated MF in foods and for their expanded use in routine quality assurance testing.
Shi, Xiao; Zhang, Ting-ting; Hu, Wei-ping; Ji, Qing-hai
2017-01-01
Background The relationship between marital status and oral cavity squamous cell carcinoma (OCSCC) survival has not been explored. The objective of our study was to evaluate the impact of marital status on OCSCC survival and investigate the potential mechanisms. Results Married patients had better 5-year cancer-specific survival (CSS) (66.7% vs 54.9%) and 5-year overall survival (OS) (56.0% vs 41.1%). In multivariate Cox regression models, unmarried patients also showed higher mortality risk for both CSS (Hazard Ratio [HR]: 1.260, 95% confidence interval (CI): 1.187–1.339, P < 0.001) and OS (HR: 1.328, 95% CI: 1.266–1.392, P < 0.001). Multivariate logistic regression showed married patients were more likely to be diagnosed at earlier stage (P < 0.001) and receive surgery (P < 0.001). Married patients still demonstrated better prognosis in the 1:1 matched group analysis (CSS: 62.9% vs 60.8%, OS: 52.3% vs 46.5%). Materials and Methods 11022 eligible OCSCC patients were identified from Surveillance, Epidemiology, and End Results (SEER) database, including 5902 married and 5120 unmarried individuals. Kaplan-Meier analysis, Log-rank test and Cox proportional hazards regression model were used to analyze survival and mortality risk. Influence of marital status on stage, age at diagnosis and selection of treatment was determined by binomial and multinomial logistic regression. Propensity score matching method was adopted to perform a 1:1 matched cohort. Conclusions Marriage has an independently protective effect on OCSCC survival. Earlier diagnosis and more sufficient treatment are possible explanations. Besides, even after 1:1 matching, survival advantage of married group still exists, indicating that spousal support from other aspects may also play an important role. PMID:28415710
The antagonistic effect between STAT1 and Survivin and its clinical significance in gastric cancer.
Deng, Hao; Zhen, Hongyan; Fu, Zhengqi; Huang, Xuan; Zhou, Hongyan; Liu, Lijiang
2012-01-01
In previous studies, we observed that STAT1 and Survivin correlated negatively with gastric cancer tissues, and that the functions of the IFN-γ-STAT1 pathway and Survivin in gastric cancer are the same as those reported for other types of cancer. In this study, the SGC7901 gastric cancer cell line and 83 gastric cancer specimens were used to confirm the relationship between STAT1 and Survivin, as well as the clinical significance of this relationship in gastric cancer. IFN-γ and STAT1 and Survivin antisense oligonucleotides (ASONs) were used to knock down the expression in SGC7901 cells. The protein expression of STAT1 and Survivin was tested by immunocytochemical and image analysis methods. A gastric cancer tissue microarray was prepared and tested by immunohistochemical methods. Data were analyzed by the Spearman's rank correlation analysis, the χ(2) test and Cox's multivariate regression analysis. Upon knockdown of IFN-γ, STAT1 and Survivin expression by ASON in the SGC7901 cell line, an antagonistic effect was observed between STAT1 and Survivin. In gastric cancer tissues, STAT1 showed a negative correlation with depth of invasion (p<0.05) in gastric cancer tissues exhibiting a negative Survivin protein expression. Furthermore, in tissues exhibiting a negative STAT1 protein expression, Survivin correlated negatively with N stage (p<0.05). Pathological and molecular markers were used to conduct Cox's multivariate regression analysis, and depth of invasion and N stage were found to be prognostic factors (p<0.05). On the other hand, in tissues exhibiting a negative Survivin protein expression, Cox's multivariate regression analysis revealed that the differentiation type and STAT1 protein expression were prognostic factors (p<0.05). There is an antagonistic effect between STAT1 and Survivin in gastric cancer, and this antagonistic effect is of clinical significance in gastric cancer.
NASA Astrophysics Data System (ADS)
Ferrera, Elisabetta; Giammanco, Salvatore; Cannata, Andrea; Montalto, Placido
2013-04-01
From November 2009 to April 2011 soil radon activity was continuously monitored using a Barasol® probe located on the upper NE flank of Mt. Etna volcano, close either to the Piano Provenzana fault or to the NE-Rift. Seismic and volcanological data have been analyzed together with radon data. We also analyzed air and soil temperature, barometric pressure, snow and rain fall data. In order to find possible correlations among the above parameters, and hence to reveal possible anomalies in the radon time-series, we used different statistical methods: i) multivariate linear regression; ii) cross-correlation; iii) coherence analysis through wavelet transform. Multivariate regression indicated a modest influence on soil radon from environmental parameters (R2 = 0.31). When using 100-days time windows, the R2 values showed wide variations in time, reaching their maxima (~0.63-0.66) during summer. Cross-correlation analysis over 100-days moving averages showed that, similar to multivariate linear regression analysis, the summer period is characterised by the best correlation between radon data and environmental parameters. Lastly, the wavelet coherence analysis allowed a multi-resolution coherence analysis of the time series acquired. This approach allows to study the relations among different signals either in time or frequency domain. It confirmed the results of the previous methods, but also allowed to recognize correlations between radon and environmental parameters at different observation scales (e.g., radon activity changed during strong precipitations, but also during anomalous variations of soil temperature uncorrelated with seasonal fluctuations). Our work suggests that in order to make an accurate analysis of the relations among distinct signals it is necessary to use different techniques that give complementary analytical information. In particular, the wavelet analysis showed to be very effective in discriminating radon changes due to environmental influences from those correlated with impending seismic or volcanic events.
Chen, Tsung-Fu; Liang, Jyh-Chong; Lin, Tzu-Bin; Tsai, Chin-Chung
2016-01-01
Background Compared with the traditional ways of gaining health-related information from newspapers, magazines, radio, and television, the Internet is inexpensive, accessible, and conveys diverse opinions. Several studies on how increasing Internet use affected outpatient clinic visits were inconclusive. Objective The objective of this study was to examine the role of Internet use on ambulatory care-seeking behaviors as indicated by the number of outpatient clinic visits after adjusting for confounding variables. Methods We conducted this study using a sample randomly selected from the general population in Taiwan. To handle the missing data, we built a multivariate logistic regression model for propensity score matching using age and sex as the independent variables. The questionnaires with no missing data were then included in a multivariate linear regression model for examining the association between Internet use and outpatient clinic visits. Results We included a sample of 293 participants who answered the questionnaire with no missing data in the multivariate linear regression model. We found that Internet use was significantly associated with more outpatient clinic visits (P=.04). The participants with chronic diseases tended to make more outpatient clinic visits (P<.01). Conclusions The inconsistent quality of health-related information obtained from the Internet may be associated with patients’ increasing need for interpreting and discussing the information with health care professionals, thus resulting in an increasing number of outpatient clinic visits. In addition, the media literacy of Web-based health-related information seekers may also affect their ambulatory care-seeking behaviors, such as outpatient clinic visits. PMID:27927606
Guo, L W; Liu, S Z; Zhang, M; Chen, Q; Zhang, S K; Sun, X B
2017-12-10
Objective: To investigate the effect of fried food intake on the pathogenesis of esophageal cancer and precancerous lesions. Methods: From 2005 to 2013, all the residents aged 40-69 years from 11 counties (cities) where cancer screening of upper gastrointestinal cancer had been conducted in rural areas of Henan province, were recruited as the subjects of study. Information on demography and lifestyle was collected. The residents under study were screened with iodine staining endoscopic examination and biopsy samples were diagnosed pathologically, under standardized criteria. Subjects with high risk were divided into the groups based on their different pathological degrees. Multivariate ordinal logistic regression analysis was used to analyze the relationship between the frequency of fried food intake and esophageal cancer and precancerous lesions. Results: A total number of 8 792 cases with normal esophagus, 3 680 with mild hyperplasia, 972 with moderate hyperplasia, 413 with severe hyperplasia carcinoma in situ, and 336 cases of esophageal cancer were recruited. Results from multivariate logistic regression analysis showed that, when compared with those who did not eat fried food, the intake of fried food (<2 times/week: OR =1.60, 95% CI : 1.40-1.83; ≥2 times/week: OR =2.58, 95% CI : 1.98-3.37) appeared a risk factor for both esophageal cancer or precancerous lesions after adjustment for age, sex, marital status, educational level, body mass index, smoking and alcohol intake. Conclusion: The intake of fried food appeared a risk factor for both esophageal cancer and precancerous lesions.
Wang, T T; Jiang, L
2017-10-01
Objective: To investigate the prognostic value of highly sensitive cardiac Troponin T (hs-cTn T) for sepsis in critically ill patients. Methods: Patients estimated to stay in the ICU of Fuxing Hospital for more than 24h were enrolled at from March 2014 to December 2014. Serum hs-cTn T was tested within two hours. Univariate and multivariate linear regression analyses were used to determine the association of variables with the hs-cTn T. Multivariable logistic regression analysis was used to evaluate the risk factors of 28-day mortality. Results: A total of 125 patients were finally enrolled including 68 patients with sepsis and 57 without. The levels of hs-cTn T in sepsis and non-sepsis groups were significantly different[52.0(32.5, 87.5) ng/L vs 14.0(6.5, 29.0) ng/L respectively, P <0.001]. In sepsis group, hs-cTn T among common sepsis, severe sepsis and septic shock were similar. Hs-cTn T was significantly higher in non-survivors than survivors [27(13, 52)ng/L vs 44.5(28.8, 83.5)ng/L, P <0.001]. Age, sepsis, serum creatinine were independent risk factors affecting hs-cTn T by multivariate linear regression analyses. But hs-cTn T was not a risk factor for death. Conclusion: Patients with sepsis had higher serum hs-cTn T than those without sepsis. but it was not found to be associated with the severity of sepsis.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lin, S.-Y.; Chang, K.-P.; Graduate Institute of Clinical Medical Sciences, Chang Gung University, Linkou, Taiwan
Purpose: The presence of Epstein-Barr virus latent membrane protein-1 (LMP-1) gene in nasopharyngeal swabs indicates the presence of nasopharyngeal carcinoma (NPC) mucosal tumor cells. This study was undertaken to investigate whether the time taken for LMP-1 to disappear after initiation of primary radiotherapy (RT) was inversely associated with NPC local control. Methods and Materials: During July 1999 and October 2002, there were 127 nondisseminated NPC patients receiving serial examinations of nasopharyngeal swabbing with detection of LMP-1 during the RT course. The time for LMP-1 regression was defined as the number of days after initiation of RT for LMP-1 results tomore » turn negative. The primary outcome was local control, which was represented by freedom from local recurrence. Results: The time for LMP-1 regression showed a statistically significant influence on NPC local control both univariately (p < 0.0001) and multivariately (p = 0.004). In multivariate analysis, the administration of chemotherapy conferred a significantly more favorable local control (p = 0.03). Advanced T status ({>=} T2b), overall treatment time of external photon radiotherapy longer than 55 days, and older age showed trends toward being poor prognosticators. The time for LMP-1 regression was very heterogeneous. According to the quartiles of the time for LMP-1 regression, we defined the pattern of LMP-1 regression as late regression if it required 40 days or more. Kaplan-Meier plots indicated that the patients with late regression had a significantly worse local control than those with intermediate or early regression (p 0.0129). Conclusion: Among the potential prognostic factors examined in this study, the time for LMP-1 regression was the most independently significant factor that was inversely associated with NPC local control.« less
Xu, Z J; Pan, J; Zhou, Q; Wang, D J
2017-10-24
Objective: To estimate the prevalence and the risk factors of preoperative coronary angiography (CAG) confirmed coronary stenosis in patients with degenerative valvular heart disease. Methods: A total of 491 patients who underwent screening CAG before valvular surgery due to degenerative valvular heart disease were enrolled from January 2011 to September 2014 in our hospital, and clinical data were analyzed. According to CAG results, patients were divided into positive CAG result (PCAG) group or negative CAG (NCAG) group. Positive CAG result was defined as stenosis ≥50% of the diameter of the left main coronary artery or stenosis ≥70% of the diameter of left anterior descending, left circumflex artery, and right coronary artery.Risk factors of positive CAG result were analyzed by multivariable logistic regression analysis, and Bootstrap method was used to verify the results. Results: There were 47(9.57%)degenerative valvular heart disease patients with PCAG. Patients were older ((68.0±7.6)years vs.(62.6±7.1)years, P <0.001) and the prevalence of typical angina was significantly higher (14.89%(7/47)vs. 2.03%(9/444), P <0.001)in PCAG group than in NCAG group. Multivariable logistic regression analysis showed that age ( OR =1.118, 95% CI 1.067-1.172, P <0.001), typical angina ( OR =8.970, 95% CI 2.963-27.154, P <0.001), and serum concentration of apolipoprotein B ( OR =20.311, 95% CI 4.774-86.416, P <0.001) were the independent risk factors of PCAG in degenerative valvular heart disease patients. Bootstrap method revealed satisfactory repeatability of multivariable logistic regression analysis results (age: OR =1.118, 95% CI 1.068-1.178, P =0.001; typical angina: OR =8.970, 95% CI 2.338-35.891, P =0.001; serum concentration of apolipoprotein B: OR =20.311, 95% CI 4.639-91.977, P =0.001). Conclusions: A low prevalence of PCAG before valvular surgery is observed in degenerative valvular heart disease patients in this patient cohort. Age, typical angina, and serum concentration of apolipoprotein B are independent risk factors of PCAG in this patient cohort.
Inouye, David I.; Ravikumar, Pradeep; Dhillon, Inderjit S.
2016-01-01
We develop Square Root Graphical Models (SQR), a novel class of parametric graphical models that provides multivariate generalizations of univariate exponential family distributions. Previous multivariate graphical models (Yang et al., 2015) did not allow positive dependencies for the exponential and Poisson generalizations. However, in many real-world datasets, variables clearly have positive dependencies. For example, the airport delay time in New York—modeled as an exponential distribution—is positively related to the delay time in Boston. With this motivation, we give an example of our model class derived from the univariate exponential distribution that allows for almost arbitrary positive and negative dependencies with only a mild condition on the parameter matrix—a condition akin to the positive definiteness of the Gaussian covariance matrix. Our Poisson generalization allows for both positive and negative dependencies without any constraints on the parameter values. We also develop parameter estimation methods using node-wise regressions with ℓ1 regularization and likelihood approximation methods using sampling. Finally, we demonstrate our exponential generalization on a synthetic dataset and a real-world dataset of airport delay times. PMID:27563373
Estuarial fingerprinting through multidimensional fluorescence and multivariate analysis.
Hall, Gregory J; Clow, Kerin E; Kenny, Jonathan E
2005-10-01
As part of a strategy for preventing the introduction of aquatic nuisance species (ANS) to U.S. estuaries, ballast water exchange (BWE) regulations have been imposed. Enforcing these regulations requires a reliable method for determining the port of origin of water in the ballast tanks of ships entering U.S. waters. This study shows that a three-dimensional fluorescence fingerprinting technique, excitation emission matrix (EEM) spectroscopy, holds great promise as a ballast water analysis tool. In our technique, EEMs are analyzed by multivariate classification and curve resolution methods, such as N-way partial least squares Regression-discriminant analysis (NPLS-DA) and parallel factor analysis (PARAFAC). We demonstrate that classification techniques can be used to discriminate among sampling sites less than 10 miles apart, encompassing Boston Harbor and two tributaries in the Mystic River Watershed. To our knowledge, this work is the first to use multivariate analysis to classify water as to location of origin. Furthermore, it is shown that curve resolution can show seasonal features within the multidimensional fluorescence data sets, which correlate with difficulty in classification.
Goicoechea, H C; Olivieri, A C
1999-08-01
The use of multivariate spectrophotometric calibration is presented for the simultaneous determination of the active components of tablets used in the treatment of pulmonary tuberculosis. The resolution of ternary mixtures of rifampicin, isoniazid and pyrazinamide has been accomplished by using partial least squares (PLS-1) regression analysis. Although the components show an important degree of spectral overlap, they have been simultaneously determined with high accuracy and precision, rapidly and with no need of nonaqueous solvents for dissolving the samples. No interference has been observed from the tablet excipients. A comparison is presented with the related multivariate method of classical least squares (CLS) analysis, which is shown to yield less reliable results due to the severe spectral overlap among the studied compounds. This is highlighted in the case of isoniazid, due to the small absorbances measured for this component.
Nateghi, Roshanak; Guikema, Seth D; Quiring, Steven M
2011-12-01
This article compares statistical methods for modeling power outage durations during hurricanes and examines the predictive accuracy of these methods. Being able to make accurate predictions of power outage durations is valuable because the information can be used by utility companies to plan their restoration efforts more efficiently. This information can also help inform customers and public agencies of the expected outage times, enabling better collective response planning, and coordination of restoration efforts for other critical infrastructures that depend on electricity. In the long run, outage duration estimates for future storm scenarios may help utilities and public agencies better allocate risk management resources to balance the disruption from hurricanes with the cost of hardening power systems. We compare the out-of-sample predictive accuracy of five distinct statistical models for estimating power outage duration times caused by Hurricane Ivan in 2004. The methods compared include both regression models (accelerated failure time (AFT) and Cox proportional hazard models (Cox PH)) and data mining techniques (regression trees, Bayesian additive regression trees (BART), and multivariate additive regression splines). We then validate our models against two other hurricanes. Our results indicate that BART yields the best prediction accuracy and that it is possible to predict outage durations with reasonable accuracy. © 2011 Society for Risk Analysis.
Latin hypercube approach to estimate uncertainty in ground water vulnerability
Gurdak, J.J.; McCray, J.E.; Thyne, G.; Qi, S.L.
2007-01-01
A methodology is proposed to quantify prediction uncertainty associated with ground water vulnerability models that were developed through an approach that coupled multivariate logistic regression with a geographic information system (GIS). This method uses Latin hypercube sampling (LHS) to illustrate the propagation of input error and estimate uncertainty associated with the logistic regression predictions of ground water vulnerability. Central to the proposed method is the assumption that prediction uncertainty in ground water vulnerability models is a function of input error propagation from uncertainty in the estimated logistic regression model coefficients (model error) and the values of explanatory variables represented in the GIS (data error). Input probability distributions that represent both model and data error sources of uncertainty were simultaneously sampled using a Latin hypercube approach with logistic regression calculations of probability of elevated nonpoint source contaminants in ground water. The resulting probability distribution represents the prediction intervals and associated uncertainty of the ground water vulnerability predictions. The method is illustrated through a ground water vulnerability assessment of the High Plains regional aquifer. Results of the LHS simulations reveal significant prediction uncertainties that vary spatially across the regional aquifer. Additionally, the proposed method enables a spatial deconstruction of the prediction uncertainty that can lead to improved prediction of ground water vulnerability. ?? 2007 National Ground Water Association.
Noninvasive and fast measurement of blood glucose in vivo by near infrared (NIR) spectroscopy
NASA Astrophysics Data System (ADS)
Jintao, Xue; Liming, Ye; Yufei, Liu; Chunyan, Li; Han, Chen
2017-05-01
This research was to develop a method for noninvasive and fast blood glucose assay in vivo. Near-infrared (NIR) spectroscopy, a more promising technique compared to other methods, was investigated in rats with diabetes and normal rats. Calibration models are generated by two different multivariate strategies: partial least squares (PLS) as linear regression method and artificial neural networks (ANN) as non-linear regression method. The PLS model was optimized individually by considering spectral range, spectral pretreatment methods and number of model factors, while the ANN model was studied individually by selecting spectral pretreatment methods, parameters of network topology, number of hidden neurons, and times of epoch. The results of the validation showed the two models were robust, accurate and repeatable. Compared to the ANN model, the performance of the PLS model was much better, with lower root mean square error of validation (RMSEP) of 0.419 and higher correlation coefficients (R) of 96.22%.
NASA Astrophysics Data System (ADS)
Yehia, Ali M.; Arafa, Reham M.; Abbas, Samah S.; Amer, Sawsan M.
2016-01-01
Spectral resolution of cefquinome sulfate (CFQ) in the presence of its degradation products was studied. Three selective, accurate and rapid spectrophotometric methods were performed for the determination of CFQ in the presence of either its hydrolytic, oxidative or photo-degradation products. The proposed ratio difference, derivative ratio and mean centering are ratio manipulating spectrophotometric methods that were satisfactorily applied for selective determination of CFQ within linear range of 5.0-40.0 μg mL- 1. Concentration Residuals Augmented Classical Least Squares was applied and evaluated for the determination of the cited drug in the presence of its all degradation products. Traditional Partial Least Squares regression was also applied and benchmarked against the proposed advanced multivariate calibration. Experimentally designed 25 synthetic mixtures of three factors at five levels were used to calibrate and validate the multivariate models. Advanced chemometrics succeeded in quantitative and qualitative analyses of CFQ along with its hydrolytic, oxidative and photo-degradation products. The proposed methods were applied successfully for different pharmaceutical formulations analyses. These developed methods were simple and cost-effective compared with the manufacturer's RP-HPLC method.
Fallah, Aria; Weil, Alexander G; Juraschka, Kyle; Ibrahim, George M; Wang, Anthony C; Crevier, Louis; Tseng, Chi-Hong; Kulkarni, Abhaya V; Ragheb, John; Bhatia, Sanjiv
2017-12-01
OBJECTIVE Combined endoscopic third ventriculostomy (ETC) and choroid plexus cauterization (CPC)-ETV/CPC- is being investigated to increase the rate of shunt independence in infants with hydrocephalus. The degree of CPC necessary to achieve improved rates of shunt independence is currently unknown. METHODS Using data from a single-center, retrospective, observational cohort study involving patients who underwent ETV/CPC for treatment of infantile hydrocephalus, comparative statistical analyses were performed to detect a difference in need for subsequent CSF diversion procedure in patients undergoing partial CPC (describes unilateral CPC or bilateral CPC that only extended from the foramen of Monro [FM] to the atrium on one side) or subtotal CPC (describes CPC extending from the FM to the posterior temporal horn bilaterally) using a rigid neuroendoscope. Propensity scores for extent of CPC were calculated using age and etiology. Propensity scores were used to perform 1) case-matching comparisons and 2) Cox multivariable regression, adjusting for propensity score in the unmatched cohort. Cox multivariable regression adjusting for age and etiology, but not propensity score was also performed as a third statistical technique. RESULTS Eighty-four patients who underwent ETV/CPC had sufficient data to be included in the analysis. Subtotal CPC was performed in 58 patients (69%) and partial CPC in 26 (31%). The ETV/CPC success rates at 6 and 12 months, respectively, were 49% and 41% for patients undergoing subtotal CPC and 35% and 31% for those undergoing partial CPC. Cox multivariate regression in a 48-patient cohort case-matched by propensity score demonstrated no added effect of increased extent of CPC on ETV/CPC survival (HR 0.868, 95% CI 0.422-1.789, p = 0.702). Cox multivariate regression including all patients, with adjustment for propensity score, demonstrated no effect of extent of CPC on ETV/CPC survival (HR 0.845, 95% CI 0.462-1.548, p = 0.586). Cox multivariate regression including all patients, with adjustment for age and etiology, but not propensity score, demonstrated no effect of extent of CPC on ETV/CPC survival (HR 0.908, 95% CI 0.495-1.664, p = 0.755). CONCLUSIONS Using multiple comparative statistical analyses, no difference in need for subsequent CSF diversion procedure was detected between patients in this cohort who underwent partial versus subtotal CPC. Further investigation regarding whether there is truly no difference between partial versus subtotal extent of CPC in larger patient populations and whether further gain in CPC success can be achieved with complete CPC is warranted.
Williams, L. Keoki; Buu, Anne
2017-01-01
We propose a multivariate genome-wide association test for mixed continuous, binary, and ordinal phenotypes. A latent response model is used to estimate the correlation between phenotypes with different measurement scales so that the empirical distribution of the Fisher’s combination statistic under the null hypothesis is estimated efficiently. The simulation study shows that our proposed correlation estimation methods have high levels of accuracy. More importantly, our approach conservatively estimates the variance of the test statistic so that the type I error rate is controlled. The simulation also shows that the proposed test maintains the power at the level very close to that of the ideal analysis based on known latent phenotypes while controlling the type I error. In contrast, conventional approaches–dichotomizing all observed phenotypes or treating them as continuous variables–could either reduce the power or employ a linear regression model unfit for the data. Furthermore, the statistical analysis on the database of the Study of Addiction: Genetics and Environment (SAGE) demonstrates that conducting a multivariate test on multiple phenotypes can increase the power of identifying markers that may not be, otherwise, chosen using marginal tests. The proposed method also offers a new approach to analyzing the Fagerström Test for Nicotine Dependence as multivariate phenotypes in genome-wide association studies. PMID:28081206
Association of internet use and depression among the spinal cord injury population.
Tsai, I-Hsuan; Graves, Daniel E; Lai, Ching-Huang; Hwang, Lu-Yu; Pompeii, Lisa A
2014-02-01
To examine the relation between the frequency of Internet use and depression among people with spinal cord injury (SCI). Cross-sectional survey. SCI Model Systems. People with SCI (N=4618) who were interviewed between 2004 and 2010. Not applicable. The frequency of Internet use and the severity of depressive symptoms were measured simultaneously by interview. Internet use was reported as daily, weekly, monthly, or none. The depressive symptoms were measured by the Patient Health Questionnaire-9 (PHQ-9), with 2 published criteria being used to screen for depressive disorder. The diagnostic method places more weight on nonsomatic items (ie, items 1, 2, and 9), and the cut-off method that determines depression by a (PHQ-9) score ≥10 places more weight on somatic factors. The average scores of somatic and nonsomatic items represented the severity of somatic and nonsomatic symptoms, respectively. Our multivariate logistic regression model indicated that daily Internet users were less likely to have depressive symptoms (odds ratio=.77; 95% confidence interval, .64-.93), if the diagnostic method was used. The linear multivariate regression analysis indicated that daily and weekly Internet usage were associated with fewer nonsomatic symptoms; no significant association was observed between daily or weekly Internet usage and somatic symptoms. People with SCI who used the Internet daily were less likely to have depressive symptoms. Copyright © 2014 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Multiple imputation for handling missing outcome data when estimating the relative risk.
Sullivan, Thomas R; Lee, Katherine J; Ryan, Philip; Salter, Amy B
2017-09-06
Multiple imputation is a popular approach to handling missing data in medical research, yet little is known about its applicability for estimating the relative risk. Standard methods for imputing incomplete binary outcomes involve logistic regression or an assumption of multivariate normality, whereas relative risks are typically estimated using log binomial models. It is unclear whether misspecification of the imputation model in this setting could lead to biased parameter estimates. Using simulated data, we evaluated the performance of multiple imputation for handling missing data prior to estimating adjusted relative risks from a correctly specified multivariable log binomial model. We considered an arbitrary pattern of missing data in both outcome and exposure variables, with missing data induced under missing at random mechanisms. Focusing on standard model-based methods of multiple imputation, missing data were imputed using multivariate normal imputation or fully conditional specification with a logistic imputation model for the outcome. Multivariate normal imputation performed poorly in the simulation study, consistently producing estimates of the relative risk that were biased towards the null. Despite outperforming multivariate normal imputation, fully conditional specification also produced somewhat biased estimates, with greater bias observed for higher outcome prevalences and larger relative risks. Deleting imputed outcomes from analysis datasets did not improve the performance of fully conditional specification. Both multivariate normal imputation and fully conditional specification produced biased estimates of the relative risk, presumably since both use a misspecified imputation model. Based on simulation results, we recommend researchers use fully conditional specification rather than multivariate normal imputation and retain imputed outcomes in the analysis when estimating relative risks. However fully conditional specification is not without its shortcomings, and so further research is needed to identify optimal approaches for relative risk estimation within the multiple imputation framework.
Workers' compensation costs among construction workers: a robust regression analysis.
Friedman, Lee S; Forst, Linda S
2009-11-01
Workers' compensation data are an important source for evaluating costs associated with construction injuries. We describe the characteristics of injured construction workers filing claims in Illinois between 2000 and 2005 and the factors associated with compensation costs using a robust regression model. In the final multivariable model, the cumulative percent temporary and permanent disability-measures of severity of injury-explained 38.7% of the variance of cost. Attorney costs explained only 0.3% of the variance of the dependent variable. The model used in this study clearly indicated that percent disability was the most important determinant of cost, although the method and uniformity of percent impairment allocation could be better elucidated. There is a need to integrate analytical methods that are suitable for skewed data when analyzing claim costs.
Aubrey-Bassler, Kris; Cullen, Richard M.; Simms, Alvin; Asghari, Shabnam; Crane, Joan; Wang, Peizhong Peter; Godwin, Marshall
2015-01-01
Background: Previous research has suggested that obstetric outcomes are similar for deliveries by family physicians and obstetricians, but many of these studies were small, and none of them adjusted for unmeasured selection bias. We compared obstetric outcomes between these provider types using an econometric method designed to adjust for unobserved confounding. Methods: We performed a retrospective population-based cohort study of all Canadian (except Quebec) hospital births with delivery by family physicians and obstetricians at more than 20 weeks gestational age, with birth weight greater than 500 g, between Apr. 1, 2006, and Mar. 31, 2009. The primary outcomes were the relative risks of in-hospital perinatal death and a composite of maternal mortality and major morbidity assessed with multivariable logistic regression and instrumental variable–adjusted multivariable regression. Results: After exclusions, there were 3600 perinatal deaths and 14 394 cases of maternal morbidity among 799 823 infants and 793 053 mothers at 390 hospitals. For deliveries by family physicians v. obstetricians, the relative risk of perinatal mortality was 0.98 (95% confidence interval [CI] 0.85–1.14) and of maternal morbidity was 0.81 (95% CI 0.70–0.94) according to logistic regression. The respective relative risks were 0.97 (95% CI 0.58–1.64) and 1.13 (95% CI 0.65–1.95) according to instrumental variable methods. Interpretation: After adjusting for both observed and unobserved confounders, we found a similar risk of perinatal mortality and adverse maternal outcome for obstetric deliveries by family physicians and obstetricians. Whether there are differences between these groups for other outcomes remains to be seen. PMID:26303244
DOE Office of Scientific and Technical Information (OSTI.GOV)
Temple, P.J.; Mutters, R.J.; Adams, C.
1995-06-01
Biomass sampling plots were established at 29 locations within the dominant vegetation zones of the study area. Estimates of foliar biomass were made for each plot by three independent methods: regression analysis on the basis of tree diameter, calculation of the amount of light intercepted by the leaf canopy, and extrapolation from branch leaf area. Multivariate regression analysis was used to relate these foliar biomass estimates for oak plots and conifer plots to several independent predictor variables, including elevation, slope, aspect, temperature, precipitation, and soil chemical characteristics.
Solving large mixed linear models using preconditioned conjugate gradient iteration.
Strandén, I; Lidauer, M
1999-12-01
Continuous evaluation of dairy cattle with a random regression test-day model requires a fast solving method and algorithm. A new computing technique feasible in Jacobi and conjugate gradient based iterative methods using iteration on data is presented. In the new computing technique, the calculations in multiplication of a vector by a matrix were recorded to three steps instead of the commonly used two steps. The three-step method was implemented in a general mixed linear model program that used preconditioned conjugate gradient iteration. Performance of this program in comparison to other general solving programs was assessed via estimation of breeding values using univariate, multivariate, and random regression test-day models. Central processing unit time per iteration with the new three-step technique was, at best, one-third that needed with the old technique. Performance was best with the test-day model, which was the largest and most complex model used. The new program did well in comparison to other general software. Programs keeping the mixed model equations in random access memory required at least 20 and 435% more time to solve the univariate and multivariate animal models, respectively. Computations of the second best iteration on data took approximately three and five times longer for the animal and test-day models, respectively, than did the new program. Good performance was due to fast computing time per iteration and quick convergence to the final solutions. Use of preconditioned conjugate gradient based methods in solving large breeding value problems is supported by our findings.
Penalized spline estimation for functional coefficient regression models.
Cao, Yanrong; Lin, Haiqun; Wu, Tracy Z; Yu, Yan
2010-04-01
The functional coefficient regression models assume that the regression coefficients vary with some "threshold" variable, providing appreciable flexibility in capturing the underlying dynamics in data and avoiding the so-called "curse of dimensionality" in multivariate nonparametric estimation. We first investigate the estimation, inference, and forecasting for the functional coefficient regression models with dependent observations via penalized splines. The P-spline approach, as a direct ridge regression shrinkage type global smoothing method, is computationally efficient and stable. With established fixed-knot asymptotics, inference is readily available. Exact inference can be obtained for fixed smoothing parameter λ, which is most appealing for finite samples. Our penalized spline approach gives an explicit model expression, which also enables multi-step-ahead forecasting via simulations. Furthermore, we examine different methods of choosing the important smoothing parameter λ: modified multi-fold cross-validation (MCV), generalized cross-validation (GCV), and an extension of empirical bias bandwidth selection (EBBS) to P-splines. In addition, we implement smoothing parameter selection using mixed model framework through restricted maximum likelihood (REML) for P-spline functional coefficient regression models with independent observations. The P-spline approach also easily allows different smoothness for different functional coefficients, which is enabled by assigning different penalty λ accordingly. We demonstrate the proposed approach by both simulation examples and a real data application.
Henry, Stephen G.; Jerant, Anthony; Iosif, Ana-Maria; Feldman, Mitchell D.; Cipri, Camille; Kravitz, Richard L.
2015-01-01
Objective To identify factors associated with participant consent to record visits; to estimate effects of recording on patient-clinician interactions Methods Secondary analysis of data from a randomized trial studying communication about depression; participants were asked for optional consent to audio record study visits. Multiple logistic regression was used to model likelihood of patient and clinician consent. Multivariable regression and propensity score analyses were used to estimate effects of audio recording on 6 dependent variables: discussion of depressive symptoms, preventive health, and depression diagnosis; depression treatment recommendations; visit length; visit difficulty. Results Of 867 visits involving 135 primary care clinicians, 39% were recorded. For clinicians, only working in academic settings (P=0.003) and having worked longer at their current practice (P=0.02) were associated with increased likelihood of consent. For patients, white race (P=0.002) and diabetes (P=0.03) were associated with increased likelihood of consent. Neither multivariable regression nor propensity score analyses revealed any significant effects of recording on the variables examined. Conclusion Few clinician or patient characteristics were significantly associated with consent. Audio recording had no significant effect on any dependent variables. Practice Implications Benefits of recording clinic visits likely outweigh the risks of bias in this setting. PMID:25837372
Lin, Zhaozhou; Zhang, Qiao; Liu, Ruixin; Gao, Xiaojie; Zhang, Lu; Kang, Bingya; Shi, Junhan; Wu, Zidan; Gui, Xinjing; Li, Xuelin
2016-01-25
To accurately, safely, and efficiently evaluate the bitterness of Traditional Chinese Medicines (TCMs), a robust predictor was developed using robust partial least squares (RPLS) regression method based on data obtained from an electronic tongue (e-tongue) system. The data quality was verified by the Grubb's test. Moreover, potential outliers were detected based on both the standardized residual and score distance calculated for each sample. The performance of RPLS on the dataset before and after outlier detection was compared to other state-of-the-art methods including multivariate linear regression, least squares support vector machine, and the plain partial least squares regression. Both R² and root-mean-squares error (RMSE) of cross-validation (CV) were recorded for each model. With four latent variables, a robust RMSECV value of 0.3916 with bitterness values ranging from 0.63 to 4.78 were obtained for the RPLS model that was constructed based on the dataset including outliers. Meanwhile, the RMSECV, which was calculated using the models constructed by other methods, was larger than that of the RPLS model. After six outliers were excluded, the performance of all benchmark methods markedly improved, but the difference between the RPLS model constructed before and after outlier exclusion was negligible. In conclusion, the bitterness of TCM decoctions can be accurately evaluated with the RPLS model constructed using e-tongue data.
Brian K. Via; Todd F. Shupe; Leslie H. Groom; Michael Stine; Chi-Leung So
2003-01-01
In manufacturing, monitoring the mechanical properties of wood with near infrared spectroscopy (NIR) is an attractive alternative to more conventional methods. However, no attention has been given to see if models differ between juvenile and mature wood. Additionally, it would be convenient if multiple linear regression (MLR) could perform well in the place of more...
Sharif, Nasim
2010-01-01
Objective This study was conducted to compare the personal well-being among the wives of Iranian veterans living in the city of Qom. Method A sample of 300 was randomly selected from a database containing the addresses of veteran's families at Iran's Veterans Foundation in Qom (Bonyad-e-Shahid va Omoore Isargaran). The veterans' wives were divided into three groups: wives of martyrs (killed veterans), wives of prisoners of war, and wives of disabled veterans. The Persian translation of Personal Well-being Index and Stress Symptoms Checklist (SSC) were administered for data collection. Four women chose not to respond to Personal Well-being Index. Data were then analyzed using linear multivariate regression (stepwise method), analysis of variance, and by computing the correlation between variables. Results Results showed a negative correlation between well-being and stress symptoms. However, each group demonstrated different levels of stress symptoms. Furthermore, multivariate linear regression in the 3 groups showed that overall satisfaction of life and personal well-being (total score and its domains) could be predicted by different symptoms. Conclusion Each group experienced different challenges and thus different stress symptoms. Therefore, although they all need help, each group needs to be helped in a different way. PMID:22952487
Jiang, Wei; Xu, Chao-Zhen; Jiang, Si-Zhi; Zhang, Tang-Duo; Wang, Shi-Zhen; Fang, Bai-Shan
2017-04-01
L-tert-Leucine (L-Tle) and its derivatives are extensively used as crucial building blocks for chiral auxiliaries, pharmaceutically active ingredients, and ligands. Combining with formate dehydrogenase (FDH) for regenerating the expensive coenzyme NADH, leucine dehydrogenase (LeuDH) is continually used for synthesizing L-Tle from α-keto acid. A multilevel factorial experimental design was executed for research of this system. In this work, an efficient optimization method for improving the productivity of L-Tle was developed. And the mathematical model between different fermentation conditions and L-Tle yield was also determined in the form of the equation by using uniform design and regression analysis. The multivariate regression equation was conveniently implemented in water, with a space time yield of 505.9 g L -1 day -1 and an enantiomeric excess value of >99 %. These results demonstrated that this method might become an ideal protocol for industrial production of chiral compounds and unnatural amino acids such as chiral drug intermediates.
NASA Astrophysics Data System (ADS)
de Santana, Felipe Bachion; de Souza, André Marcelo; Poppi, Ronei Jesus
2018-02-01
This study evaluates the use of visible and near infrared spectroscopy (Vis-NIRS) combined with multivariate regression based on random forest to quantify some quality soil parameters. The parameters analyzed were soil cation exchange capacity (CEC), sum of exchange bases (SB), organic matter (OM), clay and sand present in the soils of several regions of Brazil. Current methods for evaluating these parameters are laborious, timely and require various wet analytical methods that are not adequate for use in precision agriculture, where faster and automatic responses are required. The random forest regression models were statistically better than PLS regression models for CEC, OM, clay and sand, demonstrating resistance to overfitting, attenuating the effect of outlier samples and indicating the most important variables for the model. The methodology demonstrates the potential of the Vis-NIR as an alternative for determination of CEC, SB, OM, sand and clay, making possible to develop a fast and automatic analytical procedure.
NASA Astrophysics Data System (ADS)
Metwally, Fadia H.
2008-02-01
The quantitative predictive abilities of the new and simple bivariate spectrophotometric method are compared with the results obtained by the use of multivariate calibration methods [the classical least squares (CLS), principle component regression (PCR) and partial least squares (PLS)], using the information contained in the absorption spectra of the appropriate solutions. Mixtures of the two drugs Nifuroxazide (NIF) and Drotaverine hydrochloride (DRO) were resolved by application of the bivariate method. The different chemometric approaches were applied also with previous optimization of the calibration matrix, as they are useful in simultaneous inclusion of many spectral wavelengths. The results found by application of the bivariate, CLS, PCR and PLS methods for the simultaneous determinations of mixtures of both components containing 2-12 μg ml -1 of NIF and 2-8 μg ml -1 of DRO are reported. Both approaches were satisfactorily applied to the simultaneous determination of NIF and DRO in pure form and in pharmaceutical formulation. The results were in accordance with those given by the EVA Pharma reference spectrophotometric method.
Logistic models--an odd(s) kind of regression.
Jupiter, Daniel C
2013-01-01
The logistic regression model bears some similarity to the multivariable linear regression with which we are familiar. However, the differences are great enough to warrant a discussion of the need for and interpretation of logistic regression. Copyright © 2013 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.
[Statistical prediction methods in violence risk assessment and its application].
Liu, Yuan-Yuan; Hu, Jun-Mei; Yang, Min; Li, Xiao-Song
2013-06-01
It is an urgent global problem how to improve the violence risk assessment. As a necessary part of risk assessment, statistical methods have remarkable impacts and effects. In this study, the predicted methods in violence risk assessment from the point of statistics are reviewed. The application of Logistic regression as the sample of multivariate statistical model, decision tree model as the sample of data mining technique, and neural networks model as the sample of artificial intelligence technology are all reviewed. This study provides data in order to contribute the further research of violence risk assessment.
Retro-regression--another important multivariate regression improvement.
Randić, M
2001-01-01
We review the serious problem associated with instabilities of the coefficients of regression equations, referred to as the MRA (multivariate regression analysis) "nightmare of the first kind". This is manifested when in a stepwise regression a descriptor is included or excluded from a regression. The consequence is an unpredictable change of the coefficients of the descriptors that remain in the regression equation. We follow with consideration of an even more serious problem, referred to as the MRA "nightmare of the second kind", arising when optimal descriptors are selected from a large pool of descriptors. This process typically causes at different steps of the stepwise regression a replacement of several previously used descriptors by new ones. We describe a procedure that resolves these difficulties. The approach is illustrated on boiling points of nonanes which are considered (1) by using an ordered connectivity basis; (2) by using an ordering resulting from application of greedy algorithm; and (3) by using an ordering derived from an exhaustive search for optimal descriptors. A novel variant of multiple regression analysis, called retro-regression (RR), is outlined showing how it resolves the ambiguities associated with both "nightmares" of the first and the second kind of MRA.
Selecting minimum dataset soil variables using PLSR as a regressive multivariate method
NASA Astrophysics Data System (ADS)
Stellacci, Anna Maria; Armenise, Elena; Castellini, Mirko; Rossi, Roberta; Vitti, Carolina; Leogrande, Rita; De Benedetto, Daniela; Ferrara, Rossana M.; Vivaldi, Gaetano A.
2017-04-01
Long-term field experiments and science-based tools that characterize soil status (namely the soil quality indices, SQIs) assume a strategic role in assessing the effect of agronomic techniques and thus in improving soil management especially in marginal environments. Selecting key soil variables able to best represent soil status is a critical step for the calculation of SQIs. Current studies show the effectiveness of statistical methods for variable selection to extract relevant information deriving from multivariate datasets. Principal component analysis (PCA) has been mainly used, however supervised multivariate methods and regressive techniques are progressively being evaluated (Armenise et al., 2013; de Paul Obade et al., 2016; Pulido Moncada et al., 2014). The present study explores the effectiveness of partial least square regression (PLSR) in selecting critical soil variables, using a dataset comparing conventional tillage and sod-seeding on durum wheat. The results were compared to those obtained using PCA and stepwise discriminant analysis (SDA). The soil data derived from a long-term field experiment in Southern Italy. On samples collected in April 2015, the following set of variables was quantified: (i) chemical: total organic carbon and nitrogen (TOC and TN), alkali-extractable C (TEC and humic substances - HA-FA), water extractable N and organic C (WEN and WEOC), Olsen extractable P, exchangeable cations, pH and EC; (ii) physical: texture, dry bulk density (BD), macroporosity (Pmac), air capacity (AC), and relative field capacity (RFC); (iii) biological: carbon of the microbial biomass quantified with the fumigation-extraction method. PCA and SDA were previously applied to the multivariate dataset (Stellacci et al., 2016). PLSR was carried out on mean centered and variance scaled data of predictors (soil variables) and response (wheat yield) variables using the PLS procedure of SAS/STAT. In addition, variable importance for projection (VIP) statistics was used to quantitatively assess the predictors most relevant for response variable estimation and then for variable selection (Andersen and Bro, 2010). PCA and SDA returned TOC and RFC as influential variables both on the set of chemical and physical data analyzed separately as well as on the whole dataset (Stellacci et al., 2016). Highly weighted variables in PCA were also TEC, followed by K, and AC, followed by Pmac and BD, in the first PC (41.2% of total variance); Olsen P and HA-FA in the second PC (12.6%), Ca in the third (10.6%) component. Variables enabling maximum discrimination among treatments for SDA were WEOC, on the whole dataset, humic substances, followed by Olsen P, EC and clay, in the separate data analyses. The highest PLS-VIP statistics were recorded for Olsen P and Pmac, followed by TOC, TEC, pH and Mg for chemical variables and clay, RFC and AC for the physical variables. Results show that different methods may provide different ranking of the selected variables and the presence of a response variable, in regressive techniques, may affect variable selection. Further investigation with different response variables and with multi-year datasets would allow to better define advantages and limits of single or combined approaches. Acknowledgment The work was supported by the projects "BIOTILLAGE, approcci innovative per il miglioramento delle performances ambientali e produttive dei sistemi cerealicoli no-tillage", financed by PSR-Basilicata 2007-2013, and "DESERT, Low-cost water desalination and sensor technology compact module" financed by ERANET-WATERWORKS 2014. References Andersen C.M. and Bro R., 2010. Variable selection in regression - a tutorial. Journal of Chemometrics, 24 728-737. Armenise et al., 2013. Developing a soil quality index to compare soil fitness for agricultural use under different managements in the mediterranean environment. Soil and Tillage Research, 130:91-98. de Paul Obade et al., 2016. A standardized soil quality index for diverse field conditions. Sci. Total Env. 541:424-434. Pulido Moncada et al., 2014. Data-driven analysis of soil quality indicators using limited data. Geoderma, 235:271-278. Stellacci et al., 2016. Comparison of different multivariate methods to select key soil variables for soil quality indices computation. XLV Congress of the Italian Society of Agronomy (SIA), Sassari, 20-22 September 2016.
NASA Technical Reports Server (NTRS)
MCKissick, Burnell T. (Technical Monitor); Plassman, Gerald E.; Mall, Gerald H.; Quagliano, John R.
2005-01-01
Linear multivariable regression models for predicting day and night Eddy Dissipation Rate (EDR) from available meteorological data sources are defined and validated. Model definition is based on a combination of 1997-2000 Dallas/Fort Worth (DFW) data sources, EDR from Aircraft Vortex Spacing System (AVOSS) deployment data, and regression variables primarily from corresponding Automated Surface Observation System (ASOS) data. Model validation is accomplished through EDR predictions on a similar combination of 1994-1995 Memphis (MEM) AVOSS and ASOS data. Model forms include an intercept plus a single term of fixed optimal power for each of these regression variables; 30-minute forward averaged mean and variance of near-surface wind speed and temperature, variance of wind direction, and a discrete cloud cover metric. Distinct day and night models, regressing on EDR and the natural log of EDR respectively, yield best performance and avoid model discontinuity over day/night data boundaries.
Castritius, Stefan; Kron, Alexander; Schäfer, Thomas; Rädle, Matthias; Harms, Diedrich
2010-12-22
A new approach of combination of near-infrared (NIR) spectroscopy and refractometry was developed in this work to determine the concentration of alcohol and real extract in various beer samples. A partial least-squares (PLS) regression, as multivariate calibration method, was used to evaluate the correlation between the data of spectroscopy/refractometry and alcohol/extract concentration. This multivariate combination of spectroscopy and refractometry enhanced the precision in the determination of alcohol, compared to single spectroscopy measurements, due to the effect of high extract concentration on the spectral data, especially of nonalcoholic beer samples. For NIR calibration, two mathematical pretreatments (first-order derivation and linear baseline correction) were applied to eliminate light scattering effects. A sample grouping of the refractometry data was also applied to increase the accuracy of the determined concentration. The root mean squared errors of validation (RMSEV) of the validation process concerning alcohol and extract concentration were 0.23 Mas% (method A), 0.12 Mas% (method B), and 0.19 Mas% (method C) and 0.11 Mas% (method A), 0.11 Mas% (method B), and 0.11 Mas% (method C), respectively.
O’Brien, Catherine; True, Lawrence D.; Higano, Celestia S.; Rademacher, Brooks L. S.; Garzotto, Mark; Beer, Tomasz M.
2011-01-01
Clinical trials are evaluating the effect of neoadjuvant chemotherapy on men with high risk prostate cancer. Little is known about the clinical significance of post-chemotherapy tumor histopathology. We assessed the prognostic and predictive value of histological features (intraductal carcinoma, vacuolated cell morphology, inconspicuous glands, cribriform architecture, and inconspicuous cancer cells) observed in 50 high-risk prostate cancers treated with pre-prostatectomy docetaxel and mitoxantrone. At a median follow-up of 65 months, the overall relapse-free survival (RFS) at 2 and 5 years was 65% and 49%, respectively. In univariate analyses (using Kaplan-Meier method and log-rank tests) intraductal (p=0.001) and cribriform (p=0.014) histologies were associated with shorter RFS. In multivariate analyses, using Cox’s proportional hazards regression, baseline PSA (p=0.004), lymph node metastases (p<0.001), and cribriform histology (p=0.007) were associated with shorter RFS. In multivariable logistic regression analysis, only intraductal pattern (p=0.007) predicted lymph node metastases. Intraductal and cribriform histologies apparently predict post-chemotherapy outcome. PMID:20231619
Correlates of HIV knowledge and Sexual risk behaviors among Female Military Personnel
Essien, E. James; Monjok, Emmanuel; Chen, Hua; Abughosh, Susan; Ekong, Ernest; Peters, Ronald J.; Holmes, Laurens; Holstad, Marcia M.; Mgbere, Osaro
2010-01-01
Objective Uniformed services personnel are at an increased risk of HIV infection. We examined the HIV/AIDS knowledge and sexual risk behaviors among female military personnel to determine the correlates of HIV risk behaviors in this population. Method The study used a cross-sectional design to examine HIV/AIDS knowledge and sexual risk behaviors in a sample of 346 females drawn from two military cantonments in Southwestern Nigeria. Data was collected between 2006 and 2008. Using bivariate analysis and multivariate logistic regression, HIV/AIDS knowledge and sexual behaviors were described in relation to socio-demographic characteristics of the participants. Results Multivariate logistic regression analysis revealed that level of education and knowing someone with HIV/AIDS were significant (p<0.05) predictors of HIV knowledge in this sample. HIV prevention self-efficacy was significantly (P<0.05) predicted by annual income and race/ethnicity. Condom use attitudes were also significantly (P<0.05) associated with number of children, annual income, and number of sexual partners. Conclusion Data indicates the importance of incorporating these predictor variables into intervention designs. PMID:20387111
Arteriopathy after transarterial chemo-lipiodolization for hepatocellular carcinoma.
Matsui, Y; Figi, A; Horikawa, M; Jahangiri Noudeh, Y; Tomozawa, Y; Hashimoto, K; Kaufman, J A; Farsad, K
2017-12-01
The purpose of this study was to investigate the incidence of and the risk factors for arteriopathy in hepatic arteries after transarterial chemo-lipiodolization in patients with hepatocellular carcinoma and the subsequent treatment strategy changes due to arteriopathy. A total of 365 arteries in 167 patients (126 men and 41 women; mean age, 60.4±15.0 [SD] years [range: 18-87 years]) were evaluated for the development of arteriopathy after chemo-lipiodolization with epirubicin- or doxorubicin-Lipiodol ® emulsion. The development of arteriopathy after chemo-lipiodolization was assessed on arteriograms performed during subsequent transarterial treatments. The treatment strategy changes due to arteriopathy, including change in the chemo-lipiodolization method and the application of alternative therapies was also investigated. Univariate and multivariate binary logistic regression models were used to identify risk factors for arteriopathy and subsequent treatment strategy change. One hundred two (27.9%) arteriopathies were detected in 62/167 (37.1%) patients (45 men, 17 women) with a mean age of 63.3±7.1 [SD] years (age range, 50-86 years). The incidence of arteriopathy was highly patient dependent, demonstrating significant correlation in a fully-adjusted multivariate regression model (P<0.0001). Multivariate-adjusted regression analysis with adjustment for the patient effect showed a statistically significant association of super-selective chemo-lipiodolization (P=0.003) with the incidence of arteriopathy. Thirty of the 102 arteriopathies (29.4%) caused a change in treatment strategy. No factors were found to be significantly associated with the treatment strategy change. The incidence of arteriopathy after chemo-lipiodolization is 27.9%. Among them, 29.4% result in a change in treatment strategy. Copyright © 2017 Editions françaises de radiologie. Published by Elsevier Masson SAS. All rights reserved.
Serum Vitamin D Levels and Markers of Severity of Childhood Asthma in Costa Rica
Brehm, John M.; Celedón, Juan C.; Soto-Quiros, Manuel E.; Avila, Lydiana; Hunninghake, Gary M.; Forno, Erick; Laskey, Daniel; Sylvia, Jody S.; Hollis, Bruce W.; Weiss, Scott T.; Litonjua, Augusto A.
2009-01-01
Rationale: Maternal vitamin D intake during pregnancy has been inversely associated with asthma symptoms in early childhood. However, no study has examined the relationship between measured vitamin D levels and markers of asthma severity in childhood. Objectives: To determine the relationship between measured vitamin D levels and both markers of asthma severity and allergy in childhood. Methods: We examined the relation between 25-hydroxyvitamin D levels (the major circulating form of vitamin D) and markers of allergy and asthma severity in a cross-sectional study of 616 Costa Rican children between the ages of 6 and 14 years. Linear, logistic, and negative binomial regressions were used for the univariate and multivariate analyses. Measurements and Main Results: Of the 616 children with asthma, 175 (28%) had insufficient levels of vitamin D (<30 ng/ml). In multivariate linear regression models, vitamin D levels were significantly and inversely associated with total IgE and eosinophil count. In multivariate logistic regression models, a log10 unit increase in vitamin D levels was associated with reduced odds of any hospitalization in the previous year (odds ratio [OR], 0.05; 95% confidence interval [CI], 0.004–0.71; P = 0.03), any use of antiinflammatory medications in the previous year (OR, 0.18; 95% CI, 0.05–0.67; P = 0.01), and increased airway responsiveness (a ≤8.58-μmol provocative dose of methacholine producing a 20% fall in baseline FEV1 [OR, 0.15; 95% CI, 0.024–0.97; P = 0.05]). Conclusions: Our results suggest that vitamin D insufficiency is relatively frequent in an equatorial population of children with asthma. In these children, lower vitamin D levels are associated with increased markers of allergy and asthma severity. PMID:19179486
NASA Astrophysics Data System (ADS)
Carisi, Francesca; Domeneghetti, Alessio; Kreibich, Heidi; Schröter, Kai; Castellarin, Attilio
2017-04-01
Flood risk is function of flood hazard and vulnerability, therefore its accurate assessment depends on a reliable quantification of both factors. The scientific literature proposes a number of objective and reliable methods for assessing flood hazard, yet it highlights a limited understanding of the fundamental damage processes. Loss modelling is associated with large uncertainty which is, among other factors, due to a lack of standard procedures; for instance, flood losses are often estimated based on damage models derived in completely different contexts (i.e. different countries or geographical regions) without checking its applicability, or by considering only one explanatory variable (i.e. typically water depth). We consider the Secchia river flood event of January 2014, when a sudden levee-breach caused the inundation of nearly 200 km2 in Northern Italy. In the aftermath of this event, local authorities collected flood loss data, together with additional information on affected private households and industrial activities (e.g. buildings surface and economic value, number of company's employees and others). Based on these data we implemented and compared a quadratic-regression damage function, with water depth as the only explanatory variable, and a multi-variable model that combines multiple regression trees and considers several explanatory variables (i.e. bagging decision trees). Our results show the importance of data collection revealing that (1) a simple quadratic regression damage function based on empirical data from the study area can be significantly more accurate than literature damage-models derived for a different context and (2) multi-variable modelling may outperform the uni-variable approach, yet it is more difficult to develop and apply due to a much higher demand of detailed data.
Concentration-Dependent Antagonism and Culture Conversion in Pulmonary Tuberculosis
Pasipanodya, Jotam G.; Denti, Paolo; Sirgel, Frederick; Lesosky, Maia; Gumbo, Tawanda; Meintjes, Graeme; McIlleron, Helen; Wilkinson, Robert J.
2017-01-01
Abstract Background. There is scant evidence to support target drug exposures for optimal tuberculosis outcomes. We therefore assessed whether pharmacokinetic/pharmacodynamic (PK/PD) parameters could predict 2-month culture conversion. Methods. One hundred patients with pulmonary tuberculosis (65% human immunodeficiency virus coinfected) were intensively sampled to determine rifampicin, isoniazid, and pyrazinamide plasma concentrations after 7–8 weeks of therapy, and PK parameters determined using nonlinear mixed-effects models. Detailed clinical data and sputum for culture were collected at baseline, 2 months, and 5–6 months. Minimum inhibitory concentrations (MICs) were determined on baseline isolates. Multivariate logistic regression and the assumption-free multivariate adaptive regression splines (MARS) were used to identify clinical and PK/PD predictors of 2-month culture conversion. Potential PK/PD predictors included 0- to 24-hour area under the curve (AUC0-24), maximum concentration (Cmax), AUC0-24/MIC, Cmax/MIC, and percentage of time that concentrations persisted above the MIC (%TMIC). Results. Twenty-six percent of patients had Cmax of rifampicin <8 mg/L, pyrazinamide <35 mg/L, and isoniazid <3 mg/L. No relationship was found between PK exposures and 2-month culture conversion using multivariate logistic regression after adjusting for MIC. However, MARS identified negative interactions between isoniazid Cmax and rifampicin Cmax/MIC ratio on 2-month culture conversion. If isoniazid Cmax was <4.6 mg/L and rifampicin Cmax/MIC <28, the isoniazid concentration had an antagonistic effect on culture conversion. For patients with isoniazid Cmax >4.6 mg/L, higher isoniazid exposures were associated with improved rates of culture conversion. Conclusions. PK/PD analyses using MARS identified isoniazid Cmax and rifampicin Cmax/MIC thresholds below which there is concentration-dependent antagonism that reduces 2-month sputum culture conversion. PMID:28205671
Shaw, Souradet Y.; Lorway, Robert R.; Deering, Kathleen N.; Avery, Lisa; Mohan, H. L.; Bhattacharjee, Parinita; Reza-Paul, Sushena; Isac, Shajy; Ramesh, Banadakoppa M.; Washington, Reynold; Moses, Stephen; Blanchard, James F.
2012-01-01
Objectives There is a lack of information on sexual violence (SV) among men who have sex with men and transgendered individuals (MSM-T) in southern India. As SV has been associated with HIV vulnerability, this study examined health related behaviours and practices associated with SV among MSM-T. Design Data were from cross-sectional surveys from four districts in Karnataka, India. Methods Multivariable logistic regression models were constructed to examine factors related to SV. Multivariable negative binomial regression models examined the association between physician visits and SV. Results A total of 543 MSM-T were included in the study. Prevalence of SV was 18% in the past year. HIV prevalence among those reporting SV was 20%, compared to 12% among those not reporting SV (p = .104). In multivariable models, and among sex workers, those reporting SV were more likely to report anal sex with 5+ casual sex partners in the past week (AOR: 4.1; 95%CI: 1.2–14.3, p = .029). Increased physician visits among those reporting SV was reported only for those involved in sex work (ARR: 1.7; 95%CI: 1.1–2.7, p = .012). Conclusions These results demonstrate high levels of SV among MSM-T populations, highlighting the importance of integrating interventions to reduce violence as part of HIV prevention programs and health services. PMID:22448214
Dong, Chunjiao; Clarke, David B; Richards, Stephen H; Huang, Baoshan
2014-01-01
The influence of intersection features on safety has been examined extensively because intersections experience a relatively large proportion of motor vehicle conflicts and crashes. Although there are distinct differences between passenger cars and large trucks-size, operating characteristics, dimensions, and weight-modeling crash counts across vehicle types is rarely addressed. This paper develops and presents a multivariate regression model of crash frequencies by collision vehicle type using crash data for urban signalized intersections in Tennessee. In addition, the performance of univariate Poisson-lognormal (UVPLN), multivariate Poisson (MVP), and multivariate Poisson-lognormal (MVPLN) regression models in establishing the relationship between crashes, traffic factors, and geometric design of roadway intersections is investigated. Bayesian methods are used to estimate the unknown parameters of these models. The evaluation results suggest that the MVPLN model possesses most of the desirable statistical properties in developing the relationships. Compared to the UVPLN and MVP models, the MVPLN model better identifies significant factors and predicts crash frequencies. The findings suggest that traffic volume, truck percentage, lighting condition, and intersection angle significantly affect intersection safety. Important differences in car, car-truck, and truck crash frequencies with respect to various risk factors were found to exist between models. The paper provides some new or more comprehensive observations that have not been covered in previous studies. Copyright © 2013 Elsevier Ltd. All rights reserved.
PAPANNA, Ramesha; BLOCK-ABRAHAM, Dana; Mann, Lovepreet K; BUHIMSCHI, Irina A.; BEBBINGTON, Michael; GARCIA, Elisa; KAHLEK, Nahla; HARMAN, Christopher; JOHNSON, Anthony; BASCHAT, Ahmet; MOISE, Kenneth J.
2014-01-01
OBJECTIVE Despite improved perinatal survival following fetoscopic laser surgery (FLS) for twin twin transfusion syndrome (TTTS), prematurity remains an important contributor to perinatal mortality and morbidity. The objective of the study was to identify risk factors for complicated preterm delivery after FLS. STUDY DESIGN Retrospective cohort study of prospectively collected data on maternal/fetal demographics and pre-operative, operative and post-operative variables of 459 patients treated in 3 U.S. fetal centers. Multivariate linear regression was performed to identify significant risk factors associated with preterm delivery, which was cross-validated using K-fold method. Multivariate logistic regression was performed to identify risk factors for early vs. late preterm delivery based on median gestational age at delivery of 32 weeks. RESULTS There were significant differences in case selection and outcomes between the centers. After controlling for the center of surgery, a multivariate analysis indicated a lower maternal age at procedure, history of previous prematurity, shortened cervical length, use of amnioinfusion, 12 Fr cannula diameter, lack of a collagen plug placement and iatrogenic preterm premature rupture of membranes (iPPROM) were significantly associated with a lower gestational age at delivery. CONCLUSION Specific fetal/maternal and operative variables are associated with preterm delivery after FLS for the treatment of TTTS. Further studies to modify some of these variables may decrease the perinatal morbidity after laser therapy. PMID:24013922
Giacomino, Agnese; Abollino, Ornella; Malandrino, Mery; Mentasti, Edoardo
2011-03-04
Single and sequential extraction procedures are used for studying element mobility and availability in solid matrices, like soils, sediments, sludge, and airborne particulate matter. In the first part of this review we reported an overview on these procedures and described the applications of chemometric uni- and bivariate techniques and of multivariate pattern recognition techniques based on variable reduction to the experimental results obtained. The second part of the review deals with the use of chemometrics not only for the visualization and interpretation of data, but also for the investigation of the effects of experimental conditions on the response, the optimization of their values and the calculation of element fractionation. We will describe the principles of the multivariate chemometric techniques considered, the aims for which they were applied and the key findings obtained. The following topics will be critically addressed: pattern recognition by cluster analysis (CA), linear discriminant analysis (LDA) and other less common techniques; modelling by multiple linear regression (MLR); investigation of spatial distribution of variables by geostatistics; calculation of fractionation patterns by a mixture resolution method (Chemometric Identification of Substrates and Element Distributions, CISED); optimization and characterization of extraction procedures by experimental design; other multivariate techniques less commonly applied. Copyright © 2010 Elsevier B.V. All rights reserved.
Anantha M. Prasad; Louis R. Iverson; Andy Liaw; Andy Liaw
2006-01-01
We evaluated four statistical models - Regression Tree Analysis (RTA), Bagging Trees (BT), Random Forests (RF), and Multivariate Adaptive Regression Splines (MARS) - for predictive vegetation mapping under current and future climate scenarios according to the Canadian Climate Centre global circulation model.
Tøndel, Kristin; Indahl, Ulf G; Gjuvsland, Arne B; Vik, Jon Olav; Hunter, Peter; Omholt, Stig W; Martens, Harald
2011-06-01
Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs) to variation in features of the trajectories of the state variables (outputs) throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR), where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR) and ordinary least squares (OLS) regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback loops. HC-PLSR is a promising approach for metamodelling in systems biology, especially for highly nonlinear or non-monotone parameter to phenotype maps. The algorithm can be flexibly adjusted to suit the complexity of the dynamic model behaviour, inviting automation in the metamodelling of complex systems.
2011-01-01
Background Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs) to variation in features of the trajectories of the state variables (outputs) throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR), where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR) and ordinary least squares (OLS) regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Results Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback loops. Conclusions HC-PLSR is a promising approach for metamodelling in systems biology, especially for highly nonlinear or non-monotone parameter to phenotype maps. The algorithm can be flexibly adjusted to suit the complexity of the dynamic model behaviour, inviting automation in the metamodelling of complex systems. PMID:21627852
NASA Astrophysics Data System (ADS)
Attia, Khalid A. M.; Nassar, Mohammed W. I.; El-Zeiny, Mohamed B.; Serag, Ahmed
2017-01-01
For the first time, a new variable selection method based on swarm intelligence namely firefly algorithm is coupled with three different multivariate calibration models namely, concentration residual augmented classical least squares, artificial neural network and support vector regression in UV spectral data. A comparative study between the firefly algorithm and the well-known genetic algorithm was developed. The discussion revealed the superiority of using this new powerful algorithm over the well-known genetic algorithm. Moreover, different statistical tests were performed and no significant differences were found between all the models regarding their predictabilities. This ensures that simpler and faster models were obtained without any deterioration of the quality of the calibration.
FGWAS: Functional genome wide association analysis.
Huang, Chao; Thompson, Paul; Wang, Yalin; Yu, Yang; Zhang, Jingwen; Kong, Dehan; Colen, Rivka R; Knickmeyer, Rebecca C; Zhu, Hongtu
2017-10-01
Functional phenotypes (e.g., subcortical surface representation), which commonly arise in imaging genetic studies, have been used to detect putative genes for complexly inherited neuropsychiatric and neurodegenerative disorders. However, existing statistical methods largely ignore the functional features (e.g., functional smoothness and correlation). The aim of this paper is to develop a functional genome-wide association analysis (FGWAS) framework to efficiently carry out whole-genome analyses of functional phenotypes. FGWAS consists of three components: a multivariate varying coefficient model, a global sure independence screening procedure, and a test procedure. Compared with the standard multivariate regression model, the multivariate varying coefficient model explicitly models the functional features of functional phenotypes through the integration of smooth coefficient functions and functional principal component analysis. Statistically, compared with existing methods for genome-wide association studies (GWAS), FGWAS can substantially boost the detection power for discovering important genetic variants influencing brain structure and function. Simulation studies show that FGWAS outperforms existing GWAS methods for searching sparse signals in an extremely large search space, while controlling for the family-wise error rate. We have successfully applied FGWAS to large-scale analysis of data from the Alzheimer's Disease Neuroimaging Initiative for 708 subjects, 30,000 vertices on the left and right hippocampal surfaces, and 501,584 SNPs. Copyright © 2017 Elsevier Inc. All rights reserved.
Chelgani, S.C.; Hart, B.; Grady, W.C.; Hower, J.C.
2011-01-01
The relationship between maceral content plus mineral matter and gross calorific value (GCV) for a wide range of West Virginia coal samples (from 6518 to 15330 BTU/lb; 15.16 to 35.66MJ/kg) has been investigated by multivariable regression and adaptive neuro-fuzzy inference system (ANFIS). The stepwise least square mathematical method comparison between liptinite, vitrinite, plus mineral matter as input data sets with measured GCV reported a nonlinear correlation coefficient (R2) of 0.83. Using the same data set the correlation between the predicted GCV from the ANFIS model and the actual GCV reported a R2 value of 0.96. It was determined that the GCV-based prediction methods, as used in this article, can provide a reasonable estimation of GCV. Copyright ?? Taylor & Francis Group, LLC.
Chemometric methods for the simultaneous determination of some water-soluble vitamins.
Mohamed, Abdel-Maaboud I; Mohamed, Horria A; Mohamed, Niveen A; El-Zahery, Marwa R
2011-01-01
Two spectrophotometric methods, derivative and multivariate methods, were applied for the determination of binary, ternary, and quaternary mixtures of the water-soluble vitamins thiamine HCI (I), pyridoxine HCI (II), riboflavin (III), and cyanocobalamin (IV). The first method is divided into first derivative and first derivative of ratio spectra methods, and the second into classical least squares and principal components regression methods. Both methods are based on spectrophotometric measurements of the studied vitamins in 0.1 M HCl solution in the range of 200-500 nm for all components. The linear calibration curves were obtained from 2.5-90 microg/mL, and the correlation coefficients ranged from 0.9991 to 0.9999. These methods were applied for the analysis of the following mixtures: (I) and (II); (I), (II), and (III); (I), (II), and (IV); and (I), (II), (III), and (IV). The described methods were successfully applied for the determination of vitamin combinations in synthetic mixtures and dosage forms from different manufacturers. The recovery ranged from 96.1 +/- 1.2 to 101.2 +/- 1.0% for derivative methods and 97.0 +/- 0.5 to 101.9 +/- 1.3% for multivariate methods. The results of the developed methods were compared with those of reported methods, and gave good accuracy and precision.
Functional Relationships and Regression Analysis.
ERIC Educational Resources Information Center
Preece, Peter F. W.
1978-01-01
Using a degenerate multivariate normal model for the distribution of organismic variables, the form of least-squares regression analysis required to estimate a linear functional relationship between variables is derived. It is suggested that the two conventional regression lines may be considered to describe functional, not merely statistical,…
Zhao, Huaqing; Corrado, Rachel; Mastrogiannnis, Dimitrios M.; Lepore, Stephen J.
2017-01-01
Abstract Objectives: Ineffective contraceptive use among young sexually active women is extremely prevalent and poses a significant risk for unintended pregnancy (UP). Ineffective contraception involves the use of the withdrawal method or the inconsistent use of other types of contraception (i.e., condoms and birth control pills). This investigation examined violence exposure and psychological factors related to ineffective contraceptive use among young sexually active women. Materials and Methods: Young, nonpregnant sexually active women (n = 315) were recruited from an urban family planning clinic in 2013 to participate in a longitudinal study. Tablet-based surveys measured childhood violence, community-level violence, intimate partner violence, depressive symptoms, and self-esteem. Follow-up surveys measured type and consistency of contraception used 9 months later. Multivariate logistic regression models assessed violence and psychological risk factors as main effects and moderators related to ineffective compared with effective use of contraception. Results: The multivariate logistic regression model showed that childhood sexual violence and low self-esteem were significantly related to ineffective use of contraception (adjusted odds ratio [aOR] = 2.69, confidence interval [95% CI]: 1.18–6.17, and aOR = 0.51, 95% CI: 0.28–0.93; respectively), although self-esteem did not moderate the relationship between childhood sexual violence and ineffective use of contraception (aOR = 0.38, 95% CI: 0.08–1.84). Depressive symptoms were not related to ineffective use of contraception in the multivariate model. Conclusions: Interventions to reduce UP should recognize the long-term effects of childhood sexual violence and address the role of low self-esteem on the ability of young sexually active women to effectively and consistently use contraception to prevent UP. PMID:28045570
Laurens, L M L; Wolfrum, E J
2013-12-18
One of the challenges associated with microalgal biomass characterization and the comparison of microalgal strains and conversion processes is the rapid determination of the composition of algae. We have developed and applied a high-throughput screening technology based on near-infrared (NIR) spectroscopy for the rapid and accurate determination of algal biomass composition. We show that NIR spectroscopy can accurately predict the full composition using multivariate linear regression analysis of varying lipid, protein, and carbohydrate content of algal biomass samples from three strains. We also demonstrate a high quality of predictions of an independent validation set. A high-throughput 96-well configuration for spectroscopy gives equally good prediction relative to a ring-cup configuration, and thus, spectra can be obtained from as little as 10-20 mg of material. We found that lipids exhibit a dominant, distinct, and unique fingerprint in the NIR spectrum that allows for the use of single and multiple linear regression of respective wavelengths for the prediction of the biomass lipid content. This is not the case for carbohydrate and protein content, and thus, the use of multivariate statistical modeling approaches remains necessary.
Multivariate reference technique for quantitative analysis of fiber-optic tissue Raman spectroscopy.
Bergholt, Mads Sylvest; Duraipandian, Shiyamala; Zheng, Wei; Huang, Zhiwei
2013-12-03
We report a novel method making use of multivariate reference signals of fused silica and sapphire Raman signals generated from a ball-lens fiber-optic Raman probe for quantitative analysis of in vivo tissue Raman measurements in real time. Partial least-squares (PLS) regression modeling is applied to extract the characteristic internal reference Raman signals (e.g., shoulder of the prominent fused silica boson peak (~130 cm(-1)); distinct sapphire ball-lens peaks (380, 417, 646, and 751 cm(-1))) from the ball-lens fiber-optic Raman probe for quantitative analysis of fiber-optic Raman spectroscopy. To evaluate the analytical value of this novel multivariate reference technique, a rapid Raman spectroscopy system coupled with a ball-lens fiber-optic Raman probe is used for in vivo oral tissue Raman measurements (n = 25 subjects) under 785 nm laser excitation powers ranging from 5 to 65 mW. An accurate linear relationship (R(2) = 0.981) with a root-mean-square error of cross validation (RMSECV) of 2.5 mW can be obtained for predicting the laser excitation power changes based on a leave-one-subject-out cross-validation, which is superior to the normal univariate reference method (RMSE = 6.2 mW). A root-mean-square error of prediction (RMSEP) of 2.4 mW (R(2) = 0.985) can also be achieved for laser power prediction in real time when we applied the multivariate method independently on the five new subjects (n = 166 spectra). We further apply the multivariate reference technique for quantitative analysis of gelatin tissue phantoms that gives rise to an RMSEP of ~2.0% (R(2) = 0.998) independent of laser excitation power variations. This work demonstrates that multivariate reference technique can be advantageously used to monitor and correct the variations of laser excitation power and fiber coupling efficiency in situ for standardizing the tissue Raman intensity to realize quantitative analysis of tissue Raman measurements in vivo, which is particularly appealing in challenging Raman endoscopic applications.
Baldacchino, Tara; Jacobs, William R; Anderson, Sean R; Worden, Keith; Rowson, Jennifer
2018-01-01
This contribution presents a novel methodology for myolectric-based control using surface electromyographic (sEMG) signals recorded during finger movements. A multivariate Bayesian mixture of experts (MoE) model is introduced which provides a powerful method for modeling force regression at the fingertips, while also performing finger movement classification as a by-product of the modeling algorithm. Bayesian inference of the model allows uncertainties to be naturally incorporated into the model structure. This method is tested using data from the publicly released NinaPro database which consists of sEMG recordings for 6 degree-of-freedom force activations for 40 intact subjects. The results demonstrate that the MoE model achieves similar performance compared to the benchmark set by the authors of NinaPro for finger force regression. Additionally, inherent to the Bayesian framework is the inclusion of uncertainty in the model parameters, naturally providing confidence bounds on the force regression predictions. Furthermore, the integrated clustering step allows a detailed investigation into classification of the finger movements, without incurring any extra computational effort. Subsequently, a systematic approach to assessing the importance of the number of electrodes needed for accurate control is performed via sensitivity analysis techniques. A slight degradation in regression performance is observed for a reduced number of electrodes, while classification performance is unaffected.
A Unified Framework for Association Analysis with Multiple Related Phenotypes
Stephens, Matthew
2013-01-01
We consider the problem of assessing associations between multiple related outcome variables, and a single explanatory variable of interest. This problem arises in many settings, including genetic association studies, where the explanatory variable is genotype at a genetic variant. We outline a framework for conducting this type of analysis, based on Bayesian model comparison and model averaging for multivariate regressions. This framework unifies several common approaches to this problem, and includes both standard univariate and standard multivariate association tests as special cases. The framework also unifies the problems of testing for associations and explaining associations – that is, identifying which outcome variables are associated with genotype. This provides an alternative to the usual, but conceptually unsatisfying, approach of resorting to univariate tests when explaining and interpreting significant multivariate findings. The method is computationally tractable genome-wide for modest numbers of phenotypes (e.g. 5–10), and can be applied to summary data, without access to raw genotype and phenotype data. We illustrate the methods on both simulated examples, and to a genome-wide association study of blood lipid traits where we identify 18 potential novel genetic associations that were not identified by univariate analyses of the same data. PMID:23861737
Regional flow duration curves: Geostatistical techniques versus multivariate regression
Pugliese, Alessio; Farmer, William H.; Castellarin, Attilio; Archfield, Stacey A.; Vogel, Richard M.
2016-01-01
A period-of-record flow duration curve (FDC) represents the relationship between the magnitude and frequency of daily streamflows. Prediction of FDCs is of great importance for locations characterized by sparse or missing streamflow observations. We present a detailed comparison of two methods which are capable of predicting an FDC at ungauged basins: (1) an adaptation of the geostatistical method, Top-kriging, employing a linear weighted average of dimensionless empirical FDCs, standardised with a reference streamflow value; and (2) regional multiple linear regression of streamflow quantiles, perhaps the most common method for the prediction of FDCs at ungauged sites. In particular, Top-kriging relies on a metric for expressing the similarity between catchments computed as the negative deviation of the FDC from a reference streamflow value, which we termed total negative deviation (TND). Comparisons of these two methods are made in 182 largely unregulated river catchments in the southeastern U.S. using a three-fold cross-validation algorithm. Our results reveal that the two methods perform similarly throughout flow-regimes, with average Nash-Sutcliffe Efficiencies 0.566 and 0.662, (0.883 and 0.829 on log-transformed quantiles) for the geostatistical and the linear regression models, respectively. The differences between the reproduction of FDC's occurred mostly for low flows with exceedance probability (i.e. duration) above 0.98.
Yehia, Ali M; Arafa, Reham M; Abbas, Samah S; Amer, Sawsan M
2016-01-15
Spectral resolution of cefquinome sulfate (CFQ) in the presence of its degradation products was studied. Three selective, accurate and rapid spectrophotometric methods were performed for the determination of CFQ in the presence of either its hydrolytic, oxidative or photo-degradation products. The proposed ratio difference, derivative ratio and mean centering are ratio manipulating spectrophotometric methods that were satisfactorily applied for selective determination of CFQ within linear range of 5.0-40.0 μg mL(-1). Concentration Residuals Augmented Classical Least Squares was applied and evaluated for the determination of the cited drug in the presence of its all degradation products. Traditional Partial Least Squares regression was also applied and benchmarked against the proposed advanced multivariate calibration. Experimentally designed 25 synthetic mixtures of three factors at five levels were used to calibrate and validate the multivariate models. Advanced chemometrics succeeded in quantitative and qualitative analyses of CFQ along with its hydrolytic, oxidative and photo-degradation products. The proposed methods were applied successfully for different pharmaceutical formulations analyses. These developed methods were simple and cost-effective compared with the manufacturer's RP-HPLC method. Copyright © 2015 Elsevier B.V. All rights reserved.
Zhou, Jinzhe; Zhou, Yanbing; Cao, Shougen; Li, Shikuan; Wang, Hao; Niu, Zhaojian; Chen, Dong; Wang, Dongsheng; Lv, Liang; Zhang, Jian; Li, Yu; Jiao, Xuelong; Tan, Xiaojie; Zhang, Jianli; Wang, Haibo; Zhang, Bingyuan; Lu, Yun; Sun, Zhenqing
2016-01-01
Reporting of surgical complications is common, but few provide information about the severity and estimate risk factors of complications. If have, but lack of specificity. We retrospectively analyzed data on 2795 gastric cancer patients underwent surgical procedure at the Affiliated Hospital of Qingdao University between June 2007 and June 2012, established multivariate logistic regression model to predictive risk factors related to the postoperative complications according to the Clavien-Dindo classification system. Twenty-four out of 86 variables were identified statistically significant in univariate logistic regression analysis, 11 significant variables entered multivariate analysis were employed to produce the risk model. Liver cirrhosis, diabetes mellitus, Child classification, invasion of neighboring organs, combined resection, introperative transfusion, Billroth II anastomosis of reconstruction, malnutrition, surgical volume of surgeons, operating time and age were independent risk factors for postoperative complications after gastrectomy. Based on logistic regression equation, p=Exp∑BiXi / (1+Exp∑BiXi), multivariate logistic regression predictive model that calculated the risk of postoperative morbidity was developed, p = 1/(1 + e((4.810-1.287X1-0.504X2-0.500X3-0.474X4-0.405X5-0.318X6-0.316X7-0.305X8-0.278X9-0.255X10-0.138X11))). The accuracy, sensitivity and specificity of the model to predict the postoperative complications were 86.7%, 76.2% and 88.6%, respectively. This risk model based on Clavien-Dindo grading severity of complications system and logistic regression analysis can predict severe morbidity specific to an individual patient's risk factors, estimate patients' risks and benefits of gastric surgery as an accurate decision-making tool and may serve as a template for the development of risk models for other surgical groups.
Xiao, Li; Wei, Hui; Himmel, Michael E.; Jameel, Hasan; Kelley, Stephen S.
2014-01-01
Optimizing the use of lignocellulosic biomass as the feedstock for renewable energy production is currently being developed globally. Biomass is a complex mixture of cellulose, hemicelluloses, lignins, extractives, and proteins; as well as inorganic salts. Cell wall compositional analysis for biomass characterization is laborious and time consuming. In order to characterize biomass fast and efficiently, several high through-put technologies have been successfully developed. Among them, near infrared spectroscopy (NIR) and pyrolysis-molecular beam mass spectrometry (Py-mbms) are complementary tools and capable of evaluating a large number of raw or modified biomass in a short period of time. NIR shows vibrations associated with specific chemical structures whereas Py-mbms depicts the full range of fragments from the decomposition of biomass. Both NIR vibrations and Py-mbms peaks are assigned to possible chemical functional groups and molecular structures. They provide complementary information of chemical insight of biomaterials. However, it is challenging to interpret the informative results because of the large amount of overlapping bands or decomposition fragments contained in the spectra. In order to improve the efficiency of data analysis, multivariate analysis tools have been adapted to define the significant correlations among data variables, so that the large number of bands/peaks could be replaced by a small number of reconstructed variables representing original variation. Reconstructed data variables are used for sample comparison (principal component analysis) and for building regression models (partial least square regression) between biomass chemical structures and properties of interests. In this review, the important biomass chemical structures measured by NIR and Py-mbms are summarized. The advantages and disadvantages of conventional data analysis methods and multivariate data analysis methods are introduced, compared and evaluated. This review aims to serve as a guide for choosing the most effective data analysis methods for NIR and Py-mbms characterization of biomass. PMID:25147552
Lin, Zhaozhou; Zhang, Qiao; Liu, Ruixin; Gao, Xiaojie; Zhang, Lu; Kang, Bingya; Shi, Junhan; Wu, Zidan; Gui, Xinjing; Li, Xuelin
2016-01-01
To accurately, safely, and efficiently evaluate the bitterness of Traditional Chinese Medicines (TCMs), a robust predictor was developed using robust partial least squares (RPLS) regression method based on data obtained from an electronic tongue (e-tongue) system. The data quality was verified by the Grubb’s test. Moreover, potential outliers were detected based on both the standardized residual and score distance calculated for each sample. The performance of RPLS on the dataset before and after outlier detection was compared to other state-of-the-art methods including multivariate linear regression, least squares support vector machine, and the plain partial least squares regression. Both R2 and root-mean-squares error (RMSE) of cross-validation (CV) were recorded for each model. With four latent variables, a robust RMSECV value of 0.3916 with bitterness values ranging from 0.63 to 4.78 were obtained for the RPLS model that was constructed based on the dataset including outliers. Meanwhile, the RMSECV, which was calculated using the models constructed by other methods, was larger than that of the RPLS model. After six outliers were excluded, the performance of all benchmark methods markedly improved, but the difference between the RPLS model constructed before and after outlier exclusion was negligible. In conclusion, the bitterness of TCM decoctions can be accurately evaluated with the RPLS model constructed using e-tongue data. PMID:26821026
Beer fermentation: monitoring of process parameters by FT-NIR and multivariate data analysis.
Grassi, Silvia; Amigo, José Manuel; Lyndgaard, Christian Bøge; Foschino, Roberto; Casiraghi, Ernestina
2014-07-15
This work investigates the capability of Fourier-Transform near infrared (FT-NIR) spectroscopy to monitor and assess process parameters in beer fermentation at different operative conditions. For this purpose, the fermentation of wort with two different yeast strains and at different temperatures was monitored for nine days by FT-NIR. To correlate the collected spectra with °Brix, pH and biomass, different multivariate data methodologies were applied. Principal component analysis (PCA), partial least squares (PLS) and locally weighted regression (LWR) were used to assess the relationship between FT-NIR spectra and the abovementioned process parameters that define the beer fermentation. The accuracy and robustness of the obtained results clearly show the suitability of FT-NIR spectroscopy, combined with multivariate data analysis, to be used as a quality control tool in the beer fermentation process. FT-NIR spectroscopy, when combined with LWR, demonstrates to be a perfectly suitable quantitative method to be implemented in the production of beer. Copyright © 2014 Elsevier Ltd. All rights reserved.
Zhi, Ruicong; Zhao, Lei; Xie, Nan; Wang, Houyin; Shi, Bolin; Shi, Jingye
2016-01-13
A framework of establishing standard reference scale (texture) is proposed by multivariate statistical analysis according to instrumental measurement and sensory evaluation. Multivariate statistical analysis is conducted to rapidly select typical reference samples with characteristics of universality, representativeness, stability, substitutability, and traceability. The reasonableness of the framework method is verified by establishing standard reference scale of texture attribute (hardness) with Chinese well-known food. More than 100 food products in 16 categories were tested using instrumental measurement (TPA test), and the result was analyzed with clustering analysis, principal component analysis, relative standard deviation, and analysis of variance. As a result, nine kinds of foods were determined to construct the hardness standard reference scale. The results indicate that the regression coefficient between the estimated sensory value and the instrumentally measured value is significant (R(2) = 0.9765), which fits well with Stevens's theory. The research provides reliable a theoretical basis and practical guide for quantitative standard reference scale establishment on food texture characteristics.
Causal diagrams and multivariate analysis II: precision work.
Jupiter, Daniel C
2014-01-01
In this Investigators' Corner, I continue my discussion of when and why we researchers should include variables in multivariate regression. My examination focuses on studies comparing treatment groups and situations for which we can either exclude variables from multivariate analyses or include them for reasons of precision. Copyright © 2014 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Hasan, Haliza; Ahmad, Sanizah; Osman, Balkish Mohd; Sapri, Shamsiah; Othman, Nadirah
2017-08-01
In regression analysis, missing covariate data has been a common problem. Many researchers use ad hoc methods to overcome this problem due to the ease of implementation. However, these methods require assumptions about the data that rarely hold in practice. Model-based methods such as Maximum Likelihood (ML) using the expectation maximization (EM) algorithm and Multiple Imputation (MI) are more promising when dealing with difficulties caused by missing data. Then again, inappropriate methods of missing value imputation can lead to serious bias that severely affects the parameter estimates. The main objective of this study is to provide a better understanding regarding missing data concept that can assist the researcher to select the appropriate missing data imputation methods. A simulation study was performed to assess the effects of different missing data techniques on the performance of a regression model. The covariate data were generated using an underlying multivariate normal distribution and the dependent variable was generated as a combination of explanatory variables. Missing values in covariate were simulated using a mechanism called missing at random (MAR). Four levels of missingness (10%, 20%, 30% and 40%) were imposed. ML and MI techniques available within SAS software were investigated. A linear regression analysis was fitted and the model performance measures; MSE, and R-Squared were obtained. Results of the analysis showed that MI is superior in handling missing data with highest R-Squared and lowest MSE when percent of missingness is less than 30%. Both methods are unable to handle larger than 30% level of missingness.
Gupta, Deepak K; Claggett, Brian; Wells, Quinn; Cheng, Susan; Li, Man; Maruthur, Nisa; Selvin, Elizabeth; Coresh, Josef; Konety, Suma; Butler, Kenneth R; Mosley, Thomas; Boerwinkle, Eric; Hoogeveen, Ron; Ballantyne, Christie M; Solomon, Scott D
2015-01-01
Background Natriuretic peptides promote natriuresis, diuresis, and vasodilation. Experimental deficiency of natriuretic peptides leads to hypertension (HTN) and cardiac hypertrophy, conditions more common among African Americans. Hospital-based studies suggest that African Americans may have reduced circulating natriuretic peptides, as compared to Caucasians, but definitive data from community-based cohorts are lacking. Methods and Results We examined plasma N-terminal pro B-type natriuretic peptide (NTproBNP) levels according to race in 9137 Atherosclerosis Risk in Communities (ARIC) Study participants (22% African American) without prevalent cardiovascular disease at visit 4 (1996–1998). Multivariable linear and logistic regression analyses were performed adjusting for clinical covariates. Among African Americans, percent European ancestry was determined from genetic ancestry informative markers and then examined in relation to NTproBNP levels in multivariable linear regression analysis. NTproBNP levels were significantly lower in African Americans (median, 43 pg/mL; interquartile range [IQR], 18, 88) than Caucasians (median, 68 pg/mL; IQR, 36, 124; P<0.0001). In multivariable models, adjusted log NTproBNP levels were 40% lower (95% confidence interval [CI], −43, −36) in African Americans, compared to Caucasians, which was consistent across subgroups of age, gender, HTN, diabetes, insulin resistance, and obesity. African-American race was also significantly associated with having nondetectable NTproBNP (adjusted OR, 5.74; 95% CI, 4.22, 7.80). In multivariable analyses in African Americans, a 10% increase in genetic European ancestry was associated with a 7% (95% CI, 1, 13) increase in adjusted log NTproBNP. Conclusions African Americans have lower levels of plasma NTproBNP than Caucasians, which may be partially owing to genetic variation. Low natriuretic peptide levels in African Americans may contribute to the greater risk for HTN and its sequalae in this population. PMID:25999400
Jupiter, Daniel C
2012-01-01
In this first of a series of statistical methodology commentaries for the clinician, we discuss the use of multivariate linear regression. Copyright © 2012 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.
Battista, Marco Johannes; Cotarelo, Cristina; Jakobi, Sina; Steetskamp, Joscha; Makris, Georgios; Sicking, Isabel; Weyer, Veronika; Schmidt, Marcus
2014-07-01
The aim of this study was to evaluate the prognostic influence of epithelial cell adhesion molecule (EpCAM) in an unselected cohort of ovarian cancer (OC) patients. Expression of EpCAM was determined by immunohistochemistry in an unselected cohort of 117 patients with OC. Univariable and multivariable Cox regression analyses adjusted for age, tumor stage, histological grading, histological subtype, postoperative tumor burden and completeness of chemotherapy were performed in order to determine the prognostic influence of EpCAM. The Kaplan-Meier method is used to estimate survival rates. Univariable Cox regression analysis showed that overexpression of EpCAM is associated with favorable prognosis in terms of progression-free survival (PFS) (p = 0.011) and disease-specific survival (DSS) (p = 0.003). In multivariable Cox regression analysis, overexpression of EpCAM retains its significance independent of established prognostic factors for longer PFS [hazard ratios (HR) 0.408, 95 % confidence interval (CI) 0.197-0.846, p = 0.003] but not for PFS (HR 0.666, 95 % CI 0.366-1.212, p = 0.183). Kaplan-Meier plots demonstrate an influence on 5-year PFS rates (0 vs. 27.6 %, p = 0.048) and DSS rates (11.8 vs. 54.0 %, p = 0.018). These findings support the hypothesis that the expression of EpCAM is associated with favorable prognosis in OC.
Application of near-infrared spectroscopy for the rapid quality assessment of Radix Paeoniae Rubra
NASA Astrophysics Data System (ADS)
Zhan, Hao; Fang, Jing; Tang, Liying; Yang, Hongjun; Li, Hua; Wang, Zhuju; Yang, Bin; Wu, Hongwei; Fu, Meihong
2017-08-01
Near-infrared (NIR) spectroscopy with multivariate analysis was used to quantify gallic acid, catechin, albiflorin, and paeoniflorin in Radix Paeoniae Rubra, and the feasibility to classify the samples originating from different areas was investigated. A new high-performance liquid chromatography method was developed and validated to analyze gallic acid, catechin, albiflorin, and paeoniflorin in Radix Paeoniae Rubra as the reference. Partial least squares (PLS), principal component regression (PCR), and stepwise multivariate linear regression (SMLR) were performed to calibrate the regression model. Different data pretreatments such as derivatives (1st and 2nd), multiplicative scatter correction, standard normal variate, Savitzky-Golay filter, and Norris derivative filter were applied to remove the systematic errors. The performance of the model was evaluated according to the root mean square of calibration (RMSEC), root mean square error of prediction (RMSEP), root mean square error of cross-validation (RMSECV), and correlation coefficient (r). The results show that compared to PCR and SMLR, PLS had a lower RMSEC, RMSECV, and RMSEP and higher r for all the four analytes. PLS coupled with proper pretreatments showed good performance in both the fitting and predicting results. Furthermore, the original areas of Radix Paeoniae Rubra samples were partly distinguished by principal component analysis. This study shows that NIR with PLS is a reliable, inexpensive, and rapid tool for the quality assessment of Radix Paeoniae Rubra.
2014-01-01
This paper examined the efficiency of multivariate linear regression (MLR) and artificial neural network (ANN) models in prediction of two major water quality parameters in a wastewater treatment plant. Biochemical oxygen demand (BOD) and chemical oxygen demand (COD) as well as indirect indicators of organic matters are representative parameters for sewer water quality. Performance of the ANN models was evaluated using coefficient of correlation (r), root mean square error (RMSE) and bias values. The computed values of BOD and COD by model, ANN method and regression analysis were in close agreement with their respective measured values. Results showed that the ANN performance model was better than the MLR model. Comparative indices of the optimized ANN with input values of temperature (T), pH, total suspended solid (TSS) and total suspended (TS) for prediction of BOD was RMSE = 25.1 mg/L, r = 0.83 and for prediction of COD was RMSE = 49.4 mg/L, r = 0.81. It was found that the ANN model could be employed successfully in estimating the BOD and COD in the inlet of wastewater biochemical treatment plants. Moreover, sensitive examination results showed that pH parameter have more effect on BOD and COD predicting to another parameters. Also, both implemented models have predicted BOD better than COD. PMID:24456676
NASA Astrophysics Data System (ADS)
Giammanco, S.; Ferrera, E.; Cannata, A.; Montalto, P.; Neri, M.
2013-12-01
From November 2009 to April 2011 soil radon activity was continuously monitored using a Barasol probe located on the upper NE flank of Mt. Etna volcano (Italy), close both to the Piano Provenzana fault and to the NE-Rift. Seismic, volcanological and radon data were analysed together with data on environmental parameters, such as air and soil temperature, barometric pressure, snow and rain fall. In order to find possible correlations among the above parameters, and hence to reveal possible anomalous trends in the radon time-series, we used different statistical methods: i) multivariate linear regression; ii) cross-correlation; iii) coherence analysis through wavelet transform. Multivariate regression indicated a modest influence on soil radon from environmental parameters (R2 = 0.31). When using 100-day time windows, the R2 values showed wide variations in time, reaching their maxima (~0.63-0.66) during summer. Cross-correlation analysis over 100-day moving averages showed that, similar to multivariate linear regression analysis, the summer period was characterised by the best correlation between radon data and environmental parameters. Lastly, the wavelet coherence analysis allowed a multi-resolution coherence analysis of the time series acquired. This approach allowed to study the relations among different signals either in the time or in the frequency domain. It confirmed the results of the previous methods, but also allowed to recognize correlations between radon and environmental parameters at different observation scales (e.g., radon activity changed during strong precipitations, but also during anomalous variations of soil temperature uncorrelated with seasonal fluctuations). Using the above analysis, two periods were recognized when radon variations were significantly correlated with marked soil temperature changes and also with local seismic or volcanic activity. This allowed to produce two different physical models of soil gas transport that explain the observed anomalies. Our work suggests that in order to make an accurate analysis of the relations among different signals it is necessary to use different techniques that give complementary analytical information. In particular, the wavelet analysis showed to be the most effective in discriminating radon changes due to environmental influences from those correlated with impending seismic or volcanic events.
Gao, Yongnian; Gao, Junfeng; Yin, Hongbin; Liu, Chuansheng; Xia, Ting; Wang, Jing; Huang, Qi
2015-03-15
Remote sensing has been widely used for ater quality monitoring, but most of these monitoring studies have only focused on a few water quality variables, such as chlorophyll-a, turbidity, and total suspended solids, which have typically been considered optically active variables. Remote sensing presents a challenge in estimating the phosphorus concentration in water. The total phosphorus (TP) in lakes has been estimated from remotely sensed observations, primarily using the simple individual band ratio or their natural logarithm and the statistical regression method based on the field TP data and the spectral reflectance. In this study, we investigated the possibility of establishing a spatial modeling scheme to estimate the TP concentration of a large lake from multi-spectral satellite imagery using band combinations and regional multivariate statistical modeling techniques, and we tested the applicability of the spatial modeling scheme. The results showed that HJ-1A CCD multi-spectral satellite imagery can be used to estimate the TP concentration in a lake. The correlation and regression analysis showed a highly significant positive relationship between the TP concentration and certain remotely sensed combination variables. The proposed modeling scheme had a higher accuracy for the TP concentration estimation in the large lake compared with the traditional individual band ratio method and the whole-lake scale regression-modeling scheme. The TP concentration values showed a clear spatial variability and were high in western Lake Chaohu and relatively low in eastern Lake Chaohu. The northernmost portion, the northeastern coastal zone and the southeastern portion of western Lake Chaohu had the highest TP concentrations, and the other regions had the lowest TP concentration values, except for the coastal zone of eastern Lake Chaohu. These results strongly suggested that the proposed modeling scheme, i.e., the band combinations and the regional multivariate statistical modeling techniques, demonstrated advantages for estimating the TP concentration in a large lake and had a strong potential for universal application for the TP concentration estimation in large lake waters worldwide. Copyright © 2014 Elsevier Ltd. All rights reserved.
Value of Information Analysis for Time-lapse Seismic Data by Simulation-Regression
NASA Astrophysics Data System (ADS)
Dutta, G.; Mukerji, T.; Eidsvik, J.
2016-12-01
A novel method to estimate the Value of Information (VOI) of time-lapse seismic data in the context of reservoir development is proposed. VOI is a decision analytic metric quantifying the incremental value that would be created by collecting information prior to making a decision under uncertainty. The VOI has to be computed before collecting the information and can be used to justify its collection. Previous work on estimating the VOI of geophysical data has involved explicit approximation of the posterior distribution of reservoir properties given the data and then evaluating the prospect values for that posterior distribution of reservoir properties. Here, we propose to directly estimate the prospect values given the data by building a statistical relationship between them using regression. Various regression techniques such as Partial Least Squares Regression (PLSR), Multivariate Adaptive Regression Splines (MARS) and k-Nearest Neighbors (k-NN) are used to estimate the VOI, and the results compared. For a univariate Gaussian case, the VOI obtained from simulation-regression has been shown to be close to the analytical solution. Estimating VOI by simulation-regression is much less computationally expensive since the posterior distribution of reservoir properties given each possible dataset need not be modeled and the prospect values need not be evaluated for each such posterior distribution of reservoir properties. This method is flexible, since it does not require rigid model specification of posterior but rather fits conditional expectations non-parametrically from samples of values and data.
NASA Astrophysics Data System (ADS)
Hegazy, Maha A.; Lotfy, Hayam M.; Rezk, Mamdouh R.; Omran, Yasmin Rostom
2015-04-01
Smart and novel spectrophotometric and chemometric methods have been developed and validated for the simultaneous determination of a binary mixture of chloramphenicol (CPL) and dexamethasone sodium phosphate (DSP) in presence of interfering substances without prior separation. The first method depends upon derivative subtraction coupled with constant multiplication. The second one is ratio difference method at optimum wavelengths which were selected after applying derivative transformation method via multiplying by a decoding spectrum in order to cancel the contribution of non labeled interfering substances. The third method relies on partial least squares with regression model updating. They are so simple that they do not require any preliminary separation steps. Accuracy, precision and linearity ranges of these methods were determined. Moreover, specificity was assessed by analyzing synthetic mixtures of both drugs. The proposed methods were successfully applied for analysis of both drugs in their pharmaceutical formulation. The obtained results have been statistically compared to that of an official spectrophotometric method to give a conclusion that there is no significant difference between the proposed methods and the official ones with respect to accuracy and precision.
Farouk, M; Elaziz, Omar Abd; Tawakkol, Shereen M; Hemdan, A; Shehata, Mostafa A
2014-04-05
Four simple, accurate, reproducible, and selective methods have been developed and subsequently validated for the determination of Benazepril (BENZ) alone and in combination with Amlodipine (AML) in pharmaceutical dosage form. The first method is pH induced difference spectrophotometry, where BENZ can be measured in presence of AML as it showed maximum absorption at 237nm and 241nm in 0.1N HCl and 0.1N NaOH, respectively, while AML has no wavelength shift in both solvents. The second method is the new Extended Ratio Subtraction Method (EXRSM) coupled to Ratio Subtraction Method (RSM) for determination of both drugs in commercial dosage form. The third and fourth methods are multivariate calibration which include Principal Component Regression (PCR) and Partial Least Squares (PLSs). A detailed validation of the methods was performed following the ICH guidelines and the standard curves were found to be linear in the range of 2-30μg/mL for BENZ in difference and extended ratio subtraction spectrophotometric method, and 5-30 for AML in EXRSM method, with well accepted mean correlation coefficient for each analyte. The intra-day and inter-day precision and accuracy results were well within the acceptable limits. Copyright © 2013 Elsevier B.V. All rights reserved.
Formisano, Elia; De Martino, Federico; Valente, Giancarlo
2008-09-01
Machine learning and pattern recognition techniques are being increasingly employed in functional magnetic resonance imaging (fMRI) data analysis. By taking into account the full spatial pattern of brain activity measured simultaneously at many locations, these methods allow detecting subtle, non-strictly localized effects that may remain invisible to the conventional analysis with univariate statistical methods. In typical fMRI applications, pattern recognition algorithms "learn" a functional relationship between brain response patterns and a perceptual, cognitive or behavioral state of a subject expressed in terms of a label, which may assume discrete (classification) or continuous (regression) values. This learned functional relationship is then used to predict the unseen labels from a new data set ("brain reading"). In this article, we describe the mathematical foundations of machine learning applications in fMRI. We focus on two methods, support vector machines and relevance vector machines, which are respectively suited for the classification and regression of fMRI patterns. Furthermore, by means of several examples and applications, we illustrate and discuss the methodological challenges of using machine learning algorithms in the context of fMRI data analysis.
NASA Astrophysics Data System (ADS)
Nieto, Paulino José García; Antón, Juan Carlos Álvarez; Vilán, José Antonio Vilán; García-Gonzalo, Esperanza
2014-10-01
The aim of this research work is to build a regression model of the particulate matter up to 10 micrometers in size (PM10) by using the multivariate adaptive regression splines (MARS) technique in the Oviedo urban area (Northern Spain) at local scale. This research work explores the use of a nonparametric regression algorithm known as multivariate adaptive regression splines (MARS) which has the ability to approximate the relationship between the inputs and outputs, and express the relationship mathematically. In this sense, hazardous air pollutants or toxic air contaminants refer to any substance that may cause or contribute to an increase in mortality or serious illness, or that may pose a present or potential hazard to human health. To accomplish the objective of this study, the experimental dataset of nitrogen oxides (NOx), carbon monoxide (CO), sulfur dioxide (SO2), ozone (O3) and dust (PM10) were collected over 3 years (2006-2008) and they are used to create a highly nonlinear model of the PM10 in the Oviedo urban nucleus (Northern Spain) based on the MARS technique. One main objective of this model is to obtain a preliminary estimate of the dependence between PM10 pollutant in the Oviedo urban area at local scale. A second aim is to determine the factors with the greatest bearing on air quality with a view to proposing health and lifestyle improvements. The United States National Ambient Air Quality Standards (NAAQS) establishes the limit values of the main pollutants in the atmosphere in order to ensure the health of healthy people. Firstly, this MARS regression model captures the main perception of statistical learning theory in order to obtain a good prediction of the dependence among the main pollutants in the Oviedo urban area. Secondly, the main advantages of MARS are its capacity to produce simple, easy-to-interpret models, its ability to estimate the contributions of the input variables, and its computational efficiency. Finally, on the basis of these numerical calculations, using the multivariate adaptive regression splines (MARS) technique, conclusions of this research work are exposed.
Ting, Hui-Min; Chang, Liyun; Huang, Yu-Jie; Wu, Jia-Ming; Wang, Hung-Yu; Horng, Mong-Fong; Chang, Chun-Ming; Lan, Jen-Hong; Huang, Ya-Yu; Fang, Fu-Min; Leung, Stephen Wan
2014-01-01
Purpose The aim of this study was to develop a multivariate logistic regression model with least absolute shrinkage and selection operator (LASSO) to make valid predictions about the incidence of moderate-to-severe patient-rated xerostomia among head and neck cancer (HNC) patients treated with IMRT. Methods and Materials Quality of life questionnaire datasets from 206 patients with HNC were analyzed. The European Organization for Research and Treatment of Cancer QLQ-H&N35 and QLQ-C30 questionnaires were used as the endpoint evaluation. The primary endpoint (grade 3+ xerostomia) was defined as moderate-to-severe xerostomia at 3 (XER3m) and 12 months (XER12m) after the completion of IMRT. Normal tissue complication probability (NTCP) models were developed. The optimal and suboptimal numbers of prognostic factors for a multivariate logistic regression model were determined using the LASSO with bootstrapping technique. Statistical analysis was performed using the scaled Brier score, Nagelkerke R2, chi-squared test, Omnibus, Hosmer-Lemeshow test, and the AUC. Results Eight prognostic factors were selected by LASSO for the 3-month time point: Dmean-c, Dmean-i, age, financial status, T stage, AJCC stage, smoking, and education. Nine prognostic factors were selected for the 12-month time point: Dmean-i, education, Dmean-c, smoking, T stage, baseline xerostomia, alcohol abuse, family history, and node classification. In the selection of the suboptimal number of prognostic factors by LASSO, three suboptimal prognostic factors were fine-tuned by Hosmer-Lemeshow test and AUC, i.e., Dmean-c, Dmean-i, and age for the 3-month time point. Five suboptimal prognostic factors were also selected for the 12-month time point, i.e., Dmean-i, education, Dmean-c, smoking, and T stage. The overall performance for both time points of the NTCP model in terms of scaled Brier score, Omnibus, and Nagelkerke R2 was satisfactory and corresponded well with the expected values. Conclusions Multivariate NTCP models with LASSO can be used to predict patient-rated xerostomia after IMRT. PMID:24586971
Gebremichael, Gebrekiros; Yihune, Manaye; Ajema, Dessalegn; Haftu, Desta; Gedamu, Genet
2018-01-01
Background. Perinatal depression is a serious mental health problem that can negatively affect the lives of women and children. The adverse consequences of perinatal depression in high-income countries also occur in low-income countries. Objective. To assess the perinatal depression and associated factors among mothers in Southern Ethiopia. Methods. A community based cross-sectional study was conducted among selected 728 study participants in Arba Minch Zuria HDSS. A pretested questionnaire was used to collect the data. Data were analyzed using STATA version 12 software. Descriptive statistical methods were used to summarize the characteristics of the mothers. Bivariate and multivariable logistic regression was used for analysis. Results. The prevalence of perinatal depression among the study period was 26.7%. In the final multivariable logistic regression, monthly income AOR (95% C.I): 4.2 (1.9, 9.3), parity [AOR (95% C.I): 0.14 (0.03, 0.65)], pregnancy complications AOR (95% C.I): 5 (2.5, 10.4), husband smoking status [AOR (95% C.I): 4.12 (1.6, 10.6)], history of previous depression AOR (95% C.I): 2.7 (1.54, 4.8), and family history of psychiatric disorders were the independent factors associated with perinatal depression. Conclusion. The study showed a high prevalence of perinatal depression among pregnant mothers and mothers who have less than a one-year-old child.
Multivariate methods for indoor PM10 and PM2.5 modelling in naturally ventilated schools buildings
NASA Astrophysics Data System (ADS)
Elbayoumi, Maher; Ramli, Nor Azam; Md Yusof, Noor Faizah Fitri; Yahaya, Ahmad Shukri Bin; Al Madhoun, Wesam; Ul-Saufie, Ahmed Zia
2014-09-01
In this study the concentrations of PM10, PM2.5, CO and CO2 concentrations and meteorological variables (wind speed, air temperature, and relative humidity) were employed to predict the annual and seasonal indoor concentration of PM10 and PM2.5 using multivariate statistical methods. The data have been collected in twelve naturally ventilated schools in Gaza Strip (Palestine) from October 2011 to May 2012 (academic year). The bivariate correlation analysis showed that the indoor PM10 and PM2.5 were highly positive correlated with outdoor concentration of PM10 and PM2.5. Further, Multiple linear regression (MLR) was used for modelling and R2 values for indoor PM10 were determined as 0.62 and 0.84 for PM10 and PM2.5 respectively. The Performance indicators of MLR models indicated that the prediction for PM10 and PM2.5 annual models were better than seasonal models. In order to reduce the number of input variables, principal component analysis (PCA) and principal component regression (PCR) were applied by using annual data. The predicted R2 were 0.40 and 0.73 for PM10 and PM2.5, respectively. PM10 models (MLR and PCR) show the tendency to underestimate indoor PM10 concentrations as it does not take into account the occupant's activities which highly affect the indoor concentrations during the class hours.
Attia, Khalid A M; Nassar, Mohammed W I; El-Zeiny, Mohamed B; Serag, Ahmed
2017-01-05
For the first time, a new variable selection method based on swarm intelligence namely firefly algorithm is coupled with three different multivariate calibration models namely, concentration residual augmented classical least squares, artificial neural network and support vector regression in UV spectral data. A comparative study between the firefly algorithm and the well-known genetic algorithm was developed. The discussion revealed the superiority of using this new powerful algorithm over the well-known genetic algorithm. Moreover, different statistical tests were performed and no significant differences were found between all the models regarding their predictabilities. This ensures that simpler and faster models were obtained without any deterioration of the quality of the calibration. Copyright © 2016 Elsevier B.V. All rights reserved.
A FORTRAN program for multivariate survival analysis on the personal computer.
Mulder, P G
1988-01-01
In this paper a FORTRAN program is presented for multivariate survival or life table regression analysis in a competing risks' situation. The relevant failure rate (for example, a particular disease or mortality rate) is modelled as a log-linear function of a vector of (possibly time-dependent) explanatory variables. The explanatory variables may also include the variable time itself, which is useful for parameterizing piecewise exponential time-to-failure distributions in a Gompertz-like or Weibull-like way as a more efficient alternative to Cox's proportional hazards model. Maximum likelihood estimates of the coefficients of the log-linear relationship are obtained from the iterative Newton-Raphson method. The program runs on a personal computer under DOS; running time is quite acceptable, even for large samples.
Ordinary chondrites - Multivariate statistical analysis of trace element contents
NASA Technical Reports Server (NTRS)
Lipschutz, Michael E.; Samuels, Stephen M.
1991-01-01
The contents of mobile trace elements (Co, Au, Sb, Ga, Se, Rb, Cs, Te, Bi, Ag, In, Tl, Zn, and Cd) in Antarctic and non-Antarctic populations of H4-6 and L4-6 chondrites, were compared using standard multivariate discriminant functions borrowed from linear discriminant analysis and logistic regression. A nonstandard randomization-simulation method was developed, making it possible to carry out probability assignments on a distribution-free basis. Compositional differences were found both between the Antarctic and non-Antarctic H4-6 chondrite populations and between two L4-6 chondrite populations. It is shown that, for various types of meteorites (in particular, for the H4-6 chondrites), the Antarctic/non-Antarctic compositional difference is due to preterrestrial differences in the genesis of their parent materials.
NASA Astrophysics Data System (ADS)
Cannon, Alex
2017-04-01
Estimating historical trends in short-duration rainfall extremes at regional and local scales is challenging due to low signal-to-noise ratios and the limited availability of homogenized observational data. In addition to being of scientific interest, trends in rainfall extremes are of practical importance, as their presence calls into question the stationarity assumptions that underpin traditional engineering and infrastructure design practice. Even with these fundamental challenges, increasingly complex questions are being asked about time series of extremes. For instance, users may not only want to know whether or not rainfall extremes have changed over time, they may also want information on the modulation of trends by large-scale climate modes or on the nonstationarity of trends (e.g., identifying hiatus periods or periods of accelerating positive trends). Efforts have thus been devoted to the development and application of more robust and powerful statistical estimators for regional and local scale trends. While a standard nonparametric method like the regional Mann-Kendall test, which tests for the presence of monotonic trends (i.e., strictly non-decreasing or non-increasing changes), makes fewer assumptions than parametric methods and pools information from stations within a region, it is not designed to visualize detected trends, include information from covariates, or answer questions about the rate of change in trends. As a remedy, monotone quantile regression (MQR) has been developed as a nonparametric alternative that can be used to estimate a common monotonic trend in extremes at multiple stations. Quantile regression makes efficient use of data by directly estimating conditional quantiles based on information from all rainfall data in a region, i.e., without having to precompute the sample quantiles. The MQR method is also flexible and can be used to visualize and analyze the nonlinearity of the detected trend. However, it is fundamentally a univariate technique, and cannot incorporate information from additional covariates, for example ENSO state or physiographic controls on extreme rainfall within a region. Here, the univariate MQR model is extended to allow the use of multiple covariates. Multivariate monotone quantile regression (MMQR) is based on a single hidden-layer feedforward network with the quantile regression error function and partial monotonicity constraints. The MMQR model is demonstrated via Monte Carlo simulations and the estimation and visualization of regional trends in moderate rainfall extremes based on homogenized sub-daily precipitation data at stations in Canada.
Multivariate meta-analysis for non-linear and other multi-parameter associations
Gasparrini, A; Armstrong, B; Kenward, M G
2012-01-01
In this paper, we formalize the application of multivariate meta-analysis and meta-regression to synthesize estimates of multi-parameter associations obtained from different studies. This modelling approach extends the standard two-stage analysis used to combine results across different sub-groups or populations. The most straightforward application is for the meta-analysis of non-linear relationships, described for example by regression coefficients of splines or other functions, but the methodology easily generalizes to any setting where complex associations are described by multiple correlated parameters. The modelling framework of multivariate meta-analysis is implemented in the package mvmeta within the statistical environment R. As an illustrative example, we propose a two-stage analysis for investigating the non-linear exposure–response relationship between temperature and non-accidental mortality using time-series data from multiple cities. Multivariate meta-analysis represents a useful analytical tool for studying complex associations through a two-stage procedure. Copyright © 2012 John Wiley & Sons, Ltd. PMID:22807043
Jiménez-Carvelo, Ana M; González-Casado, Antonio; Cuadros-Rodríguez, Luis
2017-03-01
A new analytical method for the quantification of olive oil and palm oil in blends with other vegetable edible oils (canola, safflower, corn, peanut, seeds, grapeseed, linseed, sesame and soybean) using normal phase liquid chromatography, and applying chemometric tools was developed. The procedure for obtaining of chromatographic fingerprint from the methyl-transesterified fraction from each blend is described. The multivariate quantification methods used were Partial Least Square-Regression (PLS-R) and Support Vector Regression (SVR). The quantification results were evaluated by several parameters as the Root Mean Square Error of Validation (RMSEV), Mean Absolute Error of Validation (MAEV) and Median Absolute Error of Validation (MdAEV). It has to be highlighted that the new proposed analytical method, the chromatographic analysis takes only eight minutes and the results obtained showed the potential of this method and allowed quantification of mixtures of olive oil and palm oil with other vegetable oils. Copyright © 2016 Elsevier B.V. All rights reserved.
Rausch, Laura A.; Kouchoukos, Nicholas T.; Lobdell, Kevin W.; Khabbaz, Kamal; Murphy, Edward; Hagberg, Robert C.
2016-01-01
Background The goal of this study was to compare early postoperative outcomes and actuarial-free survival between patients who underwent repair of acute type A aortic dissection by the method of cerebral perfusion used. Methods A total of 324 patients from five academic medical centers underwent repair of acute type A aortic dissection between January 2000 and December 2010. Of those, antegrade cerebral perfusion (ACP) was used for 84 patients, retrograde cerebral perfusion (RCP) was used for 55 patients, and deep hypothermic circulatory arrest (DHCA) was used for 184 patients during repair. Major morbidity, operative mortality, and 5-year actuarial survival were compared between groups. Multivariate logistic regression was used to determine predictors of operative mortality and Cox Regression hazard ratios were calculated to determine the predictors of long term mortality. Results Operative mortality was not influenced by the type of cerebral protection (19% for ACP, 14.5% for RCP and 19.1% for DHCA, P=0.729). In multivariable logistic regression analysis, hemodynamic instability [odds ratio (OR) =19.6, 95% confidence intervals (CI), 0.102–0.414, P<0.001] and CPB time >200 min(OR =4.7, 95% CI, 1.962–1.072, P=0.029) emerged as independent predictors of operative mortality. Actuarial 5-year survival was unchanged by cerebral protection modality (48.8% for ACP, 61.8% for RCP and 66.8% for no cerebral protection, log-rank P=0.844). Conclusions During surgical repair of type A aortic dissection, ACP, RCP or DHCA are safe strategies for cerebral protection in selected patients with type A aortic dissection. PMID:27563545
Chiu, Yu-Jen; Liao, Wen-Chieh; Wang, Tien-Hsiang; Shih, Yu-Chung; Ma, Hsu; Lin, Chih-Hsun; Wu, Szu-Hsien; Perng, Cherng-Kang
2017-08-01
Despite significant advances in medical care and surgical techniques, pressure sore reconstruction is still prone to elevated rates of complication and recurrence. We conducted a retrospective study to investigate not only complication and recurrence rates following pressure sore reconstruction but also preoperative risk stratification. This study included 181 ulcers underwent flap operations between January 2002 and December 2013 were included in the study. We performed a multivariable logistic regression model, which offers a regression-based method accounting for the within-patient correlation of the success or failure of each flap. The overall complication and recurrence rates for all flaps were 46.4% and 16.0%, respectively, with a mean follow-up period of 55.4 ± 38.0 months. No statistically significant differences of complication and recurrence rates were observed among three different reconstruction methods. In subsequent analysis, albumin ≤3.0 g/dl and paraplegia were significantly associated with higher postoperative complication. The anatomic factor, ischial wound location, significantly trended toward the development of ulcer recurrence. In the fasciocutaneous group, paraplegia had significant correlation to higher complication and recurrence rates. In the musculocutaneous flap group, variables had no significant correlation to complication and recurrence rates. In the free-style perforator group, ischial wound location and malnourished status correlated with significantly higher complication rates; ischial wound location also correlated with significantly higher recurrence rate. Ultimately, our review of a noteworthy cohort with lengthy follow-up helped identify and confirm certain risk factors that can facilitate a more informed and thoughtful pre- and postoperative decision-making process for patients with pressure ulcers. Copyright © 2017 British Association of Plastic, Reconstructive and Aesthetic Surgeons. Published by Elsevier Ltd. All rights reserved.
Weichenthal, Scott; Ryswyk, Keith Van; Goldstein, Alon; Bagg, Scott; Shekkarizfard, Maryam; Hatzopoulou, Marianne
2016-04-01
Existing evidence suggests that ambient ultrafine particles (UFPs) (<0.1µm) may contribute to acute cardiorespiratory morbidity. However, few studies have examined the long-term health effects of these pollutants owing in part to a need for exposure surfaces that can be applied in large population-based studies. To address this need, we developed a land use regression model for UFPs in Montreal, Canada using mobile monitoring data collected from 414 road segments during the summer and winter months between 2011 and 2012. Two different approaches were examined for model development including standard multivariable linear regression and a machine learning approach (kernel-based regularized least squares (KRLS)) that learns the functional form of covariate impacts on ambient UFP concentrations from the data. The final models included parameters for population density, ambient temperature and wind speed, land use parameters (park space and open space), length of local roads and rail, and estimated annual average NOx emissions from traffic. The final multivariable linear regression model explained 62% of the spatial variation in ambient UFP concentrations whereas the KRLS model explained 79% of the variance. The KRLS model performed slightly better than the linear regression model when evaluated using an external dataset (R(2)=0.58 vs. 0.55) or a cross-validation procedure (R(2)=0.67 vs. 0.60). In general, our findings suggest that the KRLS approach may offer modest improvements in predictive performance compared to standard multivariable linear regression models used to estimate spatial variations in ambient UFPs. However, differences in predictive performance were not statistically significant when evaluated using the cross-validation procedure. Crown Copyright © 2015. Published by Elsevier Inc. All rights reserved.
Qiu, Shanshan; Wang, Jun; Gao, Liping
2014-07-09
An electronic nose (E-nose) and an electronic tongue (E-tongue) have been used to characterize five types of strawberry juices based on processing approaches (i.e., microwave pasteurization, steam blanching, high temperature short time pasteurization, frozen-thawed, and freshly squeezed). Juice quality parameters (vitamin C, pH, total soluble solid, total acid, and sugar/acid ratio) were detected by traditional measuring methods. Multivariate statistical methods (linear discriminant analysis (LDA) and partial least squares regression (PLSR)) and neural networks (Random Forest (RF) and Support Vector Machines) were employed to qualitative classification and quantitative regression. E-tongue system reached higher accuracy rates than E-nose did, and the simultaneous utilization did have an advantage in LDA classification and PLSR regression. According to cross-validation, RF has shown outstanding and indisputable performances in the qualitative and quantitative analysis. This work indicates that the simultaneous utilization of E-nose and E-tongue can discriminate processed fruit juices and predict quality parameters successfully for the beverage industry.
NASA Astrophysics Data System (ADS)
Pradhan, Biswajeet
2010-05-01
This paper presents the results of the cross-validation of a multivariate logistic regression model using remote sensing data and GIS for landslide hazard analysis on the Penang, Cameron, and Selangor areas in Malaysia. Landslide locations in the study areas were identified by interpreting aerial photographs and satellite images, supported by field surveys. SPOT 5 and Landsat TM satellite imagery were used to map landcover and vegetation index, respectively. Maps of topography, soil type, lineaments and land cover were constructed from the spatial datasets. Ten factors which influence landslide occurrence, i.e., slope, aspect, curvature, distance from drainage, lithology, distance from lineaments, soil type, landcover, rainfall precipitation, and normalized difference vegetation index (ndvi), were extracted from the spatial database and the logistic regression coefficient of each factor was computed. Then the landslide hazard was analysed using the multivariate logistic regression coefficients derived not only from the data for the respective area but also using the logistic regression coefficients calculated from each of the other two areas (nine hazard maps in all) as a cross-validation of the model. For verification of the model, the results of the analyses were then compared with the field-verified landslide locations. Among the three cases of the application of logistic regression coefficient in the same study area, the case of Selangor based on the Selangor logistic regression coefficients showed the highest accuracy (94%), where as Penang based on the Penang coefficients showed the lowest accuracy (86%). Similarly, among the six cases from the cross application of logistic regression coefficient in other two areas, the case of Selangor based on logistic coefficient of Cameron showed highest (90%) prediction accuracy where as the case of Penang based on the Selangor logistic regression coefficients showed the lowest accuracy (79%). Qualitatively, the cross application model yields reasonable results which can be used for preliminary landslide hazard mapping.
Merchak, Noelle; Silvestre, Virginie; Loquet, Denis; Rizk, Toufic; Akoka, Serge; Bejjani, Joseph
2017-01-01
Triacylglycerols, which are quasi-universal components of food matrices, consist of complex mixtures of molecules. Their site-specific 13 C content, their fatty acid profile, and their position on the glycerol moiety may significantly vary with the geographical, botanical, or animal origin of the sample. Such variables are valuable tracers for food authentication issues. The main objective of this work was to develop a new method based on a rapid and precise 13 C-NMR spectroscopy (using a polarization transfer technique) coupled with multivariate linear regression analyses in order to quantify the whole set of individual fatty acids within triacylglycerols. In this respect, olive oil samples were analyzed by means of both adiabatic 13 C-INEPT sequence and gas chromatography (GC). For each fatty acid within the studied matrix and for squalene as well, a multivariate prediction model was constructed using the deconvoluted peak areas of 13 C-INEPT spectra as predictors, and the data obtained by GC as response variables. This 13 C-NMR-based strategy, tested on olive oil, could serve as an alternative to the gas chromatographic quantification of individual fatty acids in other matrices, while providing additional compositional and isotopic information. Graphical abstract A strategy based on the multivariate linear regression of variables obtained by a rapid 13 C-NMR technique was developed for the quantification of individual fatty acids within triacylglycerol matrices. The conceived strategy was tested on olive oil.
Serum dehydroepiandrosterone sulphate, psychosocial factors and musculoskeletal pain in workers.
Marinelli, A; Prodi, A; Pesel, G; Ronchese, F; Bovenzi, M; Negro, C; Larese Filon, F
2017-12-30
The serum level of dehydroepiandrosterone sulphate (DHEA-S) has been suggested as a biological marker of stress. To assess the association between serum DHEA-S, psychosocial factors and musculoskeletal (MS) pain in university workers. The study population included voluntary workers at the scientific departments of the University of Trieste (Italy) who underwent periodical health surveillance from January 2011 to June 2012. DHEA-S level was analysed in serum. The assessment tools included the General Health Questionnaire (GHQ) and a modified Nordic musculoskeletal symptoms questionnaire. The relation between DHEA-S, individual characteristics, pain perception and psychological factors was assessed by means of multivariable linear regression analysis. There were 189 study participants. The study population was characterized by high reward and low effort. Pain perception in the neck, shoulder, upper limbs, upper back and lower back was reported by 42, 32, 19, 29 and 43% of people, respectively. In multivariable regression analysis, gender, age and pain perception in the shoulder and upper limbs were significantly related to serum DHEA-S. Effort and overcommitment were related to shoulder and neck pain but not to DHEA-S. The GHQ score was associated with pain perception in different body sites and inversely to DHEA-S but significance was lost in multivariable regression analysis. DHEA-S was associated with age, gender and perception of MS pain, while effort-reward imbalance dimensions and GHQ score failed to reach the statistical significance in multivariable regression analysis. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society of Occupational Medicine. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Independent Prognostic Factors for Acute Organophosphorus Pesticide Poisoning.
Tang, Weidong; Ruan, Feng; Chen, Qi; Chen, Suping; Shao, Xuebo; Gao, Jianbo; Zhang, Mao
2016-07-01
Acute organophosphorus pesticide poisoning (AOPP) is becoming a significant problem and a potential cause of human mortality because of the abuse of organophosphate compounds. This study aims to determine the independent prognostic factors of AOPP by using multivariate logistic regression analysis. The clinical data for 71 subjects with AOPP admitted to our hospital were retrospectively analyzed. This information included the Acute Physiology and Chronic Health Evaluation II (APACHE II) scores, 6-h post-admission blood lactate levels, post-admission 6-h lactate clearance rates, admission blood cholinesterase levels, 6-h post-admission blood cholinesterase levels, cholinesterase activity, blood pH, and other factors. Univariate analysis and multivariate logistic regression analyses were conducted to identify all prognostic factors and independent prognostic factors, respectively. A receiver operating characteristic curve was plotted to analyze the testing power of independent prognostic factors. Twelve of 71 subjects died. Admission blood lactate levels, 6-h post-admission blood lactate levels, post-admission 6-h lactate clearance rates, blood pH, and APACHE II scores were identified as prognostic factors for AOPP according to the univariate analysis, whereas only 6-h post-admission blood lactate levels, post-admission 6-h lactate clearance rates, and blood pH were independent prognostic factors identified by multivariate logistic regression analysis. The receiver operating characteristic analysis suggested that post-admission 6-h lactate clearance rates were of moderate diagnostic value. High 6-h post-admission blood lactate levels, low blood pH, and low post-admission 6-h lactate clearance rates were independent prognostic factors identified by multivariate logistic regression analysis. Copyright © 2016 by Daedalus Enterprises.
MODELING SNAKE MICROHABITAT FROM RADIOTELEMETRY STUDIES USING POLYTOMOUS LOGISTIC REGRESSION
Multivariate analysis of snake microhabitat has historically used techniques that were derived under assumptions of normality and common covariance structure (e.g., discriminant function analysis, MANOVA). In this study, polytomous logistic regression (PLR which does not require ...
Conceptual and statistical problems associated with the use of diversity indices in ecology.
Barrantes, Gilbert; Sandoval, Luis
2009-09-01
Diversity indices, particularly the Shannon-Wiener index, have extensively been used in analyzing patterns of diversity at different geographic and ecological scales. These indices have serious conceptual and statistical problems which make comparisons of species richness or species abundances across communities nearly impossible. There is often no a single statistical method that retains all information needed to answer even a simple question. However, multivariate analyses could be used instead of diversity indices, such as cluster analyses or multiple regressions. More complex multivariate analyses, such as Canonical Correspondence Analysis, provide very valuable information on environmental variables associated to the presence and abundance of the species in a community. In addition, particular hypotheses associated to changes in species richness across localities, or change in abundance of one, or a group of species can be tested using univariate, bivariate, and/or rarefaction statistical tests. The rarefaction method has proved to be robust to standardize all samples to a common size. Even the simplest method as reporting the number of species per taxonomic category possibly provides more information than a diversity index value.
London Measure of Unplanned Pregnancy: guidance for its use as an outcome measure
Hall, Jennifer A; Barrett, Geraldine; Copas, Andrew; Stephenson, Judith
2017-01-01
Background The London Measure of Unplanned Pregnancy (LMUP) is a psychometrically validated measure of the degree of intention of a current or recent pregnancy. The LMUP is increasingly being used worldwide, and can be used to evaluate family planning or preconception care programs. However, beyond recommending the use of the full LMUP scale, there is no published guidance on how to use the LMUP as an outcome measure. Ordinal logistic regression has been recommended informally, but studies published to date have all used binary logistic regression and dichotomized the scale at different cut points. There is thus a need for evidence-based guidance to provide a standardized methodology for multivariate analysis and to enable comparison of results. This paper makes recommendations for the regression method for analysis of the LMUP as an outcome measure. Materials and methods Data collected from 4,244 pregnant women in Malawi were used to compare five regression methods: linear, logistic with two cut points, and ordinal logistic with either the full or grouped LMUP score. The recommendations were then tested on the original UK LMUP data. Results There were small but no important differences in the findings across the regression models. Logistic regression resulted in the largest loss of information, and assumptions were violated for the linear and ordinal logistic regression. Consequently, robust standard errors were used for linear regression and a partial proportional odds ordinal logistic regression model attempted. The latter could only be fitted for grouped LMUP score. Conclusion We recommend the linear regression model with robust standard errors to make full use of the LMUP score when analyzed as an outcome measure. Ordinal logistic regression could be considered, but a partial proportional odds model with grouped LMUP score may be required. Logistic regression is the least-favored option, due to the loss of information. For logistic regression, the cut point for un/planned pregnancy should be between nine and ten. These recommendations will standardize the analysis of LMUP data and enhance comparability of results across studies. PMID:28435343
Prognostic predictors of patients with carcinoma of the gastric cardia.
Zhang, Ming; Li, Zhigao; Ma, Yan; Zhu, Guanyu; Zhang, Hongfeng; Xue, Yingwei
2012-05-01
This study gives insight into survival predictors and clinicopathological features of carcinoma of the gastric cardia. The study included 233 patients who underwent operation for carcinoma of the gastric cardia. Clinicopathological prognostic variables were evaluated as predictors of long-term survival by univariate and multivariate analysis. Cox regression was used for multivariate analysis and survival curves were drawn by the Kaplan- Meier method. Carcinoma of the gastric cardia was characterized by positive lymph node metastasis (77.3%), serosal invasion (83.3%) and more stage III or IV tumors (72.5%). Overall 5-year survival rate was 21.9% and median survival period was 24 months. The 5-year survival rate was influenced by tumor size, depth on invasion, lymph node metastasis, extent of lymph node dissection, disease stage, operation methods and resection margin. The absent of serosal invasion and lymph node metastasis, curative resection should be considered to be the favourable predictors of long-term survival of patients with carcinoma of the gastric cardia.
Bicycle Use and Cyclist Safety Following Boston’s Bicycle Infrastructure Expansion, 2009–2012
Angriman, Federico; Bellows, Alexandra L.; Taylor, Kathryn
2016-01-01
Objectives. To evaluate changes in bicycle use and cyclist safety in Boston, Massachusetts, following the rapid expansion of its bicycle infrastructure between 2007 and 2014. Methods. We measured bicycle lane mileage, a surrogate for bicycle infrastructure expansion, and quantified total estimated number of commuters. In addition, we calculated the number of reported bicycle accidents from 2009 to 2012. Bicycle accident and injury trends over time were assessed via generalized linear models. Multivariable logistic regression was used to examine factors associated with bicycle injuries. Results. Boston increased its total bicycle lane mileage from 0.034 miles in 2007 to 92.2 miles in 2014 (P < .001). The percentage of bicycle commuters increased from 0.9% in 2005 to 2.4% in 2014 (P = .002) and the total percentage of bicycle accidents involving injuries diminished significantly, from 82.7% in 2009 to 74.6% in 2012. The multivariable logistic regression analysis showed that for every 1-year increase in time from 2009 to 2012, there was a 14% reduction in the odds of being injured in an accident. Conclusions. The expansion of Boston’s bicycle infrastructure was associated with increases in both bicycle use and cyclist safety. PMID:27736203
Magnus, Manya; Kuo, Irene; Wang, Lei; Liu, Ting-Yuan; Mayer, Kenneth H.
2014-01-01
Objectives. We examined lifetime incarceration history and its association with key characteristics among 1553 Black men who have sex with men (BMSM) recruited in 6 US cities. Methods. We conducted bivariate analyses of data collected from the HIV Prevention Trials Network 061 study from July 2009 through December 2011 to examine the relationship between incarceration history and demographic and psychosocial variables predating incarceration and multivariate logistic regression analyses to explore the associations between incarceration history and demographic and psychosocial variables found to be significant. We then used multivariate logistic regression models to explore the independent association between incarceration history and 6 outcome variables. Results. After adjusting for confounders, we found that increasing age, transgender identity, heterosexual or straight identity, history of childhood violence, and childhood sexual experience were significantly associated with incarceration history. A history of incarceration was also independently associated with any alcohol and drug use in the past 6 months. Conclusions. The findings highlight an elevated lifetime incarceration history among a geographically diverse sample of BMSM and the need to adequately assess the impact of incarceration among BMSM in the United States. PMID:24432948
NASA Astrophysics Data System (ADS)
Candefjord, Stefan; Nyberg, Morgan; Jalkanen, Ville; Ramser, Kerstin; Lindahl, Olof A.
2010-12-01
Tissue characterization is fundamental for identification of pathological conditions. Raman spectroscopy (RS) and tactile resonance measurement (TRM) are two promising techniques that measure biochemical content and stiffness, respectively. They have potential to complement the golden standard--histological analysis. By combining RS and TRM, complementary information about tissue content can be obtained and specific drawbacks can be avoided. The aim of this study was to develop a multivariate approach to compare RS and TRM information. The approach was evaluated on measurements at the same points on porcine abdominal tissue. The measurement points were divided into five groups by multivariate analysis of the RS data. A regression analysis was performed and receiver operating characteristic (ROC) curves were used to compare the RS and TRM data. TRM identified one group efficiently (area under ROC curve 0.99). The RS data showed that the proportion of saturated fat was high in this group. The regression analysis showed that stiffness was mainly determined by the amount of fat and its composition. We concluded that RS provided additional, important information for tissue identification that was not provided by TRM alone. The results are promising for development of a method combining RS and TRM for intraoperative tissue characterization.
Social support for women of reproductive age and its predictors: a population-based study
2012-01-01
Background Social support is an exchange of resources between at least two individuals perceived by the provider or recipient to be intended to promote the health of the recipient. Social support is a major determinant of health. The objective of this study was to determine the perceived social support and its associated sociodemographic factors among women of reproductive age. Methods This was a population-based cross-sectional study with multistage random cluster sampling of 1359 women of reproductive age. Data were collected using questionnaires on sociodemographic factors and perceived social support (PRQ85-Part 2). The relationship between the dependent variable (perceived social support) and the independent variables (sociodemographic characteristics) was analyzed using the multivariable linear regression model. Results The mean score of social support was 134.3 ± 17.9. Women scored highest in the “worth” dimension and lowest in the “social integration” dimension. Multivariable linear regression analysis indicated that the variables of education, spouse’s occupation, Sufficiency of income for expenses and primary support source were significantly related to the perceived social support. Conclusion Sociodemographic factors affect social support and could be considered in planning interventions to improve social support for Iranian women. PMID:22988834
Zhong-xiang, Feng; Shi-sheng, Lu; Wei-hua, Zhang; Nan-nan, Zhang
2014-01-01
In order to build a combined model which can meet the variation rule of death toll data for road traffic accidents and can reflect the influence of multiple factors on traffic accidents and improve prediction accuracy for accidents, the Verhulst model was built based on the number of death tolls for road traffic accidents in China from 2002 to 2011; and car ownership, population, GDP, highway freight volume, highway passenger transportation volume, and highway mileage were chosen as the factors to build the death toll multivariate linear regression model. Then the two models were combined to be a combined prediction model which has weight coefficient. Shapley value method was applied to calculate the weight coefficient by assessing contributions. Finally, the combined model was used to recalculate the number of death tolls from 2002 to 2011, and the combined model was compared with the Verhulst and multivariate linear regression models. The results showed that the new model could not only characterize the death toll data characteristics but also quantify the degree of influence to the death toll by each influencing factor and had high accuracy as well as strong practicability. PMID:25610454
Feng, Zhong-xiang; Lu, Shi-sheng; Zhang, Wei-hua; Zhang, Nan-nan
2014-01-01
In order to build a combined model which can meet the variation rule of death toll data for road traffic accidents and can reflect the influence of multiple factors on traffic accidents and improve prediction accuracy for accidents, the Verhulst model was built based on the number of death tolls for road traffic accidents in China from 2002 to 2011; and car ownership, population, GDP, highway freight volume, highway passenger transportation volume, and highway mileage were chosen as the factors to build the death toll multivariate linear regression model. Then the two models were combined to be a combined prediction model which has weight coefficient. Shapley value method was applied to calculate the weight coefficient by assessing contributions. Finally, the combined model was used to recalculate the number of death tolls from 2002 to 2011, and the combined model was compared with the Verhulst and multivariate linear regression models. The results showed that the new model could not only characterize the death toll data characteristics but also quantify the degree of influence to the death toll by each influencing factor and had high accuracy as well as strong practicability.
Basques, B A; McLynn, R P; Lukasiewicz, A M; Samuel, A M; Bohl, D D; Grauer, J N
2018-02-01
The aims of this study were to characterize the frequency of missing data in the National Surgical Quality Improvement Program (NSQIP) database and to determine how missing data can influence the results of studies dealing with elderly patients with a fracture of the hip. Patients who underwent surgery for a fracture of the hip between 2005 and 2013 were identified from the NSQIP database and the percentage of missing data was noted for demographics, comorbidities and laboratory values. These variables were tested for association with 'any adverse event' using multivariate regressions based on common ways of handling missing data. A total of 26 066 patients were identified. The rate of missing data was up to 77.9% for many variables. Multivariate regressions comparing three methods of handling missing data found different risk factors for postoperative adverse events. Only seven of 35 identified risk factors (20%) were common to all three analyses. Missing data is an important issue in national database studies that researchers must consider when evaluating such investigations. Cite this article: Bone Joint J 2018;100-B:226-32. ©2018 The British Editorial Society of Bone & Joint Surgery.
Factors Influencing Cecal Intubation Time during Retrograde Approach Single-Balloon Enteroscopy
Chen, Peng-Jen; Shih, Yu-Lueng; Huang, Hsin-Hung; Hsieh, Tsai-Yuan
2014-01-01
Background and Aim. The predisposing factors for prolonged cecal intubation time (CIT) during colonoscopy have been well identified. However, the factors influencing CIT during retrograde SBE have not been addressed. The aim of this study was to determine the factors influencing CIT during retrograde SBE. Methods. We investigated patients who underwent retrograde SBE at a medical center from January 2011 to March 2014. The medical charts and SBE reports were reviewed. The patients' characteristics and procedure-associated data were recorded. These data were analyzed with univariate analysis as well as multivariate logistic regression analysis to identify the possible predisposing factors. Results. We enrolled 66 patients into this study. The median CIT was 17.4 minutes. With univariate analysis, there was no statistical difference in age, sex, BMI, or history of abdominal surgery, except for bowel preparation (P = 0.021). Multivariate logistic regression analysis showed that inadequate bowel preparation (odds ratio 30.2, 95% confidence interval 4.63–196.54; P < 0.001) was the independent predisposing factors for prolonged CIT during retrograde SBE. Conclusions. For experienced endoscopist, inadequate bowel preparation was the independent predisposing factor for prolonged CIT during retrograde SBE. PMID:25505904
Zwetsloot, P P; Kouwenberg, L H J A; Sena, E S; Eding, J E; den Ruijter, H M; Sluijter, J P G; Pasterkamp, G; Doevendans, P A; Hoefer, I E; Chamuleau, S A J; van Hout, G P J; Jansen Of Lorkeers, S J
2017-10-27
Large animal models are essential for the development of novel therapeutics for myocardial infarction. To optimize translation, we need to assess the effect of experimental design on disease outcome and model experimental design to resemble the clinical course of MI. The aim of this study is therefore to systematically investigate how experimental decisions affect outcome measurements in large animal MI models. We used control animal-data from two independent meta-analyses of large animal MI models. All variables of interest were pre-defined. We performed univariable and multivariable meta-regression to analyze whether these variables influenced infarct size and ejection fraction. Our analyses incorporated 246 relevant studies. Multivariable meta-regression revealed that infarct size and cardiac function were influenced independently by choice of species, sex, co-medication, occlusion type, occluded vessel, quantification method, ischemia duration and follow-up duration. We provide strong systematic evidence that commonly used endpoints significantly depend on study design and biological variation. This makes direct comparison of different study-results difficult and calls for standardized models. Researchers should take this into account when designing large animal studies to most closely mimic the clinical course of MI and enable translational success.
Elliott, Marc N.; Kanouse, David E.; Klein, David J.; Davies, Susan L.; Cuccaro, Paula M.; Banspach, Stephen W.; Peskin, Melissa F.; Schuster, Mark A.
2013-01-01
Objectives. We examined the contribution of perceived racial/ethnic discrimination to disparities in problem behaviors among preadolescent Black, Latino, and White youths. Methods. We used cross-sectional data from Healthy Passages, a 3-community study of 5119 fifth graders and their parents from August 2004 through September 2006 in Birmingham, Alabama; Los Angeles County, California; and Houston, Texas. We used multivariate regressions to examine the relationships of perceived racial/ethnic discrimination and race/ethnicity to problem behaviors. We used values from these regressions to calculate the percentage of disparities in problem behaviors associated with the discrimination effect. Results. In multivariate models, perceived discrimination was associated with greater problem behaviors among Black and Latino youths. Compared with Whites, Blacks were significantly more likely to report problem behaviors, whereas Latinos were significantly less likely (a “reverse disparity”). When we set Blacks’ and Latinos’ discrimination experiences to zero, the adjusted disparity between Blacks and Whites was reduced by an estimated one third to two thirds; the reverse adjusted disparity favoring Latinos widened by about one fifth to one half. Conclusions. Eliminating discrimination could considerably reduce mental health issues, including problem behaviors, among Black and Latino youths. PMID:23597387
Predictors of condom use and refusal among the population of Free State province in South Africa
2012-01-01
Background This study investigated the extent and predictors of condom use and condom refusal in the Free State province in South Africa. Methods Through a household survey conducted in the Free Sate province of South Africa, 5,837 adults were interviewed. Univariate and multivariate survey logistic regressions and classification trees (CT) were used for analysing two response variables ‘ever used condom’ and ‘ever refused condom’. Results Eighty-three per cent of the respondents had ever used condoms, of which 38% always used them; 61% used them during the last sexual intercourse and 9% had ever refused to use them. The univariate logistic regression models and CT analysis indicated that a strong predictor of condom use was its perceived need. In the CT analysis, this variable was followed in importance by ‘knowledge of correct use of condom’, condom availability, young age, being single and higher education. ‘Perceived need’ for condoms did not remain significant in the multivariate analysis after controlling for other variables. The strongest predictor of condom refusal, as shown by the CT, was shame associated with condoms followed by the presence of sexual risk behaviour, knowing one’s HIV status, older age and lacking knowledge of condoms (i.e., ability to prevent sexually transmitted diseases and pregnancy, availability, correct and consistent use and existence of female condoms). In the multivariate logistic regression, age was not significant for condom refusal while affordability and perceived need were additional significant variables. Conclusions The use of complementary modelling techniques such as CT in addition to logistic regressions adds to a better understanding of condom use and refusal. Further improvement in correct and consistent use of condoms will require targeted interventions. In addition to existing social marketing campaigns, tailored approaches should focus on establishing the perceived need for condom-use and improving skills for correct use. They should also incorporate interventions to reduce the shame associated with condoms and individual counselling of those likely to refuse condoms. PMID:22639964
Predicting volumes in four Hawaii hardwoods...first multivariate equations developed
David A. Sharpnack
1966-01-01
Multivariate regression equations were developed for predicting board-foot (Int. 1/ 4-inch log rule ) and cubic-foot volumes in each 8.15-foot section of trees of four Hawaii hardwood species. The species are koa (Acacia koa), ohia (Metrosideros polymorpha), robusta eucalyptus (Eucalyptus robusta), and...
A Multivariate Test of the Bott Hypothesis in an Urban Irish Setting
ERIC Educational Resources Information Center
Gordon, Michael; Downing, Helen
1978-01-01
Using a sample of 686 married Irish women in Cork City the Bott hypothesis was tested, and the results of a multivariate regression analysis revealed that neither network connectedness nor the strength of the respondent's emotional ties to the network had any explanatory power. (Author)
Brouckaert, D; Uyttersprot, J-S; Broeckx, W; De Beer, T
2018-03-01
Calibration transfer or standardisation aims at creating a uniform spectral response on different spectroscopic instruments or under varying conditions, without requiring a full recalibration for each situation. In the current study, this strategy is applied to construct at-line multivariate calibration models and consequently employ them in-line in a continuous industrial production line, using the same spectrometer. Firstly, quantitative multivariate models are constructed at-line at laboratory scale for predicting the concentration of two main ingredients in hard surface cleaners. By regressing the Raman spectra of a set of small-scale calibration samples against their reference concentration values, partial least squares (PLS) models are developed to quantify the surfactant levels in the liquid detergent compositions under investigation. After evaluating the models performance with a set of independent validation samples, a univariate slope/bias correction is applied in view of transporting these at-line calibration models to an in-line manufacturing set-up. This standardisation technique allows a fast and easy transfer of the PLS regression models, by simply correcting the model predictions on the in-line set-up, without adjusting anything to the original multivariate calibration models. An extensive statistical analysis is performed in order to assess the predictive quality of the transferred regression models. Before and after transfer, the R 2 and RMSEP of both models is compared for evaluating if their magnitude is similar. T-tests are then performed to investigate whether the slope and intercept of the transferred regression line are not statistically different from 1 and 0, respectively. Furthermore, it is inspected whether no significant bias can be noted. F-tests are executed as well, for assessing the linearity of the transfer regression line and for investigating the statistical coincidence of the transfer and validation regression line. Finally, a paired t-test is performed to compare the original at-line model to the slope/bias corrected in-line model, using interval hypotheses. It is shown that the calibration models of Surfactant 1 and Surfactant 2 yield satisfactory in-line predictions after slope/bias correction. While Surfactant 1 passes seven out of eight statistical tests, the recommended validation parameters are 100% successful for Surfactant 2. It is hence concluded that the proposed strategy for transferring at-line calibration models to an in-line industrial environment via a univariate slope/bias correction of the predicted values offers a successful standardisation approach. Copyright © 2017 Elsevier B.V. All rights reserved.
Vajargah, Kianoush Fathi; Sadeghi-Bazargani, Homayoun; Mehdizadeh-Esfanjani, Robab; Savadi-Oskouei, Daryoush; Farhoudi, Mehdi
2012-01-01
The objective of the present study was to assess the comparable applicability of orthogonal projections to latent structures (OPLS) statistical model vs traditional linear regression in order to investigate the role of trans cranial doppler (TCD) sonography in predicting ischemic stroke prognosis. The study was conducted on 116 ischemic stroke patients admitted to a specialty neurology ward. The Unified Neurological Stroke Scale was used once for clinical evaluation on the first week of admission and again six months later. All data was primarily analyzed using simple linear regression and later considered for multivariate analysis using PLS/OPLS models through the SIMCA P+12 statistical software package. The linear regression analysis results used for the identification of TCD predictors of stroke prognosis were confirmed through the OPLS modeling technique. Moreover, in comparison to linear regression, the OPLS model appeared to have higher sensitivity in detecting the predictors of ischemic stroke prognosis and detected several more predictors. Applying the OPLS model made it possible to use both single TCD measures/indicators and arbitrarily dichotomized measures of TCD single vessel involvement as well as the overall TCD result. In conclusion, the authors recommend PLS/OPLS methods as complementary rather than alternative to the available classical regression models such as linear regression.
NASA Astrophysics Data System (ADS)
Kiss, I.; Cioată, V. G.; Alexa, V.; Raţiu, S. A.
2017-05-01
The braking system is one of the most important and complex subsystems of railway vehicles, especially when it comes for safety. Therefore, installing efficient safe brakes on the modern railway vehicles is essential. Nowadays is devoted attention to solving problems connected with using high performance brake materials and its impact on thermal and mechanical loading of railway wheels. The main factor that influences the selection of a friction material for railway applications is the performance criterion, due to the interaction between the brake block and the wheel produce complex thermos-mechanical phenomena. In this work, the investigated subjects are the cast-iron brake shoes, which are still widely used on freight wagons. Therefore, the cast-iron brake shoes - with lamellar graphite and with a high content of phosphorus (0.8-1.1%) - need a special investigation. In order to establish the optimal condition for the cast-iron brake shoes we proposed a mathematical modelling study by using the statistical analysis and multiple regression equations. Multivariate research is important in areas of cast-iron brake shoes manufacturing, because many variables interact with each other simultaneously. Multivariate visualization comes to the fore when researchers have difficulties in comprehending many dimensions at one time. Technological data (hardness and chemical composition) obtained from cast-iron brake shoes were used for this purpose. In order to settle the multiple correlation between the hardness of the cast-iron brake shoes, and the chemical compositions elements several model of regression equation types has been proposed. Because a three-dimensional surface with variables on three axes is a common way to illustrate multivariate data, in which the maximum and minimum values are easily highlighted, we plotted graphical representation of the regression equations in order to explain interaction of the variables and locate the optimal level of each variable for maximal response. For the calculation of the regression coefficients, dispersion and correlation coefficients, the software Matlab was used.
NASA Astrophysics Data System (ADS)
Jin, Seung-Seop; Jung, Hyung-Jo
2014-03-01
It is well known that the dynamic properties of a structure such as natural frequencies depend not only on damage but also on environmental condition (e.g., temperature). The variation in dynamic characteristics of a structure due to environmental condition may mask damage of the structure. Without taking the change of environmental condition into account, false-positive or false-negative damage diagnosis may occur so that structural health monitoring becomes unreliable. In order to address this problem, an approach to construct a regression model based on structural responses considering environmental factors has been usually used by many researchers. The key to success of this approach is the formulation between the input and output variables of the regression model to take into account the environmental variations. However, it is quite challenging to determine proper environmental variables and measurement locations in advance for fully representing the relationship between the structural responses and the environmental variations. One alternative (i.e., novelty detection) is to remove the variations caused by environmental factors from the structural responses by using multivariate statistical analysis (e.g., principal component analysis (PCA), factor analysis, etc.). The success of this method is deeply depending on the accuracy of the description of normal condition. Generally, there is no prior information on normal condition during data acquisition, so that the normal condition is determined by subjective perspective with human-intervention. The proposed method is a novel adaptive multivariate statistical analysis for monitoring of structural damage detection under environmental change. One advantage of this method is the ability of a generative learning to capture the intrinsic characteristics of the normal condition. The proposed method is tested on numerically simulated data for a range of noise in measurement under environmental variation. A comparative study with conventional methods (i.e., fixed reference scheme) demonstrates the superior performance of the proposed method for structural damage detection.
NASA Astrophysics Data System (ADS)
Tewari, Jagdish; Strong, Richard; Boulas, Pierre
2017-02-01
This article summarizes the development and validation of a Fourier transform near infrared spectroscopy (FT-NIR) method for the rapid at-line prediction of active pharmaceutical ingredient (API) in a powder blend to optimize small molecule formulations. The method was used to determine the blend uniformity end-point for a pharmaceutical solid dosage formulation containing a range of API concentrations. A set of calibration spectra from samples with concentrations ranging from 1% to 15% of API (w/w) were collected at-line from 4000 to 12,500 cm- 1. The ability of the FT-NIR method to predict API concentration in the blend samples was validated against a reference high performance liquid chromatography (HPLC) method. The prediction efficiency of four different types of multivariate data modeling methods such as partial least-squares 1 (PLS1), partial least-squares 2 (PLS2), principal component regression (PCR) and artificial neural network (ANN), were compared using relevant multivariate figures of merit. The prediction ability of the regression models were cross validated against results generated with the reference HPLC method. PLS1 and ANN showed excellent and superior prediction abilities when compared to PLS2 and PCR. Based upon these results and because of its decreased complexity compared to ANN, PLS1 was selected as the best chemometric method to predict blend uniformity at-line. The FT-NIR measurement and the associated chemometric analysis were implemented in the production environment for rapid at-line determination of the end-point of the small molecule blending operation. FIGURE 1: Correlation coefficient vs Rank plot FIGURE 2: FT-NIR spectra of different steps of Blend and final blend FIGURE 3: Predictions ability of PCR FIGURE 4: Blend uniformity predication ability of PLS2 FIGURE 5: Prediction efficiency of blend uniformity using ANN FIGURE 6: Comparison of prediction efficiency of chemometric models TABLE 1: Order of Addition for Blending Steps
Divya, O; Mishra, Ashok K
2007-05-29
Quantitative determination of kerosene fraction present in diesel has been carried out based on excitation emission matrix fluorescence (EEMF) along with parallel factor analysis (PARAFAC) and N-way partial least squares regression (N-PLS). EEMF is a simple, sensitive and nondestructive method suitable for the analysis of multifluorophoric mixtures. Calibration models consisting of varying compositions of diesel and kerosene were constructed and their validation was carried out using leave-one-out cross validation method. The accuracy of the model was evaluated through the root mean square error of prediction (RMSEP) for the PARAFAC, N-PLS and unfold PLS methods. N-PLS was found to be a better method compared to PARAFAC and unfold PLS method because of its low RMSEP values.
Yam, Eileen A; Okal, Jerry; Musyoki, Helgar; Muraguri, Nicholas; Tun, Waimar; Sheehy, Meredith; Geibel, Scott
2016-03-01
To examine whether nonbarrier modern contraceptive use is associated with less consistent condom use among Kenyan female sex workers (FSWs). Researchers recruited 579 FSWs using respondent-driven sampling. We conducted multivariate logistic regression to examine the association between consistent condom use and female-controlled nonbarrier modern contraceptive use. A total of 98.8% reported using male condoms in the past month, and 64.6% reported using female-controlled nonbarrier modern contraception. In multivariate analysis, female-controlled nonbarrier modern contraceptive use was not associated with decreased condom use with clients or nonpaying partners. Consistency of condom use is not compromised when FSWs use available female-controlled nonbarrier modern contraception. FSWs should be encouraged to use condoms consistently, whether or not other methods are used simultaneously. Copyright © 2016 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Wang, Lunche; Kisi, Ozgur; Zounemat-Kermani, Mohammad; Li, Hui
2017-01-01
Pan evaporation (Ep) plays important roles in agricultural water resources management. One of the basic challenges is modeling Ep using limited climatic parameters because there are a number of factors affecting the evaporation rate. This study investigated the abilities of six different soft computing methods, multi-layer perceptron (MLP), generalized regression neural network (GRNN), fuzzy genetic (FG), least square support vector machine (LSSVM), multivariate adaptive regression spline (MARS), adaptive neuro-fuzzy inference systems with grid partition (ANFIS-GP), and two regression methods, multiple linear regression (MLR) and Stephens and Stewart model (SS) in predicting monthly Ep. Long-term climatic data at various sites crossing a wide range of climates during 1961-2000 are used for model development and validation. The results showed that the models have different accuracies in different climates and the MLP model performed superior to the other models in predicting monthly Ep at most stations using local input combinations (for example, the MAE (mean absolute errors), RMSE (root mean square errors), and determination coefficient (R2) are 0.314 mm/day, 0.405 mm/day and 0.988, respectively for HEB station), while GRNN model performed better in Tibetan Plateau (MAE, RMSE and R2 are 0.459 mm/day, 0.592 mm/day and 0.932, respectively). The accuracies of above models ranked as: MLP, GRNN, LSSVM, FG, ANFIS-GP, MARS and MLR. The overall results indicated that the soft computing techniques generally performed better than the regression methods, but MLR and SS models can be more preferred at some climatic zones instead of complex nonlinear models, for example, the BJ (Beijing), CQ (Chongqing) and HK (Haikou) stations. Therefore, it can be concluded that Ep could be successfully predicted using above models in hydrological modeling studies.
L.R. Grosenbaugh
1967-01-01
Describes an expansible computerized system that provides data needed in regression or covariance analysis of as many as 50 variables, 8 of which may be dependent. Alternatively, it can screen variously generated combinations of independent variables to find the regression with the smallest mean-squared-residual, which will be fitted if desired. The user can easily...
Fragoso, André; Silva, Claudia; Viegas, Carla; Tavares, Nelson; Guilherme, Patrícia; Santos, Nélio; Rato, Fátima; Camacho, Ana; Cavaco, Cidália; Pereira, Victor; Faísca, Marilia; Ataíde, João; Jesus, Ilídio; Neves, Pedro
2013-01-01
Aims. To evaluate the association of different apelin levels with cardiovascular mortality, hospitalization, renal function, and cardiovascular risk factors in type 2 diabetic patients with mild to moderate CKD. Methods. An observational, prospective study involving 150 patients divided into groups according to baseline apelin levels: 1 ≤ 98 pg/mL, 2 = 98–328 pg/mL, and 3 ≥ 329 pg/mL. Baseline characteristics were analyzed and compared. Multivariate Cox regression was used to find out predictors of cardiovascular mortality, and multivariate logistic regression was used to find out predictors of hospitalization and disease progression. Simple linear regressions and Pearson correlations were used to investigate correlations between apelin and renal disease and cardiovascular risk factors. Results. Patients' survival at 83 months in groups 1, 2, and 3 was 39%, 40%, and 71.2%, respectively (P = 0.046). Apelin, age, and eGFR were independent predictors of mortality, and apelin, creatinine, eGFR, resistin, and visfatin were independent predictors of hospitalization. Apelin levels were negatively correlated with cardiovascular risk factors and positively correlated with eGFR. Patients with lower apelin levels were more likely to start a depurative technique. Conclusions. Apelin levels might have a significant clinical use as a marker/predictor of cardiovascular mortality and hospitalization or even as a therapeutic agent for CKD patients with cardiovascular disease. PMID:24089668
Black, L E; Brion, G M; Freitas, S J
2007-06-01
Predicting the presence of enteric viruses in surface waters is a complex modeling problem. Multiple water quality parameters that indicate the presence of human fecal material, the load of fecal material, and the amount of time fecal material has been in the environment are needed. This paper presents the results of a multiyear study of raw-water quality at the inlet of a potable-water plant that related 17 physical, chemical, and biological indices to the presence of enteric viruses as indicated by cytopathic changes in cell cultures. It was found that several simple, multivariate logistic regression models that could reliably identify observations of the presence or absence of total culturable virus could be fitted. The best models developed combined a fecal age indicator (the atypical coliform [AC]/total coliform [TC] ratio), the detectable presence of a human-associated sterol (epicoprostanol) to indicate the fecal source, and one of several fecal load indicators (the levels of Giardia species cysts, coliform bacteria, and coprostanol). The best fit to the data was found when the AC/TC ratio, the presence of epicoprostanol, and the density of fecal coliform bacteria were input into a simple, multivariate logistic regression equation, resulting in 84.5% and 78.6% accuracies for the identification of the presence and absence of total culturable virus, respectively. The AC/TC ratio was the most influential input variable in all of the models generated, but producing the best prediction required additional input related to the fecal source and the fecal load. The potential for replacing microbial indicators of fecal load with levels of coprostanol was proposed and evaluated by multivariate logistic regression modeling for the presence and absence of virus.
Chung, Doo Yong; Cho, Kang Su; Lee, Dae Hun; Han, Jang Hee; Kang, Dong Hyuk; Jung, Hae Do; Kown, Jong Kyou; Ham, Won Sik; Choi, Young Deuk; Lee, Joo Yong
2015-01-01
Purpose This study was conducted to evaluate colic pain as a prognostic pretreatment factor that can influence ureter stone clearance and to estimate the probability of stone-free status in shock wave lithotripsy (SWL) patients with a ureter stone. Materials and Methods We retrospectively reviewed the medical records of 1,418 patients who underwent their first SWL between 2005 and 2013. Among these patients, 551 had a ureter stone measuring 4–20 mm and were thus eligible for our analyses. The colic pain as the chief complaint was defined as either subjective flank pain during history taking and physical examination. Propensity-scores for established for colic pain was calculated for each patient using multivariate logistic regression based upon the following covariates: age, maximal stone length (MSL), and mean stone density (MSD). Each factor was evaluated as predictor for stone-free status by Bayesian and non-Bayesian logistic regression model. Results After propensity-score matching, 217 patients were extracted in each group from the total patient cohort. There were no statistical differences in variables used in propensity- score matching. One-session success and stone-free rate were also higher in the painful group (73.7% and 71.0%, respectively) than in the painless group (63.6% and 60.4%, respectively). In multivariate non-Bayesian and Bayesian logistic regression models, a painful stone, shorter MSL, and lower MSD were significant factors for one-session stone-free status in patients who underwent SWL. Conclusions Colic pain in patients with ureter calculi was one of the significant predicting factors including MSL and MSD for one-session stone-free status of SWL. PMID:25902059
Shi, Xiao; Zhang, Ting-Ting; Hu, Wei-Ping; Ji, Qing-Hai
2017-04-25
The relationship between marital status and oral cavity squamous cell carcinoma (OCSCC) survival has not been explored. The objective of our study was to evaluate the impact of marital status on OCSCC survival and investigate the potential mechanisms. Married patients had better 5-year cancer-specific survival (CSS) (66.7% vs 54.9%) and 5-year overall survival (OS) (56.0% vs 41.1%). In multivariate Cox regression models, unmarried patients also showed higher mortality risk for both CSS (Hazard Ratio [HR]: 1.260, 95% confidence interval (CI): 1.187-1.339, P < 0.001) and OS (HR: 1.328, 95% CI: 1.266-1.392, P < 0.001). Multivariate logistic regression showed married patients were more likely to be diagnosed at earlier stage (P < 0.001) and receive surgery (P < 0.001). Married patients still demonstrated better prognosis in the 1:1 matched group analysis (CSS: 62.9% vs 60.8%, OS: 52.3% vs 46.5%). 11022 eligible OCSCC patients were identified from Surveillance, Epidemiology, and End Results (SEER) database, including 5902 married and 5120 unmarried individuals. Kaplan-Meier analysis, Log-rank test and Cox proportional hazards regression model were used to analyze survival and mortality risk. Influence of marital status on stage, age at diagnosis and selection of treatment was determined by binomial and multinomial logistic regression. Propensity score matching method was adopted to perform a 1:1 matched cohort. Marriage has an independently protective effect on OCSCC survival. Earlier diagnosis and more sufficient treatment are possible explanations. Besides, even after 1:1 matching, survival advantage of married group still exists, indicating that spousal support from other aspects may also play an important role.
Enhanced ID Pit Sizing Using Multivariate Regression Algorithm
NASA Astrophysics Data System (ADS)
Krzywosz, Kenji
2007-03-01
EPRI is funding a program to enhance and improve the reliability of inside diameter (ID) pit sizing for balance-of plant heat exchangers, such as condensers and component cooling water heat exchangers. More traditional approaches to ID pit sizing involve the use of frequency-specific amplitude or phase angles. The enhanced multivariate regression algorithm for ID pit depth sizing incorporates three simultaneous input parameters of frequency, amplitude, and phase angle. A set of calibration data sets consisting of machined pits of various rounded and elongated shapes and depths was acquired in the frequency range of 100 kHz to 1 MHz for stainless steel tubing having nominal wall thickness of 0.028 inch. To add noise to the acquired data set, each test sample was rotated and test data acquired at 3, 6, 9, and 12 o'clock positions. The ID pit depths were estimated using a second order and fourth order regression functions by relying on normalized amplitude and phase angle information from multiple frequencies. Due to unique damage morphology associated with the microbiologically-influenced ID pits, it was necessary to modify the elongated calibration standard-based algorithms by relying on the algorithm developed solely from the destructive sectioning results. This paper presents the use of transformed multivariate regression algorithm to estimate ID pit depths and compare the results with the traditional univariate phase angle analysis. Both estimates were then compared with the destructive sectioning results.
Optimizing complex phenotypes through model-guided multiplex genome engineering
Kuznetsov, Gleb; Goodman, Daniel B.; Filsinger, Gabriel T.; ...
2017-05-25
Here, we present a method for identifying genomic modifications that optimize a complex phenotype through multiplex genome engineering and predictive modeling. We apply our method to identify six single nucleotide mutations that recover 59% of the fitness defect exhibited by the 63-codon E. coli strain C321.ΔA. By introducing targeted combinations of changes in multiplex we generate rich genotypic and phenotypic diversity and characterize clones using whole-genome sequencing and doubling time measurements. Regularized multivariate linear regression accurately quantifies individual allelic effects and overcomes bias from hitchhiking mutations and context-dependence of genome editing efficiency that would confound other strategies.
Optimizing complex phenotypes through model-guided multiplex genome engineering
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kuznetsov, Gleb; Goodman, Daniel B.; Filsinger, Gabriel T.
Here, we present a method for identifying genomic modifications that optimize a complex phenotype through multiplex genome engineering and predictive modeling. We apply our method to identify six single nucleotide mutations that recover 59% of the fitness defect exhibited by the 63-codon E. coli strain C321.ΔA. By introducing targeted combinations of changes in multiplex we generate rich genotypic and phenotypic diversity and characterize clones using whole-genome sequencing and doubling time measurements. Regularized multivariate linear regression accurately quantifies individual allelic effects and overcomes bias from hitchhiking mutations and context-dependence of genome editing efficiency that would confound other strategies.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Southworth, Frank; Garrow, Dr. Laurie
This chapter describes the principal types of both passenger and freight demand models in use today, providing a brief history of model development supported by references to a number of popular texts on the subject, and directing the reader to papers covering some of the more recent technical developments in the area. Over the past half century a variety of methods have been used to estimate and forecast travel demands, drawing concepts from economic/utility maximization theory, transportation system optimization and spatial interaction theory, using and often combining solution techniques as varied as Box-Jenkins methods, non-linear multivariate regression, non-linear mathematical programming,more » and agent-based microsimulation.« less
Iturriaga, H; Hirsch, S; Bunout, D; Díaz, M; Kelly, M; Silva, G; de la Maza, M P; Petermann, M; Ugarte, G
1993-04-01
Looking for a noninvasive method to predict liver histologic alterations in alcoholic patients without clinical signs of liver failure, we studied 187 chronic alcoholics recently abstinent, divided in 2 series. In the model series (n = 94) several clinical variables and results of common laboratory tests were confronted to the findings of liver biopsies. These were classified in 3 groups: 1. Normal liver; 2. Moderate alterations; 3. Marked alterations, including alcoholic hepatitis and cirrhosis. Multivariate methods used were logistic regression analysis and a classification and regression tree (CART). Both methods entered gamma-glutamyltransferase (GGT), aspartate-aminotransferase (AST), weight and age as significant and independent variables. Univariate analysis with GGT and AST at different cutoffs were also performed. To predict the presence of any kind of damage (Groups 2 and 3), CART and AST > 30 IU showed the higher sensitivity, specificity and correct prediction, both in the model and validation series. For prediction of marked liver damage, a score based on logistic regression and GGT > 110 IU had the higher efficiencies. It is concluded that GGT and AST are good markers of alcoholic liver damage and that, using sample cutoffs, histologic diagnosis can be correctly predicted in 80% of recently abstinent asymptomatic alcoholics.
Estimation of Subpixel Snow-Covered Area by Nonparametric Regression Splines
NASA Astrophysics Data System (ADS)
Kuter, S.; Akyürek, Z.; Weber, G.-W.
2016-10-01
Measurement of the areal extent of snow cover with high accuracy plays an important role in hydrological and climate modeling. Remotely-sensed data acquired by earth-observing satellites offer great advantages for timely monitoring of snow cover. However, the main obstacle is the tradeoff between temporal and spatial resolution of satellite imageries. Soft or subpixel classification of low or moderate resolution satellite images is a preferred technique to overcome this problem. The most frequently employed snow cover fraction methods applied on Moderate Resolution Imaging Spectroradiometer (MODIS) data have evolved from spectral unmixing and empirical Normalized Difference Snow Index (NDSI) methods to latest machine learning-based artificial neural networks (ANNs). This study demonstrates the implementation of subpixel snow-covered area estimation based on the state-of-the-art nonparametric spline regression method, namely, Multivariate Adaptive Regression Splines (MARS). MARS models were trained by using MODIS top of atmospheric reflectance values of bands 1-7 as predictor variables. Reference percentage snow cover maps were generated from higher spatial resolution Landsat ETM+ binary snow cover maps. A multilayer feed-forward ANN with one hidden layer trained with backpropagation was also employed to estimate the percentage snow-covered area on the same data set. The results indicated that the developed MARS model performed better than th
DOE Office of Scientific and Technical Information (OSTI.GOV)
Neeway, James J.; Rieke, Peter C.; Parruzot, Benjamin P.
In far-from-equilibrium conditions, the dissolution of borosilicate glasses used to immobilize nuclear waste is known to be a function of both temperature and pH. The aim of this paper is to study effects of these variables on three model waste glasses (SON68, ISG, AFCI). To do this, experiments were conducted at temperatures of 23, 40, 70, and 90 °C and pH(RT) values of 9, 10, 11, and 12 with the single-pass flow-through (SPFT) test method. The results from these tests were then used to parameterize a kinetic rate model based on transition state theory. Both the absolute dissolution rates andmore » the rate model parameters are compared with previous results. Discrepancies in the absolute dissolution rates as compared to those obtained using other test methods are discussed. Rate model parameters for the three glasses studied here are nearly equivalent within error and in relative agreement with previous studies. The results were analyzed with a linear multivariate regression (LMR) and a nonlinear multivariate regression performed with the use of the Glass Corrosion Modeling Tool (GCMT), which is capable of providing a robust uncertainty analysis. This robust analysis highlights the high degree of correlation of various parameters in the kinetic rate model. As more data are obtained on borosilicate glasses with varying compositions, the effect of glass composition on the rate parameter values could possibly be obtained. This would allow for the possibility of predicting the forward dissolution rate of glass based solely on composition« less
Multilevel covariance regression with correlated random effects in the mean and variance structure.
Quintero, Adrian; Lesaffre, Emmanuel
2017-09-01
Multivariate regression methods generally assume a constant covariance matrix for the observations. In case a heteroscedastic model is needed, the parametric and nonparametric covariance regression approaches can be restrictive in the literature. We propose a multilevel regression model for the mean and covariance structure, including random intercepts in both components and allowing for correlation between them. The implied conditional covariance function can be different across clusters as a result of the random effect in the variance structure. In addition, allowing for correlation between the random intercepts in the mean and covariance makes the model convenient for skewedly distributed responses. Furthermore, it permits us to analyse directly the relation between the mean response level and the variability in each cluster. Parameter estimation is carried out via Gibbs sampling. We compare the performance of our model to other covariance modelling approaches in a simulation study. Finally, the proposed model is applied to the RN4CAST dataset to identify the variables that impact burnout of nurses in Belgium. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Advanced statistics: linear regression, part II: multiple linear regression.
Marill, Keith A
2004-01-01
The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.
Neuropsychological tests for predicting cognitive decline in older adults
Baerresen, Kimberly M; Miller, Karen J; Hanson, Eric R; Miller, Justin S; Dye, Richelin V; Hartman, Richard E; Vermeersch, David; Small, Gary W
2015-01-01
Summary Aim To determine neuropsychological tests likely to predict cognitive decline. Methods A sample of nonconverters (n = 106) was compared with those who declined in cognitive status (n = 24). Significant univariate logistic regression prediction models were used to create multivariate logistic regression models to predict decline based on initial neuropsychological testing. Results Rey–Osterrieth Complex Figure Test (RCFT) Retention predicted conversion to mild cognitive impairment (MCI) while baseline Buschke Delay predicted conversion to Alzheimer’s disease (AD). Due to group sample size differences, additional analyses were conducted using a subsample of demographically matched nonconverters. Analyses indicated RCFT Retention predicted conversion to MCI and AD, and Buschke Delay predicted conversion to AD. Conclusion Results suggest RCFT Retention and Buschke Delay may be useful in predicting cognitive decline. PMID:26107318
Grandke, Fabian; Singh, Priyanka; Heuven, Henri C M; de Haan, Jorn R; Metzler, Dirk
2016-08-24
Association studies are an essential part of modern plant breeding, but are limited for polyploid crops. The increased number of possible genotype classes complicates the differentiation between them. Available methods are limited with respect to the ploidy level or data producing technologies. While genotype classification is an established noise reduction step in diploids, it gains complexity with increasing ploidy levels. Eventually, the errors produced by misclassifications exceed the benefits of genotype classes. Alternatively, continuous genotype values can be used for association analysis in higher polyploids. We associated continuous genotypes to three different traits and compared the results to the output of the genotype caller SuperMASSA. Linear, Bayesian and partial least squares regression were applied, to determine if the use of continuous genotypes is limited to a specific method. A disease, a flowering and a growth trait with h (2) of 0.51, 0.78 and 0.91 were associated with a hexaploid chrysanthemum genotypes. The data set consisted of 55,825 probes and 228 samples. We were able to detect associating probes using continuous genotypes for multiple traits, using different regression methods. The identified probe sets were overlapping, but not identical between the methods. Baysian regression was the most restrictive method, resulting in ten probes for one trait and none for the others. Linear and partial least squares regression led to numerous associating probes. Association based on genotype classes resulted in similar values, but missed several significant probes. A simulation study was used to successfully validate the number of associating markers. Association of various phenotypic traits with continuous genotypes is successful with both uni- and multivariate regression methods. Genotype calling does not improve the association and shows no advantages in this study. Instead, use of continuous genotypes simplifies the analysis, saves computational time and results more potential markers.
Real, J; Cleries, R; Forné, C; Roso-Llorach, A; Martínez-Sánchez, J M
In medicine and biomedical research, statistical techniques like logistic, linear, Cox and Poisson regression are widely known. The main objective is to describe the evolution of multivariate techniques used in observational studies indexed in PubMed (1970-2013), and to check the requirements of the STROBE guidelines in the author guidelines in Spanish journals indexed in PubMed. A targeted PubMed search was performed to identify papers that used logistic linear Cox and Poisson models. Furthermore, a review was also made of the author guidelines of journals published in Spain and indexed in PubMed and Web of Science. Only 6.1% of the indexed manuscripts included a term related to multivariate analysis, increasing from 0.14% in 1980 to 12.3% in 2013. In 2013, 6.7, 2.5, 3.5, and 0.31% of the manuscripts contained terms related to logistic, linear, Cox and Poisson regression, respectively. On the other hand, 12.8% of journals author guidelines explicitly recommend to follow the STROBE guidelines, and 35.9% recommend the CONSORT guideline. A low percentage of Spanish scientific journals indexed in PubMed include the STROBE statement requirement in the author guidelines. Multivariate regression models in published observational studies such as logistic regression, linear, Cox and Poisson are increasingly used both at international level, as well as in journals published in Spanish. Copyright © 2015 Sociedad Española de Médicos de Atención Primaria (SEMERGEN). Publicado por Elsevier España, S.L.U. All rights reserved.
Dehesh, Tania; Zare, Najaf; Ayatollahi, Seyyed Mohammad Taghi
2015-01-01
Univariate meta-analysis (UM) procedure, as a technique that provides a single overall result, has become increasingly popular. Neglecting the existence of other concomitant covariates in the models leads to loss of treatment efficiency. Our aim was proposing four new approximation approaches for the covariance matrix of the coefficients, which is not readily available for the multivariate generalized least square (MGLS) method as a multivariate meta-analysis approach. We evaluated the efficiency of four new approaches including zero correlation (ZC), common correlation (CC), estimated correlation (EC), and multivariate multilevel correlation (MMC) on the estimation bias, mean square error (MSE), and 95% probability coverage of the confidence interval (CI) in the synthesis of Cox proportional hazard models coefficients in a simulation study. Comparing the results of the simulation study on the MSE, bias, and CI of the estimated coefficients indicated that MMC approach was the most accurate procedure compared to EC, CC, and ZC procedures. The precision ranking of the four approaches according to all above settings was MMC ≥ EC ≥ CC ≥ ZC. This study highlights advantages of MGLS meta-analysis on UM approach. The results suggested the use of MMC procedure to overcome the lack of information for having a complete covariance matrix of the coefficients.
A simple prognostic model for overall survival in metastatic renal cell carcinoma
Assi, Hazem I.; Patenaude, Francois; Toumishey, Ethan; Ross, Laura; Abdelsalam, Mahmoud; Reiman, Tony
2016-01-01
Introduction: The primary purpose of this study was to develop a simpler prognostic model to predict overall survival for patients treated for metastatic renal cell carcinoma (mRCC) by examining variables shown in the literature to be associated with survival. Methods: We conducted a retrospective analysis of patients treated for mRCC at two Canadian centres. All patients who started first-line treatment were included in the analysis. A multivariate Cox proportional hazards regression model was constructed using a stepwise procedure. Patients were assigned to risk groups depending on how many of the three risk factors from the final multivariate model they had. Results: There were three risk factors in the final multivariate model: hemoglobin, prior nephrectomy, and time from diagnosis to treatment. Patients in the high-risk group (two or three risk factors) had a median survival of 5.9 months, while those in the intermediate-risk group (one risk factor) had a median survival of 16.2 months, and those in the low-risk group (no risk factors) had a median survival of 50.6 months. Conclusions: In multivariate analysis, shorter survival times were associated with hemoglobin below the lower limit of normal, absence of prior nephrectomy, and initiation of treatment within one year of diagnosis. PMID:27217858
Multivariate statistical approach to estimate mixing proportions for unknown end members
Valder, Joshua F.; Long, Andrew J.; Davis, Arden D.; Kenner, Scott J.
2012-01-01
A multivariate statistical method is presented, which includes principal components analysis (PCA) and an end-member mixing model to estimate unknown end-member hydrochemical compositions and the relative mixing proportions of those end members in mixed waters. PCA, together with the Hotelling T2 statistic and a conceptual model of groundwater flow and mixing, was used in selecting samples that best approximate end members, which then were used as initial values in optimization of the end-member mixing model. This method was tested on controlled datasets (i.e., true values of estimates were known a priori) and found effective in estimating these end members and mixing proportions. The controlled datasets included synthetically generated hydrochemical data, synthetically generated mixing proportions, and laboratory analyses of sample mixtures, which were used in an evaluation of the effectiveness of this method for potential use in actual hydrological settings. For three different scenarios tested, correlation coefficients (R2) for linear regression between the estimated and known values ranged from 0.968 to 0.993 for mixing proportions and from 0.839 to 0.998 for end-member compositions. The method also was applied to field data from a study of end-member mixing in groundwater as a field example and partial method validation.
Boshuizen, Hendriek C; Lanti, Mariapaola; Menotti, Alessandro; Moschandreas, Joanna; Tolonen, Hanna; Nissinen, Aulikki; Nedeljkovic, Srecko; Kafatos, Anthony; Kromhout, Daan
2007-02-15
The authors aimed to quantify the effects of current systolic blood pressure (SBP) and serum total cholesterol on the risk of mortality in comparison with SBP or serum cholesterol 25 years previously, taking measurement error into account. The authors reanalyzed 35-year follow-up data on mortality due to coronary heart disease and stroke among subjects aged 65 years or more from nine cohorts of the Seven Countries Study. The two-step method of Tsiatis et al. (J Am Stat Assoc 1995;90:27-37) was used to adjust for regression dilution bias, and results were compared with those obtained using more commonly applied methods of adjustment for regression dilution bias. It was found that the commonly used univariate adjustment for regression dilution bias overestimates the effects of both SBP and cholesterol compared with multivariate methods. Also, the two-step method makes better use of the information available, resulting in smaller confidence intervals. Results comparing recent and past exposure indicated that past SBP is more important than recent SBP in terms of its effect on coronary heart disease mortality, while both recent and past values seem to be important for effects of cholesterol on coronary heart disease mortality and effects of SBP on stroke mortality. Associations between serum cholesterol concentration and risk of stroke mortality are weak.
Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.
2003-01-01
Logistic regression was used to predict the probability of debris flows occurring in areas recently burned by wildland fires. Multiple logistic regression is conceptually similar to multiple linear regression because statistical relations between one dependent variable and several independent variables are evaluated. In logistic regression, however, the dependent variable is transformed to a binary variable (debris flow did or did not occur), and the actual probability of the debris flow occurring is statistically modeled. Data from 399 basins located within 15 wildland fires that burned during 2000-2002 in Colorado, Idaho, Montana, and New Mexico were evaluated. More than 35 independent variables describing the burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows were delineated from National Elevation Data using a Geographic Information System (GIS). (2) Data describing the burn severity, geology, land surface gradient, rainfall, and soil properties were determined for each basin. These data were then downloaded to a statistics software package for analysis using logistic regression. (3) Relations between the occurrence/non-occurrence of debris flows and burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated and several preliminary multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combination produced the most effective model. The multivariate model that best predicted the occurrence of debris flows was selected. (4) The multivariate logistic regression model was entered into a GIS, and a map showing the probability of debris flows was constructed. The most effective model incorporates the percentage of each basin with slope greater than 30 percent, percentage of land burned at medium and high burn severity in each basin, particle size sorting, average storm intensity (millimeters per hour), soil organic matter content, soil permeability, and soil drainage. The results of this study demonstrate that logistic regression is a valuable tool for predicting the probability of debris flows occurring in recently-burned landscapes.
[Theory, method and application of method R on estimation of (co)variance components].
Liu, Wen-Zhong
2004-07-01
Theory, method and application of Method R on estimation of (co)variance components were reviewed in order to make the method be reasonably used. Estimation requires R values,which are regressions of predicted random effects that are calculated using complete dataset on predicted random effects that are calculated using random subsets of the same data. By using multivariate iteration algorithm based on a transformation matrix,and combining with the preconditioned conjugate gradient to solve the mixed model equations, the computation efficiency of Method R is much improved. Method R is computationally inexpensive,and the sampling errors and approximate credible intervals of estimates can be obtained. Disadvantages of Method R include a larger sampling variance than other methods for the same data,and biased estimates in small datasets. As an alternative method, Method R can be used in larger datasets. It is necessary to study its theoretical properties and broaden its application range further.
MGAS: a powerful tool for multivariate gene-based genome-wide association analysis.
Van der Sluis, Sophie; Dolan, Conor V; Li, Jiang; Song, Youqiang; Sham, Pak; Posthuma, Danielle; Li, Miao-Xin
2015-04-01
Standard genome-wide association studies, testing the association between one phenotype and a large number of single nucleotide polymorphisms (SNPs), are limited in two ways: (i) traits are often multivariate, and analysis of composite scores entails loss in statistical power and (ii) gene-based analyses may be preferred, e.g. to decrease the multiple testing problem. Here we present a new method, multivariate gene-based association test by extended Simes procedure (MGAS), that allows gene-based testing of multivariate phenotypes in unrelated individuals. Through extensive simulation, we show that under most trait-generating genotype-phenotype models MGAS has superior statistical power to detect associated genes compared with gene-based analyses of univariate phenotypic composite scores (i.e. GATES, multiple regression), and multivariate analysis of variance (MANOVA). Re-analysis of metabolic data revealed 32 False Discovery Rate controlled genome-wide significant genes, and 12 regions harboring multiple genes; of these 44 regions, 30 were not reported in the original analysis. MGAS allows researchers to conduct their multivariate gene-based analyses efficiently, and without the loss of power that is often associated with an incorrectly specified genotype-phenotype models. MGAS is freely available in KGG v3.0 (http://statgenpro.psychiatry.hku.hk/limx/kgg/download.php). Access to the metabolic dataset can be requested at dbGaP (https://dbgap.ncbi.nlm.nih.gov/). The R-simulation code is available from http://ctglab.nl/people/sophie_van_der_sluis. Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.
STAMATE, MIRELA CRISTINA; TODOR, NICOLAE; COSGAREA, MARCEL
2015-01-01
Background and aim The clinical utility of otoacoustic emissions as a noninvasive objective test of cochlear function has been long studied. Both transient otoacoustic emissions and distorsion products can be used to identify hearing loss, but to what extent they can be used as predictors for hearing loss is still debated. Most studies agree that multivariate analyses have better test performances than univariate analyses. The aim of the study was to determine transient otoacoustic emissions and distorsion products performance in identifying normal and impaired hearing loss, using the pure tone audiogram as a gold standard procedure and different multivariate statistical approaches. Methods The study included 105 adult subjects with normal hearing and hearing loss who underwent the same test battery: pure-tone audiometry, tympanometry, otoacoustic emission tests. We chose to use the logistic regression as a multivariate statistical technique. Three logistic regression models were developed to characterize the relations between different risk factors (age, sex, tinnitus, demographic features, cochlear status defined by otoacoustic emissions) and hearing status defined by pure-tone audiometry. The multivariate analyses allow the calculation of the logistic score, which is a combination of the inputs, weighted by coefficients, calculated within the analyses. The accuracy of each model was assessed using receiver operating characteristics curve analysis. We used the logistic score to generate receivers operating curves and to estimate the areas under the curves in order to compare different multivariate analyses. Results We compared the performance of each otoacoustic emission (transient, distorsion product) using three different multivariate analyses for each ear, when multi-frequency gold standards were used. We demonstrated that all multivariate analyses provided high values of the area under the curve proving the performance of the otoacoustic emissions. Each otoacoustic emission test presented high values of area under the curve, suggesting that implementing a multivariate approach to evaluate the performances of each otoacoustic emission test would serve to increase the accuracy in identifying the normal and impaired ears. We encountered the highest area under the curve value for the combined multivariate analysis suggesting that both otoacoustic emission tests should be used in assessing hearing status. Our multivariate analyses revealed that age is a constant predictor factor of the auditory status for both ears, but the presence of tinnitus was the most important predictor for the hearing level, only for the left ear. Age presented similar coefficients, but tinnitus coefficients, by their high value, produced the highest variations of the logistic scores, only for the left ear group, thus increasing the risk of hearing loss. We did not find gender differences between ears for any otoacoustic emission tests, but studies still debate this question as the results are contradictory. Neither gender, nor environment origin had any predictive value for the hearing status, according to the results of our study. Conclusion Like any other audiological test, using otoacoustic emissions to identify hearing loss is not without error. Even when applying multivariate analysis, perfect test performance is never achieved. Although most studies demonstrated the benefit of using the multivariate analysis, it has not been incorporated into clinical decisions maybe because of the idiosyncratic nature of multivariate solutions or because of the lack of the validation studies. PMID:26733749
Body Fat Percentage Prediction Using Intelligent Hybrid Approaches
Shao, Yuehjen E.
2014-01-01
Excess of body fat often leads to obesity. Obesity is typically associated with serious medical diseases, such as cancer, heart disease, and diabetes. Accordingly, knowing the body fat is an extremely important issue since it affects everyone's health. Although there are several ways to measure the body fat percentage (BFP), the accurate methods are often associated with hassle and/or high costs. Traditional single-stage approaches may use certain body measurements or explanatory variables to predict the BFP. Diverging from existing approaches, this study proposes new intelligent hybrid approaches to obtain fewer explanatory variables, and the proposed forecasting models are able to effectively predict the BFP. The proposed hybrid models consist of multiple regression (MR), artificial neural network (ANN), multivariate adaptive regression splines (MARS), and support vector regression (SVR) techniques. The first stage of the modeling includes the use of MR and MARS to obtain fewer but more important sets of explanatory variables. In the second stage, the remaining important variables are served as inputs for the other forecasting methods. A real dataset was used to demonstrate the development of the proposed hybrid models. The prediction results revealed that the proposed hybrid schemes outperformed the typical, single-stage forecasting models. PMID:24723804
Multivariate regression model for partitioning tree volume of white oak into round-product classes
Daniel A. Yaussy; David L. Sonderman
1984-01-01
Describes the development of multivariate equations that predict the expected cubic volume of four round-product classes from independent variables composed of individual tree-quality characteristics. Although the model has limited application at this time, it does demonstrate the feasibility of partitioning total tree cubic volume into round-product classes based on...
Women's empowerment and choice of contraceptive methods in selected African countries.
Do, Mai; Kurimoto, Nami
2012-03-01
It is generally believed that women's lack of decision-making power may restrict their use of modern contraceptives. However, few studies have examined the different dimensions of women's empowerment and contraceptive use in African countries. Data came from the latest round of Demographic and Health Surveys conducted between 2006 and 2008 in Namibia, Zambia, Ghana and Uganda. Responses from married or cohabiting women aged 15-49 were analyzed for six dimensions of empowerment and the current use of female-only methods or couple methods. Bivariate and multivariate multinomial regressions were used to identify associations between the empowerment dimensions and method use. Positive associations were found between the overall empowerment score and method use in all countries (relative risk ratios, 1.1-1.3). In multivariate analysis, household economic decision making was associated with the use of either female-only or couple methods (1.1 for all), as was agreement on fertility preferences (1.3-1.6) and the ability to negotiate sexual activity (1.1-1.2). In Namibia, women's negative attitudes toward domestic violence were correlated with the use of couple methods (1.1). Intervention programs aimed at increasing contraceptive use may need to involve different approaches, including promoting couples' discussion of fertility preferences and family planning, improving women's self-efficacy in negotiating sexual activity and increasing their economic independence.
Basques, Bryce A.; Golinvaux, Nicholas S.; Bohl, Daniel D.; Yacob, Alem; Toy, Jason O.; Varthi, Arya G.; Grauer, Jonathan N.
2014-01-01
Study Design Retrospective database review. Objective To evaluate whether microscope use during spine procedures is associated with increased operating room times or increased risk of infection. Summary of Background Data Operating microscopes are commonly used in spine procedures. It is debated whether the use of an operating microscope increases operating room time or confers increased risk of infection. Methods The American College of Surgeons National Surgical Quality Improvement Program (ACS-NSQIP) database, which includes data from over 370 participating hospitals, was used to identify patients undergoing elective spinal procedures with and without an operating microscope for the years 2011 and 2012. Bivariate and multivariate linear regressions were used to test the association between microscope use and operating room times. Bivariate and multivariate logistic regressions were similarly conducted to test the association between microscope use and infection occurrence within 30 days of surgery. Results A total of 23,670 elective spine procedures were identified, of which 2,226 (9.4%) used an operating microscope. The average patient age was 55.1 ± 14.4 years. The average operative time (incision to closure) was 125.7 ± 82.0 minutes. Microscope use was associated with minor increases in preoperative room time (+2.9 minutes, p=0.013), operative time (+13.2 minutes, p<0.001), and total room time (+18.6 minutes, p<0.001) on multivariate analysis. A total of 328 (1.4%) patients had an infection within 30 days of surgery. Multivariate analysis revealed no significant difference between the microscope and non-microscope groups for occurrence of any infection, superficial surgical site infection (SSI), deep SSI, organ space infection, or sepsis/septic shock, regardless of surgery type. Conclusions We did not find operating room times or infection risk to be significant deterrents for use of an operating microscope during spine surgery. PMID:25188600
A new method for reconstruction of solar irradiance
NASA Astrophysics Data System (ADS)
Privalsky, Victor
2018-07-01
The purpose of this research is to show how time series should be reconstructed using an example with the data on total solar irradiation (TSI) of the Earth and on sunspot numbers (SSN) since 1749. The traditional approach through regression equation(s) is designed for time-invariant vectors of random variables and is not applicable to time series, which present random functions of time. The autoregressive reconstruction (ARR) method suggested here requires fitting a multivariate stochastic difference equation to the target/proxy time series. The reconstruction is done through the scalar equation for the target time series with the white noise term excluded. The time series approach is shown to provide a better reconstruction of TSI than the correlation/regression method. A reconstruction criterion is introduced which allows one to define in advance the achievable level of success in the reconstruction. The conclusion is that time series, including the total solar irradiance, cannot be reconstructed properly if the data are not treated as sample records of random processes and analyzed in both time and frequency domains.
Cunningham, Marc; Brown, Niquelle; Sacher, Suzy; Hatch, Benjamin; Inglis, Andrew; Aronovich, Dana
2015-01-01
Background: Contraceptive prevalence rate (CPR) is a vital indicator used by country governments, international donors, and other stakeholders for measuring progress in family planning programs against country targets and global initiatives as well as for estimating health outcomes. Because of the need for more frequent CPR estimates than population-based surveys currently provide, alternative approaches for estimating CPRs are being explored, including using contraceptive logistics data. Methods: Using data from the Demographic and Health Surveys (DHS) in 30 countries, population data from the United States Census Bureau International Database, and logistics data from the Procurement Planning and Monitoring Report (PPMR) and the Pipeline Monitoring and Procurement Planning System (PipeLine), we developed and evaluated 3 models to generate country-level, public-sector contraceptive prevalence estimates for injectable contraceptives, oral contraceptives, and male condoms. Models included: direct estimation through existing couple-years of protection (CYP) conversion factors, bivariate linear regression, and multivariate linear regression. Model evaluation consisted of comparing the referent DHS prevalence rates for each short-acting method with the model-generated prevalence rate using multiple metrics, including mean absolute error and proportion of countries where the modeled prevalence rate for each method was within 1, 2, or 5 percentage points of the DHS referent value. Results: For the methods studied, family planning use estimates from public-sector logistics data were correlated with those from the DHS, validating the quality and accuracy of current public-sector logistics data. Logistics data for oral and injectable contraceptives were significantly associated (P<.05) with the referent DHS values for both bivariate and multivariate models. For condoms, however, that association was only significant for the bivariate model. With the exception of the CYP-based model for condoms, models were able to estimate public-sector prevalence rates for each short-acting method to within 2 percentage points in at least 85% of countries. Conclusions: Public-sector contraceptive logistics data are strongly correlated with public-sector prevalence rates for short-acting methods, demonstrating the quality of current logistics data and their ability to provide relatively accurate prevalence estimates. The models provide a starting point for generating interim estimates of contraceptive use when timely survey data are unavailable. All models except the condoms CYP model performed well; the regression models were most accurate but the CYP model offers the simplest calculation method. Future work extending the research to other modern methods, relating subnational logistics data with prevalence rates, and tracking that relationship over time is needed. PMID:26374805
"Photographing money" task pricing
NASA Astrophysics Data System (ADS)
Jia, Zhongxiang
2018-05-01
"Photographing money" [1]is a self-service model under the mobile Internet. The task pricing is reasonable, related to the success of the commodity inspection. First of all, we analyzed the position of the mission and the membership, and introduced the factor of membership density, considering the influence of the number of members around the mission on the pricing. Multivariate regression of task location and membership density using MATLAB to establish the mathematical model of task pricing. At the same time, we can see from the life experience that membership reputation and the intensity of the task will also affect the pricing, and the data of the task success point is more reliable. Therefore, the successful point of the task is selected, and its reputation, task density, membership density and Multiple regression of task positions, according to which a nhew task pricing program. Finally, an objective evaluation is given of the advantages and disadvantages of the established model and solution method, and the improved method is pointed out.
Ulibarri, Monica D; Hiller, Sarah P; Lozada, Remedios; Rangel, M Gudelia; Stockman, Jamila K; Silverman, Jay G; Ojeda, Victoria D
2013-01-01
This mixed methods study examined the prevalence and characteristics of physical and sexual abuse and depression symptoms among 624 injection drug-using female sex workers (FSW-IDUs) in Tijuana and Ciudad Juarez, Mexico; a subset of 47 from Tijuana also underwent qualitative interviews. Linear regressions identified correlates of current depression symptoms. In the interviews, FSW-IDUs identified drug use as a method of coping with the trauma they experienced from abuse that occurred before and after age 18 and during the course of sex work. In a multivariate linear regression model, two factors-ever experiencing forced sex and forced sex in the context of sex work-were significantly associated with higher levels of depression symptoms. Our findings suggest the need for integrated mental health and drug abuse services for FSW-IDUs addressing history of trauma as well as for further research on violence revictimization in the context of sex work in Mexico.
Ulibarri, Monica D.; Hiller, Sarah P.; Lozada, Remedios; Rangel, M. Gudelia; Stockman, Jamila K.; Silverman, Jay G.; Ojeda, Victoria D.
2013-01-01
This mixed methods study examined the prevalence and characteristics of physical and sexual abuse and depression symptoms among 624 injection drug-using female sex workers (FSW-IDUs) in Tijuana and Ciudad Juarez, Mexico; a subset of 47 from Tijuana also underwent qualitative interviews. Linear regressions identified correlates of current depression symptoms. In the interviews, FSW-IDUs identified drug use as a method of coping with the trauma they experienced from abuse that occurred before and after age 18 and during the course of sex work. In a multivariate linear regression model, two factors—ever experiencing forced sex and forced sex in the context of sex work—were significantly associated with higher levels of depression symptoms. Our findings suggest the need for integrated mental health and drug abuse services for FSW-IDUs addressing history of trauma as well as for further research on violence revictimization in the context of sex work in Mexico. PMID:23737808
Meng, Yilin; Roux, Benoît
2015-08-11
The weighted histogram analysis method (WHAM) is a standard protocol for postprocessing the information from biased umbrella sampling simulations to construct the potential of mean force with respect to a set of order parameters. By virtue of the WHAM equations, the unbiased density of state is determined by satisfying a self-consistent condition through an iterative procedure. While the method works very effectively when the number of order parameters is small, its computational cost grows rapidly in higher dimension. Here, we present a simple and efficient alternative strategy, which avoids solving the self-consistent WHAM equations iteratively. An efficient multivariate linear regression framework is utilized to link the biased probability densities of individual umbrella windows and yield an unbiased global free energy landscape in the space of order parameters. It is demonstrated with practical examples that free energy landscapes that are comparable in accuracy to WHAM can be generated at a small fraction of the cost.
2015-01-01
The weighted histogram analysis method (WHAM) is a standard protocol for postprocessing the information from biased umbrella sampling simulations to construct the potential of mean force with respect to a set of order parameters. By virtue of the WHAM equations, the unbiased density of state is determined by satisfying a self-consistent condition through an iterative procedure. While the method works very effectively when the number of order parameters is small, its computational cost grows rapidly in higher dimension. Here, we present a simple and efficient alternative strategy, which avoids solving the self-consistent WHAM equations iteratively. An efficient multivariate linear regression framework is utilized to link the biased probability densities of individual umbrella windows and yield an unbiased global free energy landscape in the space of order parameters. It is demonstrated with practical examples that free energy landscapes that are comparable in accuracy to WHAM can be generated at a small fraction of the cost. PMID:26574437
Gouvinhas, Irene; Machado, Nelson; Carvalho, Teresa; de Almeida, José M M M; Barros, Ana I R N A
2015-01-01
Extra virgin olive oils produced from three cultivars on different maturation stages were characterized using Raman spectroscopy. Chemometric methods (principal component analysis, discriminant analysis, principal component regression and partial least squares regression) applied to Raman spectral data were utilized to evaluate and quantify the statistical differences between cultivars and their ripening process. The models for predicting the peroxide value and free acidity of olive oils showed good calibration and prediction values and presented high coefficients of determination (>0.933). Both the R(2), and the correlation equations between the measured chemical parameters, and the values predicted by each approach are presented; these comprehend both PCR and PLS, used to assess SNV normalized Raman data, as well as first and second derivative of the spectra. This study demonstrates that a combination of Raman spectroscopy with multivariate analysis methods can be useful to predict rapidly olive oil chemical characteristics during the maturation process. Copyright © 2014 Elsevier B.V. All rights reserved.
Haware, Rahul V; Bauer-Brandl, Annette; Tho, Ingunn
2010-01-01
The present work challenges a newly developed approach to tablet formulation development by using chemically identical materials (grades and brands of microcrystalline cellulose). Tablet properties with respect to process and formulation parameters (e.g. compression speed, added lubricant and Emcompress fractions) were evaluated by 2(3)-factorial designs. Tablets of constant true volume were prepared on a compaction simulator at constant pressure (approx. 100 MPa). The highly repeatable and accurate force-displacement data obtained was evaluated by simple 'in-die' Heckel method and work descriptors. Relationships and interactions between formulation, process and tablet parameters were identified and quantified by multivariate analysis techniques; principal component analysis (PCA) and partial least square regressions (PLS). The method proved to be able to distinguish between different grades of MCC and even between two different brands of the same grade (Avicel PH 101 and Vivapur 101). One example of interaction was studied in more detail by mixed level design: The interaction effect of lubricant and Emcompress on elastic recovery of Avicel PH 102 was demonstrated to be complex and non-linear using the development tool under investigation.
Gardiner, Paula; Filippelli, Amanda C; Sadikova, Ekaterina; White, Laura F; Jack, Brian W
2015-01-01
Purpose. To identify characteristics associated with the use of potentially harmful combinations of dietary supplements (DS) and cardiac prescription medications in an urban, underserved, inpatient population. Methods. Cardiac prescription medication users were identified to assess the prevalence and risk factors of potentially harmful dietary supplement-prescription medication interactions (PHDS-PMI). We examined sociodemographic and clinical characteristics for crude (χ (2) or t-tests) and adjusted multivariable logistic regression associations with the outcome. Results. Among 558 patients, there were 121 who also used a DS. Of the 110 participants having a PHDS-PMI, 25% were asked about their DS use at admission, 75% had documentation of DS in their chart, and 21% reported the intention to continue DS use after discharge. A multivariable logistic regression model noted that for every additional medication or DS taken the odds of having a PHDS-PMI increase and that those with a high school education are significantly less likely to have a PHDS-PMI than those with a college education. Conclusion. Inpatients at an urban safety net hospital taking a combination of cardiac prescription medications and DS are at a high risk of harmful supplement-drug interactions. Providers must ask about DS use and should consider the potential for interactions when having patient discussions about cardiac medications and DS.
Impact of Trichiasis Surgery on Physical Functioning in Ethiopian Patients: STAR Trial
Wolle, Meraf A.; Cassard, Sandra D.; Gower, Emily W.; Munoz, Beatriz E.; Wang, Jiangxia; Alemayehu, Wondu; West, Sheila K.
2010-01-01
Purpose To evaluate the physical functioning of Ethiopian trichiasis surgery patients before and six months after surgery. Design Nested Cohort Study Methods This study was nested within the Surgery for Trichiasis, Antibiotics to Prevent Recurrence (STAR) clinical trial conducted in Ethiopia. Demographic information, ocular examinations, and physical functioning assessments were collected before and 6 months after surgery. A single score for patients’ physical functioning was constructed using Rasch analysis. A multivariate linear regression model was used to determine if change in physical functioning was associated with change in visual acuity. Results Of the 438 participants, 411 (93.8%) had both baseline and follow-up questionnaires. Physical functioning scores at baseline ranged from −6.32 (great difficulty) to +6.01 (no difficulty). The percent of participants reporting no difficulty in physical functioning increased by 32.6%; the proportion of participants in the mild/no visual impairment category increased by 8.6%. A multivariate linear regression model showed that for every line of vision gained, physical functioning improves significantly (0.09 units; 95% CI: 0.02–0.16). Conclusions Surgery to correct trichiasis appears to improve patients’ physical functioning as measured at 6 months. More effort in promoting trichiasis surgery is essential, not only to prevent corneal blindness, but also to enable improved functioning in daily life. PMID:21333268
Adjustment of geochemical background by robust multivariate statistics
Zhou, D.
1985-01-01
Conventional analyses of exploration geochemical data assume that the background is a constant or slowly changing value, equivalent to a plane or a smoothly curved surface. However, it is better to regard the geochemical background as a rugged surface, varying with changes in geology and environment. This rugged surface can be estimated from observed geological, geochemical and environmental properties by using multivariate statistics. A method of background adjustment was developed and applied to groundwater and stream sediment reconnaissance data collected from the Hot Springs Quadrangle, South Dakota, as part of the National Uranium Resource Evaluation (NURE) program. Source-rock lithology appears to be a dominant factor controlling the chemical composition of groundwater or stream sediments. The most efficacious adjustment procedure is to regress uranium concentration on selected geochemical and environmental variables for each lithologic unit, and then to delineate anomalies by a common threshold set as a multiple of the standard deviation of the combined residuals. Robust versions of regression and RQ-mode principal components analysis techniques were used rather than ordinary techniques to guard against distortion caused by outliers Anomalies delineated by this background adjustment procedure correspond with uranium prospects much better than do anomalies delineated by conventional procedures. The procedure should be applicable to geochemical exploration at different scales for other metals. ?? 1985.
Nomogram Prediction of Overall Survival After Curative Irradiation for Uterine Cervical Cancer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Seo, YoungSeok; Yoo, Seong Yul; Kim, Mi-Sook
Purpose: The purpose of this study was to develop a nomogram capable of predicting the probability of 5-year survival after radical radiotherapy (RT) without chemotherapy for uterine cervical cancer. Methods and Materials: We retrospectively analyzed 549 patients that underwent radical RT for uterine cervical cancer between March 1994 and April 2002 at our institution. Multivariate analysis using Cox proportional hazards regression was performed and this Cox model was used as the basis for the devised nomogram. The model was internally validated for discrimination and calibration by bootstrap resampling. Results: By multivariate regression analysis, the model showed that age, hemoglobin levelmore » before RT, Federation Internationale de Gynecologie Obstetrique (FIGO) stage, maximal tumor diameter, lymph node status, and RT dose at Point A significantly predicted overall survival. The survival prediction model demonstrated good calibration and discrimination. The bootstrap-corrected concordance index was 0.67. The predictive ability of the nomogram proved to be superior to FIGO stage (p = 0.01). Conclusions: The devised nomogram offers a significantly better level of discrimination than the FIGO staging system. In particular, it improves predictions of survival probability and could be useful for counseling patients, choosing treatment modalities and schedules, and designing clinical trials. However, before this nomogram is used clinically, it should be externally validated.« less
Mota, Natalie; Elias, Brenda; Tefft, Bruce; Medved, Maria; Munro, Garry
2012-01-01
Objectives. We examined individual, friend or family, and community or tribe correlates of suicidality in a representative on-reserve sample of First Nations adolescents. Methods. Data came from the 2002–2003 Manitoba First Nations Regional Longitudinal Health Survey of Youth. Interviews were conducted with adolescents aged 12 to 17 years (n = 1125) from 23 First Nations communities in Manitoba. We used bivariate logistic regression analyses to examine the relationships between a range of factors and lifetime suicidality. We conducted sex-by-correlate interactions for each significant correlate at the bivariate level. A multivariate logistic regression analysis identified those correlates most strongly related to suicidality. Results. We found several variables to be associated with an increased likelihood of suicidality in the multivariate model, including being female, depressed mood, abuse or fear of abuse, a hospital stay, and substance use (adjusted odds ratio range = 2.43–11.73). Perceived community caring was protective against suicidality (adjusted odds ratio = 0.93; 95% confidence interval = 0.88, 0.97) in the same model. Conclusions. Results of this study may be important in informing First Nations and government policy related to the implementation of suicide prevention strategies in First Nations communities. PMID:22676500
High-Dimensional Sparse Factor Modeling: Applications in Gene Expression Genomics
Carvalho, Carlos M.; Chang, Jeffrey; Lucas, Joseph E.; Nevins, Joseph R.; Wang, Quanli; West, Mike
2010-01-01
We describe studies in molecular profiling and biological pathway analysis that use sparse latent factor and regression models for microarray gene expression data. We discuss breast cancer applications and key aspects of the modeling and computational methodology. Our case studies aim to investigate and characterize heterogeneity of structure related to specific oncogenic pathways, as well as links between aggregate patterns in gene expression profiles and clinical biomarkers. Based on the metaphor of statistically derived “factors” as representing biological “subpathway” structure, we explore the decomposition of fitted sparse factor models into pathway subcomponents and investigate how these components overlay multiple aspects of known biological activity. Our methodology is based on sparsity modeling of multivariate regression, ANOVA, and latent factor models, as well as a class of models that combines all components. Hierarchical sparsity priors address questions of dimension reduction and multiple comparisons, as well as scalability of the methodology. The models include practically relevant non-Gaussian/nonparametric components for latent structure, underlying often quite complex non-Gaussianity in multivariate expression patterns. Model search and fitting are addressed through stochastic simulation and evolutionary stochastic search methods that are exemplified in the oncogenic pathway studies. Supplementary supporting material provides more details of the applications, as well as examples of the use of freely available software tools for implementing the methodology. PMID:21218139
Moazzez, Ashkan; de Virgilio, Christian
2016-10-01
With constant changes in health-care laws and payment methods, profitability, and financial sustainability of hospitals are of utmost importance. The purpose of this study is to determine the relationship between surgical services and hospital profitability. The Office of Statewide Health Planning and Development annual financial databases for the years 2009 to 2011 were used for this study. The hospitals' characteristics and income statement elements were extracted for statistical analysis using bivariate and multivariate linear regression. A total of 989 financial records of 339 hospitals were included. On bivariate analysis, the number of inpatient and ambulatory operating rooms (ORs), the number of cases done both as inpatient and outpatient in each OR, and the average minutes used in inpatient ORs were significantly related with the net income of the hospital. On multivariate regression analysis, when controlling for hospitals' payer mix and the study year, only the number of inpatient cases done in the inpatient ORs (β = 832, P = 0.037), and the number of ambulatory ORs (β = 1,485, 466, P = 0.001) were significantly related with the net income of the hospital. These findings suggest that hospitals can maximize their profitability by diverting and allocating outpatient surgeries to ambulatory ORs, to allow for more inpatient surgeries.
Kim, Sungduk; Chen, Ming-Hui; Ibrahim, Joseph G.; Shah, Arvind K.; Lin, Jianxin
2013-01-01
In this paper, we propose a class of Box-Cox transformation regression models with multidimensional random effects for analyzing multivariate responses for individual patient data (IPD) in meta-analysis. Our modeling formulation uses a multivariate normal response meta-analysis model with multivariate random effects, in which each response is allowed to have its own Box-Cox transformation. Prior distributions are specified for the Box-Cox transformation parameters as well as the regression coefficients in this complex model, and the Deviance Information Criterion (DIC) is used to select the best transformation model. Since the model is quite complex, a novel Monte Carlo Markov chain (MCMC) sampling scheme is developed to sample from the joint posterior of the parameters. This model is motivated by a very rich dataset comprising 26 clinical trials involving cholesterol lowering drugs where the goal is to jointly model the three dimensional response consisting of Low Density Lipoprotein Cholesterol (LDL-C), High Density Lipoprotein Cholesterol (HDL-C), and Triglycerides (TG) (LDL-C, HDL-C, TG). Since the joint distribution of (LDL-C, HDL-C, TG) is not multivariate normal and in fact quite skewed, a Box-Cox transformation is needed to achieve normality. In the clinical literature, these three variables are usually analyzed univariately: however, a multivariate approach would be more appropriate since these variables are correlated with each other. A detailed analysis of these data is carried out using the proposed methodology. PMID:23580436
Kim, Sungduk; Chen, Ming-Hui; Ibrahim, Joseph G; Shah, Arvind K; Lin, Jianxin
2013-10-15
In this paper, we propose a class of Box-Cox transformation regression models with multidimensional random effects for analyzing multivariate responses for individual patient data in meta-analysis. Our modeling formulation uses a multivariate normal response meta-analysis model with multivariate random effects, in which each response is allowed to have its own Box-Cox transformation. Prior distributions are specified for the Box-Cox transformation parameters as well as the regression coefficients in this complex model, and the deviance information criterion is used to select the best transformation model. Because the model is quite complex, we develop a novel Monte Carlo Markov chain sampling scheme to sample from the joint posterior of the parameters. This model is motivated by a very rich dataset comprising 26 clinical trials involving cholesterol-lowering drugs where the goal is to jointly model the three-dimensional response consisting of low density lipoprotein cholesterol (LDL-C), high density lipoprotein cholesterol (HDL-C), and triglycerides (TG) (LDL-C, HDL-C, TG). Because the joint distribution of (LDL-C, HDL-C, TG) is not multivariate normal and in fact quite skewed, a Box-Cox transformation is needed to achieve normality. In the clinical literature, these three variables are usually analyzed univariately; however, a multivariate approach would be more appropriate because these variables are correlated with each other. We carry out a detailed analysis of these data by using the proposed methodology. Copyright © 2013 John Wiley & Sons, Ltd.
Written violence policies and risk of physical assault against Minnesota educators.
Feda, Denise M; Gerberich, Susan G; Ryan, Andrew D; Nachreiner, Nancy M; McGovern, Patricia M
2010-12-01
Few research studies on school violence policies use quantitative methods to evaluate the impact of policies on workplace violence. This study analyzed nine different written violence policies and their impact on work-related physical assault in educational settings. Data were from the Minnesota Educators' Study. This large, nested case control study included cases (n=372) who reported physical assaults within the last year, and controls (n=1116) who did not. Multivariate logistic regression analyses, using directed acyclic graphs, estimated risk of assault. Results of the adjusted multivariate model suggested decreased risks of physical assault were associated with the presence of policies regarding how to report sexual harassment, verbal abuse, and threat (OR 0.53; 95 per cent CI: 0.30-0.95); assurance of confidential reporting of events (OR 0.67; 95 per cent CI: 0.44-1.04); and zero tolerance for violence (OR 0.70; 95 per cent CI: 0.47-1.04).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dyar, M. Darby; McCanta, Molly; Breves, Elly
2016-03-01
Pre-edge features in the K absorption edge of X-ray absorption spectra are commonly used to predict Fe3+ valence state in silicate glasses. However, this study shows that using the entire spectral region from the pre-edge into the extended X-ray absorption fine-structure region provides more accurate results when combined with multivariate analysis techniques. The least absolute shrinkage and selection operator (lasso) regression technique yields %Fe3+ values that are accurate to ±3.6% absolute when the full spectral region is employed. This method can be used across a broad range of glass compositions, is easily automated, and is demonstrated to yield accurate resultsmore » from different synchrotrons. It will enable future studies involving X-ray mapping of redox gradients on standard thin sections at 1 × 1 μm pixel sizes.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dyar, M. Darby; McCanta, Molly; Breves, Elly
2016-03-01
Pre-edge features in the K absorption edge of X-ray absorption spectra are commonly used to predict Fe 3+ valence state in silicate glasses. However, this study shows that using the entire spectral region from the pre-edge into the extended X-ray absorption fine-structure region provides more accurate results when combined with multivariate analysis techniques. The least absolute shrinkage and selection operator (lasso) regression technique yields %Fe 3+ values that are accurate to ±3.6% absolute when the full spectral region is employed. This method can be used across a broad range of glass compositions, is easily automated, and is demonstrated to yieldmore » accurate results from different synchrotrons. It will enable future studies involving X-ray mapping of redox gradients on standard thin sections at 1 × 1 μm pixel sizes.« less
Hassan, Che Hashim
2013-01-01
Objective To examine the relationship between socioeconomic factors affecting contraceptive use among tribal women of Bangladesh with focusing on son preference over daughter. Materials and methods The study used data gathered through a cross sectional survey on four tribal communities resided in the Rangamati Hill District of the Chittagong Hill Tracts, Bangladesh. A multistage random sampling procedure was applied to collect data from 865 currently married women of whom 806 women were currently married, non-pregnant and had at least one living child, which are the basis of this study. The information was recorded in a pre-structured questionnaire. Simple cross tabulation, chi-square tests and logistic regression analyses were performed to analyzing data. Results The contraceptive prevalence rate among the study tribal women was 73%. The multivariate analyses yielded quantitatively important and reliable estimates of likelihood of contraceptive use. Findings revealed that after controlling for other variables, the likelihood of contraceptive use was found not to be significant among women with at least one son than those who had only daughters, indicating no preference of son over daughter. Multivariate logistic regression analysis suggests that home visitations by family planning workers, tribal identity, place of residence, husband's education, and type of family, television ownership, electricity connection in the household and number of times married are important determinants of any contraceptive method use among the tribal women. Conclusion The contraceptive use rate among the disadvantaged tribal women was more than that of the national level. Door-step delivery services of modern methods should be reached and available targeting the poor and remote zones. PMID:24971107
Linear Least Squares for Correlated Data
NASA Technical Reports Server (NTRS)
Dean, Edwin B.
1988-01-01
Throughout the literature authors have consistently discussed the suspicion that regression results were less than satisfactory when the independent variables were correlated. Camm, Gulledge, and Womer, and Womer and Marcotte provide excellent applied examples of these concerns. Many authors have obtained partial solutions for this problem as discussed by Womer and Marcotte and Wonnacott and Wonnacott, which result in generalized least squares algorithms to solve restrictive cases. This paper presents a simple but relatively general multivariate method for obtaining linear least squares coefficients which are free of the statistical distortion created by correlated independent variables.
Analysis of Multivariate Experimental Data Using A Simplified Regression Model Search Algorithm
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert M.
2013-01-01
A new regression model search algorithm was developed that may be applied to both general multivariate experimental data sets and wind tunnel strain-gage balance calibration data. The algorithm is a simplified version of a more complex algorithm that was originally developed for the NASA Ames Balance Calibration Laboratory. The new algorithm performs regression model term reduction to prevent overfitting of data. It has the advantage that it needs only about one tenth of the original algorithm's CPU time for the completion of a regression model search. In addition, extensive testing showed that the prediction accuracy of math models obtained from the simplified algorithm is similar to the prediction accuracy of math models obtained from the original algorithm. The simplified algorithm, however, cannot guarantee that search constraints related to a set of statistical quality requirements are always satisfied in the optimized regression model. Therefore, the simplified algorithm is not intended to replace the original algorithm. Instead, it may be used to generate an alternate optimized regression model of experimental data whenever the application of the original search algorithm fails or requires too much CPU time. Data from a machine calibration of NASA's MK40 force balance is used to illustrate the application of the new search algorithm.
Wójcicki, Krzysztof; Khmelinskii, Igor; Sikorski, Marek; Sikorska, Ewa
2015-11-15
Infrared spectroscopic techniques and chemometric methods were used to study oxidation of olive, sunflower and rapeseed oils. Accelerated oxidative degradation of oils at 60°C was monitored using peroxide values and FT-MIR ATR and FT-NIR transmittance spectroscopy. Principal component analysis (PCA) facilitated visualization and interpretation of spectral changes occurring during oxidation. Multivariate curve resolution (MCR) method found three spectral components in the NIR and MIR spectral matrix, corresponding to the oxidation products, and saturated and unsaturated structures. Good quantitative relation was found between peroxide value and contribution of oxidation products evaluated using MCR--based on NIR (R(2) = 0.890), MIR (R(2) = 0.707) and combined NIR and MIR (R(2) = 0.747) data. Calibration models for prediction peroxide value established using partial least squares (PLS) regression were characterized for MIR (R(2) = 0.701, RPD = 1.7), NIR (R(2) = 0.970, RPD = 5.3), and combined NIR and MIR data (R(2) = 0.954, RPD = 3.1). Copyright © 2015 Elsevier Ltd. All rights reserved.
Sasakura, D; Nakayama, K; Sakamoto, T; Chikuma, T
2015-05-01
The use of transmission near infrared spectroscopy (TNIRS) is of particular interest in the pharmaceutical industry. This is because TNIRS does not require sample preparation and can analyze several tens of tablet samples in an hour. It has the capability to measure all relevant information from a tablet, while still on the production line. However, TNIRS has a narrow spectrum range and overtone vibrations often overlap. To perform content uniformity testing in tablets by TNIRS, various properties in the tableting process need to be analyzed by a multivariate prediction model, such as a Partial Least Square Regression modeling. One issue is that typical approaches require several hundred reference samples to act as the basis of the method rather than a strategically designed method. This means that many batches are needed to prepare the reference samples; this requires time and is not cost effective. Our group investigated the concentration dependence of the calibration model with a strategic design. Consequently, we developed a more effective approach to the TNIRS calibration model than the existing methodology.
Affective temperaments and suicidal ideation and behavior in mood and anxiety disorder patients.
Baldessarini, Ross J; Vázquez, Gustavo H; Tondo, Leonardo
2016-07-01
Clinical characteristics proposed to be associated with suicidal risk include affective temperament types. We tested this proposal with two methods in a large sample of subjects with mood and anxiety disorders. We assessed consecutive, consenting subjects clinically for affective temperament types and by TEMPS-A self-ratings for associations of temperament with suicidal ideation and acts, using standard bivariate methods, and multivariate logistic regression models. Among 2561 subjects (major depressive, 1171; bipolar, 919, anxiety disorders, 471), temperament-types and TEMPS-A (39-item Italian version) subscale scores differed by risk of suicidal acts or ideation. Suicidal acts and ideation were most associated with cyclothymic and dysthymic, and less with hyperthymic temperaments. These associations were sustained by multivariate modeling that included diagnosis, age, sex, and diagnosis. Not all subjects completed TEMPS-A self-ratings; clinical assessments of temperaments were not standardized, and long-term stability of temperament assessments was not tested. The findings support and extend associations of cyclothymic-dysthymic temperaments with suicidal acts and ideation, whereas hyperthymic temperament may be protective. Copyright © 2016 Elsevier B.V. All rights reserved.
Wilke, Marko
2018-02-01
This dataset contains the regression parameters derived by analyzing segmented brain MRI images (gray matter and white matter) from a large population of healthy subjects, using a multivariate adaptive regression splines approach. A total of 1919 MRI datasets ranging in age from 1-75 years from four publicly available datasets (NIH, C-MIND, fCONN, and IXI) were segmented using the CAT12 segmentation framework, writing out gray matter and white matter images normalized using an affine-only spatial normalization approach. These images were then subjected to a six-step DARTEL procedure, employing an iterative non-linear registration approach and yielding increasingly crisp intermediate images. The resulting six datasets per tissue class were then analyzed using multivariate adaptive regression splines, using the CerebroMatic toolbox. This approach allows for flexibly modelling smoothly varying trajectories while taking into account demographic (age, gender) as well as technical (field strength, data quality) predictors. The resulting regression parameters described here can be used to generate matched DARTEL or SHOOT templates for a given population under study, from infancy to old age. The dataset and the algorithm used to generate it are publicly available at https://irc.cchmc.org/software/cerebromatic.php.
Ye, Dong-qing; Hu, Yi-song; Li, Xiang-pei; Huang, Fen; Yang, Shi-gui; Hao, Jia-hu; Yin, Jing; Zhang, Guo-qing; Liu, Hui-hui
2004-11-01
To explore the impact of environmental factors, daily lifestyle, psycho-social factors and the interactions between environmental factors and chemokines genes on systemic lupus erythematosus (SLE). Case-control study was carried out and environmental factors for SLE were analyzed by univariate and multivariate unconditional logistic regression. Interactions between environmental factors and chemokines polymorphism contributing to systemic lupus erythematosus were also analyzed by logistic regression model. There were nineteen factors associated with SLE when univariate unconditional logistic regression was used. However, when multivariate unconditional logistic regression was used, only five factors showed having impacts on the disease, in which drinking well water (OR=0.099) was protective factor for SLE, and multiple drug allergy (OR=8.174), over-exposure to sunshine (OR=18.339), taking antibiotics (OR=9.630) and oral contraceptives were risk factors for SLE. When unconditional logistic regression model was used, results showed that there was interaction between eating irritable food and -2518MCP-1G/G genotype (OR=4.387). No interaction between environmental factors was found that contributing to SLE in this study. Many environmental factors were related to SLE, and there was an interaction between -2518MCP-1G/G genotype and eating irritable food.
Miki, Takako; Kochi, Takeshi; Kuwahara, Keisuke; Eguchi, Masafumi; Kurotani, Kayo; Tsuruoka, Hiroko; Ito, Rie; Kabe, Isamu; Kawakami, Norito; Mizoue, Tetsuya; Nanri, Akiko
2015-09-30
Depression has been linked to the overall diet using both exploratory and pre-defined methods. However, neither of these methods incorporates specific knowledge on nutrient-disease associations. The aim of the present study was to empirically identify dietary patterns using reduced rank regression and to examine their relations to depressive symptoms. Participants were 2006 Japanese employees aged 19-69 years. Depressive symptoms were assessed using the Center for Epidemiologic Studies Depression Scale. Diet was assessed using a validated, self-administered diet history questionnaire. Dietary patterns were extracted by reduced rank regression with 6 depression-related nutrients as response variables. Logistic regression was used to estimate odds ratios of depressive symptoms adjusted for potential confounders. A dietary pattern characterized by a high intake of vegetables, mushrooms, seaweeds, soybean products, green tea, potatoes, fruits, and small fish with bones and a low intake of rice was associated with fewer depressive symptoms. The multivariable-adjusted odds ratios of having depressive symptoms were 0.62 (95% confidence interval, 0.48-0.81) in the highest versus lowest tertiles of dietary score. Results suggest that adherence to a diet rich in vegetables, fruits, and typical Japanese foods, including mushrooms, seaweeds, soybean products, and green tea, is associated with a lower probability of having depressive symptoms. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
de Laat, Jos; van Weele, Michiel; van der A, Ronald
2015-04-01
An important new landmark in present day ozone research is presented through MLS satellite observations of significant ozone increases during the ozone hole season that are attributed unequivocally to declining ozone depleting substances. For many decades the Antarctic ozone hole has been the prime example of both the detrimental effects of human activities on our environment as well as how to construct effective and successful environmental policies. Nowadays atmospheric concentrations of ozone depleting substances are on the decline and first signs of recovery of stratospheric ozone and ozone in the Antarctic ozone hole have been observed. The claimed detection of significant recovery, however, is still subject of debate. In this talk we will discuss first current uncertainties in the assessment of ozone recovery in the Antarctic ozone hole by using multi-variate regression methods, and, secondly present an alternative approach to identify ozone hole recovery unequivocally. Even though multi-variate regression methods help to reduce uncertainties in estimates of ozone recovery, great care has to be taken in their application due to the existence of uncertainties and degrees of freedom in the choice of independent variables. We show that taking all uncertainties into account in the regressions the formal recovery of ozone in the Antarctic ozone hole cannot be established yet, though is likely before the end of the decade (before 2020). Rather than focusing on time and area averages of total ozone columns or ozone profiles, we argue that the time evolution of the probability distribution of vertically resolved ozone in the Antarctic ozone hole contains a better fingerprint for the detection of ozone recovery in the Antarctic ozone hole. The advantages of this method over more tradition methods of trend analyses based on spatio-temporal average ozone are discussed. The 10-year record of MLS satellite measurements of ozone in the Antarctic ozone hole shows a significant change in the distribution of ozone. The occurrence of extremely low ozone (near 100% ozone depletion) has been declining significantly in favor of the occurrence of low ozone (80-90% ozone depletion). Finally the potential for continuation of this attribution method in the light of the currently available and future planned satellite remote sensing capacity will be shortly addressed.
Cunningham, Marc; Bock, Ariella; Brown, Niquelle; Sacher, Suzy; Hatch, Benjamin; Inglis, Andrew; Aronovich, Dana
2015-09-01
Contraceptive prevalence rate (CPR) is a vital indicator used by country governments, international donors, and other stakeholders for measuring progress in family planning programs against country targets and global initiatives as well as for estimating health outcomes. Because of the need for more frequent CPR estimates than population-based surveys currently provide, alternative approaches for estimating CPRs are being explored, including using contraceptive logistics data. Using data from the Demographic and Health Surveys (DHS) in 30 countries, population data from the United States Census Bureau International Database, and logistics data from the Procurement Planning and Monitoring Report (PPMR) and the Pipeline Monitoring and Procurement Planning System (PipeLine), we developed and evaluated 3 models to generate country-level, public-sector contraceptive prevalence estimates for injectable contraceptives, oral contraceptives, and male condoms. Models included: direct estimation through existing couple-years of protection (CYP) conversion factors, bivariate linear regression, and multivariate linear regression. Model evaluation consisted of comparing the referent DHS prevalence rates for each short-acting method with the model-generated prevalence rate using multiple metrics, including mean absolute error and proportion of countries where the modeled prevalence rate for each method was within 1, 2, or 5 percentage points of the DHS referent value. For the methods studied, family planning use estimates from public-sector logistics data were correlated with those from the DHS, validating the quality and accuracy of current public-sector logistics data. Logistics data for oral and injectable contraceptives were significantly associated (P<.05) with the referent DHS values for both bivariate and multivariate models. For condoms, however, that association was only significant for the bivariate model. With the exception of the CYP-based model for condoms, models were able to estimate public-sector prevalence rates for each short-acting method to within 2 percentage points in at least 85% of countries. Public-sector contraceptive logistics data are strongly correlated with public-sector prevalence rates for short-acting methods, demonstrating the quality of current logistics data and their ability to provide relatively accurate prevalence estimates. The models provide a starting point for generating interim estimates of contraceptive use when timely survey data are unavailable. All models except the condoms CYP model performed well; the regression models were most accurate but the CYP model offers the simplest calculation method. Future work extending the research to other modern methods, relating subnational logistics data with prevalence rates, and tracking that relationship over time is needed. © Cunningham et al.
Han, Sheng-Nan
2014-07-01
Chemometrics is a new branch of chemistry which is widely applied to various fields of analytical chemistry. Chemometrics can use theories and methods of mathematics, statistics, computer science and other related disciplines to optimize the chemical measurement process and maximize access to acquire chemical information and other information on material systems by analyzing chemical measurement data. In recent years, traditional Chinese medicine has attracted widespread attention. In the research of traditional Chinese medicine, it has been a key problem that how to interpret the relationship between various chemical components and its efficacy, which seriously restricts the modernization of Chinese medicine. As chemometrics brings the multivariate analysis methods into the chemical research, it has been applied as an effective research tool in the composition-activity relationship research of Chinese medicine. This article reviews the applications of chemometrics methods in the composition-activity relationship research in recent years. The applications of multivariate statistical analysis methods (such as regression analysis, correlation analysis, principal component analysis, etc. ) and artificial neural network (such as back propagation artificial neural network, radical basis function neural network, support vector machine, etc. ) are summarized, including the brief fundamental principles, the research contents and the advantages and disadvantages. Finally, the existing main problems and prospects of its future researches are proposed.
Xuan Chi; Barry Goodwin
2012-01-01
Spatial and temporal relationships among agricultural prices have been an important topic of applied research for many years. Such research is used to investigate the performance of markets and to examine linkages up and down the marketing chain. This research has empirically evaluated price linkages by using correlation and regression models and, later, linear and...
Multivariate time series analysis of neuroscience data: some challenges and opportunities.
Pourahmadi, Mohsen; Noorbaloochi, Siamak
2016-04-01
Neuroimaging data may be viewed as high-dimensional multivariate time series, and analyzed using techniques from regression analysis, time series analysis and spatiotemporal analysis. We discuss issues related to data quality, model specification, estimation, interpretation, dimensionality and causality. Some recent research areas addressing aspects of some recurring challenges are introduced. Copyright © 2015 Elsevier Ltd. All rights reserved.
Factors related to treatment refusal in Taiwanese cancer patients.
Chiang, Ting-Yu; Wang, Chao-Hui; Lin, Yu-Fen; Chou, Shu-Lan; Wang, Ching-Ting; Juang, Hsiao-Ting; Lin, Yung-Chang; Lin, Mei-Hsiang
2015-01-01
Incidence and mortality rates for cancer have increased dramatically in the recent 30 years in Taiwan. However, not all patients receive treatment. Treatment refusal might impair patient survival and life quality. In order to improve this situation, we proposed this study to evaluate factors that are related to refusal of treatment in cancer patients via a cancer case manager system. This study analysed data from a case management system during the period from 2010 to 2012 at a medical center in Northern Taiwan. We enrolled a total of 14,974 patients who were diagnosed with cancer. Using the PRECEDE Model as a framework, we conducted logistic regression analysis to identify independent variables that are significantly associated with refusal of therapy in cancer patients. A multivariate logistic regression model was also applied to estimate adjusted the odds ratios (ORs) with 95% confidence intervals (95%CI). A total of 253 patients (1.69%) refused treatment. The multivariate logistic regression result showed that the high risk factors for refusal of treatment in cancer patient included: concerns about adverse effects (p<0.001), poor performance(p<0.001), changes in medical condition (p<0.001), timing of case manager contact (p=.026), the methods by which case manager contact patients (p<0.001) and the frequency that case managers contact patients (≥10times) (p=0.016). Cancer patients who refuse treatment have poor survival. The present study provides evidence of factors that are related to refusal of therapy and might be helpful for further application and improvement of cancer care.
Krige, Jake E; Jonas, Eduard; Thomson, Sandie R; Kotze, Urda K; Setshedi, Mashiko; Navsaria, Pradeep H; Nicol, Andrew J
2017-01-01
AIM To benchmark severity of complications using the Accordion Severity Grading System (ASGS) in patients undergoing operation for severe pancreatic injuries. METHODS A prospective institutional database of 461 patients with pancreatic injuries treated from 1990 to 2015 was reviewed. One hundred and thirty patients with AAST grade 3, 4 or 5 pancreatic injuries underwent resection (pancreatoduodenectomy, n = 20, distal pancreatectomy, n = 110), including 30 who had an initial damage control laparotomy (DCL) and later definitive surgery. AAST injury grades, type of pancreatic resection, need for DCL and incidence and ASGS severity of complications were assessed. Uni- and multivariate logistic regression analysis was applied. RESULTS Overall 238 complications occurred in 95 (73%) patients of which 73% were ASGS grades 3-6. Nineteen patients (14.6%) died. Patients more likely to have complications after pancreatic resection were older, had a revised trauma score (RTS) < 7.8, were shocked on admission, had grade 5 injuries of the head and neck of the pancreas with associated vascular and duodenal injuries, required a DCL, received a larger blood transfusion, had a pancreatoduodenectomy (PD) and repeat laparotomies. Applying univariate logistic regression analysis, mechanism of injury, RTS < 7.8, shock on admission, DCL, increasing AAST grade and type of pancreatic resection were significant variables for complications. Multivariate logistic regression analysis however showed that only age and type of pancreatic resection (PD) were significant. CONCLUSION This ASGS-based study benchmarked postoperative morbidity after pancreatic resection for trauma. The detailed outcome analysis provided may serve as a reference for future institutional comparisons. PMID:28396721
Lee, Chia Ee; Vincent-Chong, Vui King; Ramanathan, Anand; Kallarakkal, Thomas George; Karen-Ng, Lee Peng; Ghani, Wan Maria Nabillah; Rahman, Zainal Ariff Abdul; Ismail, Siti Mazlipah; Abraham, Mannil Thomas; Tay, Keng Kiong; Mustafa, Wan Mahadzir Wan; Cheong, Sok Ching; Zain, Rosnah Binti
2015-01-01
BACKGROUND: Collagen Triple Helix Repeat Containing 1 (CTHRC1) is a protein often found to be over-expressed in various types of human cancers. However, correlation between CTHRC1 expression level with clinico-pathological characteristics and prognosis in oral cancer remains unclear. Therefore, this study aimed to determine mRNA and protein expression of CTHRC1 in oral squamous cell carcinoma (OSCC) and to evaluate the clinical and prognostic impact of CTHRC1 in OSCC. METHODS: In this study, mRNA and protein expression of CTHRC1 in OSCCs were determined by quantitative PCR and immunohistochemistry, respectively. The association between CTHRC1 and clinico-pathological parameters were evaluated by univariate and multivariate binary logistic regression analyses. Correlation between CTHRC1 protein expressions with survival were analysed using Kaplan-Meier and Cox regression models. RESULTS: Current study demonstrated CTHRC1 was significantly overexpressed at the mRNA level in OSCC. Univariate analyses indicated a high-expression of CTHRC1 that was significantly associated with advanced stage pTNM staging, tumour size ≥ 4 cm and positive lymph node metastasis (LNM). However, only positive LNM remained significant after adjusting with other confounder factors in multivariate logistic regression analyses. Kaplan-Meier survival analyses and Cox model demonstrated that patients with high-expression of CTHRC1 protein were associated with poor prognosis and is an independent prognostic factor in OSCC. CONCLUSION: This study indicated that over-expression of CTHRC1 potentially as an independent predictor for positive LNM and poor prognosis in OSCC. PMID:26664254
Stamou, Sotiris C; Rausch, Laura A; Kouchoukos, Nicholas T; Lobdell, Kevin W; Khabbaz, Kamal; Murphy, Edward; Hagberg, Robert C
2016-07-01
The goal of this study was to compare early postoperative outcomes and actuarial-free survival between patients who underwent repair of acute type A aortic dissection by the method of cerebral perfusion used. A total of 324 patients from five academic medical centers underwent repair of acute type A aortic dissection between January 2000 and December 2010. Of those, antegrade cerebral perfusion (ACP) was used for 84 patients, retrograde cerebral perfusion (RCP) was used for 55 patients, and deep hypothermic circulatory arrest (DHCA) was used for 184 patients during repair. Major morbidity, operative mortality, and 5-year actuarial survival were compared between groups. Multivariate logistic regression was used to determine predictors of operative mortality and Cox Regression hazard ratios were calculated to determine the predictors of long term mortality. Operative mortality was not influenced by the type of cerebral protection (19% for ACP, 14.5% for RCP and 19.1% for DHCA, P=0.729). In multivariable logistic regression analysis, hemodynamic instability [odds ratio (OR) =19.6, 95% confidence intervals (CI), 0.102-0.414, P<0.001] and CPB time >200 min(OR =4.7, 95% CI, 1.962-1.072, P=0.029) emerged as independent predictors of operative mortality. Actuarial 5-year survival was unchanged by cerebral protection modality (48.8% for ACP, 61.8% for RCP and 66.8% for no cerebral protection, log-rank P=0.844). During surgical repair of type A aortic dissection, ACP, RCP or DHCA are safe strategies for cerebral protection in selected patients with type A aortic dissection.
Agier, Lydiane; Portengen, Lützen; Chadeau-Hyam, Marc; Basagaña, Xavier; Giorgis-Allemand, Lise; Siroux, Valérie; Robinson, Oliver; Vlaanderen, Jelle; González, Juan R; Nieuwenhuijsen, Mark J; Vineis, Paolo; Vrijheid, Martine; Slama, Rémy; Vermeulen, Roel
2016-12-01
The exposome constitutes a promising framework to improve understanding of the effects of environmental exposures on health by explicitly considering multiple testing and avoiding selective reporting. However, exposome studies are challenged by the simultaneous consideration of many correlated exposures. We compared the performances of linear regression-based statistical methods in assessing exposome-health associations. In a simulation study, we generated 237 exposure covariates with a realistic correlation structure and with a health outcome linearly related to 0 to 25 of these covariates. Statistical methods were compared primarily in terms of false discovery proportion (FDP) and sensitivity. On average over all simulation settings, the elastic net and sparse partial least-squares regression showed a sensitivity of 76% and an FDP of 44%; Graphical Unit Evolutionary Stochastic Search (GUESS) and the deletion/substitution/addition (DSA) algorithm revealed a sensitivity of 81% and an FDP of 34%. The environment-wide association study (EWAS) underperformed these methods in terms of FDP (average FDP, 86%) despite a higher sensitivity. Performances decreased considerably when assuming an exposome exposure matrix with high levels of correlation between covariates. Correlation between exposures is a challenge for exposome research, and the statistical methods investigated in this study were limited in their ability to efficiently differentiate true predictors from correlated covariates in a realistic exposome context. Although GUESS and DSA provided a marginally better balance between sensitivity and FDP, they did not outperform the other multivariate methods across all scenarios and properties examined, and computational complexity and flexibility should also be considered when choosing between these methods. Citation: Agier L, Portengen L, Chadeau-Hyam M, Basagaña X, Giorgis-Allemand L, Siroux V, Robinson O, Vlaanderen J, González JR, Nieuwenhuijsen MJ, Vineis P, Vrijheid M, Slama R, Vermeulen R. 2016. A systematic comparison of linear regression-based statistical methods to assess exposome-health associations. Environ Health Perspect 124:1848-1856; http://dx.doi.org/10.1289/EHP172.
Tangen, C M; Koch, G G
1999-03-01
In the randomized clinical trial setting, controlling for covariates is expected to produce variance reduction for the treatment parameter estimate and to adjust for random imbalances of covariates between the treatment groups. However, for the logistic regression model, variance reduction is not obviously obtained. This can lead to concerns about the assumptions of the logistic model. We introduce a complementary nonparametric method for covariate adjustment. It provides results that are usually compatible with expectations for analysis of covariance. The only assumptions required are based on randomization and sampling arguments. The resulting treatment parameter is a (unconditional) population average log-odds ratio that has been adjusted for random imbalance of covariates. Data from a randomized clinical trial are used to compare results from the traditional maximum likelihood logistic method with those from the nonparametric logistic method. We examine treatment parameter estimates, corresponding standard errors, and significance levels in models with and without covariate adjustment. In addition, we discuss differences between unconditional population average treatment parameters and conditional subpopulation average treatment parameters. Additional features of the nonparametric method, including stratified (multicenter) and multivariate (multivisit) analyses, are illustrated. Extensions of this methodology to the proportional odds model are also made.
Roland, Lauren T.; Kallogjeri, Dorina; Sinks, Belinda C.; Rauch, Steven D.; Shepard, Neil T.; White, Judith A.; Goebel, Joel A.
2015-01-01
Objective Test performance of a focused dizziness questionnaire’s ability to discriminate between peripheral and non-peripheral causes of vertigo. Study Design Prospective multi-center Setting Four academic centers with experienced balance specialists Patients New dizzy patients Interventions A 32-question survey was given to participants. Balance specialists were blinded and a diagnosis was established for all participating patients within 6 months. Main outcomes Multinomial logistic regression was used to evaluate questionnaire performance in predicting final diagnosis and differentiating between peripheral and non-peripheral vertigo. Univariate and multivariable stepwise logistic regression were used to identify questions as significant predictors of the ultimate diagnosis. C-index was used to evaluate performance and discriminative power of the multivariable models. Results 437 patients participated in the study. Eight participants without confirmed diagnoses were excluded and 429 were included in the analysis. Multinomial regression revealed that the model had good overall predictive accuracy of 78.5% for the final diagnosis and 75.5% for differentiating between peripheral and non-peripheral vertigo. Univariate logistic regression identified significant predictors of three main categories of vertigo: peripheral, central and other. Predictors were entered into forward stepwise multivariable logistic regression. The discriminative power of the final models for peripheral, central and other causes were considered good as measured by c-indices of 0.75, 0.7 and 0.78, respectively. Conclusions This multicenter study demonstrates a focused dizziness questionnaire can accurately predict diagnosis for patients with chronic/relapsing dizziness referred to outpatient clinics. Additionally, this survey has significant capability to differentiate peripheral from non-peripheral causes of vertigo and may, in the future, serve as a screening tool for specialty referral. Clinical utility of this questionnaire to guide specialty referral is discussed. PMID:26485598
Roland, Lauren T; Kallogjeri, Dorina; Sinks, Belinda C; Rauch, Steven D; Shepard, Neil T; White, Judith A; Goebel, Joel A
2015-12-01
Test performance of a focused dizziness questionnaire's ability to discriminate between peripheral and nonperipheral causes of vertigo. Prospective multicenter. Four academic centers with experienced balance specialists. New dizzy patients. A 32-question survey was given to participants. Balance specialists were blinded and a diagnosis was established for all participating patients within 6 months. Multinomial logistic regression was used to evaluate questionnaire performance in predicting final diagnosis and differentiating between peripheral and nonperipheral vertigo. Univariate and multivariable stepwise logistic regression were used to identify questions as significant predictors of the ultimate diagnosis. C-index was used to evaluate performance and discriminative power of the multivariable models. In total, 437 patients participated in the study. Eight participants without confirmed diagnoses were excluded and 429 were included in the analysis. Multinomial regression revealed that the model had good overall predictive accuracy of 78.5% for the final diagnosis and 75.5% for differentiating between peripheral and nonperipheral vertigo. Univariate logistic regression identified significant predictors of three main categories of vertigo: peripheral, central, and other. Predictors were entered into forward stepwise multivariable logistic regression. The discriminative power of the final models for peripheral, central, and other causes was considered good as measured by c-indices of 0.75, 0.7, and 0.78, respectively. This multicenter study demonstrates a focused dizziness questionnaire can accurately predict diagnosis for patients with chronic/relapsing dizziness referred to outpatient clinics. Additionally, this survey has significant capability to differentiate peripheral from nonperipheral causes of vertigo and may, in the future, serve as a screening tool for specialty referral. Clinical utility of this questionnaire to guide specialty referral is discussed.
Bütof, Rebecca; Hofheinz, Frank; Zöphel, Klaus; Stadelmann, Tobias; Schmollack, Julia; Jentsch, Christina; Löck, Steffen; Kotzerke, Jörg; Baumann, Michael; van den Hoff, Jörg
2015-08-01
Despite ongoing efforts to develop new treatment options, the prognosis for patients with inoperable esophageal carcinoma is still poor and the reliability of individual therapy outcome prediction based on clinical parameters is not convincing. The aim of this work was to investigate whether PET can provide independent prognostic information in such a patient group and whether the tumor-to-blood standardized uptake ratio (SUR) can improve the prognostic value of tracer uptake values. (18)F-FDG PET/CT was performed in 130 consecutive patients (mean age ± SD, 63 ± 11 y; 113 men, 17 women) with newly diagnosed esophageal cancer before definitive radiochemotherapy. In the PET images, the metabolically active tumor volume (MTV) of the primary tumor was delineated with an adaptive threshold method. The blood standardized uptake value (SUV) was determined by manually delineating the aorta in the low-dose CT. SUR values were computed as the ratio of tumor SUV and blood SUV. Uptake values were scan-time-corrected to 60 min after injection. Univariate Cox regression and Kaplan-Meier analysis with respect to overall survival (OS), distant metastases-free survival (DM), and locoregional tumor control (LRC) was performed. Additionally, a multivariate Cox regression including clinically relevant parameters was performed. In multivariate Cox regression with respect to OS, including T stage, N stage, and smoking state, MTV- and SUR-based parameters were significant prognostic factors for OS with similar effect size. Multivariate analysis with respect to DM revealed smoking state, MTV, and all SUR-based parameters as significant prognostic factors. The highest hazard ratios (HRs) were found for scan-time-corrected maximum SUR (HR = 3.9) and mean SUR (HR = 4.4). None of the PET parameters was associated with LRC. Univariate Cox regression with respect to LRC revealed a significant effect only for N stage greater than 0 (P = 0.048). PET provides independent prognostic information for OS and DM but not for LRC in patients with locally advanced esophageal carcinoma treated with definitive radiochemotherapy in addition to clinical parameters. Among the investigated uptake-based parameters, only SUR was an independent prognostic factor for OS and DM. These results suggest that the prognostic value of tracer uptake can be improved when characterized by SUR instead of SUV. Further investigations are required to confirm these preliminary results. © 2015 by the Society of Nuclear Medicine and Molecular Imaging, Inc.
Igne, Benoît; de Juan, Anna; Jaumot, Joaquim; Lallemand, Jordane; Preys, Sébastien; Drennen, James K; Anderson, Carl A
2014-10-01
The implementation of a blend monitoring and control method based on a process analytical technology such as near infrared spectroscopy requires the selection and optimization of numerous criteria that will affect the monitoring outputs and expected blend end-point. Using a five component formulation, the present article contrasts the modeling strategies and end-point determination of a traditional quantitative method based on the prediction of the blend parameters employing partial least-squares regression with a qualitative strategy based on principal component analysis and Hotelling's T(2) and residual distance to the model, called Prototype. The possibility to monitor and control blend homogeneity with multivariate curve resolution was also assessed. The implementation of the above methods in the presence of designed experiments (with variation of the amount of active ingredient and excipients) and with normal operating condition samples (nominal concentrations of the active ingredient and excipients) was tested. The impact of criteria used to stop the blends (related to precision and/or accuracy) was assessed. Results demonstrated that while all methods showed similarities in their outputs, some approaches were preferred for decision making. The selectivity of regression based methods was also contrasted with the capacity of qualitative methods to determine the homogeneity of the entire formulation. Copyright © 2014. Published by Elsevier B.V.
2011-01-01
Background Meta-analysis is a popular methodology in several fields of medical research, including genetic association studies. However, the methods used for meta-analysis of association studies that report haplotypes have not been studied in detail. In this work, methods for performing meta-analysis of haplotype association studies are summarized, compared and presented in a unified framework along with an empirical evaluation of the literature. Results We present multivariate methods that use summary-based data as well as methods that use binary and count data in a generalized linear mixed model framework (logistic regression, multinomial regression and Poisson regression). The methods presented here avoid the inflation of the type I error rate that could be the result of the traditional approach of comparing a haplotype against the remaining ones, whereas, they can be fitted using standard software. Moreover, formal global tests are presented for assessing the statistical significance of the overall association. Although the methods presented here assume that the haplotypes are directly observed, they can be easily extended to allow for such an uncertainty by weighting the haplotypes by their probability. Conclusions An empirical evaluation of the published literature and a comparison against the meta-analyses that use single nucleotide polymorphisms, suggests that the studies reporting meta-analysis of haplotypes contain approximately half of the included studies and produce significant results twice more often. We show that this excess of statistically significant results, stems from the sub-optimal method of analysis used and, in approximately half of the cases, the statistical significance is refuted if the data are properly re-analyzed. Illustrative examples of code are given in Stata and it is anticipated that the methods developed in this work will be widely applied in the meta-analysis of haplotype association studies. PMID:21247440
ERIC Educational Resources Information Center
West, Lindsey M.; Davis, Telsie A.; Thompson, Martie P.; Kaslow, Nadine J.
2011-01-01
Protective factors for fostering reasons for living were examined among low-income, suicidal, African American women. Bivariate logistic regressions revealed that higher levels of optimism, spiritual well-being, and family social support predicted reasons for living. Multivariate logistic regressions indicated that spiritual well-being showed…
Asher, Anthony L; Devin, Clinton J; McCutcheon, Brandon; Chotai, Silky; Archer, Kristin R; Nian, Hui; Harrell, Frank E; McGirt, Matthew; Mummaneni, Praveen V; Shaffrey, Christopher I; Foley, Kevin; Glassman, Steven D; Bydon, Mohamad
2017-12-01
OBJECTIVE In this analysis the authors compare the characteristics of smokers to nonsmokers using demographic, socioeconomic, and comorbidity variables. They also investigate which of these characteristics are most strongly associated with smoking status. Finally, the authors investigate whether the association between known patient risk factors and disability outcome is differentially modified by patient smoking status for those who have undergone surgery for lumbar degeneration. METHODS A total of 7547 patients undergoing degenerative lumbar surgery were entered into a prospective multicenter registry (Quality Outcomes Database [QOD]). A retrospective analysis of the prospectively collected data was conducted. Patients were dichotomized as smokers (current smokers) and nonsmokers. Multivariable logistic regression analysis fitted for patient smoking status and subsequent measurement of variable importance was performed to identify the strongest patient characteristics associated with smoking status. Multivariable linear regression models fitted for 12-month Oswestry Disability Index (ODI) scores in subsets of smokers and nonsmokers was performed to investigate whether differential effects of risk factors by smoking status might be present. RESULTS In total, 18% (n = 1365) of patients were smokers and 82% (n = 6182) were nonsmokers. In a multivariable logistic regression analysis, the factors significantly associated with patients' smoking status were sex (p < 0.0001), age (p < 0.0001), body mass index (p < 0.0001), educational status (p < 0.0001), insurance status (p < 0.001), and employment/occupation (p = 0.0024). Patients with diabetes had lowers odds of being a smoker (p = 0.0008), while patients with coronary artery disease had greater odds of being a smoker (p = 0.044). Patients' propensity for smoking was also significantly associated with higher American Society of Anesthesiologists (ASA) class (p < 0.0001), anterior-alone surgical approach (p = 0.018), greater number of levels (p = 0.0246), decompression only (p = 0.0001), and higher baseline ODI score (p < 0.0001). In a multivariable proportional odds logistic regression model, the adjusted odds ratio of risk factors and direction of improvement in 12-month ODI scores remained similar between the subsets of smokers and nonsmokers. CONCLUSIONS Using a large, national, multiinstitutional registry, the authors described the profile of patients who undergo lumbar spine surgery and its association with their smoking status. Compared with nonsmokers, smokers were younger, male, nondiabetic, nonobese patients presenting with leg pain more so than back pain, with higher ASA classes, higher disability, less education, more likely to be unemployed, and with Medicaid/uninsured insurance status. Smoking status did not affect the association between these risk factors and 12-month ODI outcome, suggesting that interventions for modifiable risk factors are equally efficacious between smokers and nonsmokers.
NASA Astrophysics Data System (ADS)
Min, Qing-xu; Zhu, Jun-zhen; Feng, Fu-zhou; Xu, Chao; Sun, Ji-wei
2017-06-01
In this paper, the lock-in vibrothermography (LVT) is utilized for defect detection. Specifically, for a metal plate with an artificial fatigue crack, the temperature rise of the defective area is used for analyzing the influence of different test conditions, i.e. engagement force, excitation intensity, and modulated frequency. The multivariate nonlinear and logistic regression models are employed to estimate the POD (probability of detection) and POA (probability of alarm) of fatigue crack, respectively. The resulting optimal selection of test conditions is presented. The study aims to provide an optimized selection method of the test conditions in the vibrothermography system with the enhanced detection ability.
NASA Astrophysics Data System (ADS)
Hart, Brian K.; Griffiths, Peter R.
1998-06-01
Partial least squares (PLS) regression has been evaluated as a robust calibration technique for over 100 hazardous air pollutants (HAPs) measured by open path Fourier transform infrared (OP/FT-IR) spectrometry. PLS has the advantage over the current recommended calibration method of classical least squares (CLS), in that it can look at the whole useable spectrum (700-1300 cm-1, 2000-2150 cm-1, and 2400-3000 cm-1), and detect several analytes simultaneously. Up to one hundred HAPs synthetically added to OP/FT-IR backgrounds have been simultaneously calibrated and detected using PLS. PLS also has the advantage in requiring less preprocessing of spectra than that which is required in CLS calibration schemes, allowing PLS to provide user independent real-time analysis of OP/FT-IR spectra.
Madaniyazi, Lina; Guo, Yuming; Chen, Renjie; Kan, Haidong; Tong, Shilu
2016-01-01
Estimating the burden of mortality associated with particulates requires knowledge of exposure-response associations. However, the evidence on exposure-response associations is limited in many cities, especially in developing countries. In this study, we predicted associations of particulates smaller than 10 μm in aerodynamic diameter (PM10) with mortality in 73 Chinese cities. The meta-regression model was used to test and quantify which city-specific characteristics contributed significantly to the heterogeneity of PM10-mortality associations for 16 Chinese cities. Then, those city-specific characteristics with statistically significant regression coefficients were treated as independent variables to build multivariate meta-regression models. The model with the best fitness was used to predict PM10-mortality associations in 73 Chinese cities in 2010. Mean temperature, PM10 concentration and green space per capita could best explain the heterogeneity in PM10-mortality associations. Based on city-specific characteristics, we were able to develop multivariate meta-regression models to predict associations between air pollutants and health outcomes reasonably well. Copyright © 2015 Elsevier Ltd. All rights reserved.
Barriers to health-care and psychological distress among mothers living with HIV in Quebec (Canada).
Blais, Martin; Fernet, Mylène; Proulx-Boucher, Karène; Lebouché, Bertrand; Rodrigue, Carl; Lapointe, Normand; Otis, Joanne; Samson, Johanne
2015-01-01
Health-care providers play a major role in providing good quality care and in preventing psychological distress among mothers living with HIV (MLHIV). The objectives of this study are to explore the impact of health-care services and satisfaction with care providers on psychological distress in MLHIV. One hundred MLHIV were recruited from community and clinical settings in the province of Quebec (Canada). Prevalence estimation of clinical psychological distress and univariate and multivariable logistic regression models were performed to predict clinical psychological distress. Forty-five percent of the participants reported clinical psychological distress. In the multivariable regression, the following variables were significantly associated with psychological distress while controlling for sociodemographic variables: resilience, quality of communication with the care providers, resources, and HIV disclosure concerns. The multivariate results support the key role of personal, structural, and medical resources in understanding psychological distress among MLHIV. Interventions that can support the psychological health of MLHIV are discussed.
Finley, Andrew O.; Banerjee, Sudipto; Cook, Bruce D.; Bradford, John B.
2013-01-01
In this paper we detail a multivariate spatial regression model that couples LiDAR, hyperspectral and forest inventory data to predict forest outcome variables at a high spatial resolution. The proposed model is used to analyze forest inventory data collected on the US Forest Service Penobscot Experimental Forest (PEF), ME, USA. In addition to helping meet the regression model's assumptions, results from the PEF analysis suggest that the addition of multivariate spatial random effects improves model fit and predictive ability, compared with two commonly applied modeling approaches. This improvement results from explicitly modeling the covariation among forest outcome variables and spatial dependence among observations through the random effects. Direct application of such multivariate models to even moderately large datasets is often computationally infeasible because of cubic order matrix algorithms involved in estimation. We apply a spatial dimension reduction technique to help overcome this computational hurdle without sacrificing richness in modeling.
Using Time Series Analysis to Predict Cardiac Arrest in a PICU.
Kennedy, Curtis E; Aoki, Noriaki; Mariscalco, Michele; Turley, James P
2015-11-01
To build and test cardiac arrest prediction models in a PICU, using time series analysis as input, and to measure changes in prediction accuracy attributable to different classes of time series data. Retrospective cohort study. Thirty-one bed academic PICU that provides care for medical and general surgical (not congenital heart surgery) patients. Patients experiencing a cardiac arrest in the PICU and requiring external cardiac massage for at least 2 minutes. None. One hundred three cases of cardiac arrest and 109 control cases were used to prepare a baseline dataset that consisted of 1,025 variables in four data classes: multivariate, raw time series, clinical calculations, and time series trend analysis. We trained 20 arrest prediction models using a matrix of five feature sets (combinations of data classes) with four modeling algorithms: linear regression, decision tree, neural network, and support vector machine. The reference model (multivariate data with regression algorithm) had an accuracy of 78% and 87% area under the receiver operating characteristic curve. The best model (multivariate + trend analysis data with support vector machine algorithm) had an accuracy of 94% and 98% area under the receiver operating characteristic curve. Cardiac arrest predictions based on a traditional model built with multivariate data and a regression algorithm misclassified cases 3.7 times more frequently than predictions that included time series trend analysis and built with a support vector machine algorithm. Although the final model lacks the specificity necessary for clinical application, we have demonstrated how information from time series data can be used to increase the accuracy of clinical prediction models.
Zhu, Hongxiao; Morris, Jeffrey S; Wei, Fengrong; Cox, Dennis D
2017-07-01
Many scientific studies measure different types of high-dimensional signals or images from the same subject, producing multivariate functional data. These functional measurements carry different types of information about the scientific process, and a joint analysis that integrates information across them may provide new insights into the underlying mechanism for the phenomenon under study. Motivated by fluorescence spectroscopy data in a cervical pre-cancer study, a multivariate functional response regression model is proposed, which treats multivariate functional observations as responses and a common set of covariates as predictors. This novel modeling framework simultaneously accounts for correlations between functional variables and potential multi-level structures in data that are induced by experimental design. The model is fitted by performing a two-stage linear transformation-a basis expansion to each functional variable followed by principal component analysis for the concatenated basis coefficients. This transformation effectively reduces the intra-and inter-function correlations and facilitates fast and convenient calculation. A fully Bayesian approach is adopted to sample the model parameters in the transformed space, and posterior inference is performed after inverse-transforming the regression coefficients back to the original data domain. The proposed approach produces functional tests that flag local regions on the functional effects, while controlling the overall experiment-wise error rate or false discovery rate. It also enables functional discriminant analysis through posterior predictive calculation. Analysis of the fluorescence spectroscopy data reveals local regions with differential expressions across the pre-cancer and normal samples. These regions may serve as biomarkers for prognosis and disease assessment.
Hamzehgardeshi, Zeinab; Malary, Mina; Moosazadeh, Mahmood; Khani, Soghra; Pourasghar, Mehdi
2017-01-01
Background and aims Hypoactive Sexual Desire Disorder (HSDD) is the most prevalent sexual disorder among women that may pop up due to worrying about Body Image (BI). Body image is a complicated concept encompassing the biological, psychological and social factors. The current study aims to analyze the relationship between BI and HSDD among Iranian women in their reproductive age. Methods this research is cross-sectional (descriptive –analytical) performed on 1000 woman in their reproductive age (15–49 yrs). The samples have been selected by systematic random sampling method. The data collection tool includes demographics and Sexual Interest and Desire Inventory-Female (SIDI-F) plus the Female Sexual Distress Scale-Revised (FSDS-R) for measuring HSDD completed as self-report by the samples. To analyze the data, univariate and multivariate regression test have been applied. Results The mean age of the study community has been 32.09±7.33. After adjusting the effect of the confounder variables by logistic regression multivariate analysis, the odd ratio for HSDD in the individuals not satisfied or slightly satisfied with their BI has been OR: 4.2 (95% CI: 1.98–9.05) and OR: 3.9 (95% CI: 2.29–6.65), respectively, times more than those highly contended with their BI. Conclusion this study depicted that dissatisfaction with BI is a determinant factor of HSDD. Thus taking this factor into account when acquiring sexual history can be useful order to provide optimal sexual counseling.
Sigurdardottir, Lara G; Markt, Sarah C; Sigurdsson, Sigurdur; Aspelund, Thor; Fall, Katja; Schernhammer, Eva; Rider, Jennifer R; Launer, Lenore; Harris, Tamara; Stampfer, Meir J; Gudnason, Vilmundur; Czeisler, Charles A; Lockley, Steven W; Valdimarsdottir, Unnur A; Mucci, Lorelei A
2016-10-01
The pineal gland produces the hormone melatonin, and its volume may influence melatonin levels. We describe an innovative method for estimating pineal volume in humans and present the association of pineal parenchyma volume with levels of the primary melatonin metabolite, 6-sulfatoxymelatonin. We selected a random sample of 122 older Icelandic men nested within the AGES-Reykjavik cohort and measured their total pineal volume, their parenchyma volume, and the extent of calcification and cysts. For volume estimations we used manual segmentation of magnetic resonance images in the axial plane with simultaneous side-by-side view of the sagittal and coronal plane. We used multivariable adjusted linear regression models to estimate the association of pineal parenchyma volume and baseline characteristics, including 6-sulfatoxymelatonin levels. We used logistic regression to test for differences in first morning urinary 6-sulfatoxymelatonin levels among men with or without cystic or calcified glands. The pineal glands varied in volume, shape, and composition. Cysts were present in 59% of the glands and calcifications in 21%. The mean total pineal volume measured 207 mm(3) (range 65-536 mm(3)) and parenchyma volume 178 mm(3) (range 65-503 mm(3)). In multivariable-adjusted models, pineal parenchyma volume was positively correlated with 6-sulfatoxymelatonin levels (β = 0.52, p < 0.001). Levels of 6-sulfatoxymelatonin did not differ significantly by presence of cysts or calcification. By using an innovative method for pineal assessment, we found pineal parenchyma volume to be positively correlated with 6-sulfatoxymelatonin levels, in line with other recent studies. © 2016 The Author(s).
Striley, Catherine Woodstock; Kelso-Chichetto, Natalie E.; Cottler, Linda B.
2017-01-01
Purpose Little is known about the risk factors for nonmedical use (NMU) of prescription stimulants among adolescent girls. We aimed to measure the association of nonmedical prescription stimulant use with empirically linked risk factors, including weight control behavior (WCB), gambling, and depressed mood, in pre-teen and teenaged girls. Methods We assessed the relationship between age and race, gambling, WCB, depressive mood, and nonmedical prescription stimulant use using multivariable logistic regression. The study sample included 5,585 females, aged 10–18 years, recruited via an entertainment venue intercept method in 10 U.S. metropolitan areas as part of the National Monitoring of Adolescent Prescription Stimulants Study (2008–2011). Results NMU of prescription stimulants was reported by 6.6% (n = 370) of the sample. In multivariable logistic regression, 1-year increase in age was associated with a 21% (95% confidence interval [CI]: .15, .28) increase in risk for NMU. Whites and other race/ethnicity girls had 2.67 (CI: 1.85, 3.87) and 1.71 (1.11, 2.65) times higher odds for NMU, compared to African-Americans. Depressive mood (adjusted odds ratio: 2.69, CI: 2.04, 5.57) and gambling (adjusted odds ratio: 1.90, 1.23, 2.92) were associated with increased odds for NMU. A dose-response was identified between WCB and NMU, where girls with unhealthy and extreme WCB were over five times more likely to endorse NMU. Conclusions We contribute to the literature linking WCB, depression, gambling, and the NMU of prescription stimulants in any population and uniquely do so among girls. PMID:27998704
Muto, Satoru; Sugiura, Syo-Ichiro; Nakajima, Akiko; Horiuchi, Akira; Inoue, Masahiro; Saito, Keisuke; Isotani, Shuji; Yamaguchi, Raizo; Ide, Hisamitsu; Horie, Shigeo
2014-10-01
We aimed to identify patients with a chief complaint of hematuria who could safely avoid unnecessary radiation and instrumentation in the diagnosis of bladder cancer (BC), using automated urine flow cytometry to detect isomorphic red blood cells (RBCs) in urine. We acquired urine samples from 134 patients over the age of 35 years with a chief complaint of hematuria and a positive urine occult blood test or microhematuria. The data were analyzed using the UF-1000i (®) (Sysmex Co., Ltd., Kobe, Japan) automated urine flow cytometer to determine RBC morphology, which was classified as isomorphic or dysmorphic. The patients were divided into two groups (BC versus non-BC) for statistical analysis. Multivariate logistic regression analysis was used to determine the predictive value of flow cytometry versus urine cytology, the bladder tumor antigen test, occult blood in urine test, and microhematuria test. BC was confirmed in 26 of 134 patients (19.4 %). The area under the curve for RBC count using the automated urine flow cytometer was 0.94, representing the highest reference value obtained in this study. Isomorphic RBCs were detected in all patients in the BC group. On multivariate logistic regression analysis, only isomorphic RBC morphology was significantly predictive for BC (p < 0.001). Analytical parameters such as sensitivity, specificity, positive predictive value, and negative predictive value of isomorphic RBCs in urine were 100.0, 91.7, 74.3, and 100.0 %, respectively. Detection of urinary isomorphic RBCs using automated urine flow cytometry is a reliable method in the diagnosis of BC with hematuria.
Sigurdardottir, Lara G.; Markt, Sarah C.; Sigurdsson, Sigurdur; Aspelund, Thor; Fall, Katja; Schernhammer, Eva; Rider, Jennifer R.; Launer, Lenore; Harris, Tamara; Stampfer, Meir J.; Gudnason, Vilmundur; Czeisler, Charles A.; Lockley, Steven W.; Valdimarsdottir, Unnur A.; Mucci, Lorelei A.
2017-01-01
The pineal gland produces the hormone melatonin and its volume may influence melatonin levels. We describe an innovative method for estimating pineal volume in humans and present the association of pineal parenchyma volume with levels of the primary melatonin metabolite, 6-sulfatoxymelatonin. We selected a random sample of 122 older Icelandic men nested within the AGES-Reykjavik cohort and measured their total pineal volume, parenchyma volume, and the extent of calcification and cysts. For volume estimations we used manual segmentation of MR images in the axial plane with simultaneous side-by-side view of the sagittal and coronal plane. We used multivariable adjusted linear regression models to estimate the association of pineal parenchyma volume and baseline characteristics, including 6-sulfatoxymelatonin levels. We used logistic regression to test for differences in first morning urinary 6-sulfatoxymelatonin levels among men with or without cystic or calcified glands. The pineal glands varied in volume, shape and composition. Cysts were present in 59% of the glands and calcifications in 21%. The mean total pineal volume measured 207 mm3 (range 65–536 mm3) and parenchyma volume 178 mm3 (range 65–503 mm3). In multivariable-adjusted models pineal parenchyma volume was positively correlated with 6-sulfatoxymelatonin levels (β=0.52, p<0.001). 6-sulfatoxymelatonin levels did not differ significantly by presence of cysts or calcification. By using an innovative method for pineal assessment we found pineal parenchyma volume to be positively correlated with 6-sulfatoxymelatonin levels, in line with other recent studies. PMID:27449477
Kamal, S M Mostafa; Hassan, Che Hashim
2013-06-01
To examine the relationship between socioeconomic factors affecting contraceptive use among tribal women of Bangladesh with focusing on son preference over daughter. The study used data gathered through a cross sectional survey on four tribal communities resided in the Rangamati Hill District of the Chittagong Hill Tracts, Bangladesh. A multistage random sampling procedure was applied to collect data from 865 currently married women of whom 806 women were currently married, non-pregnant and had at least one living child, which are the basis of this study. The information was recorded in a pre-structured questionnaire. Simple cross tabulation, chi-square tests and logistic regression analyses were performed to analyzing data. The contraceptive prevalence rate among the study tribal women was 73%. The multivariate analyses yielded quantitatively important and reliable estimates of likelihood of contraceptive use. Findings revealed that after controlling for other variables, the likelihood of contraceptive use was found not to be significant among women with at least one son than those who had only daughters, indicating no preference of son over daughter. Multivariate logistic regression analysis suggests that home visitations by family planning workers, tribal identity, place of residence, husband's education, and type of family, television ownership, electricity connection in the household and number of times married are important determinants of any contraceptive method use among the tribal women. The contraceptive use rate among the disadvantaged tribal women was more than that of the national level. Door-step delivery services of modern methods should be reached and available targeting the poor and remote zones.
Hamilton, C. A.; Miller, A.; Casablanca, Y.; Horowitz, N. S.; Rungruang, B.; Krivak, T. C.; Richard, S. D.; Rodriguez, N.; Birrer, M.J.; Backes, F.J.; Geller, M.A.; Quinn, M.; Goodheart, M.J.; Mutch, D.G.; Kavanagh, J.J.; Maxwell, G. L.; Bookman, M. A.
2018-01-01
Objective To identify clinicopathologic factors associated with 10-year overall survival in epithelial ovarian cancer (EOC) and primary peritoneal cancer (PPC), and to develop a predictive model identifying long-term survivors. Methods Demographic, surgical, and clinicopathologic data were abstracted from GOG 182 records. The association between clinical variables and long-term survival (LTS) (>10 years) was assessed using multivariable regression analysis. Bootstrap methods were used to develop predictive models from known prognostic clinical factors and predictive accuracy was quantified using optimism-adjusted area under the receiver operating characteristic curve (AUC). Results The analysis dataset included 3,010 evaluable patients, of whom 195 survived greater than ten years. These patients were more likely to have better performance status, endometrioid histology, stage III (rather than stage IV) disease, absence of ascites, less extensive preoperative disease distribution, microscopic disease residual following cyoreduction (R0), and decreased complexity of surgery (p<0.01). Multivariable regression analysis revealed that lower CA-125 levels, absence of ascites, stage, and R0 were significant independent predictors of LTS. A predictive model created using these variables had an AUC=0.729, which outperformed any of the individual predictors. Conclusions The absence of ascites, a low CA-125, stage, and R0 at the time of cytoreduction are factors associated with LTS when controlling for other confounders. An extensively annotated clinicopathologic prediction model for LTS fell short of clinical utility suggesting that prognostic molecular profiles are needed to better predict which patients are likely to be long-term survivors. PMID:29195926
Nelson, Deborah B; Zhao, Huaqing; Corrado, Rachel; Mastrogiannnis, Dimitrios M; Lepore, Stephen J
2017-04-01
Ineffective contraceptive use among young sexually active women is extremely prevalent and poses a significant risk for unintended pregnancy (UP). Ineffective contraception involves the use of the withdrawal method or the inconsistent use of other types of contraception (i.e., condoms and birth control pills). This investigation examined violence exposure and psychological factors related to ineffective contraceptive use among young sexually active women. Young, nonpregnant sexually active women (n = 315) were recruited from an urban family planning clinic in 2013 to participate in a longitudinal study. Tablet-based surveys measured childhood violence, community-level violence, intimate partner violence, depressive symptoms, and self-esteem. Follow-up surveys measured type and consistency of contraception used 9 months later. Multivariate logistic regression models assessed violence and psychological risk factors as main effects and moderators related to ineffective compared with effective use of contraception. The multivariate logistic regression model showed that childhood sexual violence and low self-esteem were significantly related to ineffective use of contraception (adjusted odds ratio [aOR] = 2.69, confidence interval [95% CI]: 1.18-6.17, and aOR = 0.51, 95% CI: 0.28-0.93; respectively), although self-esteem did not moderate the relationship between childhood sexual violence and ineffective use of contraception (aOR = 0.38, 95% CI: 0.08-1.84). Depressive symptoms were not related to ineffective use of contraception in the multivariate model. Interventions to reduce UP should recognize the long-term effects of childhood sexual violence and address the role of low self-esteem on the ability of young sexually active women to effectively and consistently use contraception to prevent UP.
2013-01-01
Background Household survey data of Changlang district, Arunachal Pradesh, were used in the present study to assess the prevalence of opium use among different tribes, and to examine the association between sociodemographic factors and opium use. Methods A sample of 3421 individuals (1795 men and 1626 women) aged 15 years and older was analyzed using a multivariate logistic regression model to determine factors associated with opium use. Sociodemographic information such as age, education, occupation, religion, ethnicity and marital status were included in the analysis. Results The prevalence of opium use was significantly higher (10.6%) among men than among women (2.1%). It varied according to age, educational level, occupation, marital status and religion of the respondents. In both sexes, opium use was significantly higher among Singpho and Khamti tribes compared with other tribes. Multivariate logistic regression indicated that opium use was significantly associated with age, occupation, ethnicity, religion and marital status of the respondents of both sexes. Multivariate rate ratios (MRR) for opium use were significantly higher (4–6 times) among older age groups (≥35 years) and male respondents. In males, the MRR was also significantly higher in respondents of Buddhist and Indigenous religion, while in females, the MRR was significantly higher in Buddhists. Most of the female opium users had taken opium for more than 5 years and were introduced to it by their husbands after marriage. Use of other substances among opium users comprised mainly tobacco (76%) and alcohol (44%). Conclusions The study reveals the sociodemographic factors, such as age, sex, ethnicity, religion and occupation, which are associated with opium use. Such information is useful for institution of intervention measures to reduce opium use. PMID:23575143
Guo, L W; Liu, S Z; Zhang, M; Chen, Q; Zhang, S K; Sun, X B
2018-02-06
Objective: To investigate the effect of fried food intake on the pathogenesis of gastric cancer and precancerous lesions. Methods: From 2005 to 2013, the residents aged 40-69 years from 11 counties/cities where cancer screening of upper gastrointestinal cancer were conducted in rural areas of Henan province as the subjects (82 367 cases). The information such as demography and lifestyle was collected. The residents were screened with endoscopic examination. The biopsy sampleswere diagnosed pathologically, according to pathological diagnosis criteria, the subjects with high risk were divided into the groups with different pathological degrees. The multivariate ordinal logistic regression analysis was used to analyze the relationship between the frequency of fried food intake and gastric cancer and precancerous lesions. Results: The study coverd 46 425 males and 35 942 females, with a age of (53.46±8.07)years. The study collected 6 707 cases of normal stomach, 2 325 cases of low grade intraepithelial neoplasia, 226 cases of high grade intraepithelial neoplasia and 331 cases of gastric cancer. Multivariate logistic regression analysis showed that, compared with those whoeat fried food less than one time per week, fried foods intake (<2 times/week: OR= 1.89, 95 %CI: 1.57-2.28; ≥ 2 times/week: OR= 1.91, 95 %CI: 1.66-2.20) were a risk factor for gastric cancer and precancerous lesions after adjustment for age, sex, marital status, educational level, body mass index (BMI), smoking and drinking status. Conclusion: The intake of fried food is a risk factor for gastric cancer and precancerous lesions. Therefore, reducing the intake of fried food can prevent the occurrence of gastric carcinoma and precancerous lesions.
Time to antibiotics and outcomes in cancer patients with febrile neutropenia
2014-01-01
Background Febrile neutropenia is an oncologic emergency. The timing of antibiotics administration in patients with febrile neutropenia may result in adverse outcomes. Our study aims to determine time-to- antibiotic administration in patients with febrile neutropenia, and its relationship with length of hospital stay, intensive care unit monitoring, and hospital mortality. Methods The study population was comprised of adult cancer patients with febrile neutropenia who were hospitalized, at a tertiary care hospital, between January 2010 and December 2011. Using Multination Association of Supportive Care in Cancer (MASCC) risk score, the study cohort was divided into high and low risk groups. A multivariate regression analysis was performed to assess relationship between time-to- antibiotic administration and various outcome variables. Results One hundred and five eligible patients with median age of 60 years (range: 18–89) and M:F of 43:62 were identified. Thirty-seven (35%) patients were in MASCC high risk group. Median time-to- antibiotic administration was 2.5 hrs (range: 0.03-50) and median length of hospital stay was 6 days (range: 1–57). In the multivariate analysis time-to- antibiotic administration (regression coefficient [RC]: 0.31 days [95% CI: 0.13-0.48]), known source of fever (RC: 4.1 days [95% CI: 0.76-7.5]), and MASCC high risk group (RC: 4 days [95% CI: 1.1-7.0]) were significantly correlated with longer hospital stay. Of 105 patients, 5 (4.7%) died & or required ICU monitoring. In multivariate analysis no variables significantly correlated with mortality or ICU monitoring. Conclusions Our study revealed that delay in antibiotics administration has been associated with a longer hospital stay. PMID:24716604
The NLS-Based Nonlinear Grey Multivariate Model for Forecasting Pollutant Emissions in China.
Pei, Ling-Ling; Li, Qin; Wang, Zheng-Xin
2018-03-08
The relationship between pollutant discharge and economic growth has been a major research focus in environmental economics. To accurately estimate the nonlinear change law of China's pollutant discharge with economic growth, this study establishes a transformed nonlinear grey multivariable (TNGM (1, N )) model based on the nonlinear least square (NLS) method. The Gauss-Seidel iterative algorithm was used to solve the parameters of the TNGM (1, N ) model based on the NLS basic principle. This algorithm improves the precision of the model by continuous iteration and constantly approximating the optimal regression coefficient of the nonlinear model. In our empirical analysis, the traditional grey multivariate model GM (1, N ) and the NLS-based TNGM (1, N ) models were respectively adopted to forecast and analyze the relationship among wastewater discharge per capita (WDPC), and per capita emissions of SO₂ and dust, alongside GDP per capita in China during the period 1996-2015. Results indicated that the NLS algorithm is able to effectively help the grey multivariable model identify the nonlinear relationship between pollutant discharge and economic growth. The results show that the NLS-based TNGM (1, N ) model presents greater precision when forecasting WDPC, SO₂ emissions and dust emissions per capita, compared to the traditional GM (1, N ) model; WDPC indicates a growing tendency aligned with the growth of GDP, while the per capita emissions of SO₂ and dust reduce accordingly.
Multi-Target Regression via Robust Low-Rank Learning.
Zhen, Xiantong; Yu, Mengyang; He, Xiaofei; Li, Shuo
2018-02-01
Multi-target regression has recently regained great popularity due to its capability of simultaneously learning multiple relevant regression tasks and its wide applications in data mining, computer vision and medical image analysis, while great challenges arise from jointly handling inter-target correlations and input-output relationships. In this paper, we propose Multi-layer Multi-target Regression (MMR) which enables simultaneously modeling intrinsic inter-target correlations and nonlinear input-output relationships in a general framework via robust low-rank learning. Specifically, the MMR can explicitly encode inter-target correlations in a structure matrix by matrix elastic nets (MEN); the MMR can work in conjunction with the kernel trick to effectively disentangle highly complex nonlinear input-output relationships; the MMR can be efficiently solved by a new alternating optimization algorithm with guaranteed convergence. The MMR leverages the strength of kernel methods for nonlinear feature learning and the structural advantage of multi-layer learning architectures for inter-target correlation modeling. More importantly, it offers a new multi-layer learning paradigm for multi-target regression which is endowed with high generality, flexibility and expressive ability. Extensive experimental evaluation on 18 diverse real-world datasets demonstrates that our MMR can achieve consistently high performance and outperforms representative state-of-the-art algorithms, which shows its great effectiveness and generality for multivariate prediction.
Regression and Sentinel Lymph Node Status in Melanoma Progression
Letca, Alina Florentina; Ungureanu, Loredana; Şenilă, Simona Corina; Grigore, Lavinia Elena; Pop, Ştefan; Fechete, Oana; Vesa, Ştefan Cristian
2018-01-01
Background The purpose of this study was to assess the role of regression and other clinical and histological features for the prognosis and the progression of cutaneous melanoma. Material/Methods Between 2005 and 2016, 403 patients with melanoma were treated and followed at our Department of Dermatology. Of the 403 patients, 173 patients had cutaneous melanoma and underwent sentinel lymph node (SLN) biopsy and thus were included in this study. Results Histological regression was found in 37 cases of melanoma (21.3%). It was significantly associated with marked and moderate tumor-infiltrating lymphocyte (TIL) and with negative SLN. Progression of the disease occurred in 42 patients (24.2%). On multivariate analysis, we found that a positive lymph node and a Breslow index higher than 2 mm were independent variables associated with disease free survival (DFS). These variables together with a mild TIL were significantly correlated with overall survival (OS). The presence of regression was not associated with DFS or OS. Conclusions We could not demonstrate an association between regression and the outcome of patients with cutaneous melanoma. Tumor thickness greater than 2 mm and a positive SLN were associated with recurrence. Survival was influenced by a Breslow thickness >2 mm, the presence of a mild TIL and a positive SLN status. PMID:29507279
1991-09-01
However, there is no guarantee that this would work; for instance if the data were generated by an ARCH model (Tong, 1990 pp. 116-117) then a simple...Hill, R., Griffiths, W., Lutkepohl, H., and Lee, T., Introduction to the Theory and Practice of Econometrics , 2th ed., Wiley, 1985. Kendall, M., Stuart
Burgués, Javier; Marco, Santiago
2018-08-17
Metal oxide semiconductor (MOX) sensors are usually temperature-modulated and calibrated with multivariate models such as partial least squares (PLS) to increase the inherent low selectivity of this technology. The multivariate sensor response patterns exhibit heteroscedastic and correlated noise, which suggests that maximum likelihood methods should outperform PLS. One contribution of this paper is the comparison between PLS and maximum likelihood principal components regression (MLPCR) in MOX sensors. PLS is often criticized by the lack of interpretability when the model complexity increases beyond the chemical rank of the problem. This happens in MOX sensors due to cross-sensitivities to interferences, such as temperature or humidity and non-linearity. Additionally, the estimation of fundamental figures of merit, such as the limit of detection (LOD), is still not standardized in multivariate models. Orthogonalization methods, such as orthogonal projection to latent structures (O-PLS), have been successfully applied in other fields to reduce the complexity of PLS models. In this work, we propose a LOD estimation method based on applying the well-accepted univariate LOD formulas to the scores of the first component of an orthogonal PLS model. The resulting LOD is compared to the multivariate LOD range derived from error-propagation. The methodology is applied to data extracted from temperature-modulated MOX sensors (FIS SB-500-12 and Figaro TGS 3870-A04), aiming at the detection of low concentrations of carbon monoxide in the presence of uncontrolled humidity (chemical noise). We found that PLS models were simpler and more accurate than MLPCR models. Average LOD values of 0.79 ppm (FIS) and 1.06 ppm (Figaro) were found using the approach described in this paper. These values were contained within the LOD ranges obtained with the error-propagation approach. The mean LOD increased to 1.13 ppm (FIS) and 1.59 ppm (Figaro) when considering validation samples collected two weeks after calibration, which represents a 43% and 46% degradation, respectively. The orthogonal score-plot was a very convenient tool to visualize MOX sensor data and to validate the LOD estimates. Copyright © 2018 Elsevier B.V. All rights reserved.
Akkus, Zeki; Camdeviren, Handan; Celik, Fatma; Gur, Ali; Nas, Kemal
2005-09-01
To determine the risk factors of osteoporosis using a multiple binary logistic regression method and to assess the risk variables for osteoporosis, which is a major and growing health problem in many countries. We presented a case-control study, consisting of 126 postmenopausal healthy women as control group and 225 postmenopausal osteoporotic women as the case group. The study was carried out in the Department of Physical Medicine and Rehabilitation, Dicle University, Diyarbakir, Turkey between 1999-2002. The data from the 351 participants were collected using a standard questionnaire that contains 43 variables. A multiple logistic regression model was then used to evaluate the data and to find the best regression model. We classified 80.1% (281/351) of the participants using the regression model. Furthermore, the specificity value of the model was 67% (84/126) of the control group while the sensitivity value was 88% (197/225) of the case group. We found the distribution of residual values standardized for final model to be exponential using the Kolmogorow-Smirnow test (p=0.193). The receiver operating characteristic curve was found successful to predict patients with risk for osteoporosis. This study suggests that low levels of dietary calcium intake, physical activity, education, and longer duration of menopause are independent predictors of the risk of low bone density in our population. Adequate dietary calcium intake in combination with maintaining a daily physical activity, increasing educational level, decreasing birth rate, and duration of breast-feeding may contribute to healthy bones and play a role in practical prevention of osteoporosis in Southeast Anatolia. In addition, the findings of the present study indicate that the use of multivariate statistical method as a multiple logistic regression in osteoporosis, which maybe influenced by many variables, is better than univariate statistical evaluation.
Jarnevich, Catherine S.; Talbert, Marian; Morisette, Jeffrey T.; Aldridge, Cameron L.; Brown, Cynthia; Kumar, Sunil; Manier, Daniel; Talbert, Colin; Holcombe, Tracy R.
2017-01-01
Evaluating the conditions where a species can persist is an important question in ecology both to understand tolerances of organisms and to predict distributions across landscapes. Presence data combined with background or pseudo-absence locations are commonly used with species distribution modeling to develop these relationships. However, there is not a standard method to generate background or pseudo-absence locations, and method choice affects model outcomes. We evaluated combinations of both model algorithms (simple and complex generalized linear models, multivariate adaptive regression splines, Maxent, boosted regression trees, and random forest) and background methods (random, minimum convex polygon, and continuous and binary kernel density estimator (KDE)) to assess the sensitivity of model outcomes to choices made. We evaluated six questions related to model results, including five beyond the common comparison of model accuracy assessment metrics (biological interpretability of response curves, cross-validation robustness, independent data accuracy and robustness, and prediction consistency). For our case study with cheatgrass in the western US, random forest was least sensitive to background choice and the binary KDE method was least sensitive to model algorithm choice. While this outcome may not hold for other locations or species, the methods we used can be implemented to help determine appropriate methodologies for particular research questions.
ERIC Educational Resources Information Center
Nguyen, Phuong L.
2006-01-01
This study examines the effects of parental SES, school quality, and community factors on children's enrollment and achievement in rural areas in Viet Nam, using logistic regression and ordered logistic regression. Multivariate analysis reveals significant differences in educational enrollment and outcomes by level of household expenditures and…
Procedures for using signals from one sensor as substitutes for signals of another
NASA Technical Reports Server (NTRS)
Suits, G.; Malila, W.; Weller, T.
1988-01-01
Long-term monitoring of surface conditions may require a transfer from using data from one satellite sensor to data from a different sensor having different spectral characteristics. Two general procedures for spectral signal substitution are described in this paper, a principal-components procedure and a complete multivariate regression procedure. They are evaluated through a simulation study of five satellite sensors (MSS, TM, AVHRR, CZCS, and HRV). For illustration, they are compared to another recently described procedure for relating AVHRR and MSS signals. The multivariate regression procedure is shown to be best. TM can accurately emulate the other sensors, but they, on the other hand, have difficulty in accurately emulating its shortwave infrared bands (TM5 and TM7).
Costing Hospital Surgery Services: The Method Matters
Mercier, Gregoire; Naro, Gerald
2014-01-01
Background Accurate hospital costs are required for policy-makers, hospital managers and clinicians to improve efficiency and transparency. However, different methods are used to allocate direct costs, and their agreement is poorly understood. The aim of this study was to assess the agreement between bottom-up and top-down unit costs of a large sample of surgical operations in a French tertiary centre. Methods Two thousand one hundred and thirty consecutive procedures performed between January and October 2010 were analysed. Top-down costs were based on pre-determined weights, while bottom-up costs were calculated through an activity-based costing (ABC) model. The agreement was assessed using correlation coefficients and the Bland and Altman method. Variables associated with the difference between methods were identified with bivariate and multivariate linear regressions. Results The correlation coefficient amounted to 0.73 (95%CI: 0.72; 0.76). The overall agreement between methods was poor. In a multivariate analysis, the cost difference was independently associated with age (Beta = −2.4; p = 0.02), ASA score (Beta = 76.3; p<0.001), RCI (Beta = 5.5; p<0.001), staffing level (Beta = 437.0; p<0.001) and intervention duration (Beta = −10.5; p<0.001). Conclusions The ability of the current method to provide relevant information to managers, clinicians and payers is questionable. As in other European countries, a shift towards time-driven activity-based costing should be advocated. PMID:24817167
Rovadoscki, Gregori A; Petrini, Juliana; Ramirez-Diaz, Johanna; Pertile, Simone F N; Pertille, Fábio; Salvian, Mayara; Iung, Laiza H S; Rodriguez, Mary Ana P; Zampar, Aline; Gaya, Leila G; Carvalho, Rachel S B; Coelho, Antonio A D; Savino, Vicente J M; Coutinho, Luiz L; Mourão, Gerson B
2016-09-01
Repeated measures from the same individual have been analyzed by using repeatability and finite dimension models under univariate or multivariate analyses. However, in the last decade, the use of random regression models for genetic studies with longitudinal data have become more common. Thus, the aim of this research was to estimate genetic parameters for body weight of four experimental chicken lines by using univariate random regression models. Body weight data from hatching to 84 days of age (n = 34,730) from four experimental free-range chicken lines (7P, Caipirão da ESALQ, Caipirinha da ESALQ and Carijó Barbado) were used. The analysis model included the fixed effects of contemporary group (gender and rearing system), fixed regression coefficients for age at measurement, and random regression coefficients for permanent environmental effects and additive genetic effects. Heterogeneous variances for residual effects were considered, and one residual variance was assigned for each of six subclasses of age at measurement. Random regression curves were modeled by using Legendre polynomials of the second and third orders, with the best model chosen based on the Akaike Information Criterion, Bayesian Information Criterion, and restricted maximum likelihood. Multivariate analyses under the same animal mixed model were also performed for the validation of the random regression models. The Legendre polynomials of second order were better for describing the growth curves of the lines studied. Moderate to high heritabilities (h(2) = 0.15 to 0.98) were estimated for body weight between one and 84 days of age, suggesting that selection for body weight at all ages can be used as a selection criteria. Genetic correlations among body weight records obtained through multivariate analyses ranged from 0.18 to 0.96, 0.12 to 0.89, 0.06 to 0.96, and 0.28 to 0.96 in 7P, Caipirão da ESALQ, Caipirinha da ESALQ, and Carijó Barbado chicken lines, respectively. Results indicate that genetic gain for body weight can be achieved by selection. Also, selection for body weight at 42 days of age can be maintained as a selection criterion. © 2016 Poultry Science Association Inc.
Zhang, Y J; Wu, S L; Li, H Y; Zhao, Q H; Ning, C H; Zhang, R Y; Yu, J X; Li, W; Chen, S H; Gao, J S
2018-01-24
Objective: To investigate the impact of blood pressure and age on arterial stiffness in general population. Methods: Participants who took part in 2010, 2012 and 2014 Kailuan health examination were included. Data of brachial ankle pulse wave velocity (baPWV) examination were analyzed. According to the WHO criteria of age, participants were divided into 3 age groups: 18-44 years group ( n= 11 608), 45-59 years group ( n= 12 757), above 60 years group ( n= 5 002). Participants were further divided into hypertension group and non-hypertension group according to the diagnostic criteria for hypertension (2010 Chinese guidelines for the managemengt of hypertension). Multiple linear regression analysis was used to analyze the association between systolic blood pressure (SBP) with baPWV in the total participants and then stratified by age groups. Multivariate logistic regression model was used to analyze the influence of blood pressure on arterial stiffness (baPWV≥1 400 cm/s) of various groups. Results: (1)The baseline characteristics of all participants: 35 350 participants completed 2010, 2012 and 2014 Kailuan examinations and took part in baPWV examination. 2 237 participants without blood pressure measurement values were excluded, 1 569 participants with history of peripheral artery disease were excluded, we also excluded 1 016 participants with history of cardiac-cerebral vascular disease. Data from 29 367 participants were analyzed. The age was (48.0±12.4) years old, 21 305 were males (72.5%). (2) Distribution of baPWV in various age groups: baPWV increased with aging. In non-hypertension population, baPWV in 18-44 years group, 45-59 years group, above 60 years group were as follows: 1 299.3, 1 428.7 and 1 704.6 cm/s, respectively. For hypertension participants, the respective values of baPWV were: 1 498.4, 1 640.7 and 1 921.4 cm/s. BaPWV was significantly higher in hypertension group than non-hypertension group of respective age groups ( P< 0.05). (3) Multiple linear regression analysis defined risk factors of baPWV: Multivariate linear regression analysis showed that baPWV was positively correlated with SBP( t= 39.30, P< 0.001), and same results were found in the sub-age groups ( t -value was 37.72, 27.30, 9.15, all P< 0.001, respectively) after adjustment for other confounding factors, including age, sex, pulse pressure(PP), body mass index (BMI), fasting blood glucose (FBG), total cholesterol (TC), smoking, drinking, physical exercise, antihypertensive medications, lipid-lowering medication. (4) Multivariate logistic regression analysis of baPWV-related factors: After adjustment for other confounding factors, including age, sex, PP, BMI, FBG, TC, smoking, drinking, physical exercise, antihypertensive medication, lipid-lowering medication, multivariate logistic regression analysis showed that risks for increased arterial stiffness in hypertension group were higher than those in non-hypertension group, the OR in participants with hypertension was 2.54 (2.35-2.74) in the total participants, and same results were also found in sub-age groups, the OR s were 3.22(2.86-3.63), 2.48(2.23-2.76), and 1.91(1.42-2.56), respectively, in each sub-age group. Conclusion: SBP is positively related to arterial stiffness in different age groups, and hypertension is a risk factor for increased arterial stiffness in different age groups. Clinical Trial Registry Chinese Clinical Trial Registry, ChiCTR-TNC-11001489.
Large signal-to-noise ratio quantification in MLE for ARARMAX models
NASA Astrophysics Data System (ADS)
Zou, Yiqun; Tang, Xiafei
2014-06-01
It has been shown that closed-loop linear system identification by indirect method can be generally transferred to open-loop ARARMAX (AutoRegressive AutoRegressive Moving Average with eXogenous input) estimation. For such models, the gradient-related optimisation with large enough signal-to-noise ratio (SNR) can avoid the potential local convergence in maximum likelihood estimation. To ease the application of this condition, the threshold SNR needs to be quantified. In this paper, we build the amplitude coefficient which is an equivalence to the SNR and prove the finiteness of the threshold amplitude coefficient within the stability region. The quantification of threshold is achieved by the minimisation of an elaborately designed multi-variable cost function which unifies all the restrictions on the amplitude coefficient. The corresponding algorithm based on two sets of physically realisable system input-output data details the minimisation and also points out how to use the gradient-related method to estimate ARARMAX parameters when local minimum is present as the SNR is small. Then, the algorithm is tested on a theoretical AutoRegressive Moving Average with eXogenous input model for the derivation of the threshold and a gas turbine engine real system for model identification, respectively. Finally, the graphical validation of threshold on a two-dimensional plot is discussed.
A novel strategy for forensic age prediction by DNA methylation and support vector regression model
Xu, Cheng; Qu, Hongzhu; Wang, Guangyu; Xie, Bingbing; Shi, Yi; Yang, Yaran; Zhao, Zhao; Hu, Lan; Fang, Xiangdong; Yan, Jiangwei; Feng, Lei
2015-01-01
High deviations resulting from prediction model, gender and population difference have limited age estimation application of DNA methylation markers. Here we identified 2,957 novel age-associated DNA methylation sites (P < 0.01 and R2 > 0.5) in blood of eight pairs of Chinese Han female monozygotic twins. Among them, nine novel sites (false discovery rate < 0.01), along with three other reported sites, were further validated in 49 unrelated female volunteers with ages of 20–80 years by Sequenom Massarray. A total of 95 CpGs were covered in the PCR products and 11 of them were built the age prediction models. After comparing four different models including, multivariate linear regression, multivariate nonlinear regression, back propagation neural network and support vector regression, SVR was identified as the most robust model with the least mean absolute deviation from real chronological age (2.8 years) and an average accuracy of 4.7 years predicted by only six loci from the 11 loci, as well as an less cross-validated error compared with linear regression model. Our novel strategy provides an accurate measurement that is highly useful in estimating the individual age in forensic practice as well as in tracking the aging process in other related applications. PMID:26635134
Analysis of Multivariate Experimental Data Using A Simplified Regression Model Search Algorithm
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert Manfred
2013-01-01
A new regression model search algorithm was developed in 2011 that may be used to analyze both general multivariate experimental data sets and wind tunnel strain-gage balance calibration data. The new algorithm is a simplified version of a more complex search algorithm that was originally developed at the NASA Ames Balance Calibration Laboratory. The new algorithm has the advantage that it needs only about one tenth of the original algorithm's CPU time for the completion of a search. In addition, extensive testing showed that the prediction accuracy of math models obtained from the simplified algorithm is similar to the prediction accuracy of math models obtained from the original algorithm. The simplified algorithm, however, cannot guarantee that search constraints related to a set of statistical quality requirements are always satisfied in the optimized regression models. Therefore, the simplified search algorithm is not intended to replace the original search algorithm. Instead, it may be used to generate an alternate optimized regression model of experimental data whenever the application of the original search algorithm either fails or requires too much CPU time. Data from a machine calibration of NASA's MK40 force balance is used to illustrate the application of the new regression model search algorithm.
Ramsthaler, Frank; Kettner, Mattias; Verhoff, Marcel A
2014-01-01
In forensic anthropological casework, estimating age-at-death is key to profiling unknown skeletal remains. The aim of this study was to examine the reliability of a new, simple, fast, and inexpensive digital odontological method for age-at-death estimation. The method is based on the original Lamendin method, which is a widely used technique in the repertoire of odontological aging methods in forensic anthropology. We examined 129 single root teeth employing a digital camera and imaging software for the measurement of the luminance of the teeth's translucent root zone. Variability in luminance detection was evaluated using statistical technical error of measurement analysis. The method revealed stable values largely unrelated to observer experience, whereas requisite formulas proved to be camera-specific and should therefore be generated for an individual recording setting based on samples of known chronological age. Multiple regression analysis showed a highly significant influence of the coefficients of the variables "arithmetic mean" and "standard deviation" of luminance for the regression formula. For the use of this primer multivariate equation for age-at-death estimation in casework, a standard error of the estimate of 6.51 years was calculated. Step-by-step reduction of the number of embedded variables to linear regression analysis employing the best contributor "arithmetic mean" of luminance yielded a regression equation with a standard error of 6.72 years (p < 0.001). The results of this study not only support the premise of root translucency as an age-related phenomenon, but also demonstrate that translucency reflects a number of other influencing factors in addition to age. This new digital measuring technique of the zone of dental root luminance can broaden the array of methods available for estimating chronological age, and furthermore facilitate measurement and age classification due to its low dependence on observer experience.
NASA Astrophysics Data System (ADS)
Passow, Christian; Donner, Reik
2017-04-01
Quantile mapping (QM) is an established concept that allows to correct systematic biases in multiple quantiles of the distribution of a climatic observable. It shows remarkable results in correcting biases in historical simulations through observational data and outperforms simpler correction methods which relate only to the mean or variance. Since it has been shown that bias correction of future predictions or scenario runs with basic QM can result in misleading trends in the projection, adjusted, trend preserving, versions of QM were introduced in the form of detrended quantile mapping (DQM) and quantile delta mapping (QDM) (Cannon, 2015, 2016). Still, all previous versions and applications of QM based bias correction rely on the assumption of time-independent quantiles over the investigated period, which can be misleading in the context of a changing climate. Here, we propose a novel combination of linear quantile regression (QR) with the classical QM method to introduce a consistent, time-dependent and trend preserving approach of bias correction for historical and future projections. Since QR is a regression method, it is possible to estimate quantiles in the same resolution as the given data and include trends or other dependencies. We demonstrate the performance of the new method of linear regression quantile mapping (RQM) in correcting biases of temperature and precipitation products from historical runs (1959 - 2005) of the COSMO model in climate mode (CCLM) from the Euro-CORDEX ensemble relative to gridded E-OBS data of the same spatial and temporal resolution. A thorough comparison with established bias correction methods highlights the strengths and potential weaknesses of the new RQM approach. References: A.J. Cannon, S.R. Sorbie, T.Q. Murdock: Bias Correction of GCM Precipitation by Quantile Mapping - How Well Do Methods Preserve Changes in Quantiles and Extremes? Journal of Climate, 28, 6038, 2015 A.J. Cannon: Multivariate Bias Correction of Climate Model Outputs - Matching Marginal Distributions and Inter-variable Dependence Structure. Journal of Climate, 29, 7045, 2016
2003-04-01
any of the P interfering sources, and Hkt i (1) (P)] T is defined below. The P-variate vector = t kt , • t J consists of complex waveforms radiated by...line. More precisely, the (i, j ) t element of the matrix Hke is a complex 4-4 coefficient which is practically constant over the kth PRI, and is a...multivariate auto-regressive (AR) model of order n: Ykt + Z Bj Yk- j , t = tkt (25) j =l In the above equation, Bj are the M-variate matrices which are the
NASA Astrophysics Data System (ADS)
Trigila, Alessandro; Iadanza, Carla; Esposito, Carlo; Scarascia-Mugnozza, Gabriele
2015-11-01
The aim of this work is to define reliable susceptibility models for shallow landslides using Logistic Regression and Random Forests multivariate statistical techniques. The study area, located in North-East Sicily, was hit on October 1st 2009 by a severe rainstorm (225 mm of cumulative rainfall in 7 h) which caused flash floods and more than 1000 landslides. Several small villages, such as Giampilieri, were hit with 31 fatalities, 6 missing persons and damage to buildings and transportation infrastructures. Landslides, mainly types such as earth and debris translational slides evolving into debris flows, were triggered on steep slopes and involved colluvium and regolith materials which cover the underlying metamorphic bedrock. The work has been carried out with the following steps: i) realization of a detailed event landslide inventory map through field surveys coupled with observation of high resolution aerial colour orthophoto; ii) identification of landslide source areas; iii) data preparation of landslide controlling factors and descriptive statistics based on a bivariate method (Frequency Ratio) to get an initial overview on existing relationships between causative factors and shallow landslide source areas; iv) choice of criteria for the selection and sizing of the mapping unit; v) implementation of 5 multivariate statistical susceptibility models based on Logistic Regression and Random Forests techniques and focused on landslide source areas; vi) evaluation of the influence of sample size and type of sampling on results and performance of the models; vii) evaluation of the predictive capabilities of the models using ROC curve, AUC and contingency tables; viii) comparison of model results and obtained susceptibility maps; and ix) analysis of temporal variation of landslide susceptibility related to input parameter changes. Models based on Logistic Regression and Random Forests have demonstrated excellent predictive capabilities. Land use and wildfire variables were found to have a strong control on the occurrence of very rapid shallow landslides.
Kabir, Mohammad Alamgir; Goh, Kim-Leng; Kamal, Sunny Mohammad Mostafa; Khan, Md. Mobarak Hossain
2013-01-01
Background Tobacco smoking (TS) and illicit drug use (IDU) are of public health concerns especially in developing countries, including Bangladesh. This paper aims to (i) identify the determinants of TS and IDU, and (ii) examine the association of TS with IDU among young slum dwellers in Bangladesh. Methodology/Principal Findings Data on a total of 1,576 young slum dwellers aged 15–24 years were extracted for analysis from the 2006 Urban Health Survey (UHS), which covered a nationally representative sample of 13,819 adult men aged 15–59 years from slums, non-slums and district municipalities of six administrative regions in Bangladesh. Methods used include frequency run, Chi-square test of association and multivariable logistic regression. The overall prevalence of TS in the target group was 42.3%, of which 41.4% smoked cigarettes and 3.1% smoked bidis. The regression model for TS showed that age, marital status, education, duration of living in slums, and those with sexually transmitted infections were significantly (p<0.001 to p<0.05) associated with TS. The overall prevalence of IDU was 9.1%, dominated by those who had drug injections (3.2%), and smoked ganja (2.8%) and tari (1.6%). In the regression model for IDU, the significant (p<0.01 to p<0.10) predictors were education, duration of living in slums, and whether infected by sexually transmitted diseases. The multivariable logistic regression (controlling for other variables) revealed significantly (p<0.001) higher likelihood of IDU (OR = 9.59, 95% CI = 5.81–15.82) among users of any form of TS. The likelihood of IDU increased significantly (p<0.001) with increased use of cigarettes. Conclusions/Significance Certain groups of youth are more vulnerable to TS and IDU. Therefore, tobacco and drug control efforts should target these groups to reduce the consequences of risky lifestyles through information, education and communication (IEC) programs. PMID:23935885
Robust, Adaptive Functional Regression in Functional Mixed Model Framework.
Zhu, Hongxiao; Brown, Philip J; Morris, Jeffrey S
2011-09-01
Functional data are increasingly encountered in scientific studies, and their high dimensionality and complexity lead to many analytical challenges. Various methods for functional data analysis have been developed, including functional response regression methods that involve regression of a functional response on univariate/multivariate predictors with nonparametrically represented functional coefficients. In existing methods, however, the functional regression can be sensitive to outlying curves and outlying regions of curves, so is not robust. In this paper, we introduce a new Bayesian method, robust functional mixed models (R-FMM), for performing robust functional regression within the general functional mixed model framework, which includes multiple continuous or categorical predictors and random effect functions accommodating potential between-function correlation induced by the experimental design. The underlying model involves a hierarchical scale mixture model for the fixed effects, random effect and residual error functions. These modeling assumptions across curves result in robust nonparametric estimators of the fixed and random effect functions which down-weight outlying curves and regions of curves, and produce statistics that can be used to flag global and local outliers. These assumptions also lead to distributions across wavelet coefficients that have outstanding sparsity and adaptive shrinkage properties, with great flexibility for the data to determine the sparsity and the heaviness of the tails. Together with the down-weighting of outliers, these within-curve properties lead to fixed and random effect function estimates that appear in our simulations to be remarkably adaptive in their ability to remove spurious features yet retain true features of the functions. We have developed general code to implement this fully Bayesian method that is automatic, requiring the user to only provide the functional data and design matrices. It is efficient enough to handle large data sets, and yields posterior samples of all model parameters that can be used to perform desired Bayesian estimation and inference. Although we present details for a specific implementation of the R-FMM using specific distributional choices in the hierarchical model, 1D functions, and wavelet transforms, the method can be applied more generally using other heavy-tailed distributions, higher dimensional functions (e.g. images), and using other invertible transformations as alternatives to wavelets.
Robust, Adaptive Functional Regression in Functional Mixed Model Framework
Zhu, Hongxiao; Brown, Philip J.; Morris, Jeffrey S.
2012-01-01
Functional data are increasingly encountered in scientific studies, and their high dimensionality and complexity lead to many analytical challenges. Various methods for functional data analysis have been developed, including functional response regression methods that involve regression of a functional response on univariate/multivariate predictors with nonparametrically represented functional coefficients. In existing methods, however, the functional regression can be sensitive to outlying curves and outlying regions of curves, so is not robust. In this paper, we introduce a new Bayesian method, robust functional mixed models (R-FMM), for performing robust functional regression within the general functional mixed model framework, which includes multiple continuous or categorical predictors and random effect functions accommodating potential between-function correlation induced by the experimental design. The underlying model involves a hierarchical scale mixture model for the fixed effects, random effect and residual error functions. These modeling assumptions across curves result in robust nonparametric estimators of the fixed and random effect functions which down-weight outlying curves and regions of curves, and produce statistics that can be used to flag global and local outliers. These assumptions also lead to distributions across wavelet coefficients that have outstanding sparsity and adaptive shrinkage properties, with great flexibility for the data to determine the sparsity and the heaviness of the tails. Together with the down-weighting of outliers, these within-curve properties lead to fixed and random effect function estimates that appear in our simulations to be remarkably adaptive in their ability to remove spurious features yet retain true features of the functions. We have developed general code to implement this fully Bayesian method that is automatic, requiring the user to only provide the functional data and design matrices. It is efficient enough to handle large data sets, and yields posterior samples of all model parameters that can be used to perform desired Bayesian estimation and inference. Although we present details for a specific implementation of the R-FMM using specific distributional choices in the hierarchical model, 1D functions, and wavelet transforms, the method can be applied more generally using other heavy-tailed distributions, higher dimensional functions (e.g. images), and using other invertible transformations as alternatives to wavelets. PMID:22308015
NASA Technical Reports Server (NTRS)
Wolf, S. F.; Lipschutz, M. E.
1993-01-01
Multivariate statistical analysis techniques (linear discriminant analysis and logistic regression) can provide powerful discrimination tools which are generally unfamiliar to the planetary science community. Fall parameters were used to identify a group of 17 H chondrites (Cluster 1) that were part of a coorbital stream which intersected Earth's orbit in May, from 1855 - 1895, and can be distinguished from all other H chondrite falls. Using multivariate statistical techniques, it was demonstrated that a totally different criterion, labile trace element contents - hence thermal histories - or 13 Cluster 1 meteorites are distinguishable from those of 45 non-Cluster 1 H chondrites. Here, we focus upon the principles of multivariate statistical techniques and illustrate their application using non-meteoritic and meteoritic examples.
Multivariate analysis of cytokine profiles in pregnancy complications.
Azizieh, Fawaz; Dingle, Kamaludin; Raghupathy, Raj; Johnson, Kjell; VanderPlas, Jacob; Ansari, Ali
2018-03-01
The immunoregulation to tolerate the semiallogeneic fetus during pregnancy includes a harmonious dynamic balance between anti- and pro-inflammatory cytokines. Several earlier studies reported significantly different levels and/or ratios of several cytokines in complicated pregnancy as compared to normal pregnancy. However, as cytokines operate in networks with potentially complex interactions, it is also interesting to compare groups with multi-cytokine data sets, with multivariate analysis. Such analysis will further examine how great the differences are, and which cytokines are more different than others. Various multivariate statistical tools, such as Cramer test, classification and regression trees, partial least squares regression figures, 2-dimensional Kolmogorov-Smirmov test, principal component analysis and gap statistic, were used to compare cytokine data of normal vs anomalous groups of different pregnancy complications. Multivariate analysis assisted in examining if the groups were different, how strongly they differed, in what ways they differed and further reported evidence for subgroups in 1 group (pregnancy-induced hypertension), possibly indicating multiple causes for the complication. This work contributes to a better understanding of cytokines interaction and may have important implications on targeting cytokine balance modulation or design of future medications or interventions that best direct management or prevention from an immunological approach. © 2018 The Authors. American Journal of Reproductive Immunology Published by John Wiley & Sons Ltd.
Gender differences in health-related quality of life of adolescents with cystic fibrosis
Arrington-Sanders, Renata; Yi, Michael S; Tsevat, Joel; Wilmott, Robert W; Mrus, Joseph M; Britto, Maria T
2006-01-01
Background Female patients with cystic fibrosis (CF) have consistently poorer survival rates than males across all ages. To determine if gender differences exist in health-related quality of life (HRQOL) of adolescent patients with CF, we performed a cross-section analysis of CF patients recruited from 2 medical centers in 2 cities during 1997–2001. Methods We used the 87-item child self-report form of the Child Health Questionnaire to measure 12 health domains. Data was also collected on age and forced expiratory volume in 1 second (FEV1). We analyzed data from 98 subjects and performed univariate analyses and linear regression or ordinal logistic regression for multivariable analyses. Results The mean (SD) age was 14.6 (2.5) years; 50 (51.0%) were female; and mean FEV1 was 71.6% (25.6%) of predicted. There were no statistically significant gender differences in age or FEV1. In univariate analyses, females reported significantly poorer HRQOL in 5 of the 12 domains. In multivariable analyses controlling for FEV1 and age, we found that female gender was associated with significantly lower global health (p < 0.05), mental health (p < 0.01), and general health perceptions (p < 0.05) scores. Conclusion Further research will need to focus on the causes of these differences in HRQOL and on potential interventions to improve HRQOL of adolescent patients with CF. PMID:16433917
2011-01-01
Background In Taiwan, there is a high incidence of breast cancer and a high prevalence of viral hepatitis. In this case-control study, we used a population-based insurance dataset to evaluate whether breast cancer in women is associated with chronic viral hepatitis infection. Methods From the claims data, we identified 1,958 patients with newly diagnosed breast cancer during the period 2000-2008. A randomly selected, age-matched cohort of 7,832 subjects without cancer was selected for comparison. Multivariable logistic regression models were constructed to calculate odds ratios of breast cancer associated with viral hepatitis after adjustment for age, residential area, occupation, urbanization, and income. The age-specific (<50 years and ≥50 years) risk of breast cancer was also evaluated. Results There were no significant differences in the prevalence of hepatitis C virus (HCV) infection, hepatitis B virus (HBV), or the prevalence of combined HBC/HBV infection between breast cancer patients and control subjects (p = 0.48). Multivariable logistic regression analysis, however, revealed that age <50 years was associated with a 2-fold greater risk of developing breast cancer (OR = 2.03, 95% CI = 1.23-3.34). Conclusions HCV infection, but not HBV infection, appears to be associated with early onset risk of breast cancer in areas endemic for HCV and HBV. This finding needs to be replicated in further studies. PMID:22115285
Krause, Kathleen H.
2015-01-01
Objective To provide the first study in Vietnam of how gendered social learning about violence and exposure to non-family institutions influence women’s attitudes about a wife’s recourse after physical IPV. Method A probability sample of 532 married women, ages 18–50 years, was surveyed in July–August, 2012 in Mỹ Hào district. We fit a multivariate linear regression model to estimate correlates of favoring recourse in six situations using a validated attitudinal scale. We split attitudes towards recourse into three subscales (disfavor silence, favor informal recourse, favor formal recourse) and fit one multivariate ordinal logistic regression model for each behavior to estimate correlates of favoring recourse. Results On average, women favored recourse in 2.8 situations. Women who were older and had witnessed physical IPV in childhood had less favorable attitudes about recourse. Women who were hit as children, had completed more schooling, worked outside agriculture, and had sought recourse after IPV had more favorable attitudes about recourse. Conclusions Normative change among women may require efforts to curb family violence, counsel those exposed to violence in childhood, and enhance women’s opportunities for higher schooling and non-agricultural wage work. The state and organizations working on IPV might overcome pockets of unfavorable public opinion by enforcing accountability for IPV rather than seeking to alter ideas about recourse among women. PMID:28392967
Relationship between alcohol intake, body fat, and physical activity – a population-based study
Liangpunsakul, Suthat; Crabb, David W.; Qi, Rong
2010-01-01
Objectives Aside from fat, ethanol is the macronutrient with the highest energy density. Whether the energy derived from ethanol affects the body composition and fat mass is debatable. We investigated the relationship between alcohol intake, body composition, and physical activity in the US population using the third National Health and Nutrition Examination Survey (NHANES III). Methods Ten thousand five hundred and fifty subjects met eligible criteria and constituted our study cohort. Estimated percent body fat and resting metabolic rate were calculated based on the sum of the skinfolds. Multivariate regression analyses were performed accounting for the study sampling weight. Results In both genders, moderate and hazardous alcohol drinkers were younger (p<0.05), had significantly lower BMI (P<0.01) and body weight (p<0.01) than controls, non drinkers. Those with hazardous alcohol consumption had significantly less physical activity compared to those with no alcohol use and moderate drinkers in both genders. Female had significantly higher percent body fat than males. In the multivariate linear regression analyses, the levels of alcohol consumption were found to be an independent predictor associated with lower percent body fat only in male subjects. Conclusions Our results showed that alcoholics are habitually less active and that alcohol drinking is an independent predictor of lower percent body fat especially in male alcoholics. PMID:20696406
2014-01-01
Background Hospitals are merging to become more cost-effective. Mergers are often complex and difficult processes with variable outcomes. The aim of this study was to analyze the effect of mergers on long-term sickness absence among hospital employees. Methods Long-term sickness absence was analyzed among hospital employees (N = 107 209) in 57 hospitals involved in 23 mergers in Norway between 2000 and 2009. Variation in long-term sickness absence was explained through a fixed effects multivariate regression analysis using panel data with years-since-merger as the independent variable. Results We found a significant but modest effect of mergers on long-term sickness absence in the year of the merger, and in years 2, 3 and 4; analyzed by gender there was a significant effect for women, also for these years, but only in year 4 for men. However, men are less represented among the hospital workforce; this could explain the lack of significance. Conclusions Mergers has a significant effect on employee health that should be taken into consideration when deciding to merge hospitals. This study illustrates the importance of analyzing the effects of mergers over several years and the need for more detailed analyses of merger processes and of the changes that may occur as a result of such mergers. PMID:24490750
Nakai, Shunichiro; Matsumiya, Wataru; Miki, Akiko; Nakamura, Makoto
2017-01-01
Purpose To determine the association of age-related maculopathy susceptibility 2 (ARMS2) gene polymorphisms with the 3-year outcomes of photodynamic therapy (PDT) in wet age-related macular degeneration (wet AMD). Methods The single nucleotide polymorphism (SNP) at rs10490924 in the ARMS2 gene of 65 patients with wet AMD who underwent PDT was genotyped using the TaqMan assay. The clinical characteristics and the outcomes of PDT were compared among the three genotypes at rs10490924. A multivariate regression analysis was performed to evaluate the influence of the clinical cofactors on the association of rs10490924 with the visual outcome at 36 months after the first PDT. Results A significant difference was found among the genotypes in the age and the baseline lesion size. The patients with the GG genotype showed a significant improvement in vision, and the patients with the TT genotype showed a significant worsening of vision at all time points measured after the initial PDT. In the multivariate regression analysis, the number of the G allele at rs10490924 was associated with a significantly greater improvement in the baseline best-corrected visual acuity (BCVA) at 36 months after the first PDT. Conclusions ARMS2 variants are likely associated with the 3-year outcomes of PDT in patients with wet AMD. PMID:28761324
Why Multivariate Methods Are Usually Vital in Research: Some Basic Concepts.
ERIC Educational Resources Information Center
Thompson, Bruce
The present paper suggests that multivariate methods ought to be used more frequently in behavioral research and explores the potential consequences of failing to use multivariate methods when these methods are appropriate. The paper explores in detail two reasons why multivariate methods are usually vital. The first is that they limit the…
Djuris, Jelena; Medarevic, Djordje; Krstic, Marko; Djuric, Zorica; Ibric, Svetlana
2013-06-01
This study illustrates the application of experimental design and multivariate data analysis in defining design space for granulation and tableting processes. According to the quality by design concepts, critical quality attributes (CQAs) of granules and tablets, as well as critical parameters of granulation and tableting processes, were identified and evaluated. Acetaminophen was used as the model drug, and one of the study aims was to investigate the possibility of the development of immediate- or extended-release acetaminophen tablets. Granulation experiments were performed in the fluid bed processor using polyethylene oxide polymer as a binder in the direct granulation method. Tablets were compressed in the laboratory excenter tablet press. The first set of experiments was organized according to Plackett-Burman design, followed by the full factorial experimental design. Principal component analysis and partial least squares regression were applied as the multivariate analysis techniques. By using these different methods, CQAs and process parameters were identified and quantified. Furthermore, an in-line method was developed to monitor the temperature during the fluidized bed granulation process, to foresee possible defects in granules CQAs. Various control strategies that are based on the process understanding and assure desired quality attributes of the product are proposed. Copyright © 2013 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Liu, Yande; Ying, Yibin; Lu, Huishan; Fu, Xiaping
2005-11-01
A new method is proposed to eliminate the varying background and noise simultaneously for multivariate calibration of Fourier transform near infrared (FT-NIR) spectral signals. An ideal spectrum signal prototype was constructed based on the FT-NIR spectrum of fruit sugar content measurement. The performances of wavelet based threshold de-noising approaches via different combinations of wavelet base functions were compared. Three families of wavelet base function (Daubechies, Symlets and Coiflets) were applied to estimate the performance of those wavelet bases and threshold selection rules by a series of experiments. The experimental results show that the best de-noising performance is reached via the combinations of Daubechies 4 or Symlet 4 wavelet base function. Based on the optimization parameter, wavelet regression models for sugar content of pear were also developed and result in a smaller prediction error than a traditional Partial Least Squares Regression (PLSR) mode.
Dirichlet Component Regression and its Applications to Psychiatric Data.
Gueorguieva, Ralitza; Rosenheck, Robert; Zelterman, Daniel
2008-08-15
We describe a Dirichlet multivariable regression method useful for modeling data representing components as a percentage of a total. This model is motivated by the unmet need in psychiatry and other areas to simultaneously assess the effects of covariates on the relative contributions of different components of a measure. The model is illustrated using the Positive and Negative Syndrome Scale (PANSS) for assessment of schizophrenia symptoms which, like many other metrics in psychiatry, is composed of a sum of scores on several components, each in turn, made up of sums of evaluations on several questions. We simultaneously examine the effects of baseline socio-demographic and co-morbid correlates on all of the components of the total PANSS score of patients from a schizophrenia clinical trial and identify variables associated with increasing or decreasing relative contributions of each component. Several definitions of residuals are provided. Diagnostics include measures of overdispersion, Cook's distance, and a local jackknife influence metric.
Macaluso, P J
2011-02-01
Digital photogrammetric methods were used to collect diameter, area, and perimeter data of the acetabulum for a twentieth-century skeletal sample from France (Georges Olivier Collection, Musée de l'Homme, Paris) consisting of 46 males and 36 females. The measurements were then subjected to both discriminant function and logistic regression analyses in order to develop osteometric standards for sex assessment. Univariate discriminant functions and logistic regression equations yielded overall correct classification accuracy rates for both the left and the right acetabula ranging from 84.1% to 89.6%. The multivariate models developed in this study did not provide increased accuracy over those using only a single variable. Classification sex bias ratios ranged between 1.1% and 7.3% for the majority of models. The results of this study, therefore, demonstrate that metric analysis of acetabular size provides a highly accurate, and easily replicable, method of discriminating sex in this documented skeletal collection. The results further suggest that the addition of area and perimeter data derived from digital images may provide a more effective method of sex assessment than that offered by traditional linear measurements alone. Copyright © 2010 Elsevier GmbH. All rights reserved.
Rahman, Md. Jahanur; Shamim, Abu Ahmed; Klemm, Rolf D. W.; Labrique, Alain B.; Rashid, Mahbubur; Christian, Parul; West, Keith P.
2017-01-01
Birth weight, length and circumferences of the head, chest and arm are key measures of newborn size and health in developing countries. We assessed maternal socio-demographic factors associated with multiple measures of newborn size in a large rural population in Bangladesh using partial least squares (PLS) regression method. PLS regression, combining features from principal component analysis and multiple linear regression, is a multivariate technique with an ability to handle multicollinearity while simultaneously handling multiple dependent variables. We analyzed maternal and infant data from singletons (n = 14,506) born during a double-masked, cluster-randomized, placebo-controlled maternal vitamin A or β-carotene supplementation trial in rural northwest Bangladesh. PLS regression results identified numerous maternal factors (parity, age, early pregnancy MUAC, living standard index, years of education, number of antenatal care visits, preterm delivery and infant sex) significantly (p<0.001) associated with newborn size. Among them, preterm delivery had the largest negative influence on newborn size (Standardized β = -0.29 − -0.19; p<0.001). Scatter plots of the scores of first two PLS components also revealed an interaction between newborn sex and preterm delivery on birth size. PLS regression was found to be more parsimonious than both ordinary least squares regression and principal component regression. It also provided more stable estimates than the ordinary least squares regression and provided the effect measure of the covariates with greater accuracy as it accounts for the correlation among the covariates and outcomes. Therefore, PLS regression is recommended when either there are multiple outcome measurements in the same study, or the covariates are correlated, or both situations exist in a dataset. PMID:29261760
Kabir, Alamgir; Rahman, Md Jahanur; Shamim, Abu Ahmed; Klemm, Rolf D W; Labrique, Alain B; Rashid, Mahbubur; Christian, Parul; West, Keith P
2017-01-01
Birth weight, length and circumferences of the head, chest and arm are key measures of newborn size and health in developing countries. We assessed maternal socio-demographic factors associated with multiple measures of newborn size in a large rural population in Bangladesh using partial least squares (PLS) regression method. PLS regression, combining features from principal component analysis and multiple linear regression, is a multivariate technique with an ability to handle multicollinearity while simultaneously handling multiple dependent variables. We analyzed maternal and infant data from singletons (n = 14,506) born during a double-masked, cluster-randomized, placebo-controlled maternal vitamin A or β-carotene supplementation trial in rural northwest Bangladesh. PLS regression results identified numerous maternal factors (parity, age, early pregnancy MUAC, living standard index, years of education, number of antenatal care visits, preterm delivery and infant sex) significantly (p<0.001) associated with newborn size. Among them, preterm delivery had the largest negative influence on newborn size (Standardized β = -0.29 - -0.19; p<0.001). Scatter plots of the scores of first two PLS components also revealed an interaction between newborn sex and preterm delivery on birth size. PLS regression was found to be more parsimonious than both ordinary least squares regression and principal component regression. It also provided more stable estimates than the ordinary least squares regression and provided the effect measure of the covariates with greater accuracy as it accounts for the correlation among the covariates and outcomes. Therefore, PLS regression is recommended when either there are multiple outcome measurements in the same study, or the covariates are correlated, or both situations exist in a dataset.
Tanpitukpongse, Teerath P.; Mazurowski, Maciej A.; Ikhena, John; Petrella, Jeffrey R.
2016-01-01
Background and Purpose To assess prognostic efficacy of individual versus combined regional volumetrics in two commercially-available brain volumetric software packages for predicting conversion of patients with mild cognitive impairment to Alzheimer's disease. Materials and Methods Data was obtained through the Alzheimer's Disease Neuroimaging Initiative. 192 subjects (mean age 74.8 years, 39% female) diagnosed with mild cognitive impairment at baseline were studied. All had T1WI MRI sequences at baseline and 3-year clinical follow-up. Analysis was performed with NeuroQuant® and Neuroreader™. Receiver operating characteristic curves assessing the prognostic efficacy of each software package were generated using a univariable approach employing individual regional brain volumes, as well as two multivariable approaches (multiple regression and random forest), combining multiple volumes. Results On univariable analysis of 11 NeuroQuant® and 11 Neuroreader™ regional volumes, hippocampal volume had the highest area under the curve for both software packages (0.69 NeuroQuant®, 0.68 Neuroreader™), and was not significantly different (p > 0.05) between packages. Multivariable analysis did not increase the area under the curve for either package (0.63 logistic regression, 0.60 random forest NeuroQuant®; 0.65 logistic regression, 0.62 random forest Neuroreader™). Conclusion Of the multiple regional volume measures available in FDA-cleared brain volumetric software packages, hippocampal volume remains the best single predictor of conversion of mild cognitive impairment to Alzheimer's disease at 3-year follow-up. Combining volumetrics did not add additional prognostic efficacy. Therefore, future prognostic studies in MCI, combining such tools with demographic and other biomarker measures, are justified in using hippocampal volume as the only volumetric biomarker. PMID:28057634
Prehospital Helicopter Transport and Survival of Patients With Traumatic Brain Injury
Mackenzie, Todd A.
2015-01-01
Objective To investigate the association of helicopter transport with survival of patients with traumatic brain injury (TBI), in comparison with ground emergency medical services (EMS). Background Helicopter utilization and its effect on the outcomes of TBI remain controversial. Methods We performed a retrospective cohort study involving patients with TBI who were registered in the National Trauma Data Bank between 2009 and 2011. Regression techniques with propensity score matching were used to investigate the association of helicopter transport with survival of patients with TBI, in comparison with ground EMS. Results During the study period, there were 209,529 patients with TBI who were registered in the National Trauma Data Bank and met the inclusion criteria. Of these patients, 35,334 were transported via helicopters and 174,195 via ground EMS. For patients transported to level I trauma centers, 2797 deaths (12%) were recorded after helicopter transport and 8161 (7.8%) after ground EMS. Multivariable logistic regression analysis demonstrated an association of helicopter transport with increased survival [OR (odds ratio), 1.95; 95% confidence interval (CI), 1.81–2.10; absolute risk reduction (ARR), 6.37%]. This persisted after propensity score matching (OR, 1.88; 95% CI, 1.74–2.03; ARR, 5.93%). For patients transported to level II trauma centers, 1282 deaths (10.6%) were recorded after helicopter transport and 5097 (7.3%) after ground EMS. Multivariable logistic regression analysis demonstrated an association of helicopter transport with increased survival (OR, 1.81; 95% CI, 1.64–2.00; ARR 5.17%). This again persisted after propensity score matching (OR, 1.73; 95% CI, 1.55–1.94; ARR, 4.69). Conclusions Helicopter transport of patients with TBI to level I and II trauma centers was associated with improved survival, in comparison with ground EMS. PMID:24743624
Retinal nerve fibre layer thinning is associated with drug resistance in epilepsy
Balestrini, Simona; Clayton, Lisa M S; Bartmann, Ana P; Chinthapalli, Krishna; Novy, Jan; Coppola, Antonietta; Wandschneider, Britta; Stern, William M; Acheson, James; Bell, Gail S; Sander, Josemir W; Sisodiya, Sanjay M
2016-01-01
Objective Retinal nerve fibre layer (RNFL) thickness is related to the axonal anterior visual pathway and is considered a marker of overall white matter ‘integrity’. We hypothesised that RNFL changes would occur in people with epilepsy, independently of vigabatrin exposure, and be related to clinical characteristics of epilepsy. Methods Three hundred people with epilepsy attending specialist clinics and 90 healthy controls were included in this cross-sectional cohort study. RNFL imaging was performed using spectral-domain optical coherence tomography (OCT). Drug resistance was defined as failure of adequate trials of two antiepileptic drugs to achieve sustained seizure freedom. Results The average RNFL thickness and the thickness of each of the 90° quadrants were significantly thinner in people with epilepsy than healthy controls (p<0.001, t test). In a multivariate logistic regression model, drug resistance was the only significant predictor of abnormal RNFL thinning (OR=2.09, 95% CI 1.09 to 4.01, p=0.03). Duration of epilepsy (coefficient −0.16, p=0.004) and presence of intellectual disability (coefficient −4.0, p=0.044) also showed a significant relationship with RNFL thinning in a multivariate linear regression model. Conclusions Our results suggest that people with epilepsy with no previous exposure to vigabatrin have a significantly thinner RNFL than healthy participants. Drug resistance emerged as a significant independent predictor of RNFL borderline attenuation or abnormal thinning in a logistic regression model. As this is easily assessed by OCT, RNFL thickness might be used to better understand the mechanisms underlying drug resistance, and possibly severity. Longitudinal studies are needed to confirm our findings. PMID:25886782
Bone Mineral Density across a Range of Physical Activity Volumes: NHANES 2007–2010
Whitfield, Geoffrey P.; Kohrt, Wendy M.; Pettee Gabriel, Kelley K.; Rahbar, Mohammad H.; Kohl, Harold W.
2014-01-01
Introduction The association between aerobic physical activity volume and bone mineral density (BMD) is not completely understood. The purpose of this study was to clarify the association between BMD and aerobic activity across a broad range of activity volumes, in particular volumes between those recommended in the 2008 Physical Activity Guidelines for Americans and those of trained endurance athletes. Methods Data from the 2007–2010 National Health and Nutrition Examination Survey were used to quantify the association between reported physical activity and BMD at the lumbar spine and proximal femur across the entire range of activity volumes reported by US adults. Participants were categorized into multiples of the minimum guideline-recommended volume based on reported moderate and vigorous intensity leisure activity. Lumbar and proximal femur BMD was assessed with dual-energy x-ray absorptiometry. Results Among women, multivariable-adjusted linear regression analyses revealed no significant differences in lumbar BMD across activity categories, while proximal femur BMD was significantly higher among those who exceeded guidelines by 2–4 times than those who reported no activity. Among men, multivariable-adjusted BMD at both sites neared its highest values among those who exceeded guidelines by at least 4 times and was not progressively higher with additional activity. Logistic regression estimating the odds of low BMD generally echoed the linear regression results. Conclusion The association between physical activity volume and BMD is complex. Among women, exceeding guidelines by 2–4 times may be important for maximizing BMD at the proximal femur, while among men, exceeding guidelines by 4+ times may be beneficial for lumbar and proximal femur BMD. PMID:24870584
Massad, L. Stewart; Xie, Xianhong; Darragh, Teresa; Minkoff, Howard; Levine, Alexandra M.; Watts, D. Heather; Wright, Rodney L.; D’Souza, Gypsyamber; Colie, Christine; Strickler, Howard D.
2011-01-01
Objective To describe the natural history of genital warts and vulvar intraepithelial neoplasia (VIN) in women with human immunodeficiency virus (HIV). Methods A cohort of 2,791 HIV infected and 953 uninfected women followed for up to 13 years had genital examinations at 6-month intervals, with biopsy for lesions suspicious for VIN. Results The prevalence of warts was 4.4% (5.3% for HIV seropositive women and 1.9% for seronegative women, P < 0.0001). The cumulative incidence of warts was 33% (95% C.I. 30, 36%) in HIV seropositive and 9% (95% C.I. 6, 12%) in seronegative women (P < 0.0001). In multivariable analysis, lower CD4 lymphocyte count, younger age, and current smoking were strongly associated with risk for incident warts. Among 501 HIV seropositive and 43 seronegative women, warts regressed in 410 (82%) seropositive and 41 (95%) seronegative women (P = 0.02), most in the first year after diagnosis. In multivariable analysis, regression was negatively associated with HIV status and lower CD4 count as well as older age. Incident VIN of any grade occurred more frequently among HIV seropositive than seronegative women: 0.42 (0.33 – 0.53) vs 0.07 (0.02 – 0.18)/100 person-years (P < 0.0001). VIN2+ was found in 58 women (55 with and 3 without HIV, P < 0.001). Two women with HIV developed stage IB squamous cell vulvar cancers. Conclusion While genital warts and VIN are more common among HIV seropositive than seronegative women, wart regression is common even in women with HIV, and cancers are infrequent. PMID:21934446
Wang, Yang; Wilson, Fernando A; Chen, Li-Wu
2017-06-01
We examined differences in cancer-related office-based provider visits associated with immigration status in the United States. Data from the 2007-2012 Medical Expenditure Panel Survey and National Health Interview Survey included adult patients diagnosed with cancer. Univariate analyses described distributions of cancer-related office-based provider visits received, expenditures, visit characteristics, as well as demographic, socioeconomic, and health covariates, across immigration groups. We measured the relationships of immigrant status to number of visits and associated expenditure within the past 12 months, adjusting for age, sex, educational attainment, race/ethnicity, self-reported health status, time since cancer diagnosis, cancer remission status, marital status, poverty status, insurance status, and usual source of care. We finally performed sensitivity analyses for regression results by using the propensity score matching method to adjust for potential selection bias. Noncitizens had about 2 fewer visits in a 12-month period in comparison to US-born citizens (4.0 vs. 5.9). Total expenditure per patient was higher for US-born citizens than immigrants (not statistically significant). Noncitizens (88.3%) were more likely than US-born citizens (76.6%) to be seen by a medical doctor during a visit. Multivariate regression results showed that noncitizens had 42% lower number of visiting medical providers at office-based settings for cancer care than US-born citizens, after adjusting for all the other covariates. There were no significant differences in expenditures across immigration groups. The propensity score matching results were largely consistent with those in multivariate-adjusted regressions. Results suggest targeted interventions are needed to reduce disparities in utilization between immigrants and US-born citizen cancer patients.
Pathan, Sameer A; Bhutta, Zain A; Moinudheen, Jibin; Jenkins, Dominic; Silva, Ashwin D; Sharma, Yogdutt; Saleh, Warda A; Khudabakhsh, Zeenat; Irfan, Furqan B; Thomas, Stephen H
2016-01-01
Background: Standard Emergency Department (ED) operations goals include minimization of the time interval (tMD) between patients' initial ED presentation and initial physician evaluation. This study assessed factors known (or suspected) to influence tMD with a two-step goal. The first step was generation of a multivariate model identifying parameters associated with prolongation of tMD at a single study center. The second step was the use of a study center-specific multivariate tMD model as a basis for predictive marginal probability analysis; the marginal model allowed for prediction of the degree of ED operations benefit that would be affected with specific ED operations improvements. Methods: The study was conducted using one month (May 2015) of data obtained from an ED administrative database (EDAD) in an urban academic tertiary ED with an annual census of approximately 500,000; during the study month, the ED saw 39,593 cases. The EDAD data were used to generate a multivariate linear regression model assessing the various demographic and operational covariates' effects on the dependent variable tMD. Predictive marginal probability analysis was used to calculate the relative contributions of key covariates as well as demonstrate the likely tMD impact on modifying those covariates with operational improvements. Analyses were conducted with Stata 14MP, with significance defined at p < 0.05 and confidence intervals (CIs) reported at the 95% level. Results: In an acceptable linear regression model that accounted for just over half of the overall variance in tMD (adjusted r 2 0.51), important contributors to tMD included shift census ( p = 0.008), shift time of day ( p = 0.002), and physician coverage n ( p = 0.004). These strong associations remained even after adjusting for each other and other covariates. Marginal predictive probability analysis was used to predict the overall tMD impact (improvement from 50 to 43 minutes, p < 0.001) of consistent staffing with 22 physicians. Conclusions: The analysis identified expected variables contributing to tMD with regression demonstrating significance and effect magnitude of alterations in covariates including patient census, shift time of day, and number of physicians. Marginal analysis provided operationally useful demonstration of the need to adjust physician coverage numbers, prompting changes at the study ED. The methods used in this analysis may prove useful in other EDs wishing to analyze operations information with the goal of predicting which interventions may have the most benefit.
Huang, An-Min; Fei, Ben-Hua; Jiang, Ze-Hui; Hse, Chung-Yun
2007-09-01
Near infrared spectroscopy is widely used as a quantitative method, and the main multivariate techniques consist of regression methods used to build prediction models, however, the accuracy of analysis results will be affected by many factors. In the present paper, the influence of different sample roughness on the mathematical model of NIR quantitative analysis of wood density was studied. The result of experiments showed that if the roughness of predicted samples was consistent with that of calibrated samples, the result was good, otherwise the error would be much higher. The roughness-mixed model was more flexible and adaptable to different sample roughness. The prediction ability of the roughness-mixed model was much better than that of the single-roughness model.
ERIC Educational Resources Information Center
Martz, Erin
2004-01-01
Because the onset of a spinal cord injury may involve a brush with death and because serious injury and disability can act as a reminder of death, death anxiety was examined as a predictor of posttraumatic stress levels among individuals with disabilities. This cross-sectional study used multiple regression and multivariate multiple regression to…
Louis R Iverson; Anantha M. Prasad; Mark W. Schwartz; Mark W. Schwartz
2005-01-01
We predict current distribution and abundance for tree species present in eastern North America, and subsequently estimate potential suitable habitat for those species under a changed climate with 2 x CO2. We used a series of statistical models (i.e., Regression Tree Analysis (RTA), Multivariate Adaptive Regression Splines (MARS), Bagging Trees (...
J. Stephen Brewer
2010-01-01
Quantifying per capita impacts of invasive species on resident communities requires integrating regression analyses with experiments under natural conditions. Using multivariate and univariate approaches, I regressed the abundance of 105 resident species of groundcover plants and tree seedlings against the abundance and height of an invasive grass, Microstegium...
Guo, Canyong; Luo, Xuefang; Zhou, Xiaohua; Shi, Beijia; Wang, Juanjuan; Zhao, Jinqi; Zhang, Xiaoxia
2017-06-05
Vibrational spectroscopic techniques such as infrared, near-infrared and Raman spectroscopy have become popular in detecting and quantifying polymorphism of pharmaceutics since they are fast and non-destructive. This study assessed the ability of three vibrational spectroscopy combined with multivariate analysis to quantify a low-content undesired polymorph within a binary polymorphic mixture. Partial least squares (PLS) regression and support vector machine (SVM) regression were employed to build quantitative models. Fusidic acid, a steroidal antibiotic, was used as the model compound. It was found that PLS regression performed slightly better than SVM regression in all the three spectroscopic techniques. Root mean square errors of prediction (RMSEP) were ranging from 0.48% to 1.17% for diffuse reflectance FTIR spectroscopy and 1.60-1.93% for diffuse reflectance FT-NIR spectroscopy and 1.62-2.31% for Raman spectroscopy. The results indicate that diffuse reflectance FTIR spectroscopy offers significant advantages in providing accurate measurement of polymorphic content in the fusidic acid binary mixtures, while Raman spectroscopy is the least accurate technique for quantitative analysis of polymorphs. Copyright © 2017 Elsevier B.V. All rights reserved.
Domain-Invariant Partial-Least-Squares Regression.
Nikzad-Langerodi, Ramin; Zellinger, Werner; Lughofer, Edwin; Saminger-Platz, Susanne
2018-05-11
Multivariate calibration models often fail to extrapolate beyond the calibration samples because of changes associated with the instrumental response, environmental condition, or sample matrix. Most of the current methods used to adapt a source calibration model to a target domain exclusively apply to calibration transfer between similar analytical devices, while generic methods for calibration-model adaptation are largely missing. To fill this gap, we here introduce domain-invariant partial-least-squares (di-PLS) regression, which extends ordinary PLS by a domain regularizer in order to align the source and target distributions in the latent-variable space. We show that a domain-invariant weight vector can be derived in closed form, which allows the integration of (partially) labeled data from the source and target domains as well as entirely unlabeled data from the latter. We test our approach on a simulated data set where the aim is to desensitize a source calibration model to an unknown interfering agent in the target domain (i.e., unsupervised model adaptation). In addition, we demonstrate unsupervised, semisupervised, and supervised model adaptation by di-PLS on two real-world near-infrared (NIR) spectroscopic data sets.
Dankers, Frank; Wijsman, Robin; Troost, Esther G C; Monshouwer, René; Bussink, Johan; Hoffmann, Aswin L
2017-05-07
In our previous work, a multivariable normal-tissue complication probability (NTCP) model for acute esophageal toxicity (AET) Grade ⩾2 after highly conformal (chemo-)radiotherapy for non-small cell lung cancer (NSCLC) was developed using multivariable logistic regression analysis incorporating clinical parameters and mean esophageal dose (MED). Since the esophagus is a tubular organ, spatial information of the esophageal wall dose distribution may be important in predicting AET. We investigated whether the incorporation of esophageal wall dose-surface data with spatial information improves the predictive power of our established NTCP model. For 149 NSCLC patients treated with highly conformal radiation therapy esophageal wall dose-surface histograms (DSHs) and polar dose-surface maps (DSMs) were generated. DSMs were used to generate new DSHs and dose-length-histograms that incorporate spatial information of the dose-surface distribution. From these histograms dose parameters were derived and univariate logistic regression analysis showed that they correlated significantly with AET. Following our previous work, new multivariable NTCP models were developed using the most significant dose histogram parameters based on univariate analysis (19 in total). However, the 19 new models incorporating esophageal wall dose-surface data with spatial information did not show improved predictive performance (area under the curve, AUC range 0.79-0.84) over the established multivariable NTCP model based on conventional dose-volume data (AUC = 0.84). For prediction of AET, based on the proposed multivariable statistical approach, spatial information of the esophageal wall dose distribution is of no added value and it is sufficient to only consider MED as a predictive dosimetric parameter.
Liu, Chia-Chuan; Shih, Chih-Shiun; Pennarun, Nicolas; Cheng, Chih-Tao
2016-01-01
The feasibility and radicalism of lymph node dissection for lung cancer surgery by a single-port technique has frequently been challenged. We performed a retrospective cohort study to investigate this issue. Two chest surgeons initiated multiple-port thoracoscopic surgery in a 180-bed cancer centre in 2005 and shifted to a single-port technique gradually after 2010. Data, including demographic and clinical information, from 389 patients receiving multiport thoracoscopic lobectomy or segmentectomy and 149 consecutive patients undergoing either single-port lobectomy or segmentectomy for primary non-small-cell lung cancer were retrieved and entered for statistical analysis by multivariable linear regression models and Box-Cox transformed multivariable analysis. The mean number of total dissected lymph nodes in the lobectomy group was 28.5 ± 11.7 for the single-port group versus 25.2 ± 11.3 for the multiport group; the mean number of total dissected lymph nodes in the segmentectomy group was 19.5 ± 10.8 for the single-port group versus 17.9 ± 10.3 for the multiport group. In linear multivariable and after Box-Cox transformed multivariable analyses, the single-port approach was still associated with a higher total number of dissected lymph nodes. The total number of dissected lymph nodes for primary lung cancer surgery by single-port video-assisted thoracoscopic surgery (VATS) was higher than by multiport VATS in univariable, multivariable linear regression and Box-Cox transformed multivariable analyses. This study confirmed that highly effective lymph node dissection could be achieved through single-port VATS in our setting. © The Author 2015. Published by Oxford University Press on behalf of the European Association for Cardio-Thoracic Surgery. All rights reserved.
NASA Astrophysics Data System (ADS)
Dankers, Frank; Wijsman, Robin; Troost, Esther G. C.; Monshouwer, René; Bussink, Johan; Hoffmann, Aswin L.
2017-05-01
In our previous work, a multivariable normal-tissue complication probability (NTCP) model for acute esophageal toxicity (AET) Grade ⩾2 after highly conformal (chemo-)radiotherapy for non-small cell lung cancer (NSCLC) was developed using multivariable logistic regression analysis incorporating clinical parameters and mean esophageal dose (MED). Since the esophagus is a tubular organ, spatial information of the esophageal wall dose distribution may be important in predicting AET. We investigated whether the incorporation of esophageal wall dose-surface data with spatial information improves the predictive power of our established NTCP model. For 149 NSCLC patients treated with highly conformal radiation therapy esophageal wall dose-surface histograms (DSHs) and polar dose-surface maps (DSMs) were generated. DSMs were used to generate new DSHs and dose-length-histograms that incorporate spatial information of the dose-surface distribution. From these histograms dose parameters were derived and univariate logistic regression analysis showed that they correlated significantly with AET. Following our previous work, new multivariable NTCP models were developed using the most significant dose histogram parameters based on univariate analysis (19 in total). However, the 19 new models incorporating esophageal wall dose-surface data with spatial information did not show improved predictive performance (area under the curve, AUC range 0.79-0.84) over the established multivariable NTCP model based on conventional dose-volume data (AUC = 0.84). For prediction of AET, based on the proposed multivariable statistical approach, spatial information of the esophageal wall dose distribution is of no added value and it is sufficient to only consider MED as a predictive dosimetric parameter.
Multivariate classification of the infrared spectra of cell and tissue samples
DOE Office of Scientific and Technical Information (OSTI.GOV)
Haaland, D.M.; Jones, H.D.; Thomas, E.V.
1997-03-01
Infrared microspectroscopy of biopsied canine lymph cells and tissue was performed to investigate the possibility of using IR spectra coupled with multivariate classification methods to classify the samples as normal, hyperplastic, or neoplastic (malignant). IR spectra were obtained in transmission mode through BaF{sub 2} windows and in reflection mode from samples prepared on gold-coated microscope slides. Cytology and histopathology samples were prepared by a variety of methods to identify the optimal methods of sample preparation. Cytospinning procedures that yielded a monolayer of cells on the BaF{sub 2} windows produced a limited set of IR transmission spectra. These transmission spectra weremore » converted to absorbance and formed the basis for a classification rule that yielded 100{percent} correct classification in a cross-validated context. Classifications of normal, hyperplastic, and neoplastic cell sample spectra were achieved by using both partial least-squares (PLS) and principal component regression (PCR) classification methods. Linear discriminant analysis applied to principal components obtained from the spectral data yielded a small number of misclassifications. PLS weight loading vectors yield valuable qualitative insight into the molecular changes that are responsible for the success of the infrared classification. These successful classification results show promise for assisting pathologists in the diagnosis of cell types and offer future potential for {ital in vivo} IR detection of some types of cancer. {copyright} {ital 1997} {ital Society for Applied Spectroscopy}« less
Laudico, Adriano V.; Van Dinh, Nguyen; Allred, D. Craig; Uy, Gemma B.; Quang, Le Hong; Salvador, Jonathan Disraeli S.; Siguan, Stephen Sixto S.; Mirasol-Lumague, Maria Rica; Tung, Nguyen Dinh; Benjaafar, Noureddine; Navarro, Narciso S.; Quy, Tran Tu; De La Peña, Arturo S.; Dofitas, Rodney B.; Bisquera, Orlino C.; Linh, Nguyen Dieu; To, Ta Van; Young, Gregory S.; Hade, Erinn M.; Jarjoura, David
2015-01-01
Background: For women with hormone receptor–positive, operable breast cancer, surgical oophorectomy plus tamoxifen is an effective adjuvant therapy. We conducted a phase III randomized clinical trial to test the hypothesis that oophorectomy surgery performed during the luteal phase of the menstrual cycle was associated with better outcomes. Methods: Seven hundred forty premenopausal women entered a clinical trial in which those women estimated not to be in the luteal phase of their menstrual cycle for the next one to six days (n = 509) were randomly assigned to receive treatment with surgical oophorectomy either delayed to be during a five-day window in the history-estimated midluteal phase of the menstrual cycles, or in the next one to six days. Women who were estimated to be in the luteal phase of the menstrual cycle for the next one to six days (n = 231) were excluded from random assignment and received immediate surgical treatments. All patients began tamoxifen within 6 days of surgery and continued this for 5 years. Kaplan-Meier methods, the log-rank test, and multivariable Cox regression models were used to assess differences in five-year disease-free survival (DFS) between the groups. All statistical tests were two-sided. Results: The randomized midluteal phase surgery group had a five-year DFS of 64%, compared with 71% for the immediate surgery random assignment group (hazard ratio [HR] = 1.24, 95% confidence interval [CI] = 0.91 to 1.68, P = .18). Multivariable Cox regression models, which included important prognostic variables, gave similar results (aHR = 1.28, 95% CI = 0.94 to 1.76, P = .12). For overall survival, the univariate hazard ratio was 1.33 (95% CI = 0.94 to 1.89, P = .11) and the multivariable aHR was 1.43 (95% CI = 1.00 to 2.06, P = .05). Better DFS for follicular phase surgery, which was unanticipated, proved consistent across multiple exploratory analyses. Conclusions: The hypothesized benefit of adjuvant luteal phase oophorectomy was not shown in this large trial. PMID:25794890
NASA Astrophysics Data System (ADS)
Szymanowski, Mariusz; Kryza, Maciej
2017-02-01
Our study examines the role of auxiliary variables in the process of spatial modelling and mapping of climatological elements, with air temperature in Poland used as an example. The multivariable algorithms are the most frequently applied for spatialization of air temperature, and their results in many studies are proved to be better in comparison to those obtained by various one-dimensional techniques. In most of the previous studies, two main strategies were used to perform multidimensional spatial interpolation of air temperature. First, it was accepted that all variables significantly correlated with air temperature should be incorporated into the model. Second, it was assumed that the more spatial variation of air temperature was deterministically explained, the better was the quality of spatial interpolation. The main goal of the paper was to examine both above-mentioned assumptions. The analysis was performed using data from 250 meteorological stations and for 69 air temperature cases aggregated on different levels: from daily means to 10-year annual mean. Two cases were considered for detailed analysis. The set of potential auxiliary variables covered 11 environmental predictors of air temperature. Another purpose of the study was to compare the results of interpolation given by various multivariable methods using the same set of explanatory variables. Two regression models: multiple linear (MLR) and geographically weighted (GWR) method, as well as their extensions to the regression-kriging form, MLRK and GWRK, respectively, were examined. Stepwise regression was used to select variables for the individual models and the cross-validation method was used to validate the results with a special attention paid to statistically significant improvement of the model using the mean absolute error (MAE) criterion. The main results of this study led to rejection of both assumptions considered. Usually, including more than two or three of the most significantly correlated auxiliary variables does not improve the quality of the spatial model. The effects of introduction of certain variables into the model were not climatologically justified and were seen on maps as unexpected and undesired artefacts. The results confirm, in accordance with previous studies, that in the case of air temperature distribution, the spatial process is non-stationary; thus, the local GWR model performs better than the global MLR if they are specified using the same set of auxiliary variables. If only GWR residuals are autocorrelated, the geographically weighted regression-kriging (GWRK) model seems to be optimal for air temperature spatial interpolation.
Usage of multivariate geostatistics in interpolation processes for meteorological precipitation maps
NASA Astrophysics Data System (ADS)
Gundogdu, Ismail Bulent
2017-01-01
Long-term meteorological data are very important both for the evaluation of meteorological events and for the analysis of their effects on the environment. Prediction maps which are constructed by different interpolation techniques often provide explanatory information. Conventional techniques, such as surface spline fitting, global and local polynomial models, and inverse distance weighting may not be adequate. Multivariate geostatistical methods can be more significant, especially when studying secondary variables, because secondary variables might directly affect the precision of prediction. In this study, the mean annual and mean monthly precipitations from 1984 to 2014 for 268 meteorological stations in Turkey have been used to construct country-wide maps. Besides linear regression, the inverse square distance and ordinary co-Kriging (OCK) have been used and compared to each other. Also elevation, slope, and aspect data for each station have been taken into account as secondary variables, whose use has reduced errors by up to a factor of three. OCK gave the smallest errors (1.002 cm) when aspect was included.
Fallon, Susan A; Park, Ju Nyeong; Ogbue, Christine Powell; Flynn, Colin; German, Danielle
2017-05-01
This paper assessed characteristics associated with awareness of and willingness to take pre-exposure prophylaxis (PrEP) among Baltimore men who have sex with men (MSM). We used data from BESURE-MSM3, a venue-based cross-sectional HIV surveillance study conducted among MSM in 2011. Multivariate regression was used to identify characteristics associated with PrEP knowledge and acceptability among 399 participants. Eleven percent had heard of PrEP, 48% would be willing to use PrEP, and none had previously used it. In multivariable analysis, black race and perceived discrimination against those with HIV were significantly associated with decreased awareness, and those who perceived higher HIV discrimination reported higher acceptability of PrEP. Our findings indicate a need for further education about the potential utility of PrEP in addition to other prevention methods among MSM. HIV prevention efforts should address the link between discrimination and potential PrEP use, especially among men of color.
Influence factors and forecast of carbon emission in China: structure adjustment for emission peak
NASA Astrophysics Data System (ADS)
Wang, B.; Cui, C. Q.; Li, Z. P.
2018-02-01
This paper introduced Principal Component Analysis and Multivariate Linear Regression Model to verify long-term balance relationships between Carbon Emissions and the impact factors. The integrated model of improved PCA and multivariate regression analysis model is attainable to figure out the pattern of carbon emission sources. Main empirical results indicate that among all selected variables, the role of energy consumption scale was largest. GDP and Population follow and also have significant impacts on carbon emission. Industrialization rate and fossil fuel proportion, which is the indicator of reflecting the economic structure and energy structure, have a higher importance than the factor of urbanization rate and the dweller consumption level of urban areas. In this way, some suggestions are put forward for government to achieve the peak of carbon emissions.
Evaluation of functional outcome of the floating knee injury using multivariate analysis.
Yokoyama, Kazuhiko; Tsukamoto, Tatsuro; Aoki, Shinichi; Wakita, Ryuji; Uchino, Masataka; Noumi, Takashi; Fukushima, Nobuaki; Itoman, Moritoshi
2002-11-01
The objective of this study is to evaluate significant contributing factors affecting the functional prognosis of floating knee injuries using multivariate analysis. A total of 68 floating knee injuries (67 patients) were treated at Kitasato University Hospital from 1986 to 1999. Both the femoral fractures and the tibial fractures were managed surgically by various methods. The functional results of these injuries were evaluated using the grading system of Karlström and Olerud. Follow-up periods ranged from 2 to 19 years (mean 50.2 months) after the original injury. We defined satisfactory (S) outcomes as those cases with excellent or good results and unsatisfactory (US) outcomes as those cases with acceptable or poor results. Logistic regression analysis was used as a multivariate analysis, and the dependent variables were defined as a satisfactory outcome or as an unsatisfactory outcome. The explanatory variables were predicting factors influencing the functional outcome such as age at trauma, gender, severity of soft-tissue injury in the femur and the tibia, AO fracture grade in the femur and the tibia, Fraser type (type I or type II), Injury Severity Score (ISS), and fixation time after injury (less than 1 week or more than 1 week) in the femur and the tibia. The final functional results were as follows: 25 cases had excellent results, 15 cases good results, 16 cases acceptable results, and 12 cases poor results. The predictive logistic regression equation was as follows: Log 1-p/p = 3.12-1.52 x Fraser type - 1.65 x severity of soft-tissue injury in the tibia - 1.31 x fixation time after injury in the tibia - 0.821 x AO fracture grade in the tibia + 1.025 x fixation time after injury in the femur - 0.687 x AO fracture grade in the femur ( p=0.01). Among the variables, Fraser type and the severity of soft-tissue injury in the tibia were significantly related to the final result. The multivariate analysis showed that both the involvement of the knee joint and the severity grade of soft-tissue injury in the tibia represented significant risk factors of poor outcome in floating knee injuries in this study.
The impact of moderate wine consumption on the risk of developing prostate cancer
Ferro, Matteo; Foerster, Beat; Abufaraj, Mohammad; Briganti, Alberto; Karakiewicz, Pierre I; Shariat, Shahrokh F
2018-01-01
Objective To investigate the impact of moderate wine consumption on the risk of prostate cancer (PCa). We focused on the differential effect of moderate consumption of red versus white wine. Design This study was a meta-analysis that includes data from case–control and cohort studies. Materials and methods A systematic search of Web of Science, Medline/PubMed, and Cochrane library was performed on December 1, 2017. Studies were deemed eligible if they assessed the risk of PCa due to red, white, or any wine using multivariable logistic regression analysis. We performed a formal meta-analysis for the risk of PCa according to moderate wine and wine type consumption (white or red). Heterogeneity between studies was assessed using Cochrane’s Q test and I2 statistics. Publication bias was assessed using Egger’s regression test. Results A total of 930 abstracts and titles were initially identified. After removal of duplicates, reviews, and conference abstracts, 83 full-text original articles were screened. Seventeen studies (611,169 subjects) were included for final evaluation and fulfilled the inclusion criteria. In the case of moderate wine consumption: the pooled risk ratio (RR) for the risk of PCa was 0.98 (95% CI 0.92–1.05, p=0.57) in the multivariable analysis. Moderate white wine consumption increased the risk of PCa with a pooled RR of 1.26 (95% CI 1.10–1.43, p=0.001) in the multi-variable analysis. Meanwhile, moderate red wine consumption had a protective role reducing the risk by 12% (RR 0.88, 95% CI 0.78–0.999, p=0.047) in the multivariable analysis that comprised 222,447 subjects. Conclusions In this meta-analysis, moderate wine consumption did not impact the risk of PCa. Interestingly, regarding the type of wine, moderate consumption of white wine increased the risk of PCa, whereas moderate consumption of red wine had a protective effect. Further analyses are needed to assess the differential molecular effect of white and red wine conferring their impact on PCa risk. PMID:29713200
Analysis of Forest Foliage Using a Multivariate Mixture Model
NASA Technical Reports Server (NTRS)
Hlavka, C. A.; Peterson, David L.; Johnson, L. F.; Ganapol, B.
1997-01-01
Data with wet chemical measurements and near infrared spectra of ground leaf samples were analyzed to test a multivariate regression technique for estimating component spectra which is based on a linear mixture model for absorbance. The resulting unmixed spectra for carbohydrates, lignin, and protein resemble the spectra of extracted plant starches, cellulose, lignin, and protein. The unmixed protein spectrum has prominent absorption spectra at wavelengths which have been associated with nitrogen bonds.
Gucciardi, Enza; DeMelo, Margaret; Offenheim, Ana; Stewart, Donna E
2008-01-01
Background Diabetes self-management education is a critical component in diabetes care. Despite worldwide efforts to develop efficacious DSME programs, high attrition rates are often reported in clinical practice. The objective of this study was to examine factors that may contribute to attrition behavior in diabetes self-management programs. Methods We conducted telephone interviews with individuals who had Type 2 diabetes (n = 267) and attended a diabetes education centre. Multivariable logistic regression was performed to identify factors associated with attrition behavior. Forty-four percent of participants (n = 118) withdrew prematurely from the program and were asked an open-ended question regarding their discontinuation of services. We used content analysis to code and generate themes, which were then organized under the Behavioral Model of Health Service Utilization. Results Working full and part-time, being over 65 years of age, having a regular primary care physician or fewer diabetes symptoms were contributing factors to attrition behaviour in our multivariable logistic regression. The most common reasons given by participants for attrition from the program were conflict between their work schedules and the centre's hours of operation, patients' confidence in their own knowledge and ability when managing their diabetes, apathy towards diabetes education, distance to the centre, forgetfulness, regular physician consultation, low perceived seriousness of diabetes, and lack of familiarity with the centre and its services. There was considerable overlap between our quantitative and qualitative results. Conclusion Reducing attrition behaviour requires a range of strategies targeted towards delivering convenient and accessible services, familiarizing individuals with these services, increasing communication between centres and their patients, and creating better partnerships between centres and primary care physicians. PMID:18248673
Péron, Julien; Pond, Gregory R; Gan, Hui K; Chen, Eric X; Almufti, Roula; Maillet, Denis; You, Benoit
2012-07-03
The Consolidated Standards of Reporting Trials (CONSORT) guidelines were developed in the mid-1990s for the explicit purpose of improving clinical trial reporting. However, there is little information regarding the adherence to CONSORT guidelines of recent publications of randomized controlled trials (RCTs) in oncology. All phase III RCTs published between 2005 and 2009 were reviewed using an 18-point overall quality score for reporting based on the 2001 CONSORT statement. Multivariable linear regression was used to identify features associated with improved reporting quality. To provide baseline data for future evaluations of reporting quality, RCTs were also assessed according to the 2010 revised CONSORT statement. All statistical tests were two-sided. A total of 357 RCTs were reviewed. The mean 2001 overall quality score was 13.4 on a scale of 0-18, whereas the mean 2010 overall quality score was 19.3 on a scale of 0-27. The overall RCT reporting quality score improved by 0.21 points per year from 2005 to 2009. Poorly reported items included method used to generate the random allocation (adequately reported in 29% of trials), whether and how blinding was applied (41%), method of allocation concealment (51%), and participant flow (59%). High impact factor (IF, P = .003), recent publication date (P = .008), and geographic origin of RCTs (P = .003) were independent factors statistically significantly associated with higher reporting quality in a multivariable regression model. Sample size, tumor type, and positivity of trial results were not associated with higher reporting quality, whereas funding source and treatment type had a borderline statistically significant impact. The results show that numerous items remained unreported for many trials. Thus, given the potential impact of poorly reported trials, oncology journals should require even stricter adherence to the CONSORT guidelines.
Huang, X C; Guo, H X; Wu, Z H; Guo, C X; Wei, W J; Li, H C; Sun, Q; Zhang, C C; Li, Z Y; Chen, T; Zhong, Q; Zhou, L
2017-05-12
Objective: To understand the characteristics of Mycobacterium tuberculosis (MTB) in epidemiology and distribution from Guangdong Province, and to explore the risk factors associated with drug resistance. Methods: A total of 225 clinical strains of MTB collected from 5 drug resistance monitoring sites of Guangdong Province in 2015 were tested by Regions of Difference 105 (RD105) deletion test and 15 loci mycobacterial interspersed repetitive units (MIRU) were used for genotyping. Gene clustering was analyzed using BioNumerics7.6. Drug susceptibility test was tested by proportion method. The statistical analysis used chi-square test and multivariate logistic regression. Results: There were 158 (70.2%) Beijing family strains from the 225 cases. Hunter-gaston index of MIRU loci varied from each other. The MTBs from Guangdong Province were categorized into 2 gene clusters by clustering analysis in which the rate of cluster of complexⅠwas significantly higher than complexⅡ(χ(2) values were 9.331, P values were 0.020). It was found by multivariate logistic regression that Qub11b was associated with resistance to rifampicin and isoniazid ( P values were 0.013, 0.012 respectively.), ETR F with resistance to isoniazid, streptomycin, ethambutol and ofloxacin ( P values were 0.039, 0.040, 0.023 and 0.003 respectively), Mtub21 with resistance to capreomycin ( P values were 0.040), and QUB26 with resistance to ethionamide ( P values were 0.047). Conclusions: The genes of MTB from Guangdong Province were of polymorphisms and the distribution of strains were stable. QUB11b, ETR F, Mtub21 and QUB26 could be related to biomarkers for predicting drug resistance.
Rank estimation and the multivariate analysis of in vivo fast-scan cyclic voltammetric data
Keithley, Richard B.; Carelli, Regina M.; Wightman, R. Mark
2010-01-01
Principal component regression has been used in the past to separate current contributions from different neuromodulators measured with in vivo fast-scan cyclic voltammetry. Traditionally, a percent cumulative variance approach has been used to determine the rank of the training set voltammetric matrix during model development, however this approach suffers from several disadvantages including the use of arbitrary percentages and the requirement of extreme precision of training sets. Here we propose that Malinowski’s F-test, a method based on a statistical analysis of the variance contained within the training set, can be used to improve factor selection for the analysis of in vivo fast-scan cyclic voltammetric data. These two methods of rank estimation were compared at all steps in the calibration protocol including the number of principal components retained, overall noise levels, model validation as determined using a residual analysis procedure, and predicted concentration information. By analyzing 119 training sets from two different laboratories amassed over several years, we were able to gain insight into the heterogeneity of in vivo fast-scan cyclic voltammetric data and study how differences in factor selection propagate throughout the entire principal component regression analysis procedure. Visualizing cyclic voltammetric representations of the data contained in the retained and discarded principal components showed that using Malinowski’s F-test for rank estimation of in vivo training sets allowed for noise to be more accurately removed. Malinowski’s F-test also improved the robustness of our criterion for judging multivariate model validity, even though signal-to-noise ratios of the data varied. In addition, pH change was the majority noise carrier of in vivo training sets while dopamine prediction was more sensitive to noise. PMID:20527815
Reasons for Cigarillo Initiation and Cigarillo Manipulation Methods among Adolescents.
Kong, Grace; Bold, Krysten W; Simon, Patricia; Camenga, Deepa R; Cavallo, Dana A; Krishnan-Sarin, Suchitra
2017-04-01
To understand reasons for cigarillo initiation and cigarillo manipulation methods among adolescents. We conducted surveys in 8 Connecticut high schools to assess reasons for trying a cigarillo and cigarillo manipulation methods. We used multivariable logistic regressions to assess associations with demographics and tobacco use status. Among ever cigarillo users (N = 697, 33.6% girls, 16.7 years old [SD = 1.14], 62.1% White), top reasons for trying a cigarillo were curiosity (41.9%), appealing flavors (32.9%), because "friends use it" (25.3%), and low cost (22.4%). Overall, 40.3% of ever cigarillo users added marijuana (to create blunts) and 39.2% did not manipulate the product. Endorsement of these reasons for initiation and manipulation methods differed significantly across sex, age, SES and other tobacco use. Cigarillo regulations should include restricting all appealing flavors, increasing the cost, monitoring the restriction of sales of cigarillos to minors, and decreasing the appeal of cigarillo manipulation.
Katsarov, Plamen; Gergov, Georgi; Alin, Aylin; Pilicheva, Bissera; Al-Degs, Yahya; Simeonov, Vasil; Kassarova, Margarita
2018-03-01
The prediction power of partial least squares (PLS) and multivariate curve resolution-alternating least squares (MCR-ALS) methods have been studied for simultaneous quantitative analysis of the binary drug combination - doxylamine succinate and pyridoxine hydrochloride. Analysis of first-order UV overlapped spectra was performed using different PLS models - classical PLS1 and PLS2 as well as partial robust M-regression (PRM). These linear models were compared to MCR-ALS with equality and correlation constraints (MCR-ALS-CC). All techniques operated within the full spectral region and extracted maximum information for the drugs analysed. The developed chemometric methods were validated on external sample sets and were applied to the analyses of pharmaceutical formulations. The obtained statistical parameters were satisfactory for calibration and validation sets. All developed methods can be successfully applied for simultaneous spectrophotometric determination of doxylamine and pyridoxine both in laboratory-prepared mixtures and commercial dosage forms.
Ohno, Yoshiharu; Fujisawa, Yasuko; Takenaka, Daisuke; Kaminaga, Shigeo; Seki, Shinichiro; Sugihara, Naoki; Yoshikawa, Takeshi
2018-02-01
The objective of this study was to compare the capability of xenon-enhanced area-detector CT (ADCT) performed with a subtraction technique and coregistered 81m Kr-ventilation SPECT/CT for the assessment of pulmonary functional loss and disease severity in smokers. Forty-six consecutive smokers (32 men and 14 women; mean age, 67.0 years) underwent prospective unenhanced and xenon-enhanced ADCT, 81m Kr-ventilation SPECT/CT, and pulmonary function tests. Disease severity was evaluated according to the Global Initiative for Chronic Obstructive Lung Disease (GOLD) classification. CT-based functional lung volume (FLV), the percentage of wall area to total airway area (WA%), and ventilated FLV on xenon-enhanced ADCT and SPECT/CT were calculated for each smoker. All indexes were correlated with percentage of forced expiratory volume in 1 second (%FEV 1 ) using step-wise regression analyses, and univariate and multivariate logistic regression analyses were performed. In addition, the diagnostic accuracy of the proposed model was compared with that of each radiologic index by means of McNemar analysis. Multivariate logistic regression showed that %FEV 1 was significantly affected (r = 0.77, r 2 = 0.59) by two factors: the first factor, ventilated FLV on xenon-enhanced ADCT (p < 0.0001); and the second factor, WA% (p = 0.004). Univariate logistic regression analyses indicated that all indexes significantly affected GOLD classification (p < 0.05). Multivariate logistic regression analyses revealed that ventilated FLV on xenon-enhanced ADCT and CT-based FLV significantly influenced GOLD classification (p < 0.0001). The diagnostic accuracy of the proposed model was significantly higher than that of ventilated FLV on SPECT/CT (p = 0.03) and WA% (p = 0.008). Xenon-enhanced ADCT is more effective than 81m Kr-ventilation SPECT/CT for the assessment of pulmonary functional loss and disease severity.
Determining the response of sea level to atmospheric pressure forcing using TOPEX/POSEIDON data
NASA Technical Reports Server (NTRS)
Fu, Lee-Lueng; Pihos, Greg
1994-01-01
The static response of sea level to the forcing of atmospheric pressure, the so-called inverted barometer (IB) effect, is investigated using TOPEX/POSEIDON data. This response, characterized by the rise and fall of sea level to compensate for the change of atmospheric pressure at a rate of -1 cm/mbar, is not associated with any ocean currents and hence is normally treated as an error to be removed from sea level observation. Linear regression and spectral transfer function analyses are applied to sea level and pressure to examine the validity of the IB effect. In regions outside the tropics, the regression coefficient is found to be consistently close to the theoretical value except for the regions of western boundary currents, where the mesoscale variability interferes with the IB effect. The spectral transfer function shows near IB response at periods of 30 degrees is -0.84 +/- 0.29 cm/mbar (1 standard deviation). The deviation from = 1 cm /mbar is shown to be caused primarily by the effect of wind forcing on sea level, based on multivariate linear regression model involving both pressure and wind forcing. The regression coefficient for pressure resulting from the multivariate analysis is -0.96 +/- 0.32 cm/mbar. In the tropics the multivariate analysis fails because sea level in the tropics is primarily responding to remote wind forcing. However, after removing from the data the wind-forced sea level estimated by a dynamic model of the tropical Pacific, the pressure regression coefficient improves from -1.22 +/- 0.69 cm/mbar to -0.99 +/- 0.46 cm/mbar, clearly revealing an IB response. The result of the study suggests that with a proper removal of the effect of wind forcing the IB effect is valid in most of the open ocean at periods longer than 20 days and spatial scales larger than 500 km.
Glass, Lisa M; Dickson, Rolland C; Anderson, Joseph C; Suriawinata, Arief A; Putra, Juan; Berk, Brian S; Toor, Arifa
2015-04-01
Given the rising epidemics of obesity and metabolic syndrome, nonalcoholic steatohepatitis (NASH) is now the most common cause of liver disease in the developed world. Effective treatment for NASH, either to reverse or prevent the progression of hepatic fibrosis, is currently lacking. To define the predictors associated with improved hepatic fibrosis in NASH patients undergoing serial liver biopsies at prolonged biopsy interval. This is a cohort study of 45 NASH patients undergoing serial liver biopsies for clinical monitoring in a tertiary care setting. Biopsies were scored using the NASH Clinical Research Network guidelines. Fibrosis regression was defined as improvement in fibrosis score ≥1 stage. Univariate analysis utilized Fisher's exact or Student's t test. Multivariate regression models determined independent predictors for regression of fibrosis. Forty-five NASH patients with biopsies collected at a mean interval of 4.6 years (±1.4) were included. The mean initial fibrosis stage was 1.96, two patients had cirrhosis and 12 patients (26.7 %) underwent bariatric surgery. There was a significantly higher rate of fibrosis regression among patients who lost ≥10 % total body weight (TBW) (63.2 vs. 9.1 %; p = 0.001) and who underwent bariatric surgery (47.4 vs. 4.5 %; p = 0.003). Factors such as age, gender, glucose intolerance, elevated ferritin, and A1AT heterozygosity did not influence fibrosis regression. On multivariate analysis, only weight loss of ≥10 % TBW predicted fibrosis regression [OR 8.14 (CI 1.08-61.17)]. Results indicate that regression of fibrosis in NASH is possible, even in advanced stages. Weight loss of ≥10 % TBW predicts fibrosis regression.
Loss to follow-up in the Australian HIV Observational Database
McManus, Hamish; Petoumenos, Kathy; Brown, Katherine; Baker, David; Russell, Darren; Read, Tim; Smith, Don; Wray, Lynne; Giles, Michelle; Hoy, Jennifer; Carr, Andrew; Law, Matthew
2015-01-01
Background Loss to follow-up (LTFU) in HIV-positive cohorts is an important surrogate for interrupted clinical care which can potentially influence the assessment of HIV disease status and outcomes. After preliminary evaluation of LTFU rates and patient characteristics, we evaluated the risk of mortality by LTFU status in a high resource setting. Methods Rates of LTFU were measured in the Australian HIV Observational Database for a range of patient characteristics. Multivariate repeated measures regression methods were used to identify determinants of LTFU. Mortality by LTFU status was ascertained using linkage to the National Death Index. Survival following combination antiretroviral therapy initiation was investigated using the Kaplan-Meier (KM) method and Cox proportional hazards models. Results Of 3,413 patients included in this analysis, 1,632 (47.8%) had at least one episode of LTFU after enrolment. Multivariate predictors of LTFU included viral load (VL)>10,000 copies/ml (Rate ratio (RR) 1.63 (95% confidence interval (CI):1.45–1.84) (ref ≤400)), time under follow-up (per year) (RR 1.03 (95% CI: 1.02–1.04)) and prior LTFU (per episode) (RR 1.15 (95% CI: 1.06–1.24)). KM curves for survival were similar by LTFU status (p=0.484). LTFU was not associated with mortality in Cox proportional hazards models (univariate hazard ratio (HR) 0.93 (95% CI: 0.69–1.26) and multivariate HR 1.04 (95% CI: 0.77–1.43)). Conclusions Increased risk of LTFU was identified amongst patients with potentially higher infectiousness. We did not find significant mortality risk associated with LTFU. This is consistent with timely re-engagement with treatment, possibly via high levels of unreported linkage to other health care providers. PMID:25377928
The NLS-Based Nonlinear Grey Multivariate Model for Forecasting Pollutant Emissions in China
Pei, Ling-Ling; Li, Qin
2018-01-01
The relationship between pollutant discharge and economic growth has been a major research focus in environmental economics. To accurately estimate the nonlinear change law of China’s pollutant discharge with economic growth, this study establishes a transformed nonlinear grey multivariable (TNGM (1, N)) model based on the nonlinear least square (NLS) method. The Gauss–Seidel iterative algorithm was used to solve the parameters of the TNGM (1, N) model based on the NLS basic principle. This algorithm improves the precision of the model by continuous iteration and constantly approximating the optimal regression coefficient of the nonlinear model. In our empirical analysis, the traditional grey multivariate model GM (1, N) and the NLS-based TNGM (1, N) models were respectively adopted to forecast and analyze the relationship among wastewater discharge per capita (WDPC), and per capita emissions of SO2 and dust, alongside GDP per capita in China during the period 1996–2015. Results indicated that the NLS algorithm is able to effectively help the grey multivariable model identify the nonlinear relationship between pollutant discharge and economic growth. The results show that the NLS-based TNGM (1, N) model presents greater precision when forecasting WDPC, SO2 emissions and dust emissions per capita, compared to the traditional GM (1, N) model; WDPC indicates a growing tendency aligned with the growth of GDP, while the per capita emissions of SO2 and dust reduce accordingly. PMID:29517985
Ferreira, Vicente; Herrero, Paula; Zapata, Julián; Escudero, Ana
2015-08-14
SPME is extremely sensitive to experimental parameters affecting liquid-gas and gas-solid distribution coefficients. Our aims were to measure the weights of these factors and to design a multivariate strategy based on the addition of a pool of internal standards, to minimize matrix effects. Synthetic but real-like wines containing selected analytes and variable amounts of ethanol, non-volatile constituents and major volatile compounds were prepared following a factorial design. The ANOVA study revealed that even using a strong matrix dilution, matrix effects are important and additive with non-significant interaction effects and that it is the presence of major volatile constituents the most dominant factor. A single internal standard provided a robust calibration for 15 out of 47 analytes. Then, two different multivariate calibration strategies based on Partial Least Square Regression were run in order to build calibration functions based on 13 different internal standards able to cope with matrix effects. The first one is based in the calculation of Multivariate Internal Standards (MIS), linear combinations of the normalized signals of the 13 internal standards, which provide the expected area of a given unit of analyte present in each sample. The second strategy is a direct calibration relating concentration to the 13 relative areas measured in each sample for each analyte. Overall, 47 different compounds can be reliably quantified in a single fully automated method with overall uncertainties better than 15%. Copyright © 2015 Elsevier B.V. All rights reserved.
Tourkmani, Abdo Karim; Sánchez-Huerta, Valeria; De Wit, Guillermo; Martínez, Jaime D.; Mingo, David; Mahillo-Fernández, Ignacio; Jiménez-Alfaro, Ignacio
2017-01-01
AIM To analyze the relationship between the score obtained in the Risk Score System (RSS) proposed by Hicks et al with penetrating keratoplasty (PKP) graft failure at 1y postoperatively and among each factor in the RSS with the risk of PKP graft failure using univariate and multivariate analysis. METHODS The retrospective cohort study had 152 PKPs from 152 patients. Eighteen cases were excluded from our study due to primary failure (10 cases), incomplete medical notes (5 cases) and follow-up less than 1y (3 cases). We included 134 PKPs from 134 patients stratified by preoperative risk score. Spearman coefficient was calculated for the relationship between the score obtained and risk of failure at 1y. Univariate and multivariate analysis were calculated for the impact of every single risk factor included in the RSS over graft failure at 1y. RESULTS Spearman coefficient showed statistically significant correlation between the score in the RSS and graft failure (P<0.05). Multivariate logistic regression analysis showed no statistically significant relationship (P>0.05) between diagnosis and lens status with graft failure. The relationship between the other risk factors studied and graft failure was significant (P<0.05), although the results for previous grafts and graft failure was unreliable. None of our patients had previous blood transfusion, thus, it had no impact. CONCLUSION After the application of multivariate analysis techniques, some risk factors do not show the expected impact over graft failure at 1y. PMID:28393027
Rosswog, Carolina; Schmidt, Rene; Oberthuer, André; Juraeva, Dilafruz; Brors, Benedikt; Engesser, Anne; Kahlert, Yvonne; Volland, Ruth; Bartenhagen, Christoph; Simon, Thorsten; Berthold, Frank; Hero, Barbara; Faldum, Andreas; Fischer, Matthias
2017-12-01
Current risk stratification systems for neuroblastoma patients consider clinical, histopathological, and genetic variables, and additional prognostic markers have been proposed in recent years. We here sought to select highly informative covariates in a multistep strategy based on consecutive Cox regression models, resulting in a risk score that integrates hazard ratios of prognostic variables. A cohort of 695 neuroblastoma patients was divided into a discovery set (n=75) for multigene predictor generation, a training set (n=411) for risk score development, and a validation set (n=209). Relevant prognostic variables were identified by stepwise multivariable L1-penalized least absolute shrinkage and selection operator (LASSO) Cox regression, followed by backward selection in multivariable Cox regression, and then integrated into a novel risk score. The variables stage, age, MYCN status, and two multigene predictors, NB-th24 and NB-th44, were selected as independent prognostic markers by LASSO Cox regression analysis. Following backward selection, only the multigene predictors were retained in the final model. Integration of these classifiers in a risk scoring system distinguished three patient subgroups that differed substantially in their outcome. The scoring system discriminated patients with diverging outcome in the validation cohort (5-year event-free survival, 84.9±3.4 vs 63.6±14.5 vs 31.0±5.4; P<.001), and its prognostic value was validated by multivariable analysis. We here propose a translational strategy for developing risk assessment systems based on hazard ratios of relevant prognostic variables. Our final neuroblastoma risk score comprised two multigene predictors only, supporting the notion that molecular properties of the tumor cells strongly impact clinical courses of neuroblastoma patients. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Estimating Soil Cation Exchange Capacity from Soil Physical and Chemical Properties
NASA Astrophysics Data System (ADS)
Bateni, S. M.; Emamgholizadeh, S.; Shahsavani, D.
2014-12-01
The soil Cation Exchange Capacity (CEC) is an important soil characteristic that has many applications in soil science and environmental studies. For example, CEC influences soil fertility by controlling the exchange of ions in the soil. Measurement of CEC is costly and difficult. Consequently, several studies attempted to obtain CEC from readily measurable soil physical and chemical properties such as soil pH, organic matter, soil texture, bulk density, and particle size distribution. These studies have often used multiple regression or artificial neural network models. Regression-based models cannot capture the intricate relationship between CEC and soil physical and chemical attributes and provide inaccurate CEC estimates. Although neural network models perform better than regression methods, they act like a black-box and cannot generate an explicit expression for retrieval of CEC from soil properties. In a departure with regression and neural network models, this study uses Genetic Expression Programming (GEP) and Multivariate Adaptive Regression Splines (MARS) to estimate CEC from easily measurable soil variables such as clay, pH, and OM. CEC estimates from GEP and MARS are compared with measurements at two field sites in Iran. Results show that GEP and MARS can estimate CEC accurately. Also, the MARS model performs slightly better than GEP. Finally, a sensitivity test indicates that organic matter and pH have respectively the least and the most significant impact on CEC.
Quantitative nuclear magnetic resonance for additives determination in an electrolytic nickel bath.
Ostra, Miren; Ubide, Carlos; Vidal, Maider
2011-02-01
The use of proton nuclear magnetic resonance (¹H-NMR) for the quantitation of additives in a commercial electrolytic nickel bath (Supreme Plus Brilliant, Atotech formulation) is reported. A simple and quick method is described that needs only the separation of nickel ions by precipitation with NaOH. The four additives in the bath (A-5(2X), leveler; Supreme Plus Brightener (SPB); SA-1, leveler; NPA, wetting agent; all of them are commercial names from Atotech) can be quantified, whereas no other analytical methods have been found in the literature for SA-1 and NPA. Two calibration methods have been tried: integration of NMR signals with the use of a proper internal standard and partial least squares regression applied to the characteristic NMR peaks. The multivariate method was preferred because of accuracy and precision. Multivariate limits of detection of about 4 mL L⁻¹ A-5(2X), 0.4 mL L⁻¹ SPB, 0.2 mL L⁻¹ SA-1 and 0.6 mL L⁻¹ NPA were found. The dynamic ranges are suitable to follow the concentration of additives in the bath along electrodeposition. ¹H-NMR spectra provide evidence for SPB and SA-1 consumption (A-5(2X) and NPA keep unchanged along the process) and the growth of some products from SA-1 degradation can be followed. The method can, probably, be extended to other electrolytic nickel baths.
Gavrilyuk, Oxana; Braaten, Tonje; Skeie, Guri; Weiderpass, Elisabete; Dumeaux, Vanessa; Lund, Eiliv
2014-03-25
Coffee and its compounds have been proposed to inhibit endometrial carcinogenesis. Studies in the Norwegian population can be especially interesting due to the high coffee consumption and increasing incidence of endometrial cancer in the country. A total of 97 926 postmenopausal Norwegian women from the population-based prospective Norwegian Women and Cancer (NOWAC) Study, were included in the present analysis. We evaluated the general association between total coffee consumption and endometrial cancer risk as well as the possible impact of brewing method. Multivariate Cox regression analysis was used to estimate risks, and heterogeneity tests were performed to compare brewing methods. During an average of 10.9 years of follow-up, 462 incident endometrial cancer cases were identified. After multivariate adjustment, significant risk reduction was found among participants who drank ≥8 cups/day of coffee with a hazard ratio of 0.52 (95% confidence interval, CI 0.34-0.79). However, we did not observe a significant dose-response relationship. No significant heterogeneity in risk was found when comparing filtered and boiled coffee brewing methods. A reduction in endometrial cancer risk was observed in subgroup analyses among participants who drank ≥8 cups/day and had a body mass index ≥25 kg/m2, and in current smokers. These data suggest that in this population with high coffee consumption, endometrial cancer risk decreases in women consuming ≥8 cups/day, independent of brewing method.
ERIC Educational Resources Information Center
Pecorella, Patricia A.; Bowers, David G.
Multiple regression in a double cross-validated design was used to predict two performance measures (total variable expense and absence rate) by multi-month period in five industrial firms. The regressions do cross-validate, and produce multiple coefficients which display both concurrent and predictive effects, peaking 18 months to two years…