Sample records for years multivariate regression

  1. Error Covariance Penalized Regression: A novel multivariate model combining penalized regression with multivariate error structure.

    PubMed

    Allegrini, Franco; Braga, Jez W B; Moreira, Alessandro C O; Olivieri, Alejandro C

    2018-06-29

    A new multivariate regression model, named Error Covariance Penalized Regression (ECPR) is presented. Following a penalized regression strategy, the proposed model incorporates information about the measurement error structure of the system, using the error covariance matrix (ECM) as a penalization term. Results are reported from both simulations and experimental data based on replicate mid and near infrared (MIR and NIR) spectral measurements. The results for ECPR are better under non-iid conditions when compared with traditional first-order multivariate methods such as ridge regression (RR), principal component regression (PCR) and partial least-squares regression (PLS). Copyright © 2018 Elsevier B.V. All rights reserved.

  2. Retro-regression--another important multivariate regression improvement.

    PubMed

    Randić, M

    2001-01-01

    We review the serious problem associated with instabilities of the coefficients of regression equations, referred to as the MRA (multivariate regression analysis) "nightmare of the first kind". This is manifested when in a stepwise regression a descriptor is included or excluded from a regression. The consequence is an unpredictable change of the coefficients of the descriptors that remain in the regression equation. We follow with consideration of an even more serious problem, referred to as the MRA "nightmare of the second kind", arising when optimal descriptors are selected from a large pool of descriptors. This process typically causes at different steps of the stepwise regression a replacement of several previously used descriptors by new ones. We describe a procedure that resolves these difficulties. The approach is illustrated on boiling points of nonanes which are considered (1) by using an ordered connectivity basis; (2) by using an ordering resulting from application of greedy algorithm; and (3) by using an ordering derived from an exhaustive search for optimal descriptors. A novel variant of multiple regression analysis, called retro-regression (RR), is outlined showing how it resolves the ambiguities associated with both "nightmares" of the first and the second kind of MRA.

  3. Multivariate Regression Analysis and Slaughter Livestock,

    DTIC Science & Technology

    AGRICULTURE, *ECONOMICS), (*MEAT, PRODUCTION), MULTIVARIATE ANALYSIS, REGRESSION ANALYSIS , ANIMALS, WEIGHT, COSTS, PREDICTIONS, STABILITY, MATHEMATICAL MODELS, STORAGE, BEEF, PORK, FOOD, STATISTICAL DATA, ACCURACY

  4. Alternatives for using multivariate regression to adjust prospective payment rates

    PubMed Central

    Sheingold, Steven H.

    1990-01-01

    Multivariate regression analysis has been used in structuring three of the adjustments to Medicare's prospective payment rates. Because the indirect-teaching adjustment, the disproportionate-share adjustment, and the adjustment for large cities are responsible for distributing approximately $3 billion in payments each year, the specification of regression models for these adjustments is of critical importance. In this article, the application of regression for adjusting Medicare's prospective rates is discussed, and the implications that differing specifications could have for these adjustments are demonstrated. PMID:10113271

  5. Regression Models For Multivariate Count Data.

    PubMed

    Zhang, Yiwen; Zhou, Hua; Zhou, Jin; Sun, Wei

    2017-01-01

    Data with multivariate count responses frequently occur in modern applications. The commonly used multinomial-logit model is limiting due to its restrictive mean-variance structure. For instance, analyzing count data from the recent RNA-seq technology by the multinomial-logit model leads to serious errors in hypothesis testing. The ubiquity of over-dispersion and complicated correlation structures among multivariate counts calls for more flexible regression models. In this article, we study some generalized linear models that incorporate various correlation structures among the counts. Current literature lacks a treatment of these models, partly due to the fact that they do not belong to the natural exponential family. We study the estimation, testing, and variable selection for these models in a unifying framework. The regression models are compared on both synthetic and real RNA-seq data.

  6. Regression Models For Multivariate Count Data

    PubMed Central

    Zhang, Yiwen; Zhou, Hua; Zhou, Jin; Sun, Wei

    2016-01-01

    Data with multivariate count responses frequently occur in modern applications. The commonly used multinomial-logit model is limiting due to its restrictive mean-variance structure. For instance, analyzing count data from the recent RNA-seq technology by the multinomial-logit model leads to serious errors in hypothesis testing. The ubiquity of over-dispersion and complicated correlation structures among multivariate counts calls for more flexible regression models. In this article, we study some generalized linear models that incorporate various correlation structures among the counts. Current literature lacks a treatment of these models, partly due to the fact that they do not belong to the natural exponential family. We study the estimation, testing, and variable selection for these models in a unifying framework. The regression models are compared on both synthetic and real RNA-seq data. PMID:28348500

  7. Calibrated Multivariate Regression with Application to Neural Semantic Basis Discovery.

    PubMed

    Liu, Han; Wang, Lie; Zhao, Tuo

    2015-08-01

    We propose a calibrated multivariate regression method named CMR for fitting high dimensional multivariate regression models. Compared with existing methods, CMR calibrates regularization for each regression task with respect to its noise level so that it simultaneously attains improved finite-sample performance and tuning insensitiveness. Theoretically, we provide sufficient conditions under which CMR achieves the optimal rate of convergence in parameter estimation. Computationally, we propose an efficient smoothed proximal gradient algorithm with a worst-case numerical rate of convergence O (1/ ϵ ), where ϵ is a pre-specified accuracy of the objective function value. We conduct thorough numerical simulations to illustrate that CMR consistently outperforms other high dimensional multivariate regression methods. We also apply CMR to solve a brain activity prediction problem and find that it is as competitive as a handcrafted model created by human experts. The R package camel implementing the proposed method is available on the Comprehensive R Archive Network http://cran.r-project.org/web/packages/camel/.

  8. Bayesian Estimation of Multivariate Latent Regression Models: Gauss versus Laplace

    ERIC Educational Resources Information Center

    Culpepper, Steven Andrew; Park, Trevor

    2017-01-01

    A latent multivariate regression model is developed that employs a generalized asymmetric Laplace (GAL) prior distribution for regression coefficients. The model is designed for high-dimensional applications where an approximate sparsity condition is satisfied, such that many regression coefficients are near zero after accounting for all the model…

  9. Higher-order Multivariable Polynomial Regression to Estimate Human Affective States

    NASA Astrophysics Data System (ADS)

    Wei, Jie; Chen, Tong; Liu, Guangyuan; Yang, Jiemin

    2016-03-01

    From direct observations, facial, vocal, gestural, physiological, and central nervous signals, estimating human affective states through computational models such as multivariate linear-regression analysis, support vector regression, and artificial neural network, have been proposed in the past decade. In these models, linear models are generally lack of precision because of ignoring intrinsic nonlinearities of complex psychophysiological processes; and nonlinear models commonly adopt complicated algorithms. To improve accuracy and simplify model, we introduce a new computational modeling method named as higher-order multivariable polynomial regression to estimate human affective states. The study employs standardized pictures in the International Affective Picture System to induce thirty subjects’ affective states, and obtains pure affective patterns of skin conductance as input variables to the higher-order multivariable polynomial model for predicting affective valence and arousal. Experimental results show that our method is able to obtain efficient correlation coefficients of 0.98 and 0.96 for estimation of affective valence and arousal, respectively. Moreover, the method may provide certain indirect evidences that valence and arousal have their brain’s motivational circuit origins. Thus, the proposed method can serve as a novel one for efficiently estimating human affective states.

  10. Higher-order Multivariable Polynomial Regression to Estimate Human Affective States

    PubMed Central

    Wei, Jie; Chen, Tong; Liu, Guangyuan; Yang, Jiemin

    2016-01-01

    From direct observations, facial, vocal, gestural, physiological, and central nervous signals, estimating human affective states through computational models such as multivariate linear-regression analysis, support vector regression, and artificial neural network, have been proposed in the past decade. In these models, linear models are generally lack of precision because of ignoring intrinsic nonlinearities of complex psychophysiological processes; and nonlinear models commonly adopt complicated algorithms. To improve accuracy and simplify model, we introduce a new computational modeling method named as higher-order multivariable polynomial regression to estimate human affective states. The study employs standardized pictures in the International Affective Picture System to induce thirty subjects’ affective states, and obtains pure affective patterns of skin conductance as input variables to the higher-order multivariable polynomial model for predicting affective valence and arousal. Experimental results show that our method is able to obtain efficient correlation coefficients of 0.98 and 0.96 for estimation of affective valence and arousal, respectively. Moreover, the method may provide certain indirect evidences that valence and arousal have their brain’s motivational circuit origins. Thus, the proposed method can serve as a novel one for efficiently estimating human affective states. PMID:26996254

  11. SMURC: High-Dimension Small-Sample Multivariate Regression With Covariance Estimation.

    PubMed

    Bayar, Belhassen; Bouaynaya, Nidhal; Shterenberg, Roman

    2017-03-01

    We consider a high-dimension low sample-size multivariate regression problem that accounts for correlation of the response variables. The system is underdetermined as there are more parameters than samples. We show that the maximum likelihood approach with covariance estimation is senseless because the likelihood diverges. We subsequently propose a normalization of the likelihood function that guarantees convergence. We call this method small-sample multivariate regression with covariance (SMURC) estimation. We derive an optimization problem and its convex approximation to compute SMURC. Simulation results show that the proposed algorithm outperforms the regularized likelihood estimator with known covariance matrix and the sparse conditional Gaussian graphical model. We also apply SMURC to the inference of the wing-muscle gene network of the Drosophila melanogaster (fruit fly).

  12. Enhanced ID Pit Sizing Using Multivariate Regression Algorithm

    NASA Astrophysics Data System (ADS)

    Krzywosz, Kenji

    2007-03-01

    EPRI is funding a program to enhance and improve the reliability of inside diameter (ID) pit sizing for balance-of plant heat exchangers, such as condensers and component cooling water heat exchangers. More traditional approaches to ID pit sizing involve the use of frequency-specific amplitude or phase angles. The enhanced multivariate regression algorithm for ID pit depth sizing incorporates three simultaneous input parameters of frequency, amplitude, and phase angle. A set of calibration data sets consisting of machined pits of various rounded and elongated shapes and depths was acquired in the frequency range of 100 kHz to 1 MHz for stainless steel tubing having nominal wall thickness of 0.028 inch. To add noise to the acquired data set, each test sample was rotated and test data acquired at 3, 6, 9, and 12 o'clock positions. The ID pit depths were estimated using a second order and fourth order regression functions by relying on normalized amplitude and phase angle information from multiple frequencies. Due to unique damage morphology associated with the microbiologically-influenced ID pits, it was necessary to modify the elongated calibration standard-based algorithms by relying on the algorithm developed solely from the destructive sectioning results. This paper presents the use of transformed multivariate regression algorithm to estimate ID pit depths and compare the results with the traditional univariate phase angle analysis. Both estimates were then compared with the destructive sectioning results.

  13. Multivariate decoding of brain images using ordinal regression.

    PubMed

    Doyle, O M; Ashburner, J; Zelaya, F O; Williams, S C R; Mehta, M A; Marquand, A F

    2013-11-01

    Neuroimaging data are increasingly being used to predict potential outcomes or groupings, such as clinical severity, drug dose response, and transitional illness states. In these examples, the variable (target) we want to predict is ordinal in nature. Conventional classification schemes assume that the targets are nominal and hence ignore their ranked nature, whereas parametric and/or non-parametric regression models enforce a metric notion of distance between classes. Here, we propose a novel, alternative multivariate approach that overcomes these limitations - whole brain probabilistic ordinal regression using a Gaussian process framework. We applied this technique to two data sets of pharmacological neuroimaging data from healthy volunteers. The first study was designed to investigate the effect of ketamine on brain activity and its subsequent modulation with two compounds - lamotrigine and risperidone. The second study investigates the effect of scopolamine on cerebral blood flow and its modulation using donepezil. We compared ordinal regression to multi-class classification schemes and metric regression. Considering the modulation of ketamine with lamotrigine, we found that ordinal regression significantly outperformed multi-class classification and metric regression in terms of accuracy and mean absolute error. However, for risperidone ordinal regression significantly outperformed metric regression but performed similarly to multi-class classification both in terms of accuracy and mean absolute error. For the scopolamine data set, ordinal regression was found to outperform both multi-class and metric regression techniques considering the regional cerebral blood flow in the anterior cingulate cortex. Ordinal regression was thus the only method that performed well in all cases. Our results indicate the potential of an ordinal regression approach for neuroimaging data while providing a fully probabilistic framework with elegant approaches for model selection

  14. Predictive and mechanistic multivariate linear regression models for reaction development

    PubMed Central

    Santiago, Celine B.; Guo, Jing-Yao

    2018-01-01

    Multivariate Linear Regression (MLR) models utilizing computationally-derived and empirically-derived physical organic molecular descriptors are described in this review. Several reports demonstrating the effectiveness of this methodological approach towards reaction optimization and mechanistic interrogation are discussed. A detailed protocol to access quantitative and predictive MLR models is provided as a guide for model development and parameter analysis. PMID:29719711

  15. A refined method for multivariate meta-analysis and meta-regression

    PubMed Central

    Jackson, Daniel; Riley, Richard D

    2014-01-01

    Making inferences about the average treatment effect using the random effects model for meta-analysis is problematic in the common situation where there is a small number of studies. This is because estimates of the between-study variance are not precise enough to accurately apply the conventional methods for testing and deriving a confidence interval for the average effect. We have found that a refined method for univariate meta-analysis, which applies a scaling factor to the estimated effects’ standard error, provides more accurate inference. We explain how to extend this method to the multivariate scenario and show that our proposal for refined multivariate meta-analysis and meta-regression can provide more accurate inferences than the more conventional approach. We explain how our proposed approach can be implemented using standard output from multivariate meta-analysis software packages and apply our methodology to two real examples. © 2013 The Authors. Statistics in Medicine published by John Wiley & Sons, Ltd. PMID:23996351

  16. A refined method for multivariate meta-analysis and meta-regression.

    PubMed

    Jackson, Daniel; Riley, Richard D

    2014-02-20

    Making inferences about the average treatment effect using the random effects model for meta-analysis is problematic in the common situation where there is a small number of studies. This is because estimates of the between-study variance are not precise enough to accurately apply the conventional methods for testing and deriving a confidence interval for the average effect. We have found that a refined method for univariate meta-analysis, which applies a scaling factor to the estimated effects' standard error, provides more accurate inference. We explain how to extend this method to the multivariate scenario and show that our proposal for refined multivariate meta-analysis and meta-regression can provide more accurate inferences than the more conventional approach. We explain how our proposed approach can be implemented using standard output from multivariate meta-analysis software packages and apply our methodology to two real examples. Copyright © 2013 John Wiley & Sons, Ltd.

  17. PM10 modeling in the Oviedo urban area (Northern Spain) by using multivariate adaptive regression splines

    NASA Astrophysics Data System (ADS)

    Nieto, Paulino José García; Antón, Juan Carlos Álvarez; Vilán, José Antonio Vilán; García-Gonzalo, Esperanza

    2014-10-01

    The aim of this research work is to build a regression model of the particulate matter up to 10 micrometers in size (PM10) by using the multivariate adaptive regression splines (MARS) technique in the Oviedo urban area (Northern Spain) at local scale. This research work explores the use of a nonparametric regression algorithm known as multivariate adaptive regression splines (MARS) which has the ability to approximate the relationship between the inputs and outputs, and express the relationship mathematically. In this sense, hazardous air pollutants or toxic air contaminants refer to any substance that may cause or contribute to an increase in mortality or serious illness, or that may pose a present or potential hazard to human health. To accomplish the objective of this study, the experimental dataset of nitrogen oxides (NOx), carbon monoxide (CO), sulfur dioxide (SO2), ozone (O3) and dust (PM10) were collected over 3 years (2006-2008) and they are used to create a highly nonlinear model of the PM10 in the Oviedo urban nucleus (Northern Spain) based on the MARS technique. One main objective of this model is to obtain a preliminary estimate of the dependence between PM10 pollutant in the Oviedo urban area at local scale. A second aim is to determine the factors with the greatest bearing on air quality with a view to proposing health and lifestyle improvements. The United States National Ambient Air Quality Standards (NAAQS) establishes the limit values of the main pollutants in the atmosphere in order to ensure the health of healthy people. Firstly, this MARS regression model captures the main perception of statistical learning theory in order to obtain a good prediction of the dependence among the main pollutants in the Oviedo urban area. Secondly, the main advantages of MARS are its capacity to produce simple, easy-to-interpret models, its ability to estimate the contributions of the input variables, and its computational efficiency. Finally, on the basis of

  18. Multivariate adaptive regression splines analysis to predict biomarkers of spontaneous preterm birth.

    PubMed

    Menon, Ramkumar; Bhat, Geeta; Saade, George R; Spratt, Heidi

    2014-04-01

    To develop classification models of demographic/clinical factors and biomarker data from spontaneous preterm birth in African Americans and Caucasians. Secondary analysis of biomarker data using multivariate adaptive regression splines (MARS), a supervised machine learning algorithm method. Analysis of data on 36 biomarkers from 191 women was reduced by MARS to develop predictive models for preterm birth in African Americans and Caucasians. Maternal plasma, cord plasma collected at admission for preterm or term labor and amniotic fluid at delivery. Data were partitioned into training and testing sets. Variable importance, a relative indicator (0-100%) and area under the receiver operating characteristic curve (AUC) characterized results. Multivariate adaptive regression splines generated models for combined and racially stratified biomarker data. Clinical and demographic data did not contribute to the model. Racial stratification of data produced distinct models in all three compartments. In African Americans maternal plasma samples IL-1RA, TNF-α, angiopoietin 2, TNFRI, IL-5, MIP1α, IL-1β and TGF-α modeled preterm birth (AUC train: 0.98, AUC test: 0.86). In Caucasians TNFR1, ICAM-1 and IL-1RA contributed to the model (AUC train: 0.84, AUC test: 0.68). African Americans cord plasma samples produced IL-12P70, IL-8 (AUC train: 0.82, AUC test: 0.66). Cord plasma in Caucasians modeled IGFII, PDGFBB, TGF-β1 , IL-12P70, and TIMP1 (AUC train: 0.99, AUC test: 0.82). Amniotic fluid in African Americans modeled FasL, TNFRII, RANTES, KGF, IGFI (AUC train: 0.95, AUC test: 0.89) and in Caucasians, TNF-α, MCP3, TGF-β3 , TNFR1 and angiopoietin 2 (AUC train: 0.94 AUC test: 0.79). Multivariate adaptive regression splines models multiple biomarkers associated with preterm birth and demonstrated racial disparity. © 2014 Nordic Federation of Societies of Obstetrics and Gynecology.

  19. A graphical method to evaluate spectral preprocessing in multivariate regression calibrations: example with Savitzky-Golay filters and partial least squares regression

    USDA-ARS?s Scientific Manuscript database

    In multivariate regression analysis of spectroscopy data, spectral preprocessing is often performed to reduce unwanted background information (offsets, sloped baselines) or accentuate absorption features in intrinsically overlapping bands. These procedures, also known as pretreatments, are commonly ...

  20. Is ovarian hyperstimulation associated with higher blood pressure in 4-year-old IVF offspring? Part I: multivariable regression analysis.

    PubMed

    Seggers, Jorien; Haadsma, Maaike L; La Bastide-Van Gemert, Sacha; Heineman, Maas Jan; Middelburg, Karin J; Roseboom, Tessa J; Schendelaar, Pamela; Van den Heuvel, Edwin R; Hadders-Algra, Mijna

    2014-03-01

    Does ovarian hyperstimulation, the in vitro procedure, or a combination of these two negatively influence blood pressure (BP) and anthropometrics of 4-year-old children born following IVF? Higher systolic blood pressure (SBP) percentiles were found in 4-year-old children born following conventional IVF with ovarian hyperstimulation compared with children born following IVF without ovarian hyperstimulation. Increasing evidence suggests that IVF, which has an increased incidence of preterm birth and low birthweight, is associated with higher BP and altered body fat distribution in offspring but the underlying mechanisms are largely unknown. We performed a prospective, assessor-blinded follow-up study in which 194 children were assessed. The attrition rate up until the 4-year-old assessment was 10%. We measured BP and anthropometrics of 4-year-old singletons born following conventional IVF with controlled ovarian hyperstimulation (COH-IVF, n = 63), or born following modified natural cycle IV (MNC-IVF, n = 52), or born to subfertile couples who conceived naturally (Sub-NC, n = 79). Both IVF and ICSI were performed. Primary outcome measures were the SBP percentiles and diastolic BP (DBP) percentiles. Anthropometric measures included triceps and subscapular skinfold thickness. Several multivariable regression analyses were applied in order to correct for subsets of confounders. The value 'B' is the unstandardized regression coefficient. SBP percentiles were significantly lower in the MNC-IVF group (mean 59, SD 24) than in the COH-IVF (mean 68, SD 22) and Sub-NC groups (mean 70, SD 16). The difference in SBP between COH-IVF and MNC-IVF remained significant after correction for current, early life and parental characteristics (B: 14.09; 95% confidence interval (CI): 5.39-22.79), whereas the difference between MNC-IVF and Sub-NC did not. DBP percentiles did not differ between groups. After correction for early life factors, subscapular skinfold thickness was thicker in the

  1. Local polynomial estimation of heteroscedasticity in a multivariate linear regression model and its applications in economics.

    PubMed

    Su, Liyun; Zhao, Yanyong; Yan, Tianshun; Li, Fenglan

    2012-01-01

    Multivariate local polynomial fitting is applied to the multivariate linear heteroscedastic regression model. Firstly, the local polynomial fitting is applied to estimate heteroscedastic function, then the coefficients of regression model are obtained by using generalized least squares method. One noteworthy feature of our approach is that we avoid the testing for heteroscedasticity by improving the traditional two-stage method. Due to non-parametric technique of local polynomial estimation, it is unnecessary to know the form of heteroscedastic function. Therefore, we can improve the estimation precision, when the heteroscedastic function is unknown. Furthermore, we verify that the regression coefficients is asymptotic normal based on numerical simulations and normal Q-Q plots of residuals. Finally, the simulation results and the local polynomial estimation of real data indicate that our approach is surely effective in finite-sample situations.

  2. Linear regression analysis and its application to multivariate chromatographic calibration for the quantitative analysis of two-component mixtures.

    PubMed

    Dinç, Erdal; Ozdemir, Abdil

    2005-01-01

    Multivariate chromatographic calibration technique was developed for the quantitative analysis of binary mixtures enalapril maleate (EA) and hydrochlorothiazide (HCT) in tablets in the presence of losartan potassium (LST). The mathematical algorithm of multivariate chromatographic calibration technique is based on the use of the linear regression equations constructed using relationship between concentration and peak area at the five-wavelength set. The algorithm of this mathematical calibration model having a simple mathematical content was briefly described. This approach is a powerful mathematical tool for an optimum chromatographic multivariate calibration and elimination of fluctuations coming from instrumental and experimental conditions. This multivariate chromatographic calibration contains reduction of multivariate linear regression functions to univariate data set. The validation of model was carried out by analyzing various synthetic binary mixtures and using the standard addition technique. Developed calibration technique was applied to the analysis of the real pharmaceutical tablets containing EA and HCT. The obtained results were compared with those obtained by classical HPLC method. It was observed that the proposed multivariate chromatographic calibration gives better results than classical HPLC.

  3. Multivariate Linear Regression and CART Regression Analysis of TBM Performance at Abu Hamour Phase-I Tunnel

    NASA Astrophysics Data System (ADS)

    Jakubowski, J.; Stypulkowski, J. B.; Bernardeau, F. G.

    2017-12-01

    The first phase of the Abu Hamour drainage and storm tunnel was completed in early 2017. The 9.5 km long, 3.7 m diameter tunnel was excavated with two Earth Pressure Balance (EPB) Tunnel Boring Machines from Herrenknecht. TBM operation processes were monitored and recorded by Data Acquisition and Evaluation System. The authors coupled collected TBM drive data with available information on rock mass properties, cleansed, completed with secondary variables and aggregated by weeks and shifts. Correlations and descriptive statistics charts were examined. Multivariate Linear Regression and CART regression tree models linking TBM penetration rate (PR), penetration per revolution (PPR) and field penetration index (FPI) with TBM operational and geotechnical characteristics were performed for the conditions of the weak/soft rock of Doha. Both regression methods are interpretable and the data were screened with different computational approaches allowing enriched insight. The primary goal of the analysis was to investigate empirical relations between multiple explanatory and responding variables, to search for best subsets of explanatory variables and to evaluate the strength of linear and non-linear relations. For each of the penetration indices, a predictive model coupling both regression methods was built and validated. The resultant models appeared to be stronger than constituent ones and indicated an opportunity for more accurate and robust TBM performance predictions.

  4. High-throughput quantitative biochemical characterization of algal biomass by NIR spectroscopy; multiple linear regression and multivariate linear regression analysis.

    PubMed

    Laurens, L M L; Wolfrum, E J

    2013-12-18

    One of the challenges associated with microalgal biomass characterization and the comparison of microalgal strains and conversion processes is the rapid determination of the composition of algae. We have developed and applied a high-throughput screening technology based on near-infrared (NIR) spectroscopy for the rapid and accurate determination of algal biomass composition. We show that NIR spectroscopy can accurately predict the full composition using multivariate linear regression analysis of varying lipid, protein, and carbohydrate content of algal biomass samples from three strains. We also demonstrate a high quality of predictions of an independent validation set. A high-throughput 96-well configuration for spectroscopy gives equally good prediction relative to a ring-cup configuration, and thus, spectra can be obtained from as little as 10-20 mg of material. We found that lipids exhibit a dominant, distinct, and unique fingerprint in the NIR spectrum that allows for the use of single and multiple linear regression of respective wavelengths for the prediction of the biomass lipid content. This is not the case for carbohydrate and protein content, and thus, the use of multivariate statistical modeling approaches remains necessary.

  5. Multivariate regression model for predicting lumber grade volumes of northern red oak sawlogs

    Treesearch

    Daniel A. Yaussy; Robert L. Brisbin

    1983-01-01

    A multivariate regression model was developed to predict green board-foot yields for the seven common factory lumber grades processed from northern red oak (Quercus rubra L.) factory grade logs. The model uses the standard log measurements of grade, scaling diameter, length, and percent defect. It was validated with an independent data set. The model...

  6. Multivariate regression model for predicting yields of grade lumber from yellow birch sawlogs

    Treesearch

    Andrew F. Howard; Daniel A. Yaussy

    1986-01-01

    A multivariate regression model was developed to predict green board-foot yields for the common grades of factory lumber processed from yellow birch factory-grade logs. The model incorporates the standard log measurements of scaling diameter, length, proportion of scalable defects, and the assigned USDA Forest Service log grade. Differences in yields between band and...

  7. Electricity Consumption in the Industrial Sector of Jordan: Application of Multivariate Linear Regression and Adaptive Neuro-Fuzzy Techniques

    NASA Astrophysics Data System (ADS)

    Samhouri, M.; Al-Ghandoor, A.; Fouad, R. H.

    2009-08-01

    In this study two techniques, for modeling electricity consumption of the Jordanian industrial sector, are presented: (i) multivariate linear regression and (ii) neuro-fuzzy models. Electricity consumption is modeled as function of different variables such as number of establishments, number of employees, electricity tariff, prevailing fuel prices, production outputs, capacity utilizations, and structural effects. It was found that industrial production and capacity utilization are the most important variables that have significant effect on future electrical power demand. The results showed that both the multivariate linear regression and neuro-fuzzy models are generally comparable and can be used adequately to simulate industrial electricity consumption. However, comparison that is based on the square root average squared error of data suggests that the neuro-fuzzy model performs slightly better for future prediction of electricity consumption than the multivariate linear regression model. Such results are in full agreement with similar work, using different methods, for other countries.

  8. Parameter estimation of multivariate multiple regression model using bayesian with non-informative Jeffreys’ prior distribution

    NASA Astrophysics Data System (ADS)

    Saputro, D. R. S.; Amalia, F.; Widyaningsih, P.; Affan, R. C.

    2018-05-01

    Bayesian method is a method that can be used to estimate the parameters of multivariate multiple regression model. Bayesian method has two distributions, there are prior and posterior distributions. Posterior distribution is influenced by the selection of prior distribution. Jeffreys’ prior distribution is a kind of Non-informative prior distribution. This prior is used when the information about parameter not available. Non-informative Jeffreys’ prior distribution is combined with the sample information resulting the posterior distribution. Posterior distribution is used to estimate the parameter. The purposes of this research is to estimate the parameters of multivariate regression model using Bayesian method with Non-informative Jeffreys’ prior distribution. Based on the results and discussion, parameter estimation of β and Σ which were obtained from expected value of random variable of marginal posterior distribution function. The marginal posterior distributions for β and Σ are multivariate normal and inverse Wishart. However, in calculation of the expected value involving integral of a function which difficult to determine the value. Therefore, approach is needed by generating of random samples according to the posterior distribution characteristics of each parameter using Markov chain Monte Carlo (MCMC) Gibbs sampling algorithm.

  9. Analysis of Multivariate Experimental Data Using A Simplified Regression Model Search Algorithm

    NASA Technical Reports Server (NTRS)

    Ulbrich, Norbert M.

    2013-01-01

    A new regression model search algorithm was developed that may be applied to both general multivariate experimental data sets and wind tunnel strain-gage balance calibration data. The algorithm is a simplified version of a more complex algorithm that was originally developed for the NASA Ames Balance Calibration Laboratory. The new algorithm performs regression model term reduction to prevent overfitting of data. It has the advantage that it needs only about one tenth of the original algorithm's CPU time for the completion of a regression model search. In addition, extensive testing showed that the prediction accuracy of math models obtained from the simplified algorithm is similar to the prediction accuracy of math models obtained from the original algorithm. The simplified algorithm, however, cannot guarantee that search constraints related to a set of statistical quality requirements are always satisfied in the optimized regression model. Therefore, the simplified algorithm is not intended to replace the original algorithm. Instead, it may be used to generate an alternate optimized regression model of experimental data whenever the application of the original search algorithm fails or requires too much CPU time. Data from a machine calibration of NASA's MK40 force balance is used to illustrate the application of the new search algorithm.

  10. The PIT-trap-A "model-free" bootstrap procedure for inference about regression models with discrete, multivariate responses.

    PubMed

    Warton, David I; Thibaut, Loïc; Wang, Yi Alice

    2017-01-01

    Bootstrap methods are widely used in statistics, and bootstrapping of residuals can be especially useful in the regression context. However, difficulties are encountered extending residual resampling to regression settings where residuals are not identically distributed (thus not amenable to bootstrapping)-common examples including logistic or Poisson regression and generalizations to handle clustered or multivariate data, such as generalised estimating equations. We propose a bootstrap method based on probability integral transform (PIT-) residuals, which we call the PIT-trap, which assumes data come from some marginal distribution F of known parametric form. This method can be understood as a type of "model-free bootstrap", adapted to the problem of discrete and highly multivariate data. PIT-residuals have the key property that they are (asymptotically) pivotal. The PIT-trap thus inherits the key property, not afforded by any other residual resampling approach, that the marginal distribution of data can be preserved under PIT-trapping. This in turn enables the derivation of some standard bootstrap properties, including second-order correctness of pivotal PIT-trap test statistics. In multivariate data, bootstrapping rows of PIT-residuals affords the property that it preserves correlation in data without the need for it to be modelled, a key point of difference as compared to a parametric bootstrap. The proposed method is illustrated on an example involving multivariate abundance data in ecology, and demonstrated via simulation to have improved properties as compared to competing resampling methods.

  11. [Multivariate Adaptive Regression Splines (MARS), an alternative for the analysis of time series].

    PubMed

    Vanegas, Jairo; Vásquez, Fabián

    Multivariate Adaptive Regression Splines (MARS) is a non-parametric modelling method that extends the linear model, incorporating nonlinearities and interactions between variables. It is a flexible tool that automates the construction of predictive models: selecting relevant variables, transforming the predictor variables, processing missing values and preventing overshooting using a self-test. It is also able to predict, taking into account structural factors that might influence the outcome variable, thereby generating hypothetical models. The end result could identify relevant cut-off points in data series. It is rarely used in health, so it is proposed as a tool for the evaluation of relevant public health indicators. For demonstrative purposes, data series regarding the mortality of children under 5 years of age in Costa Rica were used, comprising the period 1978-2008. Copyright © 2016 SESPAS. Publicado por Elsevier España, S.L.U. All rights reserved.

  12. Non-proportional odds multivariate logistic regression of ordinal family data.

    PubMed

    Zaloumis, Sophie G; Scurrah, Katrina J; Harrap, Stephen B; Ellis, Justine A; Gurrin, Lyle C

    2015-03-01

    Methods to examine whether genetic and/or environmental sources can account for the residual variation in ordinal family data usually assume proportional odds. However, standard software to fit the non-proportional odds model to ordinal family data is limited because the correlation structure of family data is more complex than for other types of clustered data. To perform these analyses we propose the non-proportional odds multivariate logistic regression model and take a simulation-based approach to model fitting using Markov chain Monte Carlo methods, such as partially collapsed Gibbs sampling and the Metropolis algorithm. We applied the proposed methodology to male pattern baldness data from the Victorian Family Heart Study. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  13. Linear Multivariable Regression Models for Prediction of Eddy Dissipation Rate from Available Meteorological Data

    NASA Technical Reports Server (NTRS)

    MCKissick, Burnell T. (Technical Monitor); Plassman, Gerald E.; Mall, Gerald H.; Quagliano, John R.

    2005-01-01

    Linear multivariable regression models for predicting day and night Eddy Dissipation Rate (EDR) from available meteorological data sources are defined and validated. Model definition is based on a combination of 1997-2000 Dallas/Fort Worth (DFW) data sources, EDR from Aircraft Vortex Spacing System (AVOSS) deployment data, and regression variables primarily from corresponding Automated Surface Observation System (ASOS) data. Model validation is accomplished through EDR predictions on a similar combination of 1994-1995 Memphis (MEM) AVOSS and ASOS data. Model forms include an intercept plus a single term of fixed optimal power for each of these regression variables; 30-minute forward averaged mean and variance of near-surface wind speed and temperature, variance of wind direction, and a discrete cloud cover metric. Distinct day and night models, regressing on EDR and the natural log of EDR respectively, yield best performance and avoid model discontinuity over day/night data boundaries.

  14. Analysis of Multivariate Experimental Data Using A Simplified Regression Model Search Algorithm

    NASA Technical Reports Server (NTRS)

    Ulbrich, Norbert Manfred

    2013-01-01

    A new regression model search algorithm was developed in 2011 that may be used to analyze both general multivariate experimental data sets and wind tunnel strain-gage balance calibration data. The new algorithm is a simplified version of a more complex search algorithm that was originally developed at the NASA Ames Balance Calibration Laboratory. The new algorithm has the advantage that it needs only about one tenth of the original algorithm's CPU time for the completion of a search. In addition, extensive testing showed that the prediction accuracy of math models obtained from the simplified algorithm is similar to the prediction accuracy of math models obtained from the original algorithm. The simplified algorithm, however, cannot guarantee that search constraints related to a set of statistical quality requirements are always satisfied in the optimized regression models. Therefore, the simplified search algorithm is not intended to replace the original search algorithm. Instead, it may be used to generate an alternate optimized regression model of experimental data whenever the application of the original search algorithm either fails or requires too much CPU time. Data from a machine calibration of NASA's MK40 force balance is used to illustrate the application of the new regression model search algorithm.

  15. Selecting minimum dataset soil variables using PLSR as a regressive multivariate method

    NASA Astrophysics Data System (ADS)

    Stellacci, Anna Maria; Armenise, Elena; Castellini, Mirko; Rossi, Roberta; Vitti, Carolina; Leogrande, Rita; De Benedetto, Daniela; Ferrara, Rossana M.; Vivaldi, Gaetano A.

    2017-04-01

    Long-term field experiments and science-based tools that characterize soil status (namely the soil quality indices, SQIs) assume a strategic role in assessing the effect of agronomic techniques and thus in improving soil management especially in marginal environments. Selecting key soil variables able to best represent soil status is a critical step for the calculation of SQIs. Current studies show the effectiveness of statistical methods for variable selection to extract relevant information deriving from multivariate datasets. Principal component analysis (PCA) has been mainly used, however supervised multivariate methods and regressive techniques are progressively being evaluated (Armenise et al., 2013; de Paul Obade et al., 2016; Pulido Moncada et al., 2014). The present study explores the effectiveness of partial least square regression (PLSR) in selecting critical soil variables, using a dataset comparing conventional tillage and sod-seeding on durum wheat. The results were compared to those obtained using PCA and stepwise discriminant analysis (SDA). The soil data derived from a long-term field experiment in Southern Italy. On samples collected in April 2015, the following set of variables was quantified: (i) chemical: total organic carbon and nitrogen (TOC and TN), alkali-extractable C (TEC and humic substances - HA-FA), water extractable N and organic C (WEN and WEOC), Olsen extractable P, exchangeable cations, pH and EC; (ii) physical: texture, dry bulk density (BD), macroporosity (Pmac), air capacity (AC), and relative field capacity (RFC); (iii) biological: carbon of the microbial biomass quantified with the fumigation-extraction method. PCA and SDA were previously applied to the multivariate dataset (Stellacci et al., 2016). PLSR was carried out on mean centered and variance scaled data of predictors (soil variables) and response (wheat yield) variables using the PLS procedure of SAS/STAT. In addition, variable importance for projection (VIP

  16. The PIT-trap—A “model-free” bootstrap procedure for inference about regression models with discrete, multivariate responses

    PubMed Central

    Thibaut, Loïc; Wang, Yi Alice

    2017-01-01

    Bootstrap methods are widely used in statistics, and bootstrapping of residuals can be especially useful in the regression context. However, difficulties are encountered extending residual resampling to regression settings where residuals are not identically distributed (thus not amenable to bootstrapping)—common examples including logistic or Poisson regression and generalizations to handle clustered or multivariate data, such as generalised estimating equations. We propose a bootstrap method based on probability integral transform (PIT-) residuals, which we call the PIT-trap, which assumes data come from some marginal distribution F of known parametric form. This method can be understood as a type of “model-free bootstrap”, adapted to the problem of discrete and highly multivariate data. PIT-residuals have the key property that they are (asymptotically) pivotal. The PIT-trap thus inherits the key property, not afforded by any other residual resampling approach, that the marginal distribution of data can be preserved under PIT-trapping. This in turn enables the derivation of some standard bootstrap properties, including second-order correctness of pivotal PIT-trap test statistics. In multivariate data, bootstrapping rows of PIT-residuals affords the property that it preserves correlation in data without the need for it to be modelled, a key point of difference as compared to a parametric bootstrap. The proposed method is illustrated on an example involving multivariate abundance data in ecology, and demonstrated via simulation to have improved properties as compared to competing resampling methods. PMID:28738071

  17. Classification and regression tree analysis vs. multivariable linear and logistic regression methods as statistical tools for studying haemophilia.

    PubMed

    Henrard, S; Speybroeck, N; Hermans, C

    2015-11-01

    Haemophilia is a rare genetic haemorrhagic disease characterized by partial or complete deficiency of coagulation factor VIII, for haemophilia A, or IX, for haemophilia B. As in any other medical research domain, the field of haemophilia research is increasingly concerned with finding factors associated with binary or continuous outcomes through multivariable models. Traditional models include multiple logistic regressions, for binary outcomes, and multiple linear regressions for continuous outcomes. Yet these regression models are at times difficult to implement, especially for non-statisticians, and can be difficult to interpret. The present paper sought to didactically explain how, why, and when to use classification and regression tree (CART) analysis for haemophilia research. The CART method is non-parametric and non-linear, based on the repeated partitioning of a sample into subgroups based on a certain criterion. Breiman developed this method in 1984. Classification trees (CTs) are used to analyse categorical outcomes and regression trees (RTs) to analyse continuous ones. The CART methodology has become increasingly popular in the medical field, yet only a few examples of studies using this methodology specifically in haemophilia have to date been published. Two examples using CART analysis and previously published in this field are didactically explained in details. There is increasing interest in using CART analysis in the health domain, primarily due to its ease of implementation, use, and interpretation, thus facilitating medical decision-making. This method should be promoted for analysing continuous or categorical outcomes in haemophilia, when applicable. © 2015 John Wiley & Sons Ltd.

  18. Reporting quality of multivariable logistic regression in selected Indian medical journals.

    PubMed

    Kumar, R; Indrayan, A; Chhabra, P

    2012-01-01

    Use of multivariable logistic regression (MLR) modeling has steeply increased in the medical literature over the past few years. Testing of model assumptions and adequate reporting of MLR allow the reader to interpret results more accurately. To review the fulfillment of assumptions and reporting quality of MLR in selected Indian medical journals using established criteria. Analysis of published literature. Medknow.com publishes 68 Indian medical journals with open access. Eight of these journals had at least five articles using MLR between the years 1994 to 2008. Articles from each of these journals were evaluated according to the previously established 10-point quality criteria for reporting and to test the MLR model assumptions. SPSS 17 software and non-parametric test (Kruskal-Wallis H, Mann Whitney U, Spearman Correlation). One hundred and nine articles were finally found using MLR for analyzing the data in the selected eight journals. The number of such articles gradually increased after year 2003, but quality score remained almost similar over time. P value, odds ratio, and 95% confidence interval for coefficients in MLR was reported in 75.2% and sufficient cases (>10) per covariate of limiting sample size were reported in the 58.7% of the articles. No article reported the test for conformity of linear gradient for continuous covariates. Total score was not significantly different across the journals. However, involvement of statistician or epidemiologist as a co-author improved the average quality score significantly (P=0.014). Reporting of MLR in many Indian journals is incomplete. Only one article managed to score 8 out of 10 among 109 articles under review. All others scored less. Appropriate guidelines in instructions to authors, and pre-publication review of articles using MLR by a qualified statistician may improve quality of reporting.

  19. Structural brain connectivity and cognitive ability differences: A multivariate distance matrix regression analysis.

    PubMed

    Ponsoda, Vicente; Martínez, Kenia; Pineda-Pardo, José A; Abad, Francisco J; Olea, Julio; Román, Francisco J; Barbey, Aron K; Colom, Roberto

    2017-02-01

    Neuroimaging research involves analyses of huge amounts of biological data that might or might not be related with cognition. This relationship is usually approached using univariate methods, and, therefore, correction methods are mandatory for reducing false positives. Nevertheless, the probability of false negatives is also increased. Multivariate frameworks have been proposed for helping to alleviate this balance. Here we apply multivariate distance matrix regression for the simultaneous analysis of biological and cognitive data, namely, structural connections among 82 brain regions and several latent factors estimating cognitive performance. We tested whether cognitive differences predict distances among individuals regarding their connectivity pattern. Beginning with 3,321 connections among regions, the 36 edges better predicted by the individuals' cognitive scores were selected. Cognitive scores were related to connectivity distances in both the full (3,321) and reduced (36) connectivity patterns. The selected edges connect regions distributed across the entire brain and the network defined by these edges supports high-order cognitive processes such as (a) (fluid) executive control, (b) (crystallized) recognition, learning, and language processing, and (c) visuospatial processing. This multivariate study suggests that one widespread, but limited number, of regions in the human brain, supports high-level cognitive ability differences. Hum Brain Mapp 38:803-816, 2017. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  20. Multivariate functional response regression, with application to fluorescence spectroscopy in a cervical pre-cancer study.

    PubMed

    Zhu, Hongxiao; Morris, Jeffrey S; Wei, Fengrong; Cox, Dennis D

    2017-07-01

    Many scientific studies measure different types of high-dimensional signals or images from the same subject, producing multivariate functional data. These functional measurements carry different types of information about the scientific process, and a joint analysis that integrates information across them may provide new insights into the underlying mechanism for the phenomenon under study. Motivated by fluorescence spectroscopy data in a cervical pre-cancer study, a multivariate functional response regression model is proposed, which treats multivariate functional observations as responses and a common set of covariates as predictors. This novel modeling framework simultaneously accounts for correlations between functional variables and potential multi-level structures in data that are induced by experimental design. The model is fitted by performing a two-stage linear transformation-a basis expansion to each functional variable followed by principal component analysis for the concatenated basis coefficients. This transformation effectively reduces the intra-and inter-function correlations and facilitates fast and convenient calculation. A fully Bayesian approach is adopted to sample the model parameters in the transformed space, and posterior inference is performed after inverse-transforming the regression coefficients back to the original data domain. The proposed approach produces functional tests that flag local regions on the functional effects, while controlling the overall experiment-wise error rate or false discovery rate. It also enables functional discriminant analysis through posterior predictive calculation. Analysis of the fluorescence spectroscopy data reveals local regions with differential expressions across the pre-cancer and normal samples. These regions may serve as biomarkers for prognosis and disease assessment.

  1. G/SPLINES: A hybrid of Friedman's Multivariate Adaptive Regression Splines (MARS) algorithm with Holland's genetic algorithm

    NASA Technical Reports Server (NTRS)

    Rogers, David

    1991-01-01

    G/SPLINES are a hybrid of Friedman's Multivariable Adaptive Regression Splines (MARS) algorithm with Holland's Genetic Algorithm. In this hybrid, the incremental search is replaced by a genetic search. The G/SPLINE algorithm exhibits performance comparable to that of the MARS algorithm, requires fewer least squares computations, and allows significantly larger problems to be considered.

  2. A graphical method to evaluate spectral preprocessing in multivariate regression calibrations: example with Savitzky-Golay filters and partial least squares regression.

    PubMed

    Delwiche, Stephen R; Reeves, James B

    2010-01-01

    In multivariate regression analysis of spectroscopy data, spectral preprocessing is often performed to reduce unwanted background information (offsets, sloped baselines) or accentuate absorption features in intrinsically overlapping bands. These procedures, also known as pretreatments, are commonly smoothing operations or derivatives. While such operations are often useful in reducing the number of latent variables of the actual decomposition and lowering residual error, they also run the risk of misleading the practitioner into accepting calibration equations that are poorly adapted to samples outside of the calibration. The current study developed a graphical method to examine this effect on partial least squares (PLS) regression calibrations of near-infrared (NIR) reflection spectra of ground wheat meal with two analytes, protein content and sodium dodecyl sulfate sedimentation (SDS) volume (an indicator of the quantity of the gluten proteins that contribute to strong doughs). These two properties were chosen because of their differing abilities to be modeled by NIR spectroscopy: excellent for protein content, fair for SDS sedimentation volume. To further demonstrate the potential pitfalls of preprocessing, an artificial component, a randomly generated value, was included in PLS regression trials. Savitzky-Golay (digital filter) smoothing, first-derivative, and second-derivative preprocess functions (5 to 25 centrally symmetric convolution points, derived from quadratic polynomials) were applied to PLS calibrations of 1 to 15 factors. The results demonstrated the danger of an over reliance on preprocessing when (1) the number of samples used in a multivariate calibration is low (<50), (2) the spectral response of the analyte is weak, and (3) the goodness of the calibration is based on the coefficient of determination (R(2)) rather than a term based on residual error. The graphical method has application to the evaluation of other preprocess functions and various

  3. Empirical Bayes approach to the estimation of "unsafety": the multivariate regression method.

    PubMed

    Hauer, E

    1992-10-01

    There are two kinds of clues to the unsafety of an entity: its traits (such as traffic, geometry, age, or gender) and its historical accident record. The Empirical Bayes approach to unsafety estimation makes use of both kinds of clues. It requires information about the mean and the variance of the unsafety in a "reference population" of similar entities. The method now in use for this purpose suffers from several shortcomings. First, a very large reference population is required. Second, the choice of reference population is to some extent arbitrary. Third, entities in the reference population usually cannot match the traits of the entity the unsafety of which is estimated. To alleviate these shortcomings the multivariate regression method for estimating the mean and variance of unsafety in reference populations is offered. Its logical foundations are described and its soundness is demonstrated. The use of the multivariate method makes the Empirical Bayes approach to unsafety estimation applicable to a wider range of circumstances and yields better estimates of unsafety. The application of the method to the tasks of identifying deviant entities and of estimating the effect of interventions on unsafety are discussed and illustrated by numerical examples.

  4. Remote-sensing data processing with the multivariate regression analysis method for iron mineral resource potential mapping: a case study in the Sarvian area, central Iran

    NASA Astrophysics Data System (ADS)

    Mansouri, Edris; Feizi, Faranak; Jafari Rad, Alireza; Arian, Mehran

    2018-03-01

    This paper uses multivariate regression to create a mathematical model for iron skarn exploration in the Sarvian area, central Iran, using multivariate regression for mineral prospectivity mapping (MPM). The main target of this paper is to apply multivariate regression analysis (as an MPM method) to map iron outcrops in the northeastern part of the study area in order to discover new iron deposits in other parts of the study area. Two types of multivariate regression models using two linear equations were employed to discover new mineral deposits. This method is one of the reliable methods for processing satellite images. ASTER satellite images (14 bands) were used as unique independent variables (UIVs), and iron outcrops were mapped as dependent variables for MPM. According to the results of the probability value (p value), coefficient of determination value (R2) and adjusted determination coefficient (Radj2), the second regression model (which consistent of multiple UIVs) fitted better than other models. The accuracy of the model was confirmed by iron outcrops map and geological observation. Based on field observation, iron mineralization occurs at the contact of limestone and intrusive rocks (skarn type).

  5. A matrix-based method of moments for fitting the multivariate random effects model for meta-analysis and meta-regression

    PubMed Central

    Jackson, Dan; White, Ian R; Riley, Richard D

    2013-01-01

    Multivariate meta-analysis is becoming more commonly used. Methods for fitting the multivariate random effects model include maximum likelihood, restricted maximum likelihood, Bayesian estimation and multivariate generalisations of the standard univariate method of moments. Here, we provide a new multivariate method of moments for estimating the between-study covariance matrix with the properties that (1) it allows for either complete or incomplete outcomes and (2) it allows for covariates through meta-regression. Further, for complete data, it is invariant to linear transformations. Our method reduces to the usual univariate method of moments, proposed by DerSimonian and Laird, in a single dimension. We illustrate our method and compare it with some of the alternatives using a simulation study and a real example. PMID:23401213

  6. Vertebral artery injury associated with blunt cervical spine trauma: a multivariate regression analysis.

    PubMed

    Lebl, Darren R; Bono, Christopher M; Velmahos, George; Metkar, Umesh; Nguyen, Joseph; Harris, Mitchel B

    2013-07-15

    Retrospective analysis of prospective registry data. To determine the patient characteristics, risk factors, and fracture patterns associated with vertebral artery injury (VAI) in patients with blunt cervical spine injury. VAI associated with cervical spine trauma has the potential for catastrophical clinical sequelae. The patterns of cervical spine injury and patient characteristics associated with VAI remain to be determined. A retrospective review of prospectively collected data from the American College of Surgeons trauma registries at 3 level-1 trauma centers identified all patients with a cervical spine injury on multidetector computed tomographic scan during a 3-year period (January 1, 2007, to January 1, 2010). Fracture pattern and patient characteristics were recorded. Logistic multivariate regression analysis of independent predictors for VAI and subgroup analysis of neurological events related to VAI was performed. Twenty-one percent of 1204 patients with cervical injuries (n = 253) underwent screening for VAI by multidetector computed tomography angiogram. VAI was diagnosed in 17% (42 of 253), unilateral in 15% (38 of 253), and bilateral in 1.6% (4 of 253) and was associated with a lower Glasgow coma scale (P < 0.001), a higher injury severity score (P < 0.01), and a higher mortality (P < 0.001). VAI was associated with ankylosing spondylitis/diffuse idiopathic skeletal hyperosteosis (crude odds ratio [OR] = 8.04; 95% confidence interval [CI], 1.30-49.68; P = 0.034), and occipitocervical dissociation (P < 0.001) by univariate analysis and fracture displacement into the transverse foramen 1 mm or more (adjusted OR = 3.29; 95% CI, 1.15-9.41; P = 0.026), and basilar skull fracture (adjusted OR = 4.25; 95% CI, 1.25-14.47; P= 0.021), by multivariate regression model. Subgroup analyses of neurological events secondary to VAI occurred in 14% (6 of 42) and the stroke-related mortality rate was 4.8% (2 of 42). Neurological events were associated with male sex (P

  7. Multivariate random-parameters zero-inflated negative binomial regression model: an application to estimate crash frequencies at intersections.

    PubMed

    Dong, Chunjiao; Clarke, David B; Yan, Xuedong; Khattak, Asad; Huang, Baoshan

    2014-09-01

    Crash data are collected through police reports and integrated with road inventory data for further analysis. Integrated police reports and inventory data yield correlated multivariate data for roadway entities (e.g., segments or intersections). Analysis of such data reveals important relationships that can help focus on high-risk situations and coming up with safety countermeasures. To understand relationships between crash frequencies and associated variables, while taking full advantage of the available data, multivariate random-parameters models are appropriate since they can simultaneously consider the correlation among the specific crash types and account for unobserved heterogeneity. However, a key issue that arises with correlated multivariate data is the number of crash-free samples increases, as crash counts have many categories. In this paper, we describe a multivariate random-parameters zero-inflated negative binomial (MRZINB) regression model for jointly modeling crash counts. The full Bayesian method is employed to estimate the model parameters. Crash frequencies at urban signalized intersections in Tennessee are analyzed. The paper investigates the performance of MZINB and MRZINB regression models in establishing the relationship between crash frequencies, pavement conditions, traffic factors, and geometric design features of roadway intersections. Compared to the MZINB model, the MRZINB model identifies additional statistically significant factors and provides better goodness of fit in developing the relationships. The empirical results show that MRZINB model possesses most of the desirable statistical properties in terms of its ability to accommodate unobserved heterogeneity and excess zero counts in correlated data. Notably, in the random-parameters MZINB model, the estimated parameters vary significantly across intersections for different crash types. Copyright © 2014 Elsevier Ltd. All rights reserved.

  8. Sparse multivariate factor analysis regression models and its applications to integrative genomics analysis.

    PubMed

    Zhou, Yan; Wang, Pei; Wang, Xianlong; Zhu, Ji; Song, Peter X-K

    2017-01-01

    The multivariate regression model is a useful tool to explore complex associations between two kinds of molecular markers, which enables the understanding of the biological pathways underlying disease etiology. For a set of correlated response variables, accounting for such dependency can increase statistical power. Motivated by integrative genomic data analyses, we propose a new methodology-sparse multivariate factor analysis regression model (smFARM), in which correlations of response variables are assumed to follow a factor analysis model with latent factors. This proposed method not only allows us to address the challenge that the number of association parameters is larger than the sample size, but also to adjust for unobserved genetic and/or nongenetic factors that potentially conceal the underlying response-predictor associations. The proposed smFARM is implemented by the EM algorithm and the blockwise coordinate descent algorithm. The proposed methodology is evaluated and compared to the existing methods through extensive simulation studies. Our results show that accounting for latent factors through the proposed smFARM can improve sensitivity of signal detection and accuracy of sparse association map estimation. We illustrate smFARM by two integrative genomics analysis examples, a breast cancer dataset, and an ovarian cancer dataset, to assess the relationship between DNA copy numbers and gene expression arrays to understand genetic regulatory patterns relevant to the disease. We identify two trans-hub regions: one in cytoband 17q12 whose amplification influences the RNA expression levels of important breast cancer genes, and the other in cytoband 9q21.32-33, which is associated with chemoresistance in ovarian cancer. © 2016 WILEY PERIODICALS, INC.

  9. Multivariate generalized hidden Markov regression models with random covariates: Physical exercise in an elderly population.

    PubMed

    Punzo, Antonio; Ingrassia, Salvatore; Maruotti, Antonello

    2018-04-22

    A time-varying latent variable model is proposed to jointly analyze multivariate mixed-support longitudinal data. The proposal can be viewed as an extension of hidden Markov regression models with fixed covariates (HMRMFCs), which is the state of the art for modelling longitudinal data, with a special focus on the underlying clustering structure. HMRMFCs are inadequate for applications in which a clustering structure can be identified in the distribution of the covariates, as the clustering is independent from the covariates distribution. Here, hidden Markov regression models with random covariates are introduced by explicitly specifying state-specific distributions for the covariates, with the aim of improving the recovering of the clusters in the data with respect to a fixed covariates paradigm. The hidden Markov regression models with random covariates class is defined focusing on the exponential family, in a generalized linear model framework. Model identifiability conditions are sketched, an expectation-maximization algorithm is outlined for parameter estimation, and various implementation and operational issues are discussed. Properties of the estimators of the regression coefficients, as well as of the hidden path parameters, are evaluated through simulation experiments and compared with those of HMRMFCs. The method is applied to physical activity data. Copyright © 2018 John Wiley & Sons, Ltd.

  10. Quality Reporting of Multivariable Regression Models in Observational Studies: Review of a Representative Sample of Articles Published in Biomedical Journals.

    PubMed

    Real, Jordi; Forné, Carles; Roso-Llorach, Albert; Martínez-Sánchez, Jose M

    2016-05-01

    Controlling for confounders is a crucial step in analytical observational studies, and multivariable models are widely used as statistical adjustment techniques. However, the validation of the assumptions of the multivariable regression models (MRMs) should be made clear in scientific reporting. The objective of this study is to review the quality of statistical reporting of the most commonly used MRMs (logistic, linear, and Cox regression) that were applied in analytical observational studies published between 2003 and 2014 by journals indexed in MEDLINE.Review of a representative sample of articles indexed in MEDLINE (n = 428) with observational design and use of MRMs (logistic, linear, and Cox regression). We assessed the quality of reporting about: model assumptions and goodness-of-fit, interactions, sensitivity analysis, crude and adjusted effect estimate, and specification of more than 1 adjusted model.The tests of underlying assumptions or goodness-of-fit of the MRMs used were described in 26.2% (95% CI: 22.0-30.3) of the articles and 18.5% (95% CI: 14.8-22.1) reported the interaction analysis. Reporting of all items assessed was higher in articles published in journals with a higher impact factor.A low percentage of articles indexed in MEDLINE that used multivariable techniques provided information demonstrating rigorous application of the model selected as an adjustment method. Given the importance of these methods to the final results and conclusions of observational studies, greater rigor is required in reporting the use of MRMs in the scientific literature.

  11. Multivariate logistic regression analysis of postoperative complications and risk model establishment of gastrectomy for gastric cancer: A single-center cohort report.

    PubMed

    Zhou, Jinzhe; Zhou, Yanbing; Cao, Shougen; Li, Shikuan; Wang, Hao; Niu, Zhaojian; Chen, Dong; Wang, Dongsheng; Lv, Liang; Zhang, Jian; Li, Yu; Jiao, Xuelong; Tan, Xiaojie; Zhang, Jianli; Wang, Haibo; Zhang, Bingyuan; Lu, Yun; Sun, Zhenqing

    2016-01-01

    Reporting of surgical complications is common, but few provide information about the severity and estimate risk factors of complications. If have, but lack of specificity. We retrospectively analyzed data on 2795 gastric cancer patients underwent surgical procedure at the Affiliated Hospital of Qingdao University between June 2007 and June 2012, established multivariate logistic regression model to predictive risk factors related to the postoperative complications according to the Clavien-Dindo classification system. Twenty-four out of 86 variables were identified statistically significant in univariate logistic regression analysis, 11 significant variables entered multivariate analysis were employed to produce the risk model. Liver cirrhosis, diabetes mellitus, Child classification, invasion of neighboring organs, combined resection, introperative transfusion, Billroth II anastomosis of reconstruction, malnutrition, surgical volume of surgeons, operating time and age were independent risk factors for postoperative complications after gastrectomy. Based on logistic regression equation, p=Exp∑BiXi / (1+Exp∑BiXi), multivariate logistic regression predictive model that calculated the risk of postoperative morbidity was developed, p = 1/(1 + e((4.810-1.287X1-0.504X2-0.500X3-0.474X4-0.405X5-0.318X6-0.316X7-0.305X8-0.278X9-0.255X10-0.138X11))). The accuracy, sensitivity and specificity of the model to predict the postoperative complications were 86.7%, 76.2% and 88.6%, respectively. This risk model based on Clavien-Dindo grading severity of complications system and logistic regression analysis can predict severe morbidity specific to an individual patient's risk factors, estimate patients' risks and benefits of gastric surgery as an accurate decision-making tool and may serve as a template for the development of risk models for other surgical groups.

  12. Simple linear and multivariate regression models.

    PubMed

    Rodríguez del Águila, M M; Benítez-Parejo, N

    2011-01-01

    In biomedical research it is common to find problems in which we wish to relate a response variable to one or more variables capable of describing the behaviour of the former variable by means of mathematical models. Regression techniques are used to this effect, in which an equation is determined relating the two variables. While such equations can have different forms, linear equations are the most widely used form and are easy to interpret. The present article describes simple and multiple linear regression models, how they are calculated, and how their applicability assumptions are checked. Illustrative examples are provided, based on the use of the freely accessible R program. Copyright © 2011 SEICAP. Published by Elsevier Espana. All rights reserved.

  13. Real estate value prediction using multivariate regression models

    NASA Astrophysics Data System (ADS)

    Manjula, R.; Jain, Shubham; Srivastava, Sharad; Rajiv Kher, Pranav

    2017-11-01

    The real estate market is one of the most competitive in terms of pricing and the same tends to vary significantly based on a lot of factors, hence it becomes one of the prime fields to apply the concepts of machine learning to optimize and predict the prices with high accuracy. Therefore in this paper, we present various important features to use while predicting housing prices with good accuracy. We have described regression models, using various features to have lower Residual Sum of Squares error. While using features in a regression model some feature engineering is required for better prediction. Often a set of features (multiple regressions) or polynomial regression (applying a various set of powers in the features) is used for making better model fit. For these models are expected to be susceptible towards over fitting ridge regression is used to reduce it. This paper thus directs to the best application of regression models in addition to other techniques to optimize the result.

  14. Extending multivariate distance matrix regression with an effect size measure and the asymptotic null distribution of the test statistic.

    PubMed

    McArtor, Daniel B; Lubke, Gitta H; Bergeman, C S

    2017-12-01

    Person-centered methods are useful for studying individual differences in terms of (dis)similarities between response profiles on multivariate outcomes. Multivariate distance matrix regression (MDMR) tests the significance of associations of response profile (dis)similarities and a set of predictors using permutation tests. This paper extends MDMR by deriving and empirically validating the asymptotic null distribution of its test statistic, and by proposing an effect size for individual outcome variables, which is shown to recover true associations. These extensions alleviate the computational burden of permutation tests currently used in MDMR and render more informative results, thus making MDMR accessible to new research domains.

  15. Determination of boiling point of petrochemicals by gas chromatography-mass spectrometry and multivariate regression analysis of structural activity relationship.

    PubMed

    Fakayode, Sayo O; Mitchell, Breanna S; Pollard, David A

    2014-08-01

    Accurate understanding of analyte boiling points (BP) is of critical importance in gas chromatographic (GC) separation and crude oil refinery operation in petrochemical industries. This study reported the first combined use of GC separation and partial-least-square (PLS1) multivariate regression analysis of petrochemical structural activity relationship (SAR) for accurate BP determination of two commercially available (D3710 and MA VHP) calibration gas mix samples. The results of the BP determination using PLS1 multivariate regression were further compared with the results of traditional simulated distillation method of BP determination. The developed PLS1 regression was able to correctly predict analytes BP in D3710 and MA VHP calibration gas mix samples, with a root-mean-square-%-relative-error (RMS%RE) of 6.4%, and 10.8% respectively. In contrast, the overall RMS%RE of 32.9% and 40.4%, respectively obtained for BP determination in D3710 and MA VHP using a traditional simulated distillation method were approximately four times larger than the corresponding RMS%RE of BP prediction using MRA, demonstrating the better predictive ability of MRA. The reported method is rapid, robust, and promising, and can be potentially used routinely for fast analysis, pattern recognition, and analyte BP determination in petrochemical industries. Copyright © 2014 Elsevier B.V. All rights reserved.

  16. Inference for multivariate regression model based on multiply imputed synthetic data generated via posterior predictive sampling

    NASA Astrophysics Data System (ADS)

    Moura, Ricardo; Sinha, Bimal; Coelho, Carlos A.

    2017-06-01

    The recent popularity of the use of synthetic data as a Statistical Disclosure Control technique has enabled the development of several methods of generating and analyzing such data, but almost always relying in asymptotic distributions and in consequence being not adequate for small sample datasets. Thus, a likelihood-based exact inference procedure is derived for the matrix of regression coefficients of the multivariate regression model, for multiply imputed synthetic data generated via Posterior Predictive Sampling. Since it is based in exact distributions this procedure may even be used in small sample datasets. Simulation studies compare the results obtained from the proposed exact inferential procedure with the results obtained from an adaptation of Reiters combination rule to multiply imputed synthetic datasets and an application to the 2000 Current Population Survey is discussed.

  17. [Multivariate ordinal logistic regression analysis on the association between consumption of fried food and both esophageal cancer and precancerous lesions].

    PubMed

    Guo, L W; Liu, S Z; Zhang, M; Chen, Q; Zhang, S K; Sun, X B

    2017-12-10

    Objective: To investigate the effect of fried food intake on the pathogenesis of esophageal cancer and precancerous lesions. Methods: From 2005 to 2013, all the residents aged 40-69 years from 11 counties (cities) where cancer screening of upper gastrointestinal cancer had been conducted in rural areas of Henan province, were recruited as the subjects of study. Information on demography and lifestyle was collected. The residents under study were screened with iodine staining endoscopic examination and biopsy samples were diagnosed pathologically, under standardized criteria. Subjects with high risk were divided into the groups based on their different pathological degrees. Multivariate ordinal logistic regression analysis was used to analyze the relationship between the frequency of fried food intake and esophageal cancer and precancerous lesions. Results: A total number of 8 792 cases with normal esophagus, 3 680 with mild hyperplasia, 972 with moderate hyperplasia, 413 with severe hyperplasia carcinoma in situ, and 336 cases of esophageal cancer were recruited. Results from multivariate logistic regression analysis showed that, when compared with those who did not eat fried food, the intake of fried food (<2 times/week: OR =1.60, 95% CI : 1.40-1.83; ≥2 times/week: OR =2.58, 95% CI : 1.98-3.37) appeared a risk factor for both esophageal cancer or precancerous lesions after adjustment for age, sex, marital status, educational level, body mass index, smoking and alcohol intake. Conclusion: The intake of fried food appeared a risk factor for both esophageal cancer and precancerous lesions.

  18. EXTENDING MULTIVARIATE DISTANCE MATRIX REGRESSION WITH AN EFFECT SIZE MEASURE AND THE ASYMPTOTIC NULL DISTRIBUTION OF THE TEST STATISTIC

    PubMed Central

    McArtor, Daniel B.; Lubke, Gitta H.; Bergeman, C. S.

    2017-01-01

    Person-centered methods are useful for studying individual differences in terms of (dis)similarities between response profiles on multivariate outcomes. Multivariate distance matrix regression (MDMR) tests the significance of associations of response profile (dis)similarities and a set of predictors using permutation tests. This paper extends MDMR by deriving and empirically validating the asymptotic null distribution of its test statistic, and by proposing an effect size for individual outcome variables, which is shown to recover true associations. These extensions alleviate the computational burden of permutation tests currently used in MDMR and render more informative results, thus making MDMR accessible to new research domains. PMID:27738957

  19. On the degrees of freedom of reduced-rank estimators in multivariate regression

    PubMed Central

    Mukherjee, A.; Chen, K.; Wang, N.; Zhu, J.

    2015-01-01

    Summary We study the effective degrees of freedom of a general class of reduced-rank estimators for multivariate regression in the framework of Stein's unbiased risk estimation. A finite-sample exact unbiased estimator is derived that admits a closed-form expression in terms of the thresholded singular values of the least-squares solution and hence is readily computable. The results continue to hold in the high-dimensional setting where both the predictor and the response dimensions may be larger than the sample size. The derived analytical form facilitates the investigation of theoretical properties and provides new insights into the empirical behaviour of the degrees of freedom. In particular, we examine the differences and connections between the proposed estimator and a commonly-used naive estimator. The use of the proposed estimator leads to efficient and accurate prediction risk estimation and model selection, as demonstrated by simulation studies and a data example. PMID:26702155

  20. Multivariate research in areas of phosphorus cast-iron brake shoes manufacturing using the statistical analysis and the multiple regression equations

    NASA Astrophysics Data System (ADS)

    Kiss, I.; Cioată, V. G.; Alexa, V.; Raţiu, S. A.

    2017-05-01

    The braking system is one of the most important and complex subsystems of railway vehicles, especially when it comes for safety. Therefore, installing efficient safe brakes on the modern railway vehicles is essential. Nowadays is devoted attention to solving problems connected with using high performance brake materials and its impact on thermal and mechanical loading of railway wheels. The main factor that influences the selection of a friction material for railway applications is the performance criterion, due to the interaction between the brake block and the wheel produce complex thermos-mechanical phenomena. In this work, the investigated subjects are the cast-iron brake shoes, which are still widely used on freight wagons. Therefore, the cast-iron brake shoes - with lamellar graphite and with a high content of phosphorus (0.8-1.1%) - need a special investigation. In order to establish the optimal condition for the cast-iron brake shoes we proposed a mathematical modelling study by using the statistical analysis and multiple regression equations. Multivariate research is important in areas of cast-iron brake shoes manufacturing, because many variables interact with each other simultaneously. Multivariate visualization comes to the fore when researchers have difficulties in comprehending many dimensions at one time. Technological data (hardness and chemical composition) obtained from cast-iron brake shoes were used for this purpose. In order to settle the multiple correlation between the hardness of the cast-iron brake shoes, and the chemical compositions elements several model of regression equation types has been proposed. Because a three-dimensional surface with variables on three axes is a common way to illustrate multivariate data, in which the maximum and minimum values are easily highlighted, we plotted graphical representation of the regression equations in order to explain interaction of the variables and locate the optimal level of each variable for

  1. A generalized multivariate regression model for modelling ocean wave heights

    NASA Astrophysics Data System (ADS)

    Wang, X. L.; Feng, Y.; Swail, V. R.

    2012-04-01

    In this study, a generalized multivariate linear regression model is developed to represent the relationship between 6-hourly ocean significant wave heights (Hs) and the corresponding 6-hourly mean sea level pressure (MSLP) fields. The model is calibrated using the ERA-Interim reanalysis of Hs and MSLP fields for 1981-2000, and is validated using the ERA-Interim reanalysis for 2001-2010 and ERA40 reanalysis of Hs and MSLP for 1958-2001. The performance of the fitted model is evaluated in terms of Pierce skill score, frequency bias index, and correlation skill score. Being not normally distributed, wave heights are subjected to a data adaptive Box-Cox transformation before being used in the model fitting. Also, since 6-hourly data are being modelled, lag-1 autocorrelation must be and is accounted for. The models with and without Box-Cox transformation, and with and without accounting for autocorrelation, are inter-compared in terms of their prediction skills. The fitted MSLP-Hs relationship is then used to reconstruct historical wave height climate from the 6-hourly MSLP fields taken from the Twentieth Century Reanalysis (20CR, Compo et al. 2011), and to project possible future wave height climates using CMIP5 model simulations of MSLP fields. The reconstructed and projected wave heights, both seasonal means and maxima, are subject to a trend analysis that allows for non-linear (polynomial) trends.

  2. Remote sensing and GIS-based landslide hazard analysis and cross-validation using multivariate logistic regression model on three test areas in Malaysia

    NASA Astrophysics Data System (ADS)

    Pradhan, Biswajeet

    2010-05-01

    This paper presents the results of the cross-validation of a multivariate logistic regression model using remote sensing data and GIS for landslide hazard analysis on the Penang, Cameron, and Selangor areas in Malaysia. Landslide locations in the study areas were identified by interpreting aerial photographs and satellite images, supported by field surveys. SPOT 5 and Landsat TM satellite imagery were used to map landcover and vegetation index, respectively. Maps of topography, soil type, lineaments and land cover were constructed from the spatial datasets. Ten factors which influence landslide occurrence, i.e., slope, aspect, curvature, distance from drainage, lithology, distance from lineaments, soil type, landcover, rainfall precipitation, and normalized difference vegetation index (ndvi), were extracted from the spatial database and the logistic regression coefficient of each factor was computed. Then the landslide hazard was analysed using the multivariate logistic regression coefficients derived not only from the data for the respective area but also using the logistic regression coefficients calculated from each of the other two areas (nine hazard maps in all) as a cross-validation of the model. For verification of the model, the results of the analyses were then compared with the field-verified landslide locations. Among the three cases of the application of logistic regression coefficient in the same study area, the case of Selangor based on the Selangor logistic regression coefficients showed the highest accuracy (94%), where as Penang based on the Penang coefficients showed the lowest accuracy (86%). Similarly, among the six cases from the cross application of logistic regression coefficient in other two areas, the case of Selangor based on logistic coefficient of Cameron showed highest (90%) prediction accuracy where as the case of Penang based on the Selangor logistic regression coefficients showed the lowest accuracy (79%). Qualitatively, the cross

  3. Trochanteric entry femoral nails yield better femoral version and lower revision rates-A large cohort multivariate regression analysis.

    PubMed

    Yoon, Richard S; Gage, Mark J; Galos, David K; Donegan, Derek J; Liporace, Frank A

    2017-06-01

    Intramedullary nailing (IMN) has become the standard of care for the treatment of most femoral shaft fractures. Different IMN options include trochanteric and piriformis entry as well as retrograde nails, which may result in varying degrees of femoral rotation. The objective of this study was to analyze postoperative femoral version between three types of nails and to delineate any significant differences in femoral version (DFV) and revision rates. Over a 10-year period, 417 patients underwent IMN of a diaphyseal femur fracture (AO/OTA 32A-C). Of these patients, 316 met inclusion criteria and obtained postoperative computed tomography (CT) scanograms to calculate femoral version and were thus included in the study. In this study, our main outcome measure was the difference in femoral version (DFV) between the uninjured limb and the injured limb. The effect of the following variables on DFV and revision rates were determined via univariate, multivariate, and ordinal regression analyses: gender, age, BMI, ethnicity, mechanism of injury, operative side, open fracture, and table type/position. Statistical significance was set at p<0.05. A total of 316 patients were included. Piriformis entry nails made up the majority (n=141), followed by retrograde (n=108), then trochanteric entry nails (n=67). Univariate regression analysis revealed that a lower BMI was significantly associated with a lower DFV (p=0.006). Controlling for possible covariables, multivariate analysis yielded a significantly lower DFV for trochanteric entry nails than piriformis or retrograde nails (7.9±6.10 vs. 9.5±7.4 vs. 9.4±7.8°, p<0.05). Using revision as an endpoint, trochanteric entry nails also had a significantly lower revision rate, even when controlling for all other variables (p<0.05). Comparative, objective comparisons between DFV between different nails based on entry point revealed that trochanteric nails had a significantly lower DFV and a lower revision rate, even after regression

  4. Integrating Growth Variability of the Ilium, Fifth Lumbar Vertebra, and Clavicle with Multivariate Adaptive Regression Splines Models for Subadult Age Estimation.

    PubMed

    Corron, Louise; Marchal, François; Condemi, Silvana; Telmon, Norbert; Chaumoitre, Kathia; Adalian, Pascal

    2018-05-31

    Subadult age estimation should rely on sampling and statistical protocols capturing development variability for more accurate age estimates. In this perspective, measurements were taken on the fifth lumbar vertebrae and/or clavicles of 534 French males and females aged 0-19 years and the ilia of 244 males and females aged 0-12 years. These variables were fitted in nonparametric multivariate adaptive regression splines (MARS) models with 95% prediction intervals (PIs) of age. The models were tested on two independent samples from Marseille and the Luis Lopes reference collection from Lisbon. Models using ilium width and module, maximum clavicle length, and lateral vertebral body heights were more than 92% accurate. Precision was lower for postpubertal individuals. Integrating punctual nonlinearities of the relationship between age and the variables and dynamic prediction intervals incorporated the normal increase in interindividual growth variability (heteroscedasticity of variance) with age for more biologically accurate predictions. © 2018 American Academy of Forensic Sciences.

  5. Transport modeling and multivariate adaptive regression splines for evaluating performance of ASR systems in freshwater aquifers

    NASA Astrophysics Data System (ADS)

    Forghani, Ali; Peralta, Richard C.

    2017-10-01

    The study presents a procedure using solute transport and statistical models to evaluate the performance of aquifer storage and recovery (ASR) systems designed to earn additional water rights in freshwater aquifers. The recovery effectiveness (REN) index quantifies the performance of these ASR systems. REN is the proportion of the injected water that the same ASR well can recapture during subsequent extraction periods. To estimate REN for individual ASR wells, the presented procedure uses finely discretized groundwater flow and contaminant transport modeling. Then, the procedure uses multivariate adaptive regression splines (MARS) analysis to identify the significant variables affecting REN, and to identify the most recovery-effective wells. Achieving REN values close to 100% is the desire of the studied 14-well ASR system operator. This recovery is feasible for most of the ASR wells by extracting three times the injectate volume during the same year as injection. Most of the wells would achieve RENs below 75% if extracting merely the same volume as they injected. In other words, recovering almost all the same water molecules that are injected requires having a pre-existing water right to extract groundwater annually. MARS shows that REN most significantly correlates with groundwater flow velocity, or hydraulic conductivity and hydraulic gradient. MARS results also demonstrate that maximizing REN requires utilizing the wells located in areas with background Darcian groundwater velocities less than 0.03 m/d. The study also highlights the superiority of MARS over regular multiple linear regressions to identify the wells that can provide the maximum REN. This is the first reported application of MARS for evaluating performance of an ASR system in fresh water aquifers.

  6. Factors related to clinical pregnancy after vitrified-warmed embryo transfer: a retrospective and multivariate logistic regression analysis of 2313 transfer cycles.

    PubMed

    Shi, Wenhao; Zhang, Silin; Zhao, Wanqiu; Xia, Xue; Wang, Min; Wang, Hui; Bai, Haiyan; Shi, Juanzi

    2013-07-01

    What factors does multivariate logistic regression show to be significantly associated with the likelihood of clinical pregnancy in vitrified-warmed embryo transfer (VET) cycles? Assisted hatching (AH) and if the reason to freeze embryos was to avoid the risk of ovarian hyperstimulation syndrome (OHSS) were significantly positively associated with a greater likelihood of clinical pregnancy. Single factor analysis has shown AH, number of embryos transferred and the reason of freezing for OHSS to be positively and damaged blastomere to be negatively significantly associated with the chance of clinical pregnancy after VET. It remains unclear what factors would be significant after multivariate analysis. The study was a retrospective analysis of 2313 VET cycles from 1481 patients performed between January 2008 and April 2012. A multivariate logistic regression analysis was performed to identify the factors to affect clinical pregnancy outcome of VET. There were 22 candidate variables selected based on clinical experiences and the literature. With the thresholds of α entry = α removal= 0.05 for both variable entry and variable removal, eight variables were chosen to contribute the multivariable model by the bootstrap stepwise variable selection algorithm (n = 1000). Eight variables were age at controlled ovarian hyperstimulation (COH), reason for freezing, AH, endometrial thickness, damaged blastomere, number of embryos transferred, number of good-quality embryos, and blood presence on transfer catheter. A descriptive comparison of the relative importance was accomplished by the proportion of explained variation (PEV). Among the reasons for freezing, the OHSS group showed a higher OR than the surplus embryo group when compared with other reasons for VET groups (OHSS versus Other, OR: 2.145; CI: 1.4-3.286; Surplus embryos versus Other, OR: 1.152; CI: 0.761-1.743) and high PEV (marginal 2.77%, P = 0.2911; partial 1.68%; CI of area under receptor operator characteristic

  7. Use of multivariate linear regression and support vector regression to predict functional outcome after surgery for cervical spondylotic myelopathy.

    PubMed

    Hoffman, Haydn; Lee, Sunghoon I; Garst, Jordan H; Lu, Derek S; Li, Charles H; Nagasawa, Daniel T; Ghalehsari, Nima; Jahanforouz, Nima; Razaghy, Mehrdad; Espinal, Marie; Ghavamrezaii, Amir; Paak, Brian H; Wu, Irene; Sarrafzadeh, Majid; Lu, Daniel C

    2015-09-01

    This study introduces the use of multivariate linear regression (MLR) and support vector regression (SVR) models to predict postoperative outcomes in a cohort of patients who underwent surgery for cervical spondylotic myelopathy (CSM). Currently, predicting outcomes after surgery for CSM remains a challenge. We recruited patients who had a diagnosis of CSM and required decompressive surgery with or without fusion. Fine motor function was tested preoperatively and postoperatively with a handgrip-based tracking device that has been previously validated, yielding mean absolute accuracy (MAA) results for two tracking tasks (sinusoidal and step). All patients completed Oswestry disability index (ODI) and modified Japanese Orthopaedic Association questionnaires preoperatively and postoperatively. Preoperative data was utilized in MLR and SVR models to predict postoperative ODI. Predictions were compared to the actual ODI scores with the coefficient of determination (R(2)) and mean absolute difference (MAD). From this, 20 patients met the inclusion criteria and completed follow-up at least 3 months after surgery. With the MLR model, a combination of the preoperative ODI score, preoperative MAA (step function), and symptom duration yielded the best prediction of postoperative ODI (R(2)=0.452; MAD=0.0887; p=1.17 × 10(-3)). With the SVR model, a combination of preoperative ODI score, preoperative MAA (sinusoidal function), and symptom duration yielded the best prediction of postoperative ODI (R(2)=0.932; MAD=0.0283; p=5.73 × 10(-12)). The SVR model was more accurate than the MLR model. The SVR can be used preoperatively in risk/benefit analysis and the decision to operate. Copyright © 2015 Elsevier Ltd. All rights reserved.

  8. Assessing the response of area burned to changing climate in western boreal North America using a Multivariate Adaptive Regression Splines (MARS) approach

    Treesearch

    Michael S. Balshi; A. David McGuire; Paul Duffy; Mike Flannigan; John Walsh; Jerry Melillo

    2009-01-01

    We developed temporally and spatially explicit relationships between air temperature and fuel moisture codes derived from the Canadian Fire Weather Index System to estimate annual area burned at 2.5o (latitude x longitude) resolution using a Multivariate Adaptive Regression Spline (MARS) approach across Alaska and Canada. Burned area was...

  9. The effect of hospital mergers on long-term sickness absence among hospital employees: a fixed effects multivariate regression analysis using panel data.

    PubMed

    Kjekshus, Lars Erik; Bernstrøm, Vilde Hoff; Dahl, Espen; Lorentzen, Thomas

    2014-02-03

    Hospitals are merging to become more cost-effective. Mergers are often complex and difficult processes with variable outcomes. The aim of this study was to analyze the effect of mergers on long-term sickness absence among hospital employees. Long-term sickness absence was analyzed among hospital employees (N = 107 209) in 57 hospitals involved in 23 mergers in Norway between 2000 and 2009. Variation in long-term sickness absence was explained through a fixed effects multivariate regression analysis using panel data with years-since-merger as the independent variable. We found a significant but modest effect of mergers on long-term sickness absence in the year of the merger, and in years 2, 3 and 4; analyzed by gender there was a significant effect for women, also for these years, but only in year 4 for men. However, men are less represented among the hospital workforce; this could explain the lack of significance. Mergers has a significant effect on employee health that should be taken into consideration when deciding to merge hospitals. This study illustrates the importance of analyzing the effects of mergers over several years and the need for more detailed analyses of merger processes and of the changes that may occur as a result of such mergers.

  10. Factors affecting the outcome of excimer laser photorefractive keratectomy: a preliminary multivariable regression analysis

    NASA Astrophysics Data System (ADS)

    Maguen, Ezra I.; Papaioannou, Thanassis; Nesburn, Anthony B.; Salz, James J.; Warren, Cathy; Grundfest, Warren S.

    1996-05-01

    Multivariable regression analysis was used to evaluate the combined effects of some preoperative and operative variables on the change of refraction following excimer laser photorefractive keratectomy for myopia (PRK). This analysis was performed on 152 eyes (at 6 months postoperatively) and 156 eyes (at 12 months postoperatively). The following variables were considered: intended refractive correction, patient age, treatment zone, central corneal thickness, average corneal curvature, and intraocular pressure. At 6 months after surgery, the cumulative R2 was 0.43 with 0.38 attributed to the intended correction and 0.06 attributed to the preoperative corneal curvature. At 12 months, the cumulative R2 was 0.37 where 0.33 was attributed to the intended correction, 0.02 to the preoperative corneal curvature, and 0.01 to both preoperative corneal thickness and to the patient age. Further model augmentation is necessary to account for the remaining variability and the behavior of the residuals.

  11. Successive Projections Algorithm-Multivariable Linear Regression Classifier for the Detection of Contaminants on Chicken Carcasses in Hyperspectral Images

    NASA Astrophysics Data System (ADS)

    Wu, W.; Chen, G. Y.; Kang, R.; Xia, J. C.; Huang, Y. P.; Chen, K. J.

    2017-07-01

    During slaughtering and further processing, chicken carcasses are inevitably contaminated by microbial pathogen contaminants. Due to food safety concerns, many countries implement a zero-tolerance policy that forbids the placement of visibly contaminated carcasses in ice-water chiller tanks during processing. Manual detection of contaminants is labor consuming and imprecise. Here, a successive projections algorithm (SPA)-multivariable linear regression (MLR) classifier based on an optimal performance threshold was developed for automatic detection of contaminants on chicken carcasses. Hyperspectral images were obtained using a hyperspectral imaging system. A regression model of the classifier was established by MLR based on twelve characteristic wavelengths (505, 537, 561, 562, 564, 575, 604, 627, 656, 665, 670, and 689 nm) selected by SPA , and the optimal threshold T = 1 was obtained from the receiver operating characteristic (ROC) analysis. The SPA-MLR classifier provided the best detection results when compared with the SPA-partial least squares (PLS) regression classifier and the SPA-least squares supported vector machine (LS-SVM) classifier. The true positive rate (TPR) of 100% and the false positive rate (FPR) of 0.392% indicate that the SPA-MLR classifier can utilize spatial and spectral information to effectively detect contaminants on chicken carcasses.

  12. Advanced statistics: linear regression, part II: multiple linear regression.

    PubMed

    Marill, Keith A

    2004-01-01

    The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.

  13. Risk factors for pedicled flap necrosis in hand soft tissue reconstruction: a multivariate logistic regression analysis.

    PubMed

    Gong, Xu; Cui, Jianli; Jiang, Ziping; Lu, Laijin; Li, Xiucun

    2018-03-01

    Few clinical retrospective studies have reported the risk factors of pedicled flap necrosis in hand soft tissue reconstruction. The aim of this study was to identify non-technical risk factors associated with pedicled flap perioperative necrosis in hand soft tissue reconstruction via a multivariate logistic regression analysis. For patients with hand soft tissue reconstruction, we carefully reviewed hospital records and identified 163 patients who met the inclusion criteria. The characteristics of these patients, flap transfer procedures and postoperative complications were recorded. Eleven predictors were identified. The correlations between pedicled flap necrosis and risk factors were analysed using a logistic regression model. Of 163 skin flaps, 125 flaps survived completely without any complications. The pedicled flap necrosis rate in hands was 11.04%, which included partial flap necrosis (7.36%) and total flap necrosis (3.68%). Soft tissue defects in fingers were noted in 68.10% of all cases. The logistic regression analysis indicated that the soft tissue defect site (P = 0.046, odds ratio (OR) = 0.079, confidence interval (CI) (0.006, 0.959)), flap size (P = 0.020, OR = 1.024, CI (1.004, 1.045)) and postoperative wound infection (P < 0.001, OR = 17.407, CI (3.821, 79.303)) were statistically significant risk factors for pedicled flap necrosis of the hand. Soft tissue defect site, flap size and postoperative wound infection were risk factors associated with pedicled flap necrosis in hand soft tissue defect reconstruction. © 2017 Royal Australasian College of Surgeons.

  14. Evaluation of multivariate linear regression and artificial neural networks in prediction of water quality parameters

    PubMed Central

    2014-01-01

    This paper examined the efficiency of multivariate linear regression (MLR) and artificial neural network (ANN) models in prediction of two major water quality parameters in a wastewater treatment plant. Biochemical oxygen demand (BOD) and chemical oxygen demand (COD) as well as indirect indicators of organic matters are representative parameters for sewer water quality. Performance of the ANN models was evaluated using coefficient of correlation (r), root mean square error (RMSE) and bias values. The computed values of BOD and COD by model, ANN method and regression analysis were in close agreement with their respective measured values. Results showed that the ANN performance model was better than the MLR model. Comparative indices of the optimized ANN with input values of temperature (T), pH, total suspended solid (TSS) and total suspended (TS) for prediction of BOD was RMSE = 25.1 mg/L, r = 0.83 and for prediction of COD was RMSE = 49.4 mg/L, r = 0.81. It was found that the ANN model could be employed successfully in estimating the BOD and COD in the inlet of wastewater biochemical treatment plants. Moreover, sensitive examination results showed that pH parameter have more effect on BOD and COD predicting to another parameters. Also, both implemented models have predicted BOD better than COD. PMID:24456676

  15. A multivariate regression model for detection of fumonisins content in maize from near infrared spectra.

    PubMed

    Giacomo, Della Riccia; Stefania, Del Zotto

    2013-12-15

    Fumonisins are mycotoxins produced by Fusarium species that commonly live in maize. Whereas fungi damage plants, fumonisins cause disease both to cattle breedings and human beings. Law limits set fumonisins tolerable daily intake with respect to several maize based feed and food. Chemical techniques assure the most reliable and accurate measurements, but they are expensive and time consuming. A method based on Near Infrared spectroscopy and multivariate statistical regression is described as a simpler, cheaper and faster alternative. We apply Partial Least Squares with full cross validation. Two models are described, having high correlation of calibration (0.995, 0.998) and of validation (0.908, 0.909), respectively. Description of observed phenomenon is accurate and overfitting is avoided. Screening of contaminated maize with respect to European legal limit of 4 mg kg(-1) should be assured. Copyright © 2013 Elsevier Ltd. All rights reserved.

  16. The effect of hospital mergers on long-term sickness absence among hospital employees: a fixed effects multivariate regression analysis using panel data

    PubMed Central

    2014-01-01

    Background Hospitals are merging to become more cost-effective. Mergers are often complex and difficult processes with variable outcomes. The aim of this study was to analyze the effect of mergers on long-term sickness absence among hospital employees. Methods Long-term sickness absence was analyzed among hospital employees (N = 107 209) in 57 hospitals involved in 23 mergers in Norway between 2000 and 2009. Variation in long-term sickness absence was explained through a fixed effects multivariate regression analysis using panel data with years-since-merger as the independent variable. Results We found a significant but modest effect of mergers on long-term sickness absence in the year of the merger, and in years 2, 3 and 4; analyzed by gender there was a significant effect for women, also for these years, but only in year 4 for men. However, men are less represented among the hospital workforce; this could explain the lack of significance. Conclusions Mergers has a significant effect on employee health that should be taken into consideration when deciding to merge hospitals. This study illustrates the importance of analyzing the effects of mergers over several years and the need for more detailed analyses of merger processes and of the changes that may occur as a result of such mergers. PMID:24490750

  17. A comparison between univariate probabilistic and multivariate (logistic regression) methods for landslide susceptibility analysis: the example of the Febbraro valley (Northern Alps, Italy)

    NASA Astrophysics Data System (ADS)

    Rossi, M.; Apuani, T.; Felletti, F.

    2009-04-01

    The aim of this paper is to compare the results of two statistical methods for landslide susceptibility analysis: 1) univariate probabilistic method based on landslide susceptibility index, 2) multivariate method (logistic regression). The study area is the Febbraro valley, located in the central Italian Alps, where different types of metamorphic rocks croup out. On the eastern part of the studied basin a quaternary cover represented by colluvial and secondarily, by glacial deposits, is dominant. In this study 110 earth flows, mainly located toward NE portion of the catchment, were analyzed. They involve only the colluvial deposits and their extension mainly ranges from 36 to 3173 m2. Both statistical methods require to establish a spatial database, in which each landslide is described by several parameters that can be assigned using a main scarp central point of landslide. The spatial database is constructed using a Geographical Information System (GIS). Each landslide is described by several parameters corresponding to the value of main scarp central point of the landslide. Based on bibliographic review a total of 15 predisposing factors were utilized. The width of the intervals, in which the maps of the predisposing factors have to be reclassified, has been defined assuming constant intervals to: elevation (100 m), slope (5 °), solar radiation (0.1 MJ/cm2/year), profile curvature (1.2 1/m), tangential curvature (2.2 1/m), drainage density (0.5), lineament density (0.00126). For the other parameters have been used the results of the probability-probability plots analysis and the statistical indexes of landslides site. In particular slope length (0 ÷ 2, 2 ÷ 5, 5 ÷ 10, 10 ÷ 20, 20 ÷ 35, 35 ÷ 260), accumulation flow (0 ÷ 1, 1 ÷ 2, 2 ÷ 5, 5 ÷ 12, 12 ÷ 60, 60 ÷27265), Topographic Wetness Index 0 ÷ 0.74, 0.74 ÷ 1.94, 1.94 ÷ 2.62, 2.62 ÷ 3.48, 3.48 ÷ 6,00, 6.00 ÷ 9.44), Stream Power Index (0 ÷ 0.64, 0.64 ÷ 1.28, 1.28 ÷ 1.81, 1.81 ÷ 4.20, 4.20 ÷ 9

  18. A mixed-effects regression model for longitudinal multivariate ordinal data.

    PubMed

    Liu, Li C; Hedeker, Donald

    2006-03-01

    A mixed-effects item response theory model that allows for three-level multivariate ordinal outcomes and accommodates multiple random subject effects is proposed for analysis of multivariate ordinal outcomes in longitudinal studies. This model allows for the estimation of different item factor loadings (item discrimination parameters) for the multiple outcomes. The covariates in the model do not have to follow the proportional odds assumption and can be at any level. Assuming either a probit or logistic response function, maximum marginal likelihood estimation is proposed utilizing multidimensional Gauss-Hermite quadrature for integration of the random effects. An iterative Fisher scoring solution, which provides standard errors for all model parameters, is used. An analysis of a longitudinal substance use data set, where four items of substance use behavior (cigarette use, alcohol use, marijuana use, and getting drunk or high) are repeatedly measured over time, is used to illustrate application of the proposed model.

  19. Endpoint in plasma etch process using new modified w-multivariate charts and windowed regression

    NASA Astrophysics Data System (ADS)

    Zakour, Sihem Ben; Taleb, Hassen

    2017-09-01

    Endpoint detection is very important undertaking on the side of getting a good understanding and figuring out if a plasma etching process is done in the right way, especially if the etched area is very small (0.1%). It truly is a crucial part of supplying repeatable effects in every single wafer. When the film being etched has been completely cleared, the endpoint is reached. To ensure the desired device performance on the produced integrated circuit, the high optical emission spectroscopy (OES) sensor is employed. The huge number of gathered wavelengths (profiles) is then analyzed and pre-processed using a new proposed simple algorithm named Spectra peak selection (SPS) to select the important wavelengths, then we employ wavelet analysis (WA) to enhance the performance of detection by suppressing noise and redundant information. The selected and treated OES wavelengths are then used in modified multivariate control charts (MEWMA and Hotelling) for three statistics (mean, SD and CV) and windowed polynomial regression for mean. The employ of three aforementioned statistics is motivated by controlling mean shift, variance shift and their ratio (CV) if both mean and SD are not stable. The control charts show their performance in detecting endpoint especially W-mean Hotelling chart and the worst result is given by CV statistic. As the best detection of endpoint is given by the W-Hotelling mean statistic, this statistic will be used to construct a windowed wavelet Hotelling polynomial regression. This latter can only identify the window containing endpoint phenomenon.

  20. Variable Selection in Logistic Regression.

    DTIC Science & Technology

    1987-06-01

    23 %. AUTIOR(.) S. CONTRACT OR GRANT NUMBE Rf.i %Z. D. Bai, P. R. Krishnaiah and . C. Zhao F49620-85- C-0008 " PERFORMING ORGANIZATION NAME AND AOORESS...d I7 IOK-TK- d 7 -I0 7’ VARIABLE SELECTION IN LOGISTIC REGRESSION Z. D. Bai, P. R. Krishnaiah and L. C. Zhao Center for Multivariate Analysis...University of Pittsburgh Center for Multivariate Analysis University of Pittsburgh Y !I VARIABLE SELECTION IN LOGISTIC REGRESSION Z- 0. Bai, P. R. Krishnaiah

  1. Regional trends in short-duration precipitation extremes: a flexible multivariate monotone quantile regression approach

    NASA Astrophysics Data System (ADS)

    Cannon, Alex

    2017-04-01

    univariate technique, and cannot incorporate information from additional covariates, for example ENSO state or physiographic controls on extreme rainfall within a region. Here, the univariate MQR model is extended to allow the use of multiple covariates. Multivariate monotone quantile regression (MMQR) is based on a single hidden-layer feedforward network with the quantile regression error function and partial monotonicity constraints. The MMQR model is demonstrated via Monte Carlo simulations and the estimation and visualization of regional trends in moderate rainfall extremes based on homogenized sub-daily precipitation data at stations in Canada.

  2. Quality of reporting of multivariable logistic regression models in Chinese clinical medical journals.

    PubMed

    Zhang, Ying-Ying; Zhou, Xiao-Bin; Wang, Qiu-Zhen; Zhu, Xiao-Yan

    2017-05-01

    Multivariable logistic regression (MLR) has been increasingly used in Chinese clinical medical research during the past few years. However, few evaluations of the quality of the reporting strategies in these studies are available.To evaluate the reporting quality and model accuracy of MLR used in published work, and related advice for authors, readers, reviewers, and editors.A total of 316 articles published in 5 leading Chinese clinical medical journals with high impact factor from January 2010 to July 2015 were selected for evaluation. Articles were evaluated according 12 established criteria for proper use and reporting of MLR models.Among the articles, the highest quality score was 9, the lowest 1, and the median 5 (4-5). A total of 85.1% of the articles scored below 6. No significant differences were found among these journals with respect to quality score (χ = 6.706, P = .15). More than 50% of the articles met the following 5 criteria: complete identification of the statistical software application that was used (97.2%), calculation of the odds ratio and its confidence interval (86.4%), description of sufficient events (>10) per variable, selection of variables, and fitting procedure (78.2%, 69.3%, and 58.5%, respectively). Less than 35% of the articles reported the coding of variables (18.7%). The remaining 5 criteria were not satisfied by a sufficient number of articles: goodness-of-fit (10.1%), interactions (3.8%), checking for outliers (3.2%), collinearity (1.9%), and participation of statisticians and epidemiologists (0.3%). The criterion of conformity with linear gradients was applicable to 186 articles; however, only 7 (3.8%) mentioned or tested it.The reporting quality and model accuracy of MLR in selected articles were not satisfactory. In fact, severe deficiencies were noted. Only 1 article scored 9. We recommend authors, readers, reviewers, and editors to consider MLR models more carefully and cooperate more closely with statisticians and

  3. Correlative and multivariate analysis of increased radon concentration in underground laboratory.

    PubMed

    Maletić, Dimitrije M; Udovičić, Vladimir I; Banjanac, Radomir M; Joković, Dejan R; Dragić, Aleksandar L; Veselinović, Nikola B; Filipović, Jelena

    2014-11-01

    The results of analysis using correlative and multivariate methods, as developed for data analysis in high-energy physics and implemented in the Toolkit for Multivariate Analysis software package, of the relations of the variation of increased radon concentration with climate variables in shallow underground laboratory is presented. Multivariate regression analysis identified a number of multivariate methods which can give a good evaluation of increased radon concentrations based on climate variables. The use of the multivariate regression methods will enable the investigation of the relations of specific climate variable with increased radon concentrations by analysis of regression methods resulting in 'mapped' underlying functional behaviour of radon concentrations depending on a wide spectrum of climate variables. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  4. Validation of cross-sectional time series and multivariate adaptive regression splines models for the prediction of energy expenditure in children and adolescents using doubly labeled water

    USDA-ARS?s Scientific Manuscript database

    Accurate, nonintrusive, and inexpensive techniques are needed to measure energy expenditure (EE) in free-living populations. Our primary aim in this study was to validate cross-sectional time series (CSTS) and multivariate adaptive regression splines (MARS) models based on observable participant cha...

  5. Multivariate random regression analysis for body weight and main morphological traits in genetically improved farmed tilapia (Oreochromis niloticus).

    PubMed

    He, Jie; Zhao, Yunfeng; Zhao, Jingli; Gao, Jin; Han, Dandan; Xu, Pao; Yang, Runqing

    2017-11-02

    Because of their high economic importance, growth traits in fish are under continuous improvement. For growth traits that are recorded at multiple time-points in life, the use of univariate and multivariate animal models is limited because of the variable and irregular timing of these measures. Thus, the univariate random regression model (RRM) was introduced for the genetic analysis of dynamic growth traits in fish breeding. We used a multivariate random regression model (MRRM) to analyze genetic changes in growth traits recorded at multiple time-point of genetically-improved farmed tilapia. Legendre polynomials of different orders were applied to characterize the influences of fixed and random effects on growth trajectories. The final MRRM was determined by optimizing the univariate RRM for the analyzed traits separately via penalizing adaptively the likelihood statistical criterion, which is superior to both the Akaike information criterion and the Bayesian information criterion. In the selected MRRM, the additive genetic effects were modeled by Legendre polynomials of three orders for body weight (BWE) and body length (BL) and of two orders for body depth (BD). By using the covariance functions of the MRRM, estimated heritabilities were between 0.086 and 0.628 for BWE, 0.155 and 0.556 for BL, and 0.056 and 0.607 for BD. Only heritabilities for BD measured from 60 to 140 days of age were consistently higher than those estimated by the univariate RRM. All genetic correlations between growth time-points exceeded 0.5 for either single or pairwise time-points. Moreover, correlations between early and late growth time-points were lower. Thus, for phenotypes that are measured repeatedly in aquaculture, an MRRM can enhance the efficiency of the comprehensive selection for BWE and the main morphological traits.

  6. SPReM: Sparse Projection Regression Model For High-dimensional Linear Regression *

    PubMed Central

    Sun, Qiang; Zhu, Hongtu; Liu, Yufeng; Ibrahim, Joseph G.

    2014-01-01

    The aim of this paper is to develop a sparse projection regression modeling (SPReM) framework to perform multivariate regression modeling with a large number of responses and a multivariate covariate of interest. We propose two novel heritability ratios to simultaneously perform dimension reduction, response selection, estimation, and testing, while explicitly accounting for correlations among multivariate responses. Our SPReM is devised to specifically address the low statistical power issue of many standard statistical approaches, such as the Hotelling’s T2 test statistic or a mass univariate analysis, for high-dimensional data. We formulate the estimation problem of SPREM as a novel sparse unit rank projection (SURP) problem and propose a fast optimization algorithm for SURP. Furthermore, we extend SURP to the sparse multi-rank projection (SMURP) by adopting a sequential SURP approximation. Theoretically, we have systematically investigated the convergence properties of SURP and the convergence rate of SURP estimates. Our simulation results and real data analysis have shown that SPReM out-performs other state-of-the-art methods. PMID:26527844

  7. Estimating suspended sediment load with multivariate adaptive regression spline, teaching-learning based optimization, and artificial bee colony models.

    PubMed

    Yilmaz, Banu; Aras, Egemen; Nacar, Sinan; Kankal, Murat

    2018-05-23

    The functional life of a dam is often determined by the rate of sediment delivery to its reservoir. Therefore, an accurate estimate of the sediment load in rivers with dams is essential for designing and predicting a dam's useful lifespan. The most credible method is direct measurements of sediment input, but this can be very costly and it cannot always be implemented at all gauging stations. In this study, we tested various regression models to estimate suspended sediment load (SSL) at two gauging stations on the Çoruh River in Turkey, including artificial bee colony (ABC), teaching-learning-based optimization algorithm (TLBO), and multivariate adaptive regression splines (MARS). These models were also compared with one another and with classical regression analyses (CRA). Streamflow values and previously collected data of SSL were used as model inputs with predicted SSL data as output. Two different training and testing dataset configurations were used to reinforce the model accuracy. For the MARS method, the root mean square error value was found to range between 35% and 39% for the test two gauging stations, which was lower than errors for other models. Error values were even lower (7% to 15%) using another dataset. Our results indicate that simultaneous measurements of streamflow with SSL provide the most effective parameter for obtaining accurate predictive models and that MARS is the most accurate model for predicting SSL. Copyright © 2017 Elsevier B.V. All rights reserved.

  8. A New Predictive Model of Centerline Segregation in Continuous Cast Steel Slabs by Using Multivariate Adaptive Regression Splines Approach

    PubMed Central

    García Nieto, Paulino José; González Suárez, Victor Manuel; Álvarez Antón, Juan Carlos; Mayo Bayón, Ricardo; Sirgo Blanco, José Ángel; Díaz Fernández, Ana María

    2015-01-01

    The aim of this study was to obtain a predictive model able to perform an early detection of central segregation severity in continuous cast steel slabs. Segregation in steel cast products is an internal defect that can be very harmful when slabs are rolled in heavy plate mills. In this research work, the central segregation was studied with success using the data mining methodology based on multivariate adaptive regression splines (MARS) technique. For this purpose, the most important physical-chemical parameters are considered. The results of the present study are two-fold. In the first place, the significance of each physical-chemical variable on the segregation is presented through the model. Second, a model for forecasting segregation is obtained. Regression with optimal hyperparameters was performed and coefficients of determination equal to 0.93 for continuity factor estimation and 0.95 for average width were obtained when the MARS technique was applied to the experimental dataset, respectively. The agreement between experimental data and the model confirmed the good performance of the latter.

  9. A New Approach of Juvenile Age Estimation using Measurements of the Ilium and Multivariate Adaptive Regression Splines (MARS) Models for Better Age Prediction.

    PubMed

    Corron, Louise; Marchal, François; Condemi, Silvana; Chaumoître, Kathia; Adalian, Pascal

    2017-01-01

    Juvenile age estimation methods used in forensic anthropology generally lack methodological consistency and/or statistical validity. Considering this, a standard approach using nonparametric Multivariate Adaptive Regression Splines (MARS) models were tested to predict age from iliac biometric variables of male and female juveniles from Marseilles, France, aged 0-12 years. Models using unidimensional (length and width) and bidimensional iliac data (module and surface) were constructed on a training sample of 176 individuals and validated on an independent test sample of 68 individuals. Results show that MARS prediction models using iliac width, module and area give overall better and statistically valid age estimates. These models integrate punctual nonlinearities of the relationship between age and osteometric variables. By constructing valid prediction intervals whose size increases with age, MARS models take into account the normal increase of individual variability. MARS models can qualify as a practical and standardized approach for juvenile age estimation. © 2016 American Academy of Forensic Sciences.

  10. Multivariable Regression Analysis in Schistosoma mansoni-Infected Individuals in the Sudan Reveals Unique Immunoepidemiological Profiles in Uninfected, egg+ and Non-egg+ Infected Individuals.

    PubMed

    Elfaki, Tayseer Elamin Mohamed; Arndts, Kathrin; Wiszniewsky, Anna; Ritter, Manuel; Goreish, Ibtisam A; Atti El Mekki, Misk El Yemen A; Arriens, Sandra; Pfarr, Kenneth; Fimmers, Rolf; Doenhoff, Mike; Hoerauf, Achim; Layland, Laura E

    2016-05-01

    In the Sudan, Schistosoma mansoni infections are a major cause of morbidity in school-aged children and infection rates are associated with available clean water sources. During infection, immune responses pass through a Th1 followed by Th2 and Treg phases and patterns can relate to different stages of infection or immunity. This retrospective study evaluated immunoepidemiological aspects in 234 individuals (range 4-85 years old) from Kassala and Khartoum states in 2011. Systemic immune profiles (cytokines and immunoglobulins) and epidemiological parameters were surveyed in n = 110 persons presenting patent S. mansoni infections (egg+), n = 63 individuals positive for S. mansoni via PCR in sera but egg negative (SmPCR+) and n = 61 people who were infection-free (Sm uninf). Immunoepidemiological findings were further investigated using two binary multivariable regression analysis. Nearly all egg+ individuals had no access to latrines and over 90% obtained water via the canal stemming from the Atbara River. With regards to age, infection and an egg+ status was linked to young and adolescent groups. In terms of immunology, S. mansoni infection per se was strongly associated with increased SEA-specific IgG4 but not IgE levels. IL-6, IL-13 and IL-10 were significantly elevated in patently-infected individuals and positively correlated with egg load. In contrast, IL-2 and IL-1β were significantly lower in SmPCR+ individuals when compared to Sm uninf and egg+ groups which was further confirmed during multivariate regression analysis. Schistosomiasis remains an important public health problem in the Sudan with a high number of patent individuals. In addition, SmPCR diagnostics revealed another cohort of infected individuals with a unique immunological profile and provides an avenue for future studies on non-patent infection states. Future studies should investigate the downstream signalling pathways/mechanisms of IL-2 and IL-1β as potential diagnostic markers in order to

  11. Multivariable Regression Analysis in Schistosoma mansoni-Infected Individuals in the Sudan Reveals Unique Immunoepidemiological Profiles in Uninfected, egg+ and Non-egg+ Infected Individuals

    PubMed Central

    Wiszniewsky, Anna; Ritter, Manuel; Goreish, Ibtisam A.; Atti El Mekki, Misk El Yemen A.; Arriens, Sandra; Pfarr, Kenneth; Fimmers, Rolf; Doenhoff, Mike; Hoerauf, Achim; Layland, Laura E.

    2016-01-01

    Background In the Sudan, Schistosoma mansoni infections are a major cause of morbidity in school-aged children and infection rates are associated with available clean water sources. During infection, immune responses pass through a Th1 followed by Th2 and Treg phases and patterns can relate to different stages of infection or immunity. Methodology This retrospective study evaluated immunoepidemiological aspects in 234 individuals (range 4–85 years old) from Kassala and Khartoum states in 2011. Systemic immune profiles (cytokines and immunoglobulins) and epidemiological parameters were surveyed in n = 110 persons presenting patent S. mansoni infections (egg+), n = 63 individuals positive for S. mansoni via PCR in sera but egg negative (SmPCR+) and n = 61 people who were infection-free (Sm uninf). Immunoepidemiological findings were further investigated using two binary multivariable regression analysis. Principal Findings Nearly all egg+ individuals had no access to latrines and over 90% obtained water via the canal stemming from the Atbara River. With regards to age, infection and an egg+ status was linked to young and adolescent groups. In terms of immunology, S. mansoni infection per se was strongly associated with increased SEA-specific IgG4 but not IgE levels. IL-6, IL-13 and IL-10 were significantly elevated in patently-infected individuals and positively correlated with egg load. In contrast, IL-2 and IL-1β were significantly lower in SmPCR+ individuals when compared to Sm uninf and egg+ groups which was further confirmed during multivariate regression analysis. Conclusions/Significance Schistosomiasis remains an important public health problem in the Sudan with a high number of patent individuals. In addition, SmPCR diagnostics revealed another cohort of infected individuals with a unique immunological profile and provides an avenue for future studies on non-patent infection states. Future studies should investigate the downstream signalling pathways

  12. Finding structure in data using multivariate tree boosting

    PubMed Central

    Miller, Patrick J.; Lubke, Gitta H.; McArtor, Daniel B.; Bergeman, C. S.

    2016-01-01

    Technology and collaboration enable dramatic increases in the size of psychological and psychiatric data collections, but finding structure in these large data sets with many collected variables is challenging. Decision tree ensembles such as random forests (Strobl, Malley, & Tutz, 2009) are a useful tool for finding structure, but are difficult to interpret with multiple outcome variables which are often of interest in psychology. To find and interpret structure in data sets with multiple outcomes and many predictors (possibly exceeding the sample size), we introduce a multivariate extension to a decision tree ensemble method called gradient boosted regression trees (Friedman, 2001). Our extension, multivariate tree boosting, is a method for nonparametric regression that is useful for identifying important predictors, detecting predictors with nonlinear effects and interactions without specification of such effects, and for identifying predictors that cause two or more outcome variables to covary. We provide the R package ‘mvtboost’ to estimate, tune, and interpret the resulting model, which extends the implementation of univariate boosting in the R package ‘gbm’ (Ridgeway et al., 2015) to continuous, multivariate outcomes. To illustrate the approach, we analyze predictors of psychological well-being (Ryff & Keyes, 1995). Simulations verify that our approach identifies predictors with nonlinear effects and achieves high prediction accuracy, exceeding or matching the performance of (penalized) multivariate multiple regression and multivariate decision trees over a wide range of conditions. PMID:27918183

  13. Functional Relationships and Regression Analysis.

    ERIC Educational Resources Information Center

    Preece, Peter F. W.

    1978-01-01

    Using a degenerate multivariate normal model for the distribution of organismic variables, the form of least-squares regression analysis required to estimate a linear functional relationship between variables is derived. It is suggested that the two conventional regression lines may be considered to describe functional, not merely statistical,…

  14. Multi-analyte quantification in bioprocesses by Fourier-transform-infrared spectroscopy by partial least squares regression and multivariate curve resolution.

    PubMed

    Koch, Cosima; Posch, Andreas E; Goicoechea, Héctor C; Herwig, Christoph; Lendl, Bernhard

    2014-01-07

    This paper presents the quantification of Penicillin V and phenoxyacetic acid, a precursor, inline during Pencillium chrysogenum fermentations by FTIR spectroscopy and partial least squares (PLS) regression and multivariate curve resolution - alternating least squares (MCR-ALS). First, the applicability of an attenuated total reflection FTIR fiber optic probe was assessed offline by measuring standards of the analytes of interest and investigating matrix effects of the fermentation broth. Then measurements were performed inline during four fed-batch fermentations with online HPLC for the determination of Penicillin V and phenoxyacetic acid as reference analysis. PLS and MCR-ALS models were built using these data and validated by comparison of single analyte spectra with the selectivity ratio of the PLS models and the extracted spectral traces of the MCR-ALS models, respectively. The achieved root mean square errors of cross-validation for the PLS regressions were 0.22 g L(-1) for Penicillin V and 0.32 g L(-1) for phenoxyacetic acid and the root mean square errors of prediction for MCR-ALS were 0.23 g L(-1) for Penicillin V and 0.15 g L(-1) for phenoxyacetic acid. A general work-flow for building and assessing chemometric regression models for the quantification of multiple analytes in bioprocesses by FTIR spectroscopy is given. Copyright © 2013 The Authors. Published by Elsevier B.V. All rights reserved.

  15. New strategy for determination of anthocyanins, polyphenols and antioxidant capacity of Brassica oleracea liquid extract using infrared spectroscopies and multivariate regression

    NASA Astrophysics Data System (ADS)

    de Oliveira, Isadora R. N.; Roque, Jussara V.; Maia, Mariza P.; Stringheta, Paulo C.; Teófilo, Reinaldo F.

    2018-04-01

    A new method was developed to determine the antioxidant properties of red cabbage extract (Brassica oleracea) by mid (MID) and near (NIR) infrared spectroscopies and partial least squares (PLS) regression. A 70% (v/v) ethanolic extract of red cabbage was concentrated to 9° Brix and further diluted (12 to 100%) in water. The dilutions were used as external standards for the building of PLS models. For the first time, this strategy was applied for building multivariate regression models. Reference analyses and spectral data were obtained from diluted extracts. The determinate properties were total and monomeric anthocyanins, total polyphenols and antioxidant capacity by ABTS (2,2-azino-bis(3-ethyl-benzothiazoline-6-sulfonate)) and DPPH (2,2-diphenyl-1-picrylhydrazyl) methods. Ordered predictors selection (OPS) and genetic algorithm (GA) were used for feature selection before PLS regression (PLS-1). In addition, a PLS-2 regression was applied to all properties simultaneously. PLS-1 models provided more predictive models than did PLS-2 regression. PLS-OPS and PLS-GA models presented excellent prediction results with a correlation coefficient higher than 0.98. However, the best models were obtained using PLS and variable selection with the OPS algorithm and the models based on NIR spectra were considered more predictive for all properties. Then, these models provided a simple, rapid and accurate method for determination of red cabbage extract antioxidant properties and its suitability for use in the food industry.

  16. Dental age assessment of young Iranian adults using third molars: A multivariate regression study.

    PubMed

    Bagherpour, Ali; Anbiaee, Najmeh; Partovi, Parnia; Golestani, Shayan; Afzalinasab, Shakiba

    2012-10-01

    In recent years, a noticeable increase in forensic age estimations of living individuals has been observed. Radiologic assessment of the mineralisation stage of third molars is of particular importance, with regard to the relevant age group. To attain a referral database and regression equations for dental age estimation of unaccompanied minors in an Iranian population was the goal of this study. Moreover, determination was made concerning the probability of an individual being over the age of 18 in case of full third molar(s) development. Using the scoring system of Gleiser and Hunt, modified by Köhler, an investigation of a cross-sectional sample of 1274 orthopantomograms of 885 females and 389 males aged between 15 and 22 years was carried out. Using kappa statistics, intra-observer reliability was tested. With Spearman correlation coefficient, correlation between the scores of all four wisdom teeth, was evaluated. We also carried out the Wilcoxon signed-rank test on asymmetry and calculated the regression formulae. A strong intra-observer agreement was displayed by the kappa value. No significant difference (p-value for upper and lower jaws were 0.07 and 0.59, respectively) was discovered by Wilcoxon signed-rank test for left and right asymmetry. The developmental stage of upper right and upper left third molars yielded the greatest correlation coefficient. The probability of an individual being over the age of 18 is 95.6% for males and 100.0% for females in case four fully developed third molars are present. Taking into consideration gender, location and number of wisdom teeth, regression formulae were arrived at. Use of population-specific standards is recommended as a means of improving the accuracy of forensic age estimates based on third molars mineralisation. To obtain more exact regression formulae, wider age range studies are recommended. Copyright © 2012 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.

  17. Simultaneous chemometric determination of pyridoxine hydrochloride and isoniazid in tablets by multivariate regression methods.

    PubMed

    Dinç, Erdal; Ustündağ, Ozgür; Baleanu, Dumitru

    2010-08-01

    The sole use of pyridoxine hydrochloride during treatment of tuberculosis gives rise to pyridoxine deficiency. Therefore, a combination of pyridoxine hydrochloride and isoniazid is used in pharmaceutical dosage form in tuberculosis treatment to reduce this side effect. In this study, two chemometric methods, partial least squares (PLS) and principal component regression (PCR), were applied to the simultaneous determination of pyridoxine (PYR) and isoniazid (ISO) in their tablets. A concentration training set comprising binary mixtures of PYR and ISO consisting of 20 different combinations were randomly prepared in 0.1 M HCl. Both multivariate calibration models were constructed using the relationships between the concentration data set (concentration data matrix) and absorbance data matrix in the spectral region 200-330 nm. The accuracy and the precision of the proposed chemometric methods were validated by analyzing synthetic mixtures containing the investigated drugs. The recovery results obtained by applying PCR and PLS calibrations to the artificial mixtures were found between 100.0 and 100.7%. Satisfactory results obtained by applying the PLS and PCR methods to both artificial and commercial samples were obtained. The results obtained in this manuscript strongly encourage us to use them for the quality control and the routine analysis of the marketing tablets containing PYR and ISO drugs. Copyright © 2010 John Wiley & Sons, Ltd.

  18. Stock price forecasting for companies listed on Tehran stock exchange using multivariate adaptive regression splines model and semi-parametric splines technique

    NASA Astrophysics Data System (ADS)

    Rounaghi, Mohammad Mahdi; Abbaszadeh, Mohammad Reza; Arashi, Mohammad

    2015-11-01

    One of the most important topics of interest to investors is stock price changes. Investors whose goals are long term are sensitive to stock price and its changes and react to them. In this regard, we used multivariate adaptive regression splines (MARS) model and semi-parametric splines technique for predicting stock price in this study. The MARS model as a nonparametric method is an adaptive method for regression and it fits for problems with high dimensions and several variables. semi-parametric splines technique was used in this study. Smoothing splines is a nonparametric regression method. In this study, we used 40 variables (30 accounting variables and 10 economic variables) for predicting stock price using the MARS model and using semi-parametric splines technique. After investigating the models, we select 4 accounting variables (book value per share, predicted earnings per share, P/E ratio and risk) as influencing variables on predicting stock price using the MARS model. After fitting the semi-parametric splines technique, only 4 accounting variables (dividends, net EPS, EPS Forecast and P/E Ratio) were selected as variables effective in forecasting stock prices.

  19. Modelling lecturer performance index of private university in Tulungagung by using survival analysis with multivariate adaptive regression spline

    NASA Astrophysics Data System (ADS)

    Hasyim, M.; Prastyo, D. D.

    2018-03-01

    Survival analysis performs relationship between independent variables and survival time as dependent variable. In fact, not all survival data can be recorded completely by any reasons. In such situation, the data is called censored data. Moreover, several model for survival analysis requires assumptions. One of the approaches in survival analysis is nonparametric that gives more relax assumption. In this research, the nonparametric approach that is employed is Multivariate Regression Adaptive Spline (MARS). This study is aimed to measure the performance of private university’s lecturer. The survival time in this study is duration needed by lecturer to obtain their professional certificate. The results show that research activities is a significant factor along with developing courses material, good publication in international or national journal, and activities in research collaboration.

  20. Multivariate logistic regression for predicting total culturable virus presence at the intake of a potable-water treatment plant: novel application of the atypical coliform/total coliform ratio.

    PubMed

    Black, L E; Brion, G M; Freitas, S J

    2007-06-01

    Predicting the presence of enteric viruses in surface waters is a complex modeling problem. Multiple water quality parameters that indicate the presence of human fecal material, the load of fecal material, and the amount of time fecal material has been in the environment are needed. This paper presents the results of a multiyear study of raw-water quality at the inlet of a potable-water plant that related 17 physical, chemical, and biological indices to the presence of enteric viruses as indicated by cytopathic changes in cell cultures. It was found that several simple, multivariate logistic regression models that could reliably identify observations of the presence or absence of total culturable virus could be fitted. The best models developed combined a fecal age indicator (the atypical coliform [AC]/total coliform [TC] ratio), the detectable presence of a human-associated sterol (epicoprostanol) to indicate the fecal source, and one of several fecal load indicators (the levels of Giardia species cysts, coliform bacteria, and coprostanol). The best fit to the data was found when the AC/TC ratio, the presence of epicoprostanol, and the density of fecal coliform bacteria were input into a simple, multivariate logistic regression equation, resulting in 84.5% and 78.6% accuracies for the identification of the presence and absence of total culturable virus, respectively. The AC/TC ratio was the most influential input variable in all of the models generated, but producing the best prediction required additional input related to the fecal source and the fecal load. The potential for replacing microbial indicators of fecal load with levels of coprostanol was proposed and evaluated by multivariate logistic regression modeling for the presence and absence of virus.

  1. Geographical variation of unmet medical needs in Italy: a multivariate logistic regression analysis

    PubMed Central

    2013-01-01

    Background Unmet health needs should be, in theory, a minor issue in Italy where a publicly funded and universally accessible health system exists. This, however, does not seem to be the case. Moreover, in the last two decades responsibilities for health care have been progressively decentralized to regional governments, which have differently organized health service delivery within their territories. Regional decision-making has affected the use of health care services, further increasing the existing geographical disparities in the access to care across the country. This study aims at comparing self-perceived unmet needs across Italian regions and assessing how the reported reasons - grouped into the categories of availability, accessibility and acceptability – vary geographically. Methods Data from the 2006 Italian component of the European Union Statistics on Income and Living Conditions are employed to explore reasons and predictors of self-reported unmet medical needs among 45,175 Italian respondents aged 18 and over. Multivariate logistic regression models are used to determine adjusted rates for overall unmet medical needs and for each of the three categories of reasons. Results Results show that, overall, 6.9% of the Italian population stated having experienced at least one unmet medical need during the last 12 months. The unadjusted rates vary markedly across regions, thus resulting in a clear-cut north–south divide (4.6% in the North-East vs. 10.6% in the South). Among those reporting unmet medical needs, the leading reason was problems of accessibility related to cost or transportation (45.5%), followed by acceptability (26.4%) and availability due to the presence of too long waiting lists (21.4%). In the South, more than one out of two individuals with an unmet need refrained from seeing a physician due to economic reasons. In the northern regions, working and family responsibilities contribute relatively more to the underutilization of medical

  2. Simultaneous determination of estrogens (ethinylestradiol and norgestimate) concentrations in human and bovine serum albumin by use of fluorescence spectroscopy and multivariate regression analysis.

    PubMed

    Hordge, LaQuana N; McDaniel, Kiara L; Jones, Derick D; Fakayode, Sayo O

    2016-05-15

    The endocrine disruption property of estrogens necessitates the immediate need for effective monitoring and development of analytical protocols for their analyses in biological and human specimens. This study explores the first combined utility of a steady-state fluorescence spectroscopy and multivariate partial-least-square (PLS) regression analysis for the simultaneous determination of two estrogens (17α-ethinylestradiol (EE) and norgestimate (NOR)) concentrations in bovine serum albumin (BSA) and human serum albumin (HSA) samples. The influence of EE and NOR concentrations and temperature on the emission spectra of EE-HSA EE-BSA, NOR-HSA, and NOR-BSA complexes was also investigated. The binding of EE with HSA and BSA resulted in increase in emission characteristics of HSA and BSA and a significant blue spectra shift. In contrast, the interaction of NOR with HSA and BSA quenched the emission characteristics of HSA and BSA. The observed emission spectral shifts preclude the effective use of traditional univariate regression analysis of fluorescent data for the determination of EE and NOR concentrations in HSA and BSA samples. Multivariate partial-least-squares (PLS) regression analysis was utilized to correlate the changes in emission spectra with EE and NOR concentrations in HSA and BSA samples. The figures-of-merit of the developed PLS regression models were excellent, with limits of detection as low as 1.6×10(-8) M for EE and 2.4×10(-7) M for NOR and good linearity (R(2)>0.994985). The PLS models correctly predicted EE and NOR concentrations in independent validation HSA and BSA samples with a root-mean-square-percent-relative-error (RMS%RE) of less than 6.0% at physiological condition. On the contrary, the use of univariate regression resulted in poor predictions of EE and NOR in HSA and BSA samples, with RMS%RE larger than 40% at physiological conditions. High accuracy, low sensitivity, simplicity, low-cost with no prior analyte extraction or separation

  3. Estimation of soil cation exchange capacity using Genetic Expression Programming (GEP) and Multivariate Adaptive Regression Splines (MARS)

    NASA Astrophysics Data System (ADS)

    Emamgolizadeh, S.; Bateni, S. M.; Shahsavani, D.; Ashrafi, T.; Ghorbani, H.

    2015-10-01

    The soil cation exchange capacity (CEC) is one of the main soil chemical properties, which is required in various fields such as environmental and agricultural engineering as well as soil science. In situ measurement of CEC is time consuming and costly. Hence, numerous studies have used traditional regression-based techniques to estimate CEC from more easily measurable soil parameters (e.g., soil texture, organic matter (OM), and pH). However, these models may not be able to adequately capture the complex and highly nonlinear relationship between CEC and its influential soil variables. In this study, Genetic Expression Programming (GEP) and Multivariate Adaptive Regression Splines (MARS) were employed to estimate CEC from more readily measurable soil physical and chemical variables (e.g., OM, clay, and pH) by developing functional relations. The GEP- and MARS-based functional relations were tested at two field sites in Iran. Results showed that GEP and MARS can provide reliable estimates of CEC. Also, it was found that the MARS model (with root-mean-square-error (RMSE) of 0.318 Cmol+ kg-1 and correlation coefficient (R2) of 0.864) generated slightly better results than the GEP model (with RMSE of 0.270 Cmol+ kg-1 and R2 of 0.807). The performance of GEP and MARS models was compared with two existing approaches, namely artificial neural network (ANN) and multiple linear regression (MLR). The comparison indicated that MARS and GEP outperformed the MLP model, but they did not perform as good as ANN. Finally, a sensitivity analysis was conducted to determine the most and the least influential variables affecting CEC. It was found that OM and pH have the most and least significant effect on CEC, respectively.

  4. A novel strategy for forensic age prediction by DNA methylation and support vector regression model

    PubMed Central

    Xu, Cheng; Qu, Hongzhu; Wang, Guangyu; Xie, Bingbing; Shi, Yi; Yang, Yaran; Zhao, Zhao; Hu, Lan; Fang, Xiangdong; Yan, Jiangwei; Feng, Lei

    2015-01-01

    High deviations resulting from prediction model, gender and population difference have limited age estimation application of DNA methylation markers. Here we identified 2,957 novel age-associated DNA methylation sites (P < 0.01 and R2 > 0.5) in blood of eight pairs of Chinese Han female monozygotic twins. Among them, nine novel sites (false discovery rate < 0.01), along with three other reported sites, were further validated in 49 unrelated female volunteers with ages of 20–80 years by Sequenom Massarray. A total of 95 CpGs were covered in the PCR products and 11 of them were built the age prediction models. After comparing four different models including, multivariate linear regression, multivariate nonlinear regression, back propagation neural network and support vector regression, SVR was identified as the most robust model with the least mean absolute deviation from real chronological age (2.8 years) and an average accuracy of 4.7 years predicted by only six loci from the 11 loci, as well as an less cross-validated error compared with linear regression model. Our novel strategy provides an accurate measurement that is highly useful in estimating the individual age in forensic practice as well as in tracking the aging process in other related applications. PMID:26635134

  5. Why are we regressing?

    PubMed

    Jupiter, Daniel C

    2012-01-01

    In this first of a series of statistical methodology commentaries for the clinician, we discuss the use of multivariate linear regression. Copyright © 2012 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.

  6. Multivariable regression analysis of list experiment data on abortion: results from a large, randomly-selected population based study in Liberia.

    PubMed

    Moseson, Heidi; Gerdts, Caitlin; Dehlendorf, Christine; Hiatt, Robert A; Vittinghoff, Eric

    2017-12-21

    The list experiment is a promising measurement tool for eliciting truthful responses to stigmatized or sensitive health behaviors. However, investigators may be hesitant to adopt the method due to previously untestable assumptions and the perceived inability to conduct multivariable analysis. With a recently developed statistical test that can detect the presence of a design effect - the absence of which is a central assumption of the list experiment method - we sought to test the validity of a list experiment conducted on self-reported abortion in Liberia. We also aim to introduce recently developed multivariable regression estimators for the analysis of list experiment data, to explore relationships between respondent characteristics and having had an abortion - an important component of understanding the experiences of women who have abortions. To test the null hypothesis of no design effect in the Liberian list experiment data, we calculated the percentage of each respondent "type," characterized by response to the control items, and compared these percentages across treatment and control groups with a Bonferroni-adjusted alpha criterion. We then implemented two least squares and two maximum likelihood models (four total), each representing different bias-variance trade-offs, to estimate the association between respondent characteristics and abortion. We find no clear evidence of a design effect in list experiment data from Liberia (p = 0.18), affirming the first key assumption of the method. Multivariable analyses suggest a negative association between education and history of abortion. The retrospective nature of measuring lifetime experience of abortion, however, complicates interpretation of results, as the timing and safety of a respondent's abortion may have influenced her ability to pursue an education. Our work demonstrates that multivariable analyses, as well as statistical testing of a key design assumption, are possible with list experiment data

  7. Regression and multivariate models for predicting particulate matter concentration level.

    PubMed

    Nazif, Amina; Mohammed, Nurul Izma; Malakahmad, Amirhossein; Abualqumboz, Motasem S

    2018-01-01

    The devastating health effects of particulate matter (PM 10 ) exposure by susceptible populace has made it necessary to evaluate PM 10 pollution. Meteorological parameters and seasonal variation increases PM 10 concentration levels, especially in areas that have multiple anthropogenic activities. Hence, stepwise regression (SR), multiple linear regression (MLR) and principal component regression (PCR) analyses were used to analyse daily average PM 10 concentration levels. The analyses were carried out using daily average PM 10 concentration, temperature, humidity, wind speed and wind direction data from 2006 to 2010. The data was from an industrial air quality monitoring station in Malaysia. The SR analysis established that meteorological parameters had less influence on PM 10 concentration levels having coefficient of determination (R 2 ) result from 23 to 29% based on seasoned and unseasoned analysis. While, the result of the prediction analysis showed that PCR models had a better R 2 result than MLR methods. The results for the analyses based on both seasoned and unseasoned data established that MLR models had R 2 result from 0.50 to 0.60. While, PCR models had R 2 result from 0.66 to 0.89. In addition, the validation analysis using 2016 data also recognised that the PCR model outperformed the MLR model, with the PCR model for the seasoned analysis having the best result. These analyses will aid in achieving sustainable air quality management strategies.

  8. Testing Multivariate Adaptive Regression Splines (MARS) as a Method of Land Cover Classification of TERRA-ASTER Satellite Images.

    PubMed

    Quirós, Elia; Felicísimo, Angel M; Cuartero, Aurora

    2009-01-01

    This work proposes a new method to classify multi-spectral satellite images based on multivariate adaptive regression splines (MARS) and compares this classification system with the more common parallelepiped and maximum likelihood (ML) methods. We apply the classification methods to the land cover classification of a test zone located in southwestern Spain. The basis of the MARS method and its associated procedures are explained in detail, and the area under the ROC curve (AUC) is compared for the three methods. The results show that the MARS method provides better results than the parallelepiped method in all cases, and it provides better results than the maximum likelihood method in 13 cases out of 17. These results demonstrate that the MARS method can be used in isolation or in combination with other methods to improve the accuracy of soil cover classification. The improvement is statistically significant according to the Wilcoxon signed rank test.

  9. The impact of operative time on complications after plastic surgery: a multivariate regression analysis of 1753 cases.

    PubMed

    Hardy, Krista L; Davis, Kathryn E; Constantine, Ryan S; Chen, Mo; Hein, Rachel; Jewell, James L; Dirisala, Karunakar; Lysikowski, Jerzy; Reed, Gary; Kenkel, Jeffrey M

    2014-05-01

    Little evidence within plastic surgery literature supports the precept that longer operative times lead to greater morbidity. The authors investigate surgery duration as a determinant of morbidity, with the goal of defining a clinically relevant time for increased risk. A retrospective chart review was conducted of patients who underwent a broad range of complex plastic surgical procedures (n = 1801 procedures) at UT Southwestern Medical Center in Dallas, Texas, from January 1, 2008 to January 31, 2012. Adjusting for possible confounders, multivariate logistic regression assessed surgery duration as an independent predictor of morbidity. To define a cutoff for increased risk, incidence of complications was compared among quintiles of surgery duration. Stratification by type of surgery controlled for procedural complexity. A total of 1753 cases were included in multivariate analyses with an overall complication rate of 27.8%. Most operations were combined (75.8%), averaging 4.9 concurrent procedures. Each hour increase in surgery duration was associated with a 21% rise in odds of morbidity (P < .0001). Compared with the first quintile of operative time (<2.0 hours), there was no change in complications until after 3.1 hours of surgery (odds ratio, 1.6; P = .017), with progressively greater odds increases of 3.1 times after 4.5 hours (P < .0001) and 4.7 times after 6.8 hours (P < .0001). When stratified by type of surgery, longer operations continued to be associated with greater morbidity. Surgery duration is an independent predictor of complications, with a significantly increased risk above 3 hours. Although procedural complexity undoubtedly affects morbidity, operative time should factor into surgical decision making.

  10. Aspects of porosity prediction using multivariate linear regression

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Byrnes, A.P.; Wilson, M.D.

    1991-03-01

    Highly accurate multiple linear regression models have been developed for sandstones of diverse compositions. Porosity reduction or enhancement processes are controlled by the fundamental variables, Pressure (P), Temperature (T), Time (t), and Composition (X), where composition includes mineralogy, size, sorting, fluid composition, etc. The multiple linear regression equation, of which all linear porosity prediction models are subsets, takes the generalized form: Porosity = C{sub 0} + C{sub 1}(P) + C{sub 2}(T) + C{sub 3}(X) + C{sub 4}(t) + C{sub 5}(PT) + C{sub 6}(PX) + C{sub 7}(Pt) + C{sub 8}(TX) + C{sub 9}(Tt) + C{sub 10}(Xt) + C{sub 11}(PTX) + C{submore » 12}(PXt) + C{sub 13}(PTt) + C{sub 14}(TXt) + C{sub 15}(PTXt). The first four primary variables are often interactive, thus requiring terms involving two or more primary variables (the form shown implies interaction and not necessarily multiplication). The final terms used may also involve simple mathematic transforms such as log X, e{sup T}, X{sup 2}, or more complex transformations such as the Time-Temperature Index (TTI). The X term in the equation above represents a suite of compositional variable and, therefore, a fully expanded equation may include a series of terms incorporating these variables. Numerous published bivariate porosity prediction models involving P (or depth) or Tt (TTI) are effective to a degree, largely because of the high degree of colinearity between p and TTI. However, all such bivariate models ignore the unique contributions of P and Tt, as well as various X terms. These simpler models become poor predictors in regions where colinear relations change, were important variables have been ignored, or where the database does not include a sufficient range or weight distribution for the critical variables.« less

  11. Expert Involvement Predicts mHealth App Downloads: Multivariate Regression Analysis of Urology Apps

    PubMed Central

    Osório, Luís; Cavadas, Vitor; Fraga, Avelino; Carrasquinho, Eduardo; Cardoso de Oliveira, Eduardo; Castelo-Branco, Miguel; Roobol, Monique J

    2016-01-01

    Background Urological mobile medical (mHealth) apps are gaining popularity with both clinicians and patients. mHealth is a rapidly evolving and heterogeneous field, with some urology apps being downloaded over 10,000 times and others not at all. The factors that contribute to medical app downloads have yet to be identified, including the hypothetical influence of expert involvement in app development. Objective The objective of our study was to identify predictors of the number of urology app downloads. Methods We reviewed urology apps available in the Google Play Store and collected publicly available data. Multivariate ordinal logistic regression evaluated the effect of publicly available app variables on the number of apps being downloaded. Results Of 129 urology apps eligible for study, only 2 (1.6%) had >10,000 downloads, with half having ≤100 downloads and 4 (3.1%) having none at all. Apps developed with expert urologist involvement (P=.003), optional in-app purchases (P=.01), higher user rating (P<.001), and more user reviews (P<.001) were more likely to be installed. App cost was inversely related to the number of downloads (P<.001). Only data from the Google Play Store and the developers’ websites, but not other platforms, were publicly available for analysis, and the level and nature of expert involvement was not documented. Conclusions The explicit participation of urologists in app development is likely to enhance its chances to have a higher number of downloads. This finding should help in the design of better apps and further promote urologist involvement in mHealth. Official certification processes are required to ensure app quality and user safety. PMID:27421338

  12. Logistic models--an odd(s) kind of regression.

    PubMed

    Jupiter, Daniel C

    2013-01-01

    The logistic regression model bears some similarity to the multivariable linear regression with which we are familiar. However, the differences are great enough to warrant a discussion of the need for and interpretation of logistic regression. Copyright © 2013 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.

  13. Multivariate analysis of risk factors for long-term urethroplasty outcome.

    PubMed

    Breyer, Benjamin N; McAninch, Jack W; Whitson, Jared M; Eisenberg, Michael L; Mehdizadeh, Jennifer F; Myers, Jeremy B; Voelzke, Bryan B

    2010-02-01

    We studied the patient risk factors that promote urethroplasty failure. Records of patients who underwent urethroplasty at the University of California, San Francisco Medical Center between 1995 and 2004 were reviewed. Cox proportional hazards regression analysis was used to identify multivariate predictors of urethroplasty outcome. Between 1995 and 2004, 443 patients of 495 who underwent urethroplasty had complete comorbidity data and were included in analysis. Median patient age was 41 years (range 18 to 90). Median followup was 5.8 years (range 1 month to 10 years). Stricture recurred in 93 patients (21%). Primary estimated stricture-free survival at 1, 3 and 5 years was 88%, 82% and 79%. After multivariate analysis smoking (HR 1.8, 95% CI 1.0-3.1, p = 0.05), prior direct vision internal urethrotomy (HR 1.7, 95% CI 1.0-3.0, p = 0.04) and prior urethroplasty (HR 1.8, 95% CI 1.1-3.1, p = 0.03) were predictive of treatment failure. On multivariate analysis diabetes mellitus showed a trend toward prediction of urethroplasty failure (HR 2.0, 95% CI 0.8-4.9, p = 0.14). Length of urethral stricture (greater than 4 cm), prior urethroplasty and failed endoscopic therapy are predictive of failure after urethroplasty. Smoking and diabetes mellitus also may predict failure potentially secondary to microvascular damage. Copyright 2010 American Urological Association. Published by Elsevier Inc. All rights reserved.

  14. Characterizing multivariate decoding models based on correlated EEG spectral features.

    PubMed

    McFarland, Dennis J

    2013-07-01

    Multivariate decoding methods are popular techniques for analysis of neurophysiological data. The present study explored potential interpretative problems with these techniques when predictors are correlated. Data from sensorimotor rhythm-based cursor control experiments was analyzed offline with linear univariate and multivariate models. Features were derived from autoregressive (AR) spectral analysis of varying model order which produced predictors that varied in their degree of correlation (i.e., multicollinearity). The use of multivariate regression models resulted in much better prediction of target position as compared to univariate regression models. However, with lower order AR features interpretation of the spectral patterns of the weights was difficult. This is likely to be due to the high degree of multicollinearity present with lower order AR features. Care should be exercised when interpreting the pattern of weights of multivariate models with correlated predictors. Comparison with univariate statistics is advisable. While multivariate decoding algorithms are very useful for prediction their utility for interpretation may be limited when predictors are correlated. Copyright © 2013 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.

  15. Characterizing multivariate decoding models based on correlated EEG spectral features

    PubMed Central

    McFarland, Dennis J.

    2013-01-01

    Objective Multivariate decoding methods are popular techniques for analysis of neurophysiological data. The present study explored potential interpretative problems with these techniques when predictors are correlated. Methods Data from sensorimotor rhythm-based cursor control experiments was analyzed offline with linear univariate and multivariate models. Features were derived from autoregressive (AR) spectral analysis of varying model order which produced predictors that varied in their degree of correlation (i.e., multicollinearity). Results The use of multivariate regression models resulted in much better prediction of target position as compared to univariate regression models. However, with lower order AR features interpretation of the spectral patterns of the weights was difficult. This is likely to be due to the high degree of multicollinearity present with lower order AR features. Conclusions Care should be exercised when interpreting the pattern of weights of multivariate models with correlated predictors. Comparison with univariate statistics is advisable. Significance While multivariate decoding algorithms are very useful for prediction their utility for interpretation may be limited when predictors are correlated. PMID:23466267

  16. Regional flow duration curves: Geostatistical techniques versus multivariate regression

    USGS Publications Warehouse

    Pugliese, Alessio; Farmer, William H.; Castellarin, Attilio; Archfield, Stacey A.; Vogel, Richard M.

    2016-01-01

    A period-of-record flow duration curve (FDC) represents the relationship between the magnitude and frequency of daily streamflows. Prediction of FDCs is of great importance for locations characterized by sparse or missing streamflow observations. We present a detailed comparison of two methods which are capable of predicting an FDC at ungauged basins: (1) an adaptation of the geostatistical method, Top-kriging, employing a linear weighted average of dimensionless empirical FDCs, standardised with a reference streamflow value; and (2) regional multiple linear regression of streamflow quantiles, perhaps the most common method for the prediction of FDCs at ungauged sites. In particular, Top-kriging relies on a metric for expressing the similarity between catchments computed as the negative deviation of the FDC from a reference streamflow value, which we termed total negative deviation (TND). Comparisons of these two methods are made in 182 largely unregulated river catchments in the southeastern U.S. using a three-fold cross-validation algorithm. Our results reveal that the two methods perform similarly throughout flow-regimes, with average Nash-Sutcliffe Efficiencies 0.566 and 0.662, (0.883 and 0.829 on log-transformed quantiles) for the geostatistical and the linear regression models, respectively. The differences between the reproduction of FDC's occurred mostly for low flows with exceedance probability (i.e. duration) above 0.98.

  17. Estimating irradiated nuclear fuel characteristics by nonlinear multivariate regression of simulated gamma-ray emissions

    NASA Astrophysics Data System (ADS)

    Åberg Lindell, M.; Andersson, P.; Grape, S.; Håkansson, A.; Thulin, M.

    2018-07-01

    In addition to verifying operator declared parameters of spent nuclear fuel, the ability to experimentally infer such parameters with a minimum of intrusiveness is of great interest and has been long-sought after in the nuclear safeguards community. It can also be anticipated that such ability would be of interest for quality assurance in e.g. recycling facilities in future Generation IV nuclear fuel cycles. One way to obtain information regarding spent nuclear fuel is to measure various gamma-ray intensities using high-resolution gamma-ray spectroscopy. While intensities from a few isotopes obtained from such measurements have traditionally been used pairwise, the approach in this work is to simultaneously analyze correlations between all available isotopes, using multivariate analysis techniques. Based on this approach, a methodology for inferring burnup, cooling time, and initial fissile content of PWR fuels using passive gamma-ray spectroscopy data has been investigated. PWR nuclear fuels, of UOX and MOX type, and their gamma-ray emissions, were simulated using the Monte Carlo code Serpent. Data comprising relative isotope activities was analyzed with decision trees and support vector machines, for predicting fuel parameters and their associated uncertainties. From this work it may be concluded that up to a cooling time of twenty years, the 95% prediction intervals of burnup, cooling time and initial fissile content could be inferred to within approximately 7 MWd/kgHM, 8 months, and 1.4 percentage points, respectively. An attempt aiming to estimate the plutonium content in spent UOX fuel, using the developed multivariate analysis model, is also presented. The results for Pu mass estimation are promising and call for further studies.

  18. Causal diagrams and multivariate analysis II: precision work.

    PubMed

    Jupiter, Daniel C

    2014-01-01

    In this Investigators' Corner, I continue my discussion of when and why we researchers should include variables in multivariate regression. My examination focuses on studies comparing treatment groups and situations for which we can either exclude variables from multivariate analyses or include them for reasons of precision. Copyright © 2014 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.

  19. Multivariate analysis of nystatin and metronidazole in a semi-solid matrix by means of diffuse reflectance NIR spectroscopy and PLS regression.

    PubMed

    Baratieri, Sabrina C; Barbosa, Juliana M; Freitas, Matheus P; Martins, José A

    2006-01-23

    A multivariate method of analysis of nystatin and metronidazole in a semi-solid matrix, based on diffuse reflectance NIR measurements and partial least squares regression, is reported. The product, a vaginal cream used in the antifungal and antibacterial treatment, is usually, quantitatively analyzed through microbiological tests (nystatin) and HPLC technique (metronidazole), according to pharmacopeial procedures. However, near infrared spectroscopy has demonstrated to be a valuable tool for content determination, given the rapidity and scope of the method. In the present study, it was successfully applied in the prediction of nystatin (even in low concentrations, ca. 0.3-0.4%, w/w, which is around 100,000 IU/5g) and metronidazole contents, as demonstrated by some figures of merit, namely linearity, precision (mean and repeatability) and accuracy.

  20. Multivariate Boosting for Integrative Analysis of High-Dimensional Cancer Genomic Data

    PubMed Central

    Xiong, Lie; Kuan, Pei-Fen; Tian, Jianan; Keles, Sunduz; Wang, Sijian

    2015-01-01

    In this paper, we propose a novel multivariate component-wise boosting method for fitting multivariate response regression models under the high-dimension, low sample size setting. Our method is motivated by modeling the association among different biological molecules based on multiple types of high-dimensional genomic data. Particularly, we are interested in two applications: studying the influence of DNA copy number alterations on RNA transcript levels and investigating the association between DNA methylation and gene expression. For this purpose, we model the dependence of the RNA expression levels on DNA copy number alterations and the dependence of gene expression on DNA methylation through multivariate regression models and utilize boosting-type method to handle the high dimensionality as well as model the possible nonlinear associations. The performance of the proposed method is demonstrated through simulation studies. Finally, our multivariate boosting method is applied to two breast cancer studies. PMID:26609213

  1. Financial Aid and First-Year Collegiate GPA: A Regression Discontinuity Approach

    ERIC Educational Resources Information Center

    Curs, Bradley R.; Harper, Casandra E.

    2012-01-01

    Using a regression discontinuity design, we investigate whether a merit-based financial aid program has a causal effect on the first-year grade point average of first-time out-of-state freshmen at the University of Oregon. Our results indicate that merit-based financial aid has a positive and significant effect on first-year collegiate grade point…

  2. Applied Statistics: From Bivariate through Multivariate Techniques [with CD-ROM

    ERIC Educational Resources Information Center

    Warner, Rebecca M.

    2007-01-01

    This book provides a clear introduction to widely used topics in bivariate and multivariate statistics, including multiple regression, discriminant analysis, MANOVA, factor analysis, and binary logistic regression. The approach is applied and does not require formal mathematics; equations are accompanied by verbal explanations. Students are asked…

  3. Rex fortran 4 system for combinatorial screening or conventional analysis of multivariate regressions

    Treesearch

    L.R. Grosenbaugh

    1967-01-01

    Describes an expansible computerized system that provides data needed in regression or covariance analysis of as many as 50 variables, 8 of which may be dependent. Alternatively, it can screen variously generated combinations of independent variables to find the regression with the smallest mean-squared-residual, which will be fitted if desired. The user can easily...

  4. A multivariate auto-regressive combined-harmonics analysis and its application to ozone time series data

    NASA Astrophysics Data System (ADS)

    Yang, Eun-Su

    2001-07-01

    A new statistical approach is used to analyze Dobson Umkehr layer-ozone measurements at Arosa for 1979-1996 and Total Ozone Mapping Spectrometer (TOMS) Version 7 zonal mean ozone for 1979-1993, accounting for stratospheric aerosol optical depth (SAOD), quasi-biennial oscillation (QBO), and solar flux effects. A stepwise regression scheme selects statistically significant periodicities caused by season, SAOD, QBO, and solar variations and filters them out. Auto-regressive (AR) terms are included in ozone residuals and time lags are assumed for the residuals of exogenous variables. Then, the magnitudes of responses of ozone to the SAOD, QBO, and solar index (SI) series are derived from the stationary time series of the residuals. These Multivariate Auto-Regressive Combined Harmonics (MARCH) processes possess the following significant advantages: (1)the ozone trends are estimated more precisely than the previous methods; (2)the influences of the exogenous SAOD, QBO, and solar variations are clearly separated at various time lags; (3)the collinearity of the exogenous variables in the regression is significantly reduced; and (4)the probability of obtaining misleading correlations between ozone and exogenous times series is reduced. The MARCH results indicate that the Umkehr ozone response to SAOD (not a real ozone response but rather an optical interference effect), QBO, and solar effects is driven by combined dynamical radiative-chemical processes. These results are independently confirmed using the revised Standard models that include aerosol and solar forcing mechanisms with all possible time lags but not by the Standard model when restricted to a zero time lag in aerosol and solar ozone forcings. As for Dobson Umkehr ozone measurements at Arosa, the aerosol effects are most significant in layers 8, 7, and 6 with no time lag, as is to be expected due to the optical contamination of Umkehr measurements by SAOD. The QBO and solar UV effects appear in all layers 4

  5. Using Multivariate Adaptive Regression Spline and Artificial Neural Network to Simulate Urbanization in Mumbai, India

    NASA Astrophysics Data System (ADS)

    Ahmadlou, M.; Delavar, M. R.; Tayyebi, A.; Shafizadeh-Moghadam, H.

    2015-12-01

    Land use change (LUC) models used for modelling urban growth are different in structure and performance. Local models divide the data into separate subsets and fit distinct models on each of the subsets. Non-parametric models are data driven and usually do not have a fixed model structure or model structure is unknown before the modelling process. On the other hand, global models perform modelling using all the available data. In addition, parametric models have a fixed structure before the modelling process and they are model driven. Since few studies have compared local non-parametric models with global parametric models, this study compares a local non-parametric model called multivariate adaptive regression spline (MARS), and a global parametric model called artificial neural network (ANN) to simulate urbanization in Mumbai, India. Both models determine the relationship between a dependent variable and multiple independent variables. We used receiver operating characteristic (ROC) to compare the power of the both models for simulating urbanization. Landsat images of 1991 (TM) and 2010 (ETM+) were used for modelling the urbanization process. The drivers considered for urbanization in this area were distance to urban areas, urban density, distance to roads, distance to water, distance to forest, distance to railway, distance to central business district, number of agricultural cells in a 7 by 7 neighbourhoods, and slope in 1991. The results showed that the area under the ROC curve for MARS and ANN was 94.77% and 95.36%, respectively. Thus, ANN performed slightly better than MARS to simulate urban areas in Mumbai, India.

  6. Modelling daily dissolved oxygen concentration using least square support vector machine, multivariate adaptive regression splines and M5 model tree

    NASA Astrophysics Data System (ADS)

    Heddam, Salim; Kisi, Ozgur

    2018-04-01

    In the present study, three types of artificial intelligence techniques, least square support vector machine (LSSVM), multivariate adaptive regression splines (MARS) and M5 model tree (M5T) are applied for modeling daily dissolved oxygen (DO) concentration using several water quality variables as inputs. The DO concentration and water quality variables data from three stations operated by the United States Geological Survey (USGS) were used for developing the three models. The water quality data selected consisted of daily measured of water temperature (TE, °C), pH (std. unit), specific conductance (SC, μS/cm) and discharge (DI cfs), are used as inputs to the LSSVM, MARS and M5T models. The three models were applied for each station separately and compared to each other. According to the results obtained, it was found that: (i) the DO concentration could be successfully estimated using the three models and (ii) the best model among all others differs from one station to another.

  7. A Ten Year Study of Salary Differential by Sex through a Regression Methodology.

    ERIC Educational Resources Information Center

    Williams, John Delane; And Others

    A 10-year study of salary differential by sex was undertaken at the University of North Dakota using a multiple regression methodology, with rank, discipline, degree, years in department, years in current rank, and sex as predictors. The sex variable evidenced lower salaries for women when controlling for the other variables throughout the study…

  8. Multivariate linear regression analysis to identify general factors for quantitative predictions of implant stability quotient values

    PubMed Central

    Huang, Hairong; Xu, Zanzan; Shao, Xianhong; Wismeijer, Daniel; Sun, Ping; Wang, Jingxiao

    2017-01-01

    Objectives This study identified potential general influencing factors for a mathematical prediction of implant stability quotient (ISQ) values in clinical practice. Methods We collected the ISQ values of 557 implants from 2 different brands (SICace and Osstem) placed by 2 surgeons in 336 patients. Surgeon 1 placed 329 SICace implants, and surgeon 2 placed 113 SICace implants and 115 Osstem implants. ISQ measurements were taken at T1 (immediately after implant placement) and T2 (before dental restoration). A multivariate linear regression model was used to analyze the influence of the following 11 candidate factors for stability prediction: sex, age, maxillary/mandibular location, bone type, immediate/delayed implantation, bone grafting, insertion torque, I-stage or II-stage healing pattern, implant diameter, implant length and T1-T2 time interval. Results The need for bone grafting as a predictor significantly influenced ISQ values in all three groups at T1 (weight coefficients ranging from -4 to -5). In contrast, implant diameter consistently influenced the ISQ values in all three groups at T2 (weight coefficients ranging from 3.4 to 4.2). Other factors, such as sex, age, I/II-stage implantation and bone type, did not significantly influence ISQ values at T2, and implant length did not significantly influence ISQ values at T1 or T2. Conclusions These findings provide a rational basis for mathematical models to quantitatively predict the ISQ values of implants in clinical practice. PMID:29084260

  9. A land use regression model for ambient ultrafine particles in Montreal, Canada: A comparison of linear regression and a machine learning approach.

    PubMed

    Weichenthal, Scott; Ryswyk, Keith Van; Goldstein, Alon; Bagg, Scott; Shekkarizfard, Maryam; Hatzopoulou, Marianne

    2016-04-01

    Existing evidence suggests that ambient ultrafine particles (UFPs) (<0.1µm) may contribute to acute cardiorespiratory morbidity. However, few studies have examined the long-term health effects of these pollutants owing in part to a need for exposure surfaces that can be applied in large population-based studies. To address this need, we developed a land use regression model for UFPs in Montreal, Canada using mobile monitoring data collected from 414 road segments during the summer and winter months between 2011 and 2012. Two different approaches were examined for model development including standard multivariable linear regression and a machine learning approach (kernel-based regularized least squares (KRLS)) that learns the functional form of covariate impacts on ambient UFP concentrations from the data. The final models included parameters for population density, ambient temperature and wind speed, land use parameters (park space and open space), length of local roads and rail, and estimated annual average NOx emissions from traffic. The final multivariable linear regression model explained 62% of the spatial variation in ambient UFP concentrations whereas the KRLS model explained 79% of the variance. The KRLS model performed slightly better than the linear regression model when evaluated using an external dataset (R(2)=0.58 vs. 0.55) or a cross-validation procedure (R(2)=0.67 vs. 0.60). In general, our findings suggest that the KRLS approach may offer modest improvements in predictive performance compared to standard multivariable linear regression models used to estimate spatial variations in ambient UFPs. However, differences in predictive performance were not statistically significant when evaluated using the cross-validation procedure. Crown Copyright © 2015. Published by Elsevier Inc. All rights reserved.

  10. Spatial regression analysis on 32 years of total column ozone data

    NASA Astrophysics Data System (ADS)

    Knibbe, J. S.; van der A, R. J.; de Laat, A. T. J.

    2014-08-01

    Multiple-regression analyses have been performed on 32 years of total ozone column data that was spatially gridded with a 1 × 1.5° resolution. The total ozone data consist of the MSR (Multi Sensor Reanalysis; 1979-2008) and 2 years of assimilated SCIAMACHY (SCanning Imaging Absorption spectroMeter for Atmospheric CHartographY) ozone data (2009-2010). The two-dimensionality in this data set allows us to perform the regressions locally and investigate spatial patterns of regression coefficients and their explanatory power. Seasonal dependencies of ozone on regressors are included in the analysis. A new physically oriented model is developed to parameterize stratospheric ozone. Ozone variations on nonseasonal timescales are parameterized by explanatory variables describing the solar cycle, stratospheric aerosols, the quasi-biennial oscillation (QBO), El Niño-Southern Oscillation (ENSO) and stratospheric alternative halogens which are parameterized by the effective equivalent stratospheric chlorine (EESC). For several explanatory variables, seasonally adjusted versions of these explanatory variables are constructed to account for the difference in their effect on ozone throughout the year. To account for seasonal variation in ozone, explanatory variables describing the polar vortex, geopotential height, potential vorticity and average day length are included. Results of this regression model are compared to that of a similar analysis based on a more commonly applied statistically oriented model. The physically oriented model provides spatial patterns in the regression results for each explanatory variable. The EESC has a significant depleting effect on ozone at mid- and high latitudes, the solar cycle affects ozone positively mostly in the Southern Hemisphere, stratospheric aerosols affect ozone negatively at high northern latitudes, the effect of QBO is positive and negative in the tropics and mid- to high latitudes, respectively, and ENSO affects ozone negatively

  11. Household Food Waste: Multivariate Regression and Principal Components Analyses of Awareness and Attitudes among U.S. Consumers

    PubMed Central

    2016-01-01

    We estimate models of consumer food waste awareness and attitudes using responses from a national survey of U.S. residents. Our models are interpreted through the lens of several theories that describe how pro-social behaviors relate to awareness, attitudes and opinions. Our analysis of patterns among respondents’ food waste attitudes yields a model with three principal components: one that represents perceived practical benefits households may lose if food waste were reduced, one that represents the guilt associated with food waste, and one that represents whether households feel they could be doing more to reduce food waste. We find our respondents express significant agreement that some perceived practical benefits are ascribed to throwing away uneaten food, e.g., nearly 70% of respondents agree that throwing away food after the package date has passed reduces the odds of foodborne illness, while nearly 60% agree that some food waste is necessary to ensure meals taste fresh. We identify that these attitudinal responses significantly load onto a single principal component that may represent a key attitudinal construct useful for policy guidance. Further, multivariate regression analysis reveals a significant positive association between the strength of this component and household income, suggesting that higher income households most strongly agree with statements that link throwing away uneaten food to perceived private benefits. PMID:27441687

  12. Multivariate regression model for partitioning tree volume of white oak into round-product classes

    Treesearch

    Daniel A. Yaussy; David L. Sonderman

    1984-01-01

    Describes the development of multivariate equations that predict the expected cubic volume of four round-product classes from independent variables composed of individual tree-quality characteristics. Although the model has limited application at this time, it does demonstrate the feasibility of partitioning total tree cubic volume into round-product classes based on...

  13. Using Multivariate Regression Model with Least Absolute Shrinkage and Selection Operator (LASSO) to Predict the Incidence of Xerostomia after Intensity-Modulated Radiotherapy for Head and Neck Cancer

    PubMed Central

    Ting, Hui-Min; Chang, Liyun; Huang, Yu-Jie; Wu, Jia-Ming; Wang, Hung-Yu; Horng, Mong-Fong; Chang, Chun-Ming; Lan, Jen-Hong; Huang, Ya-Yu; Fang, Fu-Min; Leung, Stephen Wan

    2014-01-01

    Purpose The aim of this study was to develop a multivariate logistic regression model with least absolute shrinkage and selection operator (LASSO) to make valid predictions about the incidence of moderate-to-severe patient-rated xerostomia among head and neck cancer (HNC) patients treated with IMRT. Methods and Materials Quality of life questionnaire datasets from 206 patients with HNC were analyzed. The European Organization for Research and Treatment of Cancer QLQ-H&N35 and QLQ-C30 questionnaires were used as the endpoint evaluation. The primary endpoint (grade 3+ xerostomia) was defined as moderate-to-severe xerostomia at 3 (XER3m) and 12 months (XER12m) after the completion of IMRT. Normal tissue complication probability (NTCP) models were developed. The optimal and suboptimal numbers of prognostic factors for a multivariate logistic regression model were determined using the LASSO with bootstrapping technique. Statistical analysis was performed using the scaled Brier score, Nagelkerke R2, chi-squared test, Omnibus, Hosmer-Lemeshow test, and the AUC. Results Eight prognostic factors were selected by LASSO for the 3-month time point: Dmean-c, Dmean-i, age, financial status, T stage, AJCC stage, smoking, and education. Nine prognostic factors were selected for the 12-month time point: Dmean-i, education, Dmean-c, smoking, T stage, baseline xerostomia, alcohol abuse, family history, and node classification. In the selection of the suboptimal number of prognostic factors by LASSO, three suboptimal prognostic factors were fine-tuned by Hosmer-Lemeshow test and AUC, i.e., Dmean-c, Dmean-i, and age for the 3-month time point. Five suboptimal prognostic factors were also selected for the 12-month time point, i.e., Dmean-i, education, Dmean-c, smoking, and T stage. The overall performance for both time points of the NTCP model in terms of scaled Brier score, Omnibus, and Nagelkerke R2 was satisfactory and corresponded well with the expected values. Conclusions

  14. Using multivariate regression model with least absolute shrinkage and selection operator (LASSO) to predict the incidence of Xerostomia after intensity-modulated radiotherapy for head and neck cancer.

    PubMed

    Lee, Tsair-Fwu; Chao, Pei-Ju; Ting, Hui-Min; Chang, Liyun; Huang, Yu-Jie; Wu, Jia-Ming; Wang, Hung-Yu; Horng, Mong-Fong; Chang, Chun-Ming; Lan, Jen-Hong; Huang, Ya-Yu; Fang, Fu-Min; Leung, Stephen Wan

    2014-01-01

    The aim of this study was to develop a multivariate logistic regression model with least absolute shrinkage and selection operator (LASSO) to make valid predictions about the incidence of moderate-to-severe patient-rated xerostomia among head and neck cancer (HNC) patients treated with IMRT. Quality of life questionnaire datasets from 206 patients with HNC were analyzed. The European Organization for Research and Treatment of Cancer QLQ-H&N35 and QLQ-C30 questionnaires were used as the endpoint evaluation. The primary endpoint (grade 3(+) xerostomia) was defined as moderate-to-severe xerostomia at 3 (XER3m) and 12 months (XER12m) after the completion of IMRT. Normal tissue complication probability (NTCP) models were developed. The optimal and suboptimal numbers of prognostic factors for a multivariate logistic regression model were determined using the LASSO with bootstrapping technique. Statistical analysis was performed using the scaled Brier score, Nagelkerke R(2), chi-squared test, Omnibus, Hosmer-Lemeshow test, and the AUC. Eight prognostic factors were selected by LASSO for the 3-month time point: Dmean-c, Dmean-i, age, financial status, T stage, AJCC stage, smoking, and education. Nine prognostic factors were selected for the 12-month time point: Dmean-i, education, Dmean-c, smoking, T stage, baseline xerostomia, alcohol abuse, family history, and node classification. In the selection of the suboptimal number of prognostic factors by LASSO, three suboptimal prognostic factors were fine-tuned by Hosmer-Lemeshow test and AUC, i.e., Dmean-c, Dmean-i, and age for the 3-month time point. Five suboptimal prognostic factors were also selected for the 12-month time point, i.e., Dmean-i, education, Dmean-c, smoking, and T stage. The overall performance for both time points of the NTCP model in terms of scaled Brier score, Omnibus, and Nagelkerke R(2) was satisfactory and corresponded well with the expected values. Multivariate NTCP models with LASSO can be used to

  15. Multivariate meta-analysis for non-linear and other multi-parameter associations

    PubMed Central

    Gasparrini, A; Armstrong, B; Kenward, M G

    2012-01-01

    In this paper, we formalize the application of multivariate meta-analysis and meta-regression to synthesize estimates of multi-parameter associations obtained from different studies. This modelling approach extends the standard two-stage analysis used to combine results across different sub-groups or populations. The most straightforward application is for the meta-analysis of non-linear relationships, described for example by regression coefficients of splines or other functions, but the methodology easily generalizes to any setting where complex associations are described by multiple correlated parameters. The modelling framework of multivariate meta-analysis is implemented in the package mvmeta within the statistical environment R. As an illustrative example, we propose a two-stage analysis for investigating the non-linear exposure–response relationship between temperature and non-accidental mortality using time-series data from multiple cities. Multivariate meta-analysis represents a useful analytical tool for studying complex associations through a two-stage procedure. Copyright © 2012 John Wiley & Sons, Ltd. PMID:22807043

  16. Modeling absolute differences in life expectancy with a censored skew-normal regression approach

    PubMed Central

    Clough-Gorr, Kerri; Zwahlen, Marcel

    2015-01-01

    Parameter estimates from commonly used multivariable parametric survival regression models do not directly quantify differences in years of life expectancy. Gaussian linear regression models give results in terms of absolute mean differences, but are not appropriate in modeling life expectancy, because in many situations time to death has a negative skewed distribution. A regression approach using a skew-normal distribution would be an alternative to parametric survival models in the modeling of life expectancy, because parameter estimates can be interpreted in terms of survival time differences while allowing for skewness of the distribution. In this paper we show how to use the skew-normal regression so that censored and left-truncated observations are accounted for. With this we model differences in life expectancy using data from the Swiss National Cohort Study and from official life expectancy estimates and compare the results with those derived from commonly used survival regression models. We conclude that a censored skew-normal survival regression approach for left-truncated observations can be used to model differences in life expectancy across covariates of interest. PMID:26339544

  17. Field applications of stand-off sensing using visible/NIR multivariate optical computing

    NASA Astrophysics Data System (ADS)

    Eastwood, DeLyle; Soyemi, Olusola O.; Karunamuni, Jeevanandra; Zhang, Lixia; Li, Hongli; Myrick, Michael L.

    2001-02-01

    12 A novel multivariate visible/NIR optical computing approach applicable to standoff sensing will be demonstrated with porphyrin mixtures as examples. The ultimate goal is to develop environmental or counter-terrorism sensors for chemicals such as organophosphorus (OP) pesticides or chemical warfare simulants in the near infrared spectral region. The mathematical operation that characterizes prediction of properties via regression from optical spectra is a calculation of inner products between the spectrum and the pre-determined regression vector. The result is scaled appropriately and offset to correspond to the basis from which the regression vector is derived. The process involves collecting spectroscopic data and synthesizing a multivariate vector using a pattern recognition method. Then, an interference coating is designed that reproduces the pattern of the multivariate vector in its transmission or reflection spectrum, and appropriate interference filters are fabricated. High and low refractive index materials such as Nb2O5 and SiO2 are excellent choices for the visible and near infrared regions. The proof of concept has now been established for this system in the visible and will later be extended to chemicals such as OP compounds in the near and mid-infrared.

  18. A spline-based regression parameter set for creating customized DARTEL MRI brain templates from infancy to old age.

    PubMed

    Wilke, Marko

    2018-02-01

    This dataset contains the regression parameters derived by analyzing segmented brain MRI images (gray matter and white matter) from a large population of healthy subjects, using a multivariate adaptive regression splines approach. A total of 1919 MRI datasets ranging in age from 1-75 years from four publicly available datasets (NIH, C-MIND, fCONN, and IXI) were segmented using the CAT12 segmentation framework, writing out gray matter and white matter images normalized using an affine-only spatial normalization approach. These images were then subjected to a six-step DARTEL procedure, employing an iterative non-linear registration approach and yielding increasingly crisp intermediate images. The resulting six datasets per tissue class were then analyzed using multivariate adaptive regression splines, using the CerebroMatic toolbox. This approach allows for flexibly modelling smoothly varying trajectories while taking into account demographic (age, gender) as well as technical (field strength, data quality) predictors. The resulting regression parameters described here can be used to generate matched DARTEL or SHOOT templates for a given population under study, from infancy to old age. The dataset and the algorithm used to generate it are publicly available at https://irc.cchmc.org/software/cerebromatic.php.

  19. Multivariate analysis of cytokine profiles in pregnancy complications.

    PubMed

    Azizieh, Fawaz; Dingle, Kamaludin; Raghupathy, Raj; Johnson, Kjell; VanderPlas, Jacob; Ansari, Ali

    2018-03-01

    The immunoregulation to tolerate the semiallogeneic fetus during pregnancy includes a harmonious dynamic balance between anti- and pro-inflammatory cytokines. Several earlier studies reported significantly different levels and/or ratios of several cytokines in complicated pregnancy as compared to normal pregnancy. However, as cytokines operate in networks with potentially complex interactions, it is also interesting to compare groups with multi-cytokine data sets, with multivariate analysis. Such analysis will further examine how great the differences are, and which cytokines are more different than others. Various multivariate statistical tools, such as Cramer test, classification and regression trees, partial least squares regression figures, 2-dimensional Kolmogorov-Smirmov test, principal component analysis and gap statistic, were used to compare cytokine data of normal vs anomalous groups of different pregnancy complications. Multivariate analysis assisted in examining if the groups were different, how strongly they differed, in what ways they differed and further reported evidence for subgroups in 1 group (pregnancy-induced hypertension), possibly indicating multiple causes for the complication. This work contributes to a better understanding of cytokines interaction and may have important implications on targeting cytokine balance modulation or design of future medications or interventions that best direct management or prevention from an immunological approach. © 2018 The Authors. American Journal of Reproductive Immunology Published by John Wiley & Sons Ltd.

  20. [Academic performance in first year medical students: an explanatory multivariate model].

    PubMed

    Urrutia Aguilar, María Esther; Ortiz León, Silvia; Fouilloux Morales, Claudia; Ponce Rosas, Efrén Raúl; Guevara Guzmán, Rosalinda

    2014-12-01

    Current education is focused in intellectual, affective, and ethical aspects, thus acknowledging their significance in students´ metacognition. Nowadays, it is known that an adequate and motivating environment together with a positive attitude towards studies is fundamental to induce learning. Medical students are under multiple stressful, academic, personal, and vocational situations. To identify psychosocial, vocational, and academic variables of 2010-2011 first year medical students at UNAM that may help predict their academic performance. Academic surveys of psychological and vocational factors were applied; an academic follow-up was carried out to obtain a multivariate model. The data were analyzed considering descriptive, comparative, correlative, and predictive statistics. The main variables that affect students´ academic performance are related to previous knowledge and to psychological variables. The results show the significance of implementing institutional programs to support students throughout their college adaptation.

  1. Improving Prediction Accuracy for WSN Data Reduction by Applying Multivariate Spatio-Temporal Correlation

    PubMed Central

    Carvalho, Carlos; Gomes, Danielo G.; Agoulmine, Nazim; de Souza, José Neuman

    2011-01-01

    This paper proposes a method based on multivariate spatial and temporal correlation to improve prediction accuracy in data reduction for Wireless Sensor Networks (WSN). Prediction of data not sent to the sink node is a technique used to save energy in WSNs by reducing the amount of data traffic. However, it may not be very accurate. Simulations were made involving simple linear regression and multiple linear regression functions to assess the performance of the proposed method. The results show a higher correlation between gathered inputs when compared to time, which is an independent variable widely used for prediction and forecasting. Prediction accuracy is lower when simple linear regression is used, whereas multiple linear regression is the most accurate one. In addition to that, our proposal outperforms some current solutions by about 50% in humidity prediction and 21% in light prediction. To the best of our knowledge, we believe that we are probably the first to address prediction based on multivariate correlation for WSN data reduction. PMID:22346626

  2. Two-Year versus One-Year Head Start Program Impact: Addressing Selection Bias by Comparing Regression Modeling with Propensity Score Analysis

    ERIC Educational Resources Information Center

    Leow, Christine; Wen, Xiaoli; Korfmacher, Jon

    2015-01-01

    This article compares regression modeling and propensity score analysis as different types of statistical techniques used in addressing selection bias when estimating the impact of two-year versus one-year Head Start on children's school readiness. The analyses were based on the national Head Start secondary dataset. After controlling for…

  3. Regional vertical total electron content (VTEC) modeling together with satellite and receiver differential code biases (DCBs) using semi-parametric multivariate adaptive regression B-splines (SP-BMARS)

    NASA Astrophysics Data System (ADS)

    Durmaz, Murat; Karslioglu, Mahmut Onur

    2015-04-01

    There are various global and regional methods that have been proposed for the modeling of ionospheric vertical total electron content (VTEC). Global distribution of VTEC is usually modeled by spherical harmonic expansions, while tensor products of compactly supported univariate B-splines can be used for regional modeling. In these empirical parametric models, the coefficients of the basis functions as well as differential code biases (DCBs) of satellites and receivers can be treated as unknown parameters which can be estimated from geometry-free linear combinations of global positioning system observables. In this work we propose a new semi-parametric multivariate adaptive regression B-splines (SP-BMARS) method for the regional modeling of VTEC together with satellite and receiver DCBs, where the parametric part of the model is related to the DCBs as fixed parameters and the non-parametric part adaptively models the spatio-temporal distribution of VTEC. The latter is based on multivariate adaptive regression B-splines which is a non-parametric modeling technique making use of compactly supported B-spline basis functions that are generated from the observations automatically. This algorithm takes advantage of an adaptive scale-by-scale model building strategy that searches for best-fitting B-splines to the data at each scale. The VTEC maps generated from the proposed method are compared numerically and visually with the global ionosphere maps (GIMs) which are provided by the Center for Orbit Determination in Europe (CODE). The VTEC values from SP-BMARS and CODE GIMs are also compared with VTEC values obtained through calibration using local ionospheric model. The estimated satellite and receiver DCBs from the SP-BMARS model are compared with the CODE distributed DCBs. The results show that the SP-BMARS algorithm can be used to estimate satellite and receiver DCBs while adaptively and flexibly modeling the daily regional VTEC.

  4. Bootstrap Enhanced Penalized Regression for Variable Selection with Neuroimaging Data

    PubMed Central

    Abram, Samantha V.; Helwig, Nathaniel E.; Moodie, Craig A.; DeYoung, Colin G.; MacDonald, Angus W.; Waller, Niels G.

    2016-01-01

    Recent advances in fMRI research highlight the use of multivariate methods for examining whole-brain connectivity. Complementary data-driven methods are needed for determining the subset of predictors related to individual differences. Although commonly used for this purpose, ordinary least squares (OLS) regression may not be ideal due to multi-collinearity and over-fitting issues. Penalized regression is a promising and underutilized alternative to OLS regression. In this paper, we propose a nonparametric bootstrap quantile (QNT) approach for variable selection with neuroimaging data. We use real and simulated data, as well as annotated R code, to demonstrate the benefits of our proposed method. Our results illustrate the practical potential of our proposed bootstrap QNT approach. Our real data example demonstrates how our method can be used to relate individual differences in neural network connectivity with an externalizing personality measure. Also, our simulation results reveal that the QNT method is effective under a variety of data conditions. Penalized regression yields more stable estimates and sparser models than OLS regression in situations with large numbers of highly correlated neural predictors. Our results demonstrate that penalized regression is a promising method for examining associations between neural predictors and clinically relevant traits or behaviors. These findings have important implications for the growing field of functional connectivity research, where multivariate methods produce numerous, highly correlated brain networks. PMID:27516732

  5. Bootstrap Enhanced Penalized Regression for Variable Selection with Neuroimaging Data.

    PubMed

    Abram, Samantha V; Helwig, Nathaniel E; Moodie, Craig A; DeYoung, Colin G; MacDonald, Angus W; Waller, Niels G

    2016-01-01

    Recent advances in fMRI research highlight the use of multivariate methods for examining whole-brain connectivity. Complementary data-driven methods are needed for determining the subset of predictors related to individual differences. Although commonly used for this purpose, ordinary least squares (OLS) regression may not be ideal due to multi-collinearity and over-fitting issues. Penalized regression is a promising and underutilized alternative to OLS regression. In this paper, we propose a nonparametric bootstrap quantile (QNT) approach for variable selection with neuroimaging data. We use real and simulated data, as well as annotated R code, to demonstrate the benefits of our proposed method. Our results illustrate the practical potential of our proposed bootstrap QNT approach. Our real data example demonstrates how our method can be used to relate individual differences in neural network connectivity with an externalizing personality measure. Also, our simulation results reveal that the QNT method is effective under a variety of data conditions. Penalized regression yields more stable estimates and sparser models than OLS regression in situations with large numbers of highly correlated neural predictors. Our results demonstrate that penalized regression is a promising method for examining associations between neural predictors and clinically relevant traits or behaviors. These findings have important implications for the growing field of functional connectivity research, where multivariate methods produce numerous, highly correlated brain networks.

  6. Logistic regression analysis of factors associated with avascular necrosis of the femoral head following femoral neck fractures in middle-aged and elderly patients.

    PubMed

    Ai, Zi-Sheng; Gao, You-Shui; Sun, Yuan; Liu, Yue; Zhang, Chang-Qing; Jiang, Cheng-Hua

    2013-03-01

    Risk factors for femoral neck fracture-induced avascular necrosis of the femoral head have not been elucidated clearly in middle-aged and elderly patients. Moreover, the high incidence of screw removal in China and its effect on the fate of the involved femoral head require statistical methods to reflect their intrinsic relationship. Ninety-nine patients older than 45 years with femoral neck fracture were treated by internal fixation between May 1999 and April 2004. Descriptive analysis, interaction analysis between associated factors, single factor logistic regression, multivariate logistic regression, and detailed interaction analysis were employed to explore potential relationships among associated factors. Avascular necrosis of the femoral head was found in 15 cases (15.2 %). Age × the status of implants (removal vs. maintenance) and gender × the timing of reduction were interactive according to two-factor interactive analysis. Age, the displacement of fractures, the quality of reduction, and the status of implants were found to be significant factors in single factor logistic regression analysis. Age, age × the status of implants, and the quality of reduction were found to be significant factors in multivariate logistic regression analysis. In fine interaction analysis after multivariate logistic regression analysis, implant removal was the most important risk factor for avascular necrosis in 56-to-85-year-old patients, with a risk ratio of 26.00 (95 % CI = 3.076-219.747). The middle-aged and elderly have less incidence of avascular necrosis of the femoral head following femoral neck fractures treated by cannulated screws. The removal of cannulated screws can induce a significantly high incidence of avascular necrosis of the femoral head in elderly patients, while a high-quality reduction is helpful to reduce avascular necrosis.

  7. Impact of functional and structural social relationships on two year depression outcomes: A multivariate analysis.

    PubMed

    Davidson, Sandra K; Dowrick, Christopher F; Gunn, Jane M

    2016-03-15

    High rates of persistent depression highlight the need to identify the risk factors associated with poor depression outcomes and to provide targeted interventions to people at high risk. Although social relationships have been implicated in depression course, interventions targeting social relationships have been disappointing. Possibly, interventions have targeted the wrong elements of relationships. Alternatively, the statistical association between relationships and depression course is not causal, but due to shared variance with other factors. We investigated whether elements of social relationships predict major depressive episode (MDE) when multiple relevant variables are considered. Data is from a longitudinal study of primary care patients with depressive symptoms. 494 participants completed questionnaires at baseline and a depression measure (PHQ-9) two years later. Baseline measures included functional (i.e. quality) and structural (i.e. quantity) social relationships, depression, neuroticism, chronic illness, alcohol abuse, childhood abuse, partner violence and sociodemographic characteristics. Logistic regression with generalised estimating equations was used to estimate the association between social relationships and MDE. Both functional and structural social relationships predicted MDE in univariate analysis. Only functional social relationships remained significant in multivariate analysis (OR: 0.87; 95%CI: 0.79-0.97; p=0.01). Other unique predictors of MDE were baseline depression severity, neuroticism, childhood sexual abuse and intimate partner violence. We did not assess how a person's position in their depression trajectory influenced the association between social relationships and depression. Interventions targeting relationship quality may be part of a personalised treatment plan for people at high risk due of persistent depression due to poor social relationships. Copyright © 2015 Elsevier B.V. All rights reserved.

  8. Newer classification and regression tree techniques: Bagging and Random Forests for ecological prediction

    Treesearch

    Anantha M. Prasad; Louis R. Iverson; Andy Liaw; Andy Liaw

    2006-01-01

    We evaluated four statistical models - Regression Tree Analysis (RTA), Bagging Trees (BT), Random Forests (RF), and Multivariate Adaptive Regression Splines (MARS) - for predictive vegetation mapping under current and future climate scenarios according to the Canadian Climate Centre global circulation model.

  9. Problems with Multivariate Normality: Can the Multivariate Bootstrap Help?

    ERIC Educational Resources Information Center

    Thompson, Bruce

    Multivariate normality is required for some statistical tests. This paper explores the implications of violating the assumption of multivariate normality and illustrates a graphical procedure for evaluating multivariate normality. The logic for using the multivariate bootstrap is presented. The multivariate bootstrap can be used when distribution…

  10. A retrospective study: Multivariate logistic regression analysis of the outcomes after pressure sores reconstruction with fasciocutaneous, myocutaneous, and perforator flaps.

    PubMed

    Chiu, Yu-Jen; Liao, Wen-Chieh; Wang, Tien-Hsiang; Shih, Yu-Chung; Ma, Hsu; Lin, Chih-Hsun; Wu, Szu-Hsien; Perng, Cherng-Kang

    2017-08-01

    Despite significant advances in medical care and surgical techniques, pressure sore reconstruction is still prone to elevated rates of complication and recurrence. We conducted a retrospective study to investigate not only complication and recurrence rates following pressure sore reconstruction but also preoperative risk stratification. This study included 181 ulcers underwent flap operations between January 2002 and December 2013 were included in the study. We performed a multivariable logistic regression model, which offers a regression-based method accounting for the within-patient correlation of the success or failure of each flap. The overall complication and recurrence rates for all flaps were 46.4% and 16.0%, respectively, with a mean follow-up period of 55.4 ± 38.0 months. No statistically significant differences of complication and recurrence rates were observed among three different reconstruction methods. In subsequent analysis, albumin ≤3.0 g/dl and paraplegia were significantly associated with higher postoperative complication. The anatomic factor, ischial wound location, significantly trended toward the development of ulcer recurrence. In the fasciocutaneous group, paraplegia had significant correlation to higher complication and recurrence rates. In the musculocutaneous flap group, variables had no significant correlation to complication and recurrence rates. In the free-style perforator group, ischial wound location and malnourished status correlated with significantly higher complication rates; ischial wound location also correlated with significantly higher recurrence rate. Ultimately, our review of a noteworthy cohort with lengthy follow-up helped identify and confirm certain risk factors that can facilitate a more informed and thoughtful pre- and postoperative decision-making process for patients with pressure ulcers. Copyright © 2017 British Association of Plastic, Reconstructive and Aesthetic Surgeons. Published by Elsevier Ltd. All

  11. Multivariate meta-analysis using individual participant data

    PubMed Central

    Riley, R. D.; Price, M. J.; Jackson, D.; Wardle, M.; Gueyffier, F.; Wang, J.; Staessen, J. A.; White, I. R.

    2016-01-01

    When combining results across related studies, a multivariate meta-analysis allows the joint synthesis of correlated effect estimates from multiple outcomes. Joint synthesis can improve efficiency over separate univariate syntheses, may reduce selective outcome reporting biases, and enables joint inferences across the outcomes. A common issue is that within-study correlations needed to fit the multivariate model are unknown from published reports. However, provision of individual participant data (IPD) allows them to be calculated directly. Here, we illustrate how to use IPD to estimate within-study correlations, using a joint linear regression for multiple continuous outcomes and bootstrapping methods for binary, survival and mixed outcomes. In a meta-analysis of 10 hypertension trials, we then show how these methods enable multivariate meta-analysis to address novel clinical questions about continuous, survival and binary outcomes; treatment–covariate interactions; adjusted risk/prognostic factor effects; longitudinal data; prognostic and multiparameter models; and multiple treatment comparisons. Both frequentist and Bayesian approaches are applied, with example software code provided to derive within-study correlations and to fit the models. PMID:26099484

  12. Multivariate Analysis of Seismic Field Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Alam, M. Kathleen

    1999-06-01

    This report includes the details of the model building procedure and prediction of seismic field data. Principal Components Regression, a multivariate analysis technique, was used to model seismic data collected as two pieces of equipment were cycled on and off. Models built that included only the two pieces of equipment of interest had trouble predicting data containing signals not included in the model. Evidence for poor predictions came from the prediction curves as well as spectral F-ratio plots. Once the extraneous signals were included in the model, predictions improved dramatically. While Principal Components Regression performed well for the present datamore » sets, the present data analysis suggests further work will be needed to develop more robust modeling methods as the data become more complex.« less

  13. Using multivariate regression modeling for sampling and predicting chemical characteristics of mixed waste in old landfills.

    PubMed

    Brandstätter, Christian; Laner, David; Prantl, Roman; Fellner, Johann

    2014-12-01

    Municipal solid waste landfills pose a threat on environment and human health, especially old landfills which lack facilities for collection and treatment of landfill gas and leachate. Consequently, missing information about emission flows prevent site-specific environmental risk assessments. To overcome this gap, the combination of waste sampling and analysis with statistical modeling is one option for estimating present and future emission potentials. Optimizing the tradeoff between investigation costs and reliable results requires knowledge about both: the number of samples to be taken and variables to be analyzed. This article aims to identify the optimized number of waste samples and variables in order to predict a larger set of variables. Therefore, we introduce a multivariate linear regression model and tested the applicability by usage of two case studies. Landfill A was used to set up and calibrate the model based on 50 waste samples and twelve variables. The calibrated model was applied to Landfill B including 36 waste samples and twelve variables with four predictor variables. The case study results are twofold: first, the reliable and accurate prediction of the twelve variables can be achieved with the knowledge of four predictor variables (Loi, EC, pH and Cl). For the second Landfill B, only ten full measurements would be needed for a reliable prediction of most response variables. The four predictor variables would exhibit comparably low analytical costs in comparison to the full set of measurements. This cost reduction could be used to increase the number of samples yielding an improved understanding of the spatial waste heterogeneity in landfills. Concluding, the future application of the developed model potentially improves the reliability of predicted emission potentials. The model could become a standard screening tool for old landfills if its applicability and reliability would be tested in additional case studies. Copyright © 2014 Elsevier Ltd

  14. Comprehensive modeling of monthly mean soil temperature using multivariate adaptive regression splines and support vector machine

    NASA Astrophysics Data System (ADS)

    Mehdizadeh, Saeid; Behmanesh, Javad; Khalili, Keivan

    2017-07-01

    Soil temperature (T s) and its thermal regime are the most important factors in plant growth, biological activities, and water movement in soil. Due to scarcity of the T s data, estimation of soil temperature is an important issue in different fields of sciences. The main objective of the present study is to investigate the accuracy of multivariate adaptive regression splines (MARS) and support vector machine (SVM) methods for estimating the T s. For this aim, the monthly mean data of the T s (at depths of 5, 10, 50, and 100 cm) and meteorological parameters of 30 synoptic stations in Iran were utilized. To develop the MARS and SVM models, various combinations of minimum, maximum, and mean air temperatures (T min, T max, T); actual and maximum possible sunshine duration; sunshine duration ratio (n, N, n/N); actual, net, and extraterrestrial solar radiation data (R s, R n, R a); precipitation (P); relative humidity (RH); wind speed at 2 m height (u 2); and water vapor pressure (Vp) were used as input variables. Three error statistics including root-mean-square-error (RMSE), mean absolute error (MAE), and determination coefficient (R 2) were used to check the performance of MARS and SVM models. The results indicated that the MARS was superior to the SVM at different depths. In the test and validation phases, the most accurate estimations for the MARS were obtained at the depth of 10 cm for T max, T min, T inputs (RMSE = 0.71 °C, MAE = 0.54 °C, and R 2 = 0.995) and for RH, V p, P, and u 2 inputs (RMSE = 0.80 °C, MAE = 0.61 °C, and R 2 = 0.996), respectively.

  15. Nonlinear multivariate and time series analysis by neural network methods

    NASA Astrophysics Data System (ADS)

    Hsieh, William W.

    2004-03-01

    Methods in multivariate statistical analysis are essential for working with large amounts of geophysical data, data from observational arrays, from satellites, or from numerical model output. In classical multivariate statistical analysis, there is a hierarchy of methods, starting with linear regression at the base, followed by principal component analysis (PCA) and finally canonical correlation analysis (CCA). A multivariate time series method, the singular spectrum analysis (SSA), has been a fruitful extension of the PCA technique. The common drawback of these classical methods is that only linear structures can be correctly extracted from the data. Since the late 1980s, neural network methods have become popular for performing nonlinear regression and classification. More recently, neural network methods have been extended to perform nonlinear PCA (NLPCA), nonlinear CCA (NLCCA), and nonlinear SSA (NLSSA). This paper presents a unified view of the NLPCA, NLCCA, and NLSSA techniques and their applications to various data sets of the atmosphere and the ocean (especially for the El Niño-Southern Oscillation and the stratospheric quasi-biennial oscillation). These data sets reveal that the linear methods are often too simplistic to describe real-world systems, with a tendency to scatter a single oscillatory phenomenon into numerous unphysical modes or higher harmonics, which can be largely alleviated in the new nonlinear paradigm.

  16. A multivariate analysis of clinical and morphological prognostic factors in squamous cell carcinoma of the vulva.

    PubMed

    Smyczek-Gargya, B; Volz, B; Geppert, M; Dietl, J

    1997-01-01

    Clinical and histological data of 168 patients with squamous cell carcinoma of the vulva were analyzed with respect to survival. 151 patients underwent surgery, 12 patients were treated with primary radiation and in 5 patients no treatment was performed. Follow-up lasted from at least 2 up to 22 years' posttreatment. In univariate analysis, the following factors were highly significant: presurgery lymph node status, tumor infiltration beyond the vulva, tumor grading, histological inguinal lymph node status, pre- and postsurgery tumor stage, depth of invasion and tumor diameter. In the multivariate analysis (Cox regression), the most powerful factors were shown to be histological inguinal lymph node status, tumor diameter and tumor grading. The multivariate logistic regression analysis worked out as main prognostic factors for metastases of inguinal lymph nodes: presurgery inguinal lymph node status, tumor size, depth of invasion and tumor grading. Based on these results, tumor biology seems to be the decisive factor concerning recurrence and survival. Therefore, we suggest a more conservative treatment of vulvar carcinoma. Patients with confined carcinoma to the vulva, with a tumor diameter up to 3 cm and without clinical suspected lymph nodes, should be treated by wide excision/partial vulvectomy with ipsilateral lymphadenectomy.

  17. Dose-response effects for depression and Schizophrenia management on hospital utilization in Illinois Medicaid: a multivariate regression analysis.

    PubMed

    Berg, Gregory D; Donnelly, Shawn; Warnick, Kathleen; Medina, Wendie; Miller, Mary

    2014-07-03

    The prevalence of schizophrenia and depression in the United States is far higher among Medicaid recipients than in the general population. Individuals suffering from mental illness, including schizophrenia and depression, also have higher rates of emergency department utilization, which is costly and may not generate the positive health outcomes desired. Disease management programs strive to help individuals suffering from chronic illnesses better manage their condition(s) and seek health care in the appropriate settings. The objective of this manuscript is to estimate a dose-response impact on hospital inpatient and emergency room utilizations for any reason by Medicaid recipients with depression or schizophrenia who received disease management contacts. Multivariate regression analysis of panel data taken from administrative claims was conducted to test the hypothesis that increased contacts lower the likelihood of all-cause inpatient admissions and emergency room visits. Subjects included 6,274 members of Illinois' non-institutionalized Medicaid-only aged, blind or disabled population diagnosed with depression or schizophrenia. The statistical measure is the odds ratio. The odds ratio association is between the monthly utilization indicators and the number of contacts (doses) a member had for each particular disease management intervention. Higher numbers of intervention contacts for Medicaid recipients diagnosed with depression or schizophrenia were associated with statistically significant reductions in all-cause inpatient admissions and emergency room utilizations. There is a high correlation between depression and schizophrenia disease management contacts and lowered all-cause hospital inpatient and emergency room utilizations.

  18. Predicting volumes in four Hawaii hardwoods...first multivariate equations developed

    Treesearch

    David A. Sharpnack

    1966-01-01

    Multivariate regression equations were developed for predicting board-foot (Int. 1/ 4-inch log rule ) and cubic-foot volumes in each 8.15-foot section of trees of four Hawaii hardwood species. The species are koa (Acacia koa), ohia (Metrosideros polymorpha), robusta eucalyptus (Eucalyptus robusta), and...

  19. Multivariate prediction of upper limb prosthesis acceptance or rejection.

    PubMed

    Biddiss, Elaine A; Chau, Tom T

    2008-07-01

    To develop a model for prediction of upper limb prosthesis use or rejection. A questionnaire exploring factors in prosthesis acceptance was distributed internationally to individuals with upper limb absence through community-based support groups and rehabilitation hospitals. A total of 191 participants (59 prosthesis rejecters and 132 prosthesis wearers) were included in this study. A logistic regression model, a C5.0 decision tree, and a radial basis function neural network were developed and compared in terms of sensitivity (prediction of prosthesis rejecters), specificity (prediction of prosthesis wearers), and overall cross-validation accuracy. The logistic regression and neural network provided comparable overall accuracies of approximately 84 +/- 3%, specificity of 93%, and sensitivity of 61%. Fitting time-frame emerged as the predominant predictor. Individuals fitted within two years of birth (congenital) or six months of amputation (acquired) were 16 times more likely to continue prosthesis use. To increase rates of prosthesis acceptance, clinical directives should focus on timely, client-centred fitting strategies and the development of improved prostheses and healthcare for individuals with high-level or bilateral limb absence. Multivariate analyses are useful in determining the relative importance of the many factors involved in prosthesis acceptance and rejection.

  20. Comparing lagged linear correlation, lagged regression, Granger causality, and vector autoregression for uncovering associations in EHR data.

    PubMed

    Levine, Matthew E; Albers, David J; Hripcsak, George

    2016-01-01

    Time series analysis methods have been shown to reveal clinical and biological associations in data collected in the electronic health record. We wish to develop reliable high-throughput methods for identifying adverse drug effects that are easy to implement and produce readily interpretable results. To move toward this goal, we used univariate and multivariate lagged regression models to investigate associations between twenty pairs of drug orders and laboratory measurements. Multivariate lagged regression models exhibited higher sensitivity and specificity than univariate lagged regression in the 20 examples, and incorporating autoregressive terms for labs and drugs produced more robust signals in cases of known associations among the 20 example pairings. Moreover, including inpatient admission terms in the model attenuated the signals for some cases of unlikely associations, demonstrating how multivariate lagged regression models' explicit handling of context-based variables can provide a simple way to probe for health-care processes that confound analyses of EHR data.

  1. Force required for correcting the deformity of pectus carinatum and related multivariate analysis.

    PubMed

    Chen, Chenghao; Zeng, Qi; Li, Zhongzhi; Zhang, Na; Yu, Jie

    2017-12-24

    To measure the force required for correcting pectus carinatum to the desired position and investigate the correlations of the required force with patients' gender, age, deformity type, severity and body mass index (BMI). A total of 125 patients with pectus carinatum were enrolled in the study from August 2013 to August 2016. Their gender, age, deformity type, severity and BMI were recorded. A chest wall compressor was used to measure the force required for correcting the chest wall deformity. Multivariate linear regression was used for data analysis. Among the 125 patients, 112 were males and 13 were females. Their mean age was 13.7±1.5 years old, mean Haller index was 2.1±0.2, and mean BMI was 17.4±1.8 kg/m 2 . Multivariate linear regression analysis showed that the desirable force for correcting chest wall deformity was not correlated with gender and deformity type, but positively correlated with age and BMI and negatively correlated with Haller index. The desirable force measured for correcting chest wall deformities of patients with pectus carinatum positively correlates with age and BMI and negatively correlates with Haller index. The study provides valuable information for future improvement of implanted bar, bar fixation technique, and personalized surgery. Retrospective study. Level 3-4. Copyright © 2018. Published by Elsevier Inc.

  2. More insights into early brain development through statistical analyses of eigen-structural elements of diffusion tensor imaging using multivariate adaptive regression splines

    PubMed Central

    Chen, Yasheng; Zhu, Hongtu; An, Hongyu; Armao, Diane; Shen, Dinggang; Gilmore, John H.; Lin, Weili

    2013-01-01

    The aim of this study was to characterize the maturational changes of the three eigenvalues (λ1 ≥ λ2 ≥ λ3) of diffusion tensor imaging (DTI) during early postnatal life for more insights into early brain development. In order to overcome the limitations of using presumed growth trajectories for regression analysis, we employed Multivariate Adaptive Regression Splines (MARS) to derive data-driven growth trajectories for the three eigenvalues. We further employed Generalized Estimating Equations (GEE) to carry out statistical inferences on the growth trajectories obtained with MARS. With a total of 71 longitudinal datasets acquired from 29 healthy, full-term pediatric subjects, we found that the growth velocities of the three eigenvalues were highly correlated, but significantly different from each other. This paradox suggested the existence of mechanisms coordinating the maturations of the three eigenvalues even though different physiological origins may be responsible for their temporal evolutions. Furthermore, our results revealed the limitations of using the average of λ2 and λ3 as the radial diffusivity in interpreting DTI findings during early brain development because these two eigenvalues had significantly different growth velocities even in central white matter. In addition, based upon the three eigenvalues, we have documented the growth trajectory differences between central and peripheral white matter, between anterior and posterior limbs of internal capsule, and between inferior and superior longitudinal fasciculus. Taken together, we have demonstrated that more insights into early brain maturation can be gained through analyzing eigen-structural elements of DTI. PMID:23455648

  3. Biostatistics Series Module 10: Brief Overview of Multivariate Methods.

    PubMed

    Hazra, Avijit; Gogtay, Nithya

    2017-01-01

    Multivariate analysis refers to statistical techniques that simultaneously look at three or more variables in relation to the subjects under investigation with the aim of identifying or clarifying the relationships between them. These techniques have been broadly classified as dependence techniques, which explore the relationship between one or more dependent variables and their independent predictors, and interdependence techniques, that make no such distinction but treat all variables equally in a search for underlying relationships. Multiple linear regression models a situation where a single numerical dependent variable is to be predicted from multiple numerical independent variables. Logistic regression is used when the outcome variable is dichotomous in nature. The log-linear technique models count type of data and can be used to analyze cross-tabulations where more than two variables are included. Analysis of covariance is an extension of analysis of variance (ANOVA), in which an additional independent variable of interest, the covariate, is brought into the analysis. It tries to examine whether a difference persists after "controlling" for the effect of the covariate that can impact the numerical dependent variable of interest. Multivariate analysis of variance (MANOVA) is a multivariate extension of ANOVA used when multiple numerical dependent variables have to be incorporated in the analysis. Interdependence techniques are more commonly applied to psychometrics, social sciences and market research. Exploratory factor analysis and principal component analysis are related techniques that seek to extract from a larger number of metric variables, a smaller number of composite factors or components, which are linearly related to the original variables. Cluster analysis aims to identify, in a large number of cases, relatively homogeneous groups called clusters, without prior information about the groups. The calculation intensive nature of multivariate analysis

  4. Application of least square support vector machine and multivariate adaptive regression spline models in long term prediction of river water pollution

    NASA Astrophysics Data System (ADS)

    Kisi, Ozgur; Parmar, Kulwinder Singh

    2016-03-01

    This study investigates the accuracy of least square support vector machine (LSSVM), multivariate adaptive regression splines (MARS) and M5 model tree (M5Tree) in modeling river water pollution. Various combinations of water quality parameters, Free Ammonia (AMM), Total Kjeldahl Nitrogen (TKN), Water Temperature (WT), Total Coliform (TC), Fecal Coliform (FC) and Potential of Hydrogen (pH) monitored at Nizamuddin, Delhi Yamuna River in India were used as inputs to the applied models. Results indicated that the LSSVM and MARS models had almost same accuracy and they performed better than the M5Tree model in modeling monthly chemical oxygen demand (COD). The average root mean square error (RMSE) of the LSSVM and M5Tree models was decreased by 1.47% and 19.1% using MARS model, respectively. Adding TC input to the models did not increase their accuracy in modeling COD while adding FC and pH inputs to the models generally decreased the accuracy. The overall results indicated that the MARS and LSSVM models could be successfully used in estimating monthly river water pollution level by using AMM, TKN and WT parameters as inputs.

  5. MODELING SNAKE MICROHABITAT FROM RADIOTELEMETRY STUDIES USING POLYTOMOUS LOGISTIC REGRESSION

    EPA Science Inventory

    Multivariate analysis of snake microhabitat has historically used techniques that were derived under assumptions of normality and common covariance structure (e.g., discriminant function analysis, MANOVA). In this study, polytomous logistic regression (PLR which does not require ...

  6. Multivariate meta-analysis using individual participant data.

    PubMed

    Riley, R D; Price, M J; Jackson, D; Wardle, M; Gueyffier, F; Wang, J; Staessen, J A; White, I R

    2015-06-01

    When combining results across related studies, a multivariate meta-analysis allows the joint synthesis of correlated effect estimates from multiple outcomes. Joint synthesis can improve efficiency over separate univariate syntheses, may reduce selective outcome reporting biases, and enables joint inferences across the outcomes. A common issue is that within-study correlations needed to fit the multivariate model are unknown from published reports. However, provision of individual participant data (IPD) allows them to be calculated directly. Here, we illustrate how to use IPD to estimate within-study correlations, using a joint linear regression for multiple continuous outcomes and bootstrapping methods for binary, survival and mixed outcomes. In a meta-analysis of 10 hypertension trials, we then show how these methods enable multivariate meta-analysis to address novel clinical questions about continuous, survival and binary outcomes; treatment-covariate interactions; adjusted risk/prognostic factor effects; longitudinal data; prognostic and multiparameter models; and multiple treatment comparisons. Both frequentist and Bayesian approaches are applied, with example software code provided to derive within-study correlations and to fit the models. © 2014 The Authors. Research Synthesis Methods published by John Wiley & Sons, Ltd.

  7. The effect of postoperative medical treatment on left ventricular mass regression after aortic valve replacement.

    PubMed

    Helder, Meghana R K; Ugur, Murat; Bavaria, Joseph E; Kshettry, Vibhu R; Groh, Mark A; Petracek, Michael R; Jones, Kent W; Suri, Rakesh M; Schaff, Hartzell V

    2015-03-01

    The study objective was to analyze factors associated with left ventricular mass regression in patients undergoing aortic valve replacement with a newer bioprosthesis, the Trifecta valve pericardial bioprosthesis (St Jude Medical Inc, St Paul, Minn). A total of 444 patients underwent aortic valve replacement with the Trifecta bioprosthesis from 2007 to 2009 at 6 US institutions. The clinical and echocardiographic data of 200 of these patients who had left ventricular hypertrophy and follow-up studies 1 year postoperatively were reviewed and compared to analyze factors affecting left ventricular mass regression. Mean (standard deviation) age of the 200 study patients was 73 (9) years, 66% were men, and 92% had pure or predominant aortic valve stenosis. Complete left ventricular mass regression was observed in 102 patients (51%) by 1 year postoperatively. In univariate analysis, male sex, implantation of larger valves, larger left ventricular end-diastolic volume, and beta-blocker or calcium-channel blocker treatment at dismissal were significantly associated with complete mass regression. In the multivariate model, odds ratios (95% confidence intervals) indicated that male sex (3.38 [1.39-8.26]) and beta-blocker or calcium-channel blocker treatment at dismissal (3.41 [1.40-8.34]) were associated with increased probability of complete left ventricular mass regression. Patients with higher preoperative systolic blood pressure were less likely to have complete left ventricular mass regression (0.98 [0.97-0.99]). Among patients with left ventricular hypertrophy, postoperative treatment with beta-blockers or calcium-channel blockers may enhance mass regression. This highlights the need for close medical follow-up after operation. Labeled valve size was not predictive of left ventricular mass regression. Copyright © 2015 The American Association for Thoracic Surgery. Published by Elsevier Inc. All rights reserved.

  8. Voxelwise multivariate analysis of multimodality magnetic resonance imaging.

    PubMed

    Naylor, Melissa G; Cardenas, Valerie A; Tosun, Duygu; Schuff, Norbert; Weiner, Michael; Schwartzman, Armin

    2014-03-01

    Most brain magnetic resonance imaging (MRI) studies concentrate on a single MRI contrast or modality, frequently structural MRI. By performing an integrated analysis of several modalities, such as structural, perfusion-weighted, and diffusion-weighted MRI, new insights may be attained to better understand the underlying processes of brain diseases. We compare two voxelwise approaches: (1) fitting multiple univariate models, one for each outcome and then adjusting for multiple comparisons among the outcomes and (2) fitting a multivariate model. In both cases, adjustment for multiple comparisons is performed over all voxels jointly to account for the search over the brain. The multivariate model is able to account for the multiple comparisons over outcomes without assuming independence because the covariance structure between modalities is estimated. Simulations show that the multivariate approach is more powerful when the outcomes are correlated and, even when the outcomes are independent, the multivariate approach is just as powerful or more powerful when at least two outcomes are dependent on predictors in the model. However, multiple univariate regressions with Bonferroni correction remain a desirable alternative in some circumstances. To illustrate the power of each approach, we analyze a case control study of Alzheimer's disease, in which data from three MRI modalities are available. Copyright © 2013 Wiley Periodicals, Inc.

  9. Voxelwise multivariate analysis of multimodality magnetic resonance imaging

    PubMed Central

    Naylor, Melissa G.; Cardenas, Valerie A.; Tosun, Duygu; Schuff, Norbert; Weiner, Michael; Schwartzman, Armin

    2015-01-01

    Most brain magnetic resonance imaging (MRI) studies concentrate on a single MRI contrast or modality, frequently structural MRI. By performing an integrated analysis of several modalities, such as structural, perfusion-weighted, and diffusion-weighted MRI, new insights may be attained to better understand the underlying processes of brain diseases. We compare two voxelwise approaches: (1) fitting multiple univariate models, one for each outcome and then adjusting for multiple comparisons among the outcomes and (2) fitting a multivariate model. In both cases, adjustment for multiple comparisons is performed over all voxels jointly to account for the search over the brain. The multivariate model is able to account for the multiple comparisons over outcomes without assuming independence because the covariance structure between modalities is estimated. Simulations show that the multivariate approach is more powerful when the outcomes are correlated and, even when the outcomes are independent, the multivariate approach is just as powerful or more powerful when at least two outcomes are dependent on predictors in the model. However, multiple univariate regressions with Bonferroni correction remains a desirable alternative in some circumstances. To illustrate the power of each approach, we analyze a case control study of Alzheimer's disease, in which data from three MRI modalities are available. PMID:23408378

  10. Multivariate statistical analysis: Principles and applications to coorbital streams of meteorite falls

    NASA Technical Reports Server (NTRS)

    Wolf, S. F.; Lipschutz, M. E.

    1993-01-01

    Multivariate statistical analysis techniques (linear discriminant analysis and logistic regression) can provide powerful discrimination tools which are generally unfamiliar to the planetary science community. Fall parameters were used to identify a group of 17 H chondrites (Cluster 1) that were part of a coorbital stream which intersected Earth's orbit in May, from 1855 - 1895, and can be distinguished from all other H chondrite falls. Using multivariate statistical techniques, it was demonstrated that a totally different criterion, labile trace element contents - hence thermal histories - or 13 Cluster 1 meteorites are distinguishable from those of 45 non-Cluster 1 H chondrites. Here, we focus upon the principles of multivariate statistical techniques and illustrate their application using non-meteoritic and meteoritic examples.

  11. A suite of global reconstructed precipitation products and their error estimate by multivariate regression using empirical orthogonal functions: 1850-present

    NASA Astrophysics Data System (ADS)

    Shen, S. S.

    2014-12-01

    This presentation describes a suite of global precipitation products reconstructed by a multivariate regression method using an empirical orthogonal function (EOF) expansion. The sampling errors of the reconstruction are estimated for each product datum entry. The maximum temporal coverage is 1850-present and the spatial coverage is quasi-global (75S, 75N). The temporal resolution ranges from 5-day, monthly, to seasonal and annual. The Global Precipitation Climatology Project (GPCP) precipitation data from 1979-2008 are used to calculate the EOFs. The Global Historical Climatology Network (GHCN) gridded data are used to calculate the regression coefficients for reconstructions. The sampling errors of the reconstruction are analyzed in detail for different EOF modes. Our reconstructed 1900-2011 time series of the global average annual precipitation shows a 0.024 (mm/day)/100a trend, which is very close to the trend derived from the mean of 25 models of the CMIP5 (Coupled Model Intercomparison Project Phase 5). Our reconstruction examples of 1983 El Niño precipitation and 1917 La Niña precipitation (Figure 1) demonstrate that the El Niño and La Niña precipitation patterns are well reflected in the first two EOFs. The validation of our reconstruction results with GPCP makes it possible to use the reconstruction as the benchmark data for climate models. This will help the climate modeling community to improve model precipitation mechanisms and reduce the systematic difference between observed global precipitation, which hovers at around 2.7 mm/day for reconstructions and GPCP, and model precipitations, which have a range of 2.6-3.3 mm/day for CMIP5. Our precipitation products are publically available online, including digital data, precipitation animations, computer codes, readme files, and the user manual. This work is a joint effort between San Diego State University (Sam Shen, Nancy Tafolla, Barbara Sperberg, and Melanie Thorn) and University of Maryland (Phil

  12. Order Selection for General Expression of Nonlinear Autoregressive Model Based on Multivariate Stepwise Regression

    NASA Astrophysics Data System (ADS)

    Shi, Jinfei; Zhu, Songqing; Chen, Ruwen

    2017-12-01

    An order selection method based on multiple stepwise regressions is proposed for General Expression of Nonlinear Autoregressive model which converts the model order problem into the variable selection of multiple linear regression equation. The partial autocorrelation function is adopted to define the linear term in GNAR model. The result is set as the initial model, and then the nonlinear terms are introduced gradually. Statistics are chosen to study the improvements of both the new introduced and originally existed variables for the model characteristics, which are adopted to determine the model variables to retain or eliminate. So the optimal model is obtained through data fitting effect measurement or significance test. The simulation and classic time-series data experiment results show that the method proposed is simple, reliable and can be applied to practical engineering.

  13. Groundwater potential mapping using C5.0, random forest, and multivariate adaptive regression spline models in GIS.

    PubMed

    Golkarian, Ali; Naghibi, Seyed Amir; Kalantar, Bahareh; Pradhan, Biswajeet

    2018-02-17

    Ever increasing demand for water resources for different purposes makes it essential to have better understanding and knowledge about water resources. As known, groundwater resources are one of the main water resources especially in countries with arid climatic condition. Thus, this study seeks to provide groundwater potential maps (GPMs) employing new algorithms. Accordingly, this study aims to validate the performance of C5.0, random forest (RF), and multivariate adaptive regression splines (MARS) algorithms for generating GPMs in the eastern part of Mashhad Plain, Iran. For this purpose, a dataset was produced consisting of spring locations as indicator and groundwater-conditioning factors (GCFs) as input. In this research, 13 GCFs were selected including altitude, slope aspect, slope angle, plan curvature, profile curvature, topographic wetness index (TWI), slope length, distance from rivers and faults, rivers and faults density, land use, and lithology. The mentioned dataset was divided into two classes of training and validation with 70 and 30% of the springs, respectively. Then, C5.0, RF, and MARS algorithms were employed using R statistical software, and the final values were transformed into GPMs. Finally, two evaluation criteria including Kappa and area under receiver operating characteristics curve (AUC-ROC) were calculated. According to the findings of this research, MARS had the best performance with AUC-ROC of 84.2%, followed by RF and C5.0 algorithms with AUC-ROC values of 79.7 and 77.3%, respectively. The results indicated that AUC-ROC values for the employed models are more than 70% which shows their acceptable performance. As a conclusion, the produced methodology could be used in other geographical areas. GPMs could be used by water resource managers and related organizations to accelerate and facilitate water resource exploitation.

  14. Assessing Principal Component Regression Prediction of Neurochemicals Detected with Fast-Scan Cyclic Voltammetry

    PubMed Central

    2011-01-01

    Principal component regression is a multivariate data analysis approach routinely used to predict neurochemical concentrations from in vivo fast-scan cyclic voltammetry measurements. This mathematical procedure can rapidly be employed with present day computer programming languages. Here, we evaluate several methods that can be used to evaluate and improve multivariate concentration determination. The cyclic voltammetric representation of the calculated regression vector is shown to be a valuable tool in determining whether the calculated multivariate model is chemically appropriate. The use of Cook’s distance successfully identified outliers contained within in vivo fast-scan cyclic voltammetry training sets. This work also presents the first direct interpretation of a residual color plot and demonstrated the effect of peak shifts on predicted dopamine concentrations. Finally, separate analyses of smaller increments of a single continuous measurement could not be concatenated without substantial error in the predicted neurochemical concentrations due to electrode drift. Taken together, these tools allow for the construction of more robust multivariate calibration models and provide the first approach to assess the predictive ability of a procedure that is inherently impossible to validate because of the lack of in vivo standards. PMID:21966586

  15. Assessing principal component regression prediction of neurochemicals detected with fast-scan cyclic voltammetry.

    PubMed

    Keithley, Richard B; Wightman, R Mark

    2011-06-07

    Principal component regression is a multivariate data analysis approach routinely used to predict neurochemical concentrations from in vivo fast-scan cyclic voltammetry measurements. This mathematical procedure can rapidly be employed with present day computer programming languages. Here, we evaluate several methods that can be used to evaluate and improve multivariate concentration determination. The cyclic voltammetric representation of the calculated regression vector is shown to be a valuable tool in determining whether the calculated multivariate model is chemically appropriate. The use of Cook's distance successfully identified outliers contained within in vivo fast-scan cyclic voltammetry training sets. This work also presents the first direct interpretation of a residual color plot and demonstrated the effect of peak shifts on predicted dopamine concentrations. Finally, separate analyses of smaller increments of a single continuous measurement could not be concatenated without substantial error in the predicted neurochemical concentrations due to electrode drift. Taken together, these tools allow for the construction of more robust multivariate calibration models and provide the first approach to assess the predictive ability of a procedure that is inherently impossible to validate because of the lack of in vivo standards.

  16. Quantifying the impact of between-study heterogeneity in multivariate meta-analyses

    PubMed Central

    Jackson, Dan; White, Ian R; Riley, Richard D

    2012-01-01

    Measures that quantify the impact of heterogeneity in univariate meta-analysis, including the very popular I2 statistic, are now well established. Multivariate meta-analysis, where studies provide multiple outcomes that are pooled in a single analysis, is also becoming more commonly used. The question of how to quantify heterogeneity in the multivariate setting is therefore raised. It is the univariate R2 statistic, the ratio of the variance of the estimated treatment effect under the random and fixed effects models, that generalises most naturally, so this statistic provides our basis. This statistic is then used to derive a multivariate analogue of I2, which we call . We also provide a multivariate H2 statistic, the ratio of a generalisation of Cochran's heterogeneity statistic and its associated degrees of freedom, with an accompanying generalisation of the usual I2 statistic, . Our proposed heterogeneity statistics can be used alongside all the usual estimates and inferential procedures used in multivariate meta-analysis. We apply our methods to some real datasets and show how our statistics are equally appropriate in the context of multivariate meta-regression, where study level covariate effects are included in the model. Our heterogeneity statistics may be used when applying any procedure for fitting the multivariate random effects model. Copyright © 2012 John Wiley & Sons, Ltd. PMID:22763950

  17. Regression analysis for LED color detection of visual-MIMO system

    NASA Astrophysics Data System (ADS)

    Banik, Partha Pratim; Saha, Rappy; Kim, Ki-Doo

    2018-04-01

    Color detection from a light emitting diode (LED) array using a smartphone camera is very difficult in a visual multiple-input multiple-output (visual-MIMO) system. In this paper, we propose a method to determine the LED color using a smartphone camera by applying regression analysis. We employ a multivariate regression model to identify the LED color. After taking a picture of an LED array, we select the LED array region, and detect the LED using an image processing algorithm. We then apply the k-means clustering algorithm to determine the number of potential colors for feature extraction of each LED. Finally, we apply the multivariate regression model to predict the color of the transmitted LEDs. In this paper, we show our results for three types of environmental light condition: room environmental light, low environmental light (560 lux), and strong environmental light (2450 lux). We compare the results of our proposed algorithm from the analysis of training and test R-Square (%) values, percentage of closeness of transmitted and predicted colors, and we also mention about the number of distorted test data points from the analysis of distortion bar graph in CIE1931 color space.

  18. A general framework for multivariate multi-index drought prediction based on Multivariate Ensemble Streamflow Prediction (MESP)

    NASA Astrophysics Data System (ADS)

    Hao, Zengchao; Hao, Fanghua; Singh, Vijay P.

    2016-08-01

    Drought is among the costliest natural hazards worldwide and extreme drought events in recent years have caused huge losses to various sectors. Drought prediction is therefore critically important for providing early warning information to aid decision making to cope with drought. Due to the complicated nature of drought, it has been recognized that the univariate drought indicator may not be sufficient for drought characterization and hence multivariate drought indices have been developed for drought monitoring. Alongside the substantial effort in drought monitoring with multivariate drought indices, it is of equal importance to develop a drought prediction method with multivariate drought indices to integrate drought information from various sources. This study proposes a general framework for multivariate multi-index drought prediction that is capable of integrating complementary prediction skills from multiple drought indices. The Multivariate Ensemble Streamflow Prediction (MESP) is employed to sample from historical records for obtaining statistical prediction of multiple variables, which is then used as inputs to achieve multivariate prediction. The framework is illustrated with a linearly combined drought index (LDI), which is a commonly used multivariate drought index, based on climate division data in California and New York in the United States with different seasonality of precipitation. The predictive skill of LDI (represented with persistence) is assessed by comparison with the univariate drought index and results show that the LDI prediction skill is less affected by seasonality than the meteorological drought prediction based on SPI. Prediction results from the case study show that the proposed multivariate drought prediction outperforms the persistence prediction, implying a satisfactory performance of multivariate drought prediction. The proposed method would be useful for drought prediction to integrate drought information from various sources

  19. Multivariate Bias Correction Procedures for Improving Water Quality Predictions from the SWAT Model

    NASA Astrophysics Data System (ADS)

    Arumugam, S.; Libera, D.

    2017-12-01

    Water quality observations are usually not available on a continuous basis for longer than 1-2 years at a time over a decadal period given the labor requirements making calibrating and validating mechanistic models difficult. Further, any physical model predictions inherently have bias (i.e., under/over estimation) and require post-simulation techniques to preserve the long-term mean monthly attributes. This study suggests a multivariate bias-correction technique and compares to a common technique in improving the performance of the SWAT model in predicting daily streamflow and TN loads across the southeast based on split-sample validation. The approach is a dimension reduction technique, canonical correlation analysis (CCA) that regresses the observed multivariate attributes with the SWAT model simulated values. The common approach is a regression based technique that uses an ordinary least squares regression to adjust model values. The observed cross-correlation between loadings and streamflow is better preserved when using canonical correlation while simultaneously reducing individual biases. Additionally, canonical correlation analysis does a better job in preserving the observed joint likelihood of observed streamflow and loadings. These procedures were applied to 3 watersheds chosen from the Water Quality Network in the Southeast Region; specifically, watersheds with sufficiently large drainage areas and number of observed data points. The performance of these two approaches are compared for the observed period and over a multi-decadal period using loading estimates from the USGS LOADEST model. Lastly, the CCA technique is applied in a forecasting sense by using 1-month ahead forecasts of P & T from ECHAM4.5 as forcings in the SWAT model. Skill in using the SWAT model for forecasting loadings and streamflow at the monthly and seasonal timescale is also discussed.

  20. Fresh Biomass Estimation in Heterogeneous Grassland Using Hyperspectral Measurements and Multivariate Statistical Analysis

    NASA Astrophysics Data System (ADS)

    Darvishzadeh, R.; Skidmore, A. K.; Mirzaie, M.; Atzberger, C.; Schlerf, M.

    2014-12-01

    Accurate estimation of grassland biomass at their peak productivity can provide crucial information regarding the functioning and productivity of the rangelands. Hyperspectral remote sensing has proved to be valuable for estimation of vegetation biophysical parameters such as biomass using different statistical techniques. However, in statistical analysis of hyperspectral data, multicollinearity is a common problem due to large amount of correlated hyper-spectral reflectance measurements. The aim of this study was to examine the prospect of above ground biomass estimation in a heterogeneous Mediterranean rangeland employing multivariate calibration methods. Canopy spectral measurements were made in the field using a GER 3700 spectroradiometer, along with concomitant in situ measurements of above ground biomass for 170 sample plots. Multivariate calibrations including partial least squares regression (PLSR), principal component regression (PCR), and Least-Squared Support Vector Machine (LS-SVM) were used to estimate the above ground biomass. The prediction accuracy of the multivariate calibration methods were assessed using cross validated R2 and RMSE. The best model performance was obtained using LS_SVM and then PLSR both calibrated with first derivative reflectance dataset with R2cv = 0.88 & 0.86 and RMSEcv= 1.15 & 1.07 respectively. The weakest prediction accuracy was appeared when PCR were used (R2cv = 0.31 and RMSEcv= 2.48). The obtained results highlight the importance of multivariate calibration methods for biomass estimation when hyperspectral data are used.

  1. A Multivariate Test of the Bott Hypothesis in an Urban Irish Setting

    ERIC Educational Resources Information Center

    Gordon, Michael; Downing, Helen

    1978-01-01

    Using a sample of 686 married Irish women in Cork City the Bott hypothesis was tested, and the results of a multivariate regression analysis revealed that neither network connectedness nor the strength of the respondent's emotional ties to the network had any explanatory power. (Author)

  2. Brief Report: Pregnant by Age 15 Years and Substance Use Initiation among US Adolescent Girls

    ERIC Educational Resources Information Center

    Cavazos-Rehg, Patricia A.; Krauss, Melissa J.; Spitznagel, Edward L.; Schootman, Mario; Cottler, Linda B.; Bierut, Laura Jean

    2012-01-01

    We examined substance use onset and associations with pregnancy by age 15 years. Participants were girls ages 15 years or younger (weighted n = 8319) from the 1999-2003 Youth Risk Behavior Surveillance System (YRBS). Multivariable logistic regression examined pregnancy as a function of substance use onset (i.e., age 10 years or younger, 11-12,…

  3. Multivariate Analysis and Prediction of Dioxin-Furan ...

    EPA Pesticide Factsheets

    Peer Review Draft of Regional Methods Initiative Final Report Dioxins, which are bioaccumulative and environmentally persistent, pose an ongoing risk to human and ecosystem health. Fish constitute a significant source of dioxin exposure for humans and fish-eating wildlife. Current dioxin analytical methods are costly, time-consuming, and produce hazardous by-products. A Danish team developed a novel, multivariate statistical methodology based on the covariance of dioxin-furan congener Toxic Equivalences (TEQs) and fatty acid methyl esters (FAMEs) and applied it to North Atlantic Ocean fishmeal samples. The goal of the current study was to attempt to extend this Danish methodology to 77 whole and composite fish samples from three trophic groups: predator (whole largemouth bass), benthic (whole flathead and channel catfish) and forage fish (composite bluegill, pumpkinseed and green sunfish) from two dioxin contaminated rivers (Pocatalico R. and Kanawha R.) in West Virginia, USA. Multivariate statistical analyses, including, Principal Components Analysis (PCA), Hierarchical Clustering, and Partial Least Squares Regression (PLS), were used to assess the relationship between the FAMEs and TEQs in these dioxin contaminated freshwater fish from the Kanawha and Pocatalico Rivers. These three multivariate statistical methods all confirm that the pattern of Fatty Acid Methyl Esters (FAMEs) in these freshwater fish covaries with and is predictive of the WHO TE

  4. Predictors of success after laparoscopic gastric bypass: a multivariate analysis of socioeconomic factors.

    PubMed

    Lutfi, R; Torquati, A; Sekhar, N; Richards, W O

    2006-06-01

    Laparoscopic gastric bypass (LGB) has proven efficacy in causing significant and durable weight loss. However, the degree of postoperative weight loss and metabolic improvement varies greatly among individuals. Our study is aimed to identify independent predictors of successful weight loss after LGB. Socioeconomic demographics were prospectively collected on patients undergoing LGB. Primary endpoint was percent of excess weight loss (EWL) at 1-year follow-up. Insufficient weight loss was defined as EWL regression was used in both univariate and multivariate models to identify independent preoperative demographics associated with successful weight loss. A total of 180 consecutive patients were enrolled over 30 months. Mean preoperative body mass index (BMI) was 48. Mean EWL was 70.1 +/- 17.3% (1 SD); therefore, success was defined as EWL >or=52.8%. According to this definition, 147 patients (81.7%) achieved successful weight loss 1 year after LGB. On univariate analysis, preoperative BMI had a significant effect on EWL, with patients with BMI <50 achieving a higher percentage of EWL (91.7% vs 61.6%; p = 0.001). Marriage status was also a significant predictor of successful outcome, with single patients achieving a higher percentage of EWL than married patients (89.8% vs 77.7%; p = 0.04). Race had a noticeable but not statistically significant effect, with Caucasian patients achieving a higher percentage of EWL than African Americans (82.9% vs 60%; p = 0.06). Marital status remained an independent predictor of success in the multivariate logistic regression model after adjusting for covariates. Married patients were at more than two times the risk of failure compared to those who were unmarried (OR 2.6; 95% CI: 1.1-6.5, p = 0.04). Weight loss achieved at 1 year after LGB is suboptimal in superobese patients. Single patients with BMI < 50 had the best chance of achieving greater weight loss.

  5. PARAMETRIC AND NON PARAMETRIC (MARS: MULTIVARIATE ADDITIVE REGRESSION SPLINES) LOGISTIC REGRESSIONS FOR PREDICTION OF A DICHOTOMOUS RESPONSE VARIABLE WITH AN EXAMPLE FOR PRESENCE/ABSENCE OF AMPHIBIANS

    EPA Science Inventory

    The purpose of this report is to provide a reference manual that could be used by investigators for making informed use of logistic regression using two methods (standard logistic regression and MARS). The details for analyses of relationships between a dependent binary response ...

  6. Using multiobjective tradeoff sets and Multivariate Regression Trees to identify critical and robust decisions for long term water utility planning

    NASA Astrophysics Data System (ADS)

    Smith, R.; Kasprzyk, J. R.; Balaji, R.

    2017-12-01

    In light of deeply uncertain factors like future climate change and population shifts, responsible resource management will require new types of information and strategies. For water utilities, this entails potential expansion and efficient management of water supply infrastructure systems for changes in overall supply; changes in frequency and severity of climate extremes such as droughts and floods; and variable demands, all while accounting for conflicting long and short term performance objectives. Multiobjective Evolutionary Algorithms (MOEAs) are emerging decision support tools that have been used by researchers and, more recently, water utilities to efficiently generate and evaluate thousands of planning portfolios. The tradeoffs between conflicting objectives are explored in an automated way to produce (often large) suites of portfolios that strike different balances of performance. Once generated, the sets of optimized portfolios are used to support relatively subjective assertions of priorities and human reasoning, leading to adoption of a plan. These large tradeoff sets contain information about complex relationships between decisions and between groups of decisions and performance that, until now, has not been quantitatively described. We present a novel use of Multivariate Regression Trees (MRTs) to analyze tradeoff sets to reveal these relationships and critical decisions. Additionally, when MRTs are applied to tradeoff sets developed for different realizations of an uncertain future, they can identify decisions that are robust across a wide range of conditions and produce fundamental insights about the system being optimized.

  7. Using Logistic Regression and Random Forests multivariate statistical methods for landslide spatial probability assessment in North-Est Sicily, Italy

    NASA Astrophysics Data System (ADS)

    Trigila, Alessandro; Iadanza, Carla; Esposito, Carlo; Scarascia-Mugnozza, Gabriele

    2015-04-01

    first phase of the work addressed to identify the spatial relationships between the landslides location and the 13 related factors by using the Frequency Ratio bivariate statistical method. The analysis was then carried out by adopting a multivariate statistical approach, according to the Logistic Regression technique and Random Forests technique that gave best results in terms of AUC. The models were performed and evaluated with different sample sizes and also taking into account the temporal variation of input variables such as burned areas by wildfire. The most significant outcome of this work are: the relevant influence of the sample size on the model results and the strong importance of some environmental factors (e.g. land use and wildfires) for the identification of the depletion zones of extremely rapid shallow landslides.

  8. Multivariate time series analysis of neuroscience data: some challenges and opportunities.

    PubMed

    Pourahmadi, Mohsen; Noorbaloochi, Siamak

    2016-04-01

    Neuroimaging data may be viewed as high-dimensional multivariate time series, and analyzed using techniques from regression analysis, time series analysis and spatiotemporal analysis. We discuss issues related to data quality, model specification, estimation, interpretation, dimensionality and causality. Some recent research areas addressing aspects of some recurring challenges are introduced. Copyright © 2015 Elsevier Ltd. All rights reserved.

  9. Analysis of Forest Foliage Using a Multivariate Mixture Model

    NASA Technical Reports Server (NTRS)

    Hlavka, C. A.; Peterson, David L.; Johnson, L. F.; Ganapol, B.

    1997-01-01

    Data with wet chemical measurements and near infrared spectra of ground leaf samples were analyzed to test a multivariate regression technique for estimating component spectra which is based on a linear mixture model for absorbance. The resulting unmixed spectra for carbohydrates, lignin, and protein resemble the spectra of extracted plant starches, cellulose, lignin, and protein. The unmixed protein spectrum has prominent absorption spectra at wavelengths which have been associated with nitrogen bonds.

  10. The Covariance Adjustment Approaches for Combining Incomparable Cox Regressions Caused by Unbalanced Covariates Adjustment: A Multivariate Meta-Analysis Study.

    PubMed

    Dehesh, Tania; Zare, Najaf; Ayatollahi, Seyyed Mohammad Taghi

    2015-01-01

    Univariate meta-analysis (UM) procedure, as a technique that provides a single overall result, has become increasingly popular. Neglecting the existence of other concomitant covariates in the models leads to loss of treatment efficiency. Our aim was proposing four new approximation approaches for the covariance matrix of the coefficients, which is not readily available for the multivariate generalized least square (MGLS) method as a multivariate meta-analysis approach. We evaluated the efficiency of four new approaches including zero correlation (ZC), common correlation (CC), estimated correlation (EC), and multivariate multilevel correlation (MMC) on the estimation bias, mean square error (MSE), and 95% probability coverage of the confidence interval (CI) in the synthesis of Cox proportional hazard models coefficients in a simulation study. Comparing the results of the simulation study on the MSE, bias, and CI of the estimated coefficients indicated that MMC approach was the most accurate procedure compared to EC, CC, and ZC procedures. The precision ranking of the four approaches according to all above settings was MMC ≥ EC ≥ CC ≥ ZC. This study highlights advantages of MGLS meta-analysis on UM approach. The results suggested the use of MMC procedure to overcome the lack of information for having a complete covariance matrix of the coefficients.

  11. Discordance between net analyte signal theory and practical multivariate calibration.

    PubMed

    Brown, Christopher D

    2004-08-01

    Lorber's concept of net analyte signal is reviewed in the context of classical and inverse least-squares approaches to multivariate calibration. It is shown that, in the presence of device measurement error, the classical and inverse calibration procedures have radically different theoretical prediction objectives, and the assertion that the popular inverse least-squares procedures (including partial least squares, principal components regression) approximate Lorber's net analyte signal vector in the limit is disproved. Exact theoretical expressions for the prediction error bias, variance, and mean-squared error are given under general measurement error conditions, which reinforce the very discrepant behavior between these two predictive approaches, and Lorber's net analyte signal theory. Implications for multivariate figures of merit and numerous recently proposed preprocessing treatments involving orthogonal projections are also discussed.

  12. Multivariate Adaptive Regression Splines (Preprint)

    DTIC Science & Technology

    1990-08-01

    fold cross -validation would take about ten time as long, and MARS is not all that fast to begin with. Friedman has a number of examples showing...standardized mean squared error of prediction (MSEP), the generalized cross validation (GCV), and the number of selected terms (TERMS). In accordance with...and mi= 10 case were almost exclusively spurious cross product terms and terms involving the nuisance variables x6 through xlo. This large number of

  13. Multivariate Cryptography Based on Clipped Hopfield Neural Network.

    PubMed

    Wang, Jia; Cheng, Lee-Ming; Su, Tong

    2018-02-01

    Designing secure and efficient multivariate public key cryptosystems [multivariate cryptography (MVC)] to strengthen the security of RSA and ECC in conventional and quantum computational environment continues to be a challenging research in recent years. In this paper, we will describe multivariate public key cryptosystems based on extended Clipped Hopfield Neural Network (CHNN) and implement it using the MVC (CHNN-MVC) framework operated in space. The Diffie-Hellman key exchange algorithm is extended into the matrix field, which illustrates the feasibility of its new applications in both classic and postquantum cryptography. The efficiency and security of our proposed new public key cryptosystem CHNN-MVC are simulated and found to be NP-hard. The proposed algorithm will strengthen multivariate public key cryptosystems and allows hardware realization practicality.

  14. Assessing risk factors for periodontitis using regression

    NASA Astrophysics Data System (ADS)

    Lobo Pereira, J. A.; Ferreira, Maria Cristina; Oliveira, Teresa

    2013-10-01

    Multivariate statistical analysis is indispensable to assess the associations and interactions between different factors and the risk of periodontitis. Among others, regression analysis is a statistical technique widely used in healthcare to investigate and model the relationship between variables. In our work we study the impact of socio-demographic, medical and behavioral factors on periodontal health. Using regression, linear and logistic models, we can assess the relevance, as risk factors for periodontitis disease, of the following independent variables (IVs): Age, Gender, Diabetic Status, Education, Smoking status and Plaque Index. The multiple linear regression analysis model was built to evaluate the influence of IVs on mean Attachment Loss (AL). Thus, the regression coefficients along with respective p-values will be obtained as well as the respective p-values from the significance tests. The classification of a case (individual) adopted in the logistic model was the extent of the destruction of periodontal tissues defined by an Attachment Loss greater than or equal to 4 mm in 25% (AL≥4mm/≥25%) of sites surveyed. The association measures include the Odds Ratios together with the correspondent 95% confidence intervals.

  15. [Multivariate analysis of the association between consumption of fried food and gastric cancer and precancerous lesions].

    PubMed

    Guo, L W; Liu, S Z; Zhang, M; Chen, Q; Zhang, S K; Sun, X B

    2018-02-06

    Objective: To investigate the effect of fried food intake on the pathogenesis of gastric cancer and precancerous lesions. Methods: From 2005 to 2013, the residents aged 40-69 years from 11 counties/cities where cancer screening of upper gastrointestinal cancer were conducted in rural areas of Henan province as the subjects (82 367 cases). The information such as demography and lifestyle was collected. The residents were screened with endoscopic examination. The biopsy sampleswere diagnosed pathologically, according to pathological diagnosis criteria, the subjects with high risk were divided into the groups with different pathological degrees. The multivariate ordinal logistic regression analysis was used to analyze the relationship between the frequency of fried food intake and gastric cancer and precancerous lesions. Results: The study coverd 46 425 males and 35 942 females, with a age of (53.46±8.07)years. The study collected 6 707 cases of normal stomach, 2 325 cases of low grade intraepithelial neoplasia, 226 cases of high grade intraepithelial neoplasia and 331 cases of gastric cancer. Multivariate logistic regression analysis showed that, compared with those whoeat fried food less than one time per week, fried foods intake (<2 times/week: OR= 1.89, 95 %CI: 1.57-2.28; ≥ 2 times/week: OR= 1.91, 95 %CI: 1.66-2.20) were a risk factor for gastric cancer and precancerous lesions after adjustment for age, sex, marital status, educational level, body mass index (BMI), smoking and drinking status. Conclusion: The intake of fried food is a risk factor for gastric cancer and precancerous lesions. Therefore, reducing the intake of fried food can prevent the occurrence of gastric carcinoma and precancerous lesions.

  16. Introduction to multivariate discrimination

    NASA Astrophysics Data System (ADS)

    Kégl, Balázs

    2013-07-01

    Multivariate discrimination or classification is one of the best-studied problem in machine learning, with a plethora of well-tested and well-performing algorithms. There are also several good general textbooks [1-9] on the subject written to an average engineering, computer science, or statistics graduate student; most of them are also accessible for an average physics student with some background on computer science and statistics. Hence, instead of writing a generic introduction, we concentrate here on relating the subject to a practitioner experimental physicist. After a short introduction on the basic setup (Section 1) we delve into the practical issues of complexity regularization, model selection, and hyperparameter optimization (Section 2), since it is this step that makes high-complexity non-parametric fitting so different from low-dimensional parametric fitting. To emphasize that this issue is not restricted to classification, we illustrate the concept on a low-dimensional but non-parametric regression example (Section 2.1). Section 3 describes the common algorithmic-statistical formal framework that unifies the main families of multivariate classification algorithms. We explain here the large-margin principle that partly explains why these algorithms work. Section 4 is devoted to the description of the three main (families of) classification algorithms, neural networks, the support vector machine, and AdaBoost. We do not go into the algorithmic details; the goal is to give an overview on the form of the functions these methods learn and on the objective functions they optimize. Besides their technical description, we also make an attempt to put these algorithm into a socio-historical context. We then briefly describe some rather heterogeneous applications to illustrate the pattern recognition pipeline and to show how widespread the use of these methods is (Section 5). We conclude the chapter with three essentially open research problems that are either

  17. Prematures with and without Regressed Retinopathy of Prematurity: Comparison of Long-Term (6-10 Years) Ophthalmological Morbidity.

    ERIC Educational Resources Information Center

    Cats, Bernard P.; Tan, Karel E. W. P.

    Reporting long-term ophthalmologic sequelae among ex-prematures at 6 to 10 years of age, this study compares 42 ex-premature infants who had had regressed forms of retinopathy of prematurity (ROP) during the neonatal period with 42 matched non-ROP ex-premature controls at 6 to 10 years of age. Subjects were subdivided into four groups: (1) ROP…

  18. Integrated environmental monitoring and multivariate data analysis-A case study.

    PubMed

    Eide, Ingvar; Westad, Frank; Nilssen, Ingunn; de Freitas, Felipe Sales; Dos Santos, Natalia Gomes; Dos Santos, Francisco; Cabral, Marcelo Montenegro; Bicego, Marcia Caruso; Figueira, Rubens; Johnsen, Ståle

    2017-03-01

    The present article describes integration of environmental monitoring and discharge data and interpretation using multivariate statistics, principal component analysis (PCA), and partial least squares (PLS) regression. The monitoring was carried out at the Peregrino oil field off the coast of Brazil. One sensor platform and 3 sediment traps were placed on the seabed. The sensors measured current speed and direction, turbidity, temperature, and conductivity. The sediment trap samples were used to determine suspended particulate matter that was characterized with respect to a number of chemical parameters (26 alkanes, 16 PAHs, N, C, calcium carbonate, and Ba). Data on discharges of drill cuttings and water-based drilling fluid were provided on a daily basis. The monitoring was carried out during 7 campaigns from June 2010 to October 2012, each lasting 2 to 3 months due to the capacity of the sediment traps. The data from the campaigns were preprocessed, combined, and interpreted using multivariate statistics. No systematic difference could be observed between campaigns or traps despite the fact that the first campaign was carried out before drilling, and 1 of 3 sediment traps was located in an area not expected to be influenced by the discharges. There was a strong covariation between suspended particulate matter and total N and organic C suggesting that the majority of the sediment samples had a natural and biogenic origin. Furthermore, the multivariate regression showed no correlation between discharges of drill cuttings and sediment trap or turbidity data taking current speed and direction into consideration. Because of this lack of correlation with discharges from the drilling location, a more detailed evaluation of chemical indicators providing information about origin was carried out in addition to numerical modeling of dispersion and deposition. The chemical indicators and the modeling of dispersion and deposition support the conclusions from the multivariate

  19. Comparison of Outcomes of Acute Coronary Syndrome in Patients ≥80 Years Versus Those <80 Years in Israel from 2000 to 2013.

    PubMed

    Shechter, Michael; Rubinstein, Roy; Goldenberg, Ilan; Matetzki, Shlomi

    2017-10-15

    Although patients ≥80 years old constitute the fastest-growing segment of the population and have a high prevalence of coronary artery disease, few data exist regarding the outcome of octogenarians with acute coronary syndrome (ACS). In a retrospective study based on data of 13,432 ACS patients who were enrolled in the ACS Israel Survey, we first evaluated the clinical outcome of 1,731 ACS patients ≥80 years (13%) compared with 11,701 ACS patients <80 years (87%) hospitalized during 2000 to 2013. Second, we evaluated the clinical outcome of patients ≥80 years hospitalized during the 2000 to 2006 ("early") period (n = 1,037) compared with those of the same age group of patients hospitalized during the 2008 to 2013 ("late") period (n = 694). Implementation of the ACS AHA/ACC/ESC therapeutic guidelines was lower in ACS patients ≥80 years compared with patients <80 years. Multivariate Cox regression analysis demonstrated a worse 1-year survival rate in the ACS patients ≥80 years compared with those <80 years. During the late period, patients ≥80 years were more frequently treated with guideline-recommended therapies compared with patients from the same age group who were hospitalized in the early period. Multivariate Cox regression analysis demonstrated a better 1-year survival rate of patients ≥80 years during the late period compared with the early period (hazard ratio 1.17, 95% confidence interval 1.15 to 1.61; p = 0.01). In addition, adverse outcome rates of ACS patients ≥80 years were significantly higher compared with those of patients <80 years. However, survival rates of ACS patients ≥80 years were improved over the 200 to 2013 period. Copyright © 2017 Elsevier Inc. All rights reserved.

  20. An Accurate VO[subscript 2]max Nonexercise Regression Model for 18-65-Year-Old Adults

    ERIC Educational Resources Information Center

    Bradshaw, Danielle I.; George, James D.; Hyde, Annette; LaMonte, Michael J.; Vehrs, Pat R.; Hager, Ronald L.; Yanowitz, Frank G.

    2005-01-01

    The purpose of this study was to develop a regression equation to predict maximal oxygen uptake (VO[subscript 2]max) based on nonexercise (N-EX) data. All participants (N = 100), ages 18-65 years, successfully completed a maximal graded exercise test (GXT) to assess VO[subscript 2]max (M = 39.96 mL[middle dot]kg[superscript -1][middle…

  1. Heritability of somatotype components: a multivariate analysis.

    PubMed

    Peeters, M W; Thomis, M A; Loos, R J F; Derom, C A; Fagard, R; Claessens, A L; Vlietinck, R F; Beunen, G P

    2007-08-01

    To study the genetic and environmental determination of variation in Heath-Carter somatotype (ST) components (endomorphy, mesomorphy and ectomorphy). Multivariate path analysis on twin data. Eight hundred and three members of 424 adult Flemish twin pairs (18-34 years of age). The results indicate the significance of sex differences and the significance of the covariation between the three ST components. After age-regression, variation of the population in ST components and their covariation is explained by additive genetic sources of variance (A), shared (familial) environment (C) and unique environment (E). In men, additive genetic sources of variance explain 28.0% (CI 8.7-50.8%), 86.3% (71.6-90.2%) and 66.5% (37.4-85.1%) for endomorphy, mesomorphy and ectomorphy, respectively. For women, corresponding values are 32.3% (8.9-55.6%), 82.0% (67.7-87.7%) and 70.1% (48.9-81.8%). For all components in men and women, more than 70% of the total variation was explained by sources of variance shared between the three components, emphasising the importance of analysing the ST in a multivariate way. The findings suggest that the high heritabilities for mesomorphy and ectomorphy reported in earlier twin studies in adolescence are maintained in adulthood. For endomorphy, which represents a relative measure of subcutaneous adipose tissue, however, the results suggest heritability may be considerably lower than most values reported in earlier studies on adolescent twins. The heritability is also lower than values reported for, for example, body mass index (BMI), which next to the weight of organs and adipose tissue also includes muscle and bone tissue. Considering the differences in heritability between musculoskeletal robustness (mesomorphy) and subcutaneous adipose tissue (endomorphy) it may be questioned whether studying the genetics of BMI will eventually lead to a better understanding of the genetics of fatness, obesity and overweight.

  2. Evaluation of functional outcome of the floating knee injury using multivariate analysis.

    PubMed

    Yokoyama, Kazuhiko; Tsukamoto, Tatsuro; Aoki, Shinichi; Wakita, Ryuji; Uchino, Masataka; Noumi, Takashi; Fukushima, Nobuaki; Itoman, Moritoshi

    2002-11-01

    The objective of this study is to evaluate significant contributing factors affecting the functional prognosis of floating knee injuries using multivariate analysis. A total of 68 floating knee injuries (67 patients) were treated at Kitasato University Hospital from 1986 to 1999. Both the femoral fractures and the tibial fractures were managed surgically by various methods. The functional results of these injuries were evaluated using the grading system of Karlström and Olerud. Follow-up periods ranged from 2 to 19 years (mean 50.2 months) after the original injury. We defined satisfactory (S) outcomes as those cases with excellent or good results and unsatisfactory (US) outcomes as those cases with acceptable or poor results. Logistic regression analysis was used as a multivariate analysis, and the dependent variables were defined as a satisfactory outcome or as an unsatisfactory outcome. The explanatory variables were predicting factors influencing the functional outcome such as age at trauma, gender, severity of soft-tissue injury in the femur and the tibia, AO fracture grade in the femur and the tibia, Fraser type (type I or type II), Injury Severity Score (ISS), and fixation time after injury (less than 1 week or more than 1 week) in the femur and the tibia. The final functional results were as follows: 25 cases had excellent results, 15 cases good results, 16 cases acceptable results, and 12 cases poor results. The predictive logistic regression equation was as follows: Log 1-p/p = 3.12-1.52 x Fraser type - 1.65 x severity of soft-tissue injury in the tibia - 1.31 x fixation time after injury in the tibia - 0.821 x AO fracture grade in the tibia + 1.025 x fixation time after injury in the femur - 0.687 x AO fracture grade in the femur ( p=0.01). Among the variables, Fraser type and the severity of soft-tissue injury in the tibia were significantly related to the final result. The multivariate analysis showed that both the involvement of the knee joint and

  3. Multivariate Analysis and Machine Learning in Cerebral Palsy Research.

    PubMed

    Zhang, Jing

    2017-01-01

    Cerebral palsy (CP), a common pediatric movement disorder, causes the most severe physical disability in children. Early diagnosis in high-risk infants is critical for early intervention and possible early recovery. In recent years, multivariate analytic and machine learning (ML) approaches have been increasingly used in CP research. This paper aims to identify such multivariate studies and provide an overview of this relatively young field. Studies reviewed in this paper have demonstrated that multivariate analytic methods are useful in identification of risk factors, detection of CP, movement assessment for CP prediction, and outcome assessment, and ML approaches have made it possible to automatically identify movement impairments in high-risk infants. In addition, outcome predictors for surgical treatments have been identified by multivariate outcome studies. To make the multivariate and ML approaches useful in clinical settings, further research with large samples is needed to verify and improve these multivariate methods in risk factor identification, CP detection, movement assessment, and outcome evaluation or prediction. As multivariate analysis, ML and data processing technologies advance in the era of Big Data of this century, it is expected that multivariate analysis and ML will play a bigger role in improving the diagnosis and treatment of CP to reduce mortality and morbidity rates, and enhance patient care for children with CP.

  4. Perioperative factors predicting poor outcome in elderly patients following emergency general surgery: a multivariate regression analysis.

    PubMed

    Lees, Mackenzie C; Merani, Shaheed; Tauh, Keerit; Khadaroo, Rachel G

    2015-10-01

    Older adults (≥ 65 yr) are the fastest growing population and are presenting in increasing numbers for acute surgical care. Emergency surgery is frequently life threatening for older patients. Our objective was to identify predictors of mortality and poor outcome among elderly patients undergoing emergency general surgery. We conducted a retrospective cohort study of patients aged 65-80 years undergoing emergency general surgery between 2009 and 2010 at a tertiary care centre. Demographics, comorbidities, in-hospital complications, mortality and disposition characteristics of patients were collected. Logistic regression analysis was used to identify covariate-adjusted predictors of in-hospital mortality and discharge of patients home. Our analysis included 257 patients with a mean age of 72 years; 52% were men. In-hospital mortality was 12%. Mortality was associated with patients who had higher American Society of Anesthesiologists (ASA) class (odds ratio [OR] 3.85, 95% confidence interval [CI] 1.43-10.33, p = 0.008) and in-hospital complications (OR 1.93, 95% CI 1.32-2.83, p = 0.001). Nearly two-thirds of patients discharged home were younger (OR 0.92, 95% CI 0.85-0.99, p = 0.036), had lower ASA class (OR 0.45, 95% CI 0.27-0.74, p = 0.002) and fewer in-hospital complications (OR 0.69, 95% CI 0.53-0.90, p = 0.007). American Society of Anesthesiologists class and in-hospital complications are perioperative predictors of mortality and disposition in the older surgical population. Understanding the predictors of poor outcome and the importance of preventing in-hospital complications in older patients will have important clinical utility in terms of preoperative counselling, improving health care and discharging patients home.

  5. Predictors of unsuccessful outcome in cemented femoral revisions using bone impaction grafting; Cox regression analysis of 208 cases.

    PubMed

    Te Stroet, Martijn A J; Rijnen, Wim H C; Gardeniers, Jean W M; Schreurs, B Willem; Hannink, Gerjon

    2016-09-29

    Despite improvements in the technique of femoral impaction bone grafting, reconstruction failures still can occur. Therefore, the aim of our study was to determine risk factors for the endpoint re-revision for any reason. We used prospectively collected demographic, clinical and surgical data of all 202 patients who underwent 208 femoral revisions using the X-change Femoral Revision System (Stryker-Howmedica), fresh-frozen morcellised allograft and a cemented polished Exeter stem in our department from 1991 to 2007. Univariable and multivariable Cox regression analyses were performed to identify potential factors associated with re-revision. The mean follow-up was 10.6 (5-21) years. The cumulative re-revision rate was 6.3% (13/208). After univariable selection, sex, age, body mass index (BMI), American Association of Anesthesiologists (ASA) classification, type of removed femoral component, and mesh used for reconstruction were included in multivariable regression analysis.In the multivariable analysis, BMI was the only factor that was significantly associated with the risk of re-revision after bone impaction grafting (BMI ≥30 vs. BMI <30, HR = 6.54 [95% CI 1.89-22.65]; p = 0.003). BMI was the only factor associated with the risk of re-revision for any reason. Besides BMI also other factors, such as Endoklinik score and the type of removed femoral component, can provide guidance in the process of preclinical decision making. With the knowledge obtained from this study, preoperative patient selection, informed consent, and treatment protocols can be better adjusted to the individual patient who needs to undergo a femoral revision with impaction bone grafting.

  6. Genetic analysis of body weights of individually fed beef bulls in South Africa using random regression models.

    PubMed

    Selapa, N W; Nephawe, K A; Maiwashe, A; Norris, D

    2012-02-08

    The aim of this study was to estimate genetic parameters for body weights of individually fed beef bulls measured at centralized testing stations in South Africa using random regression models. Weekly body weights of Bonsmara bulls (N = 2919) tested between 1999 and 2003 were available for the analyses. The model included a fixed regression of the body weights on fourth-order orthogonal Legendre polynomials of the actual days on test (7, 14, 21, 28, 35, 42, 49, 56, 63, 70, 77, and 84) for starting age and contemporary group effects. Random regressions on fourth-order orthogonal Legendre polynomials of the actual days on test were included for additive genetic effects and additional uncorrelated random effects of the weaning-herd-year and the permanent environment of the animal. Residual effects were assumed to be independently distributed with heterogeneous variance for each test day. Variance ratios for additive genetic, permanent environment and weaning-herd-year for weekly body weights at different test days ranged from 0.26 to 0.29, 0.37 to 0.44 and 0.26 to 0.34, respectively. The weaning-herd-year was found to have a significant effect on the variation of body weights of bulls despite a 28-day adjustment period. Genetic correlations amongst body weights at different test days were high, ranging from 0.89 to 1.00. Heritability estimates were comparable to literature using multivariate models. Therefore, random regression model could be applied in the genetic evaluation of body weight of individually fed beef bulls in South Africa.

  7. Building a Multivariable Linear Regression Model of On-road Traffic for Creation of High Resolution Emission Inventories

    NASA Astrophysics Data System (ADS)

    Powell, James Eckhardt

    Emissions inventories are an important tool, often built by governments, and used to manage emissions. To build an inventory of urban CO2 emissions and other fossil fuel combustion products in the urban atmosphere, an inventory of on-road traffic is required. In particular, a high resolution inventory is necessary to capture the local characteristics of transport emissions. These emissions vary widely due to the local nature of the fleet, fuel, and roads. Here we show a new model of ADT for the Portland, OR metropolitan region. The backbone is traffic counter recordings made by the Portland Bureau of Transportation at 7,767 sites over 21 years (1986-2006), augmented with PORTAL (The Portland Regional Transportation Archive Listing) freeway traffic count data. We constructed a regression model to fill in traffic network gaps using GIS data such as road class and population density. An EPA-supplied emissions factor was used to estimate transportation CO2 emissions, which is compared to several other estimates for the city's CO2 footprint.

  8. Parental report of the early development of children with regressive autism: the delays-plus-regression phenotype.

    PubMed

    Ozonoff, Sally; Williams, Brenda J; Landa, Rebecca

    2005-12-01

    Most children with autism demonstrate developmental abnormalities in their first year, whereas others display regression after mostly normal development. Few studies have examined the early development of the latter group. This study developed a retrospective measure, the Early Development Questionnaire (EDQ), to collect specific, parent-reported information about development in the first 18 months. Based on their EDQ scores, 60 children with autism between the ages of 3 and 9 were divided into three groups: an early onset group (n = 29), a definite regression group (n = 23), and a heterogeneous mixed group (n = 8). Significant differences in early social development were found between the early onset and regression groups. However, over 50 percent of the children who experienced a regression demonstrated some early social deficits during the first year of life, long before regression and the apparent onset of autism. This group, tentatively labeled 'delays-plus-regression', deserves further study.

  9. Assessing the response of area burned to changing climate in western boreal North America using a Multivariate Adaptive Regression Splines (MARS) approach

    USGS Publications Warehouse

    Balshi, M. S.; McGuire, A.D.; Duffy, P.; Flannigan, M.; Walsh, J.; Melillo, J.

    2009-01-01

    Fire is a common disturbance in the North American boreal forest that influences ecosystem structure and function. The temporal and spatial dynamics of fire are likely to be altered as climate continues to change. In this study, we ask the question: how will area burned in boreal North America by wildfire respond to future changes in climate? To evaluate this question, we developed temporally and spatially explicit relationships between air temperature and fuel moisture codes derived from the Canadian Fire Weather Index System to estimate annual area burned at 2.5?? (latitude ?? longitude) resolution using a Multivariate Adaptive Regression Spline (MARS) approach across Alaska and Canada. Burned area was substantially more predictable in the western portion of boreal North America than in eastern Canada. Burned area was also not very predictable in areas of substantial topographic relief and in areas along the transition between boreal forest and tundra. At the scale of Alaska and western Canada, the empirical fire models explain on the order of 82% of the variation in annual area burned for the period 1960-2002. July temperature was the most frequently occurring predictor across all models, but the fuel moisture codes for the months June through August (as a group) entered the models as the most important predictors of annual area burned. To predict changes in the temporal and spatial dynamics of fire under future climate, the empirical fire models used output from the Canadian Climate Center CGCM2 global climate model to predict annual area burned through the year 2100 across Alaska and western Canada. Relative to 1991-2000, the results suggest that average area burned per decade will double by 2041-2050 and will increase on the order of 3.5-5.5 times by the last decade of the 21st century. To improve the ability to better predict wildfire across Alaska and Canada, future research should focus on incorporating additional effects of long-term and successional

  10. Multi-variant study of obesity risk genes in African Americans: The Jackson Heart Study.

    PubMed

    Liu, Shijian; Wilson, James G; Jiang, Fan; Griswold, Michael; Correa, Adolfo; Mei, Hao

    2016-11-30

    Genome-wide association study (GWAS) has been successful in identifying obesity risk genes by single-variant association analysis. For this study, we designed steps of analysis strategy and aimed to identify multi-variant effects on obesity risk among candidate genes. Our analyses were focused on 2137 African American participants with body mass index measured in the Jackson Heart Study and 657 common single nucleotide polymorphisms (SNPs) genotyped at 8 GWAS-identified obesity risk genes. Single-variant association test showed that no SNPs reached significance after multiple testing adjustment. The following gene-gene interaction analysis, which was focused on SNPs with unadjusted p-value<0.10, identified 6 significant multi-variant associations. Logistic regression showed that SNPs in these associations did not have significant linear interactions; examination of genetic risk score evidenced that 4 multi-variant associations had significant additive effects of risk SNPs; and haplotype association test presented that all multi-variant associations contained one or several combinations of particular alleles or haplotypes, associated with increased obesity risk. Our study evidenced that obesity risk genes generated multi-variant effects, which can be additive or non-linear interactions, and multi-variant study is an important supplement to existing GWAS for understanding genetic effects of obesity risk genes. Copyright © 2016 Elsevier B.V. All rights reserved.

  11. A multivariate analysis of biophysical parameters of tallgrass prairie among land management practices and years

    USGS Publications Warehouse

    Griffith, J.A.; Price, K.P.; Martinko, E.A.

    2001-01-01

    Six treatments of eastern Kansas tallgrass prairie - native prairie, hayed, mowed, grazed, burned and untreated - were studied to examine the biophysical effects of land management practices on grasslands. On each treatment, measurements of plant biomass, leaf area index, plant cover, leaf moisture and soil moisture were collected. In addition, measurements were taken of the Normalized Difference Vegetation Index (NDVI), which is derived from spectral reflectance measurements. Measurements were taken in mid-June, mid-July and late summer of 1990 and 1991. Multivariate analysis of variance was used to determine whether there were differences in the set of variables among treatments and years. Follow-up tests included univariate t-tests to determine which variables were contributing to any significant difference. Results showed a significant difference (p < 0.0005) among treatments in the composite of parameters during each of the months sampled. In most treatment types, there was a significant difference between years within each month. The univariate tests showed, however, that only some variables, primarily soil moisture, were contributing to this difference. We conclude that biomass and % plant cover show the best potential to serve as long-term indicators of grassland condition as they generally were sensitive to effects of different land management practices but not to yearly change in weather conditions. NDVI was insensitive to precipitation differences between years in July for most treatments, but was not in the native prairie. Choice of sampling time is important for these parameters to serve effectively as indicators.

  12. Bayesian multivariate hierarchical transformation models for ROC analysis.

    PubMed

    O'Malley, A James; Zou, Kelly H

    2006-02-15

    A Bayesian multivariate hierarchical transformation model (BMHTM) is developed for receiver operating characteristic (ROC) curve analysis based on clustered continuous diagnostic outcome data with covariates. Two special features of this model are that it incorporates non-linear monotone transformations of the outcomes and that multiple correlated outcomes may be analysed. The mean, variance, and transformation components are all modelled parametrically, enabling a wide range of inferences. The general framework is illustrated by focusing on two problems: (1) analysis of the diagnostic accuracy of a covariate-dependent univariate test outcome requiring a Box-Cox transformation within each cluster to map the test outcomes to a common family of distributions; (2) development of an optimal composite diagnostic test using multivariate clustered outcome data. In the second problem, the composite test is estimated using discriminant function analysis and compared to the test derived from logistic regression analysis where the gold standard is a binary outcome. The proposed methodology is illustrated on prostate cancer biopsy data from a multi-centre clinical trial.

  13. Bayesian multivariate hierarchical transformation models for ROC analysis

    PubMed Central

    O'Malley, A. James; Zou, Kelly H.

    2006-01-01

    SUMMARY A Bayesian multivariate hierarchical transformation model (BMHTM) is developed for receiver operating characteristic (ROC) curve analysis based on clustered continuous diagnostic outcome data with covariates. Two special features of this model are that it incorporates non-linear monotone transformations of the outcomes and that multiple correlated outcomes may be analysed. The mean, variance, and transformation components are all modelled parametrically, enabling a wide range of inferences. The general framework is illustrated by focusing on two problems: (1) analysis of the diagnostic accuracy of a covariate-dependent univariate test outcome requiring a Box–Cox transformation within each cluster to map the test outcomes to a common family of distributions; (2) development of an optimal composite diagnostic test using multivariate clustered outcome data. In the second problem, the composite test is estimated using discriminant function analysis and compared to the test derived from logistic regression analysis where the gold standard is a binary outcome. The proposed methodology is illustrated on prostate cancer biopsy data from a multi-centre clinical trial. PMID:16217836

  14. Estuarial fingerprinting through multidimensional fluorescence and multivariate analysis.

    PubMed

    Hall, Gregory J; Clow, Kerin E; Kenny, Jonathan E

    2005-10-01

    As part of a strategy for preventing the introduction of aquatic nuisance species (ANS) to U.S. estuaries, ballast water exchange (BWE) regulations have been imposed. Enforcing these regulations requires a reliable method for determining the port of origin of water in the ballast tanks of ships entering U.S. waters. This study shows that a three-dimensional fluorescence fingerprinting technique, excitation emission matrix (EEM) spectroscopy, holds great promise as a ballast water analysis tool. In our technique, EEMs are analyzed by multivariate classification and curve resolution methods, such as N-way partial least squares Regression-discriminant analysis (NPLS-DA) and parallel factor analysis (PARAFAC). We demonstrate that classification techniques can be used to discriminate among sampling sites less than 10 miles apart, encompassing Boston Harbor and two tributaries in the Mystic River Watershed. To our knowledge, this work is the first to use multivariate analysis to classify water as to location of origin. Furthermore, it is shown that curve resolution can show seasonal features within the multidimensional fluorescence data sets, which correlate with difficulty in classification.

  15. Reduced rank regression via adaptive nuclear norm penalization

    PubMed Central

    Chen, Kun; Dong, Hongbo; Chan, Kung-Sik

    2014-01-01

    Summary We propose an adaptive nuclear norm penalization approach for low-rank matrix approximation, and use it to develop a new reduced rank estimation method for high-dimensional multivariate regression. The adaptive nuclear norm is defined as the weighted sum of the singular values of the matrix, and it is generally non-convex under the natural restriction that the weight decreases with the singular value. However, we show that the proposed non-convex penalized regression method has a global optimal solution obtained from an adaptively soft-thresholded singular value decomposition. The method is computationally efficient, and the resulting solution path is continuous. The rank consistency of and prediction/estimation performance bounds for the estimator are established for a high-dimensional asymptotic regime. Simulation studies and an application in genetics demonstrate its efficacy. PMID:25045172

  16. Comparison of age estimation between 15-25 years using a modified form of Demirjian’s ten stage method and two teeth regression formula

    NASA Astrophysics Data System (ADS)

    Amiroh; Priaminiarti, M.; Syahraini, S. I.

    2017-08-01

    Age estimation of individuals, both dead and living, is important for victim identification and legal certainty. The Demirjian method uses the third molar for age estimation of individuals above 15 years old. The aim is to compare age estimation between 15-25 years using two Demirjian methods. Development stage of third molars in panoramic radiographs of 50 male and female samples were assessed by two observers using Demirjian’s ten stages and two teeth regression formula. Reliability was calculated using Cohen’s kappa coefficient and the significance of the observations was obtained from Wilcoxon tests. Deviations of age estimation were calculated using various methods. The deviation of age estimation with the two teeth regression formula was ±1.090 years; with ten stages, it was ±1.191 years. The deviation of age estimation using the two teeth regression formula was less than with the ten stages method. The age estimations using the two teeth regression formula or the ten stages method are significantly different until the age of 25, but they can be applied up to the age of 22.

  17. [Risk factors on the recurrence of ischemic stroke and the establishment of a Cox's regression model].

    PubMed

    An, Ya-chen; Chen, Yun-xia; Wang, Yu-xun; Zhao, Xiao-jing; Wang, Yan; Zhang, Jiang; Li, Chun-ling; Peng, Yan-bo; Gao, Su-ling; Chang, Li-sha; Zhang, Li; Xue, Xin-hong; Chen, Rui-ying; Wang, Da-li

    2011-08-01

    To investigate the risk factors and establish the Cox's regression model on the recurrence of ischemic stroke. We retrospectively reviewed consecutive patients with ischemic stroke admitted to the Neurology Department of the Hebei United University Affiliated Hospital between January 1, 2008 and December 31, 2009. Cases had been followed since the onset of ischemic stroke. The follow-up program was finished in June 30, 2010. Kaplan-Meier methods were used to describe the recurrence rate. Monovariant and multivariate Cox's proportional hazard regression model were used to analyze the risk factors associated to the episodes of recurrence. And then, a recurrence model was set up. During the period of follow-up program, 79 cases were relapsed, with the recurrence rates as 12.75% in one year and 18.87% in two years. Monovariant and multivariate Cox's proportional hazard regression model showed that the independent risk factors that were associated with the recurrence appeared to be age (X₁) (RR = 1.025, 95%CI: 1.003 - 1.048), history of hypertension (X₂) (RR = 1.976, 95%CI: 1.014 - 3.851), history of family strokes (X₃) (RR = 2.647, 95%CI: 1.175 - 5.961), total cholesterol amount (X₄) (RR = 1.485, 95%CI: 1.214 - 1.817), ESRS total scores (X₅) (RR = 1.327, 95%CI: 1.057 - 1.666) and progression of the disease (X₆) (RR = 1.889, 95%CI: 1.123 - 3.178). Personal prognosis index (PI) of the recurrence model was as follows: PI = 0.025X₁ + 0.681X₂ + 0.973X₃ + 0.395X₄ + 0.283X₅ + 0.636X₆. The smaller the personal prognosis index was, the lower the recurrence risk appeared, while the bigger the personal prognosis index was, the higher the recurrence risk appeared. Age, history of hypertension, total cholesterol amount, total scores of ESRS, together with the disease progression were the independent risk factors associated with the recurrence episodes of ischemic stroke. Both recurrence model and the personal prognosis index equation were successful

  18. Multivariate Analysis and Machine Learning in Cerebral Palsy Research

    PubMed Central

    Zhang, Jing

    2017-01-01

    Cerebral palsy (CP), a common pediatric movement disorder, causes the most severe physical disability in children. Early diagnosis in high-risk infants is critical for early intervention and possible early recovery. In recent years, multivariate analytic and machine learning (ML) approaches have been increasingly used in CP research. This paper aims to identify such multivariate studies and provide an overview of this relatively young field. Studies reviewed in this paper have demonstrated that multivariate analytic methods are useful in identification of risk factors, detection of CP, movement assessment for CP prediction, and outcome assessment, and ML approaches have made it possible to automatically identify movement impairments in high-risk infants. In addition, outcome predictors for surgical treatments have been identified by multivariate outcome studies. To make the multivariate and ML approaches useful in clinical settings, further research with large samples is needed to verify and improve these multivariate methods in risk factor identification, CP detection, movement assessment, and outcome evaluation or prediction. As multivariate analysis, ML and data processing technologies advance in the era of Big Data of this century, it is expected that multivariate analysis and ML will play a bigger role in improving the diagnosis and treatment of CP to reduce mortality and morbidity rates, and enhance patient care for children with CP. PMID:29312134

  19. A Multivariate Analysis of Personality, Values and Expectations as Correlates of Career Aspirations of Final Year Medical Students

    ERIC Educational Resources Information Center

    Rogers, Mary E.; Searle, Judy; Creed, Peter A.; Ng, Shu-Kay

    2010-01-01

    This study reports on the career intentions of 179 final year medical students who completed an online survey that included measures of personality, values, professional and lifestyle expectations, and well-being. Logistic regression analyses identified the determinants of preferred medical specialty, practice location and hours of work.…

  20. Perioperative factors predicting poor outcome in elderly patients following emergency general surgery: a multivariate regression analysis

    PubMed Central

    Lees, Mackenzie C.; Merani, Shaheed; Tauh, Keerit; Khadaroo, Rachel G.

    2015-01-01

    Background Older adults (≥ 65 yr) are the fastest growing population and are presenting in increasing numbers for acute surgical care. Emergency surgery is frequently life threatening for older patients. Our objective was to identify predictors of mortality and poor outcome among elderly patients undergoing emergency general surgery. Methods We conducted a retrospective cohort study of patients aged 65–80 years undergoing emergency general surgery between 2009 and 2010 at a tertiary care centre. Demographics, comorbidities, in-hospital complications, mortality and disposition characteristics of patients were collected. Logistic regression analysis was used to identify covariate-adjusted predictors of in-hospital mortality and discharge of patients home. Results Our analysis included 257 patients with a mean age of 72 years; 52% were men. In-hospital mortality was 12%. Mortality was associated with patients who had higher American Society of Anesthesiologists (ASA) class (odds ratio [OR] 3.85, 95% confidence interval [CI] 1.43–10.33, p = 0.008) and in-hospital complications (OR 1.93, 95% CI 1.32–2.83, p = 0.001). Nearly two-thirds of patients discharged home were younger (OR 0.92, 95% CI 0.85–0.99, p = 0.036), had lower ASA class (OR 0.45, 95% CI 0.27–0.74, p = 0.002) and fewer in-hospital complications (OR 0.69, 95% CI 0.53–0.90, p = 0.007). Conclusion American Society of Anesthesiologists class and in-hospital complications are perioperative predictors of mortality and disposition in the older surgical population. Understanding the predictors of poor outcome and the importance of preventing in-hospital complications in older patients will have important clinical utility in terms of preoperative counselling, improving health care and discharging patients home. PMID:26204143

  1. Study of cyanotoxins presence from experimental cyanobacteria concentrations using a new data mining methodology based on multivariate adaptive regression splines in Trasona reservoir (Northern Spain).

    PubMed

    Garcia Nieto, P J; Sánchez Lasheras, F; de Cos Juez, F J; Alonso Fernández, J R

    2011-11-15

    There is an increasing need to describe cyanobacteria blooms since some cyanobacteria produce toxins, termed cyanotoxins. These latter can be toxic and dangerous to humans as well as other animals and life in general. It must be remarked that the cyanobacteria are reproduced explosively under certain conditions. This results in algae blooms, which can become harmful to other species if the cyanobacteria involved produce cyanotoxins. In this research work, the evolution of cyanotoxins in Trasona reservoir (Principality of Asturias, Northern Spain) was studied with success using the data mining methodology based on multivariate adaptive regression splines (MARS) technique. The results of the present study are two-fold. On one hand, the importance of the different kind of cyanobacteria over the presence of cyanotoxins in the reservoir is presented through the MARS model and on the other hand a predictive model able to forecast the possible presence of cyanotoxins in a short term was obtained. The agreement of the MARS model with experimental data confirmed the good performance of the same one. Finally, conclusions of this innovative research are exposed. Copyright © 2011 Elsevier B.V. All rights reserved.

  2. Heterogeneity in drug abuse among juvenile offenders: is mixture regression more informative than standard regression?

    PubMed

    Montgomery, Katherine L; Vaughn, Michael G; Thompson, Sanna J; Howard, Matthew O

    2013-11-01

    Research on juvenile offenders has largely treated this population as a homogeneous group. However, recent findings suggest that this at-risk population may be considerably more heterogeneous than previously believed. This study compared mixture regression analyses with standard regression techniques in an effort to explain how known factors such as distress, trauma, and personality are associated with drug abuse among juvenile offenders. Researchers recruited 728 juvenile offenders from Missouri juvenile correctional facilities for participation in this study. Researchers investigated past-year substance use in relation to the following variables: demographic characteristics (gender, ethnicity, age, familial use of public assistance), antisocial behavior, and mental illness symptoms (psychopathic traits, psychiatric distress, and prior trauma). Results indicated that standard and mixed regression approaches identified significant variables related to past-year substance use among this population; however, the mixture regression methods provided greater specificity in results. Mixture regression analytic methods may help policy makers and practitioners better understand and intervene with the substance-related subgroups of juvenile offenders.

  3. Regression of left ventricular hypertrophy and microalbuminuria changes during antihypertensive treatment.

    PubMed

    Rodilla, Enrique; Pascual, Jose Maria; Costa, Jose Antonio; Martin, Joaquin; Gonzalez, Carmen; Redon, Josep

    2013-08-01

    The objective of the present study was to assess the regression of left ventricular hypertrophy (LVH) during antihypertensive treatment, and its relationship with the changes in microalbuminuria. One hundred and sixty-eight previously untreated patients with echocardiographic LVH, 46 (27%) with microalbuminuria, were followed during a median period of 13 months (range 6-23 months) and treated with lifestyle changes and antihypertensive drugs. Twenty-four-hour ambulatory blood pressure monitoring, echocardiography and urinary albumin excretion were assessed at the beginning and at the end of the study period. Left ventricular mass index (LVMI) was reduced from 137 [interquartile interval (IQI), 129-154] to 121 (IQI, 104-137) g/m (P < 0.001). Eighty-nine patients (53%) had a reduction in LVMI of at least 17.8 g/m, and an LVH regression rate of 43.8 per 100 patient-years [95% confidence interval (CI) 35.2-53.9]. The main factor related to LVH regression was the reduction in SBP24 h [multivariate odds ratio (ORm) 4.49; 95% CI 1.73-11.63; P = 0.005, highest tertile compared with lower tertiles]. Male sex (ORm 0.39; 95% CI 0.17-0.90; P = 0.04) and baseline glomerular filtration rate less than 90 ml/min per 1.73 m (ORm 0.39; 95% CI 0.17-0.90; P = 0.03) were associated with a lower probability of LVH regression. Patients with microalbuminuria regression (urinary albumin excretion reduction >50%) had the same odds of achieving regression of LVH as patients with normoalbuminuria (ORm 1.1; 95% CI 0.38-3.25; P = 0.85). However, those with microalbuminuria at baseline, who did not regress, had less probability of achieving LVH regression than the normoalbuminuric patients (OR 0.26; 95% CI 0.07-0.90; P = 0.03) even when adjusted for age, sex, initial LVMI, GFR, blood pressure and angiotensin-converting enzyme inhibitor (ACE-I) or angiotensin receptor blocker (ARB) treatment during the follow-up. Patients who do not have a significant reduction in

  4. Multivariate analysis of prognostic factors in male breast cancer in Serbia.

    PubMed

    Sipetic-Grujicic, Sandra Branko; Murtezani, Zafir Hajdar; Neskovic-Konstatinovic, Zora Borivoje; Marinkovic, Jelena Milutin; Kovcin, Vladimir Nikola; Andric, Zoran Gojko; Kostic, Sanja Vladeta; Ratkov, Isidora Stojan; Maksimovic, Jadranka Milutin

    2014-01-01

    The aim of this study was to analyze the demographic and clinical characteristics of male breast cancer patients in Serbia, and furthermore to determine overall survival and predictive factors for prognosis. In the period of 1996-2006 histopathological diagnosis of breast cancer was made in 84 males at the Institute for Oncology and Radiology of Serbia. For statistical analyses the Kaplan-Meier method, long-rank test and Cox proportional hazards regression model were used. The mean age at diagnosis with breast cancer was 64.3±10.5 years with a range from 35-84 years. Nearly 80% of the tumors showed ductal histology. About 44% had early tumor stages (I and II) whereas 46.4% and 9.5% of the male exhibited stages III and IV, respectively. Only 7.1% of male patients were grade one. One-fifth of all patients had tumors measuring ≤2 cm, and 14.3% larger than 5 cm. Lymph node metastasis was recorded in 40.4% patients and 47% relapse. Estrogen and progesterone receptor expression was positive in 66.7% and 58.3%, respectively. Among 14.3% of individuals tumor was HER2 positive. About two-thirds of all male patients had radical mastectomy (66.7%). Adjuvant hormonal (tamoxifene), systematic chemotherapy (CMF or FAC) and adjuvant radiotherapy were given to 59.5%, 35.7% and 29.8% patients respectively. Overall survival rates at five and ten years for male breast cancer were 55.0% and 43.9%, respectively. According to the multivariate Cox regression predictive model, a lower initial disease stage, a lower tumor grade, application of adjuvant hormone therapy and no relapse occurrence were significant independent predictors for good overall survival. Results of the treatment would be better if disease is discovered earlier and therefore health education and screening are an imperative in solving this problem.

  5. Fast Detection of Copper Content in Rice by Laser-Induced Breakdown Spectroscopy with Uni- and Multivariate Analysis.

    PubMed

    Liu, Fei; Ye, Lanhan; Peng, Jiyu; Song, Kunlin; Shen, Tingting; Zhang, Chu; He, Yong

    2018-02-27

    Fast detection of heavy metals is very important for ensuring the quality and safety of crops. Laser-induced breakdown spectroscopy (LIBS), coupled with uni- and multivariate analysis, was applied for quantitative analysis of copper in three kinds of rice (Jiangsu rice, regular rice, and Simiao rice). For univariate analysis, three pre-processing methods were applied to reduce fluctuations, including background normalization, the internal standard method, and the standard normal variate (SNV). Linear regression models showed a strong correlation between spectral intensity and Cu content, with an R 2 more than 0.97. The limit of detection (LOD) was around 5 ppm, lower than the tolerance limit of copper in foods. For multivariate analysis, partial least squares regression (PLSR) showed its advantage in extracting effective information for prediction, and its sensitivity reached 1.95 ppm, while support vector machine regression (SVMR) performed better in both calibration and prediction sets, where R c 2 and R p 2 reached 0.9979 and 0.9879, respectively. This study showed that LIBS could be considered as a constructive tool for the quantification of copper contamination in rice.

  6. Fast Detection of Copper Content in Rice by Laser-Induced Breakdown Spectroscopy with Uni- and Multivariate Analysis

    PubMed Central

    Ye, Lanhan; Song, Kunlin; Shen, Tingting

    2018-01-01

    Fast detection of heavy metals is very important for ensuring the quality and safety of crops. Laser-induced breakdown spectroscopy (LIBS), coupled with uni- and multivariate analysis, was applied for quantitative analysis of copper in three kinds of rice (Jiangsu rice, regular rice, and Simiao rice). For univariate analysis, three pre-processing methods were applied to reduce fluctuations, including background normalization, the internal standard method, and the standard normal variate (SNV). Linear regression models showed a strong correlation between spectral intensity and Cu content, with an R2 more than 0.97. The limit of detection (LOD) was around 5 ppm, lower than the tolerance limit of copper in foods. For multivariate analysis, partial least squares regression (PLSR) showed its advantage in extracting effective information for prediction, and its sensitivity reached 1.95 ppm, while support vector machine regression (SVMR) performed better in both calibration and prediction sets, where Rc2 and Rp2 reached 0.9979 and 0.9879, respectively. This study showed that LIBS could be considered as a constructive tool for the quantification of copper contamination in rice. PMID:29495445

  7. Deterioration of Speech Recognition Ability Over a Period of 5 Years in Adults Ages 18 to 70 Years: Results of the Dutch Online Speech-in-Noise Test.

    PubMed

    Stam, Mariska; Smits, Cas; Twisk, Jos W R; Lemke, Ulrike; Festen, Joost M; Kramer, Sophia E

    2015-01-01

    The first aim of the present study was to determine the change in speech recognition in noise over a period of 5 years in participants ages 18 to 70 years at baseline. The second aim was to investigate whether age, gender, educational level, the level of initial speech recognition in noise, and reported chronic conditions were associated with a change in speech recognition in noise. The baseline and 5-year follow-up data of 427 participants with and without hearing impairment participating in the National Longitudinal Study on Hearing (NL-SH) were analyzed. The ability to recognize speech in noise was measured twice with the online National Hearing Test, a digit-triplet speech-in-noise test. Speech-reception-threshold in noise (SRTn) scores were calculated, corresponding to 50% speech intelligibility. Unaided SRTn scores obtained with the same transducer (headphones or loudspeakers) at both test moments were included. Changes in SRTn were calculated as a raw shift (T1 - T0) and an adjusted shift for regression towards the mean. Paired t tests and multivariable linear regression analyses were applied. The mean increase (i.e., deterioration) in SRTn was 0.38-dB signal-to-noise ratio (SNR) over 5 years (p < 0.001). Results of the multivariable regression analyses showed that the age group of 50 to 59 years had a significantly larger deterioration in SRTn compared with the age group of 18 to 39 years (raw shift: beta: 0.64-dB SNR; 95% confidence interval: 0.07-1.22; p = 0.028, adjusted for initial speech recognition level - adjusted shift: beta: 0.82-dB SNR; 95% confidence interval: 0.27-1.34; p = 0.004). Gender, educational level, and the number of chronic conditions were not associated with a change in SRTn over time. No significant differences in increase of SRTn were found between the initial levels of speech recognition (i.e., good, insufficient, or poor) when taking into account the phenomenon regression towards the mean. The study results indicate that hearing

  8. Seasonal variation of benzo(a)pyrene in the Spanish airborne PM10. Multivariate linear regression model applied to estimate BaP concentrations.

    PubMed

    Callén, M S; López, J M; Mastral, A M

    2010-08-15

    The estimation of benzo(a)pyrene (BaP) concentrations in ambient air is very important from an environmental point of view especially with the introduction of the Directive 2004/107/EC and due to the carcinogenic character of this pollutant. A sampling campaign of particulate matter less or equal than 10 microns (PM10) carried out during 2008-2009 in four locations of Spain was collected to determine experimentally BaP concentrations by gas chromatography mass-spectrometry mass-spectrometry (GC-MS-MS). Multivariate linear regression models (MLRM) were used to predict BaP air concentrations in two sampling places, taking PM10 and meteorological variables as possible predictors. The model obtained with data from two sampling sites (all sites model) (R(2)=0.817, PRESS/SSY=0.183) included the significant variables like PM10, temperature, solar radiation and wind speed and was internally and externally validated. The first validation was performed by cross validation and the last one by BaP concentrations from previous campaigns carried out in Zaragoza from 2001-2004. The proposed model constitutes a first approximation to estimate BaP concentrations in urban atmospheres with very good internal prediction (Q(CV)(2)=0.813, PRESS/SSY=0.187) and with the maximal external prediction for the 2001-2002 campaign (Q(ext)(2)=0.679 and PRESS/SSY=0.321) versus the 2001-2004 campaign (Q(ext)(2)=0.551, PRESS/SSY=0.449). Copyright 2010 Elsevier B.V. All rights reserved.

  9. Genetic parameters for growth characteristics of free-range chickens under univariate random regression models.

    PubMed

    Rovadoscki, Gregori A; Petrini, Juliana; Ramirez-Diaz, Johanna; Pertile, Simone F N; Pertille, Fábio; Salvian, Mayara; Iung, Laiza H S; Rodriguez, Mary Ana P; Zampar, Aline; Gaya, Leila G; Carvalho, Rachel S B; Coelho, Antonio A D; Savino, Vicente J M; Coutinho, Luiz L; Mourão, Gerson B

    2016-09-01

    Repeated measures from the same individual have been analyzed by using repeatability and finite dimension models under univariate or multivariate analyses. However, in the last decade, the use of random regression models for genetic studies with longitudinal data have become more common. Thus, the aim of this research was to estimate genetic parameters for body weight of four experimental chicken lines by using univariate random regression models. Body weight data from hatching to 84 days of age (n = 34,730) from four experimental free-range chicken lines (7P, Caipirão da ESALQ, Caipirinha da ESALQ and Carijó Barbado) were used. The analysis model included the fixed effects of contemporary group (gender and rearing system), fixed regression coefficients for age at measurement, and random regression coefficients for permanent environmental effects and additive genetic effects. Heterogeneous variances for residual effects were considered, and one residual variance was assigned for each of six subclasses of age at measurement. Random regression curves were modeled by using Legendre polynomials of the second and third orders, with the best model chosen based on the Akaike Information Criterion, Bayesian Information Criterion, and restricted maximum likelihood. Multivariate analyses under the same animal mixed model were also performed for the validation of the random regression models. The Legendre polynomials of second order were better for describing the growth curves of the lines studied. Moderate to high heritabilities (h(2) = 0.15 to 0.98) were estimated for body weight between one and 84 days of age, suggesting that selection for body weight at all ages can be used as a selection criteria. Genetic correlations among body weight records obtained through multivariate analyses ranged from 0.18 to 0.96, 0.12 to 0.89, 0.06 to 0.96, and 0.28 to 0.96 in 7P, Caipirão da ESALQ, Caipirinha da ESALQ, and Carijó Barbado chicken lines, respectively. Results indicate that

  10. Penalized spline estimation for functional coefficient regression models.

    PubMed

    Cao, Yanrong; Lin, Haiqun; Wu, Tracy Z; Yu, Yan

    2010-04-01

    The functional coefficient regression models assume that the regression coefficients vary with some "threshold" variable, providing appreciable flexibility in capturing the underlying dynamics in data and avoiding the so-called "curse of dimensionality" in multivariate nonparametric estimation. We first investigate the estimation, inference, and forecasting for the functional coefficient regression models with dependent observations via penalized splines. The P-spline approach, as a direct ridge regression shrinkage type global smoothing method, is computationally efficient and stable. With established fixed-knot asymptotics, inference is readily available. Exact inference can be obtained for fixed smoothing parameter λ, which is most appealing for finite samples. Our penalized spline approach gives an explicit model expression, which also enables multi-step-ahead forecasting via simulations. Furthermore, we examine different methods of choosing the important smoothing parameter λ: modified multi-fold cross-validation (MCV), generalized cross-validation (GCV), and an extension of empirical bias bandwidth selection (EBBS) to P-splines. In addition, we implement smoothing parameter selection using mixed model framework through restricted maximum likelihood (REML) for P-spline functional coefficient regression models with independent observations. The P-spline approach also easily allows different smoothness for different functional coefficients, which is enabled by assigning different penalty λ accordingly. We demonstrate the proposed approach by both simulation examples and a real data application.

  11. Using Logistic Regression To Predict the Probability of Debris Flows Occurring in Areas Recently Burned By Wildland Fires

    USGS Publications Warehouse

    Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.

    2003-01-01

    Logistic regression was used to predict the probability of debris flows occurring in areas recently burned by wildland fires. Multiple logistic regression is conceptually similar to multiple linear regression because statistical relations between one dependent variable and several independent variables are evaluated. In logistic regression, however, the dependent variable is transformed to a binary variable (debris flow did or did not occur), and the actual probability of the debris flow occurring is statistically modeled. Data from 399 basins located within 15 wildland fires that burned during 2000-2002 in Colorado, Idaho, Montana, and New Mexico were evaluated. More than 35 independent variables describing the burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows were delineated from National Elevation Data using a Geographic Information System (GIS). (2) Data describing the burn severity, geology, land surface gradient, rainfall, and soil properties were determined for each basin. These data were then downloaded to a statistics software package for analysis using logistic regression. (3) Relations between the occurrence/non-occurrence of debris flows and burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated and several preliminary multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combination produced the most effective model. The multivariate model that best predicted the occurrence of debris flows was selected. (4) The multivariate logistic regression model was entered into a GIS, and a map showing the probability of debris flows was constructed. The most effective model incorporates the percentage of each basin with slope greater than 30 percent, percentage of land burned at medium and high burn severity

  12. Penalized regression procedures for variable selection in the potential outcomes framework

    PubMed Central

    Ghosh, Debashis; Zhu, Yeying; Coffman, Donna L.

    2015-01-01

    A recent topic of much interest in causal inference is model selection. In this article, we describe a framework in which to consider penalized regression approaches to variable selection for causal effects. The framework leads to a simple ‘impute, then select’ class of procedures that is agnostic to the type of imputation algorithm as well as penalized regression used. It also clarifies how model selection involves a multivariate regression model for causal inference problems, and that these methods can be applied for identifying subgroups in which treatment effects are homogeneous. Analogies and links with the literature on machine learning methods, missing data and imputation are drawn. A difference LASSO algorithm is defined, along with its multiple imputation analogues. The procedures are illustrated using a well-known right heart catheterization dataset. PMID:25628185

  13. Application of two tests of multivariate discordancy to fisheries data sets

    USGS Publications Warehouse

    Stapanian, M.A.; Kocovsky, P.M.; Garner, F.C.

    2008-01-01

    The generalized (Mahalanobis) distance and multivariate kurtosis are two powerful tests of multivariate discordancies (outliers). Unlike the generalized distance test, the multivariate kurtosis test has not been applied as a test of discordancy to fisheries data heretofore. We applied both tests, along with published algorithms for identifying suspected causal variable(s) of discordant observations, to two fisheries data sets from Lake Erie: total length, mass, and age from 1,234 burbot, Lota lota; and 22 combinations of unique subsets of 10 morphometrics taken from 119 yellow perch, Perca flavescens. For the burbot data set, the generalized distance test identified six discordant observations and the multivariate kurtosis test identified 24 discordant observations. In contrast with the multivariate tests, the univariate generalized distance test identified no discordancies when applied separately to each variable. Removing discordancies had a substantial effect on length-versus-mass regression equations. For 500-mm burbot, the percent difference in estimated mass after removing discordancies in our study was greater than the percent difference in masses estimated for burbot of the same length in lakes that differed substantially in productivity. The number of discordant yellow perch detected ranged from 0 to 2 with the multivariate generalized distance test and from 6 to 11 with the multivariate kurtosis test. With the kurtosis test, 108 yellow perch (90.7%) were identified as discordant in zero to two combinations, and five (4.2%) were identified as discordant in either all or 21 of the 22 combinations. The relationship among the variables included in each combination determined which variables were identified as causal. The generalized distance test identified between zero and six discordancies when applied separately to each variable. Removing the discordancies found in at least one-half of the combinations (k=5) had a marked effect on a principal components

  14. Regression Model Optimization for the Analysis of Experimental Data

    NASA Technical Reports Server (NTRS)

    Ulbrich, N.

    2009-01-01

    A candidate math model search algorithm was developed at Ames Research Center that determines a recommended math model for the multivariate regression analysis of experimental data. The search algorithm is applicable to classical regression analysis problems as well as wind tunnel strain gage balance calibration analysis applications. The algorithm compares the predictive capability of different regression models using the standard deviation of the PRESS residuals of the responses as a search metric. This search metric is minimized during the search. Singular value decomposition is used during the search to reject math models that lead to a singular solution of the regression analysis problem. Two threshold dependent constraints are also applied. The first constraint rejects math models with insignificant terms. The second constraint rejects math models with near-linear dependencies between terms. The math term hierarchy rule may also be applied as an optional constraint during or after the candidate math model search. The final term selection of the recommended math model depends on the regressor and response values of the data set, the user s function class combination choice, the user s constraint selections, and the result of the search metric minimization. A frequently used regression analysis example from the literature is used to illustrate the application of the search algorithm to experimental data.

  15. Effects of univariate and multivariate regression on the accuracy of hydrogen quantification with laser-induced breakdown spectroscopy

    NASA Astrophysics Data System (ADS)

    Ytsma, Cai R.; Dyar, M. Darby

    2018-01-01

    Hydrogen (H) is a critical element to measure on the surface of Mars because its presence in mineral structures is indicative of past hydrous conditions. The Curiosity rover uses the laser-induced breakdown spectrometer (LIBS) on the ChemCam instrument to analyze rocks for their H emission signal at 656.6 nm, from which H can be quantified. Previous LIBS calibrations for H used small data sets measured on standards and/or manufactured mixtures of hydrous minerals and rocks and applied univariate regression to spectra normalized in a variety of ways. However, matrix effects common to LIBS make these calibrations of limited usefulness when applied to the broad range of compositions on the Martian surface. In this study, 198 naturally-occurring hydrous geological samples covering a broad range of bulk compositions with directly-measured H content are used to create more robust prediction models for measuring H in LIBS data acquired under Mars conditions. Both univariate and multivariate prediction models, including partial least square (PLS) and the least absolute shrinkage and selection operator (Lasso), are compared using several different methods for normalization of H peak intensities. Data from the ChemLIBS Mars-analog spectrometer at Mount Holyoke College are compared against spectra from the same samples acquired using a ChemCam-like instrument at Los Alamos National Laboratory and the ChemCam instrument on Mars. Results show that all current normalization and data preprocessing variations for quantifying H result in models with statistically indistinguishable prediction errors (accuracies) ca. ± 1.5 weight percent (wt%) H2O, limiting the applications of LIBS in these implementations for geological studies. This error is too large to allow distinctions among the most common hydrous phases (basalts, amphiboles, micas) to be made, though some clays (e.g., chlorites with ≈ 12 wt% H2O, smectites with 15-20 wt% H2O) and hydrated phases (e.g., gypsum with ≈ 20

  16. Risk factors for mortality before age 18 years in cystic fibrosis.

    PubMed

    McColley, Susanna A; Schechter, Michael S; Morgan, Wayne J; Pasta, David J; Craib, Marcia L; Konstan, Michael W

    2017-07-01

    Understanding early-life risk factors for childhood death in cystic fibrosis (CF) is important for clinical care, including the identification of effective interventions. Data from the Epidemiologic Study of Cystic Fibrosis (ESCF) collected 1994-2005 were linked with the Cystic Fibrosis Foundation Patient Registry (CFFPR) demographic and mortality data from 2013. Inclusion criteria were ≥1 visit annually at age 3-5 years and ≥1 FEV 1 measurement at age 6-8 years. Demographic data, nutritional parameters, pulmonary signs and symptoms, microbiology, and FEV 1 were evaluated as risk factors for death before age 18 years. Multivariable Cox proportional hazards regression was used to model the simultaneous effects of risk factors associated with death before age 18 years. Among 5365 patients enrolled in ESCF who met inclusion criteria, 3880 (72%) were linked to the CFFPR. Among these, 191 (5.7%) died before age 18 years; median age at death was 13.4 ± 3.1 years. Multivariable regression showed clubbing, crackles, female sex, unknown CFTR genotype, minority race or ethnicity, Medicaid insurance (a proxy of low socioeconomic status), Pseudomonas aeruginosa on 2 or more cultures, and weight-for-age <50th percentile were significant risk factors for death regardless of inclusion of FEV 1 at age 6-8 years in the model. We identified multiple risk factors for childhood death of patients with CF, all of which remained important after incorporating FEV 1 at age 6-8 years. Among the factors identified were the presence of clubbing or crackles at age 3-5 years, signs which are not routinely collected in registries. © 2017 Wiley Periodicals, Inc.

  17. Identifying Interacting Genetic Variations by Fish-Swarm Logic Regression

    PubMed Central

    Yang, Aiyuan; Yan, Chunxia; Zhu, Feng; Zhao, Zhongmeng; Cao, Zhi

    2013-01-01

    Understanding associations between genotypes and complex traits is a fundamental problem in human genetics. A major open problem in mapping phenotypes is that of identifying a set of interacting genetic variants, which might contribute to complex traits. Logic regression (LR) is a powerful multivariant association tool. Several LR-based approaches have been successfully applied to different datasets. However, these approaches are not adequate with regard to accuracy and efficiency. In this paper, we propose a new LR-based approach, called fish-swarm logic regression (FSLR), which improves the logic regression process by incorporating swarm optimization. In our approach, a school of fish agents are conducted in parallel. Each fish agent holds a regression model, while the school searches for better models through various preset behaviors. A swarm algorithm improves the accuracy and the efficiency by speeding up the convergence and preventing it from dropping into local optimums. We apply our approach on a real screening dataset and a series of simulation scenarios. Compared to three existing LR-based approaches, our approach outperforms them by having lower type I and type II error rates, being able to identify more preset causal sites, and performing at faster speeds. PMID:23984382

  18. Semiparametric regression during 2003–2007*

    PubMed Central

    Ruppert, David; Wand, M.P.; Carroll, Raymond J.

    2010-01-01

    Semiparametric regression is a fusion between parametric regression and nonparametric regression that integrates low-rank penalized splines, mixed model and hierarchical Bayesian methodology – thus allowing more streamlined handling of longitudinal and spatial correlation. We review progress in the field over the five-year period between 2003 and 2007. We find semiparametric regression to be a vibrant field with substantial involvement and activity, continual enhancement and widespread application. PMID:20305800

  19. Optical scatterometry of quarter-micron patterns using neural regression

    NASA Astrophysics Data System (ADS)

    Bischoff, Joerg; Bauer, Joachim J.; Haak, Ulrich; Hutschenreuther, Lutz; Truckenbrodt, Horst

    1998-06-01

    With shrinking dimensions and increasing chip areas, a rapid and non-destructive full wafer characterization after every patterning cycle is an inevitable necessity. In former publications it was shown that Optical Scatterometry (OS) has the potential to push the attainable feature limits of optical techniques from 0.8 . . . 0.5 microns for imaging methods down to 0.1 micron and below. Thus the demands of future metrology can be met. Basically being a nonimaging method, OS combines light scatter (or diffraction) measurements with modern data analysis schemes to solve the inverse scatter issue. For very fine patterns with lambda-to-pitch ratios grater than one, the specular reflected light versus the incidence angle is recorded. Usually, the data analysis comprises two steps -- a training cycle connected the a rigorous forward modeling and the prediction itself. Until now, two data analysis schemes are usually applied -- the multivariate regression based Partial Least Squares method (PLS) and a look-up-table technique which is also referred to as Minimum Mean Square Error approach (MMSE). Both methods are afflicted with serious drawbacks. On the one hand, the prediction accuracy of multivariate regression schemes degrades with larger parameter ranges due to the linearization properties of the method. On the other hand, look-up-table methods are rather time consuming during prediction thus prolonging the processing time and reducing the throughput. An alternate method is an Artificial Neural Network (ANN) based regression which combines the advantages of multivariate regression and MMSE. Due to the versatility of a neural network, not only can its structure be adapted more properly to the scatter problem, but also the nonlinearity of the neuronal transfer functions mimic the nonlinear behavior of optical diffraction processes more adequately. In spite of these pleasant properties, the prediction speed of ANN regression is comparable with that of the PLS-method. In

  20. Prognostic Relevance of Lymph Node Regression After Neoadjuvant Chemoradiation for Esophageal Cancer.

    PubMed

    Philippron, Annouck; Bollschweiler, Elfriede; Kunikata, Ayumi; Plum, Patrick; Schmidt, Claudia; Favi, Francesco; Drebber, Uta; Hölscher, Arnulf H

    2016-01-01

    Prognostic factors after preoperative chemoradiation for patients with advanced esophageal cancer are under discussion. Treatment response measured in the primary tumor is a well-defined prognostic marker. The prognostic relevance of tumor regression in lymph nodes (LNs), eg, histomorphologic characteristics must be evaluated in a larger series of patients. From 1997-2010, 403 patients with cT3N×M0 esophageal cancer underwent preoperative chemoradiation followed by transthoracic esophagectomy. Histopathologic response of the primary tumor was graded in resected specimens as "minor" (≥10% vital residual tumor cells) or "major." The LNs of all patients without LN metastases (ypN0 n = 222, adenocarcinoma n = 129, squamous cell carcinoma n = 93) were reevaluated for central fibrosis. Univariate and multivariate analyses were performed on histomorphologic criteria of examined LNs and used to correlate these with tumor response and prognosis. The 5-year survival rate (5YSR) for all patients was 30%. Overall, 5480 LNs were reevaluated for the existence of central fibrosis in ypN0 cases. The prognostic relevance of the LN regression (LNR) grading system was confirmed for all patients with univariate (P < 0.001) and multivariate (P = 0.02) analyses. In results, the 5YSR for ypN0 patients overall was 37%, for patients with major response by the primary tumor was 42%, and for minor responders was 19% (P < 0.001). Analyzing LNR in major responders, the group with less than 3 LNs with central fibrosis (n = 52) showed significantly better prognosis (5YSR = 63%) compared to those with more (5YSR = 34%), (P = 0.016). Conclusion includes morphologic signs of metastatic LNR after chemoradiation, such as central fibrosis, are of prognostic relevance for patients with advanced esophageal cancer, especially for those with major response of the primary tumor. Copyright © 2016 Elsevier Inc. All rights reserved.

  1. Simple and Multivariate Relationships Between Spiritual Intelligence with General Health and Happiness.

    PubMed

    Amirian, Mohammad-Elyas; Fazilat-Pour, Masoud

    2016-08-01

    The present study examined simple and multivariate relationships of spiritual intelligence with general health and happiness. The employed method was descriptive and correlational. King's Spiritual Quotient scales, GHQ-28 and Oxford Happiness Inventory, are filled out by a sample consisted of 384 students, which were selected using stratified random sampling from the students of Shahid Bahonar University of Kerman. Data are subjected to descriptive and inferential statistics including correlations and multivariate regressions. Bivariate correlations support positive and significant predictive value of spiritual intelligence toward general health and happiness. Further analysis showed that among the Spiritual Intelligence' subscales, Existential Critical Thinking Predicted General Health and Happiness, reversely. In addition, happiness was positively predicted by generation of personal meaning and transcendental awareness. The findings are discussed in line with the previous studies and the relevant theoretical background.

  2. Dental avoidance behaviour in parent and child as risk indicators for caries in 5-year-old children.

    PubMed

    Wigen, Tove I; Skaret, Erik; Wang, Nina J

    2009-11-01

    The aim of this study was to explore associations between avoidance behaviour and dental anxiety in both parents and children and caries experience in 5-year-old children. It was hypothesised that parents' dental avoidance behaviour and dental anxiety were related to dental caries in 5-year-old children. Data were collected from dental records and by clinical and radiographic examination of 523 children. The parents completed a questionnaire regarding education, national background, dental anxiety, dental attendance, and behaviour management problems. Bivariate and multivariate logistic regression was conducted. Children having one or more missed dental appointments (OR = 4.7), child behaviour management problems (OR = 3.3), child dental anxiety (OR = 3.1), and parents avoiding dental care (OR = 2.1) were bivariately associated with caries experience at the age of 5 years. In multivariate logistic regression, having one or more missed dental appointments (OR = 4.0) and child behaviour management problems (OR = 2.4) were indicators for dental caries in 5-year-old children, when controlling for parents education and national origin. Parents that avoid bringing their child to scheduled dental appointments and previous experiences of behaviour management problems for the child indicated risk for dental caries in 5-year-old children.

  3. A Comparison of Conventional Linear Regression Methods and Neural Networks for Forecasting Educational Spending.

    ERIC Educational Resources Information Center

    Baker, Bruce D.; Richards, Craig E.

    1999-01-01

    Applies neural network methods for forecasting 1991-95 per-pupil expenditures in U.S. public elementary and secondary schools. Forecasting models included the National Center for Education Statistics' multivariate regression model and three neural architectures. Regarding prediction accuracy, neural network results were comparable or superior to…

  4. A regularization corrected score method for nonlinear regression models with covariate error.

    PubMed

    Zucker, David M; Gorfine, Malka; Li, Yi; Tadesse, Mahlet G; Spiegelman, Donna

    2013-03-01

    Many regression analyses involve explanatory variables that are measured with error, and failing to account for this error is well known to lead to biased point and interval estimates of the regression coefficients. We present here a new general method for adjusting for covariate error. Our method consists of an approximate version of the Stefanski-Nakamura corrected score approach, using the method of regularization to obtain an approximate solution of the relevant integral equation. We develop the theory in the setting of classical likelihood models; this setting covers, for example, linear regression, nonlinear regression, logistic regression, and Poisson regression. The method is extremely general in terms of the types of measurement error models covered, and is a functional method in the sense of not involving assumptions on the distribution of the true covariate. We discuss the theoretical properties of the method and present simulation results in the logistic regression setting (univariate and multivariate). For illustration, we apply the method to data from the Harvard Nurses' Health Study concerning the relationship between physical activity and breast cancer mortality in the period following a diagnosis of breast cancer. Copyright © 2013, The International Biometric Society.

  5. Multivariate analysis of prognostic factors in synovial sarcoma.

    PubMed

    Koh, Kyoung Hwan; Cho, Eun Yoon; Kim, Dong Wook; Seo, Sung Wook

    2009-11-01

    Many studies have described the diversity of synovial sarcoma in terms of its biological characteristics and clinical features. Moreover, much effort has been expended on the identification of prognostic factors because of unpredictable behaviors of synovial sarcomas. However, with the exception of tumor size, published results have been inconsistent. We attempted to identify independent risk factors using survival analysis. Forty-one consecutive patients with synovial sarcoma were prospectively followed from January 1997 to March 2008. Overall and progression-free survival for age, sex, tumor size, tumor location, metastasis at presentation, histologic subtype, chemotherapy, radiation therapy, and resection margin were analyzed, and standard multivariate Cox proportional hazard regression analysis was used to evaluate potential prognostic factors. Tumor size (>5 cm), nonlimb-based tumors, metastasis at presentation, and a monophasic subtype were associated with poorer overall survival. Multivariate analysis showed metastasis at presentation and monophasic tumor subtype affected overall survival. For the progression-free survival, monophasic subtype was found to be only 1 prognostic factor. The study confirmed that histologic subtype is the single most important independent prognostic factors of synovial sarcoma regardless of tumor stage.

  6. MULTIVARIATE ANALYSIS OF DRINKING BEHAVIOUR IN A RURAL POPULATION

    PubMed Central

    Mathrubootham, N.; Bashyam, V.S.P.; Shahjahan

    1997-01-01

    This study was carried out to find out the drinking pattern in a rural population, using multivariate techniques. 386 current users identified in a community were assessed with regard to their drinking behaviours using a structured interview. For purposes of the study the questions were condensed into 46 meaningful variables. In bivariate analysis, 14 variables including dependent variables such as dependence, MAST & CAGE (measuring alcoholic status), Q.F. Index and troubled drinking were found to be significant. Taking these variables and other multivariate techniques too such as ANOVA, correlation, regression analysis and factor analysis were done using both SPSS PC + and HCL magnum mainframe computer with FOCUS package and UNIX systems. Results revealed that number of factors such as drinking style, duration of drinking, pattern of abuse, Q.F. Index and various problems influenced drinking and some of them set up a vicious circle. Factor analysis revealed mainly 3 factors, abuse, dependence and social drinking factors. Dependence could be divided into low/moderate dependence. The implications and practical applications of these tests are also discussed. PMID:21584077

  7. Multivariate Cluster Analysis.

    ERIC Educational Resources Information Center

    McRae, Douglas J.

    Procedures for grouping students into homogeneous subsets have long interested educational researchers. The research reported in this paper is an investigation of a set of objective grouping procedures based on multivariate analysis considerations. Four multivariate functions that might serve as criteria for adequate grouping are given and…

  8. What is the best predictor of mortality in perforated peptic ulcer disease? A population-based, multivariable regression analysis including three clinical scoring systems.

    PubMed

    Thorsen, Kenneth; Søreide, Jon Arne; Søreide, Kjetil

    2014-07-01

    Mortality rates in perforated peptic ulcer (PPU) have remained unchanged. The aim of this study was to compare known clinical factors and three scoring systems (American Society of Anesthesiologists (ASA), Boey and peptic ulcer perforation (PULP)) in the ability to predict mortality in PPU. This is a consecutive, observational cohort study of patients surgically treated for perforated peptic ulcer over a decade (January 2001 through December 2010). Primary outcome was 30-day mortality. A total of 172 patients were included, of whom 28 (16 %) died within 30 days. Among the factors associated with mortality, the PULP score had an odds ratio (OR) of 18.6 and the ASA score had an OR of 11.6, both with an area under the curve (AUC) of 0.79. The Boey score had an OR of 5.0 and an AUC of 0.75. Hypoalbuminaemia alone (≤37 g/l) achieved an OR of 8.7 and an AUC of 0.78. In multivariable regression, mortality was best predicted by a combination of increasing age, presence of active cancer and delay from admission to surgery of >24 h, together with hypoalbuminaemia, hyperbilirubinaemia and increased creatinine values, for a model AUC of 0.89. Six clinical factors predicted 30-day mortality better than available risk scores. Hypoalbuminaemia was the strongest single predictor of mortality and may be included for improved risk estimation.

  9. Multivariate flood risk assessment: reinsurance perspective

    NASA Astrophysics Data System (ADS)

    Ghizzoni, Tatiana; Ellenrieder, Tobias

    2013-04-01

    For insurance and re-insurance purposes the knowledge of the spatial characteristics of fluvial flooding is fundamental. The probability of simultaneous flooding at different locations during one event and the associated severity and losses have to be estimated in order to assess premiums and for accumulation control (Probable Maximum Losses calculation). Therefore, the identification of a statistical model able to describe the multivariate joint distribution of flood events in multiple location is necessary. In this context, copulas can be viewed as alternative tools for dealing with multivariate simulations as they allow to formalize dependence structures of random vectors. An application of copula function for flood scenario generation is presented for Australia (Queensland, New South Wales and Victoria) where 100.000 possible flood scenarios covering approximately 15.000 years were simulated.

  10. Comparative multivariate analyses of transient otoacoustic emissions and distorsion products in normal and impaired hearing.

    PubMed

    Stamate, Mirela Cristina; Todor, Nicolae; Cosgarea, Marcel

    2015-01-01

    The clinical utility of otoacoustic emissions as a noninvasive objective test of cochlear function has been long studied. Both transient otoacoustic emissions and distorsion products can be used to identify hearing loss, but to what extent they can be used as predictors for hearing loss is still debated. Most studies agree that multivariate analyses have better test performances than univariate analyses. The aim of the study was to determine transient otoacoustic emissions and distorsion products performance in identifying normal and impaired hearing loss, using the pure tone audiogram as a gold standard procedure and different multivariate statistical approaches. The study included 105 adult subjects with normal hearing and hearing loss who underwent the same test battery: pure-tone audiometry, tympanometry, otoacoustic emission tests. We chose to use the logistic regression as a multivariate statistical technique. Three logistic regression models were developed to characterize the relations between different risk factors (age, sex, tinnitus, demographic features, cochlear status defined by otoacoustic emissions) and hearing status defined by pure-tone audiometry. The multivariate analyses allow the calculation of the logistic score, which is a combination of the inputs, weighted by coefficients, calculated within the analyses. The accuracy of each model was assessed using receiver operating characteristics curve analysis. We used the logistic score to generate receivers operating curves and to estimate the areas under the curves in order to compare different multivariate analyses. We compared the performance of each otoacoustic emission (transient, distorsion product) using three different multivariate analyses for each ear, when multi-frequency gold standards were used. We demonstrated that all multivariate analyses provided high values of the area under the curve proving the performance of the otoacoustic emissions. Each otoacoustic emission test presented high

  11. Comparative multivariate analyses of transient otoacoustic emissions and distorsion products in normal and impaired hearing

    PubMed Central

    STAMATE, MIRELA CRISTINA; TODOR, NICOLAE; COSGAREA, MARCEL

    2015-01-01

    Background and aim The clinical utility of otoacoustic emissions as a noninvasive objective test of cochlear function has been long studied. Both transient otoacoustic emissions and distorsion products can be used to identify hearing loss, but to what extent they can be used as predictors for hearing loss is still debated. Most studies agree that multivariate analyses have better test performances than univariate analyses. The aim of the study was to determine transient otoacoustic emissions and distorsion products performance in identifying normal and impaired hearing loss, using the pure tone audiogram as a gold standard procedure and different multivariate statistical approaches. Methods The study included 105 adult subjects with normal hearing and hearing loss who underwent the same test battery: pure-tone audiometry, tympanometry, otoacoustic emission tests. We chose to use the logistic regression as a multivariate statistical technique. Three logistic regression models were developed to characterize the relations between different risk factors (age, sex, tinnitus, demographic features, cochlear status defined by otoacoustic emissions) and hearing status defined by pure-tone audiometry. The multivariate analyses allow the calculation of the logistic score, which is a combination of the inputs, weighted by coefficients, calculated within the analyses. The accuracy of each model was assessed using receiver operating characteristics curve analysis. We used the logistic score to generate receivers operating curves and to estimate the areas under the curves in order to compare different multivariate analyses. Results We compared the performance of each otoacoustic emission (transient, distorsion product) using three different multivariate analyses for each ear, when multi-frequency gold standards were used. We demonstrated that all multivariate analyses provided high values of the area under the curve proving the performance of the otoacoustic emissions. Each

  12. Linear regression analysis: part 14 of a series on evaluation of scientific publications.

    PubMed

    Schneider, Astrid; Hommel, Gerhard; Blettner, Maria

    2010-11-01

    Regression analysis is an important statistical method for the analysis of medical data. It enables the identification and characterization of relationships among multiple factors. It also enables the identification of prognostically relevant risk factors and the calculation of risk scores for individual prognostication. This article is based on selected textbooks of statistics, a selective review of the literature, and our own experience. After a brief introduction of the uni- and multivariable regression models, illustrative examples are given to explain what the important considerations are before a regression analysis is performed, and how the results should be interpreted. The reader should then be able to judge whether the method has been used correctly and interpret the results appropriately. The performance and interpretation of linear regression analysis are subject to a variety of pitfalls, which are discussed here in detail. The reader is made aware of common errors of interpretation through practical examples. Both the opportunities for applying linear regression analysis and its limitations are presented.

  13. Regression equations for estimating flood flows for the 2-, 10-, 25-, 50-, 100-, and 500-Year recurrence intervals in Connecticut

    USGS Publications Warehouse

    Ahearn, Elizabeth A.

    2004-01-01

    Multiple linear-regression equations were developed to estimate the magnitudes of floods in Connecticut for recurrence intervals ranging from 2 to 500 years. The equations can be used for nonurban, unregulated stream sites in Connecticut with drainage areas ranging from about 2 to 715 square miles. Flood-frequency data and hydrologic characteristics from 70 streamflow-gaging stations and the upstream drainage basins were used to develop the equations. The hydrologic characteristics?drainage area, mean basin elevation, and 24-hour rainfall?are used in the equations to estimate the magnitude of floods. Average standard errors of prediction for the equations are 31.8, 32.7, 34.4, 35.9, 37.6 and 45.0 percent for the 2-, 10-, 25-, 50-, 100-, and 500-year recurrence intervals, respectively. Simplified equations using only one hydrologic characteristic?drainage area?also were developed. The regression analysis is based on generalized least-squares regression techniques. Observed flows (log-Pearson Type III analysis of the annual maximum flows) from five streamflow-gaging stations in urban basins in Connecticut were compared to flows estimated from national three-parameter and seven-parameter urban regression equations. The comparison shows that the three- and seven- parameter equations used in conjunction with the new statewide equations generally provide reasonable estimates of flood flows for urban sites in Connecticut, although a national urban flood-frequency study indicated that the three-parameter equations significantly underestimated flood flows in many regions of the country. Verification of the accuracy of the three-parameter or seven-parameter national regression equations using new data from Connecticut stations was beyond the scope of this study. A technique for calculating flood flows at streamflow-gaging stations using a weighted average also is described. Two estimates of flood flows?one estimate based on the log-Pearson Type III analyses of the annual

  14. MGAS: a powerful tool for multivariate gene-based genome-wide association analysis.

    PubMed

    Van der Sluis, Sophie; Dolan, Conor V; Li, Jiang; Song, Youqiang; Sham, Pak; Posthuma, Danielle; Li, Miao-Xin

    2015-04-01

    Standard genome-wide association studies, testing the association between one phenotype and a large number of single nucleotide polymorphisms (SNPs), are limited in two ways: (i) traits are often multivariate, and analysis of composite scores entails loss in statistical power and (ii) gene-based analyses may be preferred, e.g. to decrease the multiple testing problem. Here we present a new method, multivariate gene-based association test by extended Simes procedure (MGAS), that allows gene-based testing of multivariate phenotypes in unrelated individuals. Through extensive simulation, we show that under most trait-generating genotype-phenotype models MGAS has superior statistical power to detect associated genes compared with gene-based analyses of univariate phenotypic composite scores (i.e. GATES, multiple regression), and multivariate analysis of variance (MANOVA). Re-analysis of metabolic data revealed 32 False Discovery Rate controlled genome-wide significant genes, and 12 regions harboring multiple genes; of these 44 regions, 30 were not reported in the original analysis. MGAS allows researchers to conduct their multivariate gene-based analyses efficiently, and without the loss of power that is often associated with an incorrectly specified genotype-phenotype models. MGAS is freely available in KGG v3.0 (http://statgenpro.psychiatry.hku.hk/limx/kgg/download.php). Access to the metabolic dataset can be requested at dbGaP (https://dbgap.ncbi.nlm.nih.gov/). The R-simulation code is available from http://ctglab.nl/people/sophie_van_der_sluis. Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.

  15. Multivariate analysis of fMRI time series: classification and regression of brain responses using machine learning.

    PubMed

    Formisano, Elia; De Martino, Federico; Valente, Giancarlo

    2008-09-01

    Machine learning and pattern recognition techniques are being increasingly employed in functional magnetic resonance imaging (fMRI) data analysis. By taking into account the full spatial pattern of brain activity measured simultaneously at many locations, these methods allow detecting subtle, non-strictly localized effects that may remain invisible to the conventional analysis with univariate statistical methods. In typical fMRI applications, pattern recognition algorithms "learn" a functional relationship between brain response patterns and a perceptual, cognitive or behavioral state of a subject expressed in terms of a label, which may assume discrete (classification) or continuous (regression) values. This learned functional relationship is then used to predict the unseen labels from a new data set ("brain reading"). In this article, we describe the mathematical foundations of machine learning applications in fMRI. We focus on two methods, support vector machines and relevance vector machines, which are respectively suited for the classification and regression of fMRI patterns. Furthermore, by means of several examples and applications, we illustrate and discuss the methodological challenges of using machine learning algorithms in the context of fMRI data analysis.

  16. Multi-Target Regression via Robust Low-Rank Learning.

    PubMed

    Zhen, Xiantong; Yu, Mengyang; He, Xiaofei; Li, Shuo

    2018-02-01

    Multi-target regression has recently regained great popularity due to its capability of simultaneously learning multiple relevant regression tasks and its wide applications in data mining, computer vision and medical image analysis, while great challenges arise from jointly handling inter-target correlations and input-output relationships. In this paper, we propose Multi-layer Multi-target Regression (MMR) which enables simultaneously modeling intrinsic inter-target correlations and nonlinear input-output relationships in a general framework via robust low-rank learning. Specifically, the MMR can explicitly encode inter-target correlations in a structure matrix by matrix elastic nets (MEN); the MMR can work in conjunction with the kernel trick to effectively disentangle highly complex nonlinear input-output relationships; the MMR can be efficiently solved by a new alternating optimization algorithm with guaranteed convergence. The MMR leverages the strength of kernel methods for nonlinear feature learning and the structural advantage of multi-layer learning architectures for inter-target correlation modeling. More importantly, it offers a new multi-layer learning paradigm for multi-target regression which is endowed with high generality, flexibility and expressive ability. Extensive experimental evaluation on 18 diverse real-world datasets demonstrates that our MMR can achieve consistently high performance and outperforms representative state-of-the-art algorithms, which shows its great effectiveness and generality for multivariate prediction.

  17. Sparse partial least squares regression for simultaneous dimension reduction and variable selection

    PubMed Central

    Chun, Hyonho; Keleş, Sündüz

    2010-01-01

    Partial least squares regression has been an alternative to ordinary least squares for handling multicollinearity in several areas of scientific research since the 1960s. It has recently gained much attention in the analysis of high dimensional genomic data. We show that known asymptotic consistency of the partial least squares estimator for a univariate response does not hold with the very large p and small n paradigm. We derive a similar result for a multivariate response regression with partial least squares. We then propose a sparse partial least squares formulation which aims simultaneously to achieve good predictive performance and variable selection by producing sparse linear combinations of the original predictors. We provide an efficient implementation of sparse partial least squares regression and compare it with well-known variable selection and dimension reduction approaches via simulation experiments. We illustrate the practical utility of sparse partial least squares regression in a joint analysis of gene expression and genomewide binding data. PMID:20107611

  18. Multivariate classification of small order watersheds in the Quabbin Reservoir Basin, Massachusetts

    USGS Publications Warehouse

    Lent, R.M.; Waldron, M.C.; Rader, J.C.

    1998-01-01

    A multivariate approach was used to analyze hydrologic, geologic, geographic, and water-chemistry data from small order watersheds in the Quabbin Reservoir Basin in central Massachusetts. Eighty three small order watersheds were delineated and landscape attributes defining hydrologic, geologic, and geographic features of the watersheds were compiled from geographic information system data layers. Principal components analysis was used to evaluate 11 chemical constituents collected bi-weekly for 1 year at 15 surface-water stations in order to subdivide the basin into subbasins comprised of watersheds with similar water quality characteristics. Three principal components accounted for about 90 percent of the variance in water chemistry data. The principal components were defined as a biogeochemical variable related to wetland density, an acid-neutralization variable, and a road-salt variable related to density of primary roads. Three subbasins were identified. Analysis of variance and multiple comparisons of means were used to identify significant differences in stream water chemistry and landscape attributes among subbasins. All stream water constituents were significantly different among subbasins. Multiple regression techniques were used to relate stream water chemistry to landscape attributes. Important differences in landscape attributes were related to wetlands, slope, and soil type.A multivariate approach was used to analyze hydrologic, geologic, geographic, and water-chemistry data from small order watersheds in the Quabbin Reservoir Basin in central Massachusetts. Eighty three small order watersheds were delineated and landscape attributes defining hydrologic, geologic, and geographic features of the watersheds were compiled from geographic information system data layers. Principal components analysis was used to evaluate 11 chemical constituents collected bi-weekly for 1 year at 15 surface-water stations in order to subdivide the basin into subbasins

  19. [Hyperspectral Estimation of Apple Tree Canopy LAI Based on SVM and RF Regression].

    PubMed

    Han, Zhao-ying; Zhu, Xi-cun; Fang, Xian-yi; Wang, Zhuo-yuan; Wang, Ling; Zhao, Geng-Xing; Jiang, Yuan-mao

    2016-03-01

    Leaf area index (LAI) is the dynamic index of crop population size. Hyperspectral technology can be used to estimate apple canopy LAI rapidly and nondestructively. It can be provide a reference for monitoring the tree growing and yield estimation. The Red Fuji apple trees of full bearing fruit are the researching objects. Ninety apple trees canopies spectral reflectance and LAI values were measured by the ASD Fieldspec3 spectrometer and LAI-2200 in thirty orchards in constant two years in Qixia research area of Shandong Province. The optimal vegetation indices were selected by the method of correlation analysis of the original spectral reflectance and vegetation indices. The models of predicting the LAI were built with the multivariate regression analysis method of support vector machine (SVM) and random forest (RF). The new vegetation indices, GNDVI527, ND-VI676, RVI682, FD-NVI656 and GRVI517 and the previous two main vegetation indices, NDVI670 and NDVI705, are in accordance with LAI. In the RF regression model, the calibration set decision coefficient C-R2 of 0.920 and validation set decision coefficient V-R2 of 0.889 are higher than the SVM regression model by 0.045 and 0.033 respectively. The root mean square error of calibration set C-RMSE of 0.249, the root mean square error validation set V-RMSE of 0.236 are lower than that of the SVM regression model by 0.054 and 0.058 respectively. Relative analysis of calibrating error C-RPD and relative analysis of validation set V-RPD reached 3.363 and 2.520, 0.598 and 0.262, respectively, which were higher than the SVM regression model. The measured and predicted the scatterplot trend line slope of the calibration set and validation set C-S and V-S are close to 1. The estimation result of RF regression model is better than that of the SVM. RF regression model can be used to estimate the LAI of red Fuji apple trees in full fruit period.

  20. Evaluation of the efficiency of continuous wavelet transform as processing and preprocessing algorithm for resolution of overlapped signals in univariate and multivariate regression analyses; an application to ternary and quaternary mixtures

    NASA Astrophysics Data System (ADS)

    Hegazy, Maha A.; Lotfy, Hayam M.; Mowaka, Shereen; Mohamed, Ekram Hany

    2016-07-01

    Wavelets have been adapted for a vast number of signal-processing applications due to the amount of information that can be extracted from a signal. In this work, a comparative study on the efficiency of continuous wavelet transform (CWT) as a signal processing tool in univariate regression and a pre-processing tool in multivariate analysis using partial least square (CWT-PLS) was conducted. These were applied to complex spectral signals of ternary and quaternary mixtures. CWT-PLS method succeeded in the simultaneous determination of a quaternary mixture of drotaverine (DRO), caffeine (CAF), paracetamol (PAR) and p-aminophenol (PAP, the major impurity of paracetamol). While, the univariate CWT failed to simultaneously determine the quaternary mixture components and was able to determine only PAR and PAP, the ternary mixtures of DRO, CAF, and PAR and CAF, PAR, and PAP. During the calculations of CWT, different wavelet families were tested. The univariate CWT method was validated according to the ICH guidelines. While for the development of the CWT-PLS model a calibration set was prepared by means of an orthogonal experimental design and their absorption spectra were recorded and processed by CWT. The CWT-PLS model was constructed by regression between the wavelet coefficients and concentration matrices and validation was performed by both cross validation and external validation sets. Both methods were successfully applied for determination of the studied drugs in pharmaceutical formulations.

  1. Regression and Sentinel Lymph Node Status in Melanoma Progression

    PubMed Central

    Letca, Alina Florentina; Ungureanu, Loredana; Şenilă, Simona Corina; Grigore, Lavinia Elena; Pop, Ştefan; Fechete, Oana; Vesa, Ştefan Cristian

    2018-01-01

    Background The purpose of this study was to assess the role of regression and other clinical and histological features for the prognosis and the progression of cutaneous melanoma. Material/Methods Between 2005 and 2016, 403 patients with melanoma were treated and followed at our Department of Dermatology. Of the 403 patients, 173 patients had cutaneous melanoma and underwent sentinel lymph node (SLN) biopsy and thus were included in this study. Results Histological regression was found in 37 cases of melanoma (21.3%). It was significantly associated with marked and moderate tumor-infiltrating lymphocyte (TIL) and with negative SLN. Progression of the disease occurred in 42 patients (24.2%). On multivariate analysis, we found that a positive lymph node and a Breslow index higher than 2 mm were independent variables associated with disease free survival (DFS). These variables together with a mild TIL were significantly correlated with overall survival (OS). The presence of regression was not associated with DFS or OS. Conclusions We could not demonstrate an association between regression and the outcome of patients with cutaneous melanoma. Tumor thickness greater than 2 mm and a positive SLN were associated with recurrence. Survival was influenced by a Breslow thickness >2 mm, the presence of a mild TIL and a positive SLN status. PMID:29507279

  2. Predicting seasonal influenza transmission using functional regression models with temporal dependence.

    PubMed

    Oviedo de la Fuente, Manuel; Febrero-Bande, Manuel; Muñoz, María Pilar; Domínguez, Àngela

    2018-01-01

    This paper proposes a novel approach that uses meteorological information to predict the incidence of influenza in Galicia (Spain). It extends the Generalized Least Squares (GLS) methods in the multivariate framework to functional regression models with dependent errors. These kinds of models are useful when the recent history of the incidence of influenza are readily unavailable (for instance, by delays on the communication with health informants) and the prediction must be constructed by correcting the temporal dependence of the residuals and using more accessible variables. A simulation study shows that the GLS estimators render better estimations of the parameters associated with the regression model than they do with the classical models. They obtain extremely good results from the predictive point of view and are competitive with the classical time series approach for the incidence of influenza. An iterative version of the GLS estimator (called iGLS) was also proposed that can help to model complicated dependence structures. For constructing the model, the distance correlation measure [Formula: see text] was employed to select relevant information to predict influenza rate mixing multivariate and functional variables. These kinds of models are extremely useful to health managers in allocating resources in advance to manage influenza epidemics.

  3. Generation of multivariate near shore extreme wave conditions based on an extreme value copula for offshore boundary conditions.

    NASA Astrophysics Data System (ADS)

    Leyssen, Gert; Mercelis, Peter; De Schoesitter, Philippe; Blanckaert, Joris

    2013-04-01

    calculated. For the remaining directions the univariate extreme wind velocity distribution is stratified, each class combined with 5 high water levels. The wave height at the model boundaries was taken into account by a regression with the extreme wind velocity at the offshore location. The regression line and the 95% confidence limits where combined with each class. Eventually the wave period is computed by a new regression with the significant wave height. This way 1103 synthetic events were selected and simulated with the SWAN wave model, each of which a frequency of occurrence is calculated for. Hence near shore significant wave heights are obtained with corresponding frequencies. The statistical distribution of the near shore wave heights is determined by sorting the model results in a descending order and accumulating the corresponding frequencies. This approach allows determination of conditional return periods. For example, for the imposed univariate design return periods of 100 years for significant wave height and 30 years for water level, the joint return period for a simultaneous exceedance of both conditions can be computed as 4000 years. Hence, this methodology allows for a probabilistic design of coastal defense structures.

  4. Adulteration of Argentinean milk fats with animal fats: Detection by fatty acids analysis and multivariate regression techniques.

    PubMed

    Rebechi, S R; Vélez, M A; Vaira, S; Perotti, M C

    2016-02-01

    The aims of the present study were to test the accuracy of the fatty acid ratios established by the Argentinean Legislation to detect adulterations of milk fat with animal fats and to propose a regression model suitable to evaluate these adulterations. For this purpose, 70 milk fat, 10 tallow and 7 lard fat samples were collected and analyzed by gas chromatography. Data was utilized to simulate arithmetically adulterated milk fat samples at 0%, 2%, 5%, 10% and 15%, for both animal fats. The fatty acids ratios failed to distinguish adulterated milk fats containing less than 15% of tallow or lard. For each adulterant, Multiple Linear Regression (MLR) was applied, and a model was chosen and validated. For that, calibration and validation matrices were constructed employing genuine and adulterated milk fat samples. The models were able to detect adulterations of milk fat at levels greater than 10% for tallow and 5% for lard. Copyright © 2015 Elsevier Ltd. All rights reserved.

  5. Hot spots of multivariate extreme anomalies in Earth observations

    NASA Astrophysics Data System (ADS)

    Flach, M.; Sippel, S.; Bodesheim, P.; Brenning, A.; Denzler, J.; Gans, F.; Guanche, Y.; Reichstein, M.; Rodner, E.; Mahecha, M. D.

    2016-12-01

    Anomalies in Earth observations might indicate data quality issues, extremes or the change of underlying processes within a highly multivariate system. Thus, considering the multivariate constellation of variables for extreme detection yields crucial additional information over conventional univariate approaches. We highlight areas in which multivariate extreme anomalies are more likely to occur, i.e. hot spots of extremes in global atmospheric Earth observations that impact the Biosphere. In addition, we present the year of the most unusual multivariate extreme between 2001 and 2013 and show that these coincide with well known high impact extremes. Technically speaking, we account for multivariate extremes by using three sophisticated algorithms adapted from computer science applications. Namely an ensemble of the k-nearest neighbours mean distance, a kernel density estimation and an approach based on recurrences is used. However, the impact of atmosphere extremes on the Biosphere might largely depend on what is considered to be normal, i.e. the shape of the mean seasonal cycle and its inter-annual variability. We identify regions with similar mean seasonality by means of dimensionality reduction in order to estimate in each region both the `normal' variance and robust thresholds for detecting the extremes. In addition, we account for challenges like heteroscedasticity in Northern latitudes. Apart from hot spot areas, those anomalies in the atmosphere time series are of particular interest, which can only be detected by a multivariate approach but not by a simple univariate approach. Such an anomalous constellation of atmosphere variables is of interest if it impacts the Biosphere. The multivariate constellation of such an anomalous part of a time series is shown in one case study indicating that multivariate anomaly detection can provide novel insights into Earth observations.

  6. A climate-based multivariate extreme emulator of met-ocean-hydrological events for coastal flooding

    NASA Astrophysics Data System (ADS)

    Camus, Paula; Rueda, Ana; Mendez, Fernando J.; Tomas, Antonio; Del Jesus, Manuel; Losada, Iñigo J.

    2015-04-01

    Atmosphere-ocean general circulation models (AOGCMs) are useful to analyze large-scale climate variability (long-term historical periods, future climate projections). However, applications such as coastal flood modeling require climate information at finer scale. Besides, flooding events depend on multiple climate conditions: waves, surge levels from the open-ocean and river discharge caused by precipitation. Therefore, a multivariate statistical downscaling approach is adopted to reproduce relationships between variables and due to its low computational cost. The proposed method can be considered as a hybrid approach which combines a probabilistic weather type downscaling model with a stochastic weather generator component. Predictand distributions are reproduced modeling the relationship with AOGCM predictors based on a physical division in weather types (Camus et al., 2012). The multivariate dependence structure of the predictand (extreme events) is introduced linking the independent marginal distributions of the variables by a probabilistic copula regression (Ben Ayala et al., 2014). This hybrid approach is applied for the downscaling of AOGCM data to daily precipitation and maximum significant wave height and storm-surge in different locations along the Spanish coast. Reanalysis data is used to assess the proposed method. A commonly predictor for the three variables involved is classified using a regression-guided clustering algorithm. The most appropriate statistical model (general extreme value distribution, pareto distribution) for daily conditions is fitted. Stochastic simulation of the present climate is performed obtaining the set of hydraulic boundary conditions needed for high resolution coastal flood modeling. References: Camus, P., Menéndez, M., Méndez, F.J., Izaguirre, C., Espejo, A., Cánovas, V., Pérez, J., Rueda, A., Losada, I.J., Medina, R. (2014b). A weather-type statistical downscaling framework for ocean wave climate. Journal of

  7. Calculating the individual probability of successful ocriplasmin treatment in eyes with VMT syndrome: a multivariable prediction model from the EXPORT study.

    PubMed

    Paul, Christoph; Heun, Christine; Müller, Hans-Helge; Hoerauf, Hans; Feltgen, Nicolas; Wachtlin, Joachim; Kaymak, Hakan; Mennel, Stefan; Koss, Michael Janusz; Fauser, Sascha; Maier, Mathias M; Schumann, Ricarda G; Mueller, Simone; Chang, Petrus; Schmitz-Valckenberg, Steffen; Kazerounian, Sara; Szurman, Peter; Lommatzsch, Albrecht; Bertelmann, Thomas

    2017-10-31

    To evaluate predictive factors for the treatment success of ocriplasmin and to use these factors to generate a multivariate model to calculate the individual probability of successful treatment. Data were collected in a retrospective, multicentre cohort study. Patients with vitreomacular traction (VMT) syndrome without a full-thickness macular hole were included if they received an intravitreal injection (IVI) of ocriplasmin. Five factors (age, gender, lens status, presence of epiretinal membrane (ERM) formation and horizontal diameter of VMT) were assessed on their association with VMT resolution. A multivariable logistic regression model was employed to further analyse these factors and calculate the individual probability of successful treatment. 167 eyes of 167 patients were included. Univariate analysis revealed a significant correlation to VMT resolution for all analysed factors: age (years) (OR 0.9208; 95% CI 0.8845 to 0.9586; p<0.0001), gender (male) (OR 0.480; 95% CI 0.241 to 0.957; p=0.0371), lens status (phakic) (OR 2.042; 95% CI 1.054 to 3.958; p=0.0344), ERM formation (present) (OR 0.384; 95% CI 0.179 to 0.821; p=0.0136) and horizontal VMT diameter (µm) (OR 0.99812; 95% CI 0.99684 to 0.99941, p=0.0042). A significant multivariable logistic regression model was established with age and VMT diameter. Known predictive factors for VMT resolution after ocriplasmin IVI were confirmed in our study. We were able to combine them into a formula, ultimately allowing the calculation of an individual probability of treatment success with ocriplasmin in patients with VMT syndrome without FTHM. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  8. Differentiating regressed melanoma from regressed lichenoid keratosis.

    PubMed

    Chan, Aegean H; Shulman, Kenneth J; Lee, Bonnie A

    2017-04-01

    Distinguishing regressed lichen planus-like keratosis (LPLK) from regressed melanoma can be difficult on histopathologic examination, potentially resulting in mismanagement of patients. We aimed to identify histopathologic features by which regressed melanoma can be differentiated from regressed LPLK. Twenty actively inflamed LPLK, 12 LPLK with regression and 15 melanomas with regression were compared and evaluated by hematoxylin and eosin staining as well as Melan-A, microphthalmia transcription factor (MiTF) and cytokeratin (AE1/AE3) immunostaining. (1) A total of 40% of regressed melanomas showed complete or near complete loss of melanocytes within the epidermis with Melan-A and MiTF immunostaining, while 8% of regressed LPLK exhibited this finding. (2) Necrotic keratinocytes were seen in the epidermis in 33% regressed melanomas as opposed to all of the regressed LPLK. (3) A dense infiltrate of melanophages in the papillary dermis was seen in 40% of regressed melanomas, a feature not seen in regressed LPLK. In summary, our findings suggest that a complete or near complete loss of melanocytes within the epidermis strongly favors a regressed melanoma over a regressed LPLK. In addition, necrotic epidermal keratinocytes and the presence of a dense band-like distribution of dermal melanophages can be helpful in differentiating these lesions. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  9. A Logistic Regression Analysis of Turkey's 15-Year-Olds' Scoring above the OECD Average on the PISA'09 Reading Assessment

    ERIC Educational Resources Information Center

    Kasapoglu, Koray

    2014-01-01

    This study aims to investigate which factors are associated with Turkey's 15-year-olds' scoring above the OECD average (493) on the PISA'09 reading assessment. Collected from a total of 4,996 15-year-old students from Turkey, data were analyzed by logistic regression analysis in order to model the data of students who were split into two: (1)…

  10. Identifying maternal and infant factors associated with newborn size in rural Bangladesh by partial least squares (PLS) regression analysis

    PubMed Central

    Rahman, Md. Jahanur; Shamim, Abu Ahmed; Klemm, Rolf D. W.; Labrique, Alain B.; Rashid, Mahbubur; Christian, Parul; West, Keith P.

    2017-01-01

    Birth weight, length and circumferences of the head, chest and arm are key measures of newborn size and health in developing countries. We assessed maternal socio-demographic factors associated with multiple measures of newborn size in a large rural population in Bangladesh using partial least squares (PLS) regression method. PLS regression, combining features from principal component analysis and multiple linear regression, is a multivariate technique with an ability to handle multicollinearity while simultaneously handling multiple dependent variables. We analyzed maternal and infant data from singletons (n = 14,506) born during a double-masked, cluster-randomized, placebo-controlled maternal vitamin A or β-carotene supplementation trial in rural northwest Bangladesh. PLS regression results identified numerous maternal factors (parity, age, early pregnancy MUAC, living standard index, years of education, number of antenatal care visits, preterm delivery and infant sex) significantly (p<0.001) associated with newborn size. Among them, preterm delivery had the largest negative influence on newborn size (Standardized β = -0.29 − -0.19; p<0.001). Scatter plots of the scores of first two PLS components also revealed an interaction between newborn sex and preterm delivery on birth size. PLS regression was found to be more parsimonious than both ordinary least squares regression and principal component regression. It also provided more stable estimates than the ordinary least squares regression and provided the effect measure of the covariates with greater accuracy as it accounts for the correlation among the covariates and outcomes. Therefore, PLS regression is recommended when either there are multiple outcome measurements in the same study, or the covariates are correlated, or both situations exist in a dataset. PMID:29261760

  11. Identifying maternal and infant factors associated with newborn size in rural Bangladesh by partial least squares (PLS) regression analysis.

    PubMed

    Kabir, Alamgir; Rahman, Md Jahanur; Shamim, Abu Ahmed; Klemm, Rolf D W; Labrique, Alain B; Rashid, Mahbubur; Christian, Parul; West, Keith P

    2017-01-01

    Birth weight, length and circumferences of the head, chest and arm are key measures of newborn size and health in developing countries. We assessed maternal socio-demographic factors associated with multiple measures of newborn size in a large rural population in Bangladesh using partial least squares (PLS) regression method. PLS regression, combining features from principal component analysis and multiple linear regression, is a multivariate technique with an ability to handle multicollinearity while simultaneously handling multiple dependent variables. We analyzed maternal and infant data from singletons (n = 14,506) born during a double-masked, cluster-randomized, placebo-controlled maternal vitamin A or β-carotene supplementation trial in rural northwest Bangladesh. PLS regression results identified numerous maternal factors (parity, age, early pregnancy MUAC, living standard index, years of education, number of antenatal care visits, preterm delivery and infant sex) significantly (p<0.001) associated with newborn size. Among them, preterm delivery had the largest negative influence on newborn size (Standardized β = -0.29 - -0.19; p<0.001). Scatter plots of the scores of first two PLS components also revealed an interaction between newborn sex and preterm delivery on birth size. PLS regression was found to be more parsimonious than both ordinary least squares regression and principal component regression. It also provided more stable estimates than the ordinary least squares regression and provided the effect measure of the covariates with greater accuracy as it accounts for the correlation among the covariates and outcomes. Therefore, PLS regression is recommended when either there are multiple outcome measurements in the same study, or the covariates are correlated, or both situations exist in a dataset.

  12. Regression of Moral Reasoning during Medical Education: Combined Design Study to Evaluate the Effect of Clinical Study Years

    PubMed Central

    Hren, Darko; Marušić, Matko; Marušić, Ana

    2011-01-01

    Background Moral reasoning is important for developing medical professionalism but current evidence for the relationship between education and moral reasoning does not clearly apply to medical students. We used a combined study design to test the effect of clinical teaching on moral reasoning. Methods We used the Defining Issues Test-2 as a measure of moral judgment, with 3 general moral schemas: Personal Interest, Maintaining Norms, and Postconventional Schema. The test was applied to 3 consecutive cohorts of second year students in 2002 (n = 207), 2003 (n = 192), and 2004 (n = 139), and to 707 students of all 6 study years in 2004 cross-sectional study. We also tested 298 age-matched controls without university education. Results In the cross-sectional study, there was significant main effect of the study year for Postconventional (F(5,679) = 3.67, P = 0.003) and Personal Interest scores (F(5,679) = 3.38, P = 0.005). There was no effect of the study year for Maintaining Norms scores. 3rd year medical students scored higher on Postconventional schema score than all other study years (p<0.001). There were no statistically significant differences among 3 cohorts of 2nd year medical students, demonstrating the absence of cohort or point-of-measurement effects. Longitudinal study of 3 cohorts demonstrated that students regressed from Postconventional to Maintaining Norms schema-based reasoning after entering the clinical part of the curriculum. Interpretation Our study demonstrated direct causative relationship between the regression in moral reasoning development and clinical teaching during medical curriculum. The reasons may include hierarchical organization of clinical practice, specific nature of moral dilemmas faced by medical students, and hidden medical curriculum. PMID:21479204

  13. Multivariate Formation Pressure Prediction with Seismic-derived Petrophysical Properties from Prestack AVO inversion and Poststack Seismic Motion Inversion

    NASA Astrophysics Data System (ADS)

    Yu, H.; Gu, H.

    2017-12-01

    A novel multivariate seismic formation pressure prediction methodology is presented, which incorporates high-resolution seismic velocity data from prestack AVO inversion, and petrophysical data (porosity and shale volume) derived from poststack seismic motion inversion. In contrast to traditional seismic formation prediction methods, the proposed methodology is based on a multivariate pressure prediction model and utilizes a trace-by-trace multivariate regression analysis on seismic-derived petrophysical properties to calibrate model parameters in order to make accurate predictions with higher resolution in both vertical and lateral directions. With prestack time migration velocity as initial velocity model, an AVO inversion was first applied to prestack dataset to obtain high-resolution seismic velocity with higher frequency that is to be used as the velocity input for seismic pressure prediction, and the density dataset to calculate accurate Overburden Pressure (OBP). Seismic Motion Inversion (SMI) is an inversion technique based on Markov Chain Monte Carlo simulation. Both structural variability and similarity of seismic waveform are used to incorporate well log data to characterize the variability of the property to be obtained. In this research, porosity and shale volume are first interpreted on well logs, and then combined with poststack seismic data using SMI to build porosity and shale volume datasets for seismic pressure prediction. A multivariate effective stress model is used to convert velocity, porosity and shale volume datasets to effective stress. After a thorough study of the regional stratigraphic and sedimentary characteristics, a regional normally compacted interval model is built, and then the coefficients in the multivariate prediction model are determined in a trace-by-trace multivariate regression analysis on the petrophysical data. The coefficients are used to convert velocity, porosity and shale volume datasets to effective stress and then

  14. MANCOVA for one way classification with homogeneity of regression coefficient vectors

    NASA Astrophysics Data System (ADS)

    Mokesh Rayalu, G.; Ravisankar, J.; Mythili, G. Y.

    2017-11-01

    The MANOVA and MANCOVA are the extensions of the univariate ANOVA and ANCOVA techniques to multidimensional or vector valued observations. The assumption of a Gaussian distribution has been replaced with the Multivariate Gaussian distribution for the vectors data and residual term variables in the statistical models of these techniques. The objective of MANCOVA is to determine if there are statistically reliable mean differences that can be demonstrated between groups later modifying the newly created variable. When randomization assignment of samples or subjects to groups is not possible, multivariate analysis of covariance (MANCOVA) provides statistical matching of groups by adjusting dependent variables as if all subjects scored the same on the covariates. In this research article, an extension has been made to the MANCOVA technique with more number of covariates and homogeneity of regression coefficient vectors is also tested.

  15. Multivariate Phylogenetic Comparative Methods: Evaluations, Comparisons, and Recommendations.

    PubMed

    Adams, Dean C; Collyer, Michael L

    2018-01-01

    Recent years have seen increased interest in phylogenetic comparative analyses of multivariate data sets, but to date the varied proposed approaches have not been extensively examined. Here we review the mathematical properties required of any multivariate method, and specifically evaluate existing multivariate phylogenetic comparative methods in this context. Phylogenetic comparative methods based on the full multivariate likelihood are robust to levels of covariation among trait dimensions and are insensitive to the orientation of the data set, but display increasing model misspecification as the number of trait dimensions increases. This is because the expected evolutionary covariance matrix (V) used in the likelihood calculations becomes more ill-conditioned as trait dimensionality increases, and as evolutionary models become more complex. Thus, these approaches are only appropriate for data sets with few traits and many species. Methods that summarize patterns across trait dimensions treated separately (e.g., SURFACE) incorrectly assume independence among trait dimensions, resulting in nearly a 100% model misspecification rate. Methods using pairwise composite likelihood are highly sensitive to levels of trait covariation, the orientation of the data set, and the number of trait dimensions. The consequences of these debilitating deficiencies are that a user can arrive at differing statistical conclusions, and therefore biological inferences, simply from a dataspace rotation, like principal component analysis. By contrast, algebraic generalizations of the standard phylogenetic comparative toolkit that use the trace of covariance matrices are insensitive to levels of trait covariation, the number of trait dimensions, and the orientation of the data set. Further, when appropriate permutation tests are used, these approaches display acceptable Type I error and statistical power. We conclude that methods summarizing information across trait dimensions, as well as

  16. Multivariate analysis in thoracic research.

    PubMed

    Mengual-Macenlle, Noemí; Marcos, Pedro J; Golpe, Rafael; González-Rivas, Diego

    2015-03-01

    Multivariate analysis is based in observation and analysis of more than one statistical outcome variable at a time. In design and analysis, the technique is used to perform trade studies across multiple dimensions while taking into account the effects of all variables on the responses of interest. The development of multivariate methods emerged to analyze large databases and increasingly complex data. Since the best way to represent the knowledge of reality is the modeling, we should use multivariate statistical methods. Multivariate methods are designed to simultaneously analyze data sets, i.e., the analysis of different variables for each person or object studied. Keep in mind at all times that all variables must be treated accurately reflect the reality of the problem addressed. There are different types of multivariate analysis and each one should be employed according to the type of variables to analyze: dependent, interdependence and structural methods. In conclusion, multivariate methods are ideal for the analysis of large data sets and to find the cause and effect relationships between variables; there is a wide range of analysis types that we can use.

  17. Bayesian inference for multivariate meta-analysis Box-Cox transformation models for individual patient data with applications to evaluation of cholesterol lowering drugs

    PubMed Central

    Kim, Sungduk; Chen, Ming-Hui; Ibrahim, Joseph G.; Shah, Arvind K.; Lin, Jianxin

    2013-01-01

    In this paper, we propose a class of Box-Cox transformation regression models with multidimensional random effects for analyzing multivariate responses for individual patient data (IPD) in meta-analysis. Our modeling formulation uses a multivariate normal response meta-analysis model with multivariate random effects, in which each response is allowed to have its own Box-Cox transformation. Prior distributions are specified for the Box-Cox transformation parameters as well as the regression coefficients in this complex model, and the Deviance Information Criterion (DIC) is used to select the best transformation model. Since the model is quite complex, a novel Monte Carlo Markov chain (MCMC) sampling scheme is developed to sample from the joint posterior of the parameters. This model is motivated by a very rich dataset comprising 26 clinical trials involving cholesterol lowering drugs where the goal is to jointly model the three dimensional response consisting of Low Density Lipoprotein Cholesterol (LDL-C), High Density Lipoprotein Cholesterol (HDL-C), and Triglycerides (TG) (LDL-C, HDL-C, TG). Since the joint distribution of (LDL-C, HDL-C, TG) is not multivariate normal and in fact quite skewed, a Box-Cox transformation is needed to achieve normality. In the clinical literature, these three variables are usually analyzed univariately: however, a multivariate approach would be more appropriate since these variables are correlated with each other. A detailed analysis of these data is carried out using the proposed methodology. PMID:23580436

  18. Bayesian inference for multivariate meta-analysis Box-Cox transformation models for individual patient data with applications to evaluation of cholesterol-lowering drugs.

    PubMed

    Kim, Sungduk; Chen, Ming-Hui; Ibrahim, Joseph G; Shah, Arvind K; Lin, Jianxin

    2013-10-15

    In this paper, we propose a class of Box-Cox transformation regression models with multidimensional random effects for analyzing multivariate responses for individual patient data in meta-analysis. Our modeling formulation uses a multivariate normal response meta-analysis model with multivariate random effects, in which each response is allowed to have its own Box-Cox transformation. Prior distributions are specified for the Box-Cox transformation parameters as well as the regression coefficients in this complex model, and the deviance information criterion is used to select the best transformation model. Because the model is quite complex, we develop a novel Monte Carlo Markov chain sampling scheme to sample from the joint posterior of the parameters. This model is motivated by a very rich dataset comprising 26 clinical trials involving cholesterol-lowering drugs where the goal is to jointly model the three-dimensional response consisting of low density lipoprotein cholesterol (LDL-C), high density lipoprotein cholesterol (HDL-C), and triglycerides (TG) (LDL-C, HDL-C, TG). Because the joint distribution of (LDL-C, HDL-C, TG) is not multivariate normal and in fact quite skewed, a Box-Cox transformation is needed to achieve normality. In the clinical literature, these three variables are usually analyzed univariately; however, a multivariate approach would be more appropriate because these variables are correlated with each other. We carry out a detailed analysis of these data by using the proposed methodology. Copyright © 2013 John Wiley & Sons, Ltd.

  19. Diagnostic accuracy of atypical p-ANCA in autoimmune hepatitis using ROC- and multivariate regression analysis.

    PubMed

    Terjung, B; Bogsch, F; Klein, R; Söhne, J; Reichel, C; Wasmuth, J-C; Beuers, U; Sauerbruch, T; Spengler, U

    2004-09-29

    Antineutrophil cytoplasmic antibodies (atypical p-ANCA) are detected at high prevalence in sera from patients with autoimmune hepatitis (AIH), but their diagnostic relevance for AIH has not been systematically evaluated so far. Here, we studied sera from 357 patients with autoimmune (autoimmune hepatitis n=175, primary sclerosing cholangitis (PSC) n=35, primary biliary cirrhosis n=45), non-autoimmune chronic liver disease (alcoholic liver cirrhosis n=62; chronic hepatitis C virus infection (HCV) n=21) or healthy controls (n=19) for the presence of various non-organ specific autoantibodies. Atypical p-ANCA, antinuclear antibodies (ANA), antibodies against smooth muscles (SMA), antibodies against liver/kidney microsomes (anti-Lkm1) and antimitochondrial antibodies (AMA) were detected by indirect immunofluorescence microscopy, antibodies against the M2 antigen (anti-M2), antibodies against soluble liver antigen (anti-SLA/LP) and anti-Lkm1 by using enzyme linked immunosorbent assays. To define the diagnostic precision of the autoantibodies, results of autoantibody testing were analyzed by receiver operating characteristics (ROC) and forward conditional logistic regression analysis. Atypical p-ANCA were detected at high prevalence in sera from patients with AIH (81%) and PSC (94%). ROC- and logistic regression analysis revealed atypical p-ANCA and SMA, but not ANA as significant diagnostic seromarkers for AIH (atypical p-ANCA: AUC 0.754+/-0.026, odds ratio [OR] 3.4; SMA: 0.652+/-0.028, OR 4.1). Atypical p-ANCA also emerged as the only diagnostically relevant seromarker for PSC (AUC 0.690+/-0.04, OR 3.4). None of the tested antibodies yielded a significant diagnostic accuracy for patients with alcoholic liver cirrhosis, HCV or healthy controls. Atypical p-ANCA along with SMA represent a seromarker with high diagnostic accuracy for AIH and should be explicitly considered in a revised version of the diagnostic score for AIH.

  20. [Use of multiple regression models in observational studies (1970-2013) and requirements of the STROBE guidelines in Spanish scientific journals].

    PubMed

    Real, J; Cleries, R; Forné, C; Roso-Llorach, A; Martínez-Sánchez, J M

    In medicine and biomedical research, statistical techniques like logistic, linear, Cox and Poisson regression are widely known. The main objective is to describe the evolution of multivariate techniques used in observational studies indexed in PubMed (1970-2013), and to check the requirements of the STROBE guidelines in the author guidelines in Spanish journals indexed in PubMed. A targeted PubMed search was performed to identify papers that used logistic linear Cox and Poisson models. Furthermore, a review was also made of the author guidelines of journals published in Spain and indexed in PubMed and Web of Science. Only 6.1% of the indexed manuscripts included a term related to multivariate analysis, increasing from 0.14% in 1980 to 12.3% in 2013. In 2013, 6.7, 2.5, 3.5, and 0.31% of the manuscripts contained terms related to logistic, linear, Cox and Poisson regression, respectively. On the other hand, 12.8% of journals author guidelines explicitly recommend to follow the STROBE guidelines, and 35.9% recommend the CONSORT guideline. A low percentage of Spanish scientific journals indexed in PubMed include the STROBE statement requirement in the author guidelines. Multivariate regression models in published observational studies such as logistic regression, linear, Cox and Poisson are increasingly used both at international level, as well as in journals published in Spanish. Copyright © 2015 Sociedad Española de Médicos de Atención Primaria (SEMERGEN). Publicado por Elsevier España, S.L.U. All rights reserved.

  1. Evaluation of the efficiency of continuous wavelet transform as processing and preprocessing algorithm for resolution of overlapped signals in univariate and multivariate regression analyses; an application to ternary and quaternary mixtures.

    PubMed

    Hegazy, Maha A; Lotfy, Hayam M; Mowaka, Shereen; Mohamed, Ekram Hany

    2016-07-05

    Wavelets have been adapted for a vast number of signal-processing applications due to the amount of information that can be extracted from a signal. In this work, a comparative study on the efficiency of continuous wavelet transform (CWT) as a signal processing tool in univariate regression and a pre-processing tool in multivariate analysis using partial least square (CWT-PLS) was conducted. These were applied to complex spectral signals of ternary and quaternary mixtures. CWT-PLS method succeeded in the simultaneous determination of a quaternary mixture of drotaverine (DRO), caffeine (CAF), paracetamol (PAR) and p-aminophenol (PAP, the major impurity of paracetamol). While, the univariate CWT failed to simultaneously determine the quaternary mixture components and was able to determine only PAR and PAP, the ternary mixtures of DRO, CAF, and PAR and CAF, PAR, and PAP. During the calculations of CWT, different wavelet families were tested. The univariate CWT method was validated according to the ICH guidelines. While for the development of the CWT-PLS model a calibration set was prepared by means of an orthogonal experimental design and their absorption spectra were recorded and processed by CWT. The CWT-PLS model was constructed by regression between the wavelet coefficients and concentration matrices and validation was performed by both cross validation and external validation sets. Both methods were successfully applied for determination of the studied drugs in pharmaceutical formulations. Copyright © 2016 Elsevier B.V. All rights reserved.

  2. Coronary artery aneurysm regression after Kawasaki disease and associated risk factors: a 3-year follow-up study in East China.

    PubMed

    Tang, Yunjia; Yan, Wenhua; Sun, Ling; Xu, Qiuqin; Ding, Yueyue; Lv, Haitao

    2018-01-12

    Kawasaki disease (KD) is the leading cause of acquired heart disease due to its complicated coronary artery lesions. Up to now, few studies were focused on the status of persistent coronary artery aneurysms (CAA) in KD patients. The present study was designed to identify the coronary artery outcomes and seek the risk factors associated with the regression of CAA in KD patients. One hundred and twenty KD patients with CAA hospitalized in Children's Hospital of Soochow University from Jan 2008 to Dec 2013 were prospectively studied by a 3-year follow-up. Data regarding demographic, clinical, laboratory, and echocardiographic characteristics were documented and further analyzed. It was estimated that 39.2% of the patients had complete regression of CAA within 4 weeks, 59.2% within 8 weeks, and 70.0% within 16 weeks. No fatal cardiac events occurred. We found patients who aged ≤ 1 year, received initial intravenous immunoglobulin (IVIG) treatment after the 10th day of illness, and IVIG non-responders were associated with the regression of persistent CAA. The relative risks were 1.55, 1.87, and 1.88, respectively. Age, initial IVIG treatment, and IVIG response were risk factors of persistent CAA, and more attention should be paid on these patients.

  3. [Logistic regression model of noninvasive prediction for portal hypertensive gastropathy in patients with hepatitis B associated cirrhosis].

    PubMed

    Wang, Qingliang; Li, Xiaojie; Hu, Kunpeng; Zhao, Kun; Yang, Peisheng; Liu, Bo

    2015-05-12

    To explore the risk factors of portal hypertensive gastropathy (PHG) in patients with hepatitis B associated cirrhosis and establish a Logistic regression model of noninvasive prediction. The clinical data of 234 hospitalized patients with hepatitis B associated cirrhosis from March 2012 to March 2014 were analyzed retrospectively. The dependent variable was the occurrence of PHG while the independent variables were screened by binary Logistic analysis. Multivariate Logistic regression was used for further analysis of significant noninvasive independent variables. Logistic regression model was established and odds ratio was calculated for each factor. The accuracy, sensitivity and specificity of model were evaluated by the curve of receiver operating characteristic (ROC). According to univariate Logistic regression, the risk factors included hepatic dysfunction, albumin (ALB), bilirubin (TB), prothrombin time (PT), platelet (PLT), white blood cell (WBC), portal vein diameter, spleen index, splenic vein diameter, diameter ratio, PLT to spleen volume ratio, esophageal varices (EV) and gastric varices (GV). Multivariate analysis showed that hepatic dysfunction (X1), TB (X2), PLT (X3) and splenic vein diameter (X4) were the major occurring factors for PHG. The established regression model was Logit P=-2.667+2.186X1-2.167X2+0.725X3+0.976X4. The accuracy of model for PHG was 79.1% with a sensitivity of 77.2% and a specificity of 80.8%. Hepatic dysfunction, TB, PLT and splenic vein diameter are risk factors for PHG and the noninvasive predicted Logistic regression model was Logit P=-2.667+2.186X1-2.167X2+0.725X3+0.976X4.

  4. Application of Fluorescence Spectrometry With Multivariate Calibration to the Enantiomeric Recognition of Fluoxetine in Pharmaceutical Preparations.

    PubMed

    Poláček, Roman; Májek, Pavel; Hroboňová, Katarína; Sádecká, Jana

    2016-04-01

    Fluoxetine is the most prescribed antidepressant chiral drug worldwide. Its enantiomers have a different duration of serotonin inhibition. A novel simple and rapid method for determination of the enantiomeric composition of fluoxetine in pharmaceutical pills is presented. Specifically, emission, excitation, and synchronous fluorescence techniques were employed to obtain the spectral data, which with multivariate calibration methods, namely, principal component regression (PCR) and partial least square (PLS), were investigated. The chiral recognition of fluoxetine enantiomers in the presence of β-cyclodextrin was based on diastereomeric complexes. The results of the multivariate calibration modeling indicated good prediction abilities. The obtained results for tablets were compared with those from chiral HPLC and no significant differences are shown by Fisher's (F) test and Student's t-test. The smallest residuals between reference or nominal values and predicted values were achieved by multivariate calibration of synchronous fluorescence spectral data. This conclusion is supported by calculated values of the figure of merit.

  5. Site-specific estimation of peak-streamflow frequency using generalized least-squares regression for natural basins in Texas

    USGS Publications Warehouse

    Asquith, William H.; Slade, R.M.

    1999-01-01

    The U.S. Geological Survey, in cooperation with the Texas Department of Transportation, has developed a computer program to estimate peak-streamflow frequency for ungaged sites in natural basins in Texas. Peak-streamflow frequency refers to the peak streamflows for recurrence intervals of 2, 5, 10, 25, 50, and 100 years. Peak-streamflow frequency estimates are needed by planners, managers, and design engineers for flood-plain management; for objective assessment of flood risk; for cost-effective design of roads and bridges; and also for the desin of culverts, dams, levees, and other flood-control structures. The program estimates peak-streamflow frequency using a site-specific approach and a multivariate generalized least-squares linear regression. A site-specific approach differs from a traditional regional regression approach by developing unique equations to estimate peak-streamflow frequency specifically for the ungaged site. The stations included in the regression are selected using an informal cluster analysis that compares the basin characteristics of the ungaged site to the basin characteristics of all the stations in the data base. The program provides several choices for selecting the stations. Selecting the stations using cluster analysis ensures that the stations included in the regression will have the most pertinent information about flooding characteristics of the ungaged site and therefore provide the basis for potentially improved peak-streamflow frequency estimation. An evaluation of the site-specific approach in estimating peak-streamflow frequency for gaged sites indicates that the site-specific approach is at least as accurate as a traditional regional regression approach.

  6. Random forests on Hadoop for genome-wide association studies of multivariate neuroimaging phenotypes

    PubMed Central

    2013-01-01

    Motivation Multivariate quantitative traits arise naturally in recent neuroimaging genetics studies, in which both structural and functional variability of the human brain is measured non-invasively through techniques such as magnetic resonance imaging (MRI). There is growing interest in detecting genetic variants associated with such multivariate traits, especially in genome-wide studies. Random forests (RFs) classifiers, which are ensembles of decision trees, are amongst the best performing machine learning algorithms and have been successfully employed for the prioritisation of genetic variants in case-control studies. RFs can also be applied to produce gene rankings in association studies with multivariate quantitative traits, and to estimate genetic similarities measures that are predictive of the trait. However, in studies involving hundreds of thousands of SNPs and high-dimensional traits, a very large ensemble of trees must be inferred from the data in order to obtain reliable rankings, which makes the application of these algorithms computationally prohibitive. Results We have developed a parallel version of the RF algorithm for regression and genetic similarity learning tasks in large-scale population genetic association studies involving multivariate traits, called PaRFR (Parallel Random Forest Regression). Our implementation takes advantage of the MapReduce programming model and is deployed on Hadoop, an open-source software framework that supports data-intensive distributed applications. Notable speed-ups are obtained by introducing a distance-based criterion for node splitting in the tree estimation process. PaRFR has been applied to a genome-wide association study on Alzheimer's disease (AD) in which the quantitative trait consists of a high-dimensional neuroimaging phenotype describing longitudinal changes in the human brain structure. PaRFR provides a ranking of SNPs associated to this trait, and produces pair-wise measures of genetic proximity

  7. Random forests on Hadoop for genome-wide association studies of multivariate neuroimaging phenotypes.

    PubMed

    Wang, Yue; Goh, Wilson; Wong, Limsoon; Montana, Giovanni

    2013-01-01

    Multivariate quantitative traits arise naturally in recent neuroimaging genetics studies, in which both structural and functional variability of the human brain is measured non-invasively through techniques such as magnetic resonance imaging (MRI). There is growing interest in detecting genetic variants associated with such multivariate traits, especially in genome-wide studies. Random forests (RFs) classifiers, which are ensembles of decision trees, are amongst the best performing machine learning algorithms and have been successfully employed for the prioritisation of genetic variants in case-control studies. RFs can also be applied to produce gene rankings in association studies with multivariate quantitative traits, and to estimate genetic similarities measures that are predictive of the trait. However, in studies involving hundreds of thousands of SNPs and high-dimensional traits, a very large ensemble of trees must be inferred from the data in order to obtain reliable rankings, which makes the application of these algorithms computationally prohibitive. We have developed a parallel version of the RF algorithm for regression and genetic similarity learning tasks in large-scale population genetic association studies involving multivariate traits, called PaRFR (Parallel Random Forest Regression). Our implementation takes advantage of the MapReduce programming model and is deployed on Hadoop, an open-source software framework that supports data-intensive distributed applications. Notable speed-ups are obtained by introducing a distance-based criterion for node splitting in the tree estimation process. PaRFR has been applied to a genome-wide association study on Alzheimer's disease (AD) in which the quantitative trait consists of a high-dimensional neuroimaging phenotype describing longitudinal changes in the human brain structure. PaRFR provides a ranking of SNPs associated to this trait, and produces pair-wise measures of genetic proximity that can be directly

  8. Power and sample size for multivariate logistic modeling of unmatched case-control studies.

    PubMed

    Gail, Mitchell H; Haneuse, Sebastien

    2017-01-01

    Sample size calculations are needed to design and assess the feasibility of case-control studies. Although such calculations are readily available for simple case-control designs and univariate analyses, there is limited theory and software for multivariate unconditional logistic analysis of case-control data. Here we outline the theory needed to detect scalar exposure effects or scalar interactions while controlling for other covariates in logistic regression. Both analytical and simulation methods are presented, together with links to the corresponding software.

  9. Work stress, sleep deficiency and predicted 10-year cardiometabolic risk in a female patient care worker population

    PubMed Central

    Jacobsen, Henrik Børsting; Reme, Silje Endresen; Sembajwe, Grace; Hopcia, Karen; Stiles, Tore C.; Sorensen, Glorian; Porter, James H.; Marino, Miguel; Buxton, Orfeu M.

    2014-01-01

    Objectives The aim of this study was to investigate the longitudinal effect of work-related stress, sleep deficiency and physical activity on 10-year cardiometabolic risk among an all-female worker population. Methods Data on patient care workers (n=99) was collected two years apart. Baseline measures included: job stress, physical activity, night work and sleep deficiency. Biomarkers and objective measurements were used to estimate 10-year cardiometabolic risk at follow-up. Significant associations (P<0.05) from baseline analyses were used to build a multivariable linear regression model. Results The participants were mostly white nurses with a mean age of 41 years. Adjusted linear regression showed that having sleep maintenance problems, a different occupation than nurse, and/or not exercising at recommended levels at baseline increased the 10-year cardiometabolic risk at follow-up. Conclusions In female workers prone to work-related stress and sleep deficiency, maintaining sleep and exercise patterns had a strong impact on modifiable 10-year cardiometabolic risk. PMID:24809311

  10. Work stress, sleep deficiency, and predicted 10-year cardiometabolic risk in a female patient care worker population.

    PubMed

    Jacobsen, Henrik B; Reme, Silje E; Sembajwe, Grace; Hopcia, Karen; Stiles, Tore C; Sorensen, Glorian; Porter, James H; Marino, Miguel; Buxton, Orfeu M

    2014-08-01

    The aim of this study was to investigate the longitudinal effect of work-related stress, sleep deficiency, and physical activity on 10-year cardiometabolic risk among an all-female worker population. Data on patient care workers (n=99) was collected 2 years apart. Baseline measures included: job stress, physical activity, night work, and sleep deficiency. Biomarkers and objective measurements were used to estimate 10-year cardiometabolic risk at follow-up. Significant associations (P<0.05) from baseline analyses were used to build a multivariable linear regression model. The participants were mostly white nurses with a mean age of 41 years. Adjusted linear regression showed that having sleep maintenance problems, a different occupation than nurse, and/or not exercising at recommended levels at baseline increased the 10-year cardiometabolic risk at follow-up. In female workers prone to work-related stress and sleep deficiency, maintaining sleep and exercise patterns had a strong impact on modifiable 10-year cardiometabolic risk. © 2014 Wiley Periodicals, Inc.

  11. Low-flow, base-flow, and mean-flow regression equations for Pennsylvania streams

    USGS Publications Warehouse

    Stuckey, Marla H.

    2006-01-01

    Low-flow, base-flow, and mean-flow characteristics are an important part of assessing water resources in a watershed. These streamflow characteristics can be used by watershed planners and regulators to determine water availability, water-use allocations, assimilative capacities of streams, and aquatic-habitat needs. Streamflow characteristics are commonly predicted by use of regression equations when a nearby streamflow-gaging station is not available. Regression equations for predicting low-flow, base-flow, and mean-flow characteristics for Pennsylvania streams were developed from data collected at 293 continuous- and partial-record streamflow-gaging stations with flow unaffected by upstream regulation, diversion, or mining. Continuous-record stations used in the regression analysis had 9 years or more of data, and partial-record stations used had seven or more measurements collected during base-flow conditions. The state was divided into five low-flow regions and regional regression equations were developed for the 7-day, 10-year; 7-day, 2-year; 30-day, 10-year; 30-day, 2-year; and 90-day, 10-year low flows using generalized least-squares regression. Statewide regression equations were developed for the 10-year, 25-year, and 50-year base flows using generalized least-squares regression. Statewide regression equations were developed for harmonic mean and mean annual flow using weighted least-squares regression. Basin characteristics found to be significant explanatory variables at the 95-percent confidence level for one or more regression equations were drainage area, basin slope, thickness of soil, stream density, mean annual precipitation, mean elevation, and the percentage of glaciation, carbonate bedrock, forested area, and urban area within a basin. Standard errors of prediction ranged from 33 to 66 percent for the n-day, T-year low flows; 21 to 23 percent for the base flows; and 12 to 38 percent for the mean annual flow and harmonic mean, respectively. The

  12. Drought forecasting in eastern Australia using multivariate adaptive regression spline, least square support vector machine and M5Tree model

    NASA Astrophysics Data System (ADS)

    Deo, Ravinesh C.; Kisi, Ozgur; Singh, Vijay P.

    2017-02-01

    Drought forecasting using standardized metrics of rainfall is a core task in hydrology and water resources management. Standardized Precipitation Index (SPI) is a rainfall-based metric that caters for different time-scales at which the drought occurs, and due to its standardization, is well-suited for forecasting drought at different periods in climatically diverse regions. This study advances drought modelling using multivariate adaptive regression splines (MARS), least square support vector machine (LSSVM), and M5Tree models by forecasting SPI in eastern Australia. MARS model incorporated rainfall as mandatory predictor with month (periodicity), Southern Oscillation Index, Pacific Decadal Oscillation Index and Indian Ocean Dipole, ENSO Modoki and Nino 3.0, 3.4 and 4.0 data added gradually. The performance was evaluated with root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination (r2). Best MARS model required different input combinations, where rainfall, sea surface temperature and periodicity were used for all stations, but ENSO Modoki and Pacific Decadal Oscillation indices were not required for Bathurst, Collarenebri and Yamba, and the Southern Oscillation Index was not required for Collarenebri. Inclusion of periodicity increased the r2 value by 0.5-8.1% and reduced RMSE by 3.0-178.5%. Comparisons showed that MARS superseded the performance of the other counterparts for three out of five stations with lower MAE by 15.0-73.9% and 7.3-42.2%, respectively. For the other stations, M5Tree was better than MARS/LSSVM with lower MAE by 13.8-13.4% and 25.7-52.2%, respectively, and for Bathurst, LSSVM yielded more accurate result. For droughts identified by SPI ≤ - 0.5, accurate forecasts were attained by MARS/M5Tree for Bathurst, Yamba and Peak Hill, whereas for Collarenebri and Barraba, M5Tree was better than LSSVM/MARS. Seasonal analysis revealed disparate results where MARS/M5Tree was better than LSSVM. The results highlight the

  13. Multivariate multiscale entropy of financial markets

    NASA Astrophysics Data System (ADS)

    Lu, Yunfan; Wang, Jun

    2017-11-01

    In current process of quantifying the dynamical properties of the complex phenomena in financial market system, the multivariate financial time series are widely concerned. In this work, considering the shortcomings and limitations of univariate multiscale entropy in analyzing the multivariate time series, the multivariate multiscale sample entropy (MMSE), which can evaluate the complexity in multiple data channels over different timescales, is applied to quantify the complexity of financial markets. Its effectiveness and advantages have been detected with numerical simulations with two well-known synthetic noise signals. For the first time, the complexity of four generated trivariate return series for each stock trading hour in China stock markets is quantified thanks to the interdisciplinary application of this method. We find that the complexity of trivariate return series in each hour show a significant decreasing trend with the stock trading time progressing. Further, the shuffled multivariate return series and the absolute multivariate return series are also analyzed. As another new attempt, quantifying the complexity of global stock markets (Asia, Europe and America) is carried out by analyzing the multivariate returns from them. Finally we utilize the multivariate multiscale entropy to assess the relative complexity of normalized multivariate return volatility series with different degrees.

  14. [Predicting the probability of development and progression of primary open angle glaucoma by regression modeling].

    PubMed

    Likhvantseva, V G; Sokolov, V A; Levanova, O N; Kovelenova, I V

    2018-01-01

    Prediction of the clinical course of primary open-angle glaucoma (POAG) is one of the main directions in solving the problem of vision loss prevention and stabilization of the pathological process. Simple statistical methods of correlation analysis show the extent of each risk factor's impact, but do not indicate the total impact of these factors in personalized combinations. The relationships between the risk factors is subject to correlation and regression analysis. The regression equation represents the dependence of the mathematical expectation of the resulting sign on the combination of factor signs. To develop a technique for predicting the probability of development and progression of primary open-angle glaucoma based on a personalized combination of risk factors by linear multivariate regression analysis. The study included 66 patients (23 female and 43 male; 132 eyes) with newly diagnosed primary open-angle glaucoma. The control group consisted of 14 patients (8 male and 6 female). Standard ophthalmic examination was supplemented with biochemical study of lacrimal fluid. Concentration of matrix metalloproteinase MMP-2 and MMP-9 in tear fluid in both eyes was determined using 'sandwich' enzyme-linked immunosorbent assay (ELISA) method. The study resulted in the development of regression equations and step-by-step multivariate logistic models that can help calculate the risk of development and progression of POAG. Those models are based on expert evaluation of clinical and instrumental indicators of hydrodynamic disturbances (coefficient of outflow ease - C, volume of intraocular fluid secretion - F, fluctuation of intraocular pressure), as well as personalized morphometric parameters of the retina (central retinal thickness in the macular area) and concentration of MMP-2 and MMP-9 in the tear film. The newly developed regression equations are highly informative and can be a reliable tool for studying of the influence vector and assessment of pathogenic

  15. Adjustment of geochemical background by robust multivariate statistics

    USGS Publications Warehouse

    Zhou, D.

    1985-01-01

    Conventional analyses of exploration geochemical data assume that the background is a constant or slowly changing value, equivalent to a plane or a smoothly curved surface. However, it is better to regard the geochemical background as a rugged surface, varying with changes in geology and environment. This rugged surface can be estimated from observed geological, geochemical and environmental properties by using multivariate statistics. A method of background adjustment was developed and applied to groundwater and stream sediment reconnaissance data collected from the Hot Springs Quadrangle, South Dakota, as part of the National Uranium Resource Evaluation (NURE) program. Source-rock lithology appears to be a dominant factor controlling the chemical composition of groundwater or stream sediments. The most efficacious adjustment procedure is to regress uranium concentration on selected geochemical and environmental variables for each lithologic unit, and then to delineate anomalies by a common threshold set as a multiple of the standard deviation of the combined residuals. Robust versions of regression and RQ-mode principal components analysis techniques were used rather than ordinary techniques to guard against distortion caused by outliers Anomalies delineated by this background adjustment procedure correspond with uranium prospects much better than do anomalies delineated by conventional procedures. The procedure should be applicable to geochemical exploration at different scales for other metals. ?? 1985.

  16. Modified Regression Correlation Coefficient for Poisson Regression Model

    NASA Astrophysics Data System (ADS)

    Kaengthong, Nattacha; Domthong, Uthumporn

    2017-09-01

    This study gives attention to indicators in predictive power of the Generalized Linear Model (GLM) which are widely used; however, often having some restrictions. We are interested in regression correlation coefficient for a Poisson regression model. This is a measure of predictive power, and defined by the relationship between the dependent variable (Y) and the expected value of the dependent variable given the independent variables [E(Y|X)] for the Poisson regression model. The dependent variable is distributed as Poisson. The purpose of this research was modifying regression correlation coefficient for Poisson regression model. We also compare the proposed modified regression correlation coefficient with the traditional regression correlation coefficient in the case of two or more independent variables, and having multicollinearity in independent variables. The result shows that the proposed regression correlation coefficient is better than the traditional regression correlation coefficient based on Bias and the Root Mean Square Error (RMSE).

  17. Simultaneous determination of rifampicin, isoniazid and pyrazinamide in tablet preparations by multivariate spectrophotometric calibration.

    PubMed

    Goicoechea, H C; Olivieri, A C

    1999-08-01

    The use of multivariate spectrophotometric calibration is presented for the simultaneous determination of the active components of tablets used in the treatment of pulmonary tuberculosis. The resolution of ternary mixtures of rifampicin, isoniazid and pyrazinamide has been accomplished by using partial least squares (PLS-1) regression analysis. Although the components show an important degree of spectral overlap, they have been simultaneously determined with high accuracy and precision, rapidly and with no need of nonaqueous solvents for dissolving the samples. No interference has been observed from the tablet excipients. A comparison is presented with the related multivariate method of classical least squares (CLS) analysis, which is shown to yield less reliable results due to the severe spectral overlap among the studied compounds. This is highlighted in the case of isoniazid, due to the small absorbances measured for this component.

  18. An Alternative Flight Software Paradigm: Applying Multivariate Logistic Regression to Sense Trigger Conditions using Inaccurate or Scarce Information

    NASA Technical Reports Server (NTRS)

    Smith, Kelly; Gay, Robert; Stachowiak, Susan

    2013-01-01

    In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter to improve altitude knowledge. In order to increase overall robustness, the vehicle also has an alternate method of triggering the parachute deployment sequence based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this backup trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to semi-automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a statistical classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers improved performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles

  19. OPLS statistical model versus linear regression to assess sonographic predictors of stroke prognosis.

    PubMed

    Vajargah, Kianoush Fathi; Sadeghi-Bazargani, Homayoun; Mehdizadeh-Esfanjani, Robab; Savadi-Oskouei, Daryoush; Farhoudi, Mehdi

    2012-01-01

    The objective of the present study was to assess the comparable applicability of orthogonal projections to latent structures (OPLS) statistical model vs traditional linear regression in order to investigate the role of trans cranial doppler (TCD) sonography in predicting ischemic stroke prognosis. The study was conducted on 116 ischemic stroke patients admitted to a specialty neurology ward. The Unified Neurological Stroke Scale was used once for clinical evaluation on the first week of admission and again six months later. All data was primarily analyzed using simple linear regression and later considered for multivariate analysis using PLS/OPLS models through the SIMCA P+12 statistical software package. The linear regression analysis results used for the identification of TCD predictors of stroke prognosis were confirmed through the OPLS modeling technique. Moreover, in comparison to linear regression, the OPLS model appeared to have higher sensitivity in detecting the predictors of ischemic stroke prognosis and detected several more predictors. Applying the OPLS model made it possible to use both single TCD measures/indicators and arbitrarily dichotomized measures of TCD single vessel involvement as well as the overall TCD result. In conclusion, the authors recommend PLS/OPLS methods as complementary rather than alternative to the available classical regression models such as linear regression.

  20. Multivariate meta-analysis: potential and promise.

    PubMed

    Jackson, Dan; Riley, Richard; White, Ian R

    2011-09-10

    The multivariate random effects model is a generalization of the standard univariate model. Multivariate meta-analysis is becoming more commonly used and the techniques and related computer software, although continually under development, are now in place. In order to raise awareness of the multivariate methods, and discuss their advantages and disadvantages, we organized a one day 'Multivariate meta-analysis' event at the Royal Statistical Society. In addition to disseminating the most recent developments, we also received an abundance of comments, concerns, insights, critiques and encouragement. This article provides a balanced account of the day's discourse. By giving others the opportunity to respond to our assessment, we hope to ensure that the various view points and opinions are aired before multivariate meta-analysis simply becomes another widely used de facto method without any proper consideration of it by the medical statistics community. We describe the areas of application that multivariate meta-analysis has found, the methods available, the difficulties typically encountered and the arguments for and against the multivariate methods, using four representative but contrasting examples. We conclude that the multivariate methods can be useful, and in particular can provide estimates with better statistical properties, but also that these benefits come at the price of making more assumptions which do not result in better inference in every case. Although there is evidence that multivariate meta-analysis has considerable potential, it must be even more carefully applied than its univariate counterpart in practice. Copyright © 2011 John Wiley & Sons, Ltd.

  1. Multivariate meta-analysis: Potential and promise

    PubMed Central

    Jackson, Dan; Riley, Richard; White, Ian R

    2011-01-01

    The multivariate random effects model is a generalization of the standard univariate model. Multivariate meta-analysis is becoming more commonly used and the techniques and related computer software, although continually under development, are now in place. In order to raise awareness of the multivariate methods, and discuss their advantages and disadvantages, we organized a one day ‘Multivariate meta-analysis’ event at the Royal Statistical Society. In addition to disseminating the most recent developments, we also received an abundance of comments, concerns, insights, critiques and encouragement. This article provides a balanced account of the day's discourse. By giving others the opportunity to respond to our assessment, we hope to ensure that the various view points and opinions are aired before multivariate meta-analysis simply becomes another widely used de facto method without any proper consideration of it by the medical statistics community. We describe the areas of application that multivariate meta-analysis has found, the methods available, the difficulties typically encountered and the arguments for and against the multivariate methods, using four representative but contrasting examples. We conclude that the multivariate methods can be useful, and in particular can provide estimates with better statistical properties, but also that these benefits come at the price of making more assumptions which do not result in better inference in every case. Although there is evidence that multivariate meta-analysis has considerable potential, it must be even more carefully applied than its univariate counterpart in practice. Copyright © 2011 John Wiley & Sons, Ltd. PMID:21268052

  2. Hospital charges associated with motorcycle crash factors: a quantile regression analysis.

    PubMed

    Olsen, Cody S; Thomas, Andrea M; Cook, Lawrence J

    2014-08-01

    Previous studies of motorcycle crash (MC) related hospital charges use trauma registries and hospital records, and do not adjust for the number of motorcyclists not requiring medical attention. This may lead to conservative estimates of helmet use effectiveness. MC records were probabilistically linked with emergency department and hospital records to obtain total hospital charges. Missing data were imputed. Multivariable quantile regression estimated reductions in hospital charges associated with helmet use and other crash factors. Motorcycle helmets were associated with reduced median hospital charges of $256 (42% reduction) and reduced 98th percentile of $32,390 (33% reduction). After adjusting for other factors, helmets were associated with reductions in charges in all upper percentiles studied. Quantile regression models described homogenous and heterogeneous associations between other crash factors and charges. Quantile regression comprehensively describes associations between crash factors and hospital charges. Helmet use among motorcyclists is associated with decreased hospital charges. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  3. Cole-Cole, linear and multivariate modeling of capacitance data for on-line monitoring of biomass.

    PubMed

    Dabros, Michal; Dennewald, Danielle; Currie, David J; Lee, Mark H; Todd, Robert W; Marison, Ian W; von Stockar, Urs

    2009-02-01

    This work evaluates three techniques of calibrating capacitance (dielectric) spectrometers used for on-line monitoring of biomass: modeling of cell properties using the theoretical Cole-Cole equation, linear regression of dual-frequency capacitance measurements on biomass concentration, and multivariate (PLS) modeling of scanning dielectric spectra. The performance and robustness of each technique is assessed during a sequence of validation batches in two experimental settings of differing signal noise. In more noisy conditions, the Cole-Cole model had significantly higher biomass concentration prediction errors than the linear and multivariate models. The PLS model was the most robust in handling signal noise. In less noisy conditions, the three models performed similarly. Estimates of the mean cell size were done additionally using the Cole-Cole and PLS models, the latter technique giving more satisfactory results.

  4. Longitudinal Relationships Between Productive Activities and Functional Health in Later Years: A Multivariate Latent Growth Curve Modeling Approach.

    PubMed

    Choi, Eunhee; Tang, Fengyan; Kim, Sung-Geun; Turk, Phillip

    2016-10-01

    This study examined the longitudinal relationships between functional health in later years and three types of productive activities: volunteering, full-time, and part-time work. Using the data from five waves (2000-2008) of the Health and Retirement Study, we applied multivariate latent growth curve modeling to examine the longitudinal relationships among individuals 50 or over. Functional health was measured by limitations in activities of daily living. Individuals who volunteered, worked either full time or part time exhibited a slower decline in functional health than nonparticipants. Significant associations were also found between initial functional health and longitudinal changes in productive activity participation. This study provides additional support for the benefits of productive activities later in life; engagement in volunteering and employment are indeed associated with better functional health in middle and old age. © The Author(s) 2016.

  5. Negative Events in Childhood Predict Trajectories of Internalizing Symptoms Up to Young Adulthood: An 18-Year Longitudinal Study

    PubMed Central

    Melchior, Maria; Touchette, Évelyne; Prokofyeva, Elena; Chollet, Aude; Fombonne, Eric; Elidemir, Gulizar; Galéra, Cédric

    2014-01-01

    Background Common negative events can precipitate the onset of internalizing symptoms. We studied whether their occurrence in childhood is associated with mental health trajectories over the course of development. Methods Using data from the TEMPO study, a French community-based cohort study of youths, we studied the association between negative events in 1991 (when participants were aged 4–16 years) and internalizing symptoms, assessed by the ASEBA family of instruments in 1991, 1999, and 2009 (n = 1503). Participants' trajectories of internalizing symptoms were estimated with semi-parametric regression methods (PROC TRAJ). Data were analyzed using multinomial regression models controlled for participants' sex, age, parental family status, socio-economic position, and parental history of depression. Results Negative childhood events were associated with an increased likelihood of concurrent internalizing symptoms which sometimes persisted into adulthood (multivariate ORs associated with > = 3 negative events respectively: high and decreasing internalizing symptoms: 5.54, 95% CI: 3.20–9.58; persistently high internalizing symptoms: 8.94, 95% CI: 2.82–28.31). Specific negative events most strongly associated with youths' persistent internalizing symptoms included: school difficulties (multivariate OR: 5.31, 95% CI: 2.24–12.59), parental stress (multivariate OR: 4.69, 95% CI: 2.02–10.87), serious illness/health problems (multivariate OR: 4.13, 95% CI: 1.76–9.70), and social isolation (multivariate OR: 2.24, 95% CI: 1.00–5.08). Conclusions Common negative events can contribute to the onset of children's lasting psychological difficulties. PMID:25485875

  6. Using Logistic Regression to Predict the Probability of Debris Flows in Areas Burned by Wildfires, Southern California, 2003-2006

    USGS Publications Warehouse

    Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.; Michael, John A.; Helsel, Dennis R.

    2008-01-01

    Logistic regression was used to develop statistical models that can be used to predict the probability of debris flows in areas recently burned by wildfires by using data from 14 wildfires that burned in southern California during 2003-2006. Twenty-eight independent variables describing the basin morphology, burn severity, rainfall, and soil properties of 306 drainage basins located within those burned areas were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows soon after the 2003 to 2006 fires were delineated from data in the National Elevation Dataset using a geographic information system; (2) Data describing the basin morphology, burn severity, rainfall, and soil properties were compiled for each basin. These data were then input to a statistics software package for analysis using logistic regression; and (3) Relations between the occurrence or absence of debris flows and the basin morphology, burn severity, rainfall, and soil properties were evaluated, and five multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combinations produced the most effective models, and the multivariate models that best predicted the occurrence of debris flows were identified. Percentage of high burn severity and 3-hour peak rainfall intensity were significant variables in all models. Soil organic matter content and soil clay content were significant variables in all models except Model 5. Soil slope was a significant variable in all models except Model 4. The most suitable model can be selected from these five models on the basis of the availability of independent variables in the particular area of interest and field checking of probability maps. The multivariate logistic regression models can be entered into a geographic information system, and maps showing the probability of debris flows can be constructed in recently burned areas of

  7. Comparison of a Classical and Quantum Based Restricted Boltzmann Machine (RBM) for Application to Non-linear Multivariate Regression.

    NASA Astrophysics Data System (ADS)

    Dorband, J. E.; Tilak, N.; Radov, A.

    2016-12-01

    In this paper, a classical computer implementation of RBM is compared to a quantum annealing based RBM running on a D-Wave 2X (an adiabatic quantum computer). The codes for both are essentially identical. Only a flag is set to change the activation function from a classically computed logistic function to the D-Wave. To obtain greater understanding of the behavior of the D-Wave, a study of the stochastic properties of a virtual qubit (a 12 qubit chain) and a cell of qubits (an 8 qubit cell) was performed. We will present the results of comparing the D-Wave implementation with a theoretically errorless adiabatic quantum computer. The main purpose of this study is to develop a generic RBM regression tool in order to infer CO2 fluxes from the NASA satellite OCO-2 observed CO2 concentrations and predicted atmospheric states using regression models. The carbon fluxes will then be assimilated into a land surface model to predict the Net Ecosystem Exchange at globally distributed regional sites.

  8. Multivariate pattern dependence

    PubMed Central

    Saxe, Rebecca

    2017-01-01

    When we perform a cognitive task, multiple brain regions are engaged. Understanding how these regions interact is a fundamental step to uncover the neural bases of behavior. Most research on the interactions between brain regions has focused on the univariate responses in the regions. However, fine grained patterns of response encode important information, as shown by multivariate pattern analysis. In the present article, we introduce and apply multivariate pattern dependence (MVPD): a technique to study the statistical dependence between brain regions in humans in terms of the multivariate relations between their patterns of responses. MVPD characterizes the responses in each brain region as trajectories in region-specific multidimensional spaces, and models the multivariate relationship between these trajectories. We applied MVPD to the posterior superior temporal sulcus (pSTS) and to the fusiform face area (FFA), using a searchlight approach to reveal interactions between these seed regions and the rest of the brain. Across two different experiments, MVPD identified significant statistical dependence not detected by standard functional connectivity. Additionally, MVPD outperformed univariate connectivity in its ability to explain independent variance in the responses of individual voxels. In the end, MVPD uncovered different connectivity profiles associated with different representational subspaces of FFA: the first principal component of FFA shows differential connectivity with occipital and parietal regions implicated in the processing of low-level properties of faces, while the second and third components show differential connectivity with anterior temporal regions implicated in the processing of invariant representations of face identity. PMID:29155809

  9. Survival of aggressive variants of papillary thyroid carcinoma in patients under 55 years old: a SEER population-based retrospective analysis.

    PubMed

    Feng, Jianhua; Shen, Fei; Cai, Wensong; Gan, Xiaoxiong; Deng, Xingyan; Xu, Bo

    2018-06-16

    Patients younger than 55 years of age with papillary thyroid carcinoma (PTC) have excellent survival. Diffuse sclerosing variant (DSV) and tall cell variant (TCV) of PTC are associated with aggressiveness; the survival of patients <55 years of age with these variants is still unclear. We aim to investigate the clinicopathological features and survival of these variants in the age group <55 years. All adult patients (<55 years old) with DSV, TCV and conventional PTC (CPTC) came from the Surveillance, Epidemiology, and End Results program (1988-2013). Kaplan-Meier method and log-rank test were used to analyze the survival. Prognostic factors associated with survival were analyzed by Cox multivariate regression. There were 280 DSV, 615 TCV, and 56287 CPTC in the age group <55 years. DSV and TCV were associated with multifocality, extrathyroidal extension, lymph node and distant metastasis (all p < 0.05). The 10-year disease-specific survival (DSS) of TCV was worse than CPTC (96.3 vs. 99.4%, p < 0.01), but there was no significant difference between DSV and CPTC (99.5 vs. 99.4%, p > 0.05). Cox multivariate regression showed TCV was the independent predictor of DSS (HR: 5.39, p < 0.01). In the age group <55 years, DSV and TCV are more likely to exhibit aggressive characteristics than CPTC. Patient <55 years of age with DSV have excellent survival likewise, while patients <55 years of age with TCV carry worse survival. Further investigation for the recurrence risk of patients <55 years with these variants would contribute to optimal clinical management making.

  10. Beer fermentation: monitoring of process parameters by FT-NIR and multivariate data analysis.

    PubMed

    Grassi, Silvia; Amigo, José Manuel; Lyndgaard, Christian Bøge; Foschino, Roberto; Casiraghi, Ernestina

    2014-07-15

    This work investigates the capability of Fourier-Transform near infrared (FT-NIR) spectroscopy to monitor and assess process parameters in beer fermentation at different operative conditions. For this purpose, the fermentation of wort with two different yeast strains and at different temperatures was monitored for nine days by FT-NIR. To correlate the collected spectra with °Brix, pH and biomass, different multivariate data methodologies were applied. Principal component analysis (PCA), partial least squares (PLS) and locally weighted regression (LWR) were used to assess the relationship between FT-NIR spectra and the abovementioned process parameters that define the beer fermentation. The accuracy and robustness of the obtained results clearly show the suitability of FT-NIR spectroscopy, combined with multivariate data analysis, to be used as a quality control tool in the beer fermentation process. FT-NIR spectroscopy, when combined with LWR, demonstrates to be a perfectly suitable quantitative method to be implemented in the production of beer. Copyright © 2014 Elsevier Ltd. All rights reserved.

  11. The correlation between serum free thyroxine and regression of dyslipidemia in adult males: A 4.5-year prospective study.

    PubMed

    Wang, Haoyu; Liu, Aihua; Zhou, Yingying; Xiao, Yue; Yan, Yumeng; Zhao, Tong; Gong, Xun; Pang, Tianxiao; Fan, Chenling; Zhao, Jiajun; Teng, Weiping; Shan, Zhongyan; Lai, Yaxin

    2017-09-01

    Elevated free thyroxine (FT4) levels may play a protective role in development of dyslipidemia. However, few prospective studies have been performed to definite the effects of thyroid hormones on the improvement of dyslipidemia and its components. Thus, this study aims to clarify the association between thyroid hormones within normal range and reversal of dyslipidemia in the absence of intervention.A prospective analysis including 134 adult males was performed between 2010 and 2014. Anthropometric parameters, thyroid function, and lipid profile were measured at baseline and during follow-up. Logistic regression and receiver operating characteristic (ROC) analysis were conducted to identify the variables in forecasting the reversal of dyslipidemia and its components.During 4.5-year follow-up, 36.6% (49/134) patients resolved their dyslipidemia status without drug intervention. Compared with the continuous dyslipidemia group, subjects in reversal group had elevated FT4 and high-density lipoprotein cholesterol (HDL-C) levels, as well as decreased total cholesterol (TC), triglycerides (TG), and low-density lipoprotein cholesterol (LDL-C) levels at baseline. Furthermore, baseline FT4 is negatively associated with the change percentages of TG (r = -0.286, P = .001), while positively associated with HDL-C (r = 0.227, P = .008). However, no correlation of lipid profile change percentages with FT3 and TSH were observed. Furthermore, the improving effects of baseline FT4 on dyslipidemia, high TG, and low HDL-C status were still observed after multivariable adjustment. In ROC analysis, areas under curve (AUCs) for FT4 in predicting the reversal of dyslipidemia, high TG, and low HDL-C were 0.666, 0.643, and 0.702, respectively (P = .001 for dyslipidemia, .018 for high TG, and .001 for low HDL-C).Higher FT4 value within normal range may ameliorate the dyslipidemia, especially high TG and low HDL-C status, in males without drug intervention. This suggests

  12. Multivariate meta-analysis: a robust approach based on the theory of U-statistic.

    PubMed

    Ma, Yan; Mazumdar, Madhu

    2011-10-30

    Meta-analysis is the methodology for combining findings from similar research studies asking the same question. When the question of interest involves multiple outcomes, multivariate meta-analysis is used to synthesize the outcomes simultaneously taking into account the correlation between the outcomes. Likelihood-based approaches, in particular restricted maximum likelihood (REML) method, are commonly utilized in this context. REML assumes a multivariate normal distribution for the random-effects model. This assumption is difficult to verify, especially for meta-analysis with small number of component studies. The use of REML also requires iterative estimation between parameters, needing moderately high computation time, especially when the dimension of outcomes is large. A multivariate method of moments (MMM) is available and is shown to perform equally well to REML. However, there is a lack of information on the performance of these two methods when the true data distribution is far from normality. In this paper, we propose a new nonparametric and non-iterative method for multivariate meta-analysis on the basis of the theory of U-statistic and compare the properties of these three procedures under both normal and skewed data through simulation studies. It is shown that the effect on estimates from REML because of non-normal data distribution is marginal and that the estimates from MMM and U-statistic-based approaches are very similar. Therefore, we conclude that for performing multivariate meta-analysis, the U-statistic estimation procedure is a viable alternative to REML and MMM. Easy implementation of all three methods are illustrated by their application to data from two published meta-analysis from the fields of hip fracture and periodontal disease. We discuss ideas for future research based on U-statistic for testing significance of between-study heterogeneity and for extending the work to meta-regression setting. Copyright © 2011 John Wiley & Sons, Ltd.

  13. Risk factors for low receptive vocabulary abilities in the preschool and early school years in the longitudinal study of Australian children.

    PubMed

    Christensen, Daniel; Zubrick, Stephen R; Lawrence, David; Mitrou, Francis; Taylor, Catherine L

    2014-01-01

    Receptive vocabulary development is a component of the human language system that emerges in the first year of life and is characterised by onward expansion throughout life. Beginning in infancy, children's receptive vocabulary knowledge builds the foundation for oral language and reading skills. The foundations for success at school are built early, hence the public health policy focus on reducing developmental inequalities before children start formal school. The underlying assumption is that children's development is stable, and therefore predictable, over time. This study investigated this assumption in relation to children's receptive vocabulary ability. We investigated the extent to which low receptive vocabulary ability at 4 years was associated with low receptive vocabulary ability at 8 years, and the predictive utility of a multivariate model that included child, maternal and family risk factors measured at 4 years. The study sample comprised 3,847 children from the first nationally representative Longitudinal Study of Australian Children (LSAC). Multivariate logistic regression was used to investigate risks for low receptive vocabulary ability from 4-8 years and sensitivity-specificity analysis was used to examine the predictive utility of the multivariate model. In the multivariate model, substantial risk factors for receptive vocabulary delay from 4-8 years, in order of descending magnitude, were low receptive vocabulary ability at 4 years, low maternal education, and low school readiness. Moderate risk factors, in order of descending magnitude, were low maternal parenting consistency, socio-economic area disadvantage, low temperamental persistence, and NESB status. The following risk factors were not significant: One or more siblings, low family income, not reading to the child, high maternal work hours, and Aboriginal or Torres Strait Islander ethnicity. The results of the sensitivity-specificity analysis showed that a well-fitted multivariate model

  14. A High-Dimensional, Multivariate Copula Approach to Modeling Multivariate Agricultural Price Relationships and Tail Dependencies

    Treesearch

    Xuan Chi; Barry Goodwin

    2012-01-01

    Spatial and temporal relationships among agricultural prices have been an important topic of applied research for many years. Such research is used to investigate the performance of markets and to examine linkages up and down the marketing chain. This research has empirically evaluated price linkages by using correlation and regression models and, later, linear and...

  15. Exact and Approximate Statistical Inference for Nonlinear Regression and the Estimating Equation Approach.

    PubMed

    Demidenko, Eugene

    2017-09-01

    The exact density distribution of the nonlinear least squares estimator in the one-parameter regression model is derived in closed form and expressed through the cumulative distribution function of the standard normal variable. Several proposals to generalize this result are discussed. The exact density is extended to the estimating equation (EE) approach and the nonlinear regression with an arbitrary number of linear parameters and one intrinsically nonlinear parameter. For a very special nonlinear regression model, the derived density coincides with the distribution of the ratio of two normally distributed random variables previously obtained by Fieller (1932), unlike other approximations previously suggested by other authors. Approximations to the density of the EE estimators are discussed in the multivariate case. Numerical complications associated with the nonlinear least squares are illustrated, such as nonexistence and/or multiple solutions, as major factors contributing to poor density approximation. The nonlinear Markov-Gauss theorem is formulated based on the near exact EE density approximation.

  16. Development of a multivariate model to predict the likelihood of carcinoma in patients with indeterminate peripheral lung nodules after a nondiagnostic bronchoscopic evaluation.

    PubMed

    Voss, Jesse S; Iqbal, Seher; Jenkins, Sarah M; Henry, Michael R; Clayton, Amy C; Jett, James R; Kipp, Benjamin R; Halling, Kevin C; Maldonado, Fabien

    2014-01-01

    Studies have shown that fluorescence in situ hybridization (FISH) testing increases lung cancer detection on cytology specimens in peripheral nodules. The goal of this study was to determine whether a predictive model using clinical features and routine cytology with FISH results could predict lung malignancy after a nondiagnostic bronchoscopic evaluation. Patients with an indeterminate peripheral lung nodule that had a nondiagnostic bronchoscopic evaluation were included in this study (N = 220). FISH was performed on residual bronchial brushing cytology specimens diagnosed as negative (n = 195), atypical (n = 16), or suspicious (n = 9). FISH results included hypertetrasomy (n = 30) and negative (n = 190). Primary study end points included lung cancer status along with time to diagnosis of lung cancer or date of last clinical follow-up. Hazard ratios (HRs) were calculated using Cox proportional hazards regression model analyses, and P values < .05 were considered statistically significant. The mean age of the 220 patients was 66.7 years (range, 35-91), and most (58%) were men. Most patients (79%) were current or former smokers with a mean pack year history of 43.2 years (median, 40; range, 1-200). After multivariate analysis, hypertetrasomy FISH (HR = 2.96, P < .001), pack years (HR = 1.03 per pack year up to 50, P = .001), age (HR = 1.04 per year, P = .02), atypical or suspicious cytology (HR = 2.02, P = .04), and nodule spiculation (HR = 2.36, P = .003) were independent predictors of malignancy over time and were used to create a prediction model (C-statistic = 0.78). These results suggest that this multivariate model including test results and clinical features may be useful following a nondiagnostic bronchoscopic examination. © 2013.

  17. Misspecification of Cox regression models with composite endpoints

    PubMed Central

    Wu, Longyang; Cook, Richard J

    2012-01-01

    Researchers routinely adopt composite endpoints in multicenter randomized trials designed to evaluate the effect of experimental interventions in cardiovascular disease, diabetes, and cancer. Despite their widespread use, relatively little attention has been paid to the statistical properties of estimators of treatment effect based on composite endpoints. We consider this here in the context of multivariate models for time to event data in which copula functions link marginal distributions with a proportional hazards structure. We then examine the asymptotic and empirical properties of the estimator of treatment effect arising from a Cox regression model for the time to the first event. We point out that even when the treatment effect is the same for the component events, the limiting value of the estimator based on the composite endpoint is usually inconsistent for this common value. We find that in this context the limiting value is determined by the degree of association between the events, the stochastic ordering of events, and the censoring distribution. Within the framework adopted, marginal methods for the analysis of multivariate failure time data yield consistent estimators of treatment effect and are therefore preferred. We illustrate the methods by application to a recent asthma study. Copyright © 2012 John Wiley & Sons, Ltd. PMID:22736519

  18. Multivariate stochastic simulation with subjective multivariate normal distributions

    Treesearch

    P. J. Ince; J. Buongiorno

    1991-01-01

    In many applications of Monte Carlo simulation in forestry or forest products, it may be known that some variables are correlated. However, for simplicity, in most simulations it has been assumed that random variables are independently distributed. This report describes an alternative Monte Carlo simulation technique for subjectively assesed multivariate normal...

  19. Multivariate Meta-Analysis of Preference-Based Quality of Life Values in Coronary Heart Disease.

    PubMed

    Stevanović, Jelena; Pechlivanoglou, Petros; Kampinga, Marthe A; Krabbe, Paul F M; Postma, Maarten J

    2016-01-01

    There are numerous health-related quality of life (HRQol) measurements used in coronary heart disease (CHD) in the literature. However, only values assessed with preference-based instruments can be directly applied in a cost-utility analysis (CUA). To summarize and synthesize instrument-specific preference-based values in CHD and the underlying disease-subgroups, stable angina and post-acute coronary syndrome (post-ACS), for developed countries, while accounting for study-level characteristics, and within- and between-study correlation. A systematic review was conducted to identify studies reporting preference-based values in CHD. A multivariate meta-analysis was applied to synthesize the HRQoL values. Meta-regression analyses examined the effect of study level covariates age, publication year, prevalence of diabetes and gender. A total of 40 studies providing preference-based values were detected. Synthesized estimates of HRQoL in post-ACS ranged from 0.64 (Quality of Well-Being) to 0.92 (EuroQol European"tariff"), while in stable angina they ranged from 0.64 (Short form 6D) to 0.89 (Standard Gamble). Similar findings were observed in estimates applying to general CHD. No significant improvement in model fit was found after adjusting for study-level covariates. Large between-study heterogeneity was observed in all the models investigated. The main finding of our study is the presence of large heterogeneity both within and between instrument-specific HRQoL values. Current economic models in CHD ignore this between-study heterogeneity. Multivariate meta-analysis can quantify this heterogeneity and offers the means for uncertainty around HRQoL values to be translated to uncertainty in CUAs.

  20. Value of Information Analysis for Time-lapse Seismic Data by Simulation-Regression

    NASA Astrophysics Data System (ADS)

    Dutta, G.; Mukerji, T.; Eidsvik, J.

    2016-12-01

    A novel method to estimate the Value of Information (VOI) of time-lapse seismic data in the context of reservoir development is proposed. VOI is a decision analytic metric quantifying the incremental value that would be created by collecting information prior to making a decision under uncertainty. The VOI has to be computed before collecting the information and can be used to justify its collection. Previous work on estimating the VOI of geophysical data has involved explicit approximation of the posterior distribution of reservoir properties given the data and then evaluating the prospect values for that posterior distribution of reservoir properties. Here, we propose to directly estimate the prospect values given the data by building a statistical relationship between them using regression. Various regression techniques such as Partial Least Squares Regression (PLSR), Multivariate Adaptive Regression Splines (MARS) and k-Nearest Neighbors (k-NN) are used to estimate the VOI, and the results compared. For a univariate Gaussian case, the VOI obtained from simulation-regression has been shown to be close to the analytical solution. Estimating VOI by simulation-regression is much less computationally expensive since the posterior distribution of reservoir properties given each possible dataset need not be modeled and the prospect values need not be evaluated for each such posterior distribution of reservoir properties. This method is flexible, since it does not require rigid model specification of posterior but rather fits conditional expectations non-parametrically from samples of values and data.

  1. Ordinary chondrites - Multivariate statistical analysis of trace element contents

    NASA Technical Reports Server (NTRS)

    Lipschutz, Michael E.; Samuels, Stephen M.

    1991-01-01

    The contents of mobile trace elements (Co, Au, Sb, Ga, Se, Rb, Cs, Te, Bi, Ag, In, Tl, Zn, and Cd) in Antarctic and non-Antarctic populations of H4-6 and L4-6 chondrites, were compared using standard multivariate discriminant functions borrowed from linear discriminant analysis and logistic regression. A nonstandard randomization-simulation method was developed, making it possible to carry out probability assignments on a distribution-free basis. Compositional differences were found both between the Antarctic and non-Antarctic H4-6 chondrite populations and between two L4-6 chondrite populations. It is shown that, for various types of meteorites (in particular, for the H4-6 chondrites), the Antarctic/non-Antarctic compositional difference is due to preterrestrial differences in the genesis of their parent materials.

  2. Hierarchical cluster-based partial least squares regression (HC-PLSR) is an efficient tool for metamodelling of nonlinear dynamic models.

    PubMed

    Tøndel, Kristin; Indahl, Ulf G; Gjuvsland, Arne B; Vik, Jon Olav; Hunter, Peter; Omholt, Stig W; Martens, Harald

    2011-06-01

    Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs) to variation in features of the trajectories of the state variables (outputs) throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR), where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR) and ordinary least squares (OLS) regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback loops. HC-PLSR is a promising approach for

  3. Hierarchical Cluster-based Partial Least Squares Regression (HC-PLSR) is an efficient tool for metamodelling of nonlinear dynamic models

    PubMed Central

    2011-01-01

    Background Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs) to variation in features of the trajectories of the state variables (outputs) throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR), where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR) and ordinary least squares (OLS) regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Results Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback loops. Conclusions HC

  4. Applicability of the Ricketts' posteroanterior cephalometry for sex determination using logistic regression analysis in Hispano American Peruvians.

    PubMed

    Perez, Ivan; Chavez, Allison K; Ponce, Dario

    2016-01-01

    The Ricketts' posteroanterior (PA) cephalometry seems to be the most widely used and it has not been tested by multivariate statistics for sex determination. The objective was to determine the applicability of Ricketts' PA cephalometry for sex determination using the logistic regression analysis. The logistic models were estimated at distinct age cutoffs (all ages, 11 years, 13 years, and 15 years) in a database from 1,296 Hispano American Peruvians between 5 years and 44 years of age. The logistic models were composed by six cephalometric measurements; the accuracy achieved by resubstitution varied between 60% and 70% and all the variables, with one exception, exhibited a direct relationship with the probability of being classified as male; the nasal width exhibited an indirect relationship. The maxillary and facial widths were present in all models and may represent a sexual dimorphism indicator. The accuracy found was lower than the literature and the Ricketts' PA cephalometry may not be adequate for sex determination. The indirect relationship of the nasal width in models with data from patients of 12 years of age or less may be a trait related to age or a characteristic in the studied population, which could be better studied and confirmed.

  5. [Predictors of success among first-year medical students at the University of Parakou].

    PubMed

    Adoukonou, Thierry; Tognon-Tchegnonsi, Francis; Mensah, Emile; Allodé, Alexandre; Adovoekpe, Jean-Marie; Gandaho, Prosper; Akpona, Simon

    2016-01-01

    Several factors including grades obtained in the Baccalaureate can influence academic performance of first year medical students. The aim of this study was to evaluate the relationship between results achieved by students taking Baccalaureate exam and student academic success during the first year of medical school. We conducted an analytical study that included the whole number of students regularly enrolled in their first year of medical school at the university of Parakou in the academic year 2010-2011. Data for the scores for each academic discipline and distinction obtained in the Baccalaureate were collected. Multivariate analysis using logistic regression and multiple linear regression made it possible to determine the best predictors of success and grade point average obtained by students at the end of the year. SPSS Statistics 17.0 was used to analyse data and a p value p < 0.05 was considered significant. Among the 414 students regularly enrolled, we could exploit the data on 407 students. They were aged 15-31 years; 262 (64.4%) were male. 98 were enrolled with a success rate of 23.7%. Concerning men, the scores obtained in mathematics, in physical sciences, the grade point average obtained in the Baccalaureate and honors obtained in the Baccalaureate were associated with their success at the end of the year, but in multivariate analysis only a score in physical sciences > 15/20 was associated with success (OR: 2,8 [1,32-6,00]). Concerning the general average grade obtained at the end of the year, only an honor obtained in the Baccalaureate was associated (standard error of the correlation coefficient: 0,130 Beta =0,370 and p=0,00001). The best predictors of student academic success during the first year were a good grade point average in physical sciences during the Baccalaureate and an honor obtained in the Baccalaureate The inclusion of these elements in the enrollement of first-year students could improve academic performance.

  6. A Multivariate Generalizability Analysis of the Multistate Bar Examination

    ERIC Educational Resources Information Center

    Yin, Ping

    2005-01-01

    The main purpose of this study is to examine the content structure of the Multistate Bar Examination (MBE) using the "table of specifications" model from the perspective of multivariate generalizability theory. Specifically, using MBE data collected over different years (six administrations: three from the February test and three from July test),…

  7. Multivariate reference technique for quantitative analysis of fiber-optic tissue Raman spectroscopy.

    PubMed

    Bergholt, Mads Sylvest; Duraipandian, Shiyamala; Zheng, Wei; Huang, Zhiwei

    2013-12-03

    We report a novel method making use of multivariate reference signals of fused silica and sapphire Raman signals generated from a ball-lens fiber-optic Raman probe for quantitative analysis of in vivo tissue Raman measurements in real time. Partial least-squares (PLS) regression modeling is applied to extract the characteristic internal reference Raman signals (e.g., shoulder of the prominent fused silica boson peak (~130 cm(-1)); distinct sapphire ball-lens peaks (380, 417, 646, and 751 cm(-1))) from the ball-lens fiber-optic Raman probe for quantitative analysis of fiber-optic Raman spectroscopy. To evaluate the analytical value of this novel multivariate reference technique, a rapid Raman spectroscopy system coupled with a ball-lens fiber-optic Raman probe is used for in vivo oral tissue Raman measurements (n = 25 subjects) under 785 nm laser excitation powers ranging from 5 to 65 mW. An accurate linear relationship (R(2) = 0.981) with a root-mean-square error of cross validation (RMSECV) of 2.5 mW can be obtained for predicting the laser excitation power changes based on a leave-one-subject-out cross-validation, which is superior to the normal univariate reference method (RMSE = 6.2 mW). A root-mean-square error of prediction (RMSEP) of 2.4 mW (R(2) = 0.985) can also be achieved for laser power prediction in real time when we applied the multivariate method independently on the five new subjects (n = 166 spectra). We further apply the multivariate reference technique for quantitative analysis of gelatin tissue phantoms that gives rise to an RMSEP of ~2.0% (R(2) = 0.998) independent of laser excitation power variations. This work demonstrates that multivariate reference technique can be advantageously used to monitor and correct the variations of laser excitation power and fiber coupling efficiency in situ for standardizing the tissue Raman intensity to realize quantitative analysis of tissue Raman measurements in vivo, which is particularly appealing in

  8. Comparing near-infrared conventional diffuse reflectance spectroscopy and hyperspectral imaging for determination of the bulk properties of solid samples by multivariate regression: determination of Mooney viscosity and plasticity indices of natural rubber.

    PubMed

    Juliano da Silva, Carlos; Pasquini, Celio

    2015-01-21

    Conventional reflectance spectroscopy (NIRS) and hyperspectral imaging (HI) in the near-infrared region (1000-2500 nm) are evaluated and compared, using, as the case study, the determination of relevant properties related to the quality of natural rubber. Mooney viscosity (MV) and plasticity indices (PI) (PI0 - original plasticity, PI30 - plasticity after accelerated aging, and PRI - the plasticity retention index after accelerated aging) of rubber were determined using multivariate regression models. Two hundred and eighty six samples of rubber were measured using conventional and hyperspectral near-infrared imaging reflectance instruments in the range of 1000-2500 nm. The sample set was split into regression (n = 191) and external validation (n = 95) sub-sets. Three instruments were employed for data acquisition: a line scanning hyperspectral camera and two conventional FT-NIR spectrometers. Sample heterogeneity was evaluated using hyperspectral images obtained with a resolution of 150 × 150 μm and principal component analysis. The probed sample area (5 cm(2); 24,000 pixels) to achieve representativeness was found to be equivalent to the average of 6 spectra for a 1 cm diameter probing circular window of one FT-NIR instrument. The other spectrophotometer can probe the whole sample in only one measurement. The results show that the rubber properties can be determined with very similar accuracy and precision by Partial Least Square (PLS) regression models regardless of whether HI-NIR or conventional FT-NIR produce the spectral datasets. The best Root Mean Square Errors of Prediction (RMSEPs) of external validation for MV, PI0, PI30, and PRI were 4.3, 1.8, 3.4, and 5.3%, respectively. Though the quantitative results provided by the three instruments can be considered equivalent, the hyperspectral imaging instrument presents a number of advantages, being about 6 times faster than conventional bulk spectrometers, producing robust spectral data by ensuring sample

  9. Development of regression equations to revise estimates of historical streamflows for the St. Croix River at Stillwater, Minnesota (water years 1910-2011), and Prescott, Wisconsin (water years 1910-2007)

    USGS Publications Warehouse

    Ziegeweid, Jeffrey R.; Magdalene, Suzanne

    2015-01-01

    The new regression equations were used to calculate revised estimates of historical streamflows for Stillwater and Prescott starting in 1910 and ending when index-velocity streamgages were installed. Monthly, annual, 30-year, and period of record statistics were examined between previous and revised estimates of historical streamflows. The abilities of the new regression equations to estimate historical streamflows were evaluated by using percent differences to compare new estimates of historical daily streamflows to discrete streamflow measurements made at Stillwater and Prescott before the installation of index-velocity streamgages. Although less variability was observed between estimated and measured streamflows at Stillwater compared to Prescott, the percent difference data indicated that the new estimates closely approximated measured streamflows at both locations.

  10. Spatial assessment of air quality patterns in Malaysia using multivariate analysis

    NASA Astrophysics Data System (ADS)

    Dominick, Doreena; Juahir, Hafizan; Latif, Mohd Talib; Zain, Sharifuddin M.; Aris, Ahmad Zaharin

    2012-12-01

    This study aims to investigate possible sources of air pollutants and the spatial patterns within the eight selected Malaysian air monitoring stations based on a two-year database (2008-2009). The multivariate analysis was applied on the dataset. It incorporated Hierarchical Agglomerative Cluster Analysis (HACA) to access the spatial patterns, Principal Component Analysis (PCA) to determine the major sources of the air pollution and Multiple Linear Regression (MLR) to assess the percentage contribution of each air pollutant. The HACA results grouped the eight monitoring stations into three different clusters, based on the characteristics of the air pollutants and meteorological parameters. The PCA analysis showed that the major sources of air pollution were emissions from motor vehicles, aircraft, industries and areas of high population density. The MLR analysis demonstrated that the main pollutant contributing to variability in the Air Pollutant Index (API) at all stations was particulate matter with a diameter of less than 10 μm (PM10). Further MLR analysis showed that the main air pollutant influencing the high concentration of PM10 was carbon monoxide (CO). This was due to combustion processes, particularly originating from motor vehicles. Meteorological factors such as ambient temperature, wind speed and humidity were also noted to influence the concentration of PM10.

  11. Creation of mortality risk charts using 123I meta-iodobenzylguanidine heart-to-mediastinum ratio in patients with heart failure: 2- and 5-year risk models.

    PubMed

    Nakajima, Kenichi; Nakata, Tomoaki; Matsuo, Shinro; Jacobson, Arnold F

    2016-10-01

    (123)I meta-iodobenzylguanidine (MIBG) imaging has been extensively used for prognostication in patients with chronic heart failure (CHF). The purpose of this study was to create mortality risk charts for short-term (2 years) and long-term (5 years) prediction of cardiac mortality. Using a pooled database of 1322 CHF patients, multivariate analysis, including (123)I-MIBG late heart-to-mediastinum ratio (HMR), left ventricular ejection fraction (LVEF), and clinical factors, was performed to determine optimal variables for the prediction of 2- and 5-year mortality risk using subsets of the patients (n = 1280 and 933, respectively). Multivariate logistic regression analysis was performed to create risk charts. Cardiac mortality was 10 and 22% for the sub-population of 2- and 5-year analyses. A four-parameter multivariate logistic regression model including age, New York Heart Association (NYHA) functional class, LVEF, and HMR was used. Annualized mortality rate was <1% in patients with NYHA Class I-II and HMR ≥ 2.0, irrespective of age and LVEF. In patients with NYHA Class III-IV, mortality rate was 4-6 times higher for HMR < 1.40 compared with HMR ≥ 2.0 in all LVEF classes. Among the subset of patients with b-type natriuretic peptide (BNP) results (n = 491 and 359 for 2- and 5-year models, respectively), the 5-year model showed incremental value of HMR in addition to BNP. Both 2- and 5-year risk prediction models with (123)I-MIBG HMR can be used to identify low-risk as well as high-risk patients, which can be effective for further risk stratification of CHF patients even when BNP is available. © The Author 2015. Published by Oxford University Press on behalf of the European Society of Cardiology.

  12. Regression equations to estimate seasonal flow duration, n-day high-flow frequency, and n-day low-flow frequency at sites in North Dakota using data through water year 2009

    USGS Publications Warehouse

    Williams-Sether, Tara; Gross, Tara A.

    2016-02-09

    Seasonal mean daily flow data from 119 U.S. Geological Survey streamflow-gaging stations in North Dakota; the surrounding states of Montana, Minnesota, and South Dakota; and the Canadian provinces of Manitoba and Saskatchewan with 10 or more years of unregulated flow record were used to develop regression equations for flow duration, n-day high flow and n-day low flow using ordinary least-squares and Tobit regression techniques. Regression equations were developed for seasonal flow durations at the 10th, 25th, 50th, 75th, and 90th percent exceedances; the 1-, 7-, and 30-day seasonal mean high flows for the 10-, 25-, and 50-year recurrence intervals; and the 1-, 7-, and 30-day seasonal mean low flows for the 2-, 5-, and 10-year recurrence intervals. Basin and climatic characteristics determined to be significant explanatory variables in one or more regression equations included drainage area, percentage of basin drainage area that drains to isolated lakes and ponds, ruggedness number, stream length, basin compactness ratio, minimum basin elevation, precipitation, slope ratio, stream slope, and soil permeability. The adjusted coefficient of determination for the n-day high-flow regression equations ranged from 55.87 to 94.53 percent. The Chi2 values for the duration regression equations ranged from 13.49 to 117.94, whereas the Chi2 values for the n-day low-flow regression equations ranged from 4.20 to 49.68.

  13. Risk prediction for myocardial infarction via generalized functional regression models.

    PubMed

    Ieva, Francesca; Paganoni, Anna M

    2016-08-01

    In this paper, we propose a generalized functional linear regression model for a binary outcome indicating the presence/absence of a cardiac disease with multivariate functional data among the relevant predictors. In particular, the motivating aim is the analysis of electrocardiographic traces of patients whose pre-hospital electrocardiogram (ECG) has been sent to 118 Dispatch Center of Milan (the Italian free-toll number for emergencies) by life support personnel of the basic rescue units. The statistical analysis starts with a preprocessing of ECGs treated as multivariate functional data. The signals are reconstructed from noisy observations. The biological variability is then removed by a nonlinear registration procedure based on landmarks. Thus, in order to perform a data-driven dimensional reduction, a multivariate functional principal component analysis is carried out on the variance-covariance matrix of the reconstructed and registered ECGs and their first derivatives. We use the scores of the Principal Components decomposition as covariates in a generalized linear model to predict the presence of the disease in a new patient. Hence, a new semi-automatic diagnostic procedure is proposed to estimate the risk of infarction (in the case of interest, the probability of being affected by Left Bundle Brunch Block). The performance of this classification method is evaluated and compared with other methods proposed in literature. Finally, the robustness of the procedure is checked via leave-j-out techniques. © The Author(s) 2013.

  14. Risk factors for baclofen pump infection in children: a multivariate analysis.

    PubMed

    Spader, Heather S; Bollo, Robert J; Bowers, Christian A; Riva-Cambrin, Jay

    2016-06-01

    OBJECTIVE Intrathecal baclofen infusion systems to manage severe spasticity and dystonia are associated with higher infection rates in children than in adults. Factors unique to this population, such as poor nutrition and physical limitations for pump placement, have been hypothesized as the reasons for this disparity. The authors assessed potential risk factors for infection in a multivariate analysis. METHODS Patients who underwent implantation of a programmable pump and intrathecal catheter for baclofen infusion at a single center between January 1, 2000, and March 1, 2012, were identified in this retrospective cohort study. The primary end point was infection. Potential risk factors investigated included preoperative (i.e., demographics, body mass index [BMI], gastrostomy tube, tracheostomy, previous spinal fusion), intraoperative (i.e., surgeon, antibiotics, pump size, catheter location), and postoperative (i.e., wound dehiscence, CSF leak, and number of revisions) factors. Univariate analysis was performed, and a multivariate logistic regression model was created to identify independent risk factors for infection. RESULTS A total of 254 patients were evaluated. The overall infection rate was 9.8%. Univariate analysis identified young age, shorter height, lower weight, dehiscence, CSF leak, and number of revisions within 6 months of pump placement as significantly associated with infection. Multivariate analysis identified young age, dehiscence, and number of revisions as independent risk factors for infection. CONCLUSIONS Young age, wound dehiscence, and number of revisions were independent risk factors for infection in this pediatric cohort. A low BMI and the presence of either a gastrostomy or tracheostomy were not associated with infection and may not be contraindications for this procedure.

  15. Artificial neural network, genetic algorithm, and logistic regression applications for predicting renal colic in emergency settings.

    PubMed

    Eken, Cenker; Bilge, Ugur; Kartal, Mutlu; Eray, Oktay

    2009-06-03

    Logistic regression is the most common statistical model for processing multivariate data in the medical literature. Artificial intelligence models like an artificial neural network (ANN) and genetic algorithm (GA) may also be useful to interpret medical data. The purpose of this study was to perform artificial intelligence models on a medical data sheet and compare to logistic regression. ANN, GA, and logistic regression analysis were carried out on a data sheet of a previously published article regarding patients presenting to an emergency department with flank pain suspicious for renal colic. The study population was composed of 227 patients: 176 patients had a diagnosis of urinary stone, while 51 ultimately had no calculus. The GA found two decision rules in predicting urinary stones. Rule 1 consisted of being male, pain not spreading to back, and no fever. In rule 2, pelvicaliceal dilatation on bedside ultrasonography replaced no fever. ANN, GA rule 1, GA rule 2, and logistic regression had a sensitivity of 94.9, 67.6, 56.8, and 95.5%, a specificity of 78.4, 76.47, 86.3, and 47.1%, a positive likelihood ratio of 4.4, 2.9, 4.1, and 1.8, and a negative likelihood ratio of 0.06, 0.42, 0.5, and 0.09, respectively. The area under the curve was found to be 0.867, 0.720, 0.715, and 0.713 for all applications, respectively. Data mining techniques such as ANN and GA can be used for predicting renal colic in emergency settings and to constitute clinical decision rules. They may be an alternative to conventional multivariate analysis applications used in biostatistics.

  16. Multivariate methods for indoor PM10 and PM2.5 modelling in naturally ventilated schools buildings

    NASA Astrophysics Data System (ADS)

    Elbayoumi, Maher; Ramli, Nor Azam; Md Yusof, Noor Faizah Fitri; Yahaya, Ahmad Shukri Bin; Al Madhoun, Wesam; Ul-Saufie, Ahmed Zia

    2014-09-01

    In this study the concentrations of PM10, PM2.5, CO and CO2 concentrations and meteorological variables (wind speed, air temperature, and relative humidity) were employed to predict the annual and seasonal indoor concentration of PM10 and PM2.5 using multivariate statistical methods. The data have been collected in twelve naturally ventilated schools in Gaza Strip (Palestine) from October 2011 to May 2012 (academic year). The bivariate correlation analysis showed that the indoor PM10 and PM2.5 were highly positive correlated with outdoor concentration of PM10 and PM2.5. Further, Multiple linear regression (MLR) was used for modelling and R2 values for indoor PM10 were determined as 0.62 and 0.84 for PM10 and PM2.5 respectively. The Performance indicators of MLR models indicated that the prediction for PM10 and PM2.5 annual models were better than seasonal models. In order to reduce the number of input variables, principal component analysis (PCA) and principal component regression (PCR) were applied by using annual data. The predicted R2 were 0.40 and 0.73 for PM10 and PM2.5, respectively. PM10 models (MLR and PCR) show the tendency to underestimate indoor PM10 concentrations as it does not take into account the occupant's activities which highly affect the indoor concentrations during the class hours.

  17. An Alternative Flight Software Trigger Paradigm: Applying Multivariate Logistic Regression to Sense Trigger Conditions Using Inaccurate or Scarce Information

    NASA Technical Reports Server (NTRS)

    Smith, Kelly M.; Gay, Robert S.; Stachowiak, Susan J.

    2013-01-01

    In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter to improve altitude knowledge. In order to increase overall robustness, the vehicle also has an alternate method of triggering the parachute deployment sequence based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this backup trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to semi-automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a statistical classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers improved performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles.

  18. An Alternative Flight Software Trigger Paradigm: Applying Multivariate Logistic Regression to Sense Trigger Conditions using Inaccurate or Scarce Information

    NASA Technical Reports Server (NTRS)

    Smith, Kelly M.; Gay, Robert S.; Stachowiak, Susan J.

    2013-01-01

    In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter. In order to increase overall robustness, the vehicle also has an alternate method of triggering the drogue parachute deployment based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this velocity-based trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers excellent performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles.

  19. The NLS-Based Nonlinear Grey Multivariate Model for Forecasting Pollutant Emissions in China.

    PubMed

    Pei, Ling-Ling; Li, Qin; Wang, Zheng-Xin

    2018-03-08

    The relationship between pollutant discharge and economic growth has been a major research focus in environmental economics. To accurately estimate the nonlinear change law of China's pollutant discharge with economic growth, this study establishes a transformed nonlinear grey multivariable (TNGM (1, N )) model based on the nonlinear least square (NLS) method. The Gauss-Seidel iterative algorithm was used to solve the parameters of the TNGM (1, N ) model based on the NLS basic principle. This algorithm improves the precision of the model by continuous iteration and constantly approximating the optimal regression coefficient of the nonlinear model. In our empirical analysis, the traditional grey multivariate model GM (1, N ) and the NLS-based TNGM (1, N ) models were respectively adopted to forecast and analyze the relationship among wastewater discharge per capita (WDPC), and per capita emissions of SO₂ and dust, alongside GDP per capita in China during the period 1996-2015. Results indicated that the NLS algorithm is able to effectively help the grey multivariable model identify the nonlinear relationship between pollutant discharge and economic growth. The results show that the NLS-based TNGM (1, N ) model presents greater precision when forecasting WDPC, SO₂ emissions and dust emissions per capita, compared to the traditional GM (1, N ) model; WDPC indicates a growing tendency aligned with the growth of GDP, while the per capita emissions of SO₂ and dust reduce accordingly.

  20. Association of sex hormones with incident 10-year cardiovascular disease and mortality in women.

    PubMed

    Schaffrath, Gotja; Kische, Hanna; Gross, Stefan; Wallaschofski, Henri; Völzke, Henry; Dörr, Marcus; Nauck, Matthias; Keevil, Brian G; Brabant, Georg; Haring, Robin

    2015-12-01

    The aims of this study were to ascertain whether women with high levels of serum total testosterone (TT) or low levels of sex hormone-binding globulin (SHBG) are more likely to develop cardiovascular disease (CVD), and to investigate potential associations between sex hormones and mortality (all-cause, as well as cause-specific) in the general population. Data on 2129 women with a mean age of 49.0 years were obtained from the population-based Study of Health in Pomerania over a median follow-up of 10.9 years. Associations of baseline levels of TT, SHBG, and rostenedione (ASD), and free testosterone (fT), and of the free androgen index (FAI), with follow-up CVD morbidity, as well as all-cause and CVD mortality, were analyzed using multivariable regression modeling. At baseline the prevalence rate of CVD was 17.8% (378 women) and the incidence of CVD over the follow-up was 50.9 per 1000 person-years. We detected an inverse association between SHBG and baseline CVD in age-adjusted models (relative risk per standard deviation increase: 0.83; 95% confidence interval: 0.74-0.93). We did not detect any significant associations between sex hormone concentrations and incident CVD in age- and multivariable-adjusted Poisson regression models. Furthermore, none of the sex hormones (TT, SHBG, ASD, fT, FAI) were associated with all-cause mortality. This population-based cohort study did not yield any consistent associations between sex hormones in women and incident CVD or mortality risk. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  1. Spatially resolved regression analysis of pre-treatment FDG, FLT and Cu-ATSM PET from post-treatment FDG PET: an exploratory study

    PubMed Central

    Bowen, Stephen R; Chappell, Richard J; Bentzen, Søren M; Deveau, Michael A; Forrest, Lisa J; Jeraj, Robert

    2012-01-01

    Purpose To quantify associations between pre-radiotherapy and post-radiotherapy PET parameters via spatially resolved regression. Materials and methods Ten canine sinonasal cancer patients underwent PET/CT scans of [18F]FDG (FDGpre), [18F]FLT (FLTpre), and [61Cu]Cu-ATSM (Cu-ATSMpre). Following radiotherapy regimens of 50 Gy in 10 fractions, veterinary patients underwent FDG PET/CT scans at three months (FDGpost). Regression of standardized uptake values in baseline FDGpre, FLTpre and Cu-ATSMpre tumour voxels to those in FDGpost images was performed for linear, log-linear, generalized-linear and mixed-fit linear models. Goodness-of-fit in regression coefficients was assessed by R2. Hypothesis testing of coefficients over the patient population was performed. Results Multivariate linear model fits of FDGpre to FDGpost were significantly positive over the population (FDGpost~0.17 FDGpre, p=0.03), and classified slopes of RECIST non-responders and responders to be different (0.37 vs. 0.07, p=0.01). Generalized-linear model fits related FDGpre to FDGpost by a linear power law (FDGpost~FDGpre0.93, p<0.001). Univariate mixture model fits of FDGpre improved R2 from 0.17 to 0.52. Neither baseline FLT PET nor Cu-ATSM PET uptake contributed statistically significant multivariate regression coefficients. Conclusions Spatially resolved regression analysis indicates that pre-treatment FDG PET uptake is most strongly associated with three-month post-treatment FDG PET uptake in this patient population, though associations are histopathology-dependent. PMID:22682748

  2. A multivariate time series approach to modeling and forecasting demand in the emergency department.

    PubMed

    Jones, Spencer S; Evans, R Scott; Allen, Todd L; Thomas, Alun; Haug, Peter J; Welch, Shari J; Snow, Gregory L

    2009-02-01

    The goals of this investigation were to study the temporal relationships between the demands for key resources in the emergency department (ED) and the inpatient hospital, and to develop multivariate forecasting models. Hourly data were collected from three diverse hospitals for the year 2006. Descriptive analysis and model fitting were carried out using graphical and multivariate time series methods. Multivariate models were compared to a univariate benchmark model in terms of their ability to provide out-of-sample forecasts of ED census and the demands for diagnostic resources. Descriptive analyses revealed little temporal interaction between the demand for inpatient resources and the demand for ED resources at the facilities considered. Multivariate models provided more accurate forecasts of ED census and of the demands for diagnostic resources. Our results suggest that multivariate time series models can be used to reliably forecast ED patient census; however, forecasts of the demands for diagnostic resources were not sufficiently reliable to be useful in the clinical setting.

  3. Body mass index gain between ages 20-40 years and lifestyle characteristics of men at ages 40-60 years: The Adventist Health Study-2

    PubMed Central

    Japas, Claudio; Knutsen, Synnøve; Dehom, Salem; Dos Santos, Hildemar; Tonstad, Serena

    2014-01-01

    Background Obesity increases risk of premature disease, and may be associated with unfavorable lifestyle changes that add to risk. This study analyzed the association of midlife BMI change with current lifestyle patterns among multiethnic men. Methods Men aged 40-60 years (n=9864) retrospectively reported body weight between ages 20-40 years and current dietary, TV, physical activity and sleep practices in the Adventist Health Study II, a study of church-goers in the US and Canada. In multivariate logistic regression analysis, odds ratios for BMI gain were calculated for each lifestyle practice controlling for sociodemographic and other lifestyle factors and current BMI. Results Men with median or higher BMI gain (2.79 kg/m2) between ages 20-40 years were more likely to consume a non-vegetarian diet, and engage in excessive TV watching and little physical activity and had a shorter sleep duration compared to men with BMI gain below the median (all p<0.001). In multivariate logistic analysis current BMI was significantly associated with all lifestyle factors in multivariate analyses (all p≤0.005). BMI gain was associated with lower odds of vegetarian diet (odds ratio [OR] 0.939; 95% confidence interval [CI] 0.921-0.957) and of physical activity ≥150 minutes/week (OR 0.979, 95% CI 0.960-0.999). Conclusions These findings imply that diet and less physical activity are associated with both gained and attained BMI, while inactivity (TV watching) and short sleep duration correlated only with attained BMI. Unhealthy lifestyle may add risk to that associated with BMI. Longitudinal and intervention studies are needed to infer causal relationships. PMID:25434910

  4. Deriving the Regression Equation without Using Calculus

    ERIC Educational Resources Information Center

    Gordon, Sheldon P.; Gordon, Florence S.

    2004-01-01

    Probably the one "new" mathematical topic that is most responsible for modernizing courses in college algebra and precalculus over the last few years is the idea of fitting a function to a set of data in the sense of a least squares fit. Whether it be simple linear regression or nonlinear regression, this topic opens the door to applying the…

  5. Axial cervical vertebrae-based multivariate regression model for the estimation of skeletal-maturation status.

    PubMed

    Yang, Y-M; Lee, J; Kim, Y-I; Cho, B-H; Park, S-B

    2014-08-01

    This study aimed to determine the viability of using axial cervical vertebrae (ACV) as biological indicators of skeletal maturation and to build models that estimate ossification level with improved explanatory power over models based only on chronological age. The study population comprised 74 female and 47 male patients with available hand-wrist radiographs and cone-beam computed tomography images. Generalized Procrustes analysis was used to analyze the shape, size, and form of the ACV regions of interest. The variabilities of these factors were analyzed by principal component analysis. Skeletal maturation was then estimated using a multiple regression model. Separate models were developed for male and female participants. For the female estimation model, the adjusted R(2) explained 84.8% of the variability of the Sempé maturation level (SML), representing a 7.9% increase in SML explanatory power over that using chronological age alone (76.9%). For the male estimation model, the adjusted R(2) was over 90%, representing a 1.7% increase relative to the reference model. The simplest possible ACV morphometric information provided a statistically significant explanation of the portion of skeletal-maturation variability not dependent on chronological age. These results verify that ACV is a strong biological indicator of ossification status. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  6. Multivariate Strategies in Functional Magnetic Resonance Imaging

    ERIC Educational Resources Information Center

    Hansen, Lars Kai

    2007-01-01

    We discuss aspects of multivariate fMRI modeling, including the statistical evaluation of multivariate models and means for dimensional reduction. In a case study we analyze linear and non-linear dimensional reduction tools in the context of a "mind reading" predictive multivariate fMRI model.

  7. Stolon regression

    PubMed Central

    Cherry Vogt, Kimberly S

    2008-01-01

    Many colonial organisms encrust surfaces with feeding and reproductive polyps connected by vascular stolons. Such colonies often show a dichotomy between runner-like forms, with widely spaced polyps and long stolon connections, and sheet-like forms, with closely spaced polyps and short stolon connections. Generative processes, such as rates of polyp initiation relative to rates of stolon elongation, are typically thought to underlie this dichotomy. Regressive processes, such as tissue regression and cell death, may also be relevant. In this context, we have recently characterized the process of stolon regression in a colonial cnidarian, Podocoryna carnea. Stolon regression occurs naturally in these colonies. To characterize this process in detail, high levels of stolon regression were induced in experimental colonies by treatment with reactive oxygen and reactive nitrogen species (ROS and RNS). Either treatment results in stolon regression and is accompanied by high levels of endogenous ROS and RNS as well as morphological indications of cell death in the regressing stolon. The initiating step in regression appears to be a perturbation of normal colony-wide gastrovascular flow. This suggests more general connections between stolon regression and a wide variety of environmental effects. Here we summarize our results and further discuss such connections. PMID:19704785

  8. Salting-out assisted liquid-liquid extraction and partial least squares regression to assay low molecular weight polycyclic aromatic hydrocarbons leached from soils and sediments

    NASA Astrophysics Data System (ADS)

    Bressan, Lucas P.; do Nascimento, Paulo Cícero; Schmidt, Marcella E. P.; Faccin, Henrique; de Machado, Leandro Carvalho; Bohrer, Denise

    2017-02-01

    A novel method was developed to determine low molecular weight polycyclic aromatic hydrocarbons in aqueous leachates from soils and sediments using a salting-out assisted liquid-liquid extraction, synchronous fluorescence spectrometry and a multivariate calibration technique. Several experimental parameters were controlled and the optimum conditions were: sodium carbonate as the salting-out agent at concentration of 2 mol L- 1, 3 mL of acetonitrile as extraction solvent, 6 mL of aqueous leachate, vortexing for 5 min and centrifuging at 4000 rpm for 5 min. The partial least squares calibration was optimized to the lowest values of root mean squared error and five latent variables were chosen for each of the targeted compounds. The regression coefficients for the true versus predicted concentrations were higher than 0.99. Figures of merit for the multivariate method were calculated, namely sensitivity, multivariate detection limit and multivariate quantification limit. The selectivity was also evaluated and other polycyclic aromatic hydrocarbons did not interfere in the analysis. Likewise, high performance liquid chromatography was used as a comparative methodology, and the regression analysis between the methods showed no statistical difference (t-test). The proposed methodology was applied to soils and sediments of a Brazilian river and the recoveries ranged from 74.3% to 105.8%. Overall, the proposed methodology was suitable for the targeted compounds, showing that the extraction method can be applied to spectrofluorometric analysis and that the multivariate calibration is also suitable for these compounds in leachates from real samples.

  9. Multivariate Autoregressive Modeling and Granger Causality Analysis of Multiple Spike Trains

    PubMed Central

    Krumin, Michael; Shoham, Shy

    2010-01-01

    Recent years have seen the emergence of microelectrode arrays and optical methods allowing simultaneous recording of spiking activity from populations of neurons in various parts of the nervous system. The analysis of multiple neural spike train data could benefit significantly from existing methods for multivariate time-series analysis which have proven to be very powerful in the modeling and analysis of continuous neural signals like EEG signals. However, those methods have not generally been well adapted to point processes. Here, we use our recent results on correlation distortions in multivariate Linear-Nonlinear-Poisson spiking neuron models to derive generalized Yule-Walker-type equations for fitting ‘‘hidden” Multivariate Autoregressive models. We use this new framework to perform Granger causality analysis in order to extract the directed information flow pattern in networks of simulated spiking neurons. We discuss the relative merits and limitations of the new method. PMID:20454705

  10. Multivariate Longitudinal Analysis with Bivariate Correlation Test

    PubMed Central

    Adjakossa, Eric Houngla; Sadissou, Ibrahim; Hounkonnou, Mahouton Norbert; Nuel, Gregory

    2016-01-01

    In the context of multivariate multilevel data analysis, this paper focuses on the multivariate linear mixed-effects model, including all the correlations between the random effects when the dimensional residual terms are assumed uncorrelated. Using the EM algorithm, we suggest more general expressions of the model’s parameters estimators. These estimators can be used in the framework of the multivariate longitudinal data analysis as well as in the more general context of the analysis of multivariate multilevel data. By using a likelihood ratio test, we test the significance of the correlations between the random effects of two dependent variables of the model, in order to investigate whether or not it is useful to model these dependent variables jointly. Simulation studies are done to assess both the parameter recovery performance of the EM estimators and the power of the test. Using two empirical data sets which are of longitudinal multivariate type and multivariate multilevel type, respectively, the usefulness of the test is illustrated. PMID:27537692

  11. Multivariate Longitudinal Analysis with Bivariate Correlation Test.

    PubMed

    Adjakossa, Eric Houngla; Sadissou, Ibrahim; Hounkonnou, Mahouton Norbert; Nuel, Gregory

    2016-01-01

    In the context of multivariate multilevel data analysis, this paper focuses on the multivariate linear mixed-effects model, including all the correlations between the random effects when the dimensional residual terms are assumed uncorrelated. Using the EM algorithm, we suggest more general expressions of the model's parameters estimators. These estimators can be used in the framework of the multivariate longitudinal data analysis as well as in the more general context of the analysis of multivariate multilevel data. By using a likelihood ratio test, we test the significance of the correlations between the random effects of two dependent variables of the model, in order to investigate whether or not it is useful to model these dependent variables jointly. Simulation studies are done to assess both the parameter recovery performance of the EM estimators and the power of the test. Using two empirical data sets which are of longitudinal multivariate type and multivariate multilevel type, respectively, the usefulness of the test is illustrated.

  12. Multivariate statistical analysis of low-voltage EDS spectrum images

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Anderson, I.M.

    1998-03-01

    Whereas energy-dispersive X-ray spectrometry (EDS) has been used for compositional analysis in the scanning electron microscope for 30 years, the benefits of using low operating voltages for such analyses have been explored only during the last few years. This paper couples low-voltage EDS with two other emerging areas of characterization: spectrum imaging and multivariate statistical analysis. The specimen analyzed for this study was a finished Intel Pentium processor, with the polyimide protective coating stripped off to expose the final active layers.

  13. A multivariate ecogeographic analysis of macaque craniodental variation.

    PubMed

    Grunstra, Nicole D S; Mitteroecker, Philipp; Foley, Robert A

    2018-06-01

    To infer the ecogeographic conditions that underlie the evolutionary diversification of macaques, we investigated the within- and between-species relationships of craniodental dimensions, geography, and environment in extant macaque species. We studied evolutionary processes by contrasting macroevolutionary patterns, phylogeny, and within-species associations. Sixty-three linear measurements of the permanent dentition and skull along with data about climate, ecology (environment), and spatial geography were collected for 711 specimens of 12 macaque species and analyzed by a multivariate approach. Phylogenetic two-block partial least squares was used to identify patterns of covariance between craniodental and environmental variation. Phylogenetic reduced rank regression was employed to analyze spatial clines in morphological variation. Between-species associations consisted of two distinct multivariate patterns. The first represents overall craniodental size and is negatively associated with temperature and habitat, but positively with latitude. The second pattern shows an antero-posterior tooth size contrast related to diet, rainfall, and habitat productivity. After controlling for phylogeny, however, the latter dimension was diminished. Within-species analyses neither revealed significant association between morphology, environment, and geography, nor evidence of isolation by distance. We found evidence for environmental adaptation in macaque body and craniodental size, primarily driven by selection for thermoregulation. This pattern cannot be explained by the within-species pattern, indicating an evolved genetic basis for the between-species relationship. The dietary signal in relative tooth size, by contrast, can largely be explained by phylogeny. This cautions against adaptive interpretations of phenotype-environment associations when phylogeny is not explicitly modelled. © 2018 Wiley Periodicals, Inc.

  14. Lifetime risks for aneurysmal subarachnoid haemorrhage: multivariable risk stratification.

    PubMed

    Vlak, Monique H M; Rinkel, Gabriel J E; Greebe, Paut; Greving, Jacoba P; Algra, Ale

    2013-06-01

    The overall incidence of aneurysmal subarachnoid haemorrhage (aSAH) in western populations is around 9 per 100 000 person-years, which confers to a lifetime risk of around half per cent. Risk factors for aSAH are usually expressed as relative risks and suggest that absolute risks vary considerably according to risk factor profiles, but such estimates are lacking. We aimed to estimate incidence and lifetime risks of aSAH according to risk factor profiles. We used data from 250 patients admitted with aSAH and 574 sex-matched and age-matched controls, who were randomly retrieved from general practitioners files. We determined independent prognostic factors with multivariable logistic regression analyses and assessed discriminatory performance using the area under the receiver operating characteristic curve. Based on the prognostic model we predicted incidences and lifetime risks of aSAH for different risk factor profiles. The four strongest independent predictors for aSAH, namely current smoking (OR 6.0; 95% CI 4.1 to 8.6), a positive family history for aSAH (4.0; 95% CI 2.3 to 7.0), hypertension (2.4; 95% CI 1.5 to 3.8) and hypercholesterolaemia (0.2; 95% CI 0.1 to 0.4), were used in the final prediction model. This model had an area under the receiver operating characteristic curve of 0.73 (95% CI 0.69 to 0.76). Depending on sex, age and the four predictors, the incidence of aSAH ranged from 0.4/100 000 to 298/100 000 person-years and lifetime risk between 0.02% and 7.2%. The incidence and lifetime risk of aSAH in the general population varies widely according to risk factor profiles. Whether persons with high risks benefit from screening should be assessed in cost-effectiveness studies.

  15. The incidence and prevalence of pterygium in South Korea: A 10-year population-based Korean cohort study.

    PubMed

    Rim, Tyler Hyungtaek; Kang, Min Jae; Choi, Moonjung; Seo, Kyoung Yul; Kim, Sung Soo

    2017-01-01

    Although numerous population-based studies have reported the prevalences and risk factors for pterygium, information regarding the incidence of pterygium is scarce. This population-based cohort study aimed to evaluate the South Korean incidence and prevalence of pterygium. We retrospectively obtained data from a nationally representative sample of 1,116,364 South Koreans in the Korea National Health Insurance Service National Sample Cohort (NHIS-NSC). The associated sociodemographic factors were evaluated using multivariable Cox regression analysis, and the hazard ratios and confidence intervals were calculated. Pterygium was defined based on the Korean Classification of Diseases code, and surgically removed pterygium was defined as cases that required surgical removal. We identified 21,465 pterygium cases and 8,338 surgically removed pterygium cases during the study period. The overall incidences were 2.1 per 1,000 person-years for pterygium and 0.8 per 1,000 person-years for surgically removed pterygium. Among subjects who were ≥40 years old, the incidences were 4.3 per 1,000 person-years for pterygium and 1.7 per 1,000 person-years for surgically removed pterygium. The overall prevalences were 1.9% for pterygium and 0.6% for surgically removed pterygium, and the prevalences increased to 3.8% for pterygium and 1.4% for surgically removed pterygium among subjects who were ≥40 years old. The incidences of pterygium decreased according to year. The incidence and prevalence of pterygium were highest among 60-79-year-old individuals. Increasing age, female sex, and living in a relatively rural area were associated with increased risks of pterygium and surgically removed pterygium in the multivariable Cox regression analysis. Our analyses of South Korean national insurance claims data revealed a decreasing trend in the incidence of pterygium during the study period.

  16. Comprehensive ripeness-index for prediction of ripening level in mangoes by multivariate modelling of ripening behaviour

    NASA Astrophysics Data System (ADS)

    Eyarkai Nambi, Vijayaram; Thangavel, Kuladaisamy; Manickavasagan, Annamalai; Shahir, Sultan

    2017-01-01

    Prediction of ripeness level in climacteric fruits is essential for post-harvest handling. An index capable of predicting ripening level with minimum inputs would be highly beneficial to the handlers, processors and researchers in fruit industry. A study was conducted with Indian mango cultivars to develop a ripeness index and associated model. Changes in physicochemical, colour and textural properties were measured throughout the ripening period and the period was classified into five stages (unripe, early ripe, partially ripe, ripe and over ripe). Multivariate regression techniques like partial least square regression, principal component regression and multi linear regression were compared and evaluated for its prediction. Multi linear regression model with 12 parameters was found more suitable in ripening prediction. Scientific variable reduction method was adopted to simplify the developed model. Better prediction was achieved with either 2 or 3 variables (total soluble solids, colour and acidity). Cross validation was done to increase the robustness and it was found that proposed ripening index was more effective in prediction of ripening stages. Three-variable model would be suitable for commercial applications where reasonable accuracies are sufficient. However, 12-variable model can be used to obtain more precise results in research and development applications.

  17. Quasi-experimental evidence on tobacco tax regressivity.

    PubMed

    Koch, Steven F

    2018-01-01

    Tobacco taxes are known to reduce tobacco consumption and to be regressive, such that tobacco control policy may have the perverse effect of further harming the poor. However, if tobacco consumption falls faster amongst the poor than the rich, tobacco control policy can actually be progressive. We take advantage of persistent and committed tobacco control activities in South Africa to examine the household tobacco expenditure burden. For the analysis, we make use of two South African Income and Expenditure Surveys (2005/06 and 2010/11) that span a series of such tax increases and have been matched across the years, yielding 7806 matched pairs of tobacco consuming households and 4909 matched pairs of cigarette consuming households. By matching households across the surveys, we are able to examine both the regressivity of the household tobacco burden, and any change in that regressivity, and since tobacco taxes have been a consistent component of tobacco prices, our results also relate to the regressivity of tobacco taxes. Like previous research into cigarette and tobacco expenditures, we find that the tobacco burden is regressive; thus, so are tobacco taxes. However, we find that over the five-year period considered, the tobacco burden has decreased, and, most importantly, falls less heavily on the poor. Thus, the tobacco burden and the tobacco tax is less regressive in 2010/11 than in 2005/06. Thus, increased tobacco taxes can, in at least some circumstances, reduce the financial burden that tobacco places on households. Copyright © 2017 Elsevier Ltd. All rights reserved.

  18. Uni- and multi-variable modelling of flood losses: experiences gained from the Secchia river inundation event.

    NASA Astrophysics Data System (ADS)

    Carisi, Francesca; Domeneghetti, Alessio; Kreibich, Heidi; Schröter, Kai; Castellarin, Attilio

    2017-04-01

    Flood risk is function of flood hazard and vulnerability, therefore its accurate assessment depends on a reliable quantification of both factors. The scientific literature proposes a number of objective and reliable methods for assessing flood hazard, yet it highlights a limited understanding of the fundamental damage processes. Loss modelling is associated with large uncertainty which is, among other factors, due to a lack of standard procedures; for instance, flood losses are often estimated based on damage models derived in completely different contexts (i.e. different countries or geographical regions) without checking its applicability, or by considering only one explanatory variable (i.e. typically water depth). We consider the Secchia river flood event of January 2014, when a sudden levee-breach caused the inundation of nearly 200 km2 in Northern Italy. In the aftermath of this event, local authorities collected flood loss data, together with additional information on affected private households and industrial activities (e.g. buildings surface and economic value, number of company's employees and others). Based on these data we implemented and compared a quadratic-regression damage function, with water depth as the only explanatory variable, and a multi-variable model that combines multiple regression trees and considers several explanatory variables (i.e. bagging decision trees). Our results show the importance of data collection revealing that (1) a simple quadratic regression damage function based on empirical data from the study area can be significantly more accurate than literature damage-models derived for a different context and (2) multi-variable modelling may outperform the uni-variable approach, yet it is more difficult to develop and apply due to a much higher demand of detailed data.

  19. Regression estimators for generic health-related quality of life and quality-adjusted life years.

    PubMed

    Basu, Anirban; Manca, Andrea

    2012-01-01

    To develop regression models for outcomes with truncated supports, such as health-related quality of life (HRQoL) data, and account for features typical of such data such as a skewed distribution, spikes at 1 or 0, and heteroskedasticity. Regression estimators based on features of the Beta distribution. First, both a single equation and a 2-part model are presented, along with estimation algorithms based on maximum-likelihood, quasi-likelihood, and Bayesian Markov-chain Monte Carlo methods. A novel Bayesian quasi-likelihood estimator is proposed. Second, a simulation exercise is presented to assess the performance of the proposed estimators against ordinary least squares (OLS) regression for a variety of HRQoL distributions that are encountered in practice. Finally, the performance of the proposed estimators is assessed by using them to quantify the treatment effect on QALYs in the EVALUATE hysterectomy trial. Overall model fit is studied using several goodness-of-fit tests such as Pearson's correlation test, link and reset tests, and a modified Hosmer-Lemeshow test. The simulation results indicate that the proposed methods are more robust in estimating covariate effects than OLS, especially when the effects are large or the HRQoL distribution has a large spike at 1. Quasi-likelihood techniques are more robust than maximum likelihood estimators. When applied to the EVALUATE trial, all but the maximum likelihood estimators produce unbiased estimates of the treatment effect. One and 2-part Beta regression models provide flexible approaches to regress the outcomes with truncated supports, such as HRQoL, on covariates, after accounting for many idiosyncratic features of the outcomes distribution. This work will provide applied researchers with a practical set of tools to model outcomes in cost-effectiveness analysis.

  20. Coronary artery calcium for the prediction of mortality in young adults <45 years old and elderly adults >75 years old.

    PubMed

    Tota-Maharaj, Rajesh; Blaha, Michael J; McEvoy, John W; Blumenthal, Roger S; Muse, Evan D; Budoff, Matthew J; Shaw, Leslee J; Berman, Daniel S; Rana, Jamal S; Rumberger, John; Callister, Tracy; Rivera, Juan; Agatston, Arthur; Nasir, Khurram

    2012-12-01

    To determine if coronary artery calcium (CAC) scoring is independently predictive of mortality in young adults and in the elderly population and if a young person with high CAC has a higher mortality risk than an older person with less CAC. We studied a cohort of 44 052 asymptomatic patients referred for CAC scans for cardiovascular risk stratification. All-cause mortality rates (MRs) were calculated after stratifying by age groups (<45, 45-54, 55-64, 65-74, and ≥75) and CAC score (0, 1-100, 100-400, and >400). Multivariable Cox regression models were constructed to assess the independent value of CAC for predicting all-cause mortality in the <45- and ≥75-year-old age groups. The MR increased in both the <45- and ≥75-year-old age groups with an increasing CAC group. After multivariable adjustment, increasing CAC remained independently predictive of increased mortality compared with CAC = 0 [<45 age group, hazard ratio (95% confidence interval): CAC = 1-100, 2.3 (1.2-4.2); CAC = 100-400, 7.4 (3.3-16.6); CAC > 400, 34.6 (15.5-77.4); ≥75 age group: CAC = 1-100, 7.0 (2.4-20.8); CAC = 100-400, 9.2 (3.2-26.5); CAC > 400, 16.1 (5.8-45.1)]. Persons <45 years old with CAC = 100-400 and CAC > 400 had 2- and 10-fold increased MRs, respectively, compared with persons ≥75 with no CAC. Individuals ≥75 years old with CAC = 0 had a 5.6-year survival rate of 98%, similar to those in other age groups with CAC = 0 (5.6-year survival, 99%). The value of CAC for predicting mortality extends to both elderly patients and those <45 years old. Elderly persons with no CAC have a lower MR than younger persons with high CAC.

  1. Regression: A Bibliography.

    ERIC Educational Resources Information Center

    Pedrini, D. T.; Pedrini, Bonnie C.

    Regression, another mechanism studied by Sigmund Freud, has had much research, e.g., hypnotic regression, frustration regression, schizophrenic regression, and infra-human-animal regression (often directly related to fixation). Many investigators worked with hypnotic age regression, which has a long history, going back to Russian reflexologists.…

  2. Multilevel covariance regression with correlated random effects in the mean and variance structure.

    PubMed

    Quintero, Adrian; Lesaffre, Emmanuel

    2017-09-01

    Multivariate regression methods generally assume a constant covariance matrix for the observations. In case a heteroscedastic model is needed, the parametric and nonparametric covariance regression approaches can be restrictive in the literature. We propose a multilevel regression model for the mean and covariance structure, including random intercepts in both components and allowing for correlation between them. The implied conditional covariance function can be different across clusters as a result of the random effect in the variance structure. In addition, allowing for correlation between the random intercepts in the mean and covariance makes the model convenient for skewedly distributed responses. Furthermore, it permits us to analyse directly the relation between the mean response level and the variability in each cluster. Parameter estimation is carried out via Gibbs sampling. We compare the performance of our model to other covariance modelling approaches in a simulation study. Finally, the proposed model is applied to the RN4CAST dataset to identify the variables that impact burnout of nurses in Belgium. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  3. Multivariate analysis in the pharmaceutical industry: enabling process understanding and improvement in the PAT and QbD era.

    PubMed

    Ferreira, Ana P; Tobyn, Mike

    2015-01-01

    In the pharmaceutical industry, chemometrics is rapidly establishing itself as a tool that can be used at every step of product development and beyond: from early development to commercialization. This set of multivariate analysis methods allows the extraction of information contained in large, complex data sets thus contributing to increase product and process understanding which is at the core of the Food and Drug Administration's Process Analytical Tools (PAT) Guidance for Industry and the International Conference on Harmonisation's Pharmaceutical Development guideline (Q8). This review is aimed at providing pharmaceutical industry professionals an introduction to multivariate analysis and how it is being adopted and implemented by companies in the transition from "quality-by-testing" to "quality-by-design". It starts with an introduction to multivariate analysis and the two methods most commonly used: principal component analysis and partial least squares regression, their advantages, common pitfalls and requirements for their effective use. That is followed with an overview of the diverse areas of application of multivariate analysis in the pharmaceutical industry: from the development of real-time analytical methods to definition of the design space and control strategy, from formulation optimization during development to the application of quality-by-design principles to improve manufacture of existing commercial products.

  4. Advanced statistics: linear regression, part I: simple linear regression.

    PubMed

    Marill, Keith A

    2004-01-01

    Simple linear regression is a mathematical technique used to model the relationship between a single independent predictor variable and a single dependent outcome variable. In this, the first of a two-part series exploring concepts in linear regression analysis, the four fundamental assumptions and the mechanics of simple linear regression are reviewed. The most common technique used to derive the regression line, the method of least squares, is described. The reader will be acquainted with other important concepts in simple linear regression, including: variable transformations, dummy variables, relationship to inference testing, and leverage. Simplified clinical examples with small datasets and graphic models are used to illustrate the points. This will provide a foundation for the second article in this series: a discussion of multiple linear regression, in which there are multiple predictor variables.

  5. The impact of covariance misspecification in multivariate Gaussian mixtures on estimation and inference: an application to longitudinal modeling.

    PubMed

    Heggeseth, Brianna C; Jewell, Nicholas P

    2013-07-20

    Multivariate Gaussian mixtures are a class of models that provide a flexible parametric approach for the representation of heterogeneous multivariate outcomes. When the outcome is a vector of repeated measurements taken on the same subject, there is often inherent dependence between observations. However, a common covariance assumption is conditional independence-that is, given the mixture component label, the outcomes for subjects are independent. In this paper, we study, through asymptotic bias calculations and simulation, the impact of covariance misspecification in multivariate Gaussian mixtures. Although maximum likelihood estimators of regression and mixing probability parameters are not consistent under misspecification, they have little asymptotic bias when mixture components are well separated or if the assumed correlation is close to the truth even when the covariance is misspecified. We also present a robust standard error estimator and show that it outperforms conventional estimators in simulations and can indicate that the model is misspecified. Body mass index data from a national longitudinal study are used to demonstrate the effects of misspecification on potential inferences made in practice. Copyright © 2013 John Wiley & Sons, Ltd.

  6. Multivariate normative comparisons using an aggregated database

    PubMed Central

    Murre, Jaap M. J.; Huizenga, Hilde M.

    2017-01-01

    In multivariate normative comparisons, a patient’s profile of test scores is compared to those in a normative sample. Recently, it has been shown that these multivariate normative comparisons enhance the sensitivity of neuropsychological assessment. However, multivariate normative comparisons require multivariate normative data, which are often unavailable. In this paper, we show how a multivariate normative database can be constructed by combining healthy control group data from published neuropsychological studies. We show that three issues should be addressed to construct a multivariate normative database. First, the database may have a multilevel structure, with participants nested within studies. Second, not all tests are administered in every study, so many data may be missing. Third, a patient should be compared to controls of similar age, gender and educational background rather than to the entire normative sample. To address these issues, we propose a multilevel approach for multivariate normative comparisons that accounts for missing data and includes covariates for age, gender and educational background. Simulations show that this approach controls the number of false positives and has high sensitivity to detect genuine deviations from the norm. An empirical example is provided. Implications for other domains than neuropsychology are also discussed. To facilitate broader adoption of these methods, we provide code implementing the entire analysis in the open source software package R. PMID:28267796

  7. Distributed Monitoring of the R(sup 2) Statistic for Linear Regression

    NASA Technical Reports Server (NTRS)

    Bhaduri, Kanishka; Das, Kamalika; Giannella, Chris R.

    2011-01-01

    The problem of monitoring a multivariate linear regression model is relevant in studying the evolving relationship between a set of input variables (features) and one or more dependent target variables. This problem becomes challenging for large scale data in a distributed computing environment when only a subset of instances is available at individual nodes and the local data changes frequently. Data centralization and periodic model recomputation can add high overhead to tasks like anomaly detection in such dynamic settings. Therefore, the goal is to develop techniques for monitoring and updating the model over the union of all nodes data in a communication-efficient fashion. Correctness guarantees on such techniques are also often highly desirable, especially in safety-critical application scenarios. In this paper we develop DReMo a distributed algorithm with very low resource overhead, for monitoring the quality of a regression model in terms of its coefficient of determination (R2 statistic). When the nodes collectively determine that R2 has dropped below a fixed threshold, the linear regression model is recomputed via a network-wide convergecast and the updated model is broadcast back to all nodes. We show empirically, using both synthetic and real data, that our proposed method is highly communication-efficient and scalable, and also provide theoretical guarantees on correctness.

  8. A FORTRAN program for multivariate survival analysis on the personal computer.

    PubMed

    Mulder, P G

    1988-01-01

    In this paper a FORTRAN program is presented for multivariate survival or life table regression analysis in a competing risks' situation. The relevant failure rate (for example, a particular disease or mortality rate) is modelled as a log-linear function of a vector of (possibly time-dependent) explanatory variables. The explanatory variables may also include the variable time itself, which is useful for parameterizing piecewise exponential time-to-failure distributions in a Gompertz-like or Weibull-like way as a more efficient alternative to Cox's proportional hazards model. Maximum likelihood estimates of the coefficients of the log-linear relationship are obtained from the iterative Newton-Raphson method. The program runs on a personal computer under DOS; running time is quite acceptable, even for large samples.

  9. The NLS-Based Nonlinear Grey Multivariate Model for Forecasting Pollutant Emissions in China

    PubMed Central

    Pei, Ling-Ling; Li, Qin

    2018-01-01

    The relationship between pollutant discharge and economic growth has been a major research focus in environmental economics. To accurately estimate the nonlinear change law of China’s pollutant discharge with economic growth, this study establishes a transformed nonlinear grey multivariable (TNGM (1, N)) model based on the nonlinear least square (NLS) method. The Gauss–Seidel iterative algorithm was used to solve the parameters of the TNGM (1, N) model based on the NLS basic principle. This algorithm improves the precision of the model by continuous iteration and constantly approximating the optimal regression coefficient of the nonlinear model. In our empirical analysis, the traditional grey multivariate model GM (1, N) and the NLS-based TNGM (1, N) models were respectively adopted to forecast and analyze the relationship among wastewater discharge per capita (WDPC), and per capita emissions of SO2 and dust, alongside GDP per capita in China during the period 1996–2015. Results indicated that the NLS algorithm is able to effectively help the grey multivariable model identify the nonlinear relationship between pollutant discharge and economic growth. The results show that the NLS-based TNGM (1, N) model presents greater precision when forecasting WDPC, SO2 emissions and dust emissions per capita, compared to the traditional GM (1, N) model; WDPC indicates a growing tendency aligned with the growth of GDP, while the per capita emissions of SO2 and dust reduce accordingly. PMID:29517985

  10. Trends in one-year mortality for stroke in a tertiary academic center in Saudi Arabia: a 5-year retrospective analysis.

    PubMed

    Almekhlafi, Mohammed A

    2016-01-01

    Numerous studies have reported a decline in stroke-related mortality in developed countries. To assess trends in one-year mortality following a stroke diagnosis in Saudi Arabia. Retrospective longitudinal cohort study. Single tertiary care center from 2010 through 2014. All patients admitted with a primary admitting diagnosis of stroke. Demographic data (age, gender, nationality), risk factor profile, stroke subtypes, in-hospital complications and mortality data as well as cause of death were collected for all patients. A multivariable logistic regression model was used to assess factors associated with one-year mortality following a stroke admission. One-year mortality. In 548 patients with a mean age of 62.9 years (SD 16.9), the most frequent vascular risk factors were hypertension (90.6%), diabetes (65.5%), and hyperlipidemia (27.2%). Hemorrhagic stroke was diagnosed in 9.9%. The overall mortality risk was 26.9%. Non-Saudis had a significantly higher one-year mortality risk compared with Saudis (25% vs. 16.8%, respectively; P=.025). The most frequently reported causes of mortality were neurological and related to the underlying stroke (32%), sepsis (30%), and cardiac or other organ dysfunction-related (each 9%) in addition to other etiologies (collectively 9.5%) such as pulmonary embolism or an underlying malignancy. Significant predictors in the multivariate model were age (P < .0001), non-Saudi nationality (OR 1.8, CI 95 1.1 to 2.9; P=.019), and hospital length of stay (OR 1.01, CI 95 1 to 1.004; P=.001). We observed no decline in stroke mortality in our center over the 5-year span. The establishment of stroke systems of care, use of thrombolytic agents, and opening of a stroke unit should play an important role in a decline in stroke mortality. Retrospective single center study. Mortality data were available only for patients who died in our hospital.

  11. Predicting School Enrollments Using the Modified Regression Technique.

    ERIC Educational Resources Information Center

    Grip, Richard S.; Young, John W.

    This report is based on a study in which a regression model was constructed to increase accuracy in enrollment predictions. A model, known as the Modified Regression Technique (MRT), was used to examine K-12 enrollment over the past 20 years in 2 New Jersey school districts of similar size and ethnicity. To test the model's accuracy, MRT was…

  12. Time-series panel analysis (TSPA): multivariate modeling of temporal associations in psychotherapy process.

    PubMed

    Ramseyer, Fabian; Kupper, Zeno; Caspar, Franz; Znoj, Hansjörg; Tschacher, Wolfgang

    2014-10-01

    Processes occurring in the course of psychotherapy are characterized by the simple fact that they unfold in time and that the multiple factors engaged in change processes vary highly between individuals (idiographic phenomena). Previous research, however, has neglected the temporal perspective by its traditional focus on static phenomena, which were mainly assessed at the group level (nomothetic phenomena). To support a temporal approach, the authors introduce time-series panel analysis (TSPA), a statistical methodology explicitly focusing on the quantification of temporal, session-to-session aspects of change in psychotherapy. TSPA-models are initially built at the level of individuals and are subsequently aggregated at the group level, thus allowing the exploration of prototypical models. TSPA is based on vector auto-regression (VAR), an extension of univariate auto-regression models to multivariate time-series data. The application of TSPA is demonstrated in a sample of 87 outpatient psychotherapy patients who were monitored by postsession questionnaires. Prototypical mechanisms of change were derived from the aggregation of individual multivariate models of psychotherapy process. In a 2nd step, the associations between mechanisms of change (TSPA) and pre- to postsymptom change were explored. TSPA allowed a prototypical process pattern to be identified, where patient's alliance and self-efficacy were linked by a temporal feedback-loop. Furthermore, therapist's stability over time in both mastery and clarification interventions was positively associated with better outcomes. TSPA is a statistical tool that sheds new light on temporal mechanisms of change. Through this approach, clinicians may gain insight into prototypical patterns of change in psychotherapy. PsycINFO Database Record (c) 2014 APA, all rights reserved.

  13. Order-restricted inference for multivariate longitudinal data with applications to the natural history of hearing loss.

    PubMed

    Rosen, Sophia; Davidov, Ori

    2012-07-20

    Multivariate outcomes are often measured longitudinally. For example, in hearing loss studies, hearing thresholds for each subject are measured repeatedly over time at several frequencies. Thus, each patient is associated with a multivariate longitudinal outcome. The multivariate mixed-effects model is a useful tool for the analysis of such data. There are situations in which the parameters of the model are subject to some restrictions or constraints. For example, it is known that hearing thresholds, at every frequency, increase with age. Moreover, this age-related threshold elevation is monotone in frequency, that is, the higher the frequency, the higher, on average, is the rate of threshold elevation. This means that there is a natural ordering among the different frequencies in the rate of hearing loss. In practice, this amounts to imposing a set of constraints on the different frequencies' regression coefficients modeling the mean effect of time and age at entry to the study on hearing thresholds. The aforementioned constraints should be accounted for in the analysis. The result is a multivariate longitudinal model with restricted parameters. We propose estimation and testing procedures for such models. We show that ignoring the constraints may lead to misleading inferences regarding the direction and the magnitude of various effects. Moreover, simulations show that incorporating the constraints substantially improves the mean squared error of the estimates and the power of the tests. We used this methodology to analyze a real hearing loss study. Copyright © 2012 John Wiley & Sons, Ltd.

  14. A Multivariate Model of Parent-Adolescent Relationship Variables in Early Adolescence

    ERIC Educational Resources Information Center

    McKinney, Cliff; Renk, Kimberly

    2011-01-01

    Given the importance of predicting outcomes for early adolescents, this study examines a multivariate model of parent-adolescent relationship variables, including parenting, family environment, and conflict. Participants, who completed measures assessing these variables, included 710 culturally diverse 11-14-year-olds who were attending a middle…

  15. Personal, social, and game-related correlates of active and non-active gaming among dutch gaming adolescents: survey-based multivariable, multilevel logistic regression analyses.

    PubMed

    Simons, Monique; de Vet, Emely; Chinapaw, Mai Jm; de Boer, Michiel; Seidell, Jacob C; Brug, Johannes

    2014-04-04

    Playing video games contributes substantially to sedentary behavior in youth. A new generation of video games-active games-seems to be a promising alternative to sedentary games to promote physical activity and reduce sedentary behavior. At this time, little is known about correlates of active and non-active gaming among adolescents. The objective of this study was to examine potential personal, social, and game-related correlates of both active and non-active gaming in adolescents. A survey assessing game behavior and potential personal, social, and game-related correlates was conducted among adolescents (12-16 years, N=353) recruited via schools. Multivariable, multilevel logistic regression analyses, adjusted for demographics (age, sex and educational level of adolescents), were conducted to examine personal, social, and game-related correlates of active gaming ≥1 hour per week (h/wk) and non-active gaming >7 h/wk. Active gaming ≥1 h/wk was significantly associated with a more positive attitude toward active gaming (OR 5.3, CI 2.4-11.8; P<.001), a less positive attitude toward non-active games (OR 0.30, CI 0.1-0.6; P=.002), a higher score on habit strength regarding gaming (OR 1.9, CI 1.2-3.2; P=.008) and having brothers/sisters (OR 6.7, CI 2.6-17.1; P<.001) and friends (OR 3.4, CI 1.4-8.4; P=.009) who spend more time on active gaming and a little bit lower score on game engagement (OR 0.95, CI 0.91-0.997; P=.04). Non-active gaming >7 h/wk was significantly associated with a more positive attitude toward non-active gaming (OR 2.6, CI 1.1-6.3; P=.035), a stronger habit regarding gaming (OR 3.0, CI 1.7-5.3; P<.001), having friends who spend more time on non-active gaming (OR 3.3, CI 1.46-7.53; P=.004), and a more positive image of a non-active gamer (OR 2, CI 1.07-3.75; P=.03). Various factors were significantly associated with active gaming ≥1 h/wk and non-active gaming >7 h/wk. Active gaming is most strongly (negatively) associated with attitude with

  16. Partial Least Squares Regression Models for the Analysis of Kinase Signaling.

    PubMed

    Bourgeois, Danielle L; Kreeger, Pamela K

    2017-01-01

    Partial least squares regression (PLSR) is a data-driven modeling approach that can be used to analyze multivariate relationships between kinase networks and cellular decisions or patient outcomes. In PLSR, a linear model relating an X matrix of dependent variables and a Y matrix of independent variables is generated by extracting the factors with the strongest covariation. While the identified relationship is correlative, PLSR models can be used to generate quantitative predictions for new conditions or perturbations to the network, allowing for mechanisms to be identified. This chapter will provide a brief explanation of PLSR and provide an instructive example to demonstrate the use of PLSR to analyze kinase signaling.

  17. Blended learning in situated contexts: 3-year evaluation of an online peer review project.

    PubMed

    Bridges, S; Chang, J W W; Chu, C H; Gardner, K

    2014-08-01

    Situated and sociocultural perspectives on learning indicate that the design of complex tasks supported by educational technologies holds potential for dental education in moving novices towards closer approximation of the clinical outcomes of their expert mentors. A cross-faculty-, student-centred, web-based project in operative dentistry was established within the Universitas 21 (U21) network of higher education institutions to support university goals for internationalisation in clinical learning by enabling distributed interactions across sites and institutions. This paper aims to present evaluation of one dental faculty's project experience of curriculum redesign for deeper student learning. A mixed-method case study approach was utilised. Three cohorts of second-year students from a 5-year bachelor of dental surgery (BDS) programme were invited to participate in annual surveys and focus group interviews on project completion. Survey data were analysed for differences between years using multivariate logistical regression analysis. Thematic analysis of questionnaire open responses and interview transcripts was conducted. Multivariate logistic regression analysis noted significant differences across items over time indicating learning improvements, attainment of university aims and the positive influence of redesign. Students perceived the enquiry-based project as stimulating and motivating, and building confidence in operative techniques. Institutional goals for greater understanding of others and lifelong learning showed improvement over time. Despite positive scores, students indicated global citizenship and intercultural understanding were conceptually challenging. Establishment of online student learning communities through a blended approach to learning stimulated motivation and intellectual engagement, thereby supporting a situated approach to cognition. Sociocultural perspectives indicate that novice-expert interactions supported student development of

  18. Dietary patterns derived by reduced rank regression (RRR) and depressive symptoms in Japanese employees: The Furukawa nutrition and health study.

    PubMed

    Miki, Takako; Kochi, Takeshi; Kuwahara, Keisuke; Eguchi, Masafumi; Kurotani, Kayo; Tsuruoka, Hiroko; Ito, Rie; Kabe, Isamu; Kawakami, Norito; Mizoue, Tetsuya; Nanri, Akiko

    2015-09-30

    Depression has been linked to the overall diet using both exploratory and pre-defined methods. However, neither of these methods incorporates specific knowledge on nutrient-disease associations. The aim of the present study was to empirically identify dietary patterns using reduced rank regression and to examine their relations to depressive symptoms. Participants were 2006 Japanese employees aged 19-69 years. Depressive symptoms were assessed using the Center for Epidemiologic Studies Depression Scale. Diet was assessed using a validated, self-administered diet history questionnaire. Dietary patterns were extracted by reduced rank regression with 6 depression-related nutrients as response variables. Logistic regression was used to estimate odds ratios of depressive symptoms adjusted for potential confounders. A dietary pattern characterized by a high intake of vegetables, mushrooms, seaweeds, soybean products, green tea, potatoes, fruits, and small fish with bones and a low intake of rice was associated with fewer depressive symptoms. The multivariable-adjusted odds ratios of having depressive symptoms were 0.62 (95% confidence interval, 0.48-0.81) in the highest versus lowest tertiles of dietary score. Results suggest that adherence to a diet rich in vegetables, fruits, and typical Japanese foods, including mushrooms, seaweeds, soybean products, and green tea, is associated with a lower probability of having depressive symptoms. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  19. Predictive factors for rebleeding and death in alcoholic cirrhotic patients with acute variceal bleeding: a multivariate analysis.

    PubMed

    Krige, Jake E J; Kotze, Urda K; Distiller, Greg; Shaw, John M; Bornman, Philippus C

    2009-10-01

    Bleeding from esophageal varices is a leading cause of death in alcoholic cirrhotic patients. The aim of the present single-center study was to identify risk factors predictive of variceal rebleeding and death within 6 weeks of initial treatment. Univariate and multivariate analyses were performed on 310 prospectively documented alcoholic cirrhotic patients with acute variceal hemorrhage (AVH) who underwent 786 endoscopic variceal injection treatments between January 1984 and December 2006. All injections were administered during the first 6 weeks after the patients were treated for their first variceal bleed. Seventy-five (24.2%) patients experienced a rebleed, 38 within 5 days of the initial treatment and 37 within 6 weeks of their initial treatment. Of the 15 variables studied and included in a multivariate analysis using a logistic regression model, a bilirubin level >51 mmol/l and transfusion of >6 units of blood during the initial hospital admission were predictors of variceal rebleeding within the first 6 weeks. Seventy-seven (24.8%) patients died, 29 (9.3%) within 5 days and 48 (15.4%) between 6 and 42 days after the initial treatment. Stepwise multivariate logistic regression analysis showed that six variables were predictors of death within the first 6 weeks: encephalopathy, ascites, bilirubin level >51 mmol/l, international normalized ratio (INR) >2.3, albumin <25 g/l, and the need for balloon tube tamponade. Survival was influenced by the severity of liver failure, with most deaths occurring in Child-Pugh grade C patients. Patients with AVH and encephalopathy, ascites, bilirubin levels >51 mmol/l, INR >2.3, albumin <25 g/l and who require balloon tube tamponade are at increased risk of dying within the first 6 weeks. Bilirubin levels >51 mmol/l and transfusion of >6 units of blood were predictors of variceal rebleeding.

  20. Multivariate Analysis As a Support for Diagnostic Flowcharts in Allergic Bronchopulmonary Aspergillosis: A Proof-of-Concept Study.

    PubMed

    Vitte, Joana; Ranque, Stéphane; Carsin, Ania; Gomez, Carine; Romain, Thomas; Cassagne, Carole; Gouitaa, Marion; Baravalle-Einaudi, Mélisande; Bel, Nathalie Stremler-Le; Reynaud-Gaubert, Martine; Dubus, Jean-Christophe; Mège, Jean-Louis; Gaudart, Jean

    2017-01-01

    Molecular-based allergy diagnosis yields multiple biomarker datasets. The classical diagnostic score for allergic bronchopulmonary aspergillosis (ABPA), a severe disease usually occurring in asthmatic patients and people with cystic fibrosis, comprises succinct immunological criteria formulated in 1977: total IgE, anti- Aspergillus fumigatus ( Af ) IgE, anti- Af "precipitins," and anti- Af IgG. Progress achieved over the last four decades led to multiple IgE and IgG(4) Af biomarkers available with quantitative, standardized, molecular-level reports. These newly available biomarkers have not been included in the current diagnostic criteria, either individually or in algorithms, despite persistent underdiagnosis of ABPA. Large numbers of individual biomarkers may hinder their use in clinical practice. Conversely, multivariate analysis using new tools may bring about a better chance of less diagnostic mistakes. We report here a proof-of-concept work consisting of a three-step multivariate analysis of Af IgE, IgG, and IgG4 biomarkers through a combination of principal component analysis, hierarchical ascendant classification, and classification and regression tree multivariate analysis. The resulting diagnostic algorithms might show the way for novel criteria and improved diagnostic efficiency in Af -sensitized patients at risk for ABPA.

  1. Square Root Graphical Models: Multivariate Generalizations of Univariate Exponential Families that Permit Positive Dependencies

    PubMed Central

    Inouye, David I.; Ravikumar, Pradeep; Dhillon, Inderjit S.

    2016-01-01

    We develop Square Root Graphical Models (SQR), a novel class of parametric graphical models that provides multivariate generalizations of univariate exponential family distributions. Previous multivariate graphical models (Yang et al., 2015) did not allow positive dependencies for the exponential and Poisson generalizations. However, in many real-world datasets, variables clearly have positive dependencies. For example, the airport delay time in New York—modeled as an exponential distribution—is positively related to the delay time in Boston. With this motivation, we give an example of our model class derived from the univariate exponential distribution that allows for almost arbitrary positive and negative dependencies with only a mild condition on the parameter matrix—a condition akin to the positive definiteness of the Gaussian covariance matrix. Our Poisson generalization allows for both positive and negative dependencies without any constraints on the parameter values. We also develop parameter estimation methods using node-wise regressions with ℓ1 regularization and likelihood approximation methods using sampling. Finally, we demonstrate our exponential generalization on a synthetic dataset and a real-world dataset of airport delay times. PMID:27563373

  2. Searching for New Biomarkers and the Use of Multivariate Analysis in Gastric Cancer Diagnostics.

    PubMed

    Kucera, Radek; Smid, David; Topolcan, Ondrej; Karlikova, Marie; Fiala, Ondrej; Slouka, David; Skalicky, Tomas; Treska, Vladislav; Kulda, Vlastimil; Simanek, Vaclav; Safanda, Martin; Pesta, Martin

    2016-04-01

    The first aim of this study was to search for new biomarkers to be used in gastric cancer diagnostics. The second aim was to verify the findings presented in literature on a sample of the local population and investigate the risk of gastric cancer in that population using a multivariant statistical analysis. We assessed a group of 36 patients with gastric cancer and 69 healthy individuals. We determined carcinoembryonic antigen, cancer antigen 19-9, cancer antigen 72-4, matrix metalloproteinases (-1, -2, -7, -8 and -9), osteoprotegerin, osteopontin, prothrombin induced by vitamin K absence-II, pepsinogen I, pepsinogen II, gastrin and Helicobacter pylori for each sample. The multivariate stepwise logistic regression identified the following biomarkers as the best gastric cancer predictors: CEA, CA72-4, pepsinogen I, Helicobacter pylori presence and MMP7. CEA and CA72-4 remain the best markers for gastric cancer diagnostics. We suggest a mathematical model for the assessment of risk of gastric cancer. Copyright© 2016 International Institute of Anticancer Research (Dr. John G. Delinassios), All rights reserved.

  3. ARMS2 variants may predict the 3-year outcome of photodynamic therapy for wet age-related macular degeneration

    PubMed Central

    Nakai, Shunichiro; Matsumiya, Wataru; Miki, Akiko; Nakamura, Makoto

    2017-01-01

    Purpose To determine the association of age-related maculopathy susceptibility 2 (ARMS2) gene polymorphisms with the 3-year outcomes of photodynamic therapy (PDT) in wet age-related macular degeneration (wet AMD). Methods The single nucleotide polymorphism (SNP) at rs10490924 in the ARMS2 gene of 65 patients with wet AMD who underwent PDT was genotyped using the TaqMan assay. The clinical characteristics and the outcomes of PDT were compared among the three genotypes at rs10490924. A multivariate regression analysis was performed to evaluate the influence of the clinical cofactors on the association of rs10490924 with the visual outcome at 36 months after the first PDT. Results A significant difference was found among the genotypes in the age and the baseline lesion size. The patients with the GG genotype showed a significant improvement in vision, and the patients with the TT genotype showed a significant worsening of vision at all time points measured after the initial PDT. In the multivariate regression analysis, the number of the G allele at rs10490924 was associated with a significantly greater improvement in the baseline best-corrected visual acuity (BCVA) at 36 months after the first PDT. Conclusions ARMS2 variants are likely associated with the 3-year outcomes of PDT in patients with wet AMD. PMID:28761324

  4. Multivariate Radiological-Based Models for the Prediction of Future Knee Pain: Data from the OAI

    PubMed Central

    Galván-Tejada, Jorge I.; Celaya-Padilla, José M.; Treviño, Victor; Tamez-Peña, José G.

    2015-01-01

    In this work, the potential of X-ray based multivariate prognostic models to predict the onset of chronic knee pain is presented. Using X-rays quantitative image assessments of joint-space-width (JSW) and paired semiquantitative central X-ray scores from the Osteoarthritis Initiative (OAI), a case-control study is presented. The pain assessments of the right knee at the baseline and the 60-month visits were used to screen for case/control subjects. Scores were analyzed at the time of pain incidence (T-0), the year prior incidence (T-1), and two years before pain incidence (T-2). Multivariate models were created by a cross validated elastic-net regularized generalized linear models feature selection tool. Univariate differences between cases and controls were reported by AUC, C-statistics, and ODDs ratios. Univariate analysis indicated that the medial osteophytes were significantly more prevalent in cases than controls: C-stat 0.62, 0.62, and 0.61, at T-0, T-1, and T-2, respectively. The multivariate JSW models significantly predicted pain: AUC = 0.695, 0.623, and 0.620, at T-0, T-1, and T-2, respectively. Semiquantitative multivariate models predicted paint with C-stat = 0.671, 0.648, and 0.645 at T-0, T-1, and T-2, respectively. Multivariate models derived from plain X-ray radiography assessments may be used to predict subjects that are at risk of developing knee pain. PMID:26504490

  5. Evaluating differential effects using regression interactions and regression mixture models

    PubMed Central

    Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung

    2015-01-01

    Research increasingly emphasizes understanding differential effects. This paper focuses on understanding regression mixture models, a relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their formulation, and their assumptions are compared using Monte Carlo simulations and real data analysis. The capabilities of regression mixture models are described and specific issues to be addressed when conducting regression mixtures are proposed. The paper aims to clarify the role that regression mixtures can take in the estimation of differential effects and increase awareness of the benefits and potential pitfalls of this approach. Regression mixture models are shown to be a potentially effective exploratory method for finding differential effects when these effects can be defined by a small number of classes of respondents who share a typical relationship between a predictor and an outcome. It is also shown that the comparison between regression mixture models and interactions becomes substantially more complex as the number of classes increases. It is argued that regression interactions are well suited for direct tests of specific hypotheses about differential effects and regression mixtures provide a useful approach for exploring effect heterogeneity given adequate samples and study design. PMID:26556903

  6. Sexual initiation and emotional/behavioral problems in Taiwanese adolescents: a multivariate response profile analysis.

    PubMed

    Chan, Chia-Hua; Ting, Te-Tien; Chen, Yen-Tyng; Chen, Chuan-Yu; Chen, Wei J

    2015-04-01

    This study aimed to investigate the relations of adolescent sexual experiences (particularly early initiation) to a spectrum of emotional/behavioral problems and to probe possible gender difference in such relationships. The 10th (N = 8,842) and 12th (N = 10,083) grade students, aged 16-19 years, participating in national surveys in 2005 and 2006 in Taiwan were included for this study. A self-administered web-based questionnaire was designed to collect information on sociodemographic characteristics, sexual experience, substance use, and the Youth Self-Report Form. For the sexually experienced adolescents, their sexual initiation was classified as early initiation (<16 years) or non-early initiation (16-19 years). Gender-specific multivariate response profile regression was used to examine the relationship between sexual experience and the behavioral syndromes. Externalizing problems, including Rule-breaking Behavior and Aggressive Behavior, were strongly associated with sexual initiation in adolescence; the magnitude of the association increased for earlier sexual initiation, especially for females. As to internalizing problems, the connection was rather heterogeneous. The scores on some syndromes, such as Somatic Complaints and Anxious/Depressed, were higher only for females with early or non-early sexual initiation whereas the score on Withdrawn, along with Social Problems that is neither internalizing nor externalizing, was lower for the sexually experienced adolescents than for the sexually inexperienced ones. We concluded that earlier sexual initiation was associated with a wider range of behavioral problems in adolescents for both genders, yet the increased risk with emotional problems was predominately found in females.

  7. Patterns and Predictors of Language and Literacy Abilities 4-10 Years in the Longitudinal Study of Australian Children

    PubMed Central

    Zubrick, Stephen R.; Taylor, Catherine L.; Christensen, Daniel

    2015-01-01

    Aims Oral language is the foundation of literacy. Naturally, policies and practices to promote children’s literacy begin in early childhood and have a strong focus on developing children’s oral language, especially for children with known risk factors for low language ability. The underlying assumption is that children’s progress along the oral to literate continuum is stable and predictable, such that low language ability foretells low literacy ability. This study investigated patterns and predictors of children’s oral language and literacy abilities at 4, 6, 8 and 10 years. The study sample comprised 2,316 to 2,792 children from the first nationally representative Longitudinal Study of Australian Children (LSAC). Six developmental patterns were observed, a stable middle-high pattern, a stable low pattern, an improving pattern, a declining pattern, a fluctuating low pattern, and a fluctuating middle-high pattern. Most children (69%) fit a stable middle-high pattern. By contrast, less than 1% of children fit a stable low pattern. These results challenged the view that children’s progress along the oral to literate continuum is stable and predictable. Findings Multivariate logistic regression was used to investigate risks for low literacy ability at 10 years and sensitivity-specificity analysis was used to examine the predictive utility of the multivariate model. Predictors were modelled as risk variables with the lowest level of risk as the reference category. In the multivariate model, substantial risks for low literacy ability at 10 years, in order of descending magnitude, were: low school readiness, Aboriginal and/or Torres Strait Islander status and low language ability at 8 years. Moderate risks were high temperamental reactivity, low language ability at 4 years, and low language ability at 6 years. The following risk factors were not statistically significant in the multivariate model: Low maternal consistency, low family income, health care card

  8. Patterns and Predictors of Language and Literacy Abilities 4-10 Years in the Longitudinal Study of Australian Children.

    PubMed

    Zubrick, Stephen R; Taylor, Catherine L; Christensen, Daniel

    2015-01-01

    Oral language is the foundation of literacy. Naturally, policies and practices to promote children's literacy begin in early childhood and have a strong focus on developing children's oral language, especially for children with known risk factors for low language ability. The underlying assumption is that children's progress along the oral to literate continuum is stable and predictable, such that low language ability foretells low literacy ability. This study investigated patterns and predictors of children's oral language and literacy abilities at 4, 6, 8 and 10 years. The study sample comprised 2,316 to 2,792 children from the first nationally representative Longitudinal Study of Australian Children (LSAC). Six developmental patterns were observed, a stable middle-high pattern, a stable low pattern, an improving pattern, a declining pattern, a fluctuating low pattern, and a fluctuating middle-high pattern. Most children (69%) fit a stable middle-high pattern. By contrast, less than 1% of children fit a stable low pattern. These results challenged the view that children's progress along the oral to literate continuum is stable and predictable. Multivariate logistic regression was used to investigate risks for low literacy ability at 10 years and sensitivity-specificity analysis was used to examine the predictive utility of the multivariate model. Predictors were modelled as risk variables with the lowest level of risk as the reference category. In the multivariate model, substantial risks for low literacy ability at 10 years, in order of descending magnitude, were: low school readiness, Aboriginal and/or Torres Strait Islander status and low language ability at 8 years. Moderate risks were high temperamental reactivity, low language ability at 4 years, and low language ability at 6 years. The following risk factors were not statistically significant in the multivariate model: Low maternal consistency, low family income, health care card, child not read to at home

  9. Application and validation of Cox regression models in a single-center series of double kidney transplantation.

    PubMed

    Santori, G; Fontana, I; Bertocchi, M; Gasloli, G; Magoni Rossi, A; Tagliamacco, A; Barocci, S; Nocera, A; Valente, U

    2010-05-01

    A useful approach to reduce the number of discarded marginal kidneys and to increase the nephron mass is double kidney transplantation (DKT). In this study, we retrospectively evaluated the potential predictors for patient and graft survival in a single-center series of 59 DKT procedures performed between April 21, 1999, and September 21, 2008. The kidney recipients of mean age 63.27 +/- 5.17 years included 16 women (27%) and 43 men (73%). The donors of mean age 69.54 +/- 7.48 years included 32 women (54%) and 27 men (46%). The mean posttransplant dialysis time was 2.37 +/- 3.61 days. The mean hospitalization was 20.12 +/- 13.65 days. Average serum creatinine (SCr) at discharge was 1.5 +/- 0.59 mg/dL. In view of the limited numbers of recipient deaths (n = 4) and graft losses (n = 8) that occurred in our series, the proportional hazards assumption for each Cox regression model with P < .05 was tested by using correlation coefficients between transformed survival times and scaled Schoenfeld residuals, and checked with smoothed plots of Schoenfeld residuals. For patient survival, the variables that reached statistical significance were donor SCr (P = .007), donor creatinine cleararance (P = .023), and recipient age (P = .047). Each significant model passed the Schoenfeld test. By entering these variables into a multivariate Cox model for patient survival, no further significance was observed. In the univariate Cox models performed for graft survival, statistical significance was noted for donor SCr (P = .027), SCr 3 months post-DKT (P = .043), and SCr 6 months post-DKT (P = .017). All significant univariate models for graft survival passed the Schoenfeld test. A final multivariate model retained SCr at 6 months (beta = 1.746, P = .042) and donor SCr (beta = .767, P = .090). In our analysis, SCr at 6 months seemed to emerge from both univariate and multivariate Cox models as a potential predictor of graft survival among DKT. Multicenter studies with larger recipient

  10. Modelling infant mortality rate in Central Java, Indonesia use generalized poisson regression method

    NASA Astrophysics Data System (ADS)

    Prahutama, Alan; Sudarno

    2018-05-01

    The infant mortality rate is the number of deaths under one year of age occurring among the live births in a given geographical area during a given year, per 1,000 live births occurring among the population of the given geographical area during the same year. This problem needs to be addressed because it is an important element of a country’s economic development. High infant mortality rate will disrupt the stability of a country as it relates to the sustainability of the population in the country. One of regression model that can be used to analyze the relationship between dependent variable Y in the form of discrete data and independent variable X is Poisson regression model. Recently The regression modeling used for data with dependent variable is discrete, among others, poisson regression, negative binomial regression and generalized poisson regression. In this research, generalized poisson regression modeling gives better AIC value than poisson regression. The most significant variable is the Number of health facilities (X1), while the variable that gives the most influence to infant mortality rate is the average breastfeeding (X9).

  11. Biological and Sociocultural Factors During the School Years Predicting Women's Lifetime Educational Attainment.

    PubMed

    Hendrick, C Emily; Cohen, Alison K; Deardorff, Julianna; Cance, Jessica D

    2016-03-01

    Lifetime educational attainment is an important predictor of health and well-being for women in the United States. In this study, we examine the roles of sociocultural factors in youth and an understudied biological life event, pubertal timing, in predicting women's lifetime educational attainment. Using data from the National Longitudinal Survey of Youth 1997 cohort (N = 3889), we conducted sequential multivariate linear regression analyses to investigate the influences of macro-level and family-level sociocultural contextual factors in youth (region of country, urbanicity, race/ethnicity, year of birth, household composition, mother's education, and mother's age at first birth) and early menarche, a marker of early pubertal development, on women's educational attainment after age 24. Pubertal timing and all sociocultural factors in youth, other than year of birth, predicted women's lifetime educational attainment in bivariate models. Family factors had the strongest associations. When family factors were added to multivariate models, geographic region in youth, and pubertal timing were no longer significant. Our findings provide additional evidence that family factors should be considered when developing comprehensive and inclusive interventions in childhood and adolescence to promote lifetime educational attainment among girls. © 2016, American School Health Association.

  12. Deconstructing multivariate decoding for the study of brain function.

    PubMed

    Hebart, Martin N; Baker, Chris I

    2017-08-04

    Multivariate decoding methods were developed originally as tools to enable accurate predictions in real-world applications. The realization that these methods can also be employed to study brain function has led to their widespread adoption in the neurosciences. However, prior to the rise of multivariate decoding, the study of brain function was firmly embedded in a statistical philosophy grounded on univariate methods of data analysis. In this way, multivariate decoding for brain interpretation grew out of two established frameworks: multivariate decoding for predictions in real-world applications, and classical univariate analysis based on the study and interpretation of brain activation. We argue that this led to two confusions, one reflecting a mixture of multivariate decoding for prediction or interpretation, and the other a mixture of the conceptual and statistical philosophies underlying multivariate decoding and classical univariate analysis. Here we attempt to systematically disambiguate multivariate decoding for the study of brain function from the frameworks it grew out of. After elaborating these confusions and their consequences, we describe six, often unappreciated, differences between classical univariate analysis and multivariate decoding. We then focus on how the common interpretation of what is signal and noise changes in multivariate decoding. Finally, we use four examples to illustrate where these confusions may impact the interpretation of neuroimaging data. We conclude with a discussion of potential strategies to help resolve these confusions in interpreting multivariate decoding results, including the potential departure from multivariate decoding methods for the study of brain function. Copyright © 2017. Published by Elsevier Inc.

  13. Multivariate calibration on NIR data: development of a model for the rapid evaluation of ethanol content in bakery products.

    PubMed

    Bello, Alessandra; Bianchi, Federica; Careri, Maria; Giannetto, Marco; Mori, Giovanni; Musci, Marilena

    2007-11-05

    A new NIR method based on multivariate calibration for determination of ethanol in industrially packed wholemeal bread was developed and validated. GC-FID was used as reference method for the determination of actual ethanol concentration of different samples of wholemeal bread with proper content of added ethanol, ranging from 0 to 3.5% (w/w). Stepwise discriminant analysis was carried out on the NIR dataset, in order to reduce the number of original variables by selecting those that were able to discriminate between the samples of different ethanol concentrations. With the so selected variables a multivariate calibration model was then obtained by multiple linear regression. The prediction power of the linear model was optimized by a new "leave one out" method, so that the number of original variables resulted further reduced.

  14. Identifying Pleiotropic Genes in Genome-Wide Association Studies for Multivariate Phenotypes with Mixed Measurement Scales

    PubMed Central

    Williams, L. Keoki; Buu, Anne

    2017-01-01

    We propose a multivariate genome-wide association test for mixed continuous, binary, and ordinal phenotypes. A latent response model is used to estimate the correlation between phenotypes with different measurement scales so that the empirical distribution of the Fisher’s combination statistic under the null hypothesis is estimated efficiently. The simulation study shows that our proposed correlation estimation methods have high levels of accuracy. More importantly, our approach conservatively estimates the variance of the test statistic so that the type I error rate is controlled. The simulation also shows that the proposed test maintains the power at the level very close to that of the ideal analysis based on known latent phenotypes while controlling the type I error. In contrast, conventional approaches–dichotomizing all observed phenotypes or treating them as continuous variables–could either reduce the power or employ a linear regression model unfit for the data. Furthermore, the statistical analysis on the database of the Study of Addiction: Genetics and Environment (SAGE) demonstrates that conducting a multivariate test on multiple phenotypes can increase the power of identifying markers that may not be, otherwise, chosen using marginal tests. The proposed method also offers a new approach to analyzing the Fagerström Test for Nicotine Dependence as multivariate phenotypes in genome-wide association studies. PMID:28081206

  15. Weight, socio-demographics, and health behaviour related correlates of academic performance in first year university students.

    PubMed

    Deliens, Tom; Clarys, Peter; De Bourdeaudhuij, Ilse; Deforche, Benedicte

    2013-12-17

    This study aimed to examine differences in socio-demographics and health behaviour between Belgian first year university students who attended all final course exams and those who did not. Secondly, this study aimed to identify weight and health behaviour related correlates of academic performance in those students who attended all course exams. Anthropometrics of 101 first year university students were measured at both the beginning of the first (T1) and second (T2) semester of the academic year. An on-line health behaviour questionnaire was filled out at T2. As a measure of academic performance student end-of-year Grade Point Averages (GPA) were obtained from the university's registration office. Independent samples t-tests and chi2-tests were executed to compare students who attended all course exams during the first year of university and students who did not carry through. Uni- and multivariate linear regression analyses were conducted to identify correlates of academic performance in students who attended all course exams during the first year of university. Students who did not attend all course exams were predominantly male, showed higher increases in waist circumference during the first semester and consumed more French fries than those who attended all final course exams. Being male, lower secondary school grades, increases in weight, Body Mass Index and waist circumference over the first semester, more gaming on weekdays, being on a diet, eating at the student restaurant more frequently, higher soda and French fries consumption, and higher frequency of alcohol use predicted lower GPA's in first year university students. When controlled for each other, being on a diet and higher frequency of alcohol use remained significant in the multivariate regression model, with frequency of alcohol use being the strongest correlate of GPA. This study, conducted in Belgian first year university students, showed that academic performance is associated with a wide range

  16. Multivariate missing data in hydrology - Review and applications

    NASA Astrophysics Data System (ADS)

    Ben Aissia, Mohamed-Aymen; Chebana, Fateh; Ouarda, Taha B. M. J.

    2017-12-01

    Water resources planning and management require complete data sets of a number of hydrological variables, such as flood peaks and volumes. However, hydrologists are often faced with the problem of missing data (MD) in hydrological databases. Several methods are used to deal with the imputation of MD. During the last decade, multivariate approaches have gained popularity in the field of hydrology, especially in hydrological frequency analysis (HFA). However, treating the MD remains neglected in the multivariate HFA literature whereas the focus has been mainly on the modeling component. For a complete analysis and in order to optimize the use of data, MD should also be treated in the multivariate setting prior to modeling and inference. Imputation of MD in the multivariate hydrological framework can have direct implications on the quality of the estimation. Indeed, the dependence between the series represents important additional information that can be included in the imputation process. The objective of the present paper is to highlight the importance of treating MD in multivariate hydrological frequency analysis by reviewing and applying multivariate imputation methods and by comparing univariate and multivariate imputation methods. An application is carried out for multiple flood attributes on three sites in order to evaluate the performance of the different methods based on the leave-one-out procedure. The results indicate that, the performance of imputation methods can be improved by adopting the multivariate setting, compared to mean substitution and interpolation methods, especially when using the copula-based approach.

  17. Multivariate Regression Analysis of Winter Ozone Events in the Uinta Basin of Eastern Utah, USA

    NASA Astrophysics Data System (ADS)

    Mansfield, M. L.

    2012-12-01

    I report on a regression analysis of a number of variables that are involved in the formation of winter ozone in the Uinta Basin of Eastern Utah. One goal of the analysis is to develop a mathematical model capable of predicting the daily maximum ozone concentration from values of a number of independent variables. The dependent variable is the daily maximum ozone concentration at a particular site in the basin. Independent variables are (1) daily lapse rate, (2) daily "basin temperature" (defined below), (3) snow cover, (4) midday solar zenith angle, (5) monthly oil production, (6) monthly gas production, and (7) the number of days since the beginning of a multi-day inversion event. Daily maximum temperature and daily snow cover data are available at ten or fifteen different sites throughout the basin. The daily lapse rate is defined operationally as the slope of the linear least-squares fit to the temperature-altitude plot, and the "basin temperature" is defined as the value assumed by the same least-squares line at an altitude of 1400 m. A multi-day inversion event is defined as a set of consecutive days for which the lapse rate remains positive. The standard deviation in the accuracy of the model is about 10 ppb. The model has been combined with historical climate and oil & gas production data to estimate historical ozone levels.

  18. The effects of driving age, driver education, and curfew laws on traffic fatalities of 15-17 year olds.

    PubMed

    Levy, D T

    1988-12-01

    This study examines the effect of state driving age, learning permit, driver's education, and curfew laws on 15-17-year-old driver fatality rates. A multivariate regression model is estimated for 47 states and nine years. The minimum legal driving age and curfew laws are found to be important determinants of fatalities. Driver's education and learning permits have smaller effects. The relationship between rates of licensure and driving age, education, and curfew laws is also examined. In each case, a more restrictive policy is found to reduce licensure of 15-17 year olds. The results suggest that the imposition of curfew laws and higher minimum driving ages are particularly effective traffic safety policies.

  19. Regional regression models of watershed suspended-sediment discharge for the eastern United States

    NASA Astrophysics Data System (ADS)

    Roman, David C.; Vogel, Richard M.; Schwarz, Gregory E.

    2012-11-01

    SummaryEstimates of mean annual watershed sediment discharge, derived from long-term measurements of suspended-sediment concentration and streamflow, often are not available at locations of interest. The goal of this study was to develop multivariate regression models to enable prediction of mean annual suspended-sediment discharge from available basin characteristics useful for most ungaged river locations in the eastern United States. The models are based on long-term mean sediment discharge estimates and explanatory variables obtained from a combined dataset of 1201 US Geological Survey (USGS) stations derived from a SPAtially Referenced Regression on Watershed attributes (SPARROW) study and the Geospatial Attributes of Gages for Evaluating Streamflow (GAGES) database. The resulting regional regression models summarized for major US water resources regions 1-8, exhibited prediction R2 values ranging from 76.9% to 92.7% and corresponding average model prediction errors ranging from 56.5% to 124.3%. Results from cross-validation experiments suggest that a majority of the models will perform similarly to calibration runs. The 36-parameter regional regression models also outperformed a 16-parameter national SPARROW model of suspended-sediment discharge and indicate that mean annual sediment loads in the eastern United States generally correlates with a combination of basin area, land use patterns, seasonal precipitation, soil composition, hydrologic modification, and to a lesser extent, topography.

  20. Regional regression models of watershed suspended-sediment discharge for the eastern United States

    USGS Publications Warehouse

    Roman, David C.; Vogel, Richard M.; Schwarz, Gregory E.

    2012-01-01

    Estimates of mean annual watershed sediment discharge, derived from long-term measurements of suspended-sediment concentration and streamflow, often are not available at locations of interest. The goal of this study was to develop multivariate regression models to enable prediction of mean annual suspended-sediment discharge from available basin characteristics useful for most ungaged river locations in the eastern United States. The models are based on long-term mean sediment discharge estimates and explanatory variables obtained from a combined dataset of 1201 US Geological Survey (USGS) stations derived from a SPAtially Referenced Regression on Watershed attributes (SPARROW) study and the Geospatial Attributes of Gages for Evaluating Streamflow (GAGES) database. The resulting regional regression models summarized for major US water resources regions 1–8, exhibited prediction R2 values ranging from 76.9% to 92.7% and corresponding average model prediction errors ranging from 56.5% to 124.3%. Results from cross-validation experiments suggest that a majority of the models will perform similarly to calibration runs. The 36-parameter regional regression models also outperformed a 16-parameter national SPARROW model of suspended-sediment discharge and indicate that mean annual sediment loads in the eastern United States generally correlates with a combination of basin area, land use patterns, seasonal precipitation, soil composition, hydrologic modification, and to a lesser extent, topography.

  1. A regression-kriging model for estimation of rainfall in the Laohahe basin

    NASA Astrophysics Data System (ADS)

    Wang, Hong; Ren, Li L.; Liu, Gao H.

    2009-10-01

    This paper presents a multivariate geostatistical algorithm called regression-kriging (RK) for predicting the spatial distribution of rainfall by incorporating five topographic/geographic factors of latitude, longitude, altitude, slope and aspect. The technique is illustrated using rainfall data collected at 52 rain gauges from the Laohahe basis in northeast China during 1986-2005 . Rainfall data from 44 stations were selected for modeling and the remaining 8 stations were used for model validation. To eliminate multicollinearity, the five explanatory factors were first transformed using factor analysis with three Principal Components (PCs) extracted. The rainfall data were then fitted using step-wise regression and residuals interpolated using SK. The regression coefficients were estimated by generalized least squares (GLS), which takes the spatial heteroskedasticity between rainfall and PCs into account. Finally, the rainfall prediction based on RK was compared with that predicted from ordinary kriging (OK) and ordinary least squares (OLS) multiple regression (MR). For correlated topographic factors are taken into account, RK improves the efficiency of predictions. RK achieved a lower relative root mean square error (RMSE) (44.67%) than MR (49.23%) and OK (73.60%) and a lower bias than MR and OK (23.82 versus 30.89 and 32.15 mm) for annual rainfall. It is much more effective for the wet season than for the dry season. RK is suitable for estimation of rainfall in areas where there are no stations nearby and where topography has a major influence on rainfall.

  2. Cytologic regression in women with atypical squamous cells of unknown significance and negative human papillomavirus test.

    PubMed

    Wang, Shu; Lang, Jing He; Cheng, Xue Mei

    2009-12-01

    The aim of this study was to investigate the cytologic regression in women with atypical squamous cells of unknown significance and negative high-risk human papillomavirus test. The 45 women with atypical squamous cells of unknown significance and negative high-risk human papillomavirus at baseline were analyzed about the cytologic regression during 2 years of follow-up. The cumulative rate of cytologic regression was calculated by Kaplan-Meier curves. Of 45 women, the cumulative rates were as follows: 55.6% obtained cytologic regression before 6 months, 84.4% by 1 year, and 95.6% at 2 years. Cytologic regression was not influenced by age, menopausal status, and baseline human papillomavirus load. However, the 1-year cumulative regression rate in women with previous cervical lesions was significantly lower than those without (P=.02), even much lower in women with high-grade intraepithelial neoplasia or worse (P=.008). Most women with atypical squamous cells of unknown significance and negative high-risk human papillomavirus could obtain cytologic regression within 2 years. Women with antecedent cervical lesions need longer time to reach this regression.

  3. Generating linear regression model to predict motor functions by use of laser range finder during TUG.

    PubMed

    Adachi, Daiki; Nishiguchi, Shu; Fukutani, Naoto; Hotta, Takayuki; Tashiro, Yuto; Morino, Saori; Shirooka, Hidehiko; Nozaki, Yuma; Hirata, Hinako; Yamaguchi, Moe; Yorozu, Ayanori; Takahashi, Masaki; Aoyama, Tomoki

    2017-05-01

    The purpose of this study was to investigate which spatial and temporal parameters of the Timed Up and Go (TUG) test are associated with motor function in elderly individuals. This study included 99 community-dwelling women aged 72.9 ± 6.3 years. Step length, step width, single support time, variability of the aforementioned parameters, gait velocity, cadence, reaction time from starting signal to first step, and minimum distance between the foot and a marker placed to 3 in front of the chair were measured using our analysis system. The 10-m walk test, five times sit-to-stand (FTSTS) test, and one-leg standing (OLS) test were used to assess motor function. Stepwise multivariate linear regression analysis was used to determine which TUG test parameters were associated with each motor function test. Finally, we calculated a predictive model for each motor function test using each regression coefficient. In stepwise linear regression analysis, step length and cadence were significantly associated with the 10-m walk test, FTSTS and OLS test. Reaction time was associated with the FTSTS test, and step width was associated with the OLS test. Each predictive model showed a strong correlation with the 10-m walk test and OLS test (P < 0.01), which was not significant higher correlation than TUG test time. We showed which TUG test parameters were associated with each motor function test. Moreover, the TUG test time regarded as the lower extremity function and mobility has strong predictive ability in each motor function test. Copyright © 2017 The Japanese Orthopaedic Association. Published by Elsevier B.V. All rights reserved.

  4. The use of copulas to practical estimation of multivariate stochastic differential equation mixed effects models

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rupšys, P.

    A system of stochastic differential equations (SDE) with mixed-effects parameters and multivariate normal copula density function were used to develop tree height model for Scots pine trees in Lithuania. A two-step maximum likelihood parameter estimation method is used and computational guidelines are given. After fitting the conditional probability density functions to outside bark diameter at breast height, and total tree height, a bivariate normal copula distribution model was constructed. Predictions from the mixed-effects parameters SDE tree height model calculated during this research were compared to the regression tree height equations. The results are implemented in the symbolic computational language MAPLE.

  5. Multivariate Density Estimation and Remote Sensing

    NASA Technical Reports Server (NTRS)

    Scott, D. W.

    1983-01-01

    Current efforts to develop methods and computer algorithms to effectively represent multivariate data commonly encountered in remote sensing applications are described. While this may involve scatter diagrams, multivariate representations of nonparametric probability density estimates are emphasized. The density function provides a useful graphical tool for looking at data and a useful theoretical tool for classification. This approach is called a thunderstorm data analysis.

  6. Classical least squares multivariate spectral analysis

    DOEpatents

    Haaland, David M.

    2002-01-01

    An improved classical least squares multivariate spectral analysis method that adds spectral shapes describing non-calibrated components and system effects (other than baseline corrections) present in the analyzed mixture to the prediction phase of the method. These improvements decrease or eliminate many of the restrictions to the CLS-type methods and greatly extend their capabilities, accuracy, and precision. One new application of PACLS includes the ability to accurately predict unknown sample concentrations when new unmodeled spectral components are present in the unknown samples. Other applications of PACLS include the incorporation of spectrometer drift into the quantitative multivariate model and the maintenance of a calibration on a drifting spectrometer. Finally, the ability of PACLS to transfer a multivariate model between spectrometers is demonstrated.

  7. Predictive factors for the regression of canine transmissible venereal tumor during vincristine therapy.

    PubMed

    Scarpelli, Karime C; Valladão, Maria L; Metze, Konradin

    2010-03-01

    Canine transmissible venereal tumor (CTVT) is a neoplasm transmitted by transplantation. Monochemotherapy with vincristine is considered to be effective, but treatment time until complete clinical remission may vary. The aim of this study was to determine which clinical data at diagnosis could predict the responsiveness of CTVT to vincristine chemotherapy. One hundred dogs with CTVT entered this prospective study. The animals were treated with vincristine sulfate (0.025 mg/kg) at weekly intervals until the tumor had macroscopically disappeared. The time to complete remission was recorded. A multivariate Cox regression model indicated that larger tumor mass, increased age and therapy during hot and rainy months were independent significant unfavorable predictive factors retarding remission, whereas sex, weight, status as owned dog or breed were of no predictive relevance. Further studies are necessary to investigate whether these results are due to changes in immunological response mechanisms in animals with a diminished immune surveillance, resulting in delays in tumor regression. 2008 Elsevier Ltd. All rights reserved.

  8. The severity of Minamata disease declined in 25 years: temporal profile of the neurological findings analyzed by multiple logistic regression model.

    PubMed

    Uchino, Makoto; Hirano, Teruyuki; Satoh, Hiroshi; Arimura, Kimiyoshi; Nakagawa, Masanori; Wakamiya, Jyunji

    2005-01-01

    Minamata disease (MD) was caused by ingestion of seafood from the methylmercury-contaminated areas. Although 50 years have passed since the discovery of MD, there have been only a few studies on the temporal profile of neurological findings in certified MD patients. Thus, we evaluated changes in neurological symptoms and signs of MD using discriminants by multiple logistic regression analysis. The severity of predictive index declined in 25 years in most of the patients. Only a few patients showed aggravation of neurological findings, which was due to complications such as spino-cerebellar degeneration. Patients with chronic MD aged over 45 years had several concomitant diseases so that their clinical pictures were complicated. It was difficult to differentiate chronic MD using statistically established discriminants based on sensory disturbance alone. In conclusion, the severity of MD declined in 25 years along with the modification by age-related concomitant disorders.

  9. Personal, Social, and Game-Related Correlates of Active and Non-Active Gaming Among Dutch Gaming Adolescents: Survey-Based Multivariable, Multilevel Logistic Regression Analyses

    PubMed Central

    de Vet, Emely; Chinapaw, Mai JM; de Boer, Michiel; Seidell, Jacob C; Brug, Johannes

    2014-01-01

    Background Playing video games contributes substantially to sedentary behavior in youth. A new generation of video games—active games—seems to be a promising alternative to sedentary games to promote physical activity and reduce sedentary behavior. At this time, little is known about correlates of active and non-active gaming among adolescents. Objective The objective of this study was to examine potential personal, social, and game-related correlates of both active and non-active gaming in adolescents. Methods A survey assessing game behavior and potential personal, social, and game-related correlates was conducted among adolescents (12-16 years, N=353) recruited via schools. Multivariable, multilevel logistic regression analyses, adjusted for demographics (age, sex and educational level of adolescents), were conducted to examine personal, social, and game-related correlates of active gaming ≥1 hour per week (h/wk) and non-active gaming >7 h/wk. Results Active gaming ≥1 h/wk was significantly associated with a more positive attitude toward active gaming (OR 5.3, CI 2.4-11.8; P<.001), a less positive attitude toward non-active games (OR 0.30, CI 0.1-0.6; P=.002), a higher score on habit strength regarding gaming (OR 1.9, CI 1.2-3.2; P=.008) and having brothers/sisters (OR 6.7, CI 2.6-17.1; P<.001) and friends (OR 3.4, CI 1.4-8.4; P=.009) who spend more time on active gaming and a little bit lower score on game engagement (OR 0.95, CI 0.91-0.997; P=.04). Non-active gaming >7 h/wk was significantly associated with a more positive attitude toward non-active gaming (OR 2.6, CI 1.1-6.3; P=.035), a stronger habit regarding gaming (OR 3.0, CI 1.7-5.3; P<.001), having friends who spend more time on non-active gaming (OR 3.3, CI 1.46-7.53; P=.004), and a more positive image of a non-active gamer (OR 2, CI 1.07–3.75; P=.03). Conclusions Various factors were significantly associated with active gaming ≥1 h/wk and non-active gaming >7 h/wk. Active gaming is most

  10. Nutrition knowledge, attitudes, behaviours and the influencing factors among non-parent caregivers of rural left-behind children under 7 years old in China.

    PubMed

    Tan, Cai; Luo, Jiayou; Zong, Rong; Fu, Chuhui; Zhang, Lingli; Mou, Jinsong; Duan, Danhui

    2010-10-01

    To explore and compare nutrition knowledge, attitudes and behaviours (KAB) between non-parent and parent caregivers of children under 7 years old in Chinese rural areas, and to identify the factors influencing their nutrition KAB. Face-to-face interviews were carried out with 1691 non-parent caregivers and 1670 parent caregivers in the selected study areas; multivariate logistic regression models were used to identify the factors influencing nutrition KAB in caregivers. The awareness rate of nutrition knowledge, the rate of positive attitudes and the rate of optimal behaviours in non-parent caregivers (52.2 %, 56.9 % and 37.7 %, respectively) were significantly lower than in the parent group (63.8 %, 62.1 % and 42.8 %, respectively). Multivariate logistic regression modelling showed that caregivers' family income and care will, and children's age and gender, were associated with caregivers' nutrition KAB after controlling the possible confounding variables (caregivers' age, gender, education and occupation). Non-parent caregivers had relatively poor nutrition KAB. Extra efforts and targeted education programmes aimed to improve rural non-parent caregivers' nutrition KAB are wanted and need to be emphasized.

  11. Synthesis of linear regression coefficients by recovering the within-study covariance matrix from summary statistics.

    PubMed

    Yoneoka, Daisuke; Henmi, Masayuki

    2017-06-01

    Recently, the number of regression models has dramatically increased in several academic fields. However, within the context of meta-analysis, synthesis methods for such models have not been developed in a commensurate trend. One of the difficulties hindering the development is the disparity in sets of covariates among literature models. If the sets of covariates differ across models, interpretation of coefficients will differ, thereby making it difficult to synthesize them. Moreover, previous synthesis methods for regression models, such as multivariate meta-analysis, often have problems because covariance matrix of coefficients (i.e. within-study correlations) or individual patient data are not necessarily available. This study, therefore, proposes a brief explanation regarding a method to synthesize linear regression models under different covariate sets by using a generalized least squares method involving bias correction terms. Especially, we also propose an approach to recover (at most) threecorrelations of covariates, which is required for the calculation of the bias term without individual patient data. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  12. Disentangling the Correlates of Drug Use in a Clinic and Community Sample: A Regression Analysis of the Associations between Drug Use, Years-of-School, Impulsivity, IQ, Working Memory, and Psychiatric Symptoms.

    PubMed

    Heyman, Gene M; Dunn, Brian J; Mignone, Jason

    2014-01-01

    Years-of-school is negatively correlated with illicit drug use. However, educational attainment is positively correlated with IQ and negatively correlated with impulsivity, two traits that are also correlated with drug use. Thus, the negative correlation between education and drug use may reflect the correlates of schooling, not schooling itself. To help disentangle these relations we obtained measures of working memory, simple memory, IQ, disposition (impulsivity and psychiatric status), years-of-school and frequency of illicit and licit drug use in methadone clinic and community drug users. We found strong zero-order correlations between all measures, including IQ, impulsivity, years-of-school, psychiatric symptoms, and drug use. However, multiple regression analyses revealed a different picture. The significant predictors of illicit drug use were gender, involvement in a methadone clinic, and years-of-school. That is, psychiatric symptoms, impulsivity, cognition, and IQ no longer predicted illicit drug use in the multiple regression analyses. Moreover, high risk subjects (low IQ and/or high impulsivity) who spent 14 or more years in school used stimulants and opiates less than did low risk subjects who had spent <14 years in school. Smoking and drinking had a different correlational structure. IQ and years-of-school predicted whether someone ever became a smoker, whereas impulsivity predicted the frequency of drinking bouts, but years-of-school did not. Many subjects reported no use of one or more drugs, resulting in a large number of "zeroes" in the data sets. Cragg's Double-Hurdle regression method proved the best approach for dealing with this problem. To our knowledge, this is the first report to show that years-of-school predicts lower levels of illicit drug use after controlling for IQ and impulsivity. This paper also highlights the advantages of Double-Hurdle regression methods for analyzing the correlates of drug use in community samples.

  13. Preoperative predictive factors of aneurysmal regression using the reporting standards for endovascular aortic aneurysm repair.

    PubMed

    Kaladji, Adrien; Cardon, Alain; Abouliatim, Issam; Campillo-Gimenez, Boris; Heautot, Jean François; Verhoye, Jean-Philippe

    2012-05-01

    Aneurysmal regression is a reliable marker for long-lasting success after endovascular aneurysm repair (EVAR). The aim of this study was to identify the preoperative factors that can predictably lead to aneurysmal sac regression after EVAR, according to the reporting standards of the Society for Vascular Surgery and the International Society of Cardiovascular Surgery (SVS/ISCVS). From 199 patients treated by EVAR between 2000 and 2009, 164 completed computed tomography angiographies and duplex scan follow-up images were available. All computed tomography angiographies for enrolled patients in this retrospective study were analyzed with Endosize software (Therenva, Rennes, France) to provide spatially correct 3-dimensional data in accordance with SVS/ISCVS recommendations. Anatomic parameters were graded according to the relevant severity grades. A severity score was calculated at the aortic neck, the abdominal aortic aneurysm, and the iliac arteries. Clinical and demographic factors were studied. Patients with aneurysmal regression >5 mm were assigned to group A (mean age, 71.4 ± 8.9 years) and the others to group B (76.3 ± 8.3 years). Aneurysmal regression occurred in 66 patients (40.2%; group A). Univariate analyses showed smaller severity scores at the aortic neck (P = .02) and the iliac arteries (P = .002) in group A and calcifications and thrombus were less significant at the aortic neck (P = .003 and P = .02) and at the iliac arteries (P = .001 and P = .02), and inferior mesenteric artery patency was less frequent (68.2% vs 82.7%, P = .04). Two multivariate analyses were done: one considered the scores and the other the variables included in the scores. In the first, the patients of group A were younger (P = .002) and aortic neck calcifications were less significant (P = .007). In the second, group A patients were younger (P < .001) and the aortic neck scores were smaller (P = .04). There was no difference between the two groups in the type of implanted

  14. Studying Resist Stochastics with the Multivariate Poisson Propagation Model

    DOE PAGES

    Naulleau, Patrick; Anderson, Christopher; Chao, Weilun; ...

    2014-01-01

    Progress in the ultimate performance of extreme ultraviolet resist has arguably decelerated in recent years suggesting an approach to stochastic limits both in photon counts and material parameters. Here we report on the performance of a variety of leading extreme ultraviolet resist both with and without chemical amplification. The measured performance is compared to stochastic modeling results using the Multivariate Poisson Propagation Model. The results show that the best materials are indeed nearing modeled performance limits.

  15. Are learning strategies linked to academic performance among adolescents in two States in India? A tobit regression analysis.

    PubMed

    Areepattamannil, Shaljan

    2014-01-01

    The results of the fourth cycle of the Program for International Student Assessment (PISA) revealed that an unacceptably large number of adolescent students in two states in India-Himachal Pradesh and Tamil Nadu-have failed to acquire basic skills in reading, mathematics, and science (Walker, 2011). Drawing on data from the PISA 2009 database and employing multivariate left-censored to bit regression as a data analytic strategy, the present study, therefore, examined whether or not the learning strategies-memorization, elaboration, and control strategies-of adolescent students in Himachal Pradesh (N = 1,616; Mean age = 15.81 years) and Tamil Nadu (N = 3,210; Mean age = 15.64 years) were linked to their performance on the PISA 2009 reading, mathematics, and science assessments. Tobit regression analyses, after accounting for student demographic characteristics, revealed that the self-reported use of control strategies was significantly positively associated with reading, mathematical, and scientific literacy of adolescents in Himachal Pradesh and Tamil Nadu. While the self-reported use of elaboration strategies was not significantly associated with reading literacy among adolescents in Himachal Pradesh and Tamil Nadu, it was significantly positively associated with mathematical literacy among adolescents in Himachal Pradesh and Tamil Nadu. Moreover, the self-reported use of elaboration strategies was significantly and positively linked to scientific literacy among adolescents in Himachal Pradesh alone. The self-reported use of memorization strategies was significantly negatively associated with reading, mathematical, and scientific literacy in Tamil Nadu, while it was significantly negatively associated with mathematical and scientific literacy alone in Himachal Pradesh. Implications of these findings are discussed.

  16. Exploring emergency department 4-hour target performance and cancelled elective operations: a regression analysis of routinely collected and openly reported NHS trust data.

    PubMed

    Keogh, Brad; Culliford, David; Guerrero-Ludueña, Richard; Monks, Thomas

    2018-05-24

    To quantify the effect of intrahospital patient flow on emergency department (ED) performance targets and indicate if the expectations set by the National Health Service (NHS) England 5-year forward review are realistic in returning emergency services to previous performance levels. Linear regression analysis of routinely reported trust activity and performance data using a series of cross-sectional studies. NHS trusts in England submitting routine nationally reported measures to NHS England. 142 acute non-specialist trusts operating in England between 2012 and 2016. The primary outcome measures were proportion of 4-hour waiting time breaches and cancelled elective operations. Univariate and multivariate linear regression models were used to show relationships between the outcome measures and various measures of trust activity including empty day beds, empty night beds, day bed to night bed ratio, ED conversion ratio and delayed transfers of care. Univariate regression results using the outcome of 4-hour breaches showed clear relationships with empty night beds and ED conversion ratio between 2012 and 2016. The day bed to night bed ratio showed an increasing ability to explain variation in performance between 2015 and 2016. Delayed transfers of care showed little evidence of an association. Multivariate model results indicated that the ability of patient flow variables to explain 4-hour target performance had reduced between 2012 and 2016 (19% to 12%), and had increased in explaining cancelled elective operations (7% to 17%). The flow of patients through trusts is shown to influence ED performance; however, performance has become less explainable by intratrust patient flow between 2012 and 2016. Some commonly stated explanatory factors such as delayed transfers of care showed limited evidence of being related. The results indicate some of the measures proposed by NHS England to reduce pressure on EDs may not have the desired impact on returning services to previous

  17. Evaluation of a Multivariate Syndromic Surveillance System for West Nile Virus.

    PubMed

    Faverjon, Céline; Andersson, M Gunnar; Decors, Anouk; Tapprest, Jackie; Tritz, Pierre; Sandoz, Alain; Kutasi, Orsolya; Sala, Carole; Leblond, Agnès

    2016-06-01

    Various methods are currently used for the early detection of West Nile virus (WNV) but their outputs are not quantitative and/or do not take into account all available information. Our study aimed to test a multivariate syndromic surveillance system to evaluate if the sensitivity and the specificity of detection of WNV could be improved. Weekly time series data on nervous syndromes in horses and mortality in both horses and wild birds were used. Baselines were fitted to the three time series and used to simulate 100 years of surveillance data. WNV outbreaks were simulated and inserted into the baselines based on historical data and expert opinion. Univariate and multivariate syndromic surveillance systems were tested to gauge how well they detected the outbreaks; detection was based on an empirical Bayesian approach. The systems' performances were compared using measures of sensitivity, specificity, and area under receiver operating characteristic curve (AUC). When data sources were considered separately (i.e., univariate systems), the best detection performance was obtained using the data set of nervous symptoms in horses compared to those of bird and horse mortality (AUCs equal to 0.80, 0.75, and 0.50, respectively). A multivariate outbreak detection system that used nervous symptoms in horses and bird mortality generated the best performance (AUC = 0.87). The proposed approach is suitable for performing multivariate syndromic surveillance of WNV outbreaks. This is particularly relevant, given that a multivariate surveillance system performed better than a univariate approach. Such a surveillance system could be especially useful in serving as an alert for the possibility of human viral infections. This approach can be also used for other diseases for which multiple sources of evidence are available.

  18. Domain-Invariant Partial-Least-Squares Regression.

    PubMed

    Nikzad-Langerodi, Ramin; Zellinger, Werner; Lughofer, Edwin; Saminger-Platz, Susanne

    2018-05-11

    Multivariate calibration models often fail to extrapolate beyond the calibration samples because of changes associated with the instrumental response, environmental condition, or sample matrix. Most of the current methods used to adapt a source calibration model to a target domain exclusively apply to calibration transfer between similar analytical devices, while generic methods for calibration-model adaptation are largely missing. To fill this gap, we here introduce domain-invariant partial-least-squares (di-PLS) regression, which extends ordinary PLS by a domain regularizer in order to align the source and target distributions in the latent-variable space. We show that a domain-invariant weight vector can be derived in closed form, which allows the integration of (partially) labeled data from the source and target domains as well as entirely unlabeled data from the latter. We test our approach on a simulated data set where the aim is to desensitize a source calibration model to an unknown interfering agent in the target domain (i.e., unsupervised model adaptation). In addition, we demonstrate unsupervised, semisupervised, and supervised model adaptation by di-PLS on two real-world near-infrared (NIR) spectroscopic data sets.

  19. Regression periods in infancy: a case study from Catalonia.

    PubMed

    Sadurní, Marta; Rostan, Carlos

    2002-05-01

    Based on Rijt-Plooij and Plooij's (1992) research on emergence of regression periods in the first two years of life, the presence of such periods in a group of 18 babies (10 boys and 8 girls, aged between 3 weeks and 14 months) from a Catalonian population was analyzed. The measurements were a questionnaire filled in by the infants' mothers, a semi-structured weekly tape-recorded interview, and observations in their homes. The procedure and the instruments used in the project follow those proposed by Rijt-Plooij and Plooij. Our results confirm the existence of the regression periods in the first year of children's life. Inter-coder agreement for trained coders was 78.2% and within-coder agreement was 90.1%. In the discussion, the possible meaning and relevance of regression periods in order to understand development from a psychobiological and social framework is commented upon.

  20. Workers' compensation costs among construction workers: a robust regression analysis.

    PubMed

    Friedman, Lee S; Forst, Linda S

    2009-11-01

    Workers' compensation data are an important source for evaluating costs associated with construction injuries. We describe the characteristics of injured construction workers filing claims in Illinois between 2000 and 2005 and the factors associated with compensation costs using a robust regression model. In the final multivariable model, the cumulative percent temporary and permanent disability-measures of severity of injury-explained 38.7% of the variance of cost. Attorney costs explained only 0.3% of the variance of the dependent variable. The model used in this study clearly indicated that percent disability was the most important determinant of cost, although the method and uniformity of percent impairment allocation could be better elucidated. There is a need to integrate analytical methods that are suitable for skewed data when analyzing claim costs.

  1. Some Recent Developments on Complex Multivariate Distributions

    ERIC Educational Resources Information Center

    Krishnaiah, P. R.

    1976-01-01

    In this paper, the author gives a review of the literature on complex multivariate distributions. Some new results on these distributions are also given. Finally, the author discusses the applications of the complex multivariate distributions in the area of the inference on multiple time series. (Author)

  2. Baseline biopsychosocial determinants of telomere length and 6-year attrition rate.

    PubMed

    Révész, Dóra; Milaneschi, Yuri; Terpstra, Erik M; Penninx, Brenda W J H

    2016-05-01

    Short leukocyte telomere length (TL) and accelerated telomere attrition have been associated with various deleterious health outcomes, although their determinants have not been explored collectively in a large-scale study. Leukocyte TL was measured (baseline N=2936; 6-year follow-up N=1860) in participants (18-65 years) from the NESDA study. Baseline determinants of TL included sociodemographics, lifestyle, chronic diseases, psychosocial stressors, and metabolic and physiological stress markers. Multivariate linear regression models were used to examine the associations between these determinants and (1) baseline TL, and (2) 6-year TL change. Multinomial logistic regression analyses were used to examine the predictors of telomere attrition and lengthening, as compared to stable TL. Short baseline TL was associated with older age, male sex, non-European ethnicity, cigarette smoking, recent life events, and higher triglycerides, glucose and pre-ejection period (R(2)=11.3%). The 6-year telomere attrition was inversely associated with baseline TL (R(2)=51.6%); also older age, long sleep, not having a partner, high childhood trauma index, and gastrointestinal disease were associated with 6-year TL attrition (additional R(2)=3.7%). Telomere attrition seemed to have slightly more predictors than lengthening. Sociodemographic, lifestyle, psychosocial stress and metabolic and physiological stress factors are cross-sectionally linked with TL. Telomere attrition over six years was strongly associated with baseline TL, suggesting an internal homeostatic influence. Modulation of the identified determinants may become target of future studies to promote telomere maintenance and healthy aging. Copyright © 2016 Elsevier Ltd. All rights reserved.

  3. POSTTRAUMATIC STRESS DISORDER AMONG INDONESIAN CHILDREN 5 YEARS AFTER THE TSUNAMI.

    PubMed

    Irwanto; Faisal; Zulfa, Hendra

    2015-09-01

    Children are at risk for developing posttraumatic stress disorder (PTSD) due to experiencing or living in a disaster area. The factors that increase the likelihood of a child developing PTSD need further clarification. We studied the factors associated with PTSD among children who experienced the tsunami in Sumatra, Indonesia. We conducted a cross sectional study in 2 subdistricts of Sumatra 5 years after experiencing a tsunami. Children aged 7-13 years were enrolled using stratified cluster sampling. A tsunami-modified version of The PsySTART Rapid Triage System was used to question children about their tsunami-specific traumatic experiences. Trauma symptoms were evaluated using the Trauma Symptom Checklist For Children (TSCC). The diagnosis of PTSD was made using the Child PTSD Symptom Scale (CPSS) and DSM-IV criteria. The data were analyzed with chi-square tests and multivariate logistic regression analysis with 95% confidence intervals (CI). A total of 262 children were enrolled in this study. The prevalence of PTSD in these children was 20.6%. On multivariate analysis, having experienced a delay in evacuation (PR = 4.5; 95% CI: 2.794-13.80; p < 0.001) and being unable to escape (PR = 13.07; 95% CI: 5.884-64; p < 0.001) were significantly associated with PTSD 5 years after the tsunami. Children who experienced a traumatic event in which they were unable to escape or when there is a delay in evacuation are at risk of developing PTSD and need appropriate treatment.

  4. Coping with matrix effects in headspace solid phase microextraction gas chromatography using multivariate calibration strategies.

    PubMed

    Ferreira, Vicente; Herrero, Paula; Zapata, Julián; Escudero, Ana

    2015-08-14

    SPME is extremely sensitive to experimental parameters affecting liquid-gas and gas-solid distribution coefficients. Our aims were to measure the weights of these factors and to design a multivariate strategy based on the addition of a pool of internal standards, to minimize matrix effects. Synthetic but real-like wines containing selected analytes and variable amounts of ethanol, non-volatile constituents and major volatile compounds were prepared following a factorial design. The ANOVA study revealed that even using a strong matrix dilution, matrix effects are important and additive with non-significant interaction effects and that it is the presence of major volatile constituents the most dominant factor. A single internal standard provided a robust calibration for 15 out of 47 analytes. Then, two different multivariate calibration strategies based on Partial Least Square Regression were run in order to build calibration functions based on 13 different internal standards able to cope with matrix effects. The first one is based in the calculation of Multivariate Internal Standards (MIS), linear combinations of the normalized signals of the 13 internal standards, which provide the expected area of a given unit of analyte present in each sample. The second strategy is a direct calibration relating concentration to the 13 relative areas measured in each sample for each analyte. Overall, 47 different compounds can be reliably quantified in a single fully automated method with overall uncertainties better than 15%. Copyright © 2015 Elsevier B.V. All rights reserved.

  5. Use of Longitudinal Regression in Quality Control. Research Report. ETS RR-14-31

    ERIC Educational Resources Information Center

    Lu, Ying; Yen, Wendy M.

    2014-01-01

    This article explores the use of longitudinal regression as a tool for identifying scoring inaccuracies. Student progression patterns, as evaluated through longitudinal regressions, typically are more stable from year to year than are scale score distributions and statistics, which require representative samples to conduct credibility checks.…

  6. Holocene coastal regression and facies patterns in a subtropical arid carbonate environment - The sabkha of Al-Zareq, Qatar

    NASA Astrophysics Data System (ADS)

    Engel, Max; Peis, Kim T.; Strohmenger, Christian J.; Pint, Anna; Rivers, John M.; Brückner, Helmut

    2017-04-01

    The Arabian Gulf is a semi-enclosed, shallow sea, which became flooded some 12,500 years ago. Current relative sea level was first reached c. 7000 to 6500 years ago, while a relative sea-level highstand of c. 2-4 m dates to around 6000-4500 years ago. Supratidal coastal sabkhas (former lagoons), stranded beach ridges and foredune sequences as well as abandoned tidal channels along the coasts of Qatar and the UAE witness this mid-Holocene peak in sea level. Regression since then triggered shoreline migration of up to several kilometers along the low-lying coasts of Qatar, for which, however, detailed reconstructions in space and time are scarce. This study presents facies changes and a scenario for the spatio-temporal evolution of the coastal area of Al Zareq in the inner Gulf of Salwa (SW Qatar), thereby also contributing to a better understanding of reservoirs that formed under arid climatic conditions. Ten vibracores (up to 8 m), two deep drillings (up to 20.5 m) and two trenches covering the entire transgression-regression cycle were investigated. In order to characterize and interpret facies types at Al-Zareq as well as to reconstruct sabkha formation in space and time, grain size and shape distribution (laser diffraction, camsizer), XRD, micro- and macrofossil contents and thin sections were analysed by applying qualitative interpretation, descriptive and multivariate statistics (PCA, MDA, end-member modelling), and RIR (XRD). Thirty-seven samples were radiocarbon dated and four samples were dated by optically stimulated luminescence (OSL). Depositional environments include the following types: eolian dune and interdune (in-situ or reworked), coastal sabkha (diagenetic), saline lake (salina), protected lagoon (sand- or carbonate-dominated), beach and beach spit, tidal channel and tidal bar, as well as open lagoon (low-energy, shallow-subtidal lagoon and low-energy deeper-subtidal).

  7. Chemiluminescence-based multivariate sensing of local equivalence ratios in premixed atmospheric methane-air flames

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tripathi, Markandey M.; Krishnan, Sundar R.; Srinivasan, Kalyan K.

    Chemiluminescence emissions from OH*, CH*, C2, and CO2 formed within the reaction zone of premixed flames depend upon the fuel-air equivalence ratio in the burning mixture. In the present paper, a new partial least square regression (PLS-R) based multivariate sensing methodology is investigated and compared with an OH*/CH* intensity ratio-based calibration model for sensing equivalence ratio in atmospheric methane-air premixed flames. Five replications of spectral data at nine different equivalence ratios ranging from 0.73 to 1.48 were used in the calibration of both models. During model development, the PLS-R model was initially validated with the calibration data set using themore » leave-one-out cross validation technique. Since the PLS-R model used the entire raw spectral intensities, it did not need the nonlinear background subtraction of CO2 emission that is required for typical OH*/CH* intensity ratio calibrations. An unbiased spectral data set (not used in the PLS-R model development), for 28 different equivalence ratio conditions ranging from 0.71 to 1.67, was used to predict equivalence ratios using the PLS-R and the intensity ratio calibration models. It was found that the equivalence ratios predicted with the PLS-R based multivariate calibration model matched the experimentally measured equivalence ratios within 7%; whereas, the OH*/CH* intensity ratio calibration grossly underpredicted equivalence ratios in comparison to measured equivalence ratios, especially under rich conditions ( > 1.2). The practical implications of the chemiluminescence-based multivariate equivalence ratio sensing methodology are also discussed.« less

  8. THE REGRESSION MODEL OF IRAN LIBRARIES ORGANIZATIONAL CLIMATE.

    PubMed

    Jahani, Mohammad Ali; Yaminfirooz, Mousa; Siamian, Hasan

    2015-10-01

    The purpose of this study was to drawing a regression model of organizational climate of central libraries of Iran's universities. This study is an applied research. The statistical population of this study consisted of 96 employees of the central libraries of Iran's public universities selected among the 117 universities affiliated to the Ministry of Health by Stratified Sampling method (510 people). Climate Qual localized questionnaire was used as research tools. For predicting the organizational climate pattern of the libraries is used from the multivariate linear regression and track diagram. of the 9 variables affecting organizational climate, 5 variables of innovation, teamwork, customer service, psychological safety and deep diversity play a major role in prediction of the organizational climate of Iran's libraries. The results also indicate that each of these variables with different coefficient have the power to predict organizational climate but the climate score of psychological safety (0.94) plays a very crucial role in predicting the organizational climate. Track diagram showed that five variables of teamwork, customer service, psychological safety, deep diversity and innovation directly effects on the organizational climate variable that contribution of the team work from this influence is more than any other variables. Of the indicator of the organizational climate of climateQual, the contribution of the team work from this influence is more than any other variables that reinforcement of teamwork in academic libraries can be more effective in improving the organizational climate of this type libraries.

  9. A multivariate model for predicting segmental body composition.

    PubMed

    Tian, Simiao; Mioche, Laurence; Denis, Jean-Baptiste; Morio, Béatrice

    2013-12-01

    The aims of the present study were to propose a multivariate model for predicting simultaneously body, trunk and appendicular fat and lean masses from easily measured variables and to compare its predictive capacity with that of the available univariate models that predict body fat percentage (BF%). The dual-energy X-ray absorptiometry (DXA) dataset (52% men and 48% women) with White, Black and Hispanic ethnicities (1999-2004, National Health and Nutrition Examination Survey) was randomly divided into three sub-datasets: a training dataset (TRD), a test dataset (TED); a validation dataset (VAD), comprising 3835, 1917 and 1917 subjects. For each sex, several multivariate prediction models were fitted from the TRD using age, weight, height and possibly waist circumference. The most accurate model was selected from the TED and then applied to the VAD and a French DXA dataset (French DB) (526 men and 529 women) to assess the prediction accuracy in comparison with that of five published univariate models, for which adjusted formulas were re-estimated using the TRD. Waist circumference was found to improve the prediction accuracy, especially in men. For BF%, the standard error of prediction (SEP) values were 3.26 (3.75) % for men and 3.47 (3.95)% for women in the VAD (French DB), as good as those of the adjusted univariate models. Moreover, the SEP values for the prediction of body and appendicular lean masses ranged from 1.39 to 2.75 kg for both the sexes. The prediction accuracy was best for age < 65 years, BMI < 30 kg/m2 and the Hispanic ethnicity. The application of our multivariate model to large populations could be useful to address various public health issues.

  10. Triglyceride and glucose (TyG) index as a predictor of incident hypertension: a 9-year longitudinal population-based study.

    PubMed

    Zheng, Rongjiong; Mao, Yushan

    2017-09-13

    Hypertension and the triglyceride and glucose index both have been associated with insulin resistance; however, the longitudinal association remains unclear. This study was designed to investigate the longitudinal association between the triglyceride and glucose index and incident hypertension among the Chinese population. We studied 4686 subjects (3177 males and 1509 females) and followed up for 9 years. The subjects were divided into four groups based on the triglyceride and glucose index. Univariate and multivariate Cox regression models were used to analyse the risk factors of hypertension. After 9 years of follow-up, 2047 subjects developed hypertension. The overall 9-year cumulative incidence of hypertension was 43.7%, ranging from 28.5% in quartile 1 to 36.9% in quartile 2, 49.2% in quartile 3 and 59.8% in quartile 4 (p for trend < 0.001). Cox regression analyses indicated that higher triglyceride and glucose index was associated with an increased risk of subsequent incident hypertension. The triglyceride and glucose index can predict the incident hypertension among the Chinese population.

  11. Moderation analysis using a two-level regression model.

    PubMed

    Yuan, Ke-Hai; Cheng, Ying; Maxwell, Scott

    2014-10-01

    Moderation analysis is widely used in social and behavioral research. The most commonly used model for moderation analysis is moderated multiple regression (MMR) in which the explanatory variables of the regression model include product terms, and the model is typically estimated by least squares (LS). This paper argues for a two-level regression model in which the regression coefficients of a criterion variable on predictors are further regressed on moderator variables. An algorithm for estimating the parameters of the two-level model by normal-distribution-based maximum likelihood (NML) is developed. Formulas for the standard errors (SEs) of the parameter estimates are provided and studied. Results indicate that, when heteroscedasticity exists, NML with the two-level model gives more efficient and more accurate parameter estimates than the LS analysis of the MMR model. When error variances are homoscedastic, NML with the two-level model leads to essentially the same results as LS with the MMR model. Most importantly, the two-level regression model permits estimating the percentage of variance of each regression coefficient that is due to moderator variables. When applied to data from General Social Surveys 1991, NML with the two-level model identified a significant moderation effect of race on the regression of job prestige on years of education while LS with the MMR model did not. An R package is also developed and documented to facilitate the application of the two-level model.

  12. Analyzing Multiple Outcomes in Clinical Research Using Multivariate Multilevel Models

    PubMed Central

    Baldwin, Scott A.; Imel, Zac E.; Braithwaite, Scott R.; Atkins, David C.

    2014-01-01

    Objective Multilevel models have become a standard data analysis approach in intervention research. Although the vast majority of intervention studies involve multiple outcome measures, few studies use multivariate analysis methods. The authors discuss multivariate extensions to the multilevel model that can be used by psychotherapy researchers. Method and Results Using simulated longitudinal treatment data, the authors show how multivariate models extend common univariate growth models and how the multivariate model can be used to examine multivariate hypotheses involving fixed effects (e.g., does the size of the treatment effect differ across outcomes?) and random effects (e.g., is change in one outcome related to change in the other?). An online supplemental appendix provides annotated computer code and simulated example data for implementing a multivariate model. Conclusions Multivariate multilevel models are flexible, powerful models that can enhance clinical research. PMID:24491071

  13. Multivariate prediction of motor diagnosis in Huntington's disease: 12 years of PREDICT‐HD

    PubMed Central

    Long, Jeffrey D.

    2015-01-01

    Abstract Background It is well known in Huntington's disease that cytosine‐adenine‐guanine expansion and age at study entry are predictive of the timing of motor diagnosis. The goal of this study was to assess whether additional motor, imaging, cognitive, functional, psychiatric, and demographic variables measured at study entry increased the ability to predict the risk of motor diagnosis over 12 years. Methods One thousand seventy‐eight Huntington's disease gene–expanded carriers (64% female) from the Neurobiological Predictors of Huntington's Disease study were followed up for up to 12 y (mean = 5, standard deviation = 3.3) covering 2002 to 2014. No one had a motor diagnosis at study entry, but 225 (21%) carriers prospectively received a motor diagnosis. Analysis was performed with random survival forests, which is a machine learning method for right‐censored data. Results Adding 34 variables along with cytosine‐adenine‐guanine and age substantially increased predictive accuracy relative to cytosine‐adenine‐guanine and age alone. Adding six of the common motor and cognitive variables (total motor score, diagnostic confidence level, Symbol Digit Modalities Test, three Stroop tests) resulted in lower predictive accuracy than the full set, but still had twice the 5‐y predictive accuracy than when using cytosine‐adenine‐guanine and age alone. Additional analysis suggested interactions and nonlinear effects that were characterized in a post hoc Cox regression model. Conclusions Measurement of clinical variables can substantially increase the accuracy of predicting motor diagnosis over and above cytosine‐adenine‐guanine and age (and their interaction). Estimated probabilities can be used to characterize progression level and aid in future studies' sample selection. © 2015 The Authors. Movement Disorders published by Wiley Periodicals, Inc. on behalf of International Parkinson and Movement Disorder Society PMID:26340420

  14. Multivariate prediction of motor diagnosis in Huntington's disease: 12 years of PREDICT-HD.

    PubMed

    Long, Jeffrey D; Paulsen, Jane S

    2015-10-01

    It is well known in Huntington's disease that cytosine-adenine-guanine expansion and age at study entry are predictive of the timing of motor diagnosis. The goal of this study was to assess whether additional motor, imaging, cognitive, functional, psychiatric, and demographic variables measured at study entry increased the ability to predict the risk of motor diagnosis over 12 years. One thousand seventy-eight Huntington's disease gene-expanded carriers (64% female) from the Neurobiological Predictors of Huntington's Disease study were followed up for up to 12 y (mean = 5, standard deviation = 3.3) covering 2002 to 2014. No one had a motor diagnosis at study entry, but 225 (21%) carriers prospectively received a motor diagnosis. Analysis was performed with random survival forests, which is a machine learning method for right-censored data. Adding 34 variables along with cytosine-adenine-guanine and age substantially increased predictive accuracy relative to cytosine-adenine-guanine and age alone. Adding six of the common motor and cognitive variables (total motor score, diagnostic confidence level, Symbol Digit Modalities Test, three Stroop tests) resulted in lower predictive accuracy than the full set, but still had twice the 5-y predictive accuracy than when using cytosine-adenine-guanine and age alone. Additional analysis suggested interactions and nonlinear effects that were characterized in a post hoc Cox regression model. Measurement of clinical variables can substantially increase the accuracy of predicting motor diagnosis over and above cytosine-adenine-guanine and age (and their interaction). Estimated probabilities can be used to characterize progression level and aid in future studies' sample selection. © 2015 The Authors. Movement Disorders published by Wiley Periodicals, Inc. on behalf of International Parkinson and Movement Disorder Society.

  15. Estimation of Subpixel Snow-Covered Area by Nonparametric Regression Splines

    NASA Astrophysics Data System (ADS)

    Kuter, S.; Akyürek, Z.; Weber, G.-W.

    2016-10-01

    Measurement of the areal extent of snow cover with high accuracy plays an important role in hydrological and climate modeling. Remotely-sensed data acquired by earth-observing satellites offer great advantages for timely monitoring of snow cover. However, the main obstacle is the tradeoff between temporal and spatial resolution of satellite imageries. Soft or subpixel classification of low or moderate resolution satellite images is a preferred technique to overcome this problem. The most frequently employed snow cover fraction methods applied on Moderate Resolution Imaging Spectroradiometer (MODIS) data have evolved from spectral unmixing and empirical Normalized Difference Snow Index (NDSI) methods to latest machine learning-based artificial neural networks (ANNs). This study demonstrates the implementation of subpixel snow-covered area estimation based on the state-of-the-art nonparametric spline regression method, namely, Multivariate Adaptive Regression Splines (MARS). MARS models were trained by using MODIS top of atmospheric reflectance values of bands 1-7 as predictor variables. Reference percentage snow cover maps were generated from higher spatial resolution Landsat ETM+ binary snow cover maps. A multilayer feed-forward ANN with one hidden layer trained with backpropagation was also employed to estimate the percentage snow-covered area on the same data set. The results indicated that the developed MARS model performed better than th

  16. Rank estimation and the multivariate analysis of in vivo fast-scan cyclic voltammetric data

    PubMed Central

    Keithley, Richard B.; Carelli, Regina M.; Wightman, R. Mark

    2010-01-01

    Principal component regression has been used in the past to separate current contributions from different neuromodulators measured with in vivo fast-scan cyclic voltammetry. Traditionally, a percent cumulative variance approach has been used to determine the rank of the training set voltammetric matrix during model development, however this approach suffers from several disadvantages including the use of arbitrary percentages and the requirement of extreme precision of training sets. Here we propose that Malinowski’s F-test, a method based on a statistical analysis of the variance contained within the training set, can be used to improve factor selection for the analysis of in vivo fast-scan cyclic voltammetric data. These two methods of rank estimation were compared at all steps in the calibration protocol including the number of principal components retained, overall noise levels, model validation as determined using a residual analysis procedure, and predicted concentration information. By analyzing 119 training sets from two different laboratories amassed over several years, we were able to gain insight into the heterogeneity of in vivo fast-scan cyclic voltammetric data and study how differences in factor selection propagate throughout the entire principal component regression analysis procedure. Visualizing cyclic voltammetric representations of the data contained in the retained and discarded principal components showed that using Malinowski’s F-test for rank estimation of in vivo training sets allowed for noise to be more accurately removed. Malinowski’s F-test also improved the robustness of our criterion for judging multivariate model validity, even though signal-to-noise ratios of the data varied. In addition, pH change was the majority noise carrier of in vivo training sets while dopamine prediction was more sensitive to noise. PMID:20527815

  17. F100 multivariable control synthesis program: Evaluation of a multivariable control using a real-time engine simulation

    NASA Technical Reports Server (NTRS)

    Szuch, J. R.; Soeder, J. F.; Seldner, K.; Cwynar, D. S.

    1977-01-01

    The design, evaluation, and testing of a practical, multivariable, linear quadratic regulator control for the F100 turbofan engine were accomplished. NASA evaluation of the multivariable control logic and implementation are covered. The evaluation utilized a real time, hybrid computer simulation of the engine. Results of the evaluation are presented, and recommendations concerning future engine testing of the control are made. Results indicated that the engine testing of the control should be conducted as planned.

  18. A Multivariate Investigation of Employee Absenteeism.

    DTIC Science & Technology

    1980-05-01

    A MULTIVARIATE INVESTIGATION OF EMPLOYEE ABSENTEEISM.(U) MAY 80 J R TERBORG, G A OAVIS, F J SMITH N00014-78"C-0756 UNCLASSIFIED TR-80-5 NL inuuununn...COMPLEX ORGANIZATIONS PROGRAM IN INDUSTRIAL ORGANIZATIONAL PSYCHOLOG C, DEPARTMENT OF PSYCHOLOGY a- UNIVERSITY OF HOUSTON C HOUSTON, TEXAS T7004 C...a-o I *I-- . ’ 4 , ... ,.I .,.- .S 7Jn .jA A Multivariate Investigation of Employee Absenteeism James R. Terborg & Gregory A. Davis University of

  19. Atmospheric conditions, lunar phases, and childbirth: a multivariate analysis

    NASA Astrophysics Data System (ADS)

    Ochiai, Angela Megumi; Gonçalves, Fabio Luiz Teixeira; Ambrizzi, Tercio; Florentino, Lucia Cristina; Wei, Chang Yi; Soares, Alda Valeria Neves; De Araujo, Natalucia Matos; Gualda, Dulce Maria Rosa

    2012-07-01

    Our objective was to assess extrinsic influences upon childbirth. In a cohort of 1,826 days containing 17,417 childbirths among them 13,252 spontaneous labor admissions, we studied the influence of environment upon the high incidence of labor (defined by 75th percentile or higher), analyzed by logistic regression. The predictors of high labor admission included increases in outdoor temperature (odds ratio: 1.742, P = 0.045, 95%CI: 1.011 to 3.001), and decreases in atmospheric pressure (odds ratio: 1.269, P = 0.029, 95%CI: 1.055 to 1.483). In contrast, increases in tidal range were associated with a lower probability of high admission (odds ratio: 0.762, P = 0.030, 95%CI: 0.515 to 0.999). Lunar phase was not a predictor of high labor admission ( P = 0.339). Using multivariate analysis, increases in temperature and decreases in atmospheric pressure predicted high labor admission, and increases of tidal range, as a measurement of the lunar gravitational force, predicted a lower probability of high admission.

  20. Understanding adaptive gait in lower-limb amputees: insights from multivariate analyses

    PubMed Central

    2013-01-01

    Background In this paper we use multivariate statistical techniques to gain insights into how adaptive gait involving obstacle crossing is regulated in lower-limb amputees compared to able-bodied controls, with the aim of identifying underlying characteristics that differ between the two groups and consequently highlighting gait deficits in the amputees. Methods Eight unilateral trans-tibial amputees and twelve able-bodied controls completed adaptive gait trials involving negotiating various height obstacles; with amputees leading with their prosthetic limb. Spatiotemporal variables that are regularly used to quantify how gait is adapted when crossing obstacles were determined and subsequently analysed using multivariate statistical techniques. Results and discussion There were fundamental differences in the adaptive gait between the two groups. Compared to controls, amputees had a reduced approach velocity, reduced foot placement distance before and after the obstacle and reduced foot clearance over it, and reduced lead-limb knee flexion during the step following crossing. Logistic regression analysis highlighted the variables that best distinguished between the gait of the two groups and multiple regression analysis (with approach velocity as a controlling factor) helped identify what gait adaptations were driving the differences seen in these variables. Getting closer to the obstacle before crossing it appeared to be a strategy to ensure the heel of the lead-limb foot passed over the obstacle prior to the foot being lowered to the ground. Despite adopting such a heel clearance strategy, the lead-foot was positioned closer to the obstacle following crossing, which was likely a result of a desire to attain a limb/foot angle and orientation at instant of landing that minimised loads on the residuum (as evidenced by the reduced lead-limb knee flexion during the step following crossing). These changes in foot placement meant the foot was in a different part of swing

  1. Sampling effort affects multivariate comparisons of stream assemblages

    USGS Publications Warehouse

    Cao, Y.; Larsen, D.P.; Hughes, R.M.; Angermeier, P.L.; Patton, T.M.

    2002-01-01

    Multivariate analyses are used widely for determining patterns of assemblage structure, inferring species-environment relationships and assessing human impacts on ecosystems. The estimation of ecological patterns often depends on sampling effort, so the degree to which sampling effort affects the outcome of multivariate analyses is a concern. We examined the effect of sampling effort on site and group separation, which was measured using a mean similarity method. Two similarity measures, the Jaccard Coefficient and Bray-Curtis Index were investigated with 1 benthic macroinvertebrate and 2 fish data sets. Site separation was significantly improved with increased sampling effort because the similarity between replicate samples of a site increased more rapidly than between sites. Similarly, the faster increase in similarity between sites of the same group than between sites of different groups caused clearer separation between groups. The strength of site and group separation completely stabilized only when the mean similarity between replicates reached 1. These results are applicable to commonly used multivariate techniques such as cluster analysis and ordination because these multivariate techniques start with a similarity matrix. Completely stable outcomes of multivariate analyses are not feasible. Instead, we suggest 2 criteria for estimating the stability of multivariate analyses of assemblage data: 1) mean within-site similarity across all sites compared, indicating sample representativeness, and 2) the SD of within-site similarity across sites, measuring sample comparability.

  2. Discrimination and prediction of cultivation age and parts of Panax ginseng by Fourier-transform infrared spectroscopy combined with multivariate statistical analysis.

    PubMed

    Lee, Byeong-Ju; Kim, Hye-Youn; Lim, Sa Rang; Huang, Linfang; Choi, Hyung-Kyoon

    2017-01-01

    Panax ginseng C.A. Meyer is a herb used for medicinal purposes, and its discrimination according to cultivation age has been an important and practical issue. This study employed Fourier-transform infrared (FT-IR) spectroscopy with multivariate statistical analysis to obtain a prediction model for discriminating cultivation ages (5 and 6 years) and three different parts (rhizome, tap root, and lateral root) of P. ginseng. The optimal partial-least-squares regression (PLSR) models for discriminating ginseng samples were determined by selecting normalization methods, number of partial-least-squares (PLS) components, and variable influence on projection (VIP) cutoff values. The best prediction model for discriminating 5- and 6-year-old ginseng was developed using tap root, vector normalization applied after the second differentiation, one PLS component, and a VIP cutoff of 1.0 (based on the lowest root-mean-square error of prediction value). In addition, for discriminating among the three parts of P. ginseng, optimized PLSR models were established using data sets obtained from vector normalization, two PLS components, and VIP cutoff values of 1.5 (for 5-year-old ginseng) and 1.3 (for 6-year-old ginseng). To our knowledge, this is the first study to provide a novel strategy for rapidly discriminating the cultivation ages and parts of P. ginseng using FT-IR by selected normalization methods, number of PLS components, and VIP cutoff values.

  3. Discrimination and prediction of cultivation age and parts of Panax ginseng by Fourier-transform infrared spectroscopy combined with multivariate statistical analysis

    PubMed Central

    Lim, Sa Rang; Huang, Linfang

    2017-01-01

    Panax ginseng C.A. Meyer is a herb used for medicinal purposes, and its discrimination according to cultivation age has been an important and practical issue. This study employed Fourier-transform infrared (FT-IR) spectroscopy with multivariate statistical analysis to obtain a prediction model for discriminating cultivation ages (5 and 6 years) and three different parts (rhizome, tap root, and lateral root) of P. ginseng. The optimal partial-least-squares regression (PLSR) models for discriminating ginseng samples were determined by selecting normalization methods, number of partial-least-squares (PLS) components, and variable influence on projection (VIP) cutoff values. The best prediction model for discriminating 5- and 6-year-old ginseng was developed using tap root, vector normalization applied after the second differentiation, one PLS component, and a VIP cutoff of 1.0 (based on the lowest root-mean-square error of prediction value). In addition, for discriminating among the three parts of P. ginseng, optimized PLSR models were established using data sets obtained from vector normalization, two PLS components, and VIP cutoff values of 1.5 (for 5-year-old ginseng) and 1.3 (for 6-year-old ginseng). To our knowledge, this is the first study to provide a novel strategy for rapidly discriminating the cultivation ages and parts of P. ginseng using FT-IR by selected normalization methods, number of PLS components, and VIP cutoff values. PMID:29049369

  4. Functional mixture regression.

    PubMed

    Yao, Fang; Fu, Yuejiao; Lee, Thomas C M

    2011-04-01

    In functional linear models (FLMs), the relationship between the scalar response and the functional predictor process is often assumed to be identical for all subjects. Motivated by both practical and methodological considerations, we relax this assumption and propose a new class of functional regression models that allow the regression structure to vary for different groups of subjects. By projecting the predictor process onto its eigenspace, the new functional regression model is simplified to a framework that is similar to classical mixture regression models. This leads to the proposed approach named as functional mixture regression (FMR). The estimation of FMR can be readily carried out using existing software implemented for functional principal component analysis and mixture regression. The practical necessity and performance of FMR are illustrated through applications to a longevity analysis of female medflies and a human growth study. Theoretical investigations concerning the consistent estimation and prediction properties of FMR along with simulation experiments illustrating its empirical properties are presented in the supplementary material available at Biostatistics online. Corresponding results demonstrate that the proposed approach could potentially achieve substantial gains over traditional FLMs.

  5. Multivariate Time Series Decomposition into Oscillation Components.

    PubMed

    Matsuda, Takeru; Komaki, Fumiyasu

    2017-08-01

    Many time series are considered to be a superposition of several oscillation components. We have proposed a method for decomposing univariate time series into oscillation components and estimating their phases (Matsuda & Komaki, 2017 ). In this study, we extend that method to multivariate time series. We assume that several oscillators underlie the given multivariate time series and that each variable corresponds to a superposition of the projections of the oscillators. Thus, the oscillators superpose on each variable with amplitude and phase modulation. Based on this idea, we develop gaussian linear state-space models and use them to decompose the given multivariate time series. The model parameters are estimated from data using the empirical Bayes method, and the number of oscillators is determined using the Akaike information criterion. Therefore, the proposed method extracts underlying oscillators in a data-driven manner and enables investigation of phase dynamics in a given multivariate time series. Numerical results show the effectiveness of the proposed method. From monthly mean north-south sunspot number data, the proposed method reveals an interesting phase relationship.

  6. F100 Multivariable Control Synthesis Program. Computer Implementation of the F100 Multivariable Control Algorithm

    NASA Technical Reports Server (NTRS)

    Soeder, J. F.

    1983-01-01

    As turbofan engines become more complex, the development of controls necessitate the use of multivariable control techniques. A control developed for the F100-PW-100(3) turbofan engine by using linear quadratic regulator theory and other modern multivariable control synthesis techniques is described. The assembly language implementation of this control on an SEL 810B minicomputer is described. This implementation was then evaluated by using a real-time hybrid simulation of the engine. The control software was modified to run with a real engine. These modifications, in the form of sensor and actuator failure checks and control executive sequencing, are discussed. Finally recommendations for control software implementations are presented.

  7. The Neurosurgery Match: A Bibliometric Analysis of 206 First-Year Residents.

    PubMed

    Kashkoush, Ahmed; Prabhu, Arpan V; Tonetti, Daniel; Agarwal, Nitin

    2017-09-01

    An important component of the residency application for neurosurgery is research experience and the subsequent number of produced publications. Bibliometrics research has been developed to establish quantitative methods for the standardization of publishing impactful research. This study aims to quantify the research productivity of medical students who successfully matriculated into a Neurosurgery residency program. We initially identified first-year neurosurgery residents for the 2016-2017 academic year of all U.S. neurosurgical residency programs through departmental websites. The Scopus database was then queried for all articles published in the years 2006 to 2015 by first-year residents and bibliometric variables, such as publication count, journal impact factors, and author h-index, were extracted. The main outcome measured was residency program, tiered 1-5 by total departmental faculty research output. Two hundred six (206) Scopus records for first-year neurosurgery residents were identified in 99 programs nationwide. Multivariate ordinal regression demonstrated that only h-index was independently associated with tier of matriculation (P = 0.043). H-index was observed to strongly correlate with the number of original research articles (P = 0.005), years since first publication (P < 0.0001), and journal impact factor (P = 0.048) as assessed by multiple linear regression. Notably, h-index was observed to increase by approximately 1 point with every 3 original research articles (B = 0.368) and 4 years since first publication (B = 0.257). H-index is a powerful research predictor of matching into neurosurgical research institutions and can be improved by starting research early, targeting high impact journals, and participating in original clinical and laboratory investigations. Copyright © 2017 Elsevier Inc. All rights reserved.

  8. Statistical experiments using the multiple regression research for prediction of proper hardness in areas of phosphorus cast-iron brake shoes manufacturing

    NASA Astrophysics Data System (ADS)

    Kiss, I.; Cioată, V. G.; Ratiu, S. A.; Rackov, M.; Penčić, M.

    2018-01-01

    Multivariate research is important in areas of cast-iron brake shoes manufacturing, because many variables interact with each other simultaneously. This article focuses on expressing the multiple linear regression model related to the hardness assurance by the chemical composition of the phosphorous cast irons destined to the brake shoes, having in view that the regression coefficients will illustrate the unrelated contributions of each independent variable towards predicting the dependent variable. In order to settle the multiple correlations between the hardness of the cast-iron brake shoes, and their chemical compositions several regression equations has been proposed. Is searched a mathematical solution which can determine the optimum chemical composition for the hardness desirable values. Starting from the above-mentioned affirmations two new statistical experiments are effectuated related to the values of Phosphorus [P], Manganese [Mn] and Silicon [Si]. Therefore, the regression equations, which describe the mathematical dependency between the above-mentioned elements and the hardness, are determined. As result, several correlation charts will be revealed.

  9. Simultaneous Force Regression and Movement Classification of Fingers via Surface EMG within a Unified Bayesian Framework.

    PubMed

    Baldacchino, Tara; Jacobs, William R; Anderson, Sean R; Worden, Keith; Rowson, Jennifer

    2018-01-01

    This contribution presents a novel methodology for myolectric-based control using surface electromyographic (sEMG) signals recorded during finger movements. A multivariate Bayesian mixture of experts (MoE) model is introduced which provides a powerful method for modeling force regression at the fingertips, while also performing finger movement classification as a by-product of the modeling algorithm. Bayesian inference of the model allows uncertainties to be naturally incorporated into the model structure. This method is tested using data from the publicly released NinaPro database which consists of sEMG recordings for 6 degree-of-freedom force activations for 40 intact subjects. The results demonstrate that the MoE model achieves similar performance compared to the benchmark set by the authors of NinaPro for finger force regression. Additionally, inherent to the Bayesian framework is the inclusion of uncertainty in the model parameters, naturally providing confidence bounds on the force regression predictions. Furthermore, the integrated clustering step allows a detailed investigation into classification of the finger movements, without incurring any extra computational effort. Subsequently, a systematic approach to assessing the importance of the number of electrodes needed for accurate control is performed via sensitivity analysis techniques. A slight degradation in regression performance is observed for a reduced number of electrodes, while classification performance is unaffected.

  10. Grade of hypospadias is the only factor predicting for re-intervention after primary hypospadias repair: a multivariate analysis from a cohort of 474 patients.

    PubMed

    Spinoit, Anne-Françoise; Poelaert, Filip; Van Praet, Charles; Groen, Luitzen-Albert; Van Laecke, Erik; Hoebeke, Piet

    2015-04-01

    There is an ongoing quest on how to minimize complications in hypospadias surgery. There is however a lack of high-quality data on the following parameters that might influence the outcome of primary hypospadias repair: age at initial surgery, the type of suture material, the initial technique, and the type of hypospadias. The objective of this study was to identify independent predictors for re-intervention in primary hypospadias repair. We retrospectively analyzed our database of 474 children undergoing primary hypospadias surgery. Univariate and multivariate logistic regression was performed to identify variables associated with re-intervention. A p-value <0.05 was considered statistically significant and therefore considered as a prognostic factor for re-intervention. Distal penile hypospadias was reported in 77.2% (n = 366), midpenile in 11.4% (n = 54) and proximal in 11.4% (n = 54) of children. Initial repair was based on an incised plate technique in 39.9% (n = 189), meatal advancement in 36.0% (n = 171), an onlay flap in 17.3% (n = 82) and other or combined techniques in 5.3% (n = 25). In 114 patients (24.1%) re-intervention was required (n = 114) of which 54 re-interventions (47.4%) were performed within the first year post-surgery, 17 (14.9%) in the second year and 43 (37.7%) later than 2 years after initial surgery. The reason for the first re-intervention was fistula in 52 patients (46.4%), meatal stenosis in 32 (28.6%), cosmesis in 35 (31.3%) and other in 14 (12.5%). The median time for re-intervention was 14 months after surgery [range 0-114]. Significant predictors for re-intervention on univariate logistic regression (polyglactin suture material versus poliglecaprone, proximal hypospadias, lower age at operation and other than meatal advancement repair) were put in a multivariate logistic regression model. Of all significant variables, only proximal hypospadias remained an independent predictor for re-intervention (OR 3.27; p = 0.012). The grade of

  11. Nonmedical Prescription Stimulant Use Among Girls 10–18 Years of Age: Associations With Other Risky Behavior

    PubMed Central

    Striley, Catherine Woodstock; Kelso-Chichetto, Natalie E.; Cottler, Linda B.

    2017-01-01

    Purpose Little is known about the risk factors for nonmedical use (NMU) of prescription stimulants among adolescent girls. We aimed to measure the association of nonmedical prescription stimulant use with empirically linked risk factors, including weight control behavior (WCB), gambling, and depressed mood, in pre-teen and teenaged girls. Methods We assessed the relationship between age and race, gambling, WCB, depressive mood, and nonmedical prescription stimulant use using multivariable logistic regression. The study sample included 5,585 females, aged 10–18 years, recruited via an entertainment venue intercept method in 10 U.S. metropolitan areas as part of the National Monitoring of Adolescent Prescription Stimulants Study (2008–2011). Results NMU of prescription stimulants was reported by 6.6% (n = 370) of the sample. In multivariable logistic regression, 1-year increase in age was associated with a 21% (95% confidence interval [CI]: .15, .28) increase in risk for NMU. Whites and other race/ethnicity girls had 2.67 (CI: 1.85, 3.87) and 1.71 (1.11, 2.65) times higher odds for NMU, compared to African-Americans. Depressive mood (adjusted odds ratio: 2.69, CI: 2.04, 5.57) and gambling (adjusted odds ratio: 1.90, 1.23, 2.92) were associated with increased odds for NMU. A dose-response was identified between WCB and NMU, where girls with unhealthy and extreme WCB were over five times more likely to endorse NMU. Conclusions We contribute to the literature linking WCB, depression, gambling, and the NMU of prescription stimulants in any population and uniquely do so among girls. PMID:27998704

  12. Validated univariate and multivariate spectrophotometric methods for the determination of pharmaceuticals mixture in complex wastewater

    NASA Astrophysics Data System (ADS)

    Riad, Safaa M.; Salem, Hesham; Elbalkiny, Heba T.; Khattab, Fatma I.

    2015-04-01

    Five, accurate, precise, and sensitive univariate and multivariate spectrophotometric methods were developed for the simultaneous determination of a ternary mixture containing Trimethoprim (TMP), Sulphamethoxazole (SMZ) and Oxytetracycline (OTC) in waste water samples collected from different cites either production wastewater or livestock wastewater after their solid phase extraction using OASIS HLB cartridges. In univariate methods OTC was determined at its λmax 355.7 nm (0D), while (TMP) and (SMZ) were determined by three different univariate methods. Method (A) is based on successive spectrophotometric resolution technique (SSRT). The technique starts with the ratio subtraction method followed by ratio difference method for determination of TMP and SMZ. Method (B) is successive derivative ratio technique (SDR). Method (C) is mean centering of the ratio spectra (MCR). The developed multivariate methods are principle component regression (PCR) and partial least squares (PLS). The specificity of the developed methods is investigated by analyzing laboratory prepared mixtures containing different ratios of the three drugs. The obtained results are statistically compared with those obtained by the official methods, showing no significant difference with respect to accuracy and precision at p = 0.05.

  13. Validated univariate and multivariate spectrophotometric methods for the determination of pharmaceuticals mixture in complex wastewater.

    PubMed

    Riad, Safaa M; Salem, Hesham; Elbalkiny, Heba T; Khattab, Fatma I

    2015-04-05

    Five, accurate, precise, and sensitive univariate and multivariate spectrophotometric methods were developed for the simultaneous determination of a ternary mixture containing Trimethoprim (TMP), Sulphamethoxazole (SMZ) and Oxytetracycline (OTC) in waste water samples collected from different cites either production wastewater or livestock wastewater after their solid phase extraction using OASIS HLB cartridges. In univariate methods OTC was determined at its λmax 355.7 nm (0D), while (TMP) and (SMZ) were determined by three different univariate methods. Method (A) is based on successive spectrophotometric resolution technique (SSRT). The technique starts with the ratio subtraction method followed by ratio difference method for determination of TMP and SMZ. Method (B) is successive derivative ratio technique (SDR). Method (C) is mean centering of the ratio spectra (MCR). The developed multivariate methods are principle component regression (PCR) and partial least squares (PLS). The specificity of the developed methods is investigated by analyzing laboratory prepared mixtures containing different ratios of the three drugs. The obtained results are statistically compared with those obtained by the official methods, showing no significant difference with respect to accuracy and precision at p=0.05. Copyright © 2015 Elsevier B.V. All rights reserved.

  14. Revisiting Regression in Autism: Heller's "Dementia Infantilis"

    ERIC Educational Resources Information Center

    Westphal, Alexander; Schelinski, Stefanie; Volkmar, Fred; Pelphrey, Kevin

    2013-01-01

    Theodor Heller first described a severe regression of adaptive function in normally developing children, something he termed dementia infantilis, over one 100 years ago. Dementia infantilis is most closely related to the modern diagnosis, childhood disintegrative disorder. We translate Heller's paper, Uber Dementia Infantilis, and discuss…

  15. A multivariate method for estimating mortality rates among children under 5 years from health and social indicators in Iraq.

    PubMed

    Garfield, R; Leu, C S

    2000-06-01

    Many reports on Iraq suggest that a rise in rates of death and disease have occurred since the Gulf War of January/February 1991 and the economic sanctions that followed it. Four preliminary models, based on unadjusted projections, were developed. A logistic regression model was then developed on the basis of six social variables in Iraq and comparable information from countries in the State of the World's Children report. Missing data were estimated for this model by a multiple imputation procedure. The final model depends on three socio-medical indicators: adult literacy, nutritional stunting of children under 5 years, and access to piped water. The model successfully predicted both the mortality rate in 1990, under stable conditions, and in 1991, following the Gulf War. For 1996, after 5 years of sanctions and prior to receipt of humanitarian food via the oil for food programme, this model shows mortality among children under 5 to have reached an estimated 87 per 1000, a rate last experienced more than 30 years ago. Accurate and timely estimates of mortality levels in developing countries are costly and require considerable methodological expertise. A rapid estimation technique like the one developed here may be a useful tool for quick and efficient estimation of mortality rates among under 5 year olds in countries where good mortality data are not routinely available. This is especially true for countries with complex humanitarian emergencies where information on mortality changes can guide interventions and the social stability to use standard demographic methods does not exist.

  16. Rejection of Multivariate Outliers.

    DTIC Science & Technology

    1983-05-01

    available in Gnanadesikan (1977). 2 The motivation for the present investigation lies in a recent paper of Schvager and Margolin (1982) who derive a... Gnanadesikan , R. (1977). Methods for Statistical Data Analysis of Multivariate Observations. Wiley, New York. [7] Hawkins, D.M. (1980). Identification of

  17. On a Family of Multivariate Modified Humbert Polynomials

    PubMed Central

    Aktaş, Rabia; Erkuş-Duman, Esra

    2013-01-01

    This paper attempts to present a multivariable extension of generalized Humbert polynomials. The results obtained here include various families of multilinear and multilateral generating functions, miscellaneous properties, and also some special cases for these multivariable polynomials. PMID:23935411

  18. Hostility among adolescents in Switzerland? multivariate relations between excessive media use and forms of violence.

    PubMed

    Kuntsche, Emmanuel N

    2004-03-01

    To determine what kind of violence-related behavior or opinion is directly related to excessive media use among adolescents in Switzerland. A national representative sample of 4222 schoolchildren (7th- and 8th-graders; mean age 13.9 years) answered questions on the frequency of television-viewing, electronic game-playing, feeling unsafe at school, bullying others, hitting others, and fighting with others, as part of the Health Behaviour in School-Aged Children (HBSC) international collaborative study protocol. The Chi-square tests and multiple logistic regression analyses were applied to high-risk groups of adolescents. For the total sample, all bivariate relationships between television-viewing/electronic game-playing and each violence-related variable are significant. In the multivariate comparison, physical violence among boys ceases to be significant. For girls, only television-viewing is linked to indirect violence. Against the hypothesis, females' electronic game-playing only had a bearing on hitting others. Experimental designs are needed that take into account gender, different forms of media, and violence to answer the question of whether excessive media use leads to violent behavior. With the exception of excessive electronic game-playing among girls, this study found that electronic media are not thought to lead directly to real-life violence but to hostility and indirect violence.

  19. Weight, socio-demographics, and health behaviour related correlates of academic performance in first year university students

    PubMed Central

    2013-01-01

    Background This study aimed to examine differences in socio-demographics and health behaviour between Belgian first year university students who attended all final course exams and those who did not. Secondly, this study aimed to identify weight and health behaviour related correlates of academic performance in those students who attended all course exams. Methods Anthropometrics of 101 first year university students were measured at both the beginning of the first (T1) and second (T2) semester of the academic year. An on-line health behaviour questionnaire was filled out at T2. As a measure of academic performance student end-of-year Grade Point Averages (GPA) were obtained from the university’s registration office. Independent samples t-tests and chi 2 -tests were executed to compare students who attended all course exams during the first year of university and students who did not carry through. Uni- and multivariate linear regression analyses were conducted to identify correlates of academic performance in students who attended all course exams during the first year of university. Results Students who did not attend all course exams were predominantly male, showed higher increases in waist circumference during the first semester and consumed more French fries than those who attended all final course exams. Being male, lower secondary school grades, increases in weight, Body Mass Index and waist circumference over the first semester, more gaming on weekdays, being on a diet, eating at the student restaurant more frequently, higher soda and French fries consumption, and higher frequency of alcohol use predicted lower GPA’s in first year university students. When controlled for each other, being on a diet and higher frequency of alcohol use remained significant in the multivariate regression model, with frequency of alcohol use being the strongest correlate of GPA. Conclusions This study, conducted in Belgian first year university students, showed that

  20. Multivariate analysis: A statistical approach for computations

    NASA Astrophysics Data System (ADS)

    Michu, Sachin; Kaushik, Vandana

    2014-10-01

    Multivariate analysis is a type of multivariate statistical approach commonly used in, automotive diagnosis, education evaluating clusters in finance etc and more recently in the health-related professions. The objective of the paper is to provide a detailed exploratory discussion about factor analysis (FA) in image retrieval method and correlation analysis (CA) of network traffic. Image retrieval methods aim to retrieve relevant images from a collected database, based on their content. The problem is made more difficult due to the high dimension of the variable space in which the images are represented. Multivariate correlation analysis proposes an anomaly detection and analysis method based on the correlation coefficient matrix. Anomaly behaviors in the network include the various attacks on the network like DDOs attacks and network scanning.

  1. [Multivariate study of the psychosocial factors affecting public attitude towards organ donation].

    PubMed

    Conesa, C; Ríos, A; Ramírez, P; Canteras, M; Rodríguez, M M; Parrilla, P

    2005-01-01

    Organ transplantation is a therapy which depends on society for its development. The objectives here are: 1) to understand the structure of public opinion towards organ donation in the population aged over 15 years of age in our Community; 2) to analyse the psychosocial variables which affect this opinion and 3) to define the population profiles on this matter. Random sample (n = 2.000) stratified for age, sex and geographical location (error for 95.5%, e +/- 2.24) to whom we apply a questionnaire about the psychosocial aspects of organ donation. Descriptive statistics, Student's t-test, Chi-squared test and logistical regression analysis. 63% have a favourable attitude towards organ donation, of which 11% have a donor's card. A statistical association has been observed between favourable public opinion and different psychosocial variables (p < 0.05), with some independent variables persisting in the multivariate analysis such as age, level of education (OR = 1.78), information given by family members (OR = 1.62), health workers (OR = 2.01) and talks in educational centres (OR = 2.13); previous experience with donation and transplantation (OR = 2.02), knowledge of the concept of brain death (OR = 1.4); partner's favourable opinion towards donation (OR = 2.6), being a blood donor (OR = 3), taking part in prosocial activities (OR = 1.6) and attitude towards incineration of the cadaver after death (OR = 1.8). The profile of a person who is against donation is of a man or woman, > 50 years of age, with primary studies or below, with no previous experience of the matter, who does not understand the concept of brain death nor their partner's opinion towards donation, who has not found out any information about donation through specialised forums, with an unfavourable opinion towards blood donation or pro-social activities and who is fearful of manipulation of the cadaver after death.

  2. Extraction, isolation, and purification of analytes from samples of marine origin--a multivariate task.

    PubMed

    Liguori, Lucia; Bjørsvik, Hans-René

    2012-12-01

    The development of a multivariate study for a quantitative analysis of six different polybrominated diphenyl ethers (PBDEs) in tissue of Atlantic Salmo salar L. is reported. An extraction, isolation, and purification process based on an accelerated solvent extraction system was designed, investigated, and optimized by means of statistical experimental design and multivariate data analysis and regression. An accompanying gas chromatography-mass spectrometry analytical method was developed for the identification and quantification of the analytes, BDE 28, BDE 47, BDE 99, BDE 100, BDE 153, and BDE 154. These PBDEs have been used in commercial blends that were used as flame-retardants for a variety of materials, including electronic devices, synthetic polymers and textiles. The present study revealed that an extracting solvent mixture composed of hexane and CH₂Cl₂ (10:90) provided excellent recoveries of all of the six PBDEs studied herein. A somewhat lower polarity in the extracting solvent, hexane and CH₂Cl₂ (40:60) decreased the analyte %-recoveries, which still remain acceptable and satisfactory. The study demonstrates the necessity to perform an intimately investigation of the extraction and purification process in order to achieve quantitative isolation of the analytes from the specific matrix. Copyright © 2012 Elsevier B.V. All rights reserved.

  3. Esophageal wall dose-surface maps do not improve the predictive performance of a multivariable NTCP model for acute esophageal toxicity in advanced stage NSCLC patients treated with intensity-modulated (chemo-)radiotherapy.

    PubMed

    Dankers, Frank; Wijsman, Robin; Troost, Esther G C; Monshouwer, René; Bussink, Johan; Hoffmann, Aswin L

    2017-05-07

    In our previous work, a multivariable normal-tissue complication probability (NTCP) model for acute esophageal toxicity (AET) Grade  ⩾2 after highly conformal (chemo-)radiotherapy for non-small cell lung cancer (NSCLC) was developed using multivariable logistic regression analysis incorporating clinical parameters and mean esophageal dose (MED). Since the esophagus is a tubular organ, spatial information of the esophageal wall dose distribution may be important in predicting AET. We investigated whether the incorporation of esophageal wall dose-surface data with spatial information improves the predictive power of our established NTCP model. For 149 NSCLC patients treated with highly conformal radiation therapy esophageal wall dose-surface histograms (DSHs) and polar dose-surface maps (DSMs) were generated. DSMs were used to generate new DSHs and dose-length-histograms that incorporate spatial information of the dose-surface distribution. From these histograms dose parameters were derived and univariate logistic regression analysis showed that they correlated significantly with AET. Following our previous work, new multivariable NTCP models were developed using the most significant dose histogram parameters based on univariate analysis (19 in total). However, the 19 new models incorporating esophageal wall dose-surface data with spatial information did not show improved predictive performance (area under the curve, AUC range 0.79-0.84) over the established multivariable NTCP model based on conventional dose-volume data (AUC  =  0.84). For prediction of AET, based on the proposed multivariable statistical approach, spatial information of the esophageal wall dose distribution is of no added value and it is sufficient to only consider MED as a predictive dosimetric parameter.

  4. Esophageal wall dose-surface maps do not improve the predictive performance of a multivariable NTCP model for acute esophageal toxicity in advanced stage NSCLC patients treated with intensity-modulated (chemo-)radiotherapy

    NASA Astrophysics Data System (ADS)

    Dankers, Frank; Wijsman, Robin; Troost, Esther G. C.; Monshouwer, René; Bussink, Johan; Hoffmann, Aswin L.

    2017-05-01

    In our previous work, a multivariable normal-tissue complication probability (NTCP) model for acute esophageal toxicity (AET) Grade  ⩾2 after highly conformal (chemo-)radiotherapy for non-small cell lung cancer (NSCLC) was developed using multivariable logistic regression analysis incorporating clinical parameters and mean esophageal dose (MED). Since the esophagus is a tubular organ, spatial information of the esophageal wall dose distribution may be important in predicting AET. We investigated whether the incorporation of esophageal wall dose-surface data with spatial information improves the predictive power of our established NTCP model. For 149 NSCLC patients treated with highly conformal radiation therapy esophageal wall dose-surface histograms (DSHs) and polar dose-surface maps (DSMs) were generated. DSMs were used to generate new DSHs and dose-length-histograms that incorporate spatial information of the dose-surface distribution. From these histograms dose parameters were derived and univariate logistic regression analysis showed that they correlated significantly with AET. Following our previous work, new multivariable NTCP models were developed using the most significant dose histogram parameters based on univariate analysis (19 in total). However, the 19 new models incorporating esophageal wall dose-surface data with spatial information did not show improved predictive performance (area under the curve, AUC range 0.79-0.84) over the established multivariable NTCP model based on conventional dose-volume data (AUC  =  0.84). For prediction of AET, based on the proposed multivariable statistical approach, spatial information of the esophageal wall dose distribution is of no added value and it is sufficient to only consider MED as a predictive dosimetric parameter.

  5. Correlation of porous and functional properties of food materials by NMR relaxometry and multivariate analysis.

    PubMed

    Haiduc, Adrian Marius; van Duynhoven, John

    2005-02-01

    The porous properties of food materials are known to determine important macroscopic parameters such as water-holding capacity and texture. In conventional approaches, understanding is built from a long process of establishing macrostructure-property relations in a rational manner. Only recently, multivariate approaches were introduced for the same purpose. The model systems used here are oil-in-water emulsions, stabilised by protein, and form complex structures, consisting of fat droplets dispersed in a porous protein phase. NMR time-domain decay curves were recorded for emulsions with varied levels of fat, protein and water. Hardness, dry matter content and water drainage were determined by classical means and analysed for correlation with the NMR data with multivariate techniques. Partial least squares can calibrate and predict these properties directly from the continuous NMR exponential decays and yields regression coefficients higher than 82%. However, the calibration coefficients themselves belong to the continuous exponential domain and do little to explain the connection between NMR data and emulsion properties. Transformation of the NMR decays into a discreet domain with non-negative least squares permits the use of multilinear regression (MLR) on the resulting amplitudes as predictors and hardness or water drainage as responses. The MLR coefficients show that hardness is highly correlated with the components that have T2 distributions of about 20 and 200 ms whereas water drainage is correlated with components that have T2 distributions around 400 and 1800 ms. These T2 distributions very likely correlate with water populations present in pores with different sizes and/or wall mobility. The results for the emulsions studied demonstrate that NMR time-domain decays can be employed to predict properties and to provide insight in the underlying microstructural features.

  6. The role of middle-class status in payday loan borrowing: a multivariate approach.

    PubMed

    Lim, Younghee; Bickham, Trey; Broussard, Julia; Dinecola, Cassie M; Gregory, Alethia; Weber, Brittany E

    2014-10-01

    Payday loans refer to small-dollar, high-interest, short-term loans usually extended to lower-income consumers. Despite much research to the contrary, the payday loan industry asserts that it primarily serves middle-class Americans. This article discusses the authors' investigation of the industry's claim, by analyzing data from a U.S. bankruptcy court serving a Southern district. Results of the multivariate binary logistic regression analysis showed that, controlling for various sociodemographic and economic variables, two middle-class indicators--home-ownership and annual income at or greater than the median income--are associated with a decreased likelihood of using payday loans. The article concludes with a discussion of the implications of the results for social work practice and advocacy in regard to financial capability, particularly asset development, income maintenance, and payday loan regulation.

  7. THE REGRESSION MODEL OF IRAN LIBRARIES ORGANIZATIONAL CLIMATE

    PubMed Central

    Jahani, Mohammad Ali; Yaminfirooz, Mousa; Siamian, Hasan

    2015-01-01

    Background: The purpose of this study was to drawing a regression model of organizational climate of central libraries of Iran’s universities. Methods: This study is an applied research. The statistical population of this study consisted of 96 employees of the central libraries of Iran’s public universities selected among the 117 universities affiliated to the Ministry of Health by Stratified Sampling method (510 people). Climate Qual localized questionnaire was used as research tools. For predicting the organizational climate pattern of the libraries is used from the multivariate linear regression and track diagram. Results: of the 9 variables affecting organizational climate, 5 variables of innovation, teamwork, customer service, psychological safety and deep diversity play a major role in prediction of the organizational climate of Iran’s libraries. The results also indicate that each of these variables with different coefficient have the power to predict organizational climate but the climate score of psychological safety (0.94) plays a very crucial role in predicting the organizational climate. Track diagram showed that five variables of teamwork, customer service, psychological safety, deep diversity and innovation directly effects on the organizational climate variable that contribution of the team work from this influence is more than any other variables. Conclusions: Of the indicator of the organizational climate of climateQual, the contribution of the team work from this influence is more than any other variables that reinforcement of teamwork in academic libraries can be more effective in improving the organizational climate of this type libraries. PMID:26622203

  8. Simulating Multivariate Nonnormal Data Using an Iterative Algorithm

    ERIC Educational Resources Information Center

    Ruscio, John; Kaczetow, Walter

    2008-01-01

    Simulating multivariate nonnormal data with specified correlation matrices is difficult. One especially popular method is Vale and Maurelli's (1983) extension of Fleishman's (1978) polynomial transformation technique to multivariate applications. This requires the specification of distributional moments and the calculation of an intermediate…

  9. Spatio-temporal variations of nitric acid total columns from 9 years of IASI measurements - a driver study

    NASA Astrophysics Data System (ADS)

    Ronsmans, Gaétane; Wespes, Catherine; Hurtmans, Daniel; Clerbaux, Cathy; Coheur, Pierre-François

    2018-04-01

    This study aims to understand the spatial and temporal variability of HNO3 total columns in terms of explanatory variables. To achieve this, multiple linear regressions are used to fit satellite-derived time series of HNO3 daily averaged total columns. First, an analysis of the IASI 9-year time series (2008-2016) is conducted based on various equivalent latitude bands. The strong and systematic denitrification of the southern polar stratosphere is observed very clearly. It is also possible to distinguish, within the polar vortex, three regions which are differently affected by the denitrification. Three exceptional denitrification episodes in 2011, 2014 and 2016 are also observed in the Northern Hemisphere, due to unusually low arctic temperatures. The time series are then fitted by multivariate regressions to identify what variables are responsible for HNO3 variability in global distributions and time series, and to quantify their respective influence. Out of an ensemble of proxies (annual cycle, solar flux, quasi-biennial oscillation, multivariate ENSO index, Arctic and Antarctic oscillations and volume of polar stratospheric clouds), only the those defined as significant (p value < 0.05) by a selection algorithm are retained for each equivalent latitude band. Overall, the regression gives a good representation of HNO3 variability, with especially good results at high latitudes (60-80 % of the observed variability explained by the model). The regressions show the dominance of annual variability in all latitudinal bands, which is related to specific chemistry and dynamics depending on the latitudes. We find that the polar stratospheric clouds (PSCs) also have a major influence in the polar regions, and that their inclusion in the model improves the correlation coefficients and the residuals. However, there is still a relatively large portion of HNO3 variability that remains unexplained by the model, especially in the intertropical regions, where factors not

  10. Multivariate generalized multifactor dimensionality reduction to detect gene-gene interactions

    PubMed Central

    2013-01-01

    Background Recently, one of the greatest challenges in genome-wide association studies is to detect gene-gene and/or gene-environment interactions for common complex human diseases. Ritchie et al. (2001) proposed multifactor dimensionality reduction (MDR) method for interaction analysis. MDR is a combinatorial approach to reduce multi-locus genotypes into high-risk and low-risk groups. Although MDR has been widely used for case-control studies with binary phenotypes, several extensions have been proposed. One of these methods, a generalized MDR (GMDR) proposed by Lou et al. (2007), allows adjusting for covariates and applying to both dichotomous and continuous phenotypes. GMDR uses the residual score of a generalized linear model of phenotypes to assign either high-risk or low-risk group, while MDR uses the ratio of cases to controls. Methods In this study, we propose multivariate GMDR, an extension of GMDR for multivariate phenotypes. Jointly analysing correlated multivariate phenotypes may have more power to detect susceptible genes and gene-gene interactions. We construct generalized estimating equations (GEE) with multivariate phenotypes to extend generalized linear models. Using the score vectors from GEE we discriminate high-risk from low-risk groups. We applied the multivariate GMDR method to the blood pressure data of the 7,546 subjects from the Korean Association Resource study: systolic blood pressure (SBP) and diastolic blood pressure (DBP). We compare the results of multivariate GMDR for SBP and DBP to the results from separate univariate GMDR for SBP and DBP, respectively. We also applied the multivariate GMDR method to the repeatedly measured hypertension status from 5,466 subjects and compared its result with those of univariate GMDR at each time point. Results Results from the univariate GMDR and multivariate GMDR in two-locus model with both blood pressures and hypertension phenotypes indicate best combinations of SNPs whose interaction has

  11. Dirichlet Component Regression and its Applications to Psychiatric Data.

    PubMed

    Gueorguieva, Ralitza; Rosenheck, Robert; Zelterman, Daniel

    2008-08-15

    We describe a Dirichlet multivariable regression method useful for modeling data representing components as a percentage of a total. This model is motivated by the unmet need in psychiatry and other areas to simultaneously assess the effects of covariates on the relative contributions of different components of a measure. The model is illustrated using the Positive and Negative Syndrome Scale (PANSS) for assessment of schizophrenia symptoms which, like many other metrics in psychiatry, is composed of a sum of scores on several components, each in turn, made up of sums of evaluations on several questions. We simultaneously examine the effects of baseline socio-demographic and co-morbid correlates on all of the components of the total PANSS score of patients from a schizophrenia clinical trial and identify variables associated with increasing or decreasing relative contributions of each component. Several definitions of residuals are provided. Diagnostics include measures of overdispersion, Cook's distance, and a local jackknife influence metric.

  12. Dirichlet Component Regression and its Applications to Psychiatric Data

    PubMed Central

    Gueorguieva, Ralitza; Rosenheck, Robert; Zelterman, Daniel

    2011-01-01

    Summary We describe a Dirichlet multivariable regression method useful for modeling data representing components as a percentage of a total. This model is motivated by the unmet need in psychiatry and other areas to simultaneously assess the effects of covariates on the relative contributions of different components of a measure. The model is illustrated using the Positive and Negative Syndrome Scale (PANSS) for assessment of schizophrenia symptoms which, like many other metrics in psychiatry, is composed of a sum of scores on several components, each in turn, made up of sums of evaluations on several questions. We simultaneously examine the effects of baseline socio-demographic and co-morbid correlates on all of the components of the total PANSS score of patients from a schizophrenia clinical trial and identify variables associated with increasing or decreasing relative contributions of each component. Several definitions of residuals are provided. Diagnostics include measures of overdispersion, Cook’s distance, and a local jackknife influence metric. PMID:22058582

  13. Investigating College and Graduate Students' Multivariable Reasoning in Computational Modeling

    ERIC Educational Resources Information Center

    Wu, Hsin-Kai; Wu, Pai-Hsing; Zhang, Wen-Xin; Hsu, Ying-Shao

    2013-01-01

    Drawing upon the literature in computational modeling, multivariable reasoning, and causal attribution, this study aims at characterizing multivariable reasoning practices in computational modeling and revealing the nature of understanding about multivariable causality. We recruited two freshmen, two sophomores, two juniors, two seniors, four…

  14. Modelling nitrate pollution pressure using a multivariate statistical approach: the case of Kinshasa groundwater body, Democratic Republic of Congo

    NASA Astrophysics Data System (ADS)

    Mfumu Kihumba, Antoine; Ndembo Longo, Jean; Vanclooster, Marnik

    2016-03-01

    A multivariate statistical modelling approach was applied to explain the anthropogenic pressure of nitrate pollution on the Kinshasa groundwater body (Democratic Republic of Congo). Multiple regression and regression tree models were compared and used to identify major environmental factors that control the groundwater nitrate concentration in this region. The analyses were made in terms of physical attributes related to the topography, land use, geology and hydrogeology in the capture zone of different groundwater sampling stations. For the nitrate data, groundwater datasets from two different surveys were used. The statistical models identified the topography, the residential area, the service land (cemetery), and the surface-water land-use classes as major factors explaining nitrate occurrence in the groundwater. Also, groundwater nitrate pollution depends not on one single factor but on the combined influence of factors representing nitrogen loading sources and aquifer susceptibility characteristics. The groundwater nitrate pressure was better predicted with the regression tree model than with the multiple regression model. Furthermore, the results elucidated the sensitivity of the model performance towards the method of delineation of the capture zones. For pollution modelling at the monitoring points, therefore, it is better to identify capture-zone shapes based on a conceptual hydrogeological model rather than to adopt arbitrary circular capture zones.

  15. Emergence and predictors of alcohol reference displays on Facebook during the first year of college

    PubMed Central

    Moreno, Megan A; D’Angelo, Jonathan; Kacvinsky, Lauren E.; Kerr, Bradley; Zhang, Chong; Eickhoff, Jens

    2013-01-01

    The purpose of this study was to investigate the emergence of displayed alcohol references on Facebook for first-year students from two universities. Graduated high school seniors who were planning to attend one of the two targeted study universities were recruited. Participants’ Facebook profiles were evaluated for displayed alcohol references at baseline and every four weeks throughout the first year of college. Profiles were categorized as Non-Displayers, Alcohol Displayers or Intoxication/Problem Drinking Displayers. Analyses included logistic regression, univariate and multivariate Cox proportional hazard analysis and multi-state Markov modeling. A total of 338 participants were recruited, 56.1% were female, 74.8% were Caucasian, and 58.8% were from University A. At baseline, 68 Facebook profiles (20.1%) included displayed alcohol references. During the first year of college, 135 (39.9%) profiles newly displayed alcohol. In multivariate Cox proportional hazard analysis, university (University B versus A, HR = 0.47, 95% CI: 0.28–0.77, p = 0.003), number of Facebook friends (HR = 1.19, 95% CI: 1.09–1.28, p < 0.001 for every 100 more friends), and average monthly status updates (HR = 1.03, 95% CI: 1.002–1.05, p = 0.033) were identified as independent predictors for new alcohol display. Findings contribute to understanding the patterns and predictors for displayed alcohol references on Facebook. PMID:24415846

  16. A new multivariate zero-adjusted Poisson model with applications to biomedicine.

    PubMed

    Liu, Yin; Tian, Guo-Liang; Tang, Man-Lai; Yuen, Kam Chuen

    2018-05-25

    Recently, although advances were made on modeling multivariate count data, existing models really has several limitations: (i) The multivariate Poisson log-normal model (Aitchison and Ho, ) cannot be used to fit multivariate count data with excess zero-vectors; (ii) The multivariate zero-inflated Poisson (ZIP) distribution (Li et al., 1999) cannot be used to model zero-truncated/deflated count data and it is difficult to apply to high-dimensional cases; (iii) The Type I multivariate zero-adjusted Poisson (ZAP) distribution (Tian et al., 2017) could only model multivariate count data with a special correlation structure for random components that are all positive or negative. In this paper, we first introduce a new multivariate ZAP distribution, based on a multivariate Poisson distribution, which allows the correlations between components with a more flexible dependency structure, that is some of the correlation coefficients could be positive while others could be negative. We then develop its important distributional properties, and provide efficient statistical inference methods for multivariate ZAP model with or without covariates. Two real data examples in biomedicine are used to illustrate the proposed methods. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  17. Parametric regression model for survival data: Weibull regression model as an example

    PubMed Central

    2016-01-01

    Weibull regression model is one of the most popular forms of parametric regression model that it provides estimate of baseline hazard function, as well as coefficients for covariates. Because of technical difficulties, Weibull regression model is seldom used in medical literature as compared to the semi-parametric proportional hazard model. To make clinical investigators familiar with Weibull regression model, this article introduces some basic knowledge on Weibull regression model and then illustrates how to fit the model with R software. The SurvRegCensCov package is useful in converting estimated coefficients to clinical relevant statistics such as hazard ratio (HR) and event time ratio (ETR). Model adequacy can be assessed by inspecting Kaplan-Meier curves stratified by categorical variable. The eha package provides an alternative method to model Weibull regression model. The check.dist() function helps to assess goodness-of-fit of the model. Variable selection is based on the importance of a covariate, which can be tested using anova() function. Alternatively, backward elimination starting from a full model is an efficient way for model development. Visualization of Weibull regression model after model development is interesting that it provides another way to report your findings. PMID:28149846

  18. Evaluating Differential Effects Using Regression Interactions and Regression Mixture Models

    ERIC Educational Resources Information Center

    Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung

    2015-01-01

    Research increasingly emphasizes understanding differential effects. This article focuses on understanding regression mixture models, which are relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their…

  19. The use of logistic regression to enhance risk assessment and decision making by mental health administrators.

    PubMed

    Menditto, Anthony A; Linhorst, Donald M; Coleman, James C; Beck, Niels C

    2006-04-01

    Development of policies and procedures to contend with the risks presented by elopement, aggression, and suicidal behaviors are long-standing challenges for mental health administrators. Guidance in making such judgments can be obtained through the use of a multivariate statistical technique known as logistic regression. This procedure can be used to develop a predictive equation that is mathematically formulated to use the best combination of predictors, rather than considering just one factor at a time. This paper presents an overview of logistic regression and its utility in mental health administrative decision making. A case example of its application is presented using data on elopements from Missouri's long-term state psychiatric hospitals. Ultimately, the use of statistical prediction analyses tempered with differential qualitative weighting of classification errors can augment decision-making processes in a manner that provides guidance and flexibility while wrestling with the complex problem of risk assessment and decision making.

  20. Contribution of spoken language and socio-economic background to adolescents' educational achievement at age 16 years.

    PubMed

    Spencer, Sarah; Clegg, Judy; Stackhouse, Joy; Rush, Robert

    2017-03-01

    Well-documented associations exist between socio-economic background and language ability in early childhood, and between educational attainment and language ability in children with clinically referred language impairment. However, very little research has looked at the associations between language ability, educational attainment and socio-economic background during adolescence, particularly in populations without language impairment. To investigate: (1) whether adolescents with higher educational outcomes overall had higher language abilities; and (2) associations between adolescent language ability, socio-economic background and educational outcomes, specifically in relation to Mathematics, English Language and English Literature GCSE grade. A total of 151 participants completed five standardized language assessments measuring vocabulary, comprehension of sentences and spoken paragraphs, and narrative skills and one nonverbal assessment when between 13 and 14 years old. These data were compared with the participants' educational achievement obtained upon leaving secondary education (16 years old). Univariate logistic regressions were employed to identify those language assessments and demographic factors that were associated with achieving a targeted A * -C grade in English Language, English Literature and Mathematics General Certificate of Secondary Education (GCSE) at 16 years. Further logistic regressions were then conducted to examine further the contribution of socio-economic background and spoken language skills in the multivariate models. Vocabulary, comprehension of sentences and spoken paragraphs, and mean length utterance in a narrative task along with socio-economic background contributed to whether participants achieved an A * -C grade in GCSE Mathematics and English Language and English Literature. Nonverbal ability contributed to English Language and Mathematics. The results of multivariate logistic regressions then found that vocabulary skills

  1. Diurnal salivary cortisol and regression status in MECP2 Duplication syndrome

    PubMed Central

    Peters, Sarika U.; Byiers, Breanne J.; Symons, Frank J.

    2015-01-01

    MECP2 duplication syndrome is an X-linked genomic disorder that is characterized by infantile hypotonia, intellectual disability, and recurrent respiratory infections. Regression affects a subset of individuals, and the etiology of regression has yet to be examined. In this study, alterations in the hypothalamus-pituitary-adrenal axis, including diurnal patterns in salivary cortisol, were examined in four males with MECP2 duplication syndrome who had regression, and four males with the same syndrome without regression (ages 3–22 years). Individuals who had experienced regression do not exhibit typical diurnal cortisol rhythms, and their profiles were flatter through the day. In contrast, individuals with MECP2 duplication syndrome who had not experienced regression showed more typical patterns of higher cortisol levels in the morning with linear decreases throughout the day. This study is the first to suggest a link between atypical diurnal cortisol rhythms and regression status in MECP2 duplication syndrome, and may have implications for treatment. PMID:25999300

  2. Cardiovascular reactivity patterns and pathways to hypertension: a multivariate cluster analysis.

    PubMed

    Brindle, R C; Ginty, A T; Jones, A; Phillips, A C; Roseboom, T J; Carroll, D; Painter, R C; de Rooij, S R

    2016-12-01

    Substantial evidence links exaggerated mental stress induced blood pressure reactivity to future hypertension, but the results for heart rate reactivity are less clear. For this reason multivariate cluster analysis was carried out to examine the relationship between heart rate and blood pressure reactivity patterns and hypertension in a large prospective cohort (age range 55-60 years). Four clusters emerged with statistically different systolic and diastolic blood pressure and heart rate reactivity patterns. Cluster 1 was characterised by a relatively exaggerated blood pressure and heart rate response while the blood pressure and heart rate responses of cluster 2 were relatively modest and in line with the sample mean. Cluster 3 was characterised by blunted cardiovascular stress reactivity across all variables and cluster 4, by an exaggerated blood pressure response and modest heart rate response. Membership to cluster 4 conferred an increased risk of hypertension at 5-year follow-up (hazard ratio=2.98 (95% CI: 1.50-5.90), P<0.01) that survived adjustment for a host of potential confounding variables. These results suggest that the cardiac reactivity plays a potentially important role in the link between blood pressure reactivity and hypertension and support the use of multivariate approaches to stress psychophysiology.

  3. Nonparametric regression applied to quantitative structure-activity relationships

    PubMed

    Constans; Hirst

    2000-03-01

    Several nonparametric regressors have been applied to modeling quantitative structure-activity relationship (QSAR) data. The simplest regressor, the Nadaraya-Watson, was assessed in a genuine multivariate setting. Other regressors, the local linear and the shifted Nadaraya-Watson, were implemented within additive models--a computationally more expedient approach, better suited for low-density designs. Performances were benchmarked against the nonlinear method of smoothing splines. A linear reference point was provided by multilinear regression (MLR). Variable selection was explored using systematic combinations of different variables and combinations of principal components. For the data set examined, 47 inhibitors of dopamine beta-hydroxylase, the additive nonparametric regressors have greater predictive accuracy (as measured by the mean absolute error of the predictions or the Pearson correlation in cross-validation trails) than MLR. The use of principal components did not improve the performance of the nonparametric regressors over use of the original descriptors, since the original descriptors are not strongly correlated. It remains to be seen if the nonparametric regressors can be successfully coupled with better variable selection and dimensionality reduction in the context of high-dimensional QSARs.

  4. Dental plaque, preventive care, and tooth brushing associated with dental caries in primary teeth in schoolchildren ages 6-9 years of Leon, Nicaragua.

    PubMed

    Herrera, Miriam del Socorro; Medina-Solís, Carlo Eduardo; Minaya-Sánchez, Mirna; Pontigo-Loyola, América Patricia; Villalobos-Rodelo, Juan José; Islas-Granillo, Horacio; de la Rosa-Santillana, Rubén; Maupomé, Gerardo

    2013-11-19

    Our study aimed to evaluate the effect of various risk indicators for dental caries on primary teeth of Nicaraguan children (from Leon, Nicaragua) ages 6 to 9, using the negative binomial regression model. A cross-sectional study was carried out to collect clinical, demographic, socioeconomic, and behavioral data from 794 schoolchildren ages 6 to 9 years, randomly selected from 25 schools in the city of León, Nicaragua. Clinical examinations for dental caries (dmft index) were performed by 2 trained and standardized examiners. Socio-demographic, socioeconomic, and behavioral data were self-reported using questionnaires. Multivariate negative binomial regression (NBR) analysis was used. Mean age was 7.49 ± 1.12 years. Boys accounted for 50.1% of the sample. Mean dmft was 3.54 ± 3.13 and caries prevalence (dmft >0) was 77.6%. In the NBR multivariate model (p<0.05), for each year of age, the expected mean dmft decreased by 7.5%. Brushing teeth at least once a day and having received preventive dental care in the last year before data collection were associated with declines in the expected mean dmft by 19.5% and 69.6%, respectively. Presence of dental plaque increased the expected mean dmft by 395.5%. The proportion of students with caries in this sample was high. We found associations between dental caries in the primary dentition and dental plaque, brushing teeth at least once a day, and having received preventive dental care. To improve oral health, school programs and/or age-appropriate interventions need to be developed based on the specific profile of caries experience and the associated risk indicators.

  5. Risk factors for refractive errors in primary school children (6-12 years old) in Nakhon Pathom Province.

    PubMed

    Yingyong, Penpimol

    2010-11-01

    Refractive error is one of the leading causes of visual impairment in children. An analysis of risk factors for refractive error is required to reduce and prevent this common eye disease. To identify the risk factors associated with refractive errors in primary school children (6-12 year old) in Nakhon Pathom province. A population-based cross-sectional analytic study was conducted between October 2008 and September 2009 in Nakhon Pathom. Refractive error, parental refractive status, and hours per week of near activities (studying, reading books, watching television, playing with video games, or working on the computer) were assessed in 377 children who participated in this study. The most common type of refractive error in primary school children was myopia. Myopic children were more likely to have parents with myopia. Children with myopia spend more time at near activities. The multivariate odds ratio (95% confidence interval)for two myopic parents was 6.37 (2.26-17.78) and for each diopter-hour per week of near work was 1.019 (1.005-1.033). Multivariate logistic regression models show no confounding effects between parental myopia and near work suggesting that each factor has an independent association with myopia. Statistical analysis by logistic regression revealed that family history of refractive error and hours of near-work were significantly associated with refractive error in primary school children.

  6. Determinants of orphan drugs prices in France: a regression analysis.

    PubMed

    Korchagina, Daria; Millier, Aurelie; Vataire, Anne-Lise; Aballea, Samuel; Falissard, Bruno; Toumi, Mondher

    2017-04-21

    The introduction of the orphan drug legislation led to the increase in the number of available orphan drugs, but the access to them is often limited due to the high price. Social preferences regarding funding orphan drugs as well as the criteria taken into consideration while setting the price remain unclear. The study aimed at identifying the determinant of orphan drug prices in France using a regression analysis. All drugs with a valid orphan designation at the moment of launch for which the price was available in France were included in the analysis. The selection of covariates was based on a literature review and included drug characteristics (Anatomical Therapeutic Chemical (ATC) class, treatment line, age of target population), diseases characteristics (severity, prevalence, availability of alternative therapeutic options), health technology assessment (HTA) details (actual benefit (AB) and improvement in actual benefit (IAB) scores, delay between the HTA and commercialisation), and study characteristics (type of study, comparator, type of endpoint). The main data sources were European public assessment reports, HTA reports, summaries of opinion on orphan designation of the European Medicines Agency, and the French insurance database of drugs and tariffs. A generalized regression model was developed to test the association between the annual treatment cost and selected covariates. A total of 68 drugs were included. The mean annual treatment cost was €96,518. In the univariate analysis, the ATC class (p = 0.01), availability of alternative treatment options (p = 0.02) and the prevalence (p = 0.02) showed a significant correlation with the annual cost. The multivariate analysis demonstrated significant association between the annual cost and availability of alternative treatment options, ATC class, IAB score, type of comparator in the pivotal clinical trial, as well as commercialisation date and delay between the HTA and commercialisation. The

  7. Resting-state functional magnetic resonance imaging: the impact of regression analysis.

    PubMed

    Yeh, Chia-Jung; Tseng, Yu-Sheng; Lin, Yi-Ru; Tsai, Shang-Yueh; Huang, Teng-Yi

    2015-01-01

    To investigate the impact of regression methods on resting-state functional magnetic resonance imaging (rsfMRI). During rsfMRI preprocessing, regression analysis is considered effective for reducing the interference of physiological noise on the signal time course. However, it is unclear whether the regression method benefits rsfMRI analysis. Twenty volunteers (10 men and 10 women; aged 23.4 ± 1.5 years) participated in the experiments. We used node analysis and functional connectivity mapping to assess the brain default mode network by using five combinations of regression methods. The results show that regressing the global mean plays a major role in the preprocessing steps. When a global regression method is applied, the values of functional connectivity are significantly lower (P ≤ .01) than those calculated without a global regression. This step increases inter-subject variation and produces anticorrelated brain areas. rsfMRI data processed using regression should be interpreted carefully. The significance of the anticorrelated brain areas produced by global signal removal is unclear. Copyright © 2014 by the American Society of Neuroimaging.

  8. Multivariate functions for predicting the sorption of 2,4,6-trinitrotoluene (TNT) and 1,3,5-trinitro-1,3,5-tricyclohexane (RDX) among taxonomically distinct soils.

    PubMed

    Katseanes, Chelsea K; Chappell, Mark A; Hopkins, Bryan G; Durham, Brian D; Price, Cynthia L; Porter, Beth E; Miller, Lesley F

    2016-11-01

    After nearly a century of use in numerous munition platforms, TNT and RDX contamination has turned up largely in the environment due to ammunition manufacturing or as part of releases from low-order detonations during training activities. Although the basic knowledge governing the environmental fate of TNT and RDX are known, accurate predictions of TNT and RDX persistence in soil remain elusive, particularly given the universal heterogeneity of pedomorphic soil types. In this work, we proposed a new solution for modeling the sorption and persistence of these munition constituents as multivariate mathematical functions correlating soil attribute data over a variety of taxonomically distinct soil types to contaminant behavior, instead of a single constant or parameter of a specific absolute value. To test this idea, we conducted experiments measuring the sorption of TNT and RDX on taxonomically different soil types that were extensively physical and chemically characterized. Statistical decomposition of the log-transformed, and auto-scaled soil characterization data using the dimension-reduction technique PCA (principal component analysis) revealed a strong latent structure based in the multiple pairwise correlations among the soil properties. TNT and RDX sorption partitioning coefficients (KD-TNT and KD-RDX) were regressed against this latent structure using partial least squares regression (PLSR), generating a 3-factor, multivariate linear functions. Here, PLSR models predicted KD-TNT and KD-RDX values based on attributes contributing to endogenous alkaline/calcareous and soil fertility criteria, respectively, exhibited among the different soil types: We hypothesized that the latent structure arising from the strong covariance of full multivariate geochemical matrix describing taxonomically distinguished soil types may provide the means for potentially predicting complex phenomena in soils. The development of predictive multivariate models tuned to a local soil

  9. Clinical correlates of hypoglycaemia over 4 years in people with type 2 diabetes starting insulin: An analysis from the CREDIT study

    PubMed Central

    Calvi‐Gries, Francoise; Blonde, Lawrence; Pilorget, Valerie; Berlingieri, Joseph; Freemantle, Nick

    2018-01-01

    Aim To identify factors associated with documented symptomatic and severe hypoglycaemia over 4 years in people with type 2 diabetes starting insulin therapy. Materials and methods CREDIT, a prospective international observational study, collected data over 4 years on people starting any insulin in 314 centres; 2729 and 2271 people had hypoglycaemia data during the last 6 months of years 1 and 4, respectively. Multivariable logistic regression was used to select the characteristics associated with documented symptomatic hypoglycaemia, and the model was tested against severe hypoglycaemia. Results The proportions of participants reporting ≥1 non‐severe event were 18.5% and 16.6% in years 1 and 4; the corresponding proportions of those achieving a glycated haemoglobin (HbA1c) concentration <7.0% (<53 mmol/mol) were 24.6% and 18.3%, and 16.5% and 16.2% of those who did not. For severe hypoglycaemia, the proportions were 3.0% and 4.6% of people reaching target vs 1.5% and 1.1% of those not reaching target. Multivariable analysis showed that, for documented symptomatic hypoglycaemia at both years 1 and 4, baseline lower body mass index and more physical activity were predictors, and lower HbA1c was an explanatory variable in the respective year. Models for documented symptomatic hypoglycaemia predicted severe hypoglycaemia. Insulin regimen was a univariate explanatory variable, and was not retained in the multivariable analysis. Conclusions Hypoglycaemia occurred at significant rates, but was stable over 4 years despite increased insulin doses. The association with insulin regimen and with oral agent use declined over that time. Associated predictors and explanatory variables for documented symptomatic hypoglycaemia conformed to clinical impressions and could be extended to severe hypoglycaemia. Better achieved HbA1c was associated with a higher risk of hypoglycaemia. PMID:29205734

  10. Clinical correlates of hypoglycaemia over 4 years in people with type 2 diabetes starting insulin: An analysis from the CREDIT study.

    PubMed

    Home, Philip; Calvi-Gries, Francoise; Blonde, Lawrence; Pilorget, Valerie; Berlingieri, Joseph; Freemantle, Nick

    2018-04-01

    To identify factors associated with documented symptomatic and severe hypoglycaemia over 4 years in people with type 2 diabetes starting insulin therapy. CREDIT, a prospective international observational study, collected data over 4 years on people starting any insulin in 314 centres; 2729 and 2271 people had hypoglycaemia data during the last 6 months of years 1 and 4, respectively. Multivariable logistic regression was used to select the characteristics associated with documented symptomatic hypoglycaemia, and the model was tested against severe hypoglycaemia. The proportions of participants reporting ≥1 non-severe event were 18.5% and 16.6% in years 1 and 4; the corresponding proportions of those achieving a glycated haemoglobin (HbA1c) concentration <7.0% (<53 mmol/mol) were 24.6% and 18.3%, and 16.5% and 16.2% of those who did not. For severe hypoglycaemia, the proportions were 3.0% and 4.6% of people reaching target vs 1.5% and 1.1% of those not reaching target. Multivariable analysis showed that, for documented symptomatic hypoglycaemia at both years 1 and 4, baseline lower body mass index and more physical activity were predictors, and lower HbA1c was an explanatory variable in the respective year. Models for documented symptomatic hypoglycaemia predicted severe hypoglycaemia. Insulin regimen was a univariate explanatory variable, and was not retained in the multivariable analysis. Hypoglycaemia occurred at significant rates, but was stable over 4 years despite increased insulin doses. The association with insulin regimen and with oral agent use declined over that time. Associated predictors and explanatory variables for documented symptomatic hypoglycaemia conformed to clinical impressions and could be extended to severe hypoglycaemia. Better achieved HbA1c was associated with a higher risk of hypoglycaemia. © 2017 The Authors. Diabetes, Obesity and Metabolism published by John Wiley & Sons Ltd.

  11. Multivariate Models for Normal and Binary Responses in Intervention Studies

    ERIC Educational Resources Information Center

    Pituch, Keenan A.; Whittaker, Tiffany A.; Chang, Wanchen

    2016-01-01

    Use of multivariate analysis (e.g., multivariate analysis of variance) is common when normally distributed outcomes are collected in intervention research. However, when mixed responses--a set of normal and binary outcomes--are collected, standard multivariate analyses are no longer suitable. While mixed responses are often obtained in…

  12. Methods for presentation and display of multivariate data

    NASA Technical Reports Server (NTRS)

    Myers, R. H.

    1981-01-01

    Methods for the presentation and display of multivariate data are discussed with emphasis placed on the multivariate analysis of variance problems and the Hotelling T(2) solution in the two-sample case. The methods utilize the concepts of stepwise discrimination analysis and the computation of partial correlation coefficients.

  13. The Multivariate Regression Statistics Strategy to Investigate Content-Effect Correlation of Multiple Components in Traditional Chinese Medicine Based on a Partial Least Squares Method.

    PubMed

    Peng, Ying; Li, Su-Ning; Pei, Xuexue; Hao, Kun

    2018-03-01

    Amultivariate regression statisticstrategy was developed to clarify multi-components content-effect correlation ofpanaxginseng saponins extract and predict the pharmacological effect by components content. In example 1, firstly, we compared pharmacological effects between panax ginseng saponins extract and individual saponin combinations. Secondly, we examined the anti-platelet aggregation effect in seven different saponin combinations of ginsenoside Rb1, Rg1, Rh, Rd, Ra3 and notoginsenoside R1. Finally, the correlation between anti-platelet aggregation and the content of multiple components was analyzed by a partial least squares algorithm. In example 2, firstly, 18 common peaks were identified in ten different batches of panax ginseng saponins extracts from different origins. Then, we investigated the anti-myocardial ischemia reperfusion injury effects of the ten different panax ginseng saponins extracts. Finally, the correlation between the fingerprints and the cardioprotective effects was analyzed by a partial least squares algorithm. Both in example 1 and 2, the relationship between the components content and pharmacological effect was modeled well by the partial least squares regression equations. Importantly, the predicted effect curve was close to the observed data of dot marked on the partial least squares regression model. This study has given evidences that themulti-component content is a promising information for predicting the pharmacological effects of traditional Chinese medicine.

  14. Robust, Adaptive Functional Regression in Functional Mixed Model Framework.

    PubMed

    Zhu, Hongxiao; Brown, Philip J; Morris, Jeffrey S

    2011-09-01

    Functional data are increasingly encountered in scientific studies, and their high dimensionality and complexity lead to many analytical challenges. Various methods for functional data analysis have been developed, including functional response regression methods that involve regression of a functional response on univariate/multivariate predictors with nonparametrically represented functional coefficients. In existing methods, however, the functional regression can be sensitive to outlying curves and outlying regions of curves, so is not robust. In this paper, we introduce a new Bayesian method, robust functional mixed models (R-FMM), for performing robust functional regression within the general functional mixed model framework, which includes multiple continuous or categorical predictors and random effect functions accommodating potential between-function correlation induced by the experimental design. The underlying model involves a hierarchical scale mixture model for the fixed effects, random effect and residual error functions. These modeling assumptions across curves result in robust nonparametric estimators of the fixed and random effect functions which down-weight outlying curves and regions of curves, and produce statistics that can be used to flag global and local outliers. These assumptions also lead to distributions across wavelet coefficients that have outstanding sparsity and adaptive shrinkage properties, with great flexibility for the data to determine the sparsity and the heaviness of the tails. Together with the down-weighting of outliers, these within-curve properties lead to fixed and random effect function estimates that appear in our simulations to be remarkably adaptive in their ability to remove spurious features yet retain true features of the functions. We have developed general code to implement this fully Bayesian method that is automatic, requiring the user to only provide the functional data and design matrices. It is efficient

  15. Robust, Adaptive Functional Regression in Functional Mixed Model Framework

    PubMed Central

    Zhu, Hongxiao; Brown, Philip J.; Morris, Jeffrey S.

    2012-01-01

    Functional data are increasingly encountered in scientific studies, and their high dimensionality and complexity lead to many analytical challenges. Various methods for functional data analysis have been developed, including functional response regression methods that involve regression of a functional response on univariate/multivariate predictors with nonparametrically represented functional coefficients. In existing methods, however, the functional regression can be sensitive to outlying curves and outlying regions of curves, so is not robust. In this paper, we introduce a new Bayesian method, robust functional mixed models (R-FMM), for performing robust functional regression within the general functional mixed model framework, which includes multiple continuous or categorical predictors and random effect functions accommodating potential between-function correlation induced by the experimental design. The underlying model involves a hierarchical scale mixture model for the fixed effects, random effect and residual error functions. These modeling assumptions across curves result in robust nonparametric estimators of the fixed and random effect functions which down-weight outlying curves and regions of curves, and produce statistics that can be used to flag global and local outliers. These assumptions also lead to distributions across wavelet coefficients that have outstanding sparsity and adaptive shrinkage properties, with great flexibility for the data to determine the sparsity and the heaviness of the tails. Together with the down-weighting of outliers, these within-curve properties lead to fixed and random effect function estimates that appear in our simulations to be remarkably adaptive in their ability to remove spurious features yet retain true features of the functions. We have developed general code to implement this fully Bayesian method that is automatic, requiring the user to only provide the functional data and design matrices. It is efficient

  16. Multivariate analysis of longitudinal rates of change.

    PubMed

    Bryan, Matthew; Heagerty, Patrick J

    2016-12-10

    Longitudinal data allow direct comparison of the change in patient outcomes associated with treatment or exposure. Frequently, several longitudinal measures are collected that either reflect a common underlying health status, or characterize processes that are influenced in a similar way by covariates such as exposure or demographic characteristics. Statistical methods that can combine multivariate response variables into common measures of covariate effects have been proposed in the literature. Current methods for characterizing the relationship between covariates and the rate of change in multivariate outcomes are limited to select models. For example, 'accelerated time' methods have been developed which assume that covariates rescale time in longitudinal models for disease progression. In this manuscript, we detail an alternative multivariate model formulation that directly structures longitudinal rates of change and that permits a common covariate effect across multiple outcomes. We detail maximum likelihood estimation for a multivariate longitudinal mixed model. We show via asymptotic calculations the potential gain in power that may be achieved with a common analysis of multiple outcomes. We apply the proposed methods to the analysis of a trivariate outcome for infant growth and compare rates of change for HIV infected and uninfected infants. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  17. Predictors of somatic symptoms: a five year follow up of adolescents

    PubMed Central

    Poikolainen, K; Aalto-Setala, T; Marttunen, M; Tuulio-Henriksson, A; Lonnqvist, J

    2000-01-01

    BACKGROUND—Somatisation is common among adolescents.
AIMS—To study factors predicting somatisation later in adulthood.
METHODS—Self report questionnaires were administered at baseline examination in 1990 to students (mean age 16.8 years) in schools, and by mail five years later. Results are based on the 615 subjects with no serious disease or injury at baseline.
RESULTS—Regression analyses showed that in men the level of somatic symptoms in 1995 was significantly predicted by the respective level in 1990 and by relief smoking. In women, the level of somatic symptoms in 1995 was significantly predicted by the respective level in 1990, self esteem, and the number of negative life events in 1990. After exclusion of cases with a long standing disease in 1995, the multivariate results remained materially similar except that self esteem was no longer significant among women.
CONCLUSION—These findings may help in early identification of adolescents with somatisation persisting into early adulthood.

 PMID:11040143

  18. Biological and socio-cultural factors during the school years predicting women’s lifetime educational attainment

    PubMed Central

    Hendrick, C. Emily; Cohen, Alison K.; Deardorff, Julianna

    2015-01-01

    BACKGROUND Lifetime educational attainment is an important predictor of health and well-being for women in the United States. In the current study, we examine the roles of socio-cultural factors in youth and an understudied biological life event, pubertal timing, in predicting women’s lifetime educational attainment. METHODS Using data from the National Longitudinal Survey of Youth 1997 cohort (N = 3889), we conducted sequential multivariate linear regression analyses to investigate the influences of macro-level and family-level socio-cultural contextual factors in youth (region of country, urbanicity, race/ethnicity, year of birth, household composition, mother’s education, mother’s age at first birth) and early menarche, a marker of early pubertal development, on women’s educational attainment after age 24. RESULTS Pubertal timing and all socio-cultural factors in youth, other than year of birth, predicted women’s lifetime educational attainment in bivariate models. Family factors had the strongest associations. When family factors were added to multivariate models, geographic region in youth and pubertal timing were no longer significant. CONCLUSION Our findings provide additional evidence that family factors should be considered when developing comprehensive and inclusive interventions in childhood and adolescence to promote lifetime educational attainment among girls. PMID:26830508

  19. Fourier Transform Infrared Spectroscopy and Multivariate Analysis for Online Monitoring of Dibutyl Phosphate Degradation Product in Tributyl Phosphate/n-Dodecane/Nitric Acid Solvent

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tatiana G. Levitskaia; James M. Peterson; Emily L. Campbell

    2013-12-01

    In liquid–liquid extraction separation processes, accumulation of organic solvent degradation products is detrimental to the process robustness, and frequent solvent analysis is warranted. Our research explores the feasibility of online monitoring of the organic solvents relevant to used nuclear fuel reprocessing. This paper describes the first phase of developing a system for monitoring the tributyl phosphate (TBP)/n-dodecane solvent commonly used to separate used nuclear fuel. In this investigation, the effect of extraction of nitric acid from aqueous solutions of variable concentrations on the quantification of TBP and its major degradation product dibutylphosphoric acid (HDBP) was assessed. Fourier transform infrared (FTIR)more » spectroscopy was used to discriminate between HDBP and TBP in the nitric acid-containing TBP/n-dodecane solvent. Multivariate analysis of the spectral data facilitated the development of regression models for HDBP and TBP quantification in real time, enabling online implementation of the monitoring system. The predictive regression models were validated using TBP/n-dodecane solvent samples subjected to high-dose external ?-irradiation. The predictive models were translated to flow conditions using a hollow fiber FTIR probe installed in a centrifugal contactor extraction apparatus, demonstrating the applicability of the FTIR technique coupled with multivariate analysis for the online monitoring of the organic solvent degradation products.« less

  20. Information extraction from multivariate images

    NASA Technical Reports Server (NTRS)

    Park, S. K.; Kegley, K. A.; Schiess, J. R.

    1986-01-01

    An overview of several multivariate image processing techniques is presented, with emphasis on techniques based upon the principal component transformation (PCT). Multiimages in various formats have a multivariate pixel value, associated with each pixel location, which has been scaled and quantized into a gray level vector, and the bivariate of the extent to which two images are correlated. The PCT of a multiimage decorrelates the multiimage to reduce its dimensionality and reveal its intercomponent dependencies if some off-diagonal elements are not small, and for the purposes of display the principal component images must be postprocessed into multiimage format. The principal component analysis of a multiimage is a statistical analysis based upon the PCT whose primary application is to determine the intrinsic component dimensionality of the multiimage. Computational considerations are also discussed.