A Technique of Fuzzy C-Mean in Multiple Linear Regression Model toward Paddy Yield
NASA Astrophysics Data System (ADS)
Syazwan Wahab, Nur; Saifullah Rusiman, Mohd; Mohamad, Mahathir; Amira Azmi, Nur; Che Him, Norziha; Ghazali Kamardan, M.; Ali, Maselan
2018-04-01
In this paper, we propose a hybrid model which is a combination of multiple linear regression model and fuzzy c-means method. This research involved a relationship between 20 variates of the top soil that are analyzed prior to planting of paddy yields at standard fertilizer rates. Data used were from the multi-location trials for rice carried out by MARDI at major paddy granary in Peninsular Malaysia during the period from 2009 to 2012. Missing observations were estimated using mean estimation techniques. The data were analyzed using multiple linear regression model and a combination of multiple linear regression model and fuzzy c-means method. Analysis of normality and multicollinearity indicate that the data is normally scattered without multicollinearity among independent variables. Analysis of fuzzy c-means cluster the yield of paddy into two clusters before the multiple linear regression model can be used. The comparison between two method indicate that the hybrid of multiple linear regression model and fuzzy c-means method outperform the multiple linear regression model with lower value of mean square error.
Advanced statistics: linear regression, part I: simple linear regression.
Marill, Keith A
2004-01-01
Simple linear regression is a mathematical technique used to model the relationship between a single independent predictor variable and a single dependent outcome variable. In this, the first of a two-part series exploring concepts in linear regression analysis, the four fundamental assumptions and the mechanics of simple linear regression are reviewed. The most common technique used to derive the regression line, the method of least squares, is described. The reader will be acquainted with other important concepts in simple linear regression, including: variable transformations, dummy variables, relationship to inference testing, and leverage. Simplified clinical examples with small datasets and graphic models are used to illustrate the points. This will provide a foundation for the second article in this series: a discussion of multiple linear regression, in which there are multiple predictor variables.
Advanced statistics: linear regression, part II: multiple linear regression.
Marill, Keith A
2004-01-01
The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.
ERIC Educational Resources Information Center
Quinino, Roberto C.; Reis, Edna A.; Bessegato, Lupercio F.
2013-01-01
This article proposes the use of the coefficient of determination as a statistic for hypothesis testing in multiple linear regression based on distributions acquired by beta sampling. (Contains 3 figures.)
Ling, Ru; Liu, Jiawang
2011-12-01
To construct prediction model for health workforce and hospital beds in county hospitals of Hunan by multiple linear regression. We surveyed 16 counties in Hunan with stratified random sampling according to uniform questionnaires,and multiple linear regression analysis with 20 quotas selected by literature view was done. Independent variables in the multiple linear regression model on medical personnels in county hospitals included the counties' urban residents' income, crude death rate, medical beds, business occupancy, professional equipment value, the number of devices valued above 10 000 yuan, fixed assets, long-term debt, medical income, medical expenses, outpatient and emergency visits, hospital visits, actual available bed days, and utilization rate of hospital beds. Independent variables in the multiple linear regression model on county hospital beds included the the population of aged 65 and above in the counties, disposable income of urban residents, medical personnel of medical institutions in county area, business occupancy, the total value of professional equipment, fixed assets, long-term debt, medical income, medical expenses, outpatient and emergency visits, hospital visits, actual available bed days, utilization rate of hospital beds, and length of hospitalization. The prediction model shows good explanatory and fitting, and may be used for short- and mid-term forecasting.
ERIC Educational Resources Information Center
Preacher, Kristopher J.; Curran, Patrick J.; Bauer, Daniel J.
2006-01-01
Simple slopes, regions of significance, and confidence bands are commonly used to evaluate interactions in multiple linear regression (MLR) models, and the use of these techniques has recently been extended to multilevel or hierarchical linear modeling (HLM) and latent curve analysis (LCA). However, conducting these tests and plotting the…
Correlation and simple linear regression.
Eberly, Lynn E
2007-01-01
This chapter highlights important steps in using correlation and simple linear regression to address scientific questions about the association of two continuous variables with each other. These steps include estimation and inference, assessing model fit, the connection between regression and ANOVA, and study design. Examples in microbiology are used throughout. This chapter provides a framework that is helpful in understanding more complex statistical techniques, such as multiple linear regression, linear mixed effects models, logistic regression, and proportional hazards regression.
As a fast and effective technique, the multiple linear regression (MLR) method has been widely used in modeling and prediction of beach bacteria concentrations. Among previous works on this subject, however, several issues were insufficiently or inconsistently addressed. Those is...
Due to the complexity of the processes contributing to beach bacteria concentrations, many researchers rely on statistical modeling, among which multiple linear regression (MLR) modeling is most widely used. Despite its ease of use and interpretation, there may be time dependence...
An improved multiple linear regression and data analysis computer program package
NASA Technical Reports Server (NTRS)
Sidik, S. M.
1972-01-01
NEWRAP, an improved version of a previous multiple linear regression program called RAPIER, CREDUC, and CRSPLT, allows for a complete regression analysis including cross plots of the independent and dependent variables, correlation coefficients, regression coefficients, analysis of variance tables, t-statistics and their probability levels, rejection of independent variables, plots of residuals against the independent and dependent variables, and a canonical reduction of quadratic response functions useful in optimum seeking experimentation. A major improvement over RAPIER is that all regression calculations are done in double precision arithmetic.
A Simple and Convenient Method of Multiple Linear Regression to Calculate Iodine Molecular Constants
ERIC Educational Resources Information Center
Cooper, Paul D.
2010-01-01
A new procedure using a student-friendly least-squares multiple linear-regression technique utilizing a function within Microsoft Excel is described that enables students to calculate molecular constants from the vibronic spectrum of iodine. This method is advantageous pedagogically as it calculates molecular constants for ground and excited…
Conjoint Analysis: A Study of the Effects of Using Person Variables.
ERIC Educational Resources Information Center
Fraas, John W.; Newman, Isadore
Three statistical techniques--conjoint analysis, a multiple linear regression model, and a multiple linear regression model with a surrogate person variable--were used to estimate the relative importance of five university attributes for students in the process of selecting a college. The five attributes include: availability and variety of…
2013-01-01
application of the Hammett equation with the constants rph in the chemistry of organophosphorus compounds, Russ. Chem. Rev. 38 (1969) 795–811. [13...of oximes and OP compounds and the ability of oximes to reactivate OP- inhibited AChE. Multiple linear regression equations were analyzed using...phosphonate pairs, 21 oxime/ phosphoramidate pairs and 12 oxime/phosphate pairs. The best linear regression equation resulting from multiple regression anal
ERIC Educational Resources Information Center
Richter, Tobias
2006-01-01
Most reading time studies using naturalistic texts yield data sets characterized by a multilevel structure: Sentences (sentence level) are nested within persons (person level). In contrast to analysis of variance and multiple regression techniques, hierarchical linear models take the multilevel structure of reading time data into account. They…
Some Applied Research Concerns Using Multiple Linear Regression Analysis.
ERIC Educational Resources Information Center
Newman, Isadore; Fraas, John W.
The intention of this paper is to provide an overall reference on how a researcher can apply multiple linear regression in order to utilize the advantages that it has to offer. The advantages and some concerns expressed about the technique are examined. A number of practical ways by which researchers can deal with such concerns as…
Khalil, Mohamed H.; Shebl, Mostafa K.; Kosba, Mohamed A.; El-Sabrout, Karim; Zaki, Nesma
2016-01-01
Aim: This research was conducted to determine the most affecting parameters on hatchability of indigenous and improved local chickens’ eggs. Materials and Methods: Five parameters were studied (fertility, early and late embryonic mortalities, shape index, egg weight, and egg weight loss) on four strains, namely Fayoumi, Alexandria, Matrouh, and Montazah. Multiple linear regression was performed on the studied parameters to determine the most influencing one on hatchability. Results: The results showed significant differences in commercial and scientific hatchability among strains. Alexandria strain has the highest significant commercial hatchability (80.70%). Regarding the studied strains, highly significant differences in hatching chick weight among strains were observed. Using multiple linear regression analysis, fertility made the greatest percent contribution (71.31%) to hatchability, and the lowest percent contributions were made by shape index and egg weight loss. Conclusion: A prediction of hatchability using multiple regression analysis could be a good tool to improve hatchability percentage in chickens. PMID:27651666
Wavelet regression model in forecasting crude oil price
NASA Astrophysics Data System (ADS)
Hamid, Mohd Helmie; Shabri, Ani
2017-05-01
This study presents the performance of wavelet multiple linear regression (WMLR) technique in daily crude oil forecasting. WMLR model was developed by integrating the discrete wavelet transform (DWT) and multiple linear regression (MLR) model. The original time series was decomposed to sub-time series with different scales by wavelet theory. Correlation analysis was conducted to assist in the selection of optimal decomposed components as inputs for the WMLR model. The daily WTI crude oil price series has been used in this study to test the prediction capability of the proposed model. The forecasting performance of WMLR model were also compared with regular multiple linear regression (MLR), Autoregressive Moving Average (ARIMA) and Generalized Autoregressive Conditional Heteroscedasticity (GARCH) using root mean square errors (RMSE) and mean absolute errors (MAE). Based on the experimental results, it appears that the WMLR model performs better than the other forecasting technique tested in this study.
Afantitis, Antreas; Melagraki, Georgia; Sarimveis, Haralambos; Koutentis, Panayiotis A; Markopoulos, John; Igglessi-Markopoulou, Olga
2006-08-01
A quantitative-structure activity relationship was obtained by applying Multiple Linear Regression Analysis to a series of 80 1-[2-hydroxyethoxy-methyl]-6-(phenylthio) thymine (HEPT) derivatives with significant anti-HIV activity. For the selection of the best among 37 different descriptors, the Elimination Selection Stepwise Regression Method (ES-SWR) was utilized. The resulting QSAR model (R (2) (CV) = 0.8160; S (PRESS) = 0.5680) proved to be very accurate both in training and predictive stages.
Simple and multiple linear regression: sample size considerations.
Hanley, James A
2016-11-01
The suggested "two subjects per variable" (2SPV) rule of thumb in the Austin and Steyerberg article is a chance to bring out some long-established and quite intuitive sample size considerations for both simple and multiple linear regression. This article distinguishes two of the major uses of regression models that imply very different sample size considerations, neither served well by the 2SPV rule. The first is etiological research, which contrasts mean Y levels at differing "exposure" (X) values and thus tends to focus on a single regression coefficient, possibly adjusted for confounders. The second research genre guides clinical practice. It addresses Y levels for individuals with different covariate patterns or "profiles." It focuses on the profile-specific (mean) Y levels themselves, estimating them via linear compounds of regression coefficients and covariates. By drawing on long-established closed-form variance formulae that lie beneath the standard errors in multiple regression, and by rearranging them for heuristic purposes, one arrives at quite intuitive sample size considerations for both research genres. Copyright © 2016 Elsevier Inc. All rights reserved.
Interpretation of commonly used statistical regression models.
Kasza, Jessica; Wolfe, Rory
2014-01-01
A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.
A Constrained Linear Estimator for Multiple Regression
ERIC Educational Resources Information Center
Davis-Stober, Clintin P.; Dana, Jason; Budescu, David V.
2010-01-01
"Improper linear models" (see Dawes, Am. Psychol. 34:571-582, "1979"), such as equal weighting, have garnered interest as alternatives to standard regression models. We analyze the general circumstances under which these models perform well by recasting a class of "improper" linear models as "proper" statistical models with a single predictor. We…
Musuku, Adrien; Tan, Aimin; Awaiye, Kayode; Trabelsi, Fethi
2013-09-01
Linear calibration is usually performed using eight to ten calibration concentration levels in regulated LC-MS bioanalysis because a minimum of six are specified in regulatory guidelines. However, we have previously reported that two-concentration linear calibration is as reliable as or even better than using multiple concentrations. The purpose of this research is to compare two-concentration with multiple-concentration linear calibration through retrospective data analysis of multiple bioanalytical projects that were conducted in an independent regulated bioanalytical laboratory. A total of 12 bioanalytical projects were randomly selected: two validations and two studies for each of the three most commonly used types of sample extraction methods (protein precipitation, liquid-liquid extraction, solid-phase extraction). When the existing data were retrospectively linearly regressed using only the lowest and the highest concentration levels, no extra batch failure/QC rejection was observed and the differences in accuracy and precision between the original multi-concentration regression and the new two-concentration linear regression are negligible. Specifically, the differences in overall mean apparent bias (square root of mean individual bias squares) are within the ranges of -0.3% to 0.7% and 0.1-0.7% for the validations and studies, respectively. The differences in mean QC concentrations are within the ranges of -0.6% to 1.8% and -0.8% to 2.5% for the validations and studies, respectively. The differences in %CV are within the ranges of -0.7% to 0.9% and -0.3% to 0.6% for the validations and studies, respectively. The average differences in study sample concentrations are within the range of -0.8% to 2.3%. With two-concentration linear regression, an average of 13% of time and cost could have been saved for each batch together with 53% of saving in the lead-in for each project (the preparation of working standard solutions, spiking, and aliquoting). Furthermore, examples are given as how to evaluate the linearity over the entire concentration range when only two concentration levels are used for linear regression. To conclude, two-concentration linear regression is accurate and robust enough for routine use in regulated LC-MS bioanalysis and it significantly saves time and cost as well. Copyright © 2013 Elsevier B.V. All rights reserved.
Isolating and Examining Sources of Suppression and Multicollinearity in Multiple Linear Regression
ERIC Educational Resources Information Center
Beckstead, Jason W.
2012-01-01
The presence of suppression (and multicollinearity) in multiple regression analysis complicates interpretation of predictor-criterion relationships. The mathematical conditions that produce suppression in regression analysis have received considerable attention in the methodological literature but until now nothing in the way of an analytic…
ERIC Educational Resources Information Center
Kobrin, Jennifer L.; Sinharay, Sandip; Haberman, Shelby J.; Chajewski, Michael
2011-01-01
This study examined the adequacy of a multiple linear regression model for predicting first-year college grade point average (FYGPA) using SAT[R] scores and high school grade point average (HSGPA). A variety of techniques, both graphical and statistical, were used to examine if it is possible to improve on the linear regression model. The results…
Practical Session: Simple Linear Regression
NASA Astrophysics Data System (ADS)
Clausel, M.; Grégoire, G.
2014-12-01
Two exercises are proposed to illustrate the simple linear regression. The first one is based on the famous Galton's data set on heredity. We use the lm R command and get coefficients estimates, standard error of the error, R2, residuals …In the second example, devoted to data related to the vapor tension of mercury, we fit a simple linear regression, predict values, and anticipate on multiple linear regression. This pratical session is an excerpt from practical exercises proposed by A. Dalalyan at EPNC (see Exercises 1 and 2 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_4.pdf).
Rasmussen, Patrick P.; Gray, John R.; Glysson, G. Douglas; Ziegler, Andrew C.
2009-01-01
In-stream continuous turbidity and streamflow data, calibrated with measured suspended-sediment concentration data, can be used to compute a time series of suspended-sediment concentration and load at a stream site. Development of a simple linear (ordinary least squares) regression model for computing suspended-sediment concentrations from instantaneous turbidity data is the first step in the computation process. If the model standard percentage error (MSPE) of the simple linear regression model meets a minimum criterion, this model should be used to compute a time series of suspended-sediment concentrations. Otherwise, a multiple linear regression model using paired instantaneous turbidity and streamflow data is developed and compared to the simple regression model. If the inclusion of the streamflow variable proves to be statistically significant and the uncertainty associated with the multiple regression model results in an improvement over that for the simple linear model, the turbidity-streamflow multiple linear regression model should be used to compute a suspended-sediment concentration time series. The computed concentration time series is subsequently used with its paired streamflow time series to compute suspended-sediment loads by standard U.S. Geological Survey techniques. Once an acceptable regression model is developed, it can be used to compute suspended-sediment concentration beyond the period of record used in model development with proper ongoing collection and analysis of calibration samples. Regression models to compute suspended-sediment concentrations are generally site specific and should never be considered static, but they represent a set period in a continually dynamic system in which additional data will help verify any change in sediment load, type, and source.
Carvalho, Carlos; Gomes, Danielo G.; Agoulmine, Nazim; de Souza, José Neuman
2011-01-01
This paper proposes a method based on multivariate spatial and temporal correlation to improve prediction accuracy in data reduction for Wireless Sensor Networks (WSN). Prediction of data not sent to the sink node is a technique used to save energy in WSNs by reducing the amount of data traffic. However, it may not be very accurate. Simulations were made involving simple linear regression and multiple linear regression functions to assess the performance of the proposed method. The results show a higher correlation between gathered inputs when compared to time, which is an independent variable widely used for prediction and forecasting. Prediction accuracy is lower when simple linear regression is used, whereas multiple linear regression is the most accurate one. In addition to that, our proposal outperforms some current solutions by about 50% in humidity prediction and 21% in light prediction. To the best of our knowledge, we believe that we are probably the first to address prediction based on multivariate correlation for WSN data reduction. PMID:22346626
ERIC Educational Resources Information Center
Baylor, Carolyn; Yorkston, Kathryn; Bamer, Alyssa; Britton, Deanna; Amtmann, Dagmar
2010-01-01
Purpose: To explore variables associated with self-reported communicative participation in a sample (n = 498) of community-dwelling adults with multiple sclerosis (MS). Method: A battery of questionnaires was administered online or on paper per participant preference. Data were analyzed using multiple linear backward stepwise regression. The…
Lorenzo-Seva, Urbano; Ferrando, Pere J
2011-03-01
We provide an SPSS program that implements currently recommended techniques and recent developments for selecting variables in multiple linear regression analysis via the relative importance of predictors. The approach consists of: (1) optimally splitting the data for cross-validation, (2) selecting the final set of predictors to be retained in the equation regression, and (3) assessing the behavior of the chosen model using standard indices and procedures. The SPSS syntax, a short manual, and data files related to this article are available as supplemental materials from brm.psychonomic-journals.org/content/supplemental.
Aqil, Muhammad; Kita, Ichiro; Yano, Akira; Nishiyama, Soichi
2007-10-01
Traditionally, the multiple linear regression technique has been one of the most widely used models in simulating hydrological time series. However, when the nonlinear phenomenon is significant, the multiple linear will fail to develop an appropriate predictive model. Recently, neuro-fuzzy systems have gained much popularity for calibrating the nonlinear relationships. This study evaluated the potential of a neuro-fuzzy system as an alternative to the traditional statistical regression technique for the purpose of predicting flow from a local source in a river basin. The effectiveness of the proposed identification technique was demonstrated through a simulation study of the river flow time series of the Citarum River in Indonesia. Furthermore, in order to provide the uncertainty associated with the estimation of river flow, a Monte Carlo simulation was performed. As a comparison, a multiple linear regression analysis that was being used by the Citarum River Authority was also examined using various statistical indices. The simulation results using 95% confidence intervals indicated that the neuro-fuzzy model consistently underestimated the magnitude of high flow while the low and medium flow magnitudes were estimated closer to the observed data. The comparison of the prediction accuracy of the neuro-fuzzy and linear regression methods indicated that the neuro-fuzzy approach was more accurate in predicting river flow dynamics. The neuro-fuzzy model was able to improve the root mean square error (RMSE) and mean absolute percentage error (MAPE) values of the multiple linear regression forecasts by about 13.52% and 10.73%, respectively. Considering its simplicity and efficiency, the neuro-fuzzy model is recommended as an alternative tool for modeling of flow dynamics in the study area.
Esserman, Denise A.; Moore, Charity G.; Roth, Mary T.
2009-01-01
Older community dwelling adults often take multiple medications for numerous chronic diseases. Non-adherence to these medications can have a large public health impact. Therefore, the measurement and modeling of medication adherence in the setting of polypharmacy is an important area of research. We apply a variety of different modeling techniques (standard linear regression; weighted linear regression; adjusted linear regression; naïve logistic regression; beta-binomial (BB) regression; generalized estimating equations (GEE)) to binary medication adherence data from a study in a North Carolina based population of older adults, where each medication an individual was taking was classified as adherent or non-adherent. In addition, through simulation we compare these different methods based on Type I error rates, bias, power, empirical 95% coverage, and goodness of fit. We find that estimation and inference using GEE is robust to a wide variety of scenarios and we recommend using this in the setting of polypharmacy when adherence is dichotomously measured for multiple medications per person. PMID:20414358
Genetic Programming Transforms in Linear Regression Situations
NASA Astrophysics Data System (ADS)
Castillo, Flor; Kordon, Arthur; Villa, Carlos
The chapter summarizes the use of Genetic Programming (GP) inMultiple Linear Regression (MLR) to address multicollinearity and Lack of Fit (LOF). The basis of the proposed method is applying appropriate input transforms (model respecification) that deal with these issues while preserving the information content of the original variables. The transforms are selected from symbolic regression models with optimal trade-off between accuracy of prediction and expressional complexity, generated by multiobjective Pareto-front GP. The chapter includes a comparative study of the GP-generated transforms with Ridge Regression, a variant of ordinary Multiple Linear Regression, which has been a useful and commonly employed approach for reducing multicollinearity. The advantages of GP-generated model respecification are clearly defined and demonstrated. Some recommendations for transforms selection are given as well. The application benefits of the proposed approach are illustrated with a real industrial application in one of the broadest empirical modeling areas in manufacturing - robust inferential sensors. The chapter contributes to increasing the awareness of the potential of GP in statistical model building by MLR.
Identifying the Factors That Influence Change in SEBD Using Logistic Regression Analysis
ERIC Educational Resources Information Center
Camilleri, Liberato; Cefai, Carmel
2013-01-01
Multiple linear regression and ANOVA models are widely used in applications since they provide effective statistical tools for assessing the relationship between a continuous dependent variable and several predictors. However these models rely heavily on linearity and normality assumptions and they do not accommodate categorical dependent…
Suzuki, Taku; Iwamoto, Takuji; Shizu, Kanae; Suzuki, Katsuji; Yamada, Harumoto; Sato, Kazuki
2017-05-01
This retrospective study was designed to investigate prognostic factors for postoperative outcomes for cubital tunnel syndrome (CubTS) using multiple logistic regression analysis with a large number of patients. Eighty-three patients with CubTS who underwent surgeries were enrolled. The following potential prognostic factors for disease severity were selected according to previous reports: sex, age, type of surgery, disease duration, body mass index, cervical lesion, presence of diabetes mellitus, Workers' Compensation status, preoperative severity, and preoperative electrodiagnostic testing. Postoperative severity of disease was assessed 2 years after surgery by Messina's criteria which is an outcome measure specifically for CubTS. Bivariate analysis was performed to select candidate prognostic factors for multiple linear regression analyses. Multiple logistic regression analysis was conducted to identify the association between postoperative severity and selected prognostic factors. Both bivariate and multiple linear regression analysis revealed only preoperative severity as an independent risk factor for poor prognosis, while other factors did not show any significant association. Although conflicting results exist regarding prognosis of CubTS, this study supports evidence from previous studies and concludes early surgical intervention portends the most favorable prognosis. Copyright © 2017 The Japanese Orthopaedic Association. Published by Elsevier B.V. All rights reserved.
1981-09-01
corresponds to the same square footage that consumed the electrical energy. 3. The basic assumptions of multiple linear regres- sion, as enumerated in...7. Data related to the sample of bases is assumed to be representative of bases in the population. Limitations Basic limitations on this research were... Ratemaking --Overview. Rand Report R-5894, Santa Monica CA, May 1977. Chatterjee, Samprit, and Bertram Price. Regression Analysis by Example. New York: John
Nie, Z Q; Ou, Y Q; Zhuang, J; Qu, Y J; Mai, J Z; Chen, J M; Liu, X Q
2016-05-01
Conditional logistic regression analysis and unconditional logistic regression analysis are commonly used in case control study, but Cox proportional hazard model is often used in survival data analysis. Most literature only refer to main effect model, however, generalized linear model differs from general linear model, and the interaction was composed of multiplicative interaction and additive interaction. The former is only statistical significant, but the latter has biological significance. In this paper, macros was written by using SAS 9.4 and the contrast ratio, attributable proportion due to interaction and synergy index were calculated while calculating the items of logistic and Cox regression interactions, and the confidence intervals of Wald, delta and profile likelihood were used to evaluate additive interaction for the reference in big data analysis in clinical epidemiology and in analysis of genetic multiplicative and additive interactions.
Multiple regression for physiological data analysis: the problem of multicollinearity.
Slinker, B K; Glantz, S A
1985-07-01
Multiple linear regression, in which several predictor variables are related to a response variable, is a powerful statistical tool for gaining quantitative insight into complex in vivo physiological systems. For these insights to be correct, all predictor variables must be uncorrelated. However, in many physiological experiments the predictor variables cannot be precisely controlled and thus change in parallel (i.e., they are highly correlated). There is a redundancy of information about the response, a situation called multicollinearity, that leads to numerical problems in estimating the parameters in regression equations; the parameters are often of incorrect magnitude or sign or have large standard errors. Although multicollinearity can be avoided with good experimental design, not all interesting physiological questions can be studied without encountering multicollinearity. In these cases various ad hoc procedures have been proposed to mitigate multicollinearity. Although many of these procedures are controversial, they can be helpful in applying multiple linear regression to some physiological problems.
The Geometry of Enhancement in Multiple Regression
ERIC Educational Resources Information Center
Waller, Niels G.
2011-01-01
In linear multiple regression, "enhancement" is said to occur when R[superscript 2] = b[prime]r greater than r[prime]r, where b is a p x 1 vector of standardized regression coefficients and r is a p x 1 vector of correlations between a criterion y and a set of standardized regressors, x. When p = 1 then b [is congruent to] r and…
Agha, Salah R; Alnahhal, Mohammed J
2012-11-01
The current study investigates the possibility of obtaining the anthropometric dimensions, critical to school furniture design, without measuring all of them. The study first selects some anthropometric dimensions that are easy to measure. Two methods are then used to check if these easy-to-measure dimensions can predict the dimensions critical to the furniture design. These methods are multiple linear regression and neural networks. Each dimension that is deemed necessary to ergonomically design school furniture is expressed as a function of some other measured anthropometric dimensions. Results show that out of the five dimensions needed for chair design, four can be related to other dimensions that can be measured while children are standing. Therefore, the method suggested here would definitely save time and effort and avoid the difficulty of dealing with students while measuring these dimensions. In general, it was found that neural networks perform better than multiple linear regression in the current study. Copyright © 2012 Elsevier Ltd and The Ergonomics Society. All rights reserved.
Quantile Regression in the Study of Developmental Sciences
Petscher, Yaacov; Logan, Jessica A. R.
2014-01-01
Linear regression analysis is one of the most common techniques applied in developmental research, but only allows for an estimate of the average relations between the predictor(s) and the outcome. This study describes quantile regression, which provides estimates of the relations between the predictor(s) and outcome, but across multiple points of the outcome’s distribution. Using data from the High School and Beyond and U.S. Sustained Effects Study databases, quantile regression is demonstrated and contrasted with linear regression when considering models with: (a) one continuous predictor, (b) one dichotomous predictor, (c) a continuous and a dichotomous predictor, and (d) a longitudinal application. Results from each example exhibited the differential inferences which may be drawn using linear or quantile regression. PMID:24329596
Applied Multiple Linear Regression: A General Research Strategy
ERIC Educational Resources Information Center
Smith, Brandon B.
1969-01-01
Illustrates some of the basic concepts and procedures for using regression analysis in experimental design, analysis of variance, analysis of covariance, and curvilinear regression. Applications to evaluation of instruction and vocational education programs are illustrated. (GR)
ERIC Educational Resources Information Center
Bates, Reid A.; Holton, Elwood F., III; Burnett, Michael F.
1999-01-01
A case study of learning transfer demonstrates the possible effect of influential observation on linear regression analysis. A diagnostic method that tests for violation of assumptions, multicollinearity, and individual and multiple influential observations helps determine which observation to delete to eliminate bias. (SK)
Partitioning sources of variation in vertebrate species richness
Boone, R.B.; Krohn, W.B.
2000-01-01
Aim: To explore biogeographic patterns of terrestrial vertebrates in Maine, USA using techniques that would describe local and spatial correlations with the environment. Location: Maine, USA. Methods: We delineated the ranges within Maine (86,156 km2) of 275 species using literature and expert review. Ranges were combined into species richness maps, and compared to geomorphology, climate, and woody plant distributions. Methods were adapted that compared richness of all vertebrate classes to each environmental correlate, rather than assessing a single explanatory theory. We partitioned variation in species richness into components using tree and multiple linear regression. Methods were used that allowed for useful comparisons between tree and linear regression results. For both methods we partitioned variation into broad-scale (spatially autocorrelated) and fine-scale (spatially uncorrelated) explained and unexplained components. By partitioning variance, and using both tree and linear regression in analyses, we explored the degree of variation in species richness for each vertebrate group that Could be explained by the relative contribution of each environmental variable. Results: In tree regression, climate variation explained richness better (92% of mean deviance explained for all species) than woody plant variation (87%) and geomorphology (86%). Reptiles were highly correlated with environmental variation (93%), followed by mammals, amphibians, and birds (each with 84-82% deviance explained). In multiple linear regression, climate was most closely associated with total vertebrate richness (78%), followed by woody plants (67%) and geomorphology (56%). Again, reptiles were closely correlated with the environment (95%), followed by mammals (73%), amphibians (63%) and birds (57%). Main conclusions: Comparing variation explained using tree and multiple linear regression quantified the importance of nonlinear relationships and local interactions between species richness and environmental variation, identifying the importance of linear relationships between reptiles and the environment, and nonlinear relationships between birds and woody plants, for example. Conservation planners should capture climatic variation in broad-scale designs; temperatures may shift during climate change, but the underlying correlations between the environment and species richness will presumably remain.
Prediction of the Main Engine Power of a New Container Ship at the Preliminary Design Stage
NASA Astrophysics Data System (ADS)
Cepowski, Tomasz
2017-06-01
The paper presents mathematical relationships that allow us to forecast the estimated main engine power of new container ships, based on data concerning vessels built in 2005-2015. The presented approximations allow us to estimate the engine power based on the length between perpendiculars and the number of containers the ship will carry. The approximations were developed using simple linear regression and multivariate linear regression analysis. The presented relations have practical application for estimation of container ship engine power needed in preliminary parametric design of the ship. It follows from the above that the use of multiple linear regression to predict the main engine power of a container ship brings more accurate solutions than simple linear regression.
Modification of the USLE K factor for soil erodibility assessment on calcareous soils in Iran
NASA Astrophysics Data System (ADS)
Ostovari, Yaser; Ghorbani-Dashtaki, Shoja; Bahrami, Hossein-Ali; Naderi, Mehdi; Dematte, Jose Alexandre M.; Kerry, Ruth
2016-11-01
The measurement of soil erodibility (K) in the field is tedious, time-consuming and expensive; therefore, its prediction through pedotransfer functions (PTFs) could be far less costly and time-consuming. The aim of this study was to develop new PTFs to estimate the K factor using multiple linear regression, Mamdani fuzzy inference systems, and artificial neural networks. For this purpose, K was measured in 40 erosion plots with natural rainfall. Various soil properties including the soil particle size distribution, calcium carbonate equivalent, organic matter, permeability, and wet-aggregate stability were measured. The results showed that the mean measured K was 0.014 t h MJ- 1 mm- 1 and 2.08 times less than the estimated mean K (0.030 t h MJ- 1 mm- 1) using the USLE model. Permeability, wet-aggregate stability, very fine sand, and calcium carbonate were selected as independent variables by forward stepwise regression in order to assess the ability of multiple linear regression, Mamdani fuzzy inference systems and artificial neural networks to predict K. The calcium carbonate equivalent, which is not accounted for in the USLE model, had a significant impact on K in multiple linear regression due to its strong influence on the stability of aggregates and soil permeability. Statistical indices in validation and calibration datasets determined that the artificial neural networks method with the highest R2, lowest RMSE, and lowest ME was the best model for estimating the K factor. A strong correlation (R2 = 0.81, n = 40, p < 0.05) between the estimated K from multiple linear regression and measured K indicates that the use of calcium carbonate equivalent as a predictor variable gives a better estimation of K in areas with calcareous soils.
Louys, Julien; Meloro, Carlo; Elton, Sarah; Ditchfield, Peter; Bishop, Laura C
2015-01-01
We test the performance of two models that use mammalian communities to reconstruct multivariate palaeoenvironments. While both models exploit the correlation between mammal communities (defined in terms of functional groups) and arboreal heterogeneity, the first uses a multiple multivariate regression of community structure and arboreal heterogeneity, while the second uses a linear regression of the principal components of each ecospace. The success of these methods means the palaeoenvironment of a particular locality can be reconstructed in terms of the proportions of heavy, moderate, light, and absent tree canopy cover. The linear regression is less biased, and more precisely and accurately reconstructs heavy tree canopy cover than the multiple multivariate model. However, the multiple multivariate model performs better than the linear regression for all other canopy cover categories. Both models consistently perform better than randomly generated reconstructions. We apply both models to the palaeocommunity of the Upper Laetolil Beds, Tanzania. Our reconstructions indicate that there was very little heavy tree cover at this site (likely less than 10%), with the palaeo-landscape instead comprising a mixture of light and absent tree cover. These reconstructions help resolve the previous conflicting palaeoecological reconstructions made for this site. Copyright © 2014 Elsevier Ltd. All rights reserved.
Cruz, Antonio M; Barr, Cameron; Puñales-Pozo, Elsa
2008-01-01
This research's main goals were to build a predictor for a turnaround time (TAT) indicator for estimating its values and use a numerical clustering technique for finding possible causes of undesirable TAT values. The following stages were used: domain understanding, data characterisation and sample reduction and insight characterisation. Building the TAT indicator multiple linear regression predictor and clustering techniques were used for improving corrective maintenance task efficiency in a clinical engineering department (CED). The indicator being studied was turnaround time (TAT). Multiple linear regression was used for building a predictive TAT value model. The variables contributing to such model were clinical engineering department response time (CE(rt), 0.415 positive coefficient), stock service response time (Stock(rt), 0.734 positive coefficient), priority level (0.21 positive coefficient) and service time (0.06 positive coefficient). The regression process showed heavy reliance on Stock(rt), CE(rt) and priority, in that order. Clustering techniques revealed the main causes of high TAT values. This examination has provided a means for analysing current technical service quality and effectiveness. In doing so, it has demonstrated a process for identifying areas and methods of improvement and a model against which to analyse these methods' effectiveness.
NASA Technical Reports Server (NTRS)
Whitlock, C. H.; Kuo, C. Y.
1979-01-01
The objective of this paper is to define optical physics and/or environmental conditions under which the linear multiple-regression should be applicable. An investigation of the signal-response equations is conducted and the concept is tested by application to actual remote sensing data from a laboratory experiment performed under controlled conditions. Investigation of the signal-response equations shows that the exact solution for a number of optical physics conditions is of the same form as a linearized multiple-regression equation, even if nonlinear contributions from surface reflections, atmospheric constituents, or other water pollutants are included. Limitations on achieving this type of solution are defined.
DOT National Transportation Integrated Search
2016-09-01
We consider the problem of solving mixed random linear equations with k components. This is the noiseless setting of mixed linear regression. The goal is to estimate multiple linear models from mixed samples in the case where the labels (which sample...
ℓ(p)-Norm multikernel learning approach for stock market price forecasting.
Shao, Xigao; Wu, Kun; Liao, Bifeng
2012-01-01
Linear multiple kernel learning model has been used for predicting financial time series. However, ℓ(1)-norm multiple support vector regression is rarely observed to outperform trivial baselines in practical applications. To allow for robust kernel mixtures that generalize well, we adopt ℓ(p)-norm multiple kernel support vector regression (1 ≤ p < ∞) as a stock price prediction model. The optimization problem is decomposed into smaller subproblems, and the interleaved optimization strategy is employed to solve the regression model. The model is evaluated on forecasting the daily stock closing prices of Shanghai Stock Index in China. Experimental results show that our proposed model performs better than ℓ(1)-norm multiple support vector regression model.
NASA Astrophysics Data System (ADS)
Zahari, Siti Meriam; Ramli, Norazan Mohamed; Moktar, Balkiah; Zainol, Mohammad Said
2014-09-01
In the presence of multicollinearity and multiple outliers, statistical inference of linear regression model using ordinary least squares (OLS) estimators would be severely affected and produces misleading results. To overcome this, many approaches have been investigated. These include robust methods which were reported to be less sensitive to the presence of outliers. In addition, ridge regression technique was employed to tackle multicollinearity problem. In order to mitigate both problems, a combination of ridge regression and robust methods was discussed in this study. The superiority of this approach was examined when simultaneous presence of multicollinearity and multiple outliers occurred in multiple linear regression. This study aimed to look at the performance of several well-known robust estimators; M, MM, RIDGE and robust ridge regression estimators, namely Weighted Ridge M-estimator (WRM), Weighted Ridge MM (WRMM), Ridge MM (RMM), in such a situation. Results of the study showed that in the presence of simultaneous multicollinearity and multiple outliers (in both x and y-direction), the RMM and RIDGE are more or less similar in terms of superiority over the other estimators, regardless of the number of observation, level of collinearity and percentage of outliers used. However, when outliers occurred in only single direction (y-direction), the WRMM estimator is the most superior among the robust ridge regression estimators, by producing the least variance. In conclusion, the robust ridge regression is the best alternative as compared to robust and conventional least squares estimators when dealing with simultaneous presence of multicollinearity and outliers.
Automating approximate Bayesian computation by local linear regression.
Thornton, Kevin R
2009-07-07
In several biological contexts, parameter inference often relies on computationally-intensive techniques. "Approximate Bayesian Computation", or ABC, methods based on summary statistics have become increasingly popular. A particular flavor of ABC based on using a linear regression to approximate the posterior distribution of the parameters, conditional on the summary statistics, is computationally appealing, yet no standalone tool exists to automate the procedure. Here, I describe a program to implement the method. The software package ABCreg implements the local linear-regression approach to ABC. The advantages are: 1. The code is standalone, and fully-documented. 2. The program will automatically process multiple data sets, and create unique output files for each (which may be processed immediately in R), facilitating the testing of inference procedures on simulated data, or the analysis of multiple data sets. 3. The program implements two different transformation methods for the regression step. 4. Analysis options are controlled on the command line by the user, and the program is designed to output warnings for cases where the regression fails. 5. The program does not depend on any particular simulation machinery (coalescent, forward-time, etc.), and therefore is a general tool for processing the results from any simulation. 6. The code is open-source, and modular.Examples of applying the software to empirical data from Drosophila melanogaster, and testing the procedure on simulated data, are shown. In practice, the ABCreg simplifies implementing ABC based on local-linear regression.
Using the Ridge Regression Procedures to Estimate the Multiple Linear Regression Coefficients
NASA Astrophysics Data System (ADS)
Gorgees, HazimMansoor; Mahdi, FatimahAssim
2018-05-01
This article concerns with comparing the performance of different types of ordinary ridge regression estimators that have been already proposed to estimate the regression parameters when the near exact linear relationships among the explanatory variables is presented. For this situations we employ the data obtained from tagi gas filling company during the period (2008-2010). The main result we reached is that the method based on the condition number performs better than other methods since it has smaller mean square error (MSE) than the other stated methods.
Statistical power analyses using G*Power 3.1: tests for correlation and regression analyses.
Faul, Franz; Erdfelder, Edgar; Buchner, Axel; Lang, Albert-Georg
2009-11-01
G*Power is a free power analysis program for a variety of statistical tests. We present extensions and improvements of the version introduced by Faul, Erdfelder, Lang, and Buchner (2007) in the domain of correlation and regression analyses. In the new version, we have added procedures to analyze the power of tests based on (1) single-sample tetrachoric correlations, (2) comparisons of dependent correlations, (3) bivariate linear regression, (4) multiple linear regression based on the random predictor model, (5) logistic regression, and (6) Poisson regression. We describe these new features and provide a brief introduction to their scope and handling.
Mohd Yusof, Mohd Yusmiaidil Putera; Cauwels, Rita; Deschepper, Ellen; Martens, Luc
2015-08-01
The third molar development (TMD) has been widely utilized as one of the radiographic method for dental age estimation. By using the same radiograph of the same individual, third molar eruption (TME) information can be incorporated to the TMD regression model. This study aims to evaluate the performance of dental age estimation in individual method models and the combined model (TMD and TME) based on the classic regressions of multiple linear and principal component analysis. A sample of 705 digital panoramic radiographs of Malay sub-adults aged between 14.1 and 23.8 years was collected. The techniques described by Gleiser and Hunt (modified by Kohler) and Olze were employed to stage the TMD and TME, respectively. The data was divided to develop three respective models based on the two regressions of multiple linear and principal component analysis. The trained models were then validated on the test sample and the accuracy of age prediction was compared between each model. The coefficient of determination (R²) and root mean square error (RMSE) were calculated. In both genders, adjusted R² yielded an increment in the linear regressions of combined model as compared to the individual models. The overall decrease in RMSE was detected in combined model as compared to TMD (0.03-0.06) and TME (0.2-0.8). In principal component regression, low value of adjusted R(2) and high RMSE except in male were exhibited in combined model. Dental age estimation is better predicted using combined model in multiple linear regression models. Copyright © 2015 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.
Simultaneous spectrophotometric determination of salbutamol and bromhexine in tablets.
Habib, I H I; Hassouna, M E M; Zaki, G A
2005-03-01
Typical anti-mucolytic drugs called salbutamol hydrochloride and bromhexine sulfate encountered in tablets were determined simultaneously either by using linear regression at zero-crossing wavelengths of the first derivation of UV-spectra or by application of multiple linear partial least squares regression method. The results obtained by the two proposed mathematical methods were compared with those obtained by the HPLC technique.
Laurens, L M L; Wolfrum, E J
2013-12-18
One of the challenges associated with microalgal biomass characterization and the comparison of microalgal strains and conversion processes is the rapid determination of the composition of algae. We have developed and applied a high-throughput screening technology based on near-infrared (NIR) spectroscopy for the rapid and accurate determination of algal biomass composition. We show that NIR spectroscopy can accurately predict the full composition using multivariate linear regression analysis of varying lipid, protein, and carbohydrate content of algal biomass samples from three strains. We also demonstrate a high quality of predictions of an independent validation set. A high-throughput 96-well configuration for spectroscopy gives equally good prediction relative to a ring-cup configuration, and thus, spectra can be obtained from as little as 10-20 mg of material. We found that lipids exhibit a dominant, distinct, and unique fingerprint in the NIR spectrum that allows for the use of single and multiple linear regression of respective wavelengths for the prediction of the biomass lipid content. This is not the case for carbohydrate and protein content, and thus, the use of multivariate statistical modeling approaches remains necessary.
Brown, Angus M
2006-04-01
The objective of this present study was to demonstrate a method for fitting complex electrophysiological data with multiple functions using the SOLVER add-in of the ubiquitous spreadsheet Microsoft Excel. SOLVER minimizes the difference between the sum of the squares of the data to be fit and the function(s) describing the data using an iterative generalized reduced gradient method. While it is a straightforward procedure to fit data with linear functions, and we have previously demonstrated a method of non-linear regression analysis of experimental data based upon a single function, it is more complex to fit data with multiple functions, usually requiring specialized expensive computer software. In this paper we describe an easily understood program for fitting experimentally acquired data, in this case the stimulus-evoked compound action potential from the mouse optic nerve, with multiple Gaussian functions. The program is flexible and can be applied to describe data with a wide variety of user-input functions.
Weather Impact on Airport Arrival Meter Fix Throughput
NASA Technical Reports Server (NTRS)
Wang, Yao
2017-01-01
Time-based flow management provides arrival aircraft schedules based on arrival airport conditions, airport capacity, required spacing, and weather conditions. In order to meet a scheduled time at which arrival aircraft can cross an airport arrival meter fix prior to entering the airport terminal airspace, air traffic controllers make regulations on air traffic. Severe weather may create an airport arrival bottleneck if one or more of airport arrival meter fixes are partially or completely blocked by the weather and the arrival demand has not been reduced accordingly. Under these conditions, aircraft are frequently being put in holding patterns until they can be rerouted. A model that predicts the weather impacted meter fix throughput may help air traffic controllers direct arrival flows into the airport more efficiently, minimizing arrival meter fix congestion. This paper presents an analysis of air traffic flows across arrival meter fixes at the Newark Liberty International Airport (EWR). Several scenarios of weather impacted EWR arrival fix flows are described. Furthermore, multiple linear regression and regression tree ensemble learning approaches for translating multiple sector Weather Impacted Traffic Indexes (WITI) to EWR arrival meter fix throughputs are examined. These weather translation models are developed and validated using the EWR arrival flight and weather data for the period of April-September in 2014. This study also compares the performance of the regression tree ensemble with traditional multiple linear regression models for estimating the weather impacted throughputs at each of the EWR arrival meter fixes. For all meter fixes investigated, the results from the regression tree ensemble weather translation models show a stronger correlation between model outputs and observed meter fix throughputs than that produced from multiple linear regression method.
Birthweight Related Factors in Northwestern Iran: Using Quantile Regression Method.
Fallah, Ramazan; Kazemnejad, Anoshirvan; Zayeri, Farid; Shoghli, Alireza
2015-11-18
Birthweight is one of the most important predicting indicators of the health status in adulthood. Having a balanced birthweight is one of the priorities of the health system in most of the industrial and developed countries. This indicator is used to assess the growth and health status of the infants. The aim of this study was to assess the birthweight of the neonates by using quantile regression in Zanjan province. This analytical descriptive study was carried out using pre-registered (March 2010 - March 2012) data of neonates in urban/rural health centers of Zanjan province using multiple-stage cluster sampling. Data were analyzed using multiple linear regressions andquantile regression method and SAS 9.2 statistical software. From 8456 newborn baby, 4146 (49%) were female. The mean age of the mothers was 27.1±5.4 years. The mean birthweight of the neonates was 3104 ± 431 grams. Five hundred and seventy-three patients (6.8%) of the neonates were less than 2500 grams. In all quantiles, gestational age of neonates (p<0.05), weight and educational level of the mothers (p<0.05) showed a linear significant relationship with the i of the neonates. However, sex and birth rank of the neonates, mothers age, place of residence (urban/rural) and career were not significant in all quantiles (p>0.05). This study revealed the results of multiple linear regression and quantile regression were not identical. We strictly recommend the use of quantile regression when an asymmetric response variable or data with outliers is available.
Birthweight Related Factors in Northwestern Iran: Using Quantile Regression Method
Fallah, Ramazan; Kazemnejad, Anoshirvan; Zayeri, Farid; Shoghli, Alireza
2016-01-01
Introduction: Birthweight is one of the most important predicting indicators of the health status in adulthood. Having a balanced birthweight is one of the priorities of the health system in most of the industrial and developed countries. This indicator is used to assess the growth and health status of the infants. The aim of this study was to assess the birthweight of the neonates by using quantile regression in Zanjan province. Methods: This analytical descriptive study was carried out using pre-registered (March 2010 - March 2012) data of neonates in urban/rural health centers of Zanjan province using multiple-stage cluster sampling. Data were analyzed using multiple linear regressions andquantile regression method and SAS 9.2 statistical software. Results: From 8456 newborn baby, 4146 (49%) were female. The mean age of the mothers was 27.1±5.4 years. The mean birthweight of the neonates was 3104 ± 431 grams. Five hundred and seventy-three patients (6.8%) of the neonates were less than 2500 grams. In all quantiles, gestational age of neonates (p<0.05), weight and educational level of the mothers (p<0.05) showed a linear significant relationship with the i of the neonates. However, sex and birth rank of the neonates, mothers age, place of residence (urban/rural) and career were not significant in all quantiles (p>0.05). Conclusion: This study revealed the results of multiple linear regression and quantile regression were not identical. We strictly recommend the use of quantile regression when an asymmetric response variable or data with outliers is available. PMID:26925889
NASA Astrophysics Data System (ADS)
Shastri, Niket; Pathak, Kamlesh
2018-05-01
The water vapor content in atmosphere plays very important role in climate. In this paper the application of GPS signal in meteorology is discussed, which is useful technique that is used to estimate the perceptible water vapor of atmosphere. In this paper various algorithms like artificial neural network, support vector machine and multiple linear regression are use to predict perceptible water vapor. The comparative studies in terms of root mean square error and mean absolute errors are also carried out for all the algorithms.
Simultaneous multiple non-crossing quantile regression estimation using kernel constraints
Liu, Yufeng; Wu, Yichao
2011-01-01
Quantile regression (QR) is a very useful statistical tool for learning the relationship between the response variable and covariates. For many applications, one often needs to estimate multiple conditional quantile functions of the response variable given covariates. Although one can estimate multiple quantiles separately, it is of great interest to estimate them simultaneously. One advantage of simultaneous estimation is that multiple quantiles can share strength among them to gain better estimation accuracy than individually estimated quantile functions. Another important advantage of joint estimation is the feasibility of incorporating simultaneous non-crossing constraints of QR functions. In this paper, we propose a new kernel-based multiple QR estimation technique, namely simultaneous non-crossing quantile regression (SNQR). We use kernel representations for QR functions and apply constraints on the kernel coefficients to avoid crossing. Both unregularised and regularised SNQR techniques are considered. Asymptotic properties such as asymptotic normality of linear SNQR and oracle properties of the sparse linear SNQR are developed. Our numerical results demonstrate the competitive performance of our SNQR over the original individual QR estimation. PMID:22190842
ℓ p-Norm Multikernel Learning Approach for Stock Market Price Forecasting
Shao, Xigao; Wu, Kun; Liao, Bifeng
2012-01-01
Linear multiple kernel learning model has been used for predicting financial time series. However, ℓ 1-norm multiple support vector regression is rarely observed to outperform trivial baselines in practical applications. To allow for robust kernel mixtures that generalize well, we adopt ℓ p-norm multiple kernel support vector regression (1 ≤ p < ∞) as a stock price prediction model. The optimization problem is decomposed into smaller subproblems, and the interleaved optimization strategy is employed to solve the regression model. The model is evaluated on forecasting the daily stock closing prices of Shanghai Stock Index in China. Experimental results show that our proposed model performs better than ℓ 1-norm multiple support vector regression model. PMID:23365561
Suppression Situations in Multiple Linear Regression
ERIC Educational Resources Information Center
Shieh, Gwowen
2006-01-01
This article proposes alternative expressions for the two most prevailing definitions of suppression without resorting to the standardized regression modeling. The formulation provides a simple basis for the examination of their relationship. For the two-predictor regression, the author demonstrates that the previous results in the literature are…
BIODEGRADATION PROBABILITY PROGRAM (BIODEG)
The Biodegradation Probability Program (BIODEG) calculates the probability that a chemical under aerobic conditions with mixed cultures of microorganisms will biodegrade rapidly or slowly. It uses fragment constants developed using multiple linear and non-linear regressions and d...
The Use of Linear Programming for Prediction.
ERIC Educational Resources Information Center
Schnittjer, Carl J.
The purpose of the study was to develop a linear programming model to be used for prediction, test the accuracy of the predictions, and compare the accuracy with that produced by curvilinear multiple regression analysis. (Author)
Miozzo, Michele; Pulvermüller, Friedemann; Hauk, Olaf
2015-01-01
The time course of brain activation during word production has become an area of increasingly intense investigation in cognitive neuroscience. The predominant view has been that semantic and phonological processes are activated sequentially, at about 150 and 200–400 ms after picture onset. Although evidence from prior studies has been interpreted as supporting this view, these studies were arguably not ideally suited to detect early brain activation of semantic and phonological processes. We here used a multiple linear regression approach to magnetoencephalography (MEG) analysis of picture naming in order to investigate early effects of variables specifically related to visual, semantic, and phonological processing. This was combined with distributed minimum-norm source estimation and region-of-interest analysis. Brain activation associated with visual image complexity appeared in occipital cortex at about 100 ms after picture presentation onset. At about 150 ms, semantic variables became physiologically manifest in left frontotemporal regions. In the same latency range, we found an effect of phonological variables in the left middle temporal gyrus. Our results demonstrate that multiple linear regression analysis is sensitive to early effects of multiple psycholinguistic variables in picture naming. Crucially, our results suggest that access to phonological information might begin in parallel with semantic processing around 150 ms after picture onset. PMID:25005037
Dong, J Q; Zhang, X Y; Wang, S Z; Jiang, X F; Zhang, K; Ma, G W; Wu, M Q; Li, H; Zhang, H
2018-01-01
Plasma very low-density lipoprotein (VLDL) can be used to select for low body fat or abdominal fat (AF) in broilers, but its correlation with AF is limited. We investigated whether any other biochemical indicator can be used in combination with VLDL for a better selective effect. Nineteen plasma biochemical indicators were measured in male chickens from the Northeast Agricultural University broiler lines divergently selected for AF content (NEAUHLF) in the fed state at 46 and 48 d of age. The average concentration of every parameter for the 2 d was used for statistical analysis. Levels of these 19 plasma biochemical parameters were compared between the lean and fat lines. The phenotypic correlations between these plasma biochemical indicators and AF traits were analyzed. Then, multiple linear regression models were constructed to select the best model used for selecting against AF content. and the heritabilities of plasma indicators contained in the best models were estimated. The results showed that 11 plasma biochemical indicators (triglycerides, total bile acid, total protein, globulin, albumin/globulin, aspartate transaminase, alanine transaminase, gamma-glutamyl transpeptidase, uric acid, creatinine, and VLDL) differed significantly between the lean and fat lines (P < 0.01), and correlated significantly with AF traits (P < 0.05). The best multiple linear regression models based on albumin/globulin, VLDL, triglycerides, globulin, total bile acid, and uric acid, had higher R2 (0.73) than the model based only on VLDL (0.21). The plasma parameters included in the best models had moderate heritability estimates (0.21 ≤ h2 ≤ 0.43). These results indicate that these multiple linear regression models can be used to select for lean broiler chickens. © 2017 Poultry Science Association Inc.
Regression Commonality Analysis: A Technique for Quantitative Theory Building
ERIC Educational Resources Information Center
Nimon, Kim; Reio, Thomas G., Jr.
2011-01-01
When it comes to multiple linear regression analysis (MLR), it is common for social and behavioral science researchers to rely predominately on beta weights when evaluating how predictors contribute to a regression model. Presenting an underutilized statistical technique, this article describes how organizational researchers can use commonality…
Precision Efficacy Analysis for Regression.
ERIC Educational Resources Information Center
Brooks, Gordon P.
When multiple linear regression is used to develop a prediction model, sample size must be large enough to ensure stable coefficients. If the derivation sample size is inadequate, the model may not predict well for future subjects. The precision efficacy analysis for regression (PEAR) method uses a cross- validity approach to select sample sizes…
Krasikova, Dina V; Le, Huy; Bachura, Eric
2018-06-01
To address a long-standing concern regarding a gap between organizational science and practice, scholars called for more intuitive and meaningful ways of communicating research results to users of academic research. In this article, we develop a common language effect size index (CLβ) that can help translate research results to practice. We demonstrate how CLβ can be computed and used to interpret the effects of continuous and categorical predictors in multiple linear regression models. We also elaborate on how the proposed CLβ index is computed and used to interpret interactions and nonlinear effects in regression models. In addition, we test the robustness of the proposed index to violations of normality and provide means for computing standard errors and constructing confidence intervals around its estimates. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Schistosomiasis Breeding Environment Situation Analysis in Dongting Lake Area
NASA Astrophysics Data System (ADS)
Li, Chuanrong; Jia, Yuanyuan; Ma, Lingling; Liu, Zhaoyan; Qian, Yonggang
2013-01-01
Monitoring environmental characteristics, such as vegetation, soil moisture et al., of Oncomelania hupensis (O. hupensis)’ spatial/temporal distribution is of vital importance to the schistosomiasis prevention and control. In this study, the relationship between environmental factors derived from remotely sensed data and the density of O. hupensis was analyzed by a multiple linear regression model. Secondly, spatial analysis of the regression residual was investigated by the semi-variogram method. Thirdly, spatial analysis of the regression residual and the multiple linear regression model were both employed to estimate the spatial variation of O. hupensis density. Finally, the approach was used to monitor and predict the spatial and temporal variations of oncomelania of Dongting Lake region, China. And the areas of potential O. hupensis habitats were predicted and the influence of Three Gorges Dam (TGB)project on the density of O. hupensis was analyzed.
Optimized multiple linear mappings for single image super-resolution
NASA Astrophysics Data System (ADS)
Zhang, Kaibing; Li, Jie; Xiong, Zenggang; Liu, Xiuping; Gao, Xinbo
2017-12-01
Learning piecewise linear regression has been recognized as an effective way for example learning-based single image super-resolution (SR) in literature. In this paper, we employ an expectation-maximization (EM) algorithm to further improve the SR performance of our previous multiple linear mappings (MLM) based SR method. In the training stage, the proposed method starts with a set of linear regressors obtained by the MLM-based method, and then jointly optimizes the clustering results and the low- and high-resolution subdictionary pairs for regression functions by using the metric of the reconstruction errors. In the test stage, we select the optimal regressor for SR reconstruction by accumulating the reconstruction errors of m-nearest neighbors in the training set. Thorough experimental results carried on six publicly available datasets demonstrate that the proposed SR method can yield high-quality images with finer details and sharper edges in terms of both quantitative and perceptual image quality assessments.
Adjusted variable plots for Cox's proportional hazards regression model.
Hall, C B; Zeger, S L; Bandeen-Roche, K J
1996-01-01
Adjusted variable plots are useful in linear regression for outlier detection and for qualitative evaluation of the fit of a model. In this paper, we extend adjusted variable plots to Cox's proportional hazards model for possibly censored survival data. We propose three different plots: a risk level adjusted variable (RLAV) plot in which each observation in each risk set appears, a subject level adjusted variable (SLAV) plot in which each subject is represented by one point, and an event level adjusted variable (ELAV) plot in which the entire risk set at each failure event is represented by a single point. The latter two plots are derived from the RLAV by combining multiple points. In each point, the regression coefficient and standard error from a Cox proportional hazards regression is obtained by a simple linear regression through the origin fit to the coordinates of the pictured points. The plots are illustrated with a reanalysis of a dataset of 65 patients with multiple myeloma.
NASA Astrophysics Data System (ADS)
Sahabiev, I. A.; Ryazanov, S. S.; Kolcova, T. G.; Grigoryan, B. R.
2018-03-01
The three most common techniques to interpolate soil properties at a field scale—ordinary kriging (OK), regression kriging with multiple linear regression drift model (RK + MLR), and regression kriging with principal component regression drift model (RK + PCR)—were examined. The results of the performed study were compiled into an algorithm of choosing the most appropriate soil mapping technique. Relief attributes were used as the auxiliary variables. When spatial dependence of a target variable was strong, the OK method showed more accurate interpolation results, and the inclusion of the auxiliary data resulted in an insignificant improvement in prediction accuracy. According to the algorithm, the RK + PCR method effectively eliminates multicollinearity of explanatory variables. However, if the number of predictors is less than ten, the probability of multicollinearity is reduced, and application of the PCR becomes irrational. In that case, the multiple linear regression should be used instead.
Energy expenditure estimation during daily military routine with body-fixed sensors.
Wyss, Thomas; Mäder, Urs
2011-05-01
The purpose of this study was to develop and validate an algorithm for estimating energy expenditure during the daily military routine on the basis of data collected using body-fixed sensors. First, 8 volunteers completed isolated physical activities according to an established protocol, and the resulting data were used to develop activity-class-specific multiple linear regressions for physical activity energy expenditure on the basis of hip acceleration, heart rate, and body mass as independent variables. Second, the validity of these linear regressions was tested during the daily military routine using indirect calorimetry (n = 12). Volunteers' mean estimated energy expenditure did not significantly differ from the energy expenditure measured with indirect calorimetry (p = 0.898, 95% confidence interval = -1.97 to 1.75 kJ/min). We conclude that the developed activity-class-specific multiple linear regressions applied to the acceleration and heart rate data allow estimation of energy expenditure in 1-minute intervals during daily military routine, with accuracy equal to indirect calorimetry.
Williams, D. Keith; Muddiman, David C.
2008-01-01
Fourier transform ion cyclotron resonance mass spectrometry has the ability to achieve unprecedented mass measurement accuracy (MMA); MMA is one of the most significant attributes of mass spectrometric measurements as it affords extraordinary molecular specificity. However, due to space-charge effects, the achievable MMA significantly depends on the total number of ions trapped in the ICR cell for a particular measurement. Even through the use of automatic gain control (AGC), the total ion population is not constant between spectra. Multiple linear regression calibration in conjunction with AGC is utilized in these experiments to formally account for the differences in total ion population in the ICR cell between the external calibration spectra and experimental spectra. This ability allows for the extension of dynamic range of the instrument while allowing mean MMA values to remain less than 1 ppm. In addition, multiple linear regression calibration is used to account for both differences in total ion population in the ICR cell as well as relative ion abundance of a given species, which also affords mean MMA values at the parts-per-billion level. PMID:17539605
Quantile Regression in the Study of Developmental Sciences
ERIC Educational Resources Information Center
Petscher, Yaacov; Logan, Jessica A. R.
2014-01-01
Linear regression analysis is one of the most common techniques applied in developmental research, but only allows for an estimate of the average relations between the predictor(s) and the outcome. This study describes quantile regression, which provides estimates of the relations between the predictor(s) and outcome, but across multiple points of…
Maintenance Operations in Mission Oriented Protective Posture Level IV (MOPPIV)
1987-10-01
Repair FADAC Printed Circuit Board ............. 6 3. Data Analysis Techniques ............................. 6 a. Multiple Linear Regression... ANALYSIS /DISCUSSION ............................... 12 1. Exa-ple of Regression Analysis ..................... 12 S2. Regression results for all tasks...6 * TABLE 9. Task Grouping for Analysis ........................ 7 "TABXLE 10. Remove/Replace H60A3 Power Pack................. 8 TABLE
Simple linear and multivariate regression models.
Rodríguez del Águila, M M; Benítez-Parejo, N
2011-01-01
In biomedical research it is common to find problems in which we wish to relate a response variable to one or more variables capable of describing the behaviour of the former variable by means of mathematical models. Regression techniques are used to this effect, in which an equation is determined relating the two variables. While such equations can have different forms, linear equations are the most widely used form and are easy to interpret. The present article describes simple and multiple linear regression models, how they are calculated, and how their applicability assumptions are checked. Illustrative examples are provided, based on the use of the freely accessible R program. Copyright © 2011 SEICAP. Published by Elsevier Espana. All rights reserved.
Advanced Statistics for Exotic Animal Practitioners.
Hodsoll, John; Hellier, Jennifer M; Ryan, Elizabeth G
2017-09-01
Correlation and regression assess the association between 2 or more variables. This article reviews the core knowledge needed to understand these analyses, moving from visual analysis in scatter plots through correlation, simple and multiple linear regression, and logistic regression. Correlation estimates the strength and direction of a relationship between 2 variables. Regression can be considered more general and quantifies the numerical relationships between an outcome and 1 or multiple variables in terms of a best-fit line, allowing predictions to be made. Each technique is discussed with examples and the statistical assumptions underlying their correct application. Copyright © 2017 Elsevier Inc. All rights reserved.
Prediction of monthly rainfall in Victoria, Australia: Clusterwise linear regression approach
NASA Astrophysics Data System (ADS)
Bagirov, Adil M.; Mahmood, Arshad; Barton, Andrew
2017-05-01
This paper develops the Clusterwise Linear Regression (CLR) technique for prediction of monthly rainfall. The CLR is a combination of clustering and regression techniques. It is formulated as an optimization problem and an incremental algorithm is designed to solve it. The algorithm is applied to predict monthly rainfall in Victoria, Australia using rainfall data with five input meteorological variables over the period of 1889-2014 from eight geographically diverse weather stations. The prediction performance of the CLR method is evaluated by comparing observed and predicted rainfall values using four measures of forecast accuracy. The proposed method is also compared with the CLR using the maximum likelihood framework by the expectation-maximization algorithm, multiple linear regression, artificial neural networks and the support vector machines for regression models using computational results. The results demonstrate that the proposed algorithm outperforms other methods in most locations.
NASA Astrophysics Data System (ADS)
Shi, Jinfei; Zhu, Songqing; Chen, Ruwen
2017-12-01
An order selection method based on multiple stepwise regressions is proposed for General Expression of Nonlinear Autoregressive model which converts the model order problem into the variable selection of multiple linear regression equation. The partial autocorrelation function is adopted to define the linear term in GNAR model. The result is set as the initial model, and then the nonlinear terms are introduced gradually. Statistics are chosen to study the improvements of both the new introduced and originally existed variables for the model characteristics, which are adopted to determine the model variables to retain or eliminate. So the optimal model is obtained through data fitting effect measurement or significance test. The simulation and classic time-series data experiment results show that the method proposed is simple, reliable and can be applied to practical engineering.
Multiple regression technique for Pth degree polynominals with and without linear cross products
NASA Technical Reports Server (NTRS)
Davis, J. W.
1973-01-01
A multiple regression technique was developed by which the nonlinear behavior of specified independent variables can be related to a given dependent variable. The polynomial expression can be of Pth degree and can incorporate N independent variables. Two cases are treated such that mathematical models can be studied both with and without linear cross products. The resulting surface fits can be used to summarize trends for a given phenomenon and provide a mathematical relationship for subsequent analysis. To implement this technique, separate computer programs were developed for the case without linear cross products and for the case incorporating such cross products which evaluate the various constants in the model regression equation. In addition, the significance of the estimated regression equation is considered and the standard deviation, the F statistic, the maximum absolute percent error, and the average of the absolute values of the percent of error evaluated. The computer programs and their manner of utilization are described. Sample problems are included to illustrate the use and capability of the technique which show the output formats and typical plots comparing computer results to each set of input data.
Krishan, Kewal; Kanchan, Tanuj; Sharma, Abhilasha
2012-05-01
Estimation of stature is an important parameter in identification of human remains in forensic examinations. The present study is aimed to compare the reliability and accuracy of stature estimation and to demonstrate the variability in estimated stature and actual stature using multiplication factor and regression analysis methods. The study is based on a sample of 246 subjects (123 males and 123 females) from North India aged between 17 and 20 years. Four anthropometric measurements; hand length, hand breadth, foot length and foot breadth taken on the left side in each subject were included in the study. Stature was measured using standard anthropometric techniques. Multiplication factors were calculated and linear regression models were derived for estimation of stature from hand and foot dimensions. Derived multiplication factors and regression formula were applied to the hand and foot measurements in the study sample. The estimated stature from the multiplication factors and regression analysis was compared with the actual stature to find the error in estimated stature. The results indicate that the range of error in estimation of stature from regression analysis method is less than that of multiplication factor method thus, confirming that the regression analysis method is better than multiplication factor analysis in stature estimation. Copyright © 2012 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.
NASA Technical Reports Server (NTRS)
Barrett, C. A.
1985-01-01
Multiple linear regression analysis was used to determine an equation for estimating hot corrosion attack for a series of Ni base cast turbine alloys. The U transform (i.e., 1/sin (% A/100) to the 1/2) was shown to give the best estimate of the dependent variable, y. A complete second degree equation is described for the centered" weight chemistries for the elements Cr, Al, Ti, Mo, W, Cb, Ta, and Co. In addition linear terms for the minor elements C, B, and Zr were added for a basic 47 term equation. The best reduced equation was determined by the stepwise selection method with essentially 13 terms. The Cr term was found to be the most important accounting for 60 percent of the explained variability hot corrosion attack.
Use of probabilistic weights to enhance linear regression myoelectric control
NASA Astrophysics Data System (ADS)
Smith, Lauren H.; Kuiken, Todd A.; Hargrove, Levi J.
2015-12-01
Objective. Clinically available prostheses for transradial amputees do not allow simultaneous myoelectric control of degrees of freedom (DOFs). Linear regression methods can provide simultaneous myoelectric control, but frequently also result in difficulty with isolating individual DOFs when desired. This study evaluated the potential of using probabilistic estimates of categories of gross prosthesis movement, which are commonly used in classification-based myoelectric control, to enhance linear regression myoelectric control. Approach. Gaussian models were fit to electromyogram (EMG) feature distributions for three movement classes at each DOF (no movement, or movement in either direction) and used to weight the output of linear regression models by the probability that the user intended the movement. Eight able-bodied and two transradial amputee subjects worked in a virtual Fitts’ law task to evaluate differences in controllability between linear regression and probability-weighted regression for an intramuscular EMG-based three-DOF wrist and hand system. Main results. Real-time and offline analyses in able-bodied subjects demonstrated that probability weighting improved performance during single-DOF tasks (p < 0.05) by preventing extraneous movement at additional DOFs. Similar results were seen in experiments with two transradial amputees. Though goodness-of-fit evaluations suggested that the EMG feature distributions showed some deviations from the Gaussian, equal-covariance assumptions used in this experiment, the assumptions were sufficiently met to provide improved performance compared to linear regression control. Significance. Use of probability weights can improve the ability to isolate individual during linear regression myoelectric control, while maintaining the ability to simultaneously control multiple DOFs.
Linear regression analysis: part 14 of a series on evaluation of scientific publications.
Schneider, Astrid; Hommel, Gerhard; Blettner, Maria
2010-11-01
Regression analysis is an important statistical method for the analysis of medical data. It enables the identification and characterization of relationships among multiple factors. It also enables the identification of prognostically relevant risk factors and the calculation of risk scores for individual prognostication. This article is based on selected textbooks of statistics, a selective review of the literature, and our own experience. After a brief introduction of the uni- and multivariable regression models, illustrative examples are given to explain what the important considerations are before a regression analysis is performed, and how the results should be interpreted. The reader should then be able to judge whether the method has been used correctly and interpret the results appropriately. The performance and interpretation of linear regression analysis are subject to a variety of pitfalls, which are discussed here in detail. The reader is made aware of common errors of interpretation through practical examples. Both the opportunities for applying linear regression analysis and its limitations are presented.
Multiple linear regression models are often used to predict levels of fecal indicator bacteria (FIB) in recreational swimming waters based on independent variables (IVs) such as meteorologic, hydrodynamic, and water-quality measures. The IVs used for these analyses are traditiona...
Ng, Kar Yong; Awang, Norhashidah
2018-01-06
Frequent haze occurrences in Malaysia have made the management of PM 10 (particulate matter with aerodynamic less than 10 μm) pollution a critical task. This requires knowledge on factors associating with PM 10 variation and good forecast of PM 10 concentrations. Hence, this paper demonstrates the prediction of 1-day-ahead daily average PM 10 concentrations based on predictor variables including meteorological parameters and gaseous pollutants. Three different models were built. They were multiple linear regression (MLR) model with lagged predictor variables (MLR1), MLR model with lagged predictor variables and PM 10 concentrations (MLR2) and regression with time series error (RTSE) model. The findings revealed that humidity, temperature, wind speed, wind direction, carbon monoxide and ozone were the main factors explaining the PM 10 variation in Peninsular Malaysia. Comparison among the three models showed that MLR2 model was on a same level with RTSE model in terms of forecasting accuracy, while MLR1 model was the worst.
NASA Astrophysics Data System (ADS)
Gusriani, N.; Firdaniza
2018-03-01
The existence of outliers on multiple linear regression analysis causes the Gaussian assumption to be unfulfilled. If the Least Square method is forcedly used on these data, it will produce a model that cannot represent most data. For that, we need a robust regression method against outliers. This paper will compare the Minimum Covariance Determinant (MCD) method and the TELBS method on secondary data on the productivity of phytoplankton, which contains outliers. Based on the robust determinant coefficient value, MCD method produces a better model compared to TELBS method.
Miele, Andrew; Thompson, Morgan; Jao, Nancy C; Kalhan, Ravi; Leone, Frank; Hogarth, Lee; Hitsman, Brian; Schnoll, Robert
2018-01-01
A substantial proportion of cancer patients continue to smoke after their diagnosis but few studies have evaluated correlates of nicotine dependence and smoking rate in this population, which could help guide smoking cessation interventions. This study evaluated correlates of smoking rate and nicotine dependence among 207 cancer patients. A cross-sectional analysis using multiple linear regression evaluated disease, demographic, affective, and tobacco-seeking correlates of smoking rate and nicotine dependence. Smoking rate was assessed using a timeline follow-back method. The Fagerström Test for Nicotine Dependence measured levels of nicotine dependence. A multiple linear regression predicting nicotine dependence showed an association with smoking to alleviate a sense of addiction from the Reasons for Smoking scale and tobacco-seeking behavior from the concurrent choice task ( p < .05), but not with affect measured by the HADS and PANAS ( p > .05). Multiple linear regression predicting prequit showed an association with smoking to alleviate addiction ( p < .05). ANOVA showed that Caucasian participants reported greater rates of smoking compared to other races. The results suggest that behavioral smoking cessation interventions that focus on helping patients to manage tobacco-seeking behavior, rather than mood management interventions, could help cancer patients quit smoking.
Height and Weight Estimation From Anthropometric Measurements Using Machine Learning Regressions
Fernandes, Bruno J. T.; Roque, Alexandre
2018-01-01
Height and weight are measurements explored to tracking nutritional diseases, energy expenditure, clinical conditions, drug dosages, and infusion rates. Many patients are not ambulant or may be unable to communicate, and a sequence of these factors may not allow accurate estimation or measurements; in those cases, it can be estimated approximately by anthropometric means. Different groups have proposed different linear or non-linear equations which coefficients are obtained by using single or multiple linear regressions. In this paper, we present a complete study of the application of different learning models to estimate height and weight from anthropometric measurements: support vector regression, Gaussian process, and artificial neural networks. The predicted values are significantly more accurate than that obtained with conventional linear regressions. In all the cases, the predictions are non-sensitive to ethnicity, and to gender, if more than two anthropometric parameters are analyzed. The learning model analysis creates new opportunities for anthropometric applications in industry, textile technology, security, and health care. PMID:29651366
A method for fitting regression splines with varying polynomial order in the linear mixed model.
Edwards, Lloyd J; Stewart, Paul W; MacDougall, James E; Helms, Ronald W
2006-02-15
The linear mixed model has become a widely used tool for longitudinal analysis of continuous variables. The use of regression splines in these models offers the analyst additional flexibility in the formulation of descriptive analyses, exploratory analyses and hypothesis-driven confirmatory analyses. We propose a method for fitting piecewise polynomial regression splines with varying polynomial order in the fixed effects and/or random effects of the linear mixed model. The polynomial segments are explicitly constrained by side conditions for continuity and some smoothness at the points where they join. By using a reparameterization of this explicitly constrained linear mixed model, an implicitly constrained linear mixed model is constructed that simplifies implementation of fixed-knot regression splines. The proposed approach is relatively simple, handles splines in one variable or multiple variables, and can be easily programmed using existing commercial software such as SAS or S-plus. The method is illustrated using two examples: an analysis of longitudinal viral load data from a study of subjects with acute HIV-1 infection and an analysis of 24-hour ambulatory blood pressure profiles.
Selection of higher order regression models in the analysis of multi-factorial transcription data.
Prazeres da Costa, Olivia; Hoffman, Arthur; Rey, Johannes W; Mansmann, Ulrich; Buch, Thorsten; Tresch, Achim
2014-01-01
Many studies examine gene expression data that has been obtained under the influence of multiple factors, such as genetic background, environmental conditions, or exposure to diseases. The interplay of multiple factors may lead to effect modification and confounding. Higher order linear regression models can account for these effects. We present a new methodology for linear model selection and apply it to microarray data of bone marrow-derived macrophages. This experiment investigates the influence of three variable factors: the genetic background of the mice from which the macrophages were obtained, Yersinia enterocolitica infection (two strains, and a mock control), and treatment/non-treatment with interferon-γ. We set up four different linear regression models in a hierarchical order. We introduce the eruption plot as a new practical tool for model selection complementary to global testing. It visually compares the size and significance of effect estimates between two nested models. Using this methodology we were able to select the most appropriate model by keeping only relevant factors showing additional explanatory power. Application to experimental data allowed us to qualify the interaction of factors as either neutral (no interaction), alleviating (co-occurring effects are weaker than expected from the single effects), or aggravating (stronger than expected). We find a biologically meaningful gene cluster of putative C2TA target genes that appear to be co-regulated with MHC class II genes. We introduced the eruption plot as a tool for visual model comparison to identify relevant higher order interactions in the analysis of expression data obtained under the influence of multiple factors. We conclude that model selection in higher order linear regression models should generally be performed for the analysis of multi-factorial microarray data.
NASA Technical Reports Server (NTRS)
Wilson, Edward (Inventor)
2006-01-01
The present invention is a method for identifying unknown parameters in a system having a set of governing equations describing its behavior that cannot be put into regression form with the unknown parameters linearly represented. In this method, the vector of unknown parameters is segmented into a plurality of groups where each individual group of unknown parameters may be isolated linearly by manipulation of said equations. Multiple concurrent and independent recursive least squares identification of each said group run, treating other unknown parameters appearing in their regression equation as if they were known perfectly, with said values provided by recursive least squares estimation from the other groups, thereby enabling the use of fast, compact, efficient linear algorithms to solve problems that would otherwise require nonlinear solution approaches. This invention is presented with application to identification of mass and thruster properties for a thruster-controlled spacecraft.
Mutter, Brigitte; Alcorn, Mark B; Welsh, Marilyn
2006-06-01
This study of the relationship between theory of mind and executive function examined whether on the false-belief task age differences between 3 and 5 ears of age are related to development of working-memory capacity and inhibitory processes. 72 children completed tasks measuring false belief, working memory, and inhibition. Significant age effects were observed for false-belief and working-memory performance, as well as for the false-alarm and perseveration measures of inhibition. A simultaneous multiple linear regression specified the contribution of age, inhibition, and working memory to the prediction of false-belief performance. This model was significant, explaining a total of 36% of the variance. To examine the independent contributions of the working-memory and inhibition variables, after controlling for age, two hierarchical multiple linear regressions were conducted. These multiple regression analyses indicate that working memory and inhibition make small, overlapping contributions to false-belief performance after accounting for age, but that working memory, as measured in this study, is a somewhat better predictor of false-belief understanding than is inhibition.
Mapping diffuse photosynthetically active radiation from satellite data in Thailand
NASA Astrophysics Data System (ADS)
Choosri, P.; Janjai, S.; Nunez, M.; Buntoung, S.; Charuchittipan, D.
2017-12-01
In this paper, calculation of monthly average hourly diffuse photosynthetically active radiation (PAR) using satellite data is proposed. Diffuse PAR was analyzed at four stations in Thailand. A radiative transfer model was used for calculating the diffuse PAR for cloudless sky conditions. Differences between the diffuse PAR under all sky conditions obtained from the ground-based measurements and those from the model are representative of cloud effects. Two models are developed, one describing diffuse PAR only as a function of solar zenith angle, and the second one as a multiple linear regression with solar zenith angle and satellite reflectivity acting linearly and aerosol optical depth acting in logarithmic functions. When tested with an independent data set, the multiple regression model performed best with a higher coefficient of variance R2 (0.78 vs. 0.70), lower root mean square difference (RMSD) (12.92% vs. 13.05%) and the same mean bias difference (MBD) of -2.20%. Results from the multiple regression model are used to map diffuse PAR throughout the country as monthly averages of hourly data.
Food insecurity and CD4% Among HIV+ children in Gaborone, Botswana.
Mendoza, Jason A; Matshaba, Mogomotsi; Makhanda, Jeremiah; Liu, Yan; Boitshwarelo, Matshwenyego; Anabwani, Gabriel M
2014-08-01
We investigated the association between household food insecurity (HFI) and CD4% among 2-6-year old HIV+ outpatients (n = 78) at the Botswana-Baylor Children's Clinical Center of Excellence in Gaborone, Botswana. HFI was assessed by a validated survey. CD4% data were abstracted from the medical record. We used multiple linear regression with CD4% (dependent variable), HFI (independent variable), and controlled for sociodemographic and clinical covariates. Multiple linear regression showed a significant main effect for HFI [beta = -0.6, 95% confidence interval (CI): -1.0 to -0.1] and child gender (beta = 5.6, 95% CI: 1.3 to 9.8). Alleviating food insecurity may improve pediatric HIV outcomes in Botswana and similar Sub-Saharan settings.
Henrard, S; Speybroeck, N; Hermans, C
2015-11-01
Haemophilia is a rare genetic haemorrhagic disease characterized by partial or complete deficiency of coagulation factor VIII, for haemophilia A, or IX, for haemophilia B. As in any other medical research domain, the field of haemophilia research is increasingly concerned with finding factors associated with binary or continuous outcomes through multivariable models. Traditional models include multiple logistic regressions, for binary outcomes, and multiple linear regressions for continuous outcomes. Yet these regression models are at times difficult to implement, especially for non-statisticians, and can be difficult to interpret. The present paper sought to didactically explain how, why, and when to use classification and regression tree (CART) analysis for haemophilia research. The CART method is non-parametric and non-linear, based on the repeated partitioning of a sample into subgroups based on a certain criterion. Breiman developed this method in 1984. Classification trees (CTs) are used to analyse categorical outcomes and regression trees (RTs) to analyse continuous ones. The CART methodology has become increasingly popular in the medical field, yet only a few examples of studies using this methodology specifically in haemophilia have to date been published. Two examples using CART analysis and previously published in this field are didactically explained in details. There is increasing interest in using CART analysis in the health domain, primarily due to its ease of implementation, use, and interpretation, thus facilitating medical decision-making. This method should be promoted for analysing continuous or categorical outcomes in haemophilia, when applicable. © 2015 John Wiley & Sons Ltd.
NASA Astrophysics Data System (ADS)
Lucifredi, A.; Mazzieri, C.; Rossi, M.
2000-05-01
Since the operational conditions of a hydroelectric unit can vary within a wide range, the monitoring system must be able to distinguish between the variations of the monitored variable caused by variations of the operation conditions and those due to arising and progressing of failures and misoperations. The paper aims to identify the best technique to be adopted for the monitoring system. Three different methods have been implemented and compared. Two of them use statistical techniques: the first, the linear multiple regression, expresses the monitored variable as a linear function of the process parameters (independent variables), while the second, the dynamic kriging technique, is a modified technique of multiple linear regression representing the monitored variable as a linear combination of the process variables in such a way as to minimize the variance of the estimate error. The third is based on neural networks. Tests have shown that the monitoring system based on the kriging technique is not affected by some problems common to the other two models e.g. the requirement of a large amount of data for their tuning, both for training the neural network and defining the optimum plane for the multiple regression, not only in the system starting phase but also after a trivial operation of maintenance involving the substitution of machinery components having a direct impact on the observed variable. Or, in addition, the necessity of different models to describe in a satisfactory way the different ranges of operation of the plant. The monitoring system based on the kriging statistical technique overrides the previous difficulties: it does not require a large amount of data to be tuned and is immediately operational: given two points, the third can be immediately estimated; in addition the model follows the system without adapting itself to it. The results of the experimentation performed seem to indicate that a model based on a neural network or on a linear multiple regression is not optimal, and that a different approach is necessary to reduce the amount of work during the learning phase using, when available, all the information stored during the initial phase of the plant to build the reference baseline, elaborating, if it is the case, the raw information available. A mixed approach using the kriging statistical technique and neural network techniques could optimise the result.
NASA Astrophysics Data System (ADS)
Leroux, Romain; Chatellier, Ludovic; David, Laurent
2018-01-01
This article is devoted to the estimation of time-resolved particle image velocimetry (TR-PIV) flow fields using a time-resolved point measurements of a voltage signal obtained by hot-film anemometry. A multiple linear regression model is first defined to map the TR-PIV flow fields onto the voltage signal. Due to the high temporal resolution of the signal acquired by the hot-film sensor, the estimates of the TR-PIV flow fields are obtained with a multiple linear regression method called orthonormalized partial least squares regression (OPLSR). Subsequently, this model is incorporated as the observation equation in an ensemble Kalman filter (EnKF) applied on a proper orthogonal decomposition reduced-order model to stabilize it while reducing the effects of the hot-film sensor noise. This method is assessed for the reconstruction of the flow around a NACA0012 airfoil at a Reynolds number of 1000 and an angle of attack of {20}°. Comparisons with multi-time delay-modified linear stochastic estimation show that both the OPLSR and EnKF combined with OPLSR are more accurate as they produce a much lower relative estimation error, and provide a faithful reconstruction of the time evolution of the velocity flow fields.
2016-01-01
Understanding the relationship between physiological measurements from human subjects and their demographic data is important within both the biometric and forensic domains. In this paper we explore the relationship between measurements of the human hand and a range of demographic features. We assess the ability of linear regression and machine learning classifiers to predict demographics from hand features, thereby providing evidence on both the strength of relationship and the key features underpinning this relationship. Our results show that we are able to predict sex, height, weight and foot size accurately within various data-range bin sizes, with machine learning classification algorithms out-performing linear regression in most situations. In addition, we identify the features used to provide these relationships applicable across multiple applications. PMID:27806075
Miguel-Hurtado, Oscar; Guest, Richard; Stevenage, Sarah V; Neil, Greg J; Black, Sue
2016-01-01
Understanding the relationship between physiological measurements from human subjects and their demographic data is important within both the biometric and forensic domains. In this paper we explore the relationship between measurements of the human hand and a range of demographic features. We assess the ability of linear regression and machine learning classifiers to predict demographics from hand features, thereby providing evidence on both the strength of relationship and the key features underpinning this relationship. Our results show that we are able to predict sex, height, weight and foot size accurately within various data-range bin sizes, with machine learning classification algorithms out-performing linear regression in most situations. In addition, we identify the features used to provide these relationships applicable across multiple applications.
Zhou, Qing-he; Xiao, Wang-pin; Shen, Ying-yan
2014-07-01
The spread of spinal anesthesia is highly unpredictable. In patients with increased abdominal girth and short stature, a greater cephalad spread after a fixed amount of subarachnoidally administered plain bupivacaine is often observed. We hypothesized that there is a strong correlation between abdominal girth/vertebral column length and cephalad spread. Age, weight, height, body mass index, abdominal girth, and vertebral column length were recorded for 114 patients. The L3-L4 interspace was entered, and 3 mL of 0.5% plain bupivacaine was injected into the subarachnoid space. The cephalad spread (loss of temperature sensation and loss of pinprick discrimination) was assessed 30 minutes after intrathecal injection. Linear regression analysis was performed for age, weight, height, body mass index, abdominal girth, vertebral column length, and the spread of spinal anesthesia, and the combined linear contribution of age up to 55 years, weight, height, abdominal girth, and vertebral column length was tested by multiple regression analysis. Linear regression analysis showed that there was a significant univariate correlation among all 6 patient characteristics evaluated and the spread of spinal anesthesia (all P < 0.039) except for age and loss of temperature sensation (P > 0.068). Multiple regression analysis showed that abdominal girth and the vertebral column length were the key determinants for spinal anesthesia spread (both P < 0.0001), whereas age, weight, and height could be omitted without changing the results (all P > 0.059, all 95% confidence limits < 0.372). Multiple regression analysis revealed that the combination of a patient's 5 general characteristics, especially abdominal girth and vertebral column length, had a high predictive value for the spread of spinal anesthesia after a given dose of plain bupivacaine.
Røislien, Jo; Lossius, Hans Morten; Kristiansen, Thomas
2015-01-01
Background Trauma is a leading global cause of death. Trauma mortality rates are higher in rural areas, constituting a challenge for quality and equality in trauma care. The aim of the study was to explore population density and transport time to hospital care as possible predictors of geographical differences in mortality rates, and to what extent choice of statistical method might affect the analytical results and accompanying clinical conclusions. Methods Using data from the Norwegian Cause of Death registry, deaths from external causes 1998–2007 were analysed. Norway consists of 434 municipalities, and municipality population density and travel time to hospital care were entered as predictors of municipality mortality rates in univariate and multiple regression models of increasing model complexity. We fitted linear regression models with continuous and categorised predictors, as well as piecewise linear and generalised additive models (GAMs). Models were compared using Akaike's information criterion (AIC). Results Population density was an independent predictor of trauma mortality rates, while the contribution of transport time to hospital care was highly dependent on choice of statistical model. A multiple GAM or piecewise linear model was superior, and similar, in terms of AIC. However, while transport time was statistically significant in multiple models with piecewise linear or categorised predictors, it was not in GAM or standard linear regression. Conclusions Population density is an independent predictor of trauma mortality rates. The added explanatory value of transport time to hospital care is marginal and model-dependent, highlighting the importance of exploring several statistical models when studying complex associations in observational data. PMID:25972600
NASA Astrophysics Data System (ADS)
Kuchar, A.; Sacha, P.; Miksovsky, J.; Pisoft, P.
2015-06-01
This study focusses on the variability of temperature, ozone and circulation characteristics in the stratosphere and lower mesosphere with regard to the influence of the 11-year solar cycle. It is based on attribution analysis using multiple nonlinear techniques (support vector regression, neural networks) besides the multiple linear regression approach. The analysis was applied to several current reanalysis data sets for the 1979-2013 period, including MERRA, ERA-Interim and JRA-55, with the aim to compare how these types of data resolve especially the double-peaked solar response in temperature and ozone variables and the consequent changes induced by these anomalies. Equatorial temperature signals in the tropical stratosphere were found to be in qualitative agreement with previous attribution studies, although the agreement with observational results was incomplete, especially for JRA-55. The analysis also pointed to the solar signal in the ozone data sets (i.e. MERRA and ERA-Interim) not being consistent with the observed double-peaked ozone anomaly extracted from satellite measurements. The results obtained by linear regression were confirmed by the nonlinear approach through all data sets, suggesting that linear regression is a relevant tool to sufficiently resolve the solar signal in the middle atmosphere. The seasonal evolution of the solar response was also discussed in terms of dynamical causalities in the winter hemispheres. The hypothetical mechanism of a weaker Brewer-Dobson circulation at solar maxima was reviewed together with a discussion of polar vortex behaviour.
Correlation Weights in Multiple Regression
ERIC Educational Resources Information Center
Waller, Niels G.; Jones, Jeff A.
2010-01-01
A general theory on the use of correlation weights in linear prediction has yet to be proposed. In this paper we take initial steps in developing such a theory by describing the conditions under which correlation weights perform well in population regression models. Using OLS weights as a comparison, we define cases in which the two weighting…
Testing a single regression coefficient in high dimensional linear models
Zhong, Ping-Shou; Li, Runze; Wang, Hansheng; Tsai, Chih-Ling
2017-01-01
In linear regression models with high dimensional data, the classical z-test (or t-test) for testing the significance of each single regression coefficient is no longer applicable. This is mainly because the number of covariates exceeds the sample size. In this paper, we propose a simple and novel alternative by introducing the Correlated Predictors Screening (CPS) method to control for predictors that are highly correlated with the target covariate. Accordingly, the classical ordinary least squares approach can be employed to estimate the regression coefficient associated with the target covariate. In addition, we demonstrate that the resulting estimator is consistent and asymptotically normal even if the random errors are heteroscedastic. This enables us to apply the z-test to assess the significance of each covariate. Based on the p-value obtained from testing the significance of each covariate, we further conduct multiple hypothesis testing by controlling the false discovery rate at the nominal level. Then, we show that the multiple hypothesis testing achieves consistent model selection. Simulation studies and empirical examples are presented to illustrate the finite sample performance and the usefulness of the proposed method, respectively. PMID:28663668
Testing a single regression coefficient in high dimensional linear models.
Lan, Wei; Zhong, Ping-Shou; Li, Runze; Wang, Hansheng; Tsai, Chih-Ling
2016-11-01
In linear regression models with high dimensional data, the classical z -test (or t -test) for testing the significance of each single regression coefficient is no longer applicable. This is mainly because the number of covariates exceeds the sample size. In this paper, we propose a simple and novel alternative by introducing the Correlated Predictors Screening (CPS) method to control for predictors that are highly correlated with the target covariate. Accordingly, the classical ordinary least squares approach can be employed to estimate the regression coefficient associated with the target covariate. In addition, we demonstrate that the resulting estimator is consistent and asymptotically normal even if the random errors are heteroscedastic. This enables us to apply the z -test to assess the significance of each covariate. Based on the p -value obtained from testing the significance of each covariate, we further conduct multiple hypothesis testing by controlling the false discovery rate at the nominal level. Then, we show that the multiple hypothesis testing achieves consistent model selection. Simulation studies and empirical examples are presented to illustrate the finite sample performance and the usefulness of the proposed method, respectively.
Predicting flight delay based on multiple linear regression
NASA Astrophysics Data System (ADS)
Ding, Yi
2017-08-01
Delay of flight has been regarded as one of the toughest difficulties in aviation control. How to establish an effective model to handle the delay prediction problem is a significant work. To solve the problem that the flight delay is difficult to predict, this study proposes a method to model the arriving flights and a multiple linear regression algorithm to predict delay, comparing with Naive-Bayes and C4.5 approach. Experiments based on a realistic dataset of domestic airports show that the accuracy of the proposed model approximates 80%, which is further improved than the Naive-Bayes and C4.5 approach approaches. The result testing shows that this method is convenient for calculation, and also can predict the flight delays effectively. It can provide decision basis for airport authorities.
Forecasting daily patient volumes in the emergency department.
Jones, Spencer S; Thomas, Alun; Evans, R Scott; Welch, Shari J; Haug, Peter J; Snow, Gregory L
2008-02-01
Shifts in the supply of and demand for emergency department (ED) resources make the efficient allocation of ED resources increasingly important. Forecasting is a vital activity that guides decision-making in many areas of economic, industrial, and scientific planning, but has gained little traction in the health care industry. There are few studies that explore the use of forecasting methods to predict patient volumes in the ED. The goals of this study are to explore and evaluate the use of several statistical forecasting methods to predict daily ED patient volumes at three diverse hospital EDs and to compare the accuracy of these methods to the accuracy of a previously proposed forecasting method. Daily patient arrivals at three hospital EDs were collected for the period January 1, 2005, through March 31, 2007. The authors evaluated the use of seasonal autoregressive integrated moving average, time series regression, exponential smoothing, and artificial neural network models to forecast daily patient volumes at each facility. Forecasts were made for horizons ranging from 1 to 30 days in advance. The forecast accuracy achieved by the various forecasting methods was compared to the forecast accuracy achieved when using a benchmark forecasting method already available in the emergency medicine literature. All time series methods considered in this analysis provided improved in-sample model goodness of fit. However, post-sample analysis revealed that time series regression models that augment linear regression models by accounting for serial autocorrelation offered only small improvements in terms of post-sample forecast accuracy, relative to multiple linear regression models, while seasonal autoregressive integrated moving average, exponential smoothing, and artificial neural network forecasting models did not provide consistently accurate forecasts of daily ED volumes. This study confirms the widely held belief that daily demand for ED services is characterized by seasonal and weekly patterns. The authors compared several time series forecasting methods to a benchmark multiple linear regression model. The results suggest that the existing methodology proposed in the literature, multiple linear regression based on calendar variables, is a reasonable approach to forecasting daily patient volumes in the ED. However, the authors conclude that regression-based models that incorporate calendar variables, account for site-specific special-day effects, and allow for residual autocorrelation provide a more appropriate, informative, and consistently accurate approach to forecasting daily ED patient volumes.
Li, Zhenghua; Cheng, Fansheng; Xia, Zhining
2011-01-01
The chemical structures of 114 polycyclic aromatic sulfur heterocycles (PASHs) have been studied by molecular electronegativity-distance vector (MEDV). The linear relationships between gas chromatographic retention index and the MEDV have been established by a multiple linear regression (MLR) model. The results of variable selection by stepwise multiple regression (SMR) and the powerful predictive abilities of the optimization model appraised by leave-one-out cross-validation showed that the optimization model with the correlation coefficient (R) of 0.994 7 and the cross-validated correlation coefficient (Rcv) of 0.994 0 possessed the best statistical quality. Furthermore, when the 114 PASHs compounds were divided into calibration and test sets in the ratio of 2:1, the statistical analysis showed our models possesses almost equal statistical quality, the very similar regression coefficients and the good robustness. The quantitative structure-retention relationship (QSRR) model established may provide a convenient and powerful method for predicting the gas chromatographic retention of PASHs.
Modeling Pan Evaporation for Kuwait by Multiple Linear Regression
Almedeij, Jaber
2012-01-01
Evaporation is an important parameter for many projects related to hydrology and water resources systems. This paper constitutes the first study conducted in Kuwait to obtain empirical relations for the estimation of daily and monthly pan evaporation as functions of available meteorological data of temperature, relative humidity, and wind speed. The data used here for the modeling are daily measurements of substantial continuity coverage, within a period of 17 years between January 1993 and December 2009, which can be considered representative of the desert climate of the urban zone of the country. Multiple linear regression technique is used with a procedure of variable selection for fitting the best model forms. The correlations of evaporation with temperature and relative humidity are also transformed in order to linearize the existing curvilinear patterns of the data by using power and exponential functions, respectively. The evaporation models suggested with the best variable combinations were shown to produce results that are in a reasonable agreement with observation values. PMID:23226984
Multivariate meta-analysis for non-linear and other multi-parameter associations
Gasparrini, A; Armstrong, B; Kenward, M G
2012-01-01
In this paper, we formalize the application of multivariate meta-analysis and meta-regression to synthesize estimates of multi-parameter associations obtained from different studies. This modelling approach extends the standard two-stage analysis used to combine results across different sub-groups or populations. The most straightforward application is for the meta-analysis of non-linear relationships, described for example by regression coefficients of splines or other functions, but the methodology easily generalizes to any setting where complex associations are described by multiple correlated parameters. The modelling framework of multivariate meta-analysis is implemented in the package mvmeta within the statistical environment R. As an illustrative example, we propose a two-stage analysis for investigating the non-linear exposure–response relationship between temperature and non-accidental mortality using time-series data from multiple cities. Multivariate meta-analysis represents a useful analytical tool for studying complex associations through a two-stage procedure. Copyright © 2012 John Wiley & Sons, Ltd. PMID:22807043
Roy, Banibrata; Ripstein, Ira; Perry, Kyle; Cohen, Barry
2016-01-01
To determine whether the pre-medical Grade Point Average (GPA), Medical College Admission Test (MCAT), Internal examinations (Block) and National Board of Medical Examiners (NBME) scores are correlated with and predict the Medical Council of Canada Qualifying Examination Part I (MCCQE-1) scores. Data from 392 admitted students in the graduating classes of 2010-2013 at University of Manitoba (UofM), College of Medicine was considered. Pearson's correlation to assess the strength of the relationship, multiple linear regression to estimate MCCQE-1 score and stepwise linear regression to investigate the amount of variance were employed. Complete data from 367 (94%) students were studied. The MCCQE-1 had a moderate-to-large positive correlation with NBME scores and Block scores but a low correlation with GPA and MCAT scores. The multiple linear regression model gives a good estimate of the MCCQE-1 (R2 =0.604). Stepwise regression analysis demonstrated that 59.2% of the variation in the MCCQE-1 was accounted for by the NBME, but only 1.9% by the Block exams, and negligible variation came from the GPA and the MCAT. Amongst all the examinations used at UofM, the NBME is most closely correlated with MCCQE-1.
Tokunaga, Makoto; Watanabe, Susumu; Sonoda, Shigeru
2017-09-01
Multiple linear regression analysis is often used to predict the outcome of stroke rehabilitation. However, the predictive accuracy may not be satisfactory. The objective of this study was to elucidate the predictive accuracy of a method of calculating motor Functional Independence Measure (mFIM) at discharge from mFIM effectiveness predicted by multiple regression analysis. The subjects were 505 patients with stroke who were hospitalized in a convalescent rehabilitation hospital. The formula "mFIM at discharge = mFIM effectiveness × (91 points - mFIM at admission) + mFIM at admission" was used. By including the predicted mFIM effectiveness obtained through multiple regression analysis in this formula, we obtained the predicted mFIM at discharge (A). We also used multiple regression analysis to directly predict mFIM at discharge (B). The correlation between the predicted and the measured values of mFIM at discharge was compared between A and B. The correlation coefficients were .916 for A and .878 for B. Calculating mFIM at discharge from mFIM effectiveness predicted by multiple regression analysis had a higher degree of predictive accuracy of mFIM at discharge than that directly predicted. Copyright © 2017 National Stroke Association. Published by Elsevier Inc. All rights reserved.
Evaluation and prediction of shrub cover in coastal Oregon forests (USA)
Becky K. Kerns; Janet L. Ohmann
2004-01-01
We used data from regional forest inventories and research programs, coupled with mapped climatic and topographic information, to explore relationships and develop multiple linear regression (MLR) and regression tree models for total and deciduous shrub cover in the Oregon coastal province. Results from both types of models indicate that forest structure variables were...
ERIC Educational Resources Information Center
Bloom, Allan M.; And Others
In response to the increasing importance of student performance in required classes, research was conducted to compare two prediction procedures, linear modeling using multiple regression and nonlinear modeling using AID3. Performance in the first college math course (College Mathematics, Calculus, or Business Calculus Matrices) was the dependent…
Multiple linear regression analysis
NASA Technical Reports Server (NTRS)
Edwards, T. R.
1980-01-01
Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.
Cactus: An Introduction to Regression
ERIC Educational Resources Information Center
Hyde, Hartley
2008-01-01
When the author first used "VisiCalc," the author thought it a very useful tool when he had the formulas. But how could he design a spreadsheet if there was no known formula for the quantities he was trying to predict? A few months later, the author relates he learned to use multiple linear regression software and suddenly it all clicked into…
A New Sample Size Formula for Regression.
ERIC Educational Resources Information Center
Brooks, Gordon P.; Barcikowski, Robert S.
The focus of this research was to determine the efficacy of a new method of selecting sample sizes for multiple linear regression. A Monte Carlo simulation was used to study both empirical predictive power rates and empirical statistical power rates of the new method and seven other methods: those of C. N. Park and A. L. Dudycha (1974); J. Cohen…
Goodarzi, Mohammad; Jensen, Richard; Vander Heyden, Yvan
2012-12-01
A Quantitative Structure-Retention Relationship (QSRR) is proposed to estimate the chromatographic retention of 83 diverse drugs on a Unisphere poly butadiene (PBD) column, using isocratic elutions at pH 11.7. Previous work has generated QSRR models for them using Classification And Regression Trees (CART). In this work, Ant Colony Optimization is used as a feature selection method to find the best molecular descriptors from a large pool. In addition, several other selection methods have been applied, such as Genetic Algorithms, Stepwise Regression and the Relief method, not only to evaluate Ant Colony Optimization as a feature selection method but also to investigate its ability to find the important descriptors in QSRR. Multiple Linear Regression (MLR) and Support Vector Machines (SVMs) were applied as linear and nonlinear regression methods, respectively, giving excellent correlation between the experimental, i.e. extrapolated to a mobile phase consisting of pure water, and predicted logarithms of the retention factors of the drugs (logk(w)). The overall best model was the SVM one built using descriptors selected by ACO. Copyright © 2012 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Fernández-Manso, O.; Fernández-Manso, A.; Quintano, C.
2014-09-01
Aboveground biomass (AGB) estimation from optical satellite data is usually based on regression models of original or synthetic bands. To overcome the poor relation between AGB and spectral bands due to mixed-pixels when a medium spatial resolution sensor is considered, we propose to base the AGB estimation on fraction images from Linear Spectral Mixture Analysis (LSMA). Our study area is a managed Mediterranean pine woodland (Pinus pinaster Ait.) in central Spain. A total of 1033 circular field plots were used to estimate AGB from Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) optical data. We applied Pearson correlation statistics and stepwise multiple regression to identify suitable predictors from the set of variables of original bands, fraction imagery, Normalized Difference Vegetation Index and Tasselled Cap components. Four linear models and one nonlinear model were tested. A linear combination of ASTER band 2 (red, 0.630-0.690 μm), band 8 (short wave infrared 5, 2.295-2.365 μm) and green vegetation fraction (from LSMA) was the best AGB predictor (Radj2=0.632, the root-mean-squared error of estimated AGB was 13.3 Mg ha-1 (or 37.7%), resulting from cross-validation), rather than other combinations of the above cited independent variables. Results indicated that using ASTER fraction images in regression models improves the AGB estimation in Mediterranean pine forests. The spatial distribution of the estimated AGB, based on a multiple linear regression model, may be used as baseline information for forest managers in future studies, such as quantifying the regional carbon budget, fuel accumulation or monitoring of management practices.
NASA Astrophysics Data System (ADS)
Uca; Toriman, Ekhwan; Jaafar, Othman; Maru, Rosmini; Arfan, Amal; Saleh Ahmar, Ansari
2018-01-01
Prediction of suspended sediment discharge in a catchments area is very important because it can be used to evaluation the erosion hazard, management of its water resources, water quality, hydrology project management (dams, reservoirs, and irrigation) and to determine the extent of the damage that occurred in the catchments. Multiple Linear Regression analysis and artificial neural network can be used to predict the amount of daily suspended sediment discharge. Regression analysis using the least square method, whereas artificial neural networks using Radial Basis Function (RBF) and feedforward multilayer perceptron with three learning algorithms namely Levenberg-Marquardt (LM), Scaled Conjugate Descent (SCD) and Broyden-Fletcher-Goldfarb-Shanno Quasi-Newton (BFGS). The number neuron of hidden layer is three to sixteen, while in output layer only one neuron because only one output target. The mean absolute error (MAE), root mean square error (RMSE), coefficient of determination (R2 ) and coefficient of efficiency (CE) of the multiple linear regression (MLRg) value Model 2 (6 input variable independent) has the lowest the value of MAE and RMSE (0.0000002 and 13.6039) and highest R2 and CE (0.9971 and 0.9971). When compared between LM, SCG and RBF, the BFGS model structure 3-7-1 is the better and more accurate to prediction suspended sediment discharge in Jenderam catchment. The performance value in testing process, MAE and RMSE (13.5769 and 17.9011) is smallest, meanwhile R2 and CE (0.9999 and 0.9998) is the highest if it compared with the another BFGS Quasi-Newton model (6-3-1, 9-10-1 and 12-12-1). Based on the performance statistics value, MLRg, LM, SCG, BFGS and RBF suitable and accurately for prediction by modeling the non-linear complex behavior of suspended sediment responses to rainfall, water depth and discharge. The comparison between artificial neural network (ANN) and MLRg, the MLRg Model 2 accurately for to prediction suspended sediment discharge (kg/day) in Jenderan catchment area.
Liquid electrolyte informatics using an exhaustive search with linear regression.
Sodeyama, Keitaro; Igarashi, Yasuhiko; Nakayama, Tomofumi; Tateyama, Yoshitaka; Okada, Masato
2018-06-14
Exploring new liquid electrolyte materials is a fundamental target for developing new high-performance lithium-ion batteries. In contrast to solid materials, disordered liquid solution properties have been less studied by data-driven information techniques. Here, we examined the estimation accuracy and efficiency of three information techniques, multiple linear regression (MLR), least absolute shrinkage and selection operator (LASSO), and exhaustive search with linear regression (ES-LiR), by using coordination energy and melting point as test liquid properties. We then confirmed that ES-LiR gives the most accurate estimation among the techniques. We also found that ES-LiR can provide the relationship between the "prediction accuracy" and "calculation cost" of the properties via a weight diagram of descriptors. This technique makes it possible to choose the balance of the "accuracy" and "cost" when the search of a huge amount of new materials was carried out.
NASA Technical Reports Server (NTRS)
Banse, Karl; Yong, Marina
1990-01-01
As a proxy for satellite CZCS observations and concurrent measurements of primary production rates, data from 138 stations occupied seasonally during 1967-1968 in the offshore eastern tropical Pacific were analyzed in terms of six temporal groups and our current regimes. Multiple linear regressions on column production Pt show that simulated satellite pigment is generally weakly correlated, but sometimes not correlated with Pt, and that incident irradiance, sea surface temperature, nitrate, transparency, and depths of mixed layer or nitracline assume little or no importance. After a proxy for the light-saturated chlorophyll-specific photosynthetic rate P(max) is added, the coefficient of determination ranges from 0.55 to 0.91 (median of 0.85) for the 10 cases. In stepwise multiple linear regressions the P(max) proxy is the best predictor for Pt.
NASA Technical Reports Server (NTRS)
Whitlock, C. H., III
1977-01-01
Constituents with linear radiance gradients with concentration may be quantified from signals which contain nonlinear atmospheric and surface reflection effects for both homogeneous and non-homogeneous water bodies provided accurate data can be obtained and nonlinearities are constant with wavelength. Statistical parameters must be used which give an indication of bias as well as total squared error to insure that an equation with an optimum combination of bands is selected. It is concluded that the effect of error in upwelled radiance measurements is to reduce the accuracy of the least square fitting process and to increase the number of points required to obtain a satisfactory fit. The problem of obtaining a multiple regression equation that is extremely sensitive to error is discussed.
NASA Astrophysics Data System (ADS)
Grotti, Marco; Abelmoschi, Maria Luisa; Soggia, Francesco; Tiberiade, Christian; Frache, Roberto
2000-12-01
The multivariate effects of Na, K, Mg and Ca as nitrates on the electrothermal atomisation of manganese, cadmium and iron were studied by multiple linear regression modelling. Since the models proved to efficiently predict the effects of the considered matrix elements in a wide range of concentrations, they were applied to correct the interferences occurring in the determination of trace elements in seawater after pre-concentration of the analytes. In order to obtain a statistically significant number of samples, a large volume of the certified seawater reference materials CASS-3 and NASS-3 was treated with Chelex-100 resin; then, the chelating resin was separated from the solution, divided into several sub-samples, each of them was eluted with nitric acid and analysed by electrothermal atomic absorption spectrometry (for trace element determinations) and inductively coupled plasma optical emission spectrometry (for matrix element determinations). To minimise any other systematic error besides that due to matrix effects, accuracy of the pre-concentration step and contamination levels of the procedure were checked by inductively coupled plasma mass spectrometric measurements. Analytical results obtained by applying the multiple linear regression models were compared with those obtained with other calibration methods, such as external calibration using acid-based standards, external calibration using matrix-matched standards and the analyte addition technique. Empirical models proved to efficiently reduce interferences occurring in the analysis of real samples, allowing an improvement of accuracy better than for other calibration methods.
Ochi, H; Ikuma, I; Toda, H; Shimada, T; Morioka, S; Moriyama, K
1989-12-01
In order to determine whether isovolumic relaxation period (IRP) reflects left ventricular relaxation under different afterload conditions, 17 anesthetized, open chest dogs were studied, and the left ventricular pressure decay time constant (T) was calculated. In 12 dogs, angiotensin II and nitroprusside were administered, with the heart rate constant at 90 beats/min. Multiple linear regression analysis showed that the aortic dicrotic notch pressure (AoDNP) and T were major determinants of IRP, while left ventricular end-diastolic pressure was a minor determinant. Multiple linear regression analysis, correlating T with IRP and AoDNP, did not further improve the correlation coefficient compared with that between T and IRP. We concluded that correction of the IRP by AoDNP is not necessary to predict T from additional multiple linear regression. The effects of ascending aortic constriction or angiotensin II on IRP were examined in five dogs, after pretreatment with propranolol. Aortic constriction caused a significant decrease in IRP and T, while angiotensin II produced a significant increase in IRP and T. IRP was affected by the change of afterload. However, the IRP and T values were always altered in the same direction. These results demonstrate that IRP is substituted for T and it reflects left ventricular relaxation even in different afterload conditions. We conclude that IRP is a simple parameter easily used to evaluate left ventricular relaxation in clinical situations.
Aspects of porosity prediction using multivariate linear regression
DOE Office of Scientific and Technical Information (OSTI.GOV)
Byrnes, A.P.; Wilson, M.D.
1991-03-01
Highly accurate multiple linear regression models have been developed for sandstones of diverse compositions. Porosity reduction or enhancement processes are controlled by the fundamental variables, Pressure (P), Temperature (T), Time (t), and Composition (X), where composition includes mineralogy, size, sorting, fluid composition, etc. The multiple linear regression equation, of which all linear porosity prediction models are subsets, takes the generalized form: Porosity = C{sub 0} + C{sub 1}(P) + C{sub 2}(T) + C{sub 3}(X) + C{sub 4}(t) + C{sub 5}(PT) + C{sub 6}(PX) + C{sub 7}(Pt) + C{sub 8}(TX) + C{sub 9}(Tt) + C{sub 10}(Xt) + C{sub 11}(PTX) + C{submore » 12}(PXt) + C{sub 13}(PTt) + C{sub 14}(TXt) + C{sub 15}(PTXt). The first four primary variables are often interactive, thus requiring terms involving two or more primary variables (the form shown implies interaction and not necessarily multiplication). The final terms used may also involve simple mathematic transforms such as log X, e{sup T}, X{sup 2}, or more complex transformations such as the Time-Temperature Index (TTI). The X term in the equation above represents a suite of compositional variable and, therefore, a fully expanded equation may include a series of terms incorporating these variables. Numerous published bivariate porosity prediction models involving P (or depth) or Tt (TTI) are effective to a degree, largely because of the high degree of colinearity between p and TTI. However, all such bivariate models ignore the unique contributions of P and Tt, as well as various X terms. These simpler models become poor predictors in regions where colinear relations change, were important variables have been ignored, or where the database does not include a sufficient range or weight distribution for the critical variables.« less
Schilling, K.E.; Wolter, C.F.
2005-01-01
Nineteen variables, including precipitation, soils and geology, land use, and basin morphologic characteristics, were evaluated to develop Iowa regression models to predict total streamflow (Q), base flow (Qb), storm flow (Qs) and base flow percentage (%Qb) in gauged and ungauged watersheds in the state. Discharge records from a set of 33 watersheds across the state for the 1980 to 2000 period were separated into Qb and Qs. Multiple linear regression found that 75.5 percent of long term average Q was explained by rainfall, sand content, and row crop percentage variables, whereas 88.5 percent of Qb was explained by these three variables plus permeability and floodplain area variables. Qs was explained by average rainfall and %Qb was a function of row crop percentage, permeability, and basin slope variables. Regional regression models developed for long term average Q and Qb were adapted to annual rainfall and showed good correlation between measured and predicted values. Combining the regression model for Q with an estimate of mean annual nitrate concentration, a map of potential nitrate loads in the state was produced. Results from this study have important implications for understanding geomorphic and land use controls on streamflow and base flow in Iowa watersheds and similar agriculture dominated watersheds in the glaciated Midwest. (JAWRA) (Copyright ?? 2005).
NASA Astrophysics Data System (ADS)
Jakubowski, J.; Stypulkowski, J. B.; Bernardeau, F. G.
2017-12-01
The first phase of the Abu Hamour drainage and storm tunnel was completed in early 2017. The 9.5 km long, 3.7 m diameter tunnel was excavated with two Earth Pressure Balance (EPB) Tunnel Boring Machines from Herrenknecht. TBM operation processes were monitored and recorded by Data Acquisition and Evaluation System. The authors coupled collected TBM drive data with available information on rock mass properties, cleansed, completed with secondary variables and aggregated by weeks and shifts. Correlations and descriptive statistics charts were examined. Multivariate Linear Regression and CART regression tree models linking TBM penetration rate (PR), penetration per revolution (PPR) and field penetration index (FPI) with TBM operational and geotechnical characteristics were performed for the conditions of the weak/soft rock of Doha. Both regression methods are interpretable and the data were screened with different computational approaches allowing enriched insight. The primary goal of the analysis was to investigate empirical relations between multiple explanatory and responding variables, to search for best subsets of explanatory variables and to evaluate the strength of linear and non-linear relations. For each of the penetration indices, a predictive model coupling both regression methods was built and validated. The resultant models appeared to be stronger than constituent ones and indicated an opportunity for more accurate and robust TBM performance predictions.
NASA Technical Reports Server (NTRS)
Lo, Ching F.
1999-01-01
The integration of Radial Basis Function Networks and Back Propagation Neural Networks with the Multiple Linear Regression has been accomplished to map nonlinear response surfaces over a wide range of independent variables in the process of the Modem Design of Experiments. The integrated method is capable to estimate the precision intervals including confidence and predicted intervals. The power of the innovative method has been demonstrated by applying to a set of wind tunnel test data in construction of response surface and estimation of precision interval.
ERIC Educational Resources Information Center
Sigfusdottir, Inga-Dora; Silver, Eric
2009-01-01
This study examines the effects of negative life events on anger and depressed mood among a sample of 7,758 Icelandic adolescents, measured as part of the National Survey of Icelandic Adolescents (Thorlindsson, Sigfusdottir, Bernburg, & Halldorsson, 1998). Using multiple linear regression and multinomial logit regression, we find that (a)…
NASA Astrophysics Data System (ADS)
Hassanzadeh, S.; Hosseinibalam, F.; Omidvari, M.
2008-04-01
Data of seven meteorological variables (relative humidity, wet temperature, dry temperature, maximum temperature, minimum temperature, ground temperature and sun radiation time) and ozone values have been used for statistical analysis. Meteorological variables and ozone values were analyzed using both multiple linear regression and principal component methods. Data for the period 1999-2004 are analyzed jointly using both methods. For all periods, temperature dependent variables were highly correlated, but were all negatively correlated with relative humidity. Multiple regression analysis was used to fit the meteorological variables using the meteorological variables as predictors. A variable selection method based on high loading of varimax rotated principal components was used to obtain subsets of the predictor variables to be included in the linear regression model of the meteorological variables. In 1999, 2001 and 2002 one of the meteorological variables was weakly influenced predominantly by the ozone concentrations. However, the model did not predict that the meteorological variables for the year 2000 were not influenced predominantly by the ozone concentrations that point to variation in sun radiation. This could be due to other factors that were not explicitly considered in this study.
A Study of the Effect of the Front-End Styling of Sport Utility Vehicles on Pedestrian Head Injuries
Qin, Qin; Chen, Zheng; Bai, Zhonghao; Cao, Libo
2018-01-01
Background The number of sport utility vehicles (SUVs) on China market is continuously increasing. It is necessary to investigate the relationships between the front-end styling features of SUVs and head injuries at the styling design stage for improving the pedestrian protection performance and product development efficiency. Methods Styling feature parameters were extracted from the SUV side contour line. And simplified finite element models were established based on the 78 SUV side contour lines. Pedestrian headform impact simulations were performed and validated. The head injury criterion of 15 ms (HIC15) at four wrap-around distances was obtained. A multiple linear regression analysis method was employed to describe the relationships between the styling feature parameters and the HIC15 at each impact point. Results The relationship between the selected styling features and the HIC15 showed reasonable correlations, and the regression models and the selected independent variables showed statistical significance. Conclusions The regression equations obtained by multiple linear regression can be used to assess the performance of SUV styling in protecting pedestrians' heads and provide styling designers with technical guidance regarding their artistic creations.
Predicting musically induced emotions from physiological inputs: linear and neural network models.
Russo, Frank A; Vempala, Naresh N; Sandstrom, Gillian M
2013-01-01
Listening to music often leads to physiological responses. Do these physiological responses contain sufficient information to infer emotion induced in the listener? The current study explores this question by attempting to predict judgments of "felt" emotion from physiological responses alone using linear and neural network models. We measured five channels of peripheral physiology from 20 participants-heart rate (HR), respiration, galvanic skin response, and activity in corrugator supercilii and zygomaticus major facial muscles. Using valence and arousal (VA) dimensions, participants rated their felt emotion after listening to each of 12 classical music excerpts. After extracting features from the five channels, we examined their correlation with VA ratings, and then performed multiple linear regression to see if a linear relationship between the physiological responses could account for the ratings. Although linear models predicted a significant amount of variance in arousal ratings, they were unable to do so with valence ratings. We then used a neural network to provide a non-linear account of the ratings. The network was trained on the mean ratings of eight of the 12 excerpts and tested on the remainder. Performance of the neural network confirms that physiological responses alone can be used to predict musically induced emotion. The non-linear model derived from the neural network was more accurate than linear models derived from multiple linear regression, particularly along the valence dimension. A secondary analysis allowed us to quantify the relative contributions of inputs to the non-linear model. The study represents a novel approach to understanding the complex relationship between physiological responses and musically induced emotion.
Building "e-rater"® Scoring Models Using Machine Learning Methods. Research Report. ETS RR-16-04
ERIC Educational Resources Information Center
Chen, Jing; Fife, James H.; Bejar, Isaac I.; Rupp, André A.
2016-01-01
The "e-rater"® automated scoring engine used at Educational Testing Service (ETS) scores the writing quality of essays. In the current practice, e-rater scores are generated via a multiple linear regression (MLR) model as a linear combination of various features evaluated for each essay and human scores as the outcome variable. This…
ERIC Educational Resources Information Center
Si, Yajuan; Reiter, Jerome P.
2013-01-01
In many surveys, the data comprise a large number of categorical variables that suffer from item nonresponse. Standard methods for multiple imputation, like log-linear models or sequential regression imputation, can fail to capture complex dependencies and can be difficult to implement effectively in high dimensions. We present a fully Bayesian,…
Female Literacy Rate is a Better Predictor of Birth Rate and Infant Mortality Rate in India.
Saurabh, Suman; Sarkar, Sonali; Pandey, Dhruv K
2013-01-01
Educated women are known to take informed reproductive and healthcare decisions. These result in population stabilization and better infant care reflected by lower birth rates and infant mortality rates (IMRs), respectively. Our objective was to study the relationship of male and female literacy rates with crude birth rates (CBRs) and IMRs of the states and union territories (UTs) of India. The data were analyzed using linear regression. CBR and IMR were taken as the dependent variables; while the overall literacy rates, male, and female literacy rates were the independent variables. CBRs were inversely related to literacy rates (slope parameter = -0.402, P < 0.001). On multiple linear regression with male and female literacy rates, a significant inverse relationship emerged between female literacy rate and CBR (slope = -0.363, P < 0.001), while male literacy rate was not significantly related to CBR (P = 0.674). IMR of the states were also inversely related to their literacy rates (slope = -1.254, P < 0.001). Multiple linear regression revealed a significant inverse relationship between IMR and female literacy (slope = -0.816, P = 0.031), whereas male literacy rate was not significantly related (P = 0.630). Female literacy is relatively highly important for both population stabilization and better infant health.
NASA Technical Reports Server (NTRS)
Smith, Timothy D.; Steffen, Christopher J., Jr.; Yungster, Shaye; Keller, Dennis J.
1998-01-01
The all rocket mode of operation is shown to be a critical factor in the overall performance of a rocket based combined cycle (RBCC) vehicle. An axisymmetric RBCC engine was used to determine specific impulse efficiency values based upon both full flow and gas generator configurations. Design of experiments methodology was used to construct a test matrix and multiple linear regression analysis was used to build parametric models. The main parameters investigated in this study were: rocket chamber pressure, rocket exit area ratio, injected secondary flow, mixer-ejector inlet area, mixer-ejector area ratio, and mixer-ejector length-to-inlet diameter ratio. A perfect gas computational fluid dynamics analysis, using both the Spalart-Allmaras and k-omega turbulence models, was performed with the NPARC code to obtain values of vacuum specific impulse. Results from the multiple linear regression analysis showed that for both the full flow and gas generator configurations increasing mixer-ejector area ratio and rocket area ratio increase performance, while increasing mixer-ejector inlet area ratio and mixer-ejector length-to-diameter ratio decrease performance. Increasing injected secondary flow increased performance for the gas generator analysis, but was not statistically significant for the full flow analysis. Chamber pressure was found to be not statistically significant.
Multiple imputation for cure rate quantile regression with censored data.
Wu, Yuanshan; Yin, Guosheng
2017-03-01
The main challenge in the context of cure rate analysis is that one never knows whether censored subjects are cured or uncured, or whether they are susceptible or insusceptible to the event of interest. Considering the susceptible indicator as missing data, we propose a multiple imputation approach to cure rate quantile regression for censored data with a survival fraction. We develop an iterative algorithm to estimate the conditionally uncured probability for each subject. By utilizing this estimated probability and Bernoulli sample imputation, we can classify each subject as cured or uncured, and then employ the locally weighted method to estimate the quantile regression coefficients with only the uncured subjects. Repeating the imputation procedure multiple times and taking an average over the resultant estimators, we obtain consistent estimators for the quantile regression coefficients. Our approach relaxes the usual global linearity assumption, so that we can apply quantile regression to any particular quantile of interest. We establish asymptotic properties for the proposed estimators, including both consistency and asymptotic normality. We conduct simulation studies to assess the finite-sample performance of the proposed multiple imputation method and apply it to a lung cancer study as an illustration. © 2016, The International Biometric Society.
do Prado, Mara Rúbia Maciel Cardoso; Oliveira, Fabiana de Cássia Carvalho; Assis, Karine Franklin; Ribeiro, Sarah Aparecida Vieira; do Prado, Pedro Paulo; Sant'Ana, Luciana Ferreira da Rocha; Priore, Silvia Eloiza; Franceschini, Sylvia do Carmo Castro
2015-01-01
Abstract Objective: To assess the prevalence of vitamin D deficiency and its associated factors in women and their newborns in the postpartum period. Methods: This cross-sectional study evaluated vitamin D deficiency/insufficiency in 226 women and their newborns in Viçosa (Minas Gerais, BR) between December 2011 and November 2012. Cord blood and venous maternal blood were collected to evaluate the following biochemical parameters: vitamin D, alkaline phosphatase, calcium, phosphorus and parathyroid hormone. Poisson regression analysis, with a confidence interval of 95%, was applied to assess vitamin D deficiency and its associated factors. Multiple linear regression analysis was performed to identify factors associated with 25(OH)D deficiency in the newborns and women from the study. The criteria for variable inclusion in the multiple linear regression model was the association with the dependent variable in the simple linear regression analysis, considering p<0.20. Significance level was α <5%. Results: From 226 women included, 200 (88.5%) were 20-44 years old; the median age was 28 years. Deficient/insufficient levels of vitamin D were found in 192 (85%) women and in 182 (80.5%) neonates. The maternal 25(OH)D and alkaline phosphatase levels were independently associated with vitamin D deficiency in infants. Conclusions: This study identified a high prevalence of vitamin D deficiency and insufficiency in women and newborns and the association between maternal nutritional status of vitamin D and their infants' vitamin D status. PMID:26100593
Estimation of standard liver volume in Chinese adult living donors.
Fu-Gui, L; Lu-Nan, Y; Bo, L; Yong, Z; Tian-Fu, W; Ming-Qing, X; Wen-Tao, W; Zhe-Yu, C
2009-12-01
To determine a formula predicting the standard liver volume based on body surface area (BSA) or body weight in Chinese adults. A total of 115 consecutive right-lobe living donors not including the middle hepatic vein underwent right hemi-hepatectomy. No organs were used from prisoners, and no subjects were prisoners. Donor anthropometric data including age, gender, body weight, and body height were recorded prospectively. The weights and volumes of the right lobe liver grafts were measured at the back table. Liver weights and volumes were calculated from the right lobe graft weight and volume obtained at the back table, divided by the proportion of the right lobe on computed tomography. By simple linear regression analysis and stepwise multiple linear regression analysis, we correlated calculated liver volume and body height, body weight, or body surface area. The subjects had a mean age of 35.97 +/- 9.6 years, and a female-to-male ratio of 60:55. The mean volume of the right lobe was 727.47 +/- 136.17 mL, occupying 55.59% +/- 6.70% of the whole liver by computed tomography. The volume of the right lobe was 581.73 +/- 96.137 mL, and the estimated liver volume was 1053.08 +/- 167.56 mL. Females of the same body weight showed a slightly lower liver weight. By simple linear regression analysis and stepwise multiple linear regression analysis, a formula was derived based on body weight. All formulae except the Hong Kong formula overestimated liver volume compared to this formula. The formula of standard liver volume, SLV (mL) = 11.508 x body weight (kg) + 334.024, may be applied to estimate liver volumes in Chinese adults.
Koper, Olga Martyna; Kamińska, Joanna; Milewska, Anna; Sawicki, Karol; Mariak, Zenon; Kemona, Halina; Matowicka-Karna, Joanna
2018-05-18
The influence of isoform A of reticulon-4 (Nogo-A), also known as neurite outgrowth inhibitor, on primary brain tumor development was reported. Therefore the aim was the evaluation of Nogo-A concentrations in cerebrospinal fluid (CSF) and serum of brain tumor patients compared with non-tumoral individuals. All serum results, except for two cases, obtained both in brain tumors and non-tumoral individuals, were below the lower limit of ELISA detection. Cerebrospinal fluid Nogo-A concentrations were significantly lower in primary brain tumor patients compared to non-tumoral individuals. The univariate linear regression analysis found that if white blood cell count increases by 1 × 10 3 /μL, the mean cerebrospinal fluid Nogo-A concentration value decreases 1.12 times. In the model of multiple linear regression analysis predictor variables influencing cerebrospinal fluid Nogo-A concentrations included: diagnosis, sex, and sodium level. The mean cerebrospinal fluid Nogo-A concentration value was 1.9 times higher for women in comparison to men. In the astrocytic brain tumor group higher sodium level occurs with lower cerebrospinal fluid Nogo-A concentrations. We found the opposite situation in non-tumoral individuals. Univariate linear regression analysis revealed, that cerebrospinal fluid Nogo-A concentrations change in relation to white blood cell count. In the created model of multiple linear regression analysis we found, that within predictor variables influencing CSF Nogo-A concentrations were diagnosis, sex, and sodium level. Results may be relevant to the search for cerebrospinal fluid biomarkers and potential therapeutic targets in primary brain tumor patients. Nogo-A concentrations were tested by means of enzyme-linked immunosorbent assay (ELISA).
NASA Astrophysics Data System (ADS)
Singh, S.; Jaishi, H. P.; Tiwari, R. P.; Tiwari, R. C.
2017-07-01
This paper reports the analysis of soil radon data recorded in the seismic zone-V, located in the northeastern part of India (latitude 23.73N, longitude 92.73E). Continuous measurements of soil-gas emission along Chite fault in Mizoram (India) were carried out with the replacement of solid-state nuclear track detectors at weekly interval. The present study was done for the period from March 2013 to May 2015 using LR-115 Type II detectors, manufactured by Kodak Pathe, France. In order to reduce the influence of meteorological parameters, statistical analysis tools such as multiple linear regression and artificial neural network have been used. Decrease in radon concentration was recorded prior to some earthquakes that occurred during the observation period. Some false anomalies were also recorded which may be attributed to the ongoing crustal deformation which was not major enough to produce an earthquake.
NASA Astrophysics Data System (ADS)
Kiss, I.; Cioată, V. G.; Ratiu, S. A.; Rackov, M.; Penčić, M.
2018-01-01
Multivariate research is important in areas of cast-iron brake shoes manufacturing, because many variables interact with each other simultaneously. This article focuses on expressing the multiple linear regression model related to the hardness assurance by the chemical composition of the phosphorous cast irons destined to the brake shoes, having in view that the regression coefficients will illustrate the unrelated contributions of each independent variable towards predicting the dependent variable. In order to settle the multiple correlations between the hardness of the cast-iron brake shoes, and their chemical compositions several regression equations has been proposed. Is searched a mathematical solution which can determine the optimum chemical composition for the hardness desirable values. Starting from the above-mentioned affirmations two new statistical experiments are effectuated related to the values of Phosphorus [P], Manganese [Mn] and Silicon [Si]. Therefore, the regression equations, which describe the mathematical dependency between the above-mentioned elements and the hardness, are determined. As result, several correlation charts will be revealed.
Specific factors for prenatal lead exposure in the border area of China.
Kawata, Kimiko; Li, Yan; Liu, Hao; Zhang, Xiao Qin; Ushijima, Hiroshi
2006-07-01
The objectives of this study are to examine the prevalence of increased blood lead concentrations in mothers and their umbilical cords, and to identify risk factors for prenatal lead exposure in Kunming city, Yunnan province, China. The study was conducted at two obstetrics departments, and 100 peripartum women were enrolled. The mean blood lead concentrations of the mothers and the umbilical cords were 67.3microg/l and 53.1microg/l, respectively. In multiple linear regression analysis, maternal occupational exposure, maternal consumption of homemade dehydrated vegetables and maternal habitation period in Kunming city were significantly associated with an increase of umbilical cord blood lead concentration. In addition, logistic regression analysis was used to assess the association of umbilical cord blood lead concentrations that possibly have adverse effects on brain development of newborns with each potential risk factor. Maternal frequent use of tableware with color patterns inside was significantly associated with higher cord blood lead concentration in addition to the three items in the multiple linear regression analysis. These points should be considered as specific recommendations for maternal and fetal lead exposure in this city.
1990-09-01
without the help from the DSXR staff. William Lyons, Charles Ramsey , and Martin Meeks went above and beyond to help complete this research. Special...develop a valid forecasting model that is significantly more accurate than the one presently used by DSXR and suggested the development and testing of a...method, Strom tested DSXR’s iterative linear regression forecasting technique by examining P1 in the simple regression equation to determine whether
Andrew T. Hudak; Nicholas L. Crookston; Jeffrey S. Evans; Michael K. Falkowski; Alistair M. S. Smith; Paul E. Gessler; Penelope Morgan
2006-01-01
We compared the utility of discrete-return light detection and ranging (lidar) data and multispectral satellite imagery, and their integration, for modeling and mapping basal area and tree density across two diverse coniferous forest landscapes in north-central Idaho. We applied multiple linear regression models subset from a suite of 26 predictor variables derived...
Use of AMMI and linear regression models to analyze genotype-environment interaction in durum wheat.
Nachit, M M; Nachit, G; Ketata, H; Gauch, H G; Zobel, R W
1992-03-01
The joint durum wheat (Triticum turgidum L var 'durum') breeding program of the International Maize and Wheat Improvement Center (CIMMYT) and the International Center for Agricultural Research in the Dry Areas (ICARDA) for the Mediterranean region employs extensive multilocation testing. Multilocation testing produces significant genotype-environment (GE) interaction that reduces the accuracy for estimating yield and selecting appropriate germ plasm. The sum of squares (SS) of GE interaction was partitioned by linear regression techniques into joint, genotypic, and environmental regressions, and by Additive Main effects and the Multiplicative Interactions (AMMI) model into five significant Interaction Principal Component Axes (IPCA). The AMMI model was more effective in partitioning the interaction SS than the linear regression technique. The SS contained in the AMMI model was 6 times higher than the SS for all three regressions. Postdictive assessment recommended the use of the first five IPCA axes, while predictive assessment AMMI1 (main effects plus IPCA1). After elimination of random variation, AMMI1 estimates for genotypic yields within sites were more precise than unadjusted means. This increased precision was equivalent to increasing the number of replications by a factor of 3.7.
Azadi, Sama; Karimi-Jashni, Ayoub
2016-02-01
Predicting the mass of solid waste generation plays an important role in integrated solid waste management plans. In this study, the performance of two predictive models, Artificial Neural Network (ANN) and Multiple Linear Regression (MLR) was verified to predict mean Seasonal Municipal Solid Waste Generation (SMSWG) rate. The accuracy of the proposed models is illustrated through a case study of 20 cities located in Fars Province, Iran. Four performance measures, MAE, MAPE, RMSE and R were used to evaluate the performance of these models. The MLR, as a conventional model, showed poor prediction performance. On the other hand, the results indicated that the ANN model, as a non-linear model, has a higher predictive accuracy when it comes to prediction of the mean SMSWG rate. As a result, in order to develop a more cost-effective strategy for waste management in the future, the ANN model could be used to predict the mean SMSWG rate. Copyright © 2015 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Bonelli, Maria Grazia; Ferrini, Mauro; Manni, Andrea
2016-12-01
The assessment of metals and organic micropollutants contamination in agricultural soils is a difficult challenge due to the extensive area used to collect and analyze a very large number of samples. With Dioxins and dioxin-like PCBs measurement methods and subsequent the treatment of data, the European Community advises the develop low-cost and fast methods allowing routing analysis of a great number of samples, providing rapid measurement of these compounds in the environment, feeds and food. The aim of the present work has been to find a method suitable to describe the relations occurring between organic and inorganic contaminants and use the value of the latter in order to forecast the former. In practice, the use of a metal portable soil analyzer coupled with an efficient statistical procedure enables the required objective to be achieved. Compared to Multiple Linear Regression, the Artificial Neural Networks technique has shown to be an excellent forecasting method, though there is no linear correlation between the variables to be analyzed.
STRONG ORACLE OPTIMALITY OF FOLDED CONCAVE PENALIZED ESTIMATION.
Fan, Jianqing; Xue, Lingzhou; Zou, Hui
2014-06-01
Folded concave penalization methods have been shown to enjoy the strong oracle property for high-dimensional sparse estimation. However, a folded concave penalization problem usually has multiple local solutions and the oracle property is established only for one of the unknown local solutions. A challenging fundamental issue still remains that it is not clear whether the local optimum computed by a given optimization algorithm possesses those nice theoretical properties. To close this important theoretical gap in over a decade, we provide a unified theory to show explicitly how to obtain the oracle solution via the local linear approximation algorithm. For a folded concave penalized estimation problem, we show that as long as the problem is localizable and the oracle estimator is well behaved, we can obtain the oracle estimator by using the one-step local linear approximation. In addition, once the oracle estimator is obtained, the local linear approximation algorithm converges, namely it produces the same estimator in the next iteration. The general theory is demonstrated by using four classical sparse estimation problems, i.e., sparse linear regression, sparse logistic regression, sparse precision matrix estimation and sparse quantile regression.
STRONG ORACLE OPTIMALITY OF FOLDED CONCAVE PENALIZED ESTIMATION
Fan, Jianqing; Xue, Lingzhou; Zou, Hui
2014-01-01
Folded concave penalization methods have been shown to enjoy the strong oracle property for high-dimensional sparse estimation. However, a folded concave penalization problem usually has multiple local solutions and the oracle property is established only for one of the unknown local solutions. A challenging fundamental issue still remains that it is not clear whether the local optimum computed by a given optimization algorithm possesses those nice theoretical properties. To close this important theoretical gap in over a decade, we provide a unified theory to show explicitly how to obtain the oracle solution via the local linear approximation algorithm. For a folded concave penalized estimation problem, we show that as long as the problem is localizable and the oracle estimator is well behaved, we can obtain the oracle estimator by using the one-step local linear approximation. In addition, once the oracle estimator is obtained, the local linear approximation algorithm converges, namely it produces the same estimator in the next iteration. The general theory is demonstrated by using four classical sparse estimation problems, i.e., sparse linear regression, sparse logistic regression, sparse precision matrix estimation and sparse quantile regression. PMID:25598560
NASA Technical Reports Server (NTRS)
Stolzer, Alan J.; Halford, Carl
2007-01-01
In a previous study, multiple regression techniques were applied to Flight Operations Quality Assurance-derived data to develop parsimonious model(s) for fuel consumption on the Boeing 757 airplane. The present study examined several data mining algorithms, including neural networks, on the fuel consumption problem and compared them to the multiple regression results obtained earlier. Using regression methods, parsimonious models were obtained that explained approximately 85% of the variation in fuel flow. In general data mining methods were more effective in predicting fuel consumption. Classification and Regression Tree methods reported correlation coefficients of .91 to .92, and General Linear Models and Multilayer Perceptron neural networks reported correlation coefficients of about .99. These data mining models show great promise for use in further examining large FOQA databases for operational and safety improvements.
Athanasopoulos, Leonidas V; Dritsas, Athanasios; Doll, Helen A; Cokkinos, Dennis V
2010-08-01
This study was conducted to explain the variance in quality of life (QoL) and activity capacity of patients with congestive heart failure from pathophysiological changes as estimated by laboratory data. Peak oxygen consumption (peak VO2) and ventilation (VE)/carbon dioxide output (VCO2) slope derived from cardiopulmonary exercise testing, plasma N-terminal prohormone of B-type natriuretic peptide (NT-proBNP), and echocardiographic markers [left atrium (LA), left ventricular ejection fraction (LVEF)] were measured in 62 patients with congestive heart failure, who also completed the Minnesota Living with Heart Failure Questionnaire and the Specific Activity Questionnaire. All regression models were adjusted for age and sex. On linear regression analysis, peak VO2 with P value less than 0.001, VE/VCO2 slope with P value less than 0.01, LVEF with P value less than 0.001, LA with P=0.001, and logNT-proBNP with P value less than 0.01 were found to be associated with QoL. On stepwise multiple linear regression, peak VO2 and LVEF continued to be predictive, accounting for 40% of the variability in Minnesota Living with Heart Failure Questionnaire score. On linear regression analysis, peak VO2 with P value less than 0.001, VE/VCO2 slope with P value less than 0.001, LVEF with P value less than 0.05, LA with P value less than 0.001, and logNT-proBNP with P value less than 0.001 were found to be associated with activity capacity. On stepwise multiple linear regression, peak VO2 and LA continued to be predictive, accounting for 53% of the variability in Specific Activity Questionnaire score. Peak VO2 is independently associated both with QoL and activity capacity. In addition to peak VO2, LVEF is independently associated with QoL, and LA with activity capacity.
Balabin, Roman M; Smirnov, Sergey V
2011-04-29
During the past several years, near-infrared (near-IR/NIR) spectroscopy has increasingly been adopted as an analytical tool in various fields from petroleum to biomedical sectors. The NIR spectrum (above 4000 cm(-1)) of a sample is typically measured by modern instruments at a few hundred of wavelengths. Recently, considerable effort has been directed towards developing procedures to identify variables (wavelengths) that contribute useful information. Variable selection (VS) or feature selection, also called frequency selection or wavelength selection, is a critical step in data analysis for vibrational spectroscopy (infrared, Raman, or NIRS). In this paper, we compare the performance of 16 different feature selection methods for the prediction of properties of biodiesel fuel, including density, viscosity, methanol content, and water concentration. The feature selection algorithms tested include stepwise multiple linear regression (MLR-step), interval partial least squares regression (iPLS), backward iPLS (BiPLS), forward iPLS (FiPLS), moving window partial least squares regression (MWPLS), (modified) changeable size moving window partial least squares (CSMWPLS/MCSMWPLSR), searching combination moving window partial least squares (SCMWPLS), successive projections algorithm (SPA), uninformative variable elimination (UVE, including UVE-SPA), simulated annealing (SA), back-propagation artificial neural networks (BP-ANN), Kohonen artificial neural network (K-ANN), and genetic algorithms (GAs, including GA-iPLS). Two linear techniques for calibration model building, namely multiple linear regression (MLR) and partial least squares regression/projection to latent structures (PLS/PLSR), are used for the evaluation of biofuel properties. A comparison with a non-linear calibration model, artificial neural networks (ANN-MLP), is also provided. Discussion of gasoline, ethanol-gasoline (bioethanol), and diesel fuel data is presented. The results of other spectroscopic techniques application, such as Raman, ultraviolet-visible (UV-vis), or nuclear magnetic resonance (NMR) spectroscopies, can be greatly improved by an appropriate feature selection choice. Copyright © 2011 Elsevier B.V. All rights reserved.
Method and Excel VBA Algorithm for Modeling Master Recession Curve Using Trigonometry Approach.
Posavec, Kristijan; Giacopetti, Marco; Materazzi, Marco; Birk, Steffen
2017-11-01
A new method was developed and implemented into an Excel Visual Basic for Applications (VBAs) algorithm utilizing trigonometry laws in an innovative way to overlap recession segments of time series and create master recession curves (MRCs). Based on a trigonometry approach, the algorithm horizontally translates succeeding recession segments of time series, placing their vertex, that is, the highest recorded value of each recession segment, directly onto the appropriate connection line defined by measurement points of a preceding recession segment. The new method and algorithm continues the development of methods and algorithms for the generation of MRC, where the first published method was based on a multiple linear/nonlinear regression model approach (Posavec et al. 2006). The newly developed trigonometry-based method was tested on real case study examples and compared with the previously published multiple linear/nonlinear regression model-based method. The results show that in some cases, that is, for some time series, the trigonometry-based method creates narrower overlaps of the recession segments, resulting in higher coefficients of determination R 2 , while in other cases the multiple linear/nonlinear regression model-based method remains superior. The Excel VBA algorithm for modeling MRC using the trigonometry approach is implemented into a spreadsheet tool (MRCTools v3.0 written by and available from Kristijan Posavec, Zagreb, Croatia) containing the previously published VBA algorithms for MRC generation and separation. All algorithms within the MRCTools v3.0 are open access and available free of charge, supporting the idea of running science on available, open, and free of charge software. © 2017, National Ground Water Association.
Impact of divorce on the quality of life in school-age children.
Eymann, Alfredo; Busaniche, Julio; Llera, Julián; De Cunto, Carmen; Wahren, Carlos
2009-01-01
To assess psychosocial quality of life in school-age children of divorced parents. A cross-sectional survey was conducted at the pediatric outpatient clinic of a community hospital. Children 5 to 12 years old from married families and divorced families were included. Child quality of life was assessed through maternal reports using a Child Health Questionnaire-Parent Form 50. A multiple linear regression model was constructed including clinically relevant variables significant on univariate analysis (beta coefficient and 95%CI). Three hundred and thirty families were invited to participate and 313 completed the questionnaire. Univariate analysis showed that quality of life was significantly associated with parental separation, child sex, time spent with the father, standard of living, and maternal education. In a multiple linear regression model, quality of life scores decreased in boys -4.5 (-6.8 to -2.3) and increased for time spent with the father 0.09 (0.01 to 0.2). In divorced families, multiple linear regression showed that quality of life scores increased when parents had separated by mutual agreement 6.1 (2.7 to 9.4), when the mother had university level education 5.9 (1.7 to 10.1) and for each year elapsed since separation 0.6 (0.2 to 1.1), whereas scores decreased in boys -5.4 (-9.5 to -1.3) and for each one-year increment of maternal age -0.4 (-0.7 to -0.05). Children's psychosocial quality of life was affected by divorce. The Child Health Questionnaire can be useful to detect a decline in the psychosocial quality of life.
Practical Session: Multiple Linear Regression
NASA Astrophysics Data System (ADS)
Clausel, M.; Grégoire, G.
2014-12-01
Three exercises are proposed to illustrate the simple linear regression. In the first one investigates the influence of several factors on atmospheric pollution. It has been proposed by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr33.pdf) and is based on data coming from 20 cities of U.S. Exercise 2 is an introduction to model selection whereas Exercise 3 provides a first example of analysis of variance. Exercises 2 and 3 have been proposed by A. Dalalyan at ENPC (see Exercises 2 and 3 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_5.pdf).
Ergonomics study on mobile phones for thumb physiology discomfort
NASA Astrophysics Data System (ADS)
Bendero, J. M. S.; Doon, M. E. R.; Quiogue, K. C. A.; Soneja, L. C.; Ong, N. R.; Sauli, Z.; Vairavan, R.
2017-09-01
The study was conducted on Filipino undergraduate college students and aimed to find out about the significant factors associated with mobile phone usage and its effect on thumb pain.A correlation-prediction analysisand Multiple Linear Regression was adopted and used as the main tool in determining the significant factors and coming up with predictive models on thumb related pain. With the use of the software Statistical Package for the Social Sciences or SPSS in conducting linear regression, 2 significant factors on thumb-related pain (percentage of time using portrait as screen orientation when text messaging, amount of time playing games using one hand in a day) were found.
A new linear least squares method for T1 estimation from SPGR signals with multiple TRs
NASA Astrophysics Data System (ADS)
Chang, Lin-Ching; Koay, Cheng Guan; Basser, Peter J.; Pierpaoli, Carlo
2009-02-01
The longitudinal relaxation time, T1, can be estimated from two or more spoiled gradient recalled echo x (SPGR) images with two or more flip angles and one or more repetition times (TRs). The function relating signal intensity and the parameters are nonlinear; T1 maps can be computed from SPGR signals using nonlinear least squares regression. A widely-used linear method transforms the nonlinear model by assuming a fixed TR in SPGR images. This constraint is not desirable since multiple TRs are a clinically practical way to reduce the total acquisition time, to satisfy the required resolution, and/or to combine SPGR data acquired at different times. A new linear least squares method is proposed using the first order Taylor expansion. Monte Carlo simulations of SPGR experiments are used to evaluate the accuracy and precision of the estimated T1 from the proposed linear and the nonlinear methods. We show that the new linear least squares method provides T1 estimates comparable in both precision and accuracy to those from the nonlinear method, allowing multiple TRs and reducing computation time significantly.
ESTER HYDROLYSIS RATE CONSTANT PREDICTION FROM INFRARED INTERFEROGRAMS
A method for predicting reactivity parameters of organic chemicals from spectroscopic data is being developed to assist in assessing the environmental fate of pollutants. he prototype system, which employs multiple linear regression analysis using selected points from the Fourier...
ERIC Educational Resources Information Center
Games, Paul A.
1975-01-01
A brief introduction is presented on how multiple regression and linear model techniques can handle data analysis situations that most educators and psychologists think of as appropriate for analysis of variance. (Author/BJG)
NASA Astrophysics Data System (ADS)
Oguntunde, Philip G.; Lischeid, Gunnar; Dietrich, Ottfried
2018-03-01
This study examines the variations of climate variables and rice yield and quantifies the relationships among them using multiple linear regression, principal component analysis, and support vector machine (SVM) analysis in southwest Nigeria. The climate and yield data used was for a period of 36 years between 1980 and 2015. Similar to the observed decrease ( P < 0.001) in rice yield, pan evaporation, solar radiation, and wind speed declined significantly. Eight principal components exhibited an eigenvalue > 1 and explained 83.1% of the total variance of predictor variables. The SVM regression function using the scores of the first principal component explained about 75% of the variance in rice yield data and linear regression about 64%. SVM regression between annual solar radiation values and yield explained 67% of the variance. Only the first component of the principal component analysis (PCA) exhibited a clear long-term trend and sometimes short-term variance similar to that of rice yield. Short-term fluctuations of the scores of the PC1 are closely coupled to those of rice yield during the 1986-1993 and the 2006-2013 periods thereby revealing the inter-annual sensitivity of rice production to climate variability. Solar radiation stands out as the climate variable of highest influence on rice yield, and the influence was especially strong during monsoon and post-monsoon periods, which correspond to the vegetative, booting, flowering, and grain filling stages in the study area. The outcome is expected to provide more in-depth regional-specific climate-rice linkage for screening of better cultivars that can positively respond to future climate fluctuations as well as providing information that may help optimized planting dates for improved radiation use efficiency in the study area.
He, Dan; Kuhn, David; Parida, Laxmi
2016-06-15
Given a set of biallelic molecular markers, such as SNPs, with genotype values encoded numerically on a collection of plant, animal or human samples, the goal of genetic trait prediction is to predict the quantitative trait values by simultaneously modeling all marker effects. Genetic trait prediction is usually represented as linear regression models. In many cases, for the same set of samples and markers, multiple traits are observed. Some of these traits might be correlated with each other. Therefore, modeling all the multiple traits together may improve the prediction accuracy. In this work, we view the multitrait prediction problem from a machine learning angle: as either a multitask learning problem or a multiple output regression problem, depending on whether different traits share the same genotype matrix or not. We then adapted multitask learning algorithms and multiple output regression algorithms to solve the multitrait prediction problem. We proposed a few strategies to improve the least square error of the prediction from these algorithms. Our experiments show that modeling multiple traits together could improve the prediction accuracy for correlated traits. The programs we used are either public or directly from the referred authors, such as MALSAR (http://www.public.asu.edu/~jye02/Software/MALSAR/) package. The Avocado data set has not been published yet and is available upon request. dhe@us.ibm.com. © The Author 2016. Published by Oxford University Press.
Understanding logistic regression analysis.
Sperandei, Sandro
2014-01-01
Logistic regression is used to obtain odds ratio in the presence of more than one explanatory variable. The procedure is quite similar to multiple linear regression, with the exception that the response variable is binomial. The result is the impact of each variable on the odds ratio of the observed event of interest. The main advantage is to avoid confounding effects by analyzing the association of all variables together. In this article, we explain the logistic regression procedure using examples to make it as simple as possible. After definition of the technique, the basic interpretation of the results is highlighted and then some special issues are discussed.
Depressive disorder in pregnant Latin women: does intimate partner violence matter?
Fonseca-Machado, Mariana de Oliveira; Alves, Lisiane Camargo; Monteiro, Juliana Cristina Dos Santos; Stefanello, Juliana; Nakano, Ana Márcia Spanó; Haas, Vanderlei José; Gomes-Sponholz, Flávia
2015-05-01
To identify the association of antenatal depressive symptoms with intimate partner violence during the current pregnancy in Brazilian women. Intimate partner violence is an important risk factor for antenatal depression. To the authors' knowledge, there has been no study to date that assessed the association between intimate partner violence during pregnancy and antenatal depressive symptoms among Brazilian women. Cross-sectional study. Three hundred and fifty-eight pregnant women were enrolled in the study. The Edinburgh Postnatal Depression Scale and an adapted version of the instrument used in the World Health Organization Multi-country Study on Women's Health and Domestic Violence were used to measure antenatal depressive symptoms and psychological, physical and sexual acts of intimate partner violence during the current pregnancy respectively. Multiple logistic regression and multiple linear regression were used for data analysis. The prevalence of antenatal depressive symptoms, as determined by the cut-off score of 12 in the Edinburgh Postnatal Depression Scale, was 28·2% (101). Of the participants, 63 (17·6%) reported some type of intimate partner violence during pregnancy. Among them, 60 (95·2%) reported suffering psychological violence, 23 (36·5%) physical violence and one (1·6%) sexual violence. Multiple logistic regression and multiple linear regression indicated that antenatal depressive symptoms are extremely associated with intimate partner violence during pregnancy. Among Brazilian women, exposure to intimate partner violence during pregnancy increases the chances of experiencing antenatal depressive symptoms. Clinical nurses and nurses midwifes should pay attention to the particularities of Brazilian women, especially with regard to the occurrence of intimate partner violence, whose impacts on the mental health of this population are extremely significant, both during the gestational period and postpartum. © 2015 John Wiley & Sons Ltd.
Kovačević, Strahinja; Karadžić, Milica; Podunavac-Kuzmanović, Sanja; Jevrić, Lidija
2018-01-01
The present study is based on the quantitative structure-activity relationship (QSAR) analysis of binding affinity toward human prion protein (huPrP C ) of quinacrine, pyridine dicarbonitrile, diphenylthiazole and diphenyloxazole analogs applying different linear and non-linear chemometric regression techniques, including univariate linear regression, multiple linear regression, partial least squares regression and artificial neural networks. The QSAR analysis distinguished molecular lipophilicity as an important factor that contributes to the binding affinity. Principal component analysis was used in order to reveal similarities or dissimilarities among the studied compounds. The analysis of in silico absorption, distribution, metabolism, excretion and toxicity (ADMET) parameters was conducted. The ranking of the studied analogs on the basis of their ADMET parameters was done applying the sum of ranking differences, as a relatively new chemometric method. The main aim of the study was to reveal the most important molecular features whose changes lead to the changes in the binding affinities of the studied compounds. Another point of view on the binding affinity of the most promising analogs was established by application of molecular docking analysis. The results of the molecular docking were proven to be in agreement with the experimental outcome. Copyright © 2017 Elsevier B.V. All rights reserved.
ERIC Educational Resources Information Center
Osborne, Jason W.
2013-01-01
Osborne and Waters (2002) focused on checking some of the assumptions of multiple linear regression. In a critique of that paper, Williams, Grajales, and Kurkiewicz correctly clarify that regression models estimated using ordinary least squares require the assumption of normally distributed errors, but not the assumption of normally distributed…
Changes in aerobic power of men, ages 25-70 yr
NASA Technical Reports Server (NTRS)
Jackson, A. S.; Beard, E. F.; Wier, L. T.; Ross, R. M.; Stuteville, J. E.; Blair, S. N.
1995-01-01
This study quantified and compared the cross-sectional and longitudinal influence of age, self-report physical activity (SR-PA), and body composition (%fat) on the decline of maximal aerobic power (VO2peak). The cross-sectional sample consisted of 1,499 healthy men ages 25-70 yr. The 156 men of the longitudinal sample were from the same population and examined twice, the mean time between tests was 4.1 (+/- 1.2) yr. Peak oxygen uptake was determined by indirect calorimetry during a maximal treadmill exercise test. The zero-order correlations between VO2peak and %fat (r = -0.62) and SR-PA (r = 0.58) were significantly (P < 0.05) higher that the age correlation (r = -0.45). Linear regression defined the cross-sectional age-related decline in VO2peak at 0.46 ml.kg-1.min-1.yr-1. Multiple regression analysis (R = 0.79) showed that nearly 50% of this cross-sectional decline was due to %fat and SR-PA, adding these lifestyle variables to the multiple regression model reduced the age regression weight to -0.26 ml.kg-1.min-1.yr-1. Statistically controlling for time differences between tests, general linear models analysis showed that longitudinal changes in aerobic power were due to independent changes in %fat and SR-PA, confirming the cross-sectional results.
Wheat flour dough Alveograph characteristics predicted by Mixolab regression models.
Codină, Georgiana Gabriela; Mironeasa, Silvia; Mironeasa, Costel; Popa, Ciprian N; Tamba-Berehoiu, Radiana
2012-02-01
In Romania, the Alveograph is the most used device to evaluate the rheological properties of wheat flour dough, but lately the Mixolab device has begun to play an important role in the breadmaking industry. These two instruments are based on different principles but there are some correlations that can be found between the parameters determined by the Mixolab and the rheological properties of wheat dough measured with the Alveograph. Statistical analysis on 80 wheat flour samples using the backward stepwise multiple regression method showed that Mixolab values using the ‘Chopin S’ protocol (40 samples) and ‘Chopin + ’ protocol (40 samples) can be used to elaborate predictive models for estimating the value of the rheological properties of wheat dough: baking strength (W), dough tenacity (P) and extensibility (L). The correlation analysis confirmed significant findings (P < 0.05 and P < 0.01) between the parameters of wheat dough studied by the Mixolab and its rheological properties measured with the Alveograph. A number of six predictive linear equations were obtained. Linear regression models gave multiple regression coefficients with R²(adjusted) > 0.70 for P, R²(adjusted) > 0.70 for W and R²(adjusted) > 0.38 for L, at a 95% confidence interval. Copyright © 2011 Society of Chemical Industry.
Transfer Student Success: Educationally Purposeful Activities Predictive of Undergraduate GPA
ERIC Educational Resources Information Center
Fauria, Renee M.; Fuller, Matthew B.
2015-01-01
Researchers evaluated the effects of Educationally Purposeful Activities (EPAs) on transfer and nontransfer students' cumulative GPAs. Hierarchical, linear, and multiple regression models yielded seven statistically significant educationally purposeful items that influenced undergraduate student GPAs. Statistically significant positive EPAs for…
Linkages between benthic macroinvertebrate assemblages and landscape stressors in the US Great Lakes
We used multiple linear regression analysis to investigate relationships between benthic macroinvertebrate assemblages in the nearshore region of the Laurentian Great Lakes and landscape characteristics in adjacent watersheds. Benthic invertebrate data were obtained from the 201...
Practical Assessment, Research & Evaluation, 2000-2001.
ERIC Educational Resources Information Center
Rudner, Lawrence M., Ed.; Schafer, William D., Ed.
2001-01-01
This document consists of papers published in the electronic journal "Practical Assessment, Research & Evaluation" during 2000-2001: (1) "Advantages of Hierarchical Linear Modeling" (Jason W. Osborne); (2) "Prediction in Multiple Regression" (Jason W. Osborne); (3) Scoring Rubrics: What, When, and How?"…
Female Literacy Rate is a Better Predictor of Birth Rate and Infant Mortality Rate in India
Saurabh, Suman; Sarkar, Sonali; Pandey, Dhruv K.
2013-01-01
Background: Educated women are known to take informed reproductive and healthcare decisions. These result in population stabilization and better infant care reflected by lower birth rates and infant mortality rates (IMRs), respectively. Materials and Methods: Our objective was to study the relationship of male and female literacy rates with crude birth rates (CBRs) and IMRs of the states and union territories (UTs) of India. The data were analyzed using linear regression. CBR and IMR were taken as the dependent variables; while the overall literacy rates, male, and female literacy rates were the independent variables. Results: CBRs were inversely related to literacy rates (slope parameter = −0.402, P < 0.001). On multiple linear regression with male and female literacy rates, a significant inverse relationship emerged between female literacy rate and CBR (slope = −0.363, P < 0.001), while male literacy rate was not significantly related to CBR (P = 0.674). IMR of the states were also inversely related to their literacy rates (slope = −1.254, P < 0.001). Multiple linear regression revealed a significant inverse relationship between IMR and female literacy (slope = −0.816, P = 0.031), whereas male literacy rate was not significantly related (P = 0.630). Conclusion: Female literacy is relatively highly important for both population stabilization and better infant health. PMID:26664840
NASA Astrophysics Data System (ADS)
Setyaningsih, S.
2017-01-01
The main element to build a leading university requires lecturer commitment in a professional manner. Commitment is measured through willpower, loyalty, pride, loyalty, and integrity as a professional lecturer. A total of 135 from 337 university lecturers were sampled to collect data. Data were analyzed using validity and reliability test and multiple linear regression. Many studies have found a link on the commitment of lecturers, but the basic cause of the causal relationship is generally neglected. These results indicate that the professional commitment of lecturers affected by variables empowerment, academic culture, and trust. The relationship model between variables is composed of three substructures. The first substructure consists of endogenous variables professional commitment and exogenous three variables, namely the academic culture, empowerment and trust, as well as residue variable ɛ y . The second substructure consists of one endogenous variable that is trust and two exogenous variables, namely empowerment and academic culture and the residue variable ɛ 3. The third substructure consists of one endogenous variable, namely the academic culture and exogenous variables, namely empowerment as well as residue variable ɛ 2. Multiple linear regression was used in the path model for each substructure. The results showed that the hypothesis has been proved and these findings provide empirical evidence that increasing the variables will have an impact on increasing the professional commitment of the lecturers.
Stature estimation from the lengths of the growing foot-a study on North Indian adolescents.
Krishan, Kewal; Kanchan, Tanuj; Passi, Neelam; DiMaggio, John A
2012-12-01
Stature estimation is considered as one of the basic parameters of the investigation process in unknown and commingled human remains in medico-legal case work. Race, age and sex are the other parameters which help in this process. Stature estimation is of the utmost importance as it completes the biological profile of a person along with the other three parameters of identification. The present research is intended to formulate standards for stature estimation from foot dimensions in adolescent males from North India and study the pattern of foot growth during the growing years. 154 male adolescents from the Northern part of India were included in the study. Besides stature, five anthropometric measurements that included the length of the foot from each toe (T1, T2, T3, T4, and T5 respectively) to pternion were measured on each foot. The data was analyzed statistically using Student's t-test, Pearson's correlation, linear and multiple regression analysis for estimation of stature and growth of foot during ages 13-18 years. Correlation coefficients between stature and all the foot measurements were found to be highly significant and positively correlated. Linear regression models and multiple regression models (with age as a co-variable) were derived for estimation of stature from the different measurements of the foot. Multiple regression models (with age as a co-variable) estimate stature with greater accuracy than the regression models for 13-18 years age group. The study shows the growth pattern of feet in North Indian adolescents and indicates that anthropometric measurements of the foot and its segments are valuable in estimation of stature in growing individuals of that population. Copyright © 2012 Elsevier Ltd. All rights reserved.
do Prado, Mara Rúbia Maciel Cardoso; Oliveira, Fabiana de Cássia Carvalho; Assis, Karine Franklin; Ribeiro, Sarah Aparecida Vieira; do Prado Junior, Pedro Paulo; Sant'Ana, Luciana Ferreira da Rocha; Priore, Silvia Eloiza; Franceschini, Sylvia do Carmo Castro
2015-01-01
To assess the prevalence of vitamin D deficiency and its associated factors in women and their newborns in the postpartum period. This cross-sectional study evaluated vitamin D deficiency/insufficiency in 226 women and their newborns in Viçosa (Minas Gerais, BR) between December 2011 and November 2012. Cord blood and venous maternal blood were collected to evaluate the following biochemical parameters: vitamin D, alkaline phosphatase, calcium, phosphorus and parathyroid hormone. Poisson regression analysis, with a confidence interval of 95% was applied to assess vitamin D deficiency and its associated factors. Multiple linear regression analysis was performed to identify factors associated with 25(OH)D deficiency in the newborns and women from the study. The criteria for variable inclusion in the multiple linear regression model was the association with the dependent variable in the simple linear regression analysis, considering p<0.20. Significance level was α<5%. From 226 women included, 200 (88.5%) were 20 to 44 years old; the median age was 28 years. Deficient/insufficient levels of vitamin D were found in 192 (85%) women and in 182 (80.5%) neonates. The maternal 25(OH)D and alkaline phosphatase levels were independently associated with vitamin D deficiency in infants. This study identified a high prevalence of vitamin D deficiency and insufficiency in women and newborns and the association between maternal nutritional status of vitamin D and their infants' vitamin D status. Copyright © 2015 Sociedade de Pediatria de São Paulo. Publicado por Elsevier Editora Ltda. All rights reserved.
Kim, Seong-Gil
2018-01-01
Background The purpose of this study was to investigate the effect of ankle ROM and lower-extremity muscle strength on static balance control ability in young adults. Material/Methods This study was conducted with 65 young adults, but 10 young adults dropped out during the measurement, so 55 young adults (male: 19, female: 36) completed the study. Postural sway (length and velocity) was measured with eyes open and closed, and ankle ROM (AROM and PROM of dorsiflexion and plantarflexion) and lower-extremity muscle strength (flexor and extensor of hip, knee, and ankle joint) were measured. Pearson correlation coefficient was used to examine the correlation between variables and static balance ability. Simple linear regression analysis and multiple linear regression analysis were used to examine the effect of variables on static balance ability. Results In correlation analysis, plantarflexion ROM (AROM and PROM) and lower-extremity muscle strength (except hip extensor) were significantly correlated with postural sway (p<0.05). In simple correlation analysis, all variables that passed the correlation analysis procedure had significant influence (p<0.05). In multiple linear regression analysis, plantar flexion PROM with eyes open significantly influenced sway length (B=0.681) and sway velocity (B=0.011). Conclusions Lower-extremity muscle strength and ankle plantarflexion ROM influenced static balance control ability, with ankle plantarflexion PROM showing the greatest influence. Therefore, both contractile structures and non-contractile structures should be of interest when considering static balance control ability improvement. PMID:29760375
Kim, Seong-Gil; Kim, Wan-Soo
2018-05-15
BACKGROUND The purpose of this study was to investigate the effect of ankle ROM and lower-extremity muscle strength on static balance control ability in young adults. MATERIAL AND METHODS This study was conducted with 65 young adults, but 10 young adults dropped out during the measurement, so 55 young adults (male: 19, female: 36) completed the study. Postural sway (length and velocity) was measured with eyes open and closed, and ankle ROM (AROM and PROM of dorsiflexion and plantarflexion) and lower-extremity muscle strength (flexor and extensor of hip, knee, and ankle joint) were measured. Pearson correlation coefficient was used to examine the correlation between variables and static balance ability. Simple linear regression analysis and multiple linear regression analysis were used to examine the effect of variables on static balance ability. RESULTS In correlation analysis, plantarflexion ROM (AROM and PROM) and lower-extremity muscle strength (except hip extensor) were significantly correlated with postural sway (p<0.05). In simple correlation analysis, all variables that passed the correlation analysis procedure had significant influence (p<0.05). In multiple linear regression analysis, plantar flexion PROM with eyes open significantly influenced sway length (B=0.681) and sway velocity (B=0.011). CONCLUSIONS Lower-extremity muscle strength and ankle plantarflexion ROM influenced static balance control ability, with ankle plantarflexion PROM showing the greatest influence. Therefore, both contractile structures and non-contractile structures should be of interest when considering static balance control ability improvement.
Steen, Paul J.; Passino-Reader, Dora R.; Wiley, Michael J.
2006-01-01
As a part of the Great Lakes Regional Aquatic Gap Analysis Project, we evaluated methodologies for modeling associations between fish species and habitat characteristics at a landscape scale. To do this, we created brook trout Salvelinus fontinalis presence and absence models based on four different techniques: multiple linear regression, logistic regression, neural networks, and classification trees. The models were tested in two ways: by application to an independent validation database and cross-validation using the training data, and by visual comparison of statewide distribution maps with historically recorded occurrences from the Michigan Fish Atlas. Although differences in the accuracy of our models were slight, the logistic regression model predicted with the least error, followed by multiple regression, then classification trees, then the neural networks. These models will provide natural resource managers a way to identify habitats requiring protection for the conservation of fish species.
Criteria for the use of regression analysis for remote sensing of sediment and pollutants
NASA Technical Reports Server (NTRS)
Whitlock, C. H.; Kuo, C. Y.; Lecroy, S. R.
1982-01-01
An examination of limitations, requirements, and precision of the linear multiple-regression technique for quantification of marine environmental parameters is conducted. Both environmental and optical physics conditions have been defined for which an exact solution to the signal response equations is of the same form as the multiple regression equation. Various statistical parameters are examined to define a criteria for selection of an unbiased fit when upwelled radiance values contain error and are correlated with each other. Field experimental data are examined to define data smoothing requirements in order to satisfy the criteria of Daniel and Wood (1971). Recommendations are made concerning improved selection of ground-truth locations to maximize variance and to minimize physical errors associated with the remote sensing experiment.
Buchvold, Hogne Vikanes; Pallesen, Ståle; Waage, Siri; Bjorvatn, Bjørn
2018-05-01
Objectives The aim of this study was to investigate changes in body mass index (BMI) between different work schedules and different average number of yearly night shifts over a four-year follow-up period. Methods A prospective study of Norwegian nurses (N=2965) with different work schedules was conducted: day only, two-shift rotation (day and evening shifts), three-shift rotation (day, evening and night shifts), night only, those who changed towards night shifts, and those who changed away from schedules containing night shifts. Paired student's t-tests were used to evaluate within subgroup changes in BMI. Multiple linear regression analysis was used to evaluate between groups effects on BMI when adjusting for BMI at baseline, sex, age, marital status, children living at home, and years since graduation. The same regression model was used to evaluate the effect of average number of yearly night shifts on BMI change. Results We found that night workers [mean difference (MD) 1.30 (95% CI 0.70-1.90)], two shift workers [MD 0.48 (95% CI 0.20-0.75)], three shift workers [MD 0.46 (95% CI 0.30-0.62)], and those who changed work schedule away from [MD 0.57 (95% CI 0.17-0.84)] or towards night work [MD 0.63 (95% CI 0.20-1.05)] all had significant BMI gain (P<0.01) during the follow-up period. However, day workers had a non-significant BMI gain. Using adjusted multiple linear regressions, we found that night workers had significantly larger BMI gain compared to day workers [B=0.89 (95% CI 0.06-1.72), P<0.05]. We did not find any significant association between average number of yearly night shifts and BMI change using our multiple linear regression model. Conclusions After adjusting for possible confounders, we found that BMI increased significantly more among night workers compared to day workers.
Emission and distribution of phosphine in paddy fields and its relationship with greenhouse gases.
Chen, Weiyi; Niu, Xiaojun; An, Shaorong; Sheng, Hong; Tang, Zhenghua; Yang, Zhiquan; Gu, Xiaohong
2017-12-01
Phosphine (PH 3 ), as a gaseous phosphide, plays an important role in the phosphorus cycle in ecosystems. In this study, the emission and distribution of phosphine, carbon dioxide (CO 2 ) and methane (CH 4 ) in paddy fields were investigated to speculate the future potential impacts of enhanced greenhouse effect on phosphorus cycle involved in phosphine by the method of Pearson correlation analysis and multiple linear regression analysis. During the whole period of rice growth, there was a significant positive correlation between CO 2 emission flux and PH 3 emission flux (r=0.592, p=0.026, n=14). Similarly, a significant positive correlation of emission flux was also observed between CH 4 and PH 3 (r=0.563, p=0.036, n=14). The linear regression relationship was determined as [PH 3 ] flux =0.007[CO 2 ] flux +0.063[CH 4 ] flux -4.638. No significant differences were observed for all values of matrix-bound phosphine (MBP), soil carbon dioxide (SCO 2 ), and soil methane (SCH 4 ) in paddy soils. However, there was a significant positive correlation between MBP and SCO 2 at heading, flowering and ripening stage. The correlation coefficients were 0.909, 0.890 and 0.827, respectively. In vertical distribution, MBP had the analogical variation trend with SCO 2 and SCH 4 . Through Pearson correlation analysis and multiple stepwise linear regression analysis, pH, redox potential (Eh), total phosphorus (TP) and acid phosphatase (ACP) were identified as the principal factors affecting MBP levels, with correlative rankings of Eh>pH>TP>ACP. The multiple stepwise regression model ([MBP]=0.456∗[ACP]+0.235∗[TP]-1.458∗[Eh]-36.547∗[pH]+352.298) was obtained. The findings in this study hold great reference values to the global biogeochemical cycling of phosphorus in the future. Copyright © 2017 Elsevier B.V. All rights reserved.
Empirical Modeling of Microbial Indicators at a South Carolina Beach
Public concerns about water quality at beaches have prompted the development of multiple linear regression and other models that can be used to "nowcast" levels of bacterial indicators. Hydrometeorological and biogeochemical data from summer, 2009 were used to develop empirical m...
MULTIPLE LINEAR REGRESSION FOR LAKE ICE AND LAKE TEMPERATURE CHARACTERISTICS. (R824801)
The perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Concl...
DEVELOPMENT OF THE VIRTUAL BEACH MODEL, PHASE 1: AN EMPIRICAL MODEL
With increasing attention focused on the use of multiple linear regression (MLR) modeling of beach fecal bacteria concentration, the validity of the entire statistical process should be carefully evaluated to assure satisfactory predictions. This work aims to identify pitfalls an...
Mathematics Readiness of First-Year University Students
ERIC Educational Resources Information Center
Atuahene, Francis; Russell, Tammy A.
2016-01-01
The majority of high school students, particularly underrepresented minorities (URMs) from low socioeconomic backgrounds are graduating from high school less prepared academically for advanced-level college mathematics. Using 2009 and 2010 course enrollment data, several statistical analyses (multiple linear regression, Cochran Mantel Haenszel…
Rahman, Md. Jahanur; Shamim, Abu Ahmed; Klemm, Rolf D. W.; Labrique, Alain B.; Rashid, Mahbubur; Christian, Parul; West, Keith P.
2017-01-01
Birth weight, length and circumferences of the head, chest and arm are key measures of newborn size and health in developing countries. We assessed maternal socio-demographic factors associated with multiple measures of newborn size in a large rural population in Bangladesh using partial least squares (PLS) regression method. PLS regression, combining features from principal component analysis and multiple linear regression, is a multivariate technique with an ability to handle multicollinearity while simultaneously handling multiple dependent variables. We analyzed maternal and infant data from singletons (n = 14,506) born during a double-masked, cluster-randomized, placebo-controlled maternal vitamin A or β-carotene supplementation trial in rural northwest Bangladesh. PLS regression results identified numerous maternal factors (parity, age, early pregnancy MUAC, living standard index, years of education, number of antenatal care visits, preterm delivery and infant sex) significantly (p<0.001) associated with newborn size. Among them, preterm delivery had the largest negative influence on newborn size (Standardized β = -0.29 − -0.19; p<0.001). Scatter plots of the scores of first two PLS components also revealed an interaction between newborn sex and preterm delivery on birth size. PLS regression was found to be more parsimonious than both ordinary least squares regression and principal component regression. It also provided more stable estimates than the ordinary least squares regression and provided the effect measure of the covariates with greater accuracy as it accounts for the correlation among the covariates and outcomes. Therefore, PLS regression is recommended when either there are multiple outcome measurements in the same study, or the covariates are correlated, or both situations exist in a dataset. PMID:29261760
Kabir, Alamgir; Rahman, Md Jahanur; Shamim, Abu Ahmed; Klemm, Rolf D W; Labrique, Alain B; Rashid, Mahbubur; Christian, Parul; West, Keith P
2017-01-01
Birth weight, length and circumferences of the head, chest and arm are key measures of newborn size and health in developing countries. We assessed maternal socio-demographic factors associated with multiple measures of newborn size in a large rural population in Bangladesh using partial least squares (PLS) regression method. PLS regression, combining features from principal component analysis and multiple linear regression, is a multivariate technique with an ability to handle multicollinearity while simultaneously handling multiple dependent variables. We analyzed maternal and infant data from singletons (n = 14,506) born during a double-masked, cluster-randomized, placebo-controlled maternal vitamin A or β-carotene supplementation trial in rural northwest Bangladesh. PLS regression results identified numerous maternal factors (parity, age, early pregnancy MUAC, living standard index, years of education, number of antenatal care visits, preterm delivery and infant sex) significantly (p<0.001) associated with newborn size. Among them, preterm delivery had the largest negative influence on newborn size (Standardized β = -0.29 - -0.19; p<0.001). Scatter plots of the scores of first two PLS components also revealed an interaction between newborn sex and preterm delivery on birth size. PLS regression was found to be more parsimonious than both ordinary least squares regression and principal component regression. It also provided more stable estimates than the ordinary least squares regression and provided the effect measure of the covariates with greater accuracy as it accounts for the correlation among the covariates and outcomes. Therefore, PLS regression is recommended when either there are multiple outcome measurements in the same study, or the covariates are correlated, or both situations exist in a dataset.
A Solution to Separation and Multicollinearity in Multiple Logistic Regression
Shen, Jianzhao; Gao, Sujuan
2010-01-01
In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27–38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth’s penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study. PMID:20376286
A Solution to Separation and Multicollinearity in Multiple Logistic Regression.
Shen, Jianzhao; Gao, Sujuan
2008-10-01
In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27-38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth's penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study.
Efficacy of Social Media Adoption on Client Growth for Independent Management Consultants
2017-02-01
design , a linear multiple regression with three predictor variables and one dependent variable per testing were used. Under those circumstances...regression test was used to compare the social media adoption of two groups on a single measure to determine if there was a statistical difference...number and types of social media platforms used and their influence on client growth was examined in this research design that used a descriptive
TI-59 Programs for Multiple Regression.
1980-05-01
general linear hypothesis model of full rank [ Graybill , 19611 can be written as Y = x 8 + C , s-N(O,o 2I) nxl nxk kxl nxl where Y is the vector of n...a "reduced model " solution, and confidence intervals for linear functions of the coefficients can be obtained using (x’x) and a2, based on the t...O107)l UA.LLL. Library ModuIe NASTER -Puter 0NTINA Cards 1 PROGRAM DESCRIPTION (s s 2 ror the general linear hypothesis model Y - XO + C’ calculates
Ventura, Cristina; Latino, Diogo A R S; Martins, Filomena
2013-01-01
The performance of two QSAR methodologies, namely Multiple Linear Regressions (MLR) and Neural Networks (NN), towards the modeling and prediction of antitubercular activity was evaluated and compared. A data set of 173 potentially active compounds belonging to the hydrazide family and represented by 96 descriptors was analyzed. Models were built with Multiple Linear Regressions (MLR), single Feed-Forward Neural Networks (FFNNs), ensembles of FFNNs and Associative Neural Networks (AsNNs) using four different data sets and different types of descriptors. The predictive ability of the different techniques used were assessed and discussed on the basis of different validation criteria and results show in general a better performance of AsNNs in terms of learning ability and prediction of antitubercular behaviors when compared with all other methods. MLR have, however, the advantage of pinpointing the most relevant molecular characteristics responsible for the behavior of these compounds against Mycobacterium tuberculosis. The best results for the larger data set (94 compounds in training set and 18 in test set) were obtained with AsNNs using seven descriptors (R(2) of 0.874 and RMSE of 0.437 against R(2) of 0.845 and RMSE of 0.472 in MLRs, for test set). Counter-Propagation Neural Networks (CPNNs) were trained with the same data sets and descriptors. From the scrutiny of the weight levels in each CPNN and the information retrieved from MLRs, a rational design of potentially active compounds was attempted. Two new compounds were synthesized and tested against M. tuberculosis showing an activity close to that predicted by the majority of the models. Copyright © 2013 Elsevier Masson SAS. All rights reserved.
NASA Astrophysics Data System (ADS)
Chiong, W. L.; Omar, A. F.
2017-07-01
Non-destructive technique based on visible (VIS) spectroscopy using light emitting diode (LED) as lighting was used for evaluation of the internal quality of mango fruit. The objective of this study was to investigate feasibility of white LED as lighting in spectroscopic instrumentation to predict the acidity and soluble solids content of intact Sala Mango. The reflectance spectra of the mango samples were obtained and measured in the visible range (400-700 nm) using VIS spectroscopy illuminated under different white LEDs and tungsten-halogen lamp (pro lamp). Regression models were developed by multiple linear regression to establish the relationship between spectra and internal quality. Direct calibration transfer procedure was then applied between master and slave lighting to check on the acidity prediction results after transfer. Determination of mango acidity under white LED lighting was successfully performed through VIS spectroscopy using multiple linear regression but otherwise for soluble solids content. Satisfactory results were obtained for calibration transfer between LEDs with different correlated colour temperature indicated this technique was successfully used in spectroscopy measurement between two similar light sources in prediction of internal quality of mango.
Zhang, Hanze; Huang, Yangxin; Wang, Wei; Chen, Henian; Langland-Orban, Barbara
2017-01-01
In longitudinal AIDS studies, it is of interest to investigate the relationship between HIV viral load and CD4 cell counts, as well as the complicated time effect. Most of common models to analyze such complex longitudinal data are based on mean-regression, which fails to provide efficient estimates due to outliers and/or heavy tails. Quantile regression-based partially linear mixed-effects models, a special case of semiparametric models enjoying benefits of both parametric and nonparametric models, have the flexibility to monitor the viral dynamics nonparametrically and detect the varying CD4 effects parametrically at different quantiles of viral load. Meanwhile, it is critical to consider various data features of repeated measurements, including left-censoring due to a limit of detection, covariate measurement error, and asymmetric distribution. In this research, we first establish a Bayesian joint models that accounts for all these data features simultaneously in the framework of quantile regression-based partially linear mixed-effects models. The proposed models are applied to analyze the Multicenter AIDS Cohort Study (MACS) data. Simulation studies are also conducted to assess the performance of the proposed methods under different scenarios.
Yang, Xiaowei; Nie, Kun
2008-03-15
Longitudinal data sets in biomedical research often consist of large numbers of repeated measures. In many cases, the trajectories do not look globally linear or polynomial, making it difficult to summarize the data or test hypotheses using standard longitudinal data analysis based on various linear models. An alternative approach is to apply the approaches of functional data analysis, which directly target the continuous nonlinear curves underlying discretely sampled repeated measures. For the purposes of data exploration, many functional data analysis strategies have been developed based on various schemes of smoothing, but fewer options are available for making causal inferences regarding predictor-outcome relationships, a common task seen in hypothesis-driven medical studies. To compare groups of curves, two testing strategies with good power have been proposed for high-dimensional analysis of variance: the Fourier-based adaptive Neyman test and the wavelet-based thresholding test. Using a smoking cessation clinical trial data set, this paper demonstrates how to extend the strategies for hypothesis testing into the framework of functional linear regression models (FLRMs) with continuous functional responses and categorical or continuous scalar predictors. The analysis procedure consists of three steps: first, apply the Fourier or wavelet transform to the original repeated measures; then fit a multivariate linear model in the transformed domain; and finally, test the regression coefficients using either adaptive Neyman or thresholding statistics. Since a FLRM can be viewed as a natural extension of the traditional multiple linear regression model, the development of this model and computational tools should enhance the capacity of medical statistics for longitudinal data.
Application of stepwise multiple regression techniques to inversion of Nimbus 'IRIS' observations.
NASA Technical Reports Server (NTRS)
Ohring, G.
1972-01-01
Exploratory studies with Nimbus-3 infrared interferometer-spectrometer (IRIS) data indicate that, in addition to temperature, such meteorological parameters as geopotential heights of pressure surfaces, tropopause pressure, and tropopause temperature can be inferred from the observed spectra with the use of simple regression equations. The technique of screening the IRIS spectral data by means of stepwise regression to obtain the best radiation predictors of meteorological parameters is validated. The simplicity of application of the technique and the simplicity of the derived linear regression equations - which contain only a few terms - suggest usefulness for this approach. Based upon the results obtained, suggestions are made for further development and exploitation of the stepwise regression analysis technique.
Construction of mathematical model for measuring material concentration by colorimetric method
NASA Astrophysics Data System (ADS)
Liu, Bing; Gao, Lingceng; Yu, Kairong; Tan, Xianghua
2018-06-01
This paper use the method of multiple linear regression to discuss the data of C problem of mathematical modeling in 2017. First, we have established a regression model for the concentration of 5 substances. But only the regression model of the substance concentration of urea in milk can pass through the significance test. The regression model established by the second sets of data can pass the significance test. But this model exists serious multicollinearity. We have improved the model by principal component analysis. The improved model is used to control the system so that it is possible to measure the concentration of material by direct colorimetric method.
Trend Analysis Using Microcomputers.
ERIC Educational Resources Information Center
Berger, Carl F.
A trend analysis statistical package and additional programs for the Apple microcomputer are presented. They illustrate strategies of data analysis suitable to the graphics and processing capabilities of the microcomputer. The programs analyze data sets using examples of: (1) analysis of variance with multiple linear regression; (2) exponential…
RRegrs: an R package for computer-aided model selection with multiple regression models.
Tsiliki, Georgia; Munteanu, Cristian R; Seoane, Jose A; Fernandez-Lozano, Carlos; Sarimveis, Haralambos; Willighagen, Egon L
2015-01-01
Predictive regression models can be created with many different modelling approaches. Choices need to be made for data set splitting, cross-validation methods, specific regression parameters and best model criteria, as they all affect the accuracy and efficiency of the produced predictive models, and therefore, raising model reproducibility and comparison issues. Cheminformatics and bioinformatics are extensively using predictive modelling and exhibit a need for standardization of these methodologies in order to assist model selection and speed up the process of predictive model development. A tool accessible to all users, irrespectively of their statistical knowledge, would be valuable if it tests several simple and complex regression models and validation schemes, produce unified reports, and offer the option to be integrated into more extensive studies. Additionally, such methodology should be implemented as a free programming package, in order to be continuously adapted and redistributed by others. We propose an integrated framework for creating multiple regression models, called RRegrs. The tool offers the option of ten simple and complex regression methods combined with repeated 10-fold and leave-one-out cross-validation. Methods include Multiple Linear regression, Generalized Linear Model with Stepwise Feature Selection, Partial Least Squares regression, Lasso regression, and Support Vector Machines Recursive Feature Elimination. The new framework is an automated fully validated procedure which produces standardized reports to quickly oversee the impact of choices in modelling algorithms and assess the model and cross-validation results. The methodology was implemented as an open source R package, available at https://www.github.com/enanomapper/RRegrs, by reusing and extending on the caret package. The universality of the new methodology is demonstrated using five standard data sets from different scientific fields. Its efficiency in cheminformatics and QSAR modelling is shown with three use cases: proteomics data for surface-modified gold nanoparticles, nano-metal oxides descriptor data, and molecular descriptors for acute aquatic toxicity data. The results show that for all data sets RRegrs reports models with equal or better performance for both training and test sets than those reported in the original publications. Its good performance as well as its adaptability in terms of parameter optimization could make RRegrs a popular framework to assist the initial exploration of predictive models, and with that, the design of more comprehensive in silico screening applications.Graphical abstractRRegrs is a computer-aided model selection framework for R multiple regression models; this is a fully validated procedure with application to QSAR modelling.
NASA Astrophysics Data System (ADS)
Soares dos Santos, T.; Mendes, D.; Rodrigues Torres, R.
2016-01-01
Several studies have been devoted to dynamic and statistical downscaling for analysis of both climate variability and climate change. This paper introduces an application of artificial neural networks (ANNs) and multiple linear regression (MLR) by principal components to estimate rainfall in South America. This method is proposed for downscaling monthly precipitation time series over South America for three regions: the Amazon; northeastern Brazil; and the La Plata Basin, which is one of the regions of the planet that will be most affected by the climate change projected for the end of the 21st century. The downscaling models were developed and validated using CMIP5 model output and observed monthly precipitation. We used general circulation model (GCM) experiments for the 20th century (RCP historical; 1970-1999) and two scenarios (RCP 2.6 and 8.5; 2070-2100). The model test results indicate that the ANNs significantly outperform the MLR downscaling of monthly precipitation variability.
NASA Astrophysics Data System (ADS)
dos Santos, T. S.; Mendes, D.; Torres, R. R.
2015-08-01
Several studies have been devoted to dynamic and statistical downscaling for analysis of both climate variability and climate change. This paper introduces an application of artificial neural networks (ANN) and multiple linear regression (MLR) by principal components to estimate rainfall in South America. This method is proposed for downscaling monthly precipitation time series over South America for three regions: the Amazon, Northeastern Brazil and the La Plata Basin, which is one of the regions of the planet that will be most affected by the climate change projected for the end of the 21st century. The downscaling models were developed and validated using CMIP5 model out- put and observed monthly precipitation. We used GCMs experiments for the 20th century (RCP Historical; 1970-1999) and two scenarios (RCP 2.6 and 8.5; 2070-2100). The model test results indicate that the ANN significantly outperforms the MLR downscaling of monthly precipitation variability.
Correlates and Predictors of Psychological Distress Among Older Asian Immigrants in California.
Chang, Miya; Moon, Ailee
2016-01-01
Psychological distress occurs frequently in older minority immigrants because many have limited social resources and undergo a difficult process related to immigration and acculturation. Despite a rapid increase in the number of Asian immigrants, relatively little research has focused on subgroup mental health comparisons. This study examines the prevalence of psychological distress, and relationship with socio-demographic factors, and health care utilization among older Asian immigrants. Weighted data from Asian immigrants 65 and older from 5 countries (n = 1,028) who participated in the California Health Interview Survey (CHIS) were analyzed descriptively and in multiple linear regressions. The prevalence of psychological distress varied significantly across the 5 ethnic groups, from Filipinos (4.83%) to Chinese (1.64%). General health status, cognitive and physical impairment, and health care utilization are all associated (p < .05) with psychological distress in multiple linear regressions. These findings are similar to those from previous studies. The findings reinforce the need to develop more culturally effective mental health services and outreach programs.
Introduction to the use of regression models in epidemiology.
Bender, Ralf
2009-01-01
Regression modeling is one of the most important statistical techniques used in analytical epidemiology. By means of regression models the effect of one or several explanatory variables (e.g., exposures, subject characteristics, risk factors) on a response variable such as mortality or cancer can be investigated. From multiple regression models, adjusted effect estimates can be obtained that take the effect of potential confounders into account. Regression methods can be applied in all epidemiologic study designs so that they represent a universal tool for data analysis in epidemiology. Different kinds of regression models have been developed in dependence on the measurement scale of the response variable and the study design. The most important methods are linear regression for continuous outcomes, logistic regression for binary outcomes, Cox regression for time-to-event data, and Poisson regression for frequencies and rates. This chapter provides a nontechnical introduction to these regression models with illustrating examples from cancer research.
Assessing risk factors for periodontitis using regression
NASA Astrophysics Data System (ADS)
Lobo Pereira, J. A.; Ferreira, Maria Cristina; Oliveira, Teresa
2013-10-01
Multivariate statistical analysis is indispensable to assess the associations and interactions between different factors and the risk of periodontitis. Among others, regression analysis is a statistical technique widely used in healthcare to investigate and model the relationship between variables. In our work we study the impact of socio-demographic, medical and behavioral factors on periodontal health. Using regression, linear and logistic models, we can assess the relevance, as risk factors for periodontitis disease, of the following independent variables (IVs): Age, Gender, Diabetic Status, Education, Smoking status and Plaque Index. The multiple linear regression analysis model was built to evaluate the influence of IVs on mean Attachment Loss (AL). Thus, the regression coefficients along with respective p-values will be obtained as well as the respective p-values from the significance tests. The classification of a case (individual) adopted in the logistic model was the extent of the destruction of periodontal tissues defined by an Attachment Loss greater than or equal to 4 mm in 25% (AL≥4mm/≥25%) of sites surveyed. The association measures include the Odds Ratios together with the correspondent 95% confidence intervals.
Pfeiffer, R M; Riedl, R
2015-08-15
We assess the asymptotic bias of estimates of exposure effects conditional on covariates when summary scores of confounders, instead of the confounders themselves, are used to analyze observational data. First, we study regression models for cohort data that are adjusted for summary scores. Second, we derive the asymptotic bias for case-control studies when cases and controls are matched on a summary score, and then analyzed either using conditional logistic regression or by unconditional logistic regression adjusted for the summary score. Two scores, the propensity score (PS) and the disease risk score (DRS) are studied in detail. For cohort analysis, when regression models are adjusted for the PS, the estimated conditional treatment effect is unbiased only for linear models, or at the null for non-linear models. Adjustment of cohort data for DRS yields unbiased estimates only for linear regression; all other estimates of exposure effects are biased. Matching cases and controls on DRS and analyzing them using conditional logistic regression yields unbiased estimates of exposure effect, whereas adjusting for the DRS in unconditional logistic regression yields biased estimates, even under the null hypothesis of no association. Matching cases and controls on the PS yield unbiased estimates only under the null for both conditional and unconditional logistic regression, adjusted for the PS. We study the bias for various confounding scenarios and compare our asymptotic results with those from simulations with limited sample sizes. To create realistic correlations among multiple confounders, we also based simulations on a real dataset. Copyright © 2015 John Wiley & Sons, Ltd.
Fenske, Nora; Burns, Jacob; Hothorn, Torsten; Rehfuess, Eva A.
2013-01-01
Background Most attempts to address undernutrition, responsible for one third of global child deaths, have fallen behind expectations. This suggests that the assumptions underlying current modelling and intervention practices should be revisited. Objective We undertook a comprehensive analysis of the determinants of child stunting in India, and explored whether the established focus on linear effects of single risks is appropriate. Design Using cross-sectional data for children aged 0–24 months from the Indian National Family Health Survey for 2005/2006, we populated an evidence-based diagram of immediate, intermediate and underlying determinants of stunting. We modelled linear, non-linear, spatial and age-varying effects of these determinants using additive quantile regression for four quantiles of the Z-score of standardized height-for-age and logistic regression for stunting and severe stunting. Results At least one variable within each of eleven groups of determinants was significantly associated with height-for-age in the 35% Z-score quantile regression. The non-modifiable risk factors child age and sex, and the protective factors household wealth, maternal education and BMI showed the largest effects. Being a twin or multiple birth was associated with dramatically decreased height-for-age. Maternal age, maternal BMI, birth order and number of antenatal visits influenced child stunting in non-linear ways. Findings across the four quantile and two logistic regression models were largely comparable. Conclusions Our analysis confirms the multifactorial nature of child stunting. It emphasizes the need to pursue a systems-based approach and to consider non-linear effects, and suggests that differential effects across the height-for-age distribution do not play a major role. PMID:24223839
Fenske, Nora; Burns, Jacob; Hothorn, Torsten; Rehfuess, Eva A
2013-01-01
Most attempts to address undernutrition, responsible for one third of global child deaths, have fallen behind expectations. This suggests that the assumptions underlying current modelling and intervention practices should be revisited. We undertook a comprehensive analysis of the determinants of child stunting in India, and explored whether the established focus on linear effects of single risks is appropriate. Using cross-sectional data for children aged 0-24 months from the Indian National Family Health Survey for 2005/2006, we populated an evidence-based diagram of immediate, intermediate and underlying determinants of stunting. We modelled linear, non-linear, spatial and age-varying effects of these determinants using additive quantile regression for four quantiles of the Z-score of standardized height-for-age and logistic regression for stunting and severe stunting. At least one variable within each of eleven groups of determinants was significantly associated with height-for-age in the 35% Z-score quantile regression. The non-modifiable risk factors child age and sex, and the protective factors household wealth, maternal education and BMI showed the largest effects. Being a twin or multiple birth was associated with dramatically decreased height-for-age. Maternal age, maternal BMI, birth order and number of antenatal visits influenced child stunting in non-linear ways. Findings across the four quantile and two logistic regression models were largely comparable. Our analysis confirms the multifactorial nature of child stunting. It emphasizes the need to pursue a systems-based approach and to consider non-linear effects, and suggests that differential effects across the height-for-age distribution do not play a major role.
Shen, Minxue; Tan, Hongzhuan; Zhou, Shujin; Retnakaran, Ravi; Smith, Graeme N.; Davidge, Sandra T.; Trasler, Jacquetta; Walker, Mark C.; Wen, Shi Wu
2016-01-01
Background It has been reported that higher folate intake from food and supplementation is associated with decreased blood pressure (BP). The association between serum folate concentration and BP has been examined in few studies. We aim to examine the association between serum folate and BP levels in a cohort of young Chinese women. Methods We used the baseline data from a pre-conception cohort of women of childbearing age in Liuyang, China, for this study. Demographic data were collected by structured interview. Serum folate concentration was measured by immunoassay, and homocysteine, blood glucose, triglyceride and total cholesterol were measured through standardized clinical procedures. Multiple linear regression and principal component regression model were applied in the analysis. Results A total of 1,532 healthy normotensive non-pregnant women were included in the final analysis. The mean concentration of serum folate was 7.5 ± 5.4 nmol/L and 55% of the women presented with folate deficiency (< 6.8 nmol/L). Multiple linear regression and principal component regression showed that serum folate levels were inversely associated with systolic and diastolic BP, after adjusting for demographic, anthropometric, and biochemical factors. Conclusions Serum folate is inversely associated with BP in non-pregnant women of childbearing age with high prevalence of folate deficiency. PMID:27182603
NASA Astrophysics Data System (ADS)
Delbari, Masoomeh; Sharifazari, Salman; Mohammadi, Ehsan
2018-02-01
The knowledge of soil temperature at different depths is important for agricultural industry and for understanding climate change. The aim of this study is to evaluate the performance of a support vector regression (SVR)-based model in estimating daily soil temperature at 10, 30 and 100 cm depth at different climate conditions over Iran. The obtained results were compared to those obtained from a more classical multiple linear regression (MLR) model. The correlation sensitivity for the input combinations and periodicity effect were also investigated. Climatic data used as inputs to the models were minimum and maximum air temperature, solar radiation, relative humidity, dew point, and the atmospheric pressure (reduced to see level), collected from five synoptic stations Kerman, Ahvaz, Tabriz, Saghez, and Rasht located respectively in the hyper-arid, arid, semi-arid, Mediterranean, and hyper-humid climate conditions. According to the results, the performance of both MLR and SVR models was quite well at surface layer, i.e., 10-cm depth. However, SVR performed better than MLR in estimating soil temperature at deeper layers especially 100 cm depth. Moreover, both models performed better in humid climate condition than arid and hyper-arid areas. Further, adding a periodicity component into the modeling process considerably improved the models' performance especially in the case of SVR.
Association between Personality Traits and Sleep Quality in Young Korean Women
Kim, Han-Na; Cho, Juhee; Chang, Yoosoo; Ryu, Seungho
2015-01-01
Personality is a trait that affects behavior and lifestyle, and sleep quality is an important component of a healthy life. We analyzed the association between personality traits and sleep quality in a cross-section of 1,406 young women (from 18 to 40 years of age) who were not reporting clinically meaningful depression symptoms. Surveys were carried out from December 2011 to February 2012, using the Revised NEO Personality Inventory and the Pittsburgh Sleep Quality Index (PSQI). All analyses were adjusted for demographic and behavioral variables. We considered beta weights, structure coefficients, unique effects, and common effects when evaluating the importance of sleep quality predictors in multiple linear regression models. Neuroticism was the most important contributor to PSQI global scores in the multiple regression models. By contrast, despite being strongly correlated with sleep quality, conscientiousness had a near-zero beta weight in linear regression models, because most variance was shared with other personality traits. However, conscientiousness was the most noteworthy predictor of poor sleep quality status (PSQI≥6) in logistic regression models and individuals high in conscientiousness were least likely to have poor sleep quality, which is consistent with an OR of 0.813, with conscientiousness being protective against poor sleep quality. Personality may be a factor in poor sleep quality and should be considered in sleep interventions targeting young women. PMID:26030141
Kumar, Rajesh; Dogra, Vishal; Rani, Khushbu; Sahu, Kanti
2017-01-01
District level determinants of total fertility rate in Empowered Action Group states of India can help in ongoing population stabilization programs in India. Present study intends to assess the role of district level determinants in predicting total fertility rate among districts of the Empowered Action Group states of India. Data from Annual Health Survey (2011-12) was analysed using STATA and R software packages. Multiple linear regression models were built and evaluated using Akaike Information Criterion. For further understanding, recursive partitioning was used to prepare a regression tree. Female married illiteracy positively associated with total fertility rate and explained more than half (53%) of variance. Under multiple linear regression model, married illiteracy, infant mortality rate, Ante natal care registration, household size, median age of live birth and sex ratio explained 70% of total variance in total fertility rate. In regression tree, female married illiteracy was the root node and splits at 42% determined TFR <= 2.7. The next left side branch was again married illiteracy with splits at 23% to determine TFR <= 2.1. We conclude that female married illiteracy is one of the most important determinants explaining total fertility rate among the districts of an Empowered Action Group states. Focus on female literacy is required to stabilize the population growth in long run.
Predicting daily use of urban forest recreation sites
John F. Dwyer
1988-01-01
A multiple linear regression model explains 90% of the variance in daily use of an urban recreation site. Explanatory variables include season, day of the week, and weather. The results offer guides for recreation site planning and management as well as suggestions for improving the model.
Mckay, Garrett; Huang, Wenxi; Romera-Castillo, Cristina; Crouch, Jenna E; Rosario-Ortiz, Fernando L; Jaffé, Rudolf
2017-05-16
The antioxidant capacity and formation of photochemically produced reactive intermediates (RI) was studied for water samples collected from the Florida Everglades with different spatial (marsh versus estuarine) and temporal (wet versus dry season) characteristics. Measured RI included triplet excited states of dissolved organic matter ( 3 DOM*), singlet oxygen ( 1 O 2 ), and the hydroxyl radical ( • OH). Single and multiple linear regression modeling were performed using a broad range of extrinsic (to predict RI formation rates, R RI ) and intrinsic (to predict RI quantum yields, Φ RI ) parameters. Multiple linear regression models consistently led to better predictions of R RI and Φ RI for our data set but poor prediction of Φ RI for a previously published data set,1 probably because the predictors are intercorrelated (Pearson's r > 0.5). Single linear regression models were built with data compiled from previously published studies (n ≈ 120) in which E2:E3, S, and Φ RI values were measured, which revealed a high degree of similarity between RI-optical property relationships across DOM samples of diverse sources. This study reveals that • OH formation is, in general, decoupled from 3 DOM* and 1 O 2 formation, providing supporting evidence that 3 DOM* is not a • OH precursor. Finally, Φ RI for 1 O 2 and 3 DOM* correlated negatively with antioxidant activity (a surrogate for electron donating capacity) for the collected samples, which is consistent with intramolecular oxidation of DOM moieties by 3 DOM*.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sadat Hayatshahi, Sayyed Hamed; Abdolmaleki, Parviz; Safarian, Shahrokh
2005-12-16
Logistic regression and artificial neural networks have been developed as two non-linear models to establish quantitative structure-activity relationships between structural descriptors and biochemical activity of adenosine based competitive inhibitors, toward adenosine deaminase. The training set included 24 compounds with known k {sub i} values. The models were trained to solve two-class problems. Unlike the previous work in which multiple linear regression was used, the highest of positive charge on the molecules was recognized to be in close relation with their inhibition activity, while the electric charge on atom N1 of adenosine was found to be a poor descriptor. Consequently, themore » previously developed equation was improved and the newly formed one could predict the class of 91.66% of compounds correctly. Also optimized 2-3-1 and 3-4-1 neural networks could increase this rate to 95.83%.« less
Linear and nonlinear models for predicting fish bioconcentration factors for pesticides.
Yuan, Jintao; Xie, Chun; Zhang, Ting; Sun, Jinfang; Yuan, Xuejie; Yu, Shuling; Zhang, Yingbiao; Cao, Yunyuan; Yu, Xingchen; Yang, Xuan; Yao, Wu
2016-08-01
This work is devoted to the applications of the multiple linear regression (MLR), multilayer perceptron neural network (MLP NN) and projection pursuit regression (PPR) to quantitative structure-property relationship analysis of bioconcentration factors (BCFs) of pesticides tested on Bluegill (Lepomis macrochirus). Molecular descriptors of a total of 107 pesticides were calculated with the DRAGON Software and selected by inverse enhanced replacement method. Based on the selected DRAGON descriptors, a linear model was built by MLR, nonlinear models were developed using MLP NN and PPR. The robustness of the obtained models was assessed by cross-validation and external validation using test set. Outliers were also examined and deleted to improve predictive power. Comparative results revealed that PPR achieved the most accurate predictions. This study offers useful models and information for BCF prediction, risk assessment, and pesticide formulation. Copyright © 2016 Elsevier Ltd. All rights reserved.
Byun, Bo-Ram; Kim, Yong-Il; Yamaguchi, Tetsutaro; Maki, Koutaro; Son, Woo-Sung
2015-01-01
This study was aimed to examine the correlation between skeletal maturation status and parameters from the odontoid process/body of the second vertebra and the bodies of third and fourth cervical vertebrae and simultaneously build multiple regression models to be able to estimate skeletal maturation status in Korean girls. Hand-wrist radiographs and cone beam computed tomography (CBCT) images were obtained from 74 Korean girls (6-18 years of age). CBCT-generated cervical vertebral maturation (CVM) was used to demarcate the odontoid process and the body of the second cervical vertebra, based on the dentocentral synchondrosis. Correlation coefficient analysis and multiple linear regression analysis were used for each parameter of the cervical vertebrae (P < 0.05). Forty-seven of 64 parameters from CBCT-generated CVM (independent variables) exhibited statistically significant correlations (P < 0.05). The multiple regression model with the greatest R (2) had six parameters (PH2/W2, UW2/W2, (OH+AH2)/LW2, UW3/LW3, D3, and H4/W4) as independent variables with a variance inflation factor (VIF) of <2. CBCT-generated CVM was able to include parameters from the second cervical vertebral body and odontoid process, respectively, for the multiple regression models. This suggests that quantitative analysis might be used to estimate skeletal maturation status.
Robbins, Blaine
2013-01-01
Sociologists, political scientists, and economists all suggest that culture plays a pivotal role in the development of large-scale cooperation. In this study, I used generalized trust as a measure of culture to explore if and how culture impacts intentional homicide, my operationalization of cooperation. I compiled multiple cross-national data sets and used pooled time-series linear regression, single-equation instrumental-variables linear regression, and fixed- and random-effects estimation techniques on an unbalanced panel of 118 countries and 232 observations spread over a 15-year time period. Results suggest that culture and large-scale cooperation form a tenuous relationship, while economic factors such as development, inequality, and geopolitics appear to drive large-scale cooperation.
School Climate, Principal Support and Collaboration among Portuguese Teachers
ERIC Educational Resources Information Center
Castro Silva, José; Amante, Lúcia; Morgado, José
2017-01-01
This article analyses the relationship between school principal support and teacher collaboration among Portuguese teachers. Data were collected from a random sample of 234 teachers in middle and secondary schools. The use of a combined approach using linear and multiple regression tests concluded that the school principal support, through the…
Statistical considerations in the analysis of data from replicated bioassays
USDA-ARS?s Scientific Manuscript database
Multiple-dose bioassay is generally the preferred method for characterizing virulence of insect pathogens. Linear regression of probit mortality on log dose enables estimation of LD50/LC50 and slope, the latter having substantial effect on LD90/95s (doses of considerable interest in pest management)...
Exploring Race Differences in Correlates of Seniors' Satisfaction with Undergraduate Education
ERIC Educational Resources Information Center
Einarson, Marne K.; Matier, Michael W.
2005-01-01
This study employed multiple linear regression and decision tree analysis to examine the correlates of overall satisfaction with undergraduate education for white, Asian American, Latino and African American seniors enrolled at 17 doctoral/research universities. Satisfaction with the overall quality of instruction and social involvement were the…
Exploring Race Differences in Correlates of Seniors' Satisfaction with Undergraduate Education
ERIC Educational Resources Information Center
Einarson, Marne K.; Matier, Michael W.
2004-01-01
This study employed multiple linear regression and decision tree analysis to examine the correlates of overall satisfaction with undergraduate education for white, Asian American, Hispanic and African American seniors enrolled at 17 research-extensive universities. Satisfaction with the overall quality of instruction and social involvement were…
ERIC Educational Resources Information Center
Everson, Howard T.; And Others
This paper explores the feasibility of neural computing methods such as artificial neural networks (ANNs) and abductory induction mechanisms (AIM) for use in educational measurement. ANNs and AIMS methods are contrasted with more traditional statistical techniques, such as multiple regression and discriminant function analyses, for making…
Predictors of Quality Verbal Engagement in Third-Grade Literature Discussions
ERIC Educational Resources Information Center
Young, Chase
2014-01-01
This study investigates how reading ability and personality traits predict the quality of verbal discussions in peer-led literature circles. Third grade literature discussions were recorded, transcribed, and coded. The coded statements and questions were quantified into a quality of engagement score. Through multiple linear regression, the…
ERIC Educational Resources Information Center
Ostman, Ronald E.; Wagner, Graham A.
1987-01-01
Describes a survey of 724 management students in New Zealand's Technical Correspondence Institute which was conducted to determine whether the introduction of educational technologies could decrease the dropout rate. The multiple linear regression model that was used to analyze the questionnaire responses is presented, and predictor variables are…
Epistemological Predictors of Prospective Biology Teachers' Nature of Science Understandings
ERIC Educational Resources Information Center
Köseoglu, Pinar; Köksal, Mustafa Serdar
2015-01-01
The purpose of this study was to investigate epistemological predictors of nature of science understandings of 281 prospective biology teachers surveyed using the Epistemological Beliefs Scale Regarding Science and the Nature of Science Scale. The findings on multiple linear regression showed that understandings about definition of science and…
Real, J; Cleries, R; Forné, C; Roso-Llorach, A; Martínez-Sánchez, J M
In medicine and biomedical research, statistical techniques like logistic, linear, Cox and Poisson regression are widely known. The main objective is to describe the evolution of multivariate techniques used in observational studies indexed in PubMed (1970-2013), and to check the requirements of the STROBE guidelines in the author guidelines in Spanish journals indexed in PubMed. A targeted PubMed search was performed to identify papers that used logistic linear Cox and Poisson models. Furthermore, a review was also made of the author guidelines of journals published in Spain and indexed in PubMed and Web of Science. Only 6.1% of the indexed manuscripts included a term related to multivariate analysis, increasing from 0.14% in 1980 to 12.3% in 2013. In 2013, 6.7, 2.5, 3.5, and 0.31% of the manuscripts contained terms related to logistic, linear, Cox and Poisson regression, respectively. On the other hand, 12.8% of journals author guidelines explicitly recommend to follow the STROBE guidelines, and 35.9% recommend the CONSORT guideline. A low percentage of Spanish scientific journals indexed in PubMed include the STROBE statement requirement in the author guidelines. Multivariate regression models in published observational studies such as logistic regression, linear, Cox and Poisson are increasingly used both at international level, as well as in journals published in Spanish. Copyright © 2015 Sociedad Española de Médicos de Atención Primaria (SEMERGEN). Publicado por Elsevier España, S.L.U. All rights reserved.
The prediction of intelligence in preschool children using alternative models to regression.
Finch, W Holmes; Chang, Mei; Davis, Andrew S; Holden, Jocelyn E; Rothlisberg, Barbara A; McIntosh, David E
2011-12-01
Statistical prediction of an outcome variable using multiple independent variables is a common practice in the social and behavioral sciences. For example, neuropsychologists are sometimes called upon to provide predictions of preinjury cognitive functioning for individuals who have suffered a traumatic brain injury. Typically, these predictions are made using standard multiple linear regression models with several demographic variables (e.g., gender, ethnicity, education level) as predictors. Prior research has shown conflicting evidence regarding the ability of such models to provide accurate predictions of outcome variables such as full-scale intelligence (FSIQ) test scores. The present study had two goals: (1) to demonstrate the utility of a set of alternative prediction methods that have been applied extensively in the natural sciences and business but have not been frequently explored in the social sciences and (2) to develop models that can be used to predict premorbid cognitive functioning in preschool children. Predictions of Stanford-Binet 5 FSIQ scores for preschool-aged children is used to compare the performance of a multiple regression model with several of these alternative methods. Results demonstrate that classification and regression trees provided more accurate predictions of FSIQ scores than does the more traditional regression approach. Implications of these results are discussed.
Single Image Super-Resolution Using Global Regression Based on Multiple Local Linear Mappings.
Choi, Jae-Seok; Kim, Munchurl
2017-03-01
Super-resolution (SR) has become more vital, because of its capability to generate high-quality ultra-high definition (UHD) high-resolution (HR) images from low-resolution (LR) input images. Conventional SR methods entail high computational complexity, which makes them difficult to be implemented for up-scaling of full-high-definition input images into UHD-resolution images. Nevertheless, our previous super-interpolation (SI) method showed a good compromise between Peak-Signal-to-Noise Ratio (PSNR) performances and computational complexity. However, since SI only utilizes simple linear mappings, it may fail to precisely reconstruct HR patches with complex texture. In this paper, we present a novel SR method, which inherits the large-to-small patch conversion scheme from SI but uses global regression based on local linear mappings (GLM). Thus, our new SR method is called GLM-SI. In GLM-SI, each LR input patch is divided into 25 overlapped subpatches. Next, based on the local properties of these subpatches, 25 different local linear mappings are applied to the current LR input patch to generate 25 HR patch candidates, which are then regressed into one final HR patch using a global regressor. The local linear mappings are learned cluster-wise in our off-line training phase. The main contribution of this paper is as follows: Previously, linear-mapping-based conventional SR methods, including SI only used one simple yet coarse linear mapping to each patch to reconstruct its HR version. On the contrary, for each LR input patch, our GLM-SI is the first to apply a combination of multiple local linear mappings, where each local linear mapping is found according to local properties of the current LR patch. Therefore, it can better approximate nonlinear LR-to-HR mappings for HR patches with complex texture. Experiment results show that the proposed GLM-SI method outperforms most of the state-of-the-art methods, and shows comparable PSNR performance with much lower computational complexity when compared with a super-resolution method based on convolutional neural nets (SRCNN15). Compared with the previous SI method that is limited with a scale factor of 2, GLM-SI shows superior performance with average 0.79 dB higher in PSNR, and can be used for scale factors of 3 or higher.
Survival Data and Regression Models
NASA Astrophysics Data System (ADS)
Grégoire, G.
2014-12-01
We start this chapter by introducing some basic elements for the analysis of censored survival data. Then we focus on right censored data and develop two types of regression models. The first one concerns the so-called accelerated failure time models (AFT), which are parametric models where a function of a parameter depends linearly on the covariables. The second one is a semiparametric model, where the covariables enter in a multiplicative form in the expression of the hazard rate function. The main statistical tool for analysing these regression models is the maximum likelihood methodology and, in spite we recall some essential results about the ML theory, we refer to the chapter "Logistic Regression" for a more detailed presentation.
Locomotive syndrome is associated not only with physical capacity but also degree of depression.
Ikemoto, Tatsunori; Inoue, Masayuki; Nakata, Masatoshi; Miyagawa, Hirofumi; Shimo, Kazuhiro; Wakabayashi, Toshiko; Arai, Young-Chang P; Ushida, Takahiro
2016-05-01
Reports of locomotive syndrome (LS) have recently been increasing. Although physical performance measures for LS have been well investigated to date, studies including psychiatric assessment are still scarce. Hence, the aim of this study was to investigate both physical and mental parameters in relation to presence and severity of LS using a 25-question geriatric locomotive function scale (GLFS-25) questionnaire. 150 elderly people aged over 60 years who were members of our physical-fitness center and displayed well-being were enrolled in this study. Firstly, using the previously determined GLFS-25 cutoff value (=16 points), subjects were divided into two groups accordingly: an LS and non-LS group in order to compare each parameter (age, grip strength, timed-up-and-go test (TUG), one-leg standing with eye open, back muscle and leg muscle strength, degree of depression and cognitive impairment) between the groups using the Mann-Whitney U-test followed by multiple logistic regression analysis. Secondly, a multiple linear regression was conducted to determine which variables showed the strongest correlation with severity of LS. We confirmed 110 people for non-LS (73%) and 40 people for LS using the GLFS-25 cutoff value. Comparative analysis between LS and non-LS revealed significant differences in parameters in age, grip strength, TUG, one-leg standing, back muscle strength and degree of depression (p < 0.006, after Bonferroni correction). Multiple logistic regression revealed that functional decline in grip strength, TUG and one-leg standing and degree of depression were significantly associated with LS. On the other hand, we observed that the significant contributors towards the GLFS-25 score were TUG and degree of depression in multiple linear regression analysis. The results indicate that LS is associated with not only the capacity of physical performance but also the degree of depression although most participants fell under the criteria of LS. Copyright © 2016 The Japanese Orthopaedic Association. Published by Elsevier B.V. All rights reserved.
Madarang, Krish J; Kang, Joo-Hyon
2014-06-01
Stormwater runoff has been identified as a source of pollution for the environment, especially for receiving waters. In order to quantify and manage the impacts of stormwater runoff on the environment, predictive models and mathematical models have been developed. Predictive tools such as regression models have been widely used to predict stormwater discharge characteristics. Storm event characteristics, such as antecedent dry days (ADD), have been related to response variables, such as pollutant loads and concentrations. However it has been a controversial issue among many studies to consider ADD as an important variable in predicting stormwater discharge characteristics. In this study, we examined the accuracy of general linear regression models in predicting discharge characteristics of roadway runoff. A total of 17 storm events were monitored in two highway segments, located in Gwangju, Korea. Data from the monitoring were used to calibrate United States Environmental Protection Agency's Storm Water Management Model (SWMM). The calibrated SWMM was simulated for 55 storm events, and the results of total suspended solid (TSS) discharge loads and event mean concentrations (EMC) were extracted. From these data, linear regression models were developed. R(2) and p-values of the regression of ADD for both TSS loads and EMCs were investigated. Results showed that pollutant loads were better predicted than pollutant EMC in the multiple regression models. Regression may not provide the true effect of site-specific characteristics, due to uncertainty in the data. Copyright © 2014 The Research Centre for Eco-Environmental Sciences, Chinese Academy of Sciences. Published by Elsevier B.V. All rights reserved.
Detection of epistatic effects with logic regression and a classical linear regression model.
Malina, Magdalena; Ickstadt, Katja; Schwender, Holger; Posch, Martin; Bogdan, Małgorzata
2014-02-01
To locate multiple interacting quantitative trait loci (QTL) influencing a trait of interest within experimental populations, usually methods as the Cockerham's model are applied. Within this framework, interactions are understood as the part of the joined effect of several genes which cannot be explained as the sum of their additive effects. However, if a change in the phenotype (as disease) is caused by Boolean combinations of genotypes of several QTLs, this Cockerham's approach is often not capable to identify them properly. To detect such interactions more efficiently, we propose a logic regression framework. Even though with the logic regression approach a larger number of models has to be considered (requiring more stringent multiple testing correction) the efficient representation of higher order logic interactions in logic regression models leads to a significant increase of power to detect such interactions as compared to a Cockerham's approach. The increase in power is demonstrated analytically for a simple two-way interaction model and illustrated in more complex settings with simulation study and real data analysis.
Sowande, O S; Oyewale, B F; Iyasere, O S
2010-06-01
The relationships between live weight and eight body measurements of West African Dwarf (WAD) goats were studied using 211 animals under farm condition. The animals were categorized based on age and sex. Data obtained on height at withers (HW), heart girth (HG), body length (BL), head length (HL), and length of hindquarter (LHQ) were fitted into simple linear, allometric, and multiple-regression models to predict live weight from the body measurements according to age group and sex. Results showed that live weight, HG, BL, LHQ, HL, and HW increased with the age of the animals. In multiple-regression model, HG and HL best fit the model for goat kids; HG, HW, and HL for goat aged 13-24 months; while HG, LHQ, HW, and HL best fit the model for goats aged 25-36 months. Coefficients of determination (R(2)) values for linear and allometric models for predicting the live weight of WAD goat increased with age in all the body measurements, with HG being the most satisfactory single measurement in predicting the live weight of WAD goat. Sex had significant influence on the model with R(2) values consistently higher in females except the models for LHQ and HW.
Mathur, Praveen; Sharma, Sarita; Soni, Bhupendra
2010-01-01
In the present work, an attempt is made to formulate multiple regression equations using all possible regressions method for groundwater quality assessment of Ajmer-Pushkar railway line region in pre- and post-monsoon seasons. Correlation studies revealed the existence of linear relationships (r 0.7) for electrical conductivity (EC), total hardness (TH) and total dissolved solids (TDS) with other water quality parameters. The highest correlation was found between EC and TDS (r = 0.973). EC showed highly significant positive correlation with Na, K, Cl, TDS and total solids (TS). TH showed highest correlation with Ca and Mg. TDS showed significant correlation with Na, K, SO4, PO4 and Cl. The study indicated that most of the contamination present was water soluble or ionic in nature. Mg was present as MgCl2; K mainly as KCl and K2SO4, and Na was present as the salts of Cl, SO4 and PO4. On the other hand, F and NO3 showed no significant correlations. The r2 values and F values (at 95% confidence limit, alpha = 0.05) for the modelled equations indicated high degree of linearity among independent and dependent variables. Also the error % between calculated and experimental values was contained within +/- 15% limit.
NASA Astrophysics Data System (ADS)
Chen, Hua-cai; Chen, Xing-dan; Lu, Yong-jun; Cao, Zhi-qiang
2006-01-01
Near infrared (NIR) reflectance spectroscopy was used to develop a fast determination method for total ginsenosides in Ginseng (Panax Ginseng) powder. The spectra were analyzed with multiplicative signal correction (MSC) correlation method. The best correlative spectra region with the total ginsenosides content was 1660 nm~1880 nm and 2230nm~2380 nm. The NIR calibration models of ginsenosides were built with multiple linear regression (MLR), principle component regression (PCR) and partial least squares (PLS) regression respectively. The results showed that the calibration model built with PLS combined with MSC and the optimal spectrum region was the best one. The correlation coefficient and the root mean square error of correction validation (RMSEC) of the best calibration model were 0.98 and 0.15% respectively. The optimal spectrum region for calibration was 1204nm~2014nm. The result suggested that using NIR to rapidly determinate the total ginsenosides content in ginseng powder were feasible.
On the Stationarity of Multiple Autoregressive Approximants: Theory and Algorithms
1976-08-01
a I (3.4) Hannan and Terrell (1972) consider problems of a similar nature. Efficient estimates A(1),... , A(p) , and i of A(1)... ,A(p) and...34Autoregressive model fitting for control, Ann . Inst. Statist. Math., 23, 163-180. Hannan, E. J. (1970), Multiple Time Series, New York, John Wiley...Hannan, E. J. and Terrell , R. D. (1972), "Time series regression with linear constraints, " International Economic Review, 13, 189-200. Masani, P
Hsu, Ruey-Fen; Ho, Chi-Kung; Lu, Sheng-Nan; Chen, Shun-Sheng
2010-10-01
An objective investigation is needed to verify the existence and severity of hearing impairments resulting from work-related, noise-induced hearing loss in arbitration of medicolegal aspects. We investigated the accuracy of multiple-frequency auditory steady-state responses (Mf-ASSRs) between subjects with sensorineural hearing loss (SNHL) with and without occupational noise exposure. Cross-sectional study. Tertiary referral medical centre. Pure-tone audiometry and Mf-ASSRs were recorded in 88 subjects (34 patients had occupational noise-induced hearing loss [NIHL], 36 patients had SNHL without noise exposure, and 18 volunteers were normal controls). Inter- and intragroup comparisons were made. A predicting equation was derived using multiple linear regression analysis. ASSRs and pure-tone thresholds (PTTs) showed a strong correlation for all subjects (r = .77 ≈ .94). The relationship is demonstrated by the equationThe differences between the ASSR and PTT were significantly higher for the NIHL group than for the subjects with non-noise-induced SNHL (p < .001). Mf-ASSR is a promising tool for objectively evaluating hearing thresholds. Predictive value may be lower in subjects with occupational hearing loss. Regardless of carrier frequencies, the severity of hearing loss affects the steady-state response. Moreover, the ASSR may assist in detecting noise-induced injury of the auditory pathway. A multiple linear regression equation to accurately predict thresholds was shown that takes into consideration all effect factors.
Olsson, A; Oturai, D B; Sørensen, P S; Oturai, P S; Oturai, A B
2015-10-01
Patients with multiple sclerosis (MS) are at increased risk of reduced bone mineral density (BMD). A contributing factor might be treatment with high-dose glucocorticoids (GCs). The objective of this paper is to assess bone mass in patients with MS and evaluate the importance of short-term, high-dose GC treatment and other risk factors that affect BMD in patients with MS. A total of 260 patients with MS received short-term high-dose GC treatment and had their BMD measured by dual x-ray absorptiometry. BMD was compared to a healthy age-matched reference population (Z-scores). Data regarding GCs, age, body mass index (BMI), serum 25(OH)D, disease duration and severity were collected retrospectively and analysed in a multiple linear regression analysis to evaluate the association between each risk factor and BMD. Osteopenia was present in 38% and osteoporosis in 7% of the study population. Mean Z-score was significantly below zero, indicating a decreased BMD in our MS patients. Multiple linear regression analysis showed no significant association between GCs and BMD. In contrast, age, BMI and disease severity were independently associated with both lumbar and femoral BMD. Reduced BMD was prevalent in patients with MS. GC treatment appears not to be the primary underlying cause of secondary osteoporosis in MS patients. © The Author(s), 2015.
Fernández-Fernández, Mario; Rodríguez-González, Pablo; García Alonso, J Ignacio
2016-10-01
We have developed a novel, rapid and easy calculation procedure for Mass Isotopomer Distribution Analysis based on multiple linear regression which allows the simultaneous calculation of the precursor pool enrichment and the fraction of newly synthesized labelled proteins (fractional synthesis) using linear algebra. To test this approach, we used the peptide RGGGLK as a model tryptic peptide containing three subunits of glycine. We selected glycine labelled in two 13 C atoms ( 13 C 2 -glycine) as labelled amino acid to demonstrate that spectral overlap is not a problem in the proposed methodology. The developed methodology was tested first in vitro by changing the precursor pool enrichment from 10 to 40% of 13 C 2 -glycine. Secondly, a simulated in vivo synthesis of proteins was designed by combining the natural abundance RGGGLK peptide and 10 or 20% 13 C 2 -glycine at 1 : 1, 1 : 3 and 3 : 1 ratios. Precursor pool enrichments and fractional synthesis values were calculated with satisfactory precision and accuracy using a simple spreadsheet. This novel approach can provide a relatively rapid and easy means to measure protein turnover based on stable isotope tracers. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Genomic prediction based on data from three layer lines using non-linear regression models.
Huang, Heyun; Windig, Jack J; Vereijken, Addie; Calus, Mario P L
2014-11-06
Most studies on genomic prediction with reference populations that include multiple lines or breeds have used linear models. Data heterogeneity due to using multiple populations may conflict with model assumptions used in linear regression methods. In an attempt to alleviate potential discrepancies between assumptions of linear models and multi-population data, two types of alternative models were used: (1) a multi-trait genomic best linear unbiased prediction (GBLUP) model that modelled trait by line combinations as separate but correlated traits and (2) non-linear models based on kernel learning. These models were compared to conventional linear models for genomic prediction for two lines of brown layer hens (B1 and B2) and one line of white hens (W1). The three lines each had 1004 to 1023 training and 238 to 240 validation animals. Prediction accuracy was evaluated by estimating the correlation between observed phenotypes and predicted breeding values. When the training dataset included only data from the evaluated line, non-linear models yielded at best a similar accuracy as linear models. In some cases, when adding a distantly related line, the linear models showed a slight decrease in performance, while non-linear models generally showed no change in accuracy. When only information from a closely related line was used for training, linear models and non-linear radial basis function (RBF) kernel models performed similarly. The multi-trait GBLUP model took advantage of the estimated genetic correlations between the lines. Combining linear and non-linear models improved the accuracy of multi-line genomic prediction. Linear models and non-linear RBF models performed very similarly for genomic prediction, despite the expectation that non-linear models could deal better with the heterogeneous multi-population data. This heterogeneity of the data can be overcome by modelling trait by line combinations as separate but correlated traits, which avoids the occasional occurrence of large negative accuracies when the evaluated line was not included in the training dataset. Furthermore, when using a multi-line training dataset, non-linear models provided information on the genotype data that was complementary to the linear models, which indicates that the underlying data distributions of the three studied lines were indeed heterogeneous.
NASA Astrophysics Data System (ADS)
Bradshaw, Tyler; Fu, Rau; Bowen, Stephen; Zhu, Jun; Forrest, Lisa; Jeraj, Robert
2015-07-01
Dose painting relies on the ability of functional imaging to identify resistant tumor subvolumes to be targeted for additional boosting. This work assessed the ability of FDG, FLT, and Cu-ATSM PET imaging to predict the locations of residual FDG PET in canine tumors following radiotherapy. Nineteen canines with spontaneous sinonasal tumors underwent PET/CT imaging with radiotracers FDG, FLT, and Cu-ATSM prior to hypofractionated radiotherapy. Therapy consisted of 10 fractions of 4.2 Gy to the sinonasal cavity with or without an integrated boost of 0.8 Gy to the GTV. Patients had an additional FLT PET/CT scan after fraction 2, a Cu-ATSM PET/CT scan after fraction 3, and follow-up FDG PET/CT scans after radiotherapy. Following image registration, simple and multiple linear and logistic voxel regressions were performed to assess how well pre- and mid-treatment PET imaging predicted post-treatment FDG uptake. R2 and pseudo R2 were used to assess the goodness of fits. For simple linear regression models, regression coefficients for all pre- and mid-treatment PET images were significantly positive across the population (P < 0.05). However, there was large variability among patients in goodness of fits: R2 ranged from 0.00 to 0.85, with a median of 0.12. Results for logistic regression models were similar. Multiple linear regression models resulted in better fits (median R2 = 0.31), but there was still large variability between patients in R2. The R2 from regression models for different predictor variables were highly correlated across patients (R ≈ 0.8), indicating tumors that were poorly predicted with one tracer were also poorly predicted by other tracers. In conclusion, the high inter-patient variability in goodness of fits indicates that PET was able to predict locations of residual tumor in some patients, but not others. This suggests not all patients would be good candidates for dose painting based on a single biological target.
Bradshaw, Tyler; Fu, Rau; Bowen, Stephen; Zhu, Jun; Forrest, Lisa; Jeraj, Robert
2015-07-07
Dose painting relies on the ability of functional imaging to identify resistant tumor subvolumes to be targeted for additional boosting. This work assessed the ability of FDG, FLT, and Cu-ATSM PET imaging to predict the locations of residual FDG PET in canine tumors following radiotherapy. Nineteen canines with spontaneous sinonasal tumors underwent PET/CT imaging with radiotracers FDG, FLT, and Cu-ATSM prior to hypofractionated radiotherapy. Therapy consisted of 10 fractions of 4.2 Gy to the sinonasal cavity with or without an integrated boost of 0.8 Gy to the GTV. Patients had an additional FLT PET/CT scan after fraction 2, a Cu-ATSM PET/CT scan after fraction 3, and follow-up FDG PET/CT scans after radiotherapy. Following image registration, simple and multiple linear and logistic voxel regressions were performed to assess how well pre- and mid-treatment PET imaging predicted post-treatment FDG uptake. R(2) and pseudo R(2) were used to assess the goodness of fits. For simple linear regression models, regression coefficients for all pre- and mid-treatment PET images were significantly positive across the population (P < 0.05). However, there was large variability among patients in goodness of fits: R(2) ranged from 0.00 to 0.85, with a median of 0.12. Results for logistic regression models were similar. Multiple linear regression models resulted in better fits (median R(2) = 0.31), but there was still large variability between patients in R(2). The R(2) from regression models for different predictor variables were highly correlated across patients (R ≈ 0.8), indicating tumors that were poorly predicted with one tracer were also poorly predicted by other tracers. In conclusion, the high inter-patient variability in goodness of fits indicates that PET was able to predict locations of residual tumor in some patients, but not others. This suggests not all patients would be good candidates for dose painting based on a single biological target.
Kumar, Rajesh; Dogra, Vishal; Rani, Khushbu; Sahu, Kanti
2017-01-01
Background: District level determinants of total fertility rate in Empowered Action Group states of India can help in ongoing population stabilization programs in India. Objective: Present study intends to assess the role of district level determinants in predicting total fertility rate among districts of the Empowered Action Group states of India. Material and Methods: Data from Annual Health Survey (2011-12) was analysed using STATA and R software packages. Multiple linear regression models were built and evaluated using Akaike Information Criterion. For further understanding, recursive partitioning was used to prepare a regression tree. Results: Female married illiteracy positively associated with total fertility rate and explained more than half (53%) of variance. Under multiple linear regression model, married illiteracy, infant mortality rate, Ante natal care registration, household size, median age of live birth and sex ratio explained 70% of total variance in total fertility rate. In regression tree, female married illiteracy was the root node and splits at 42% determined TFR <= 2.7. The next left side branch was again married illiteracy with splits at 23% to determine TFR <= 2.1. Conclusion: We conclude that female married illiteracy is one of the most important determinants explaining total fertility rate among the districts of an Empowered Action Group states. Focus on female literacy is required to stabilize the population growth in long run. PMID:29416999
Should a First Course in ANOVA Be Taught Through MLR?
ERIC Educational Resources Information Center
Williams, John D.
Before implementing a course in the analysis of variance (ANOVA) taught through multiple linear regression, several concerns must be addressed. Adequate computer facilities that are available to students on a low-cost or cost-free basis are necessary; also students must be able to meaningfully communicate with their major advisor regarding their…
Linda H. Geiser; Sarah E. Jovan; Doug A. Glavich; Matthew K. Porter
2010-01-01
Critical loads (CLs) define maximum atmospheric deposition levels apparently preventative of ecosystem harm. We present first nitrogen CLs for northwestern North America's maritime forests. Using multiple linear regression, we related epiphytic-macrolichen community composition to: 1) wet deposition from the National Atmospheric Deposition Program, 2) wet, dry,...
Do Nondomestic Undergraduates Choose a Major Field in Order to Maximize Grade Point Averages?
ERIC Educational Resources Information Center
Bergman, Matthew E.; Fass-Holmes, Barry
2016-01-01
The authors investigated whether undergraduates attending an American West Coast public university who were not U.S. citizens (nondomestic) maximized their grade point averages (GPA) through their choice of major field. Multiple regression hierarchical linear modeling analyses showed that major field's effect size was small for these…
ERIC Educational Resources Information Center
Romero, Andrea J.; Ruiz, Myrna
2007-01-01
We examined coping with risky behaviors (cigarettes, alcohol/drugs, yelling/ hitting, and anger), familism (family proximity and parental closeness) and parental monitoring (knowledge and discipline) in a sample of 56 adolescents (11-15 years old) predominantly of Mexican descent at two time points. Multiple linear regression analysis indicated…
Modeling Success: Using Preenrollment Data to Identify Academically At-Risk Students
ERIC Educational Resources Information Center
Gansemer-Topf, Ann M.; Compton, Jonathan; Wohlgemuth, Darin; Forbes, Greg; Ralston, Ekaterina
2015-01-01
Improving student success and degree completion is one of the core principles of strategic enrollment management. To address this principle, institutional data were used to develop a statistical model to identify academically at-risk students. The model employs multiple linear regression techniques to predict students at risk of earning below a…
The Prediction of Achievement and Time Spent in Instruction in a Self-Paced Individualized Course.
ERIC Educational Resources Information Center
Franklin, Thomas E.
Multiple linear regressions were employed to determine the relative contributions of cognitive and affective variables accounting for variance in college students' achievement and amount of time taken to complete a self-paced, individualized course. Study habits and attitudes (SSHA) made greater relative contributions to explaining total course…
Touch Processing and Social Behavior in ASD
ERIC Educational Resources Information Center
Miguel, Helga O.; Sampaio, Adriana; Martínez-Regueiro, Rocío; Gómez-Guerrero, Lorena; López-Dóriga, Cristina Gutiérrez; Gómez, Sonia; Carracedo, Ángel; Fernández-Prieto, Montse
2017-01-01
Abnormal patterns of touch processing have been linked to core symptoms in ASD. This study examined the relation between tactile processing patterns and social problems in 44 children and adolescents with ASD, aged 6-14 (M = 8.39 ± 2.35). Multiple linear regression indicated significant associations between touch processing and social problems. No…
ERIC Educational Resources Information Center
Roulette-McIntyre, Ovella; Bagaka's, Joshua G.; Drake, Daniel D.
2005-01-01
This study identified parental practices that relate positively to high school students' academic performance. Parents of 643 high school students participated in the study. Data analysis, using a multiple linear regression model, shows parent-school connection, student gender, and race are significant predictors of student academic performance.…
ERIC Educational Resources Information Center
Dubnjakovic, Ana
2012-01-01
The current study investigates factors influencing increase in reference transactions in a typical week in academic libraries across the United States of America. Employing multiple regression analysis and general linear modeling, variables of interest from the "Academic Library Survey (ALS) 2006" survey (sample size 3960 academic libraries) were…
Patterns of Library Use by Undergraduate Students in a Chilean University
ERIC Educational Resources Information Center
Jara, Magdalena; Clasing, Paula; Gonzalez, Carlos; Montenegro, Maximiliano; Kelly, Nick; Alarcón, Rosa; Sandoval, Augusto; Saurina, Elvira
2017-01-01
This paper explores the patterns of use of print materials and digital resources in an undergraduate library in a Chilean university, by the students' discipline and year of study. A quantitative analysis was carried out, including descriptive analysis of contingency tables, chi-squared tests, t-tests, and multiple linear regressions. The results…
Gender/racial Differences in Jock Identity, Dating, and Adolescent Sexual Risk.
ERIC Educational Resources Information Center
Miller, Kathleen E.; Farrell, Michael P.; Barnes, Grace M.; Melnick, Merrill J.; Sabo, Don
2005-01-01
Despite recent declines in overall sexual activity, sexual risk-taking remains a substantial danger to US youth. Existing research points to athletic participation as a promising venue for reducing these risks. Linear regressions and multiple analyses of covariance were performed on a longitudinal sample of nearly 600 Western New York adolescents…
Predictors of Career Adaptability Skill among Higher Education Students in Nigeria
ERIC Educational Resources Information Center
Ebenehi, Amos Shaibu; Rashid, Abdullah Mat; Bakar, Ab Rahim
2016-01-01
This paper examined predictors of career adaptability skill among higher education students in Nigeria. A sample of 603 higher education students randomly selected from six colleges of education in Nigeria participated in this study. A set of self-reported questionnaire was used for data collection, and multiple linear regression analysis was used…
Penalized nonparametric scalar-on-function regression via principal coordinates
Reiss, Philip T.; Miller, David L.; Wu, Pei-Shien; Hua, Wen-Yu
2016-01-01
A number of classical approaches to nonparametric regression have recently been extended to the case of functional predictors. This paper introduces a new method of this type, which extends intermediate-rank penalized smoothing to scalar-on-function regression. In the proposed method, which we call principal coordinate ridge regression, one regresses the response on leading principal coordinates defined by a relevant distance among the functional predictors, while applying a ridge penalty. Our publicly available implementation, based on generalized additive modeling software, allows for fast optimal tuning parameter selection and for extensions to multiple functional predictors, exponential family-valued responses, and mixed-effects models. In an application to signature verification data, principal coordinate ridge regression, with dynamic time warping distance used to define the principal coordinates, is shown to outperform a functional generalized linear model. PMID:29217963
Shabri, Ani; Samsudin, Ruhaidah
2014-01-01
Crude oil prices do play significant role in the global economy and are a key input into option pricing formulas, portfolio allocation, and risk measurement. In this paper, a hybrid model integrating wavelet and multiple linear regressions (MLR) is proposed for crude oil price forecasting. In this model, Mallat wavelet transform is first selected to decompose an original time series into several subseries with different scale. Then, the principal component analysis (PCA) is used in processing subseries data in MLR for crude oil price forecasting. The particle swarm optimization (PSO) is used to adopt the optimal parameters of the MLR model. To assess the effectiveness of this model, daily crude oil market, West Texas Intermediate (WTI), has been used as the case study. Time series prediction capability performance of the WMLR model is compared with the MLR, ARIMA, and GARCH models using various statistics measures. The experimental results show that the proposed model outperforms the individual models in forecasting of the crude oil prices series.
NASA Astrophysics Data System (ADS)
George, Anna Ray Bayless
A study was conducted to determine the relationship between the credentials held by science teachers who taught at a school that administered the Science Texas Assessment on Knowledge and Skills (Science TAKS), the state standardized exam in science, at grade 11 and student performance on a state standardized exam in science administered in grade 11. Years of teaching experience, teacher certification type(s), highest degree level held, teacher and school demographic information, and the percentage of students who met the passing standard on the Science TAKS were obtained through a public records request to the Texas Education Agency (TEA) and the State Board for Educator Certification (SBEC). Analysis was performed through the use of canonical correlation analysis and multiple linear regression analysis. The results of the multiple linear regression analysis indicate that a larger percentage of students met the passing standard on the Science TAKS state attended schools in which a large portion of the high school science teachers held post baccalaureate degrees, elementary and physical science certifications, and had 11-20 years of teaching experience.
Mental ability and psychological work performance in Chinese workers.
Zhong, Fei; Yano, Eiji; Lan, Yajia; Wang, Mianzhen; Wang, Zhiming; Wang, Xiaorong
2006-10-01
This study was to explore the relationship among mental ability, occupational stress, and psychological work performance in Chinese workers, and to identify relevant modifiers of mental ability and psychological work performance. Psychological Stress Intensity (PSI), psychological work performance, and mental ability (Mental Function Index, MFI) were determined among 485 Chinese workers (aged 33 to 62 yr, 65% of men) with varied work occupations. Occupational Stress Questionnaire (OSQ) and mental ability with 3 tests (including immediate memory, digit span, and cipher decoding) were used. The relationship between mental ability and psychological work performance was analyzed with multiple linear regression approach. PSI, MFI, or psychological work performance were significantly different among different work types and educational level groups (p<0.01). Multiple linear regression analysis showed that MFI was significantly related to gender, age, educational level, and work type. Higher MFI and lower PSI predicted a better psychological work performance, even after adjusted for gender, age, educational level, and work type. The study suggests that occupational stress and low mental ability are important predictors for poor psychological work performance, which is modified by both gender and educational level.
Association of Alimentary Factors and Nutritional Status with Caries in Children of Leon, Mexico.
Guizar, Juan Manuel; Muñoz, Nathalie; Amador, Norma; Garcia, Gabriela
To determine the association between types of food consumed, nutritional status (BMI) and caries in schoolchildren. A cross-sectional study was performed with 224 schoolchildren 6 to 12 years of age. DMFT/ dmft indices, level of oral hygiene, nutritional status as quantified by BMI and types of food consumed were determined in all participants. Data were analysed using multiple linear regression with significance set at p < 0.05. Caries prevalence was 36%. In the multiple linear regression analysis adjusted for BMI, variables related to a higher number of caries were younger age and lower intake of vitamin D, calcium and fiber, with higher consumption of phosphorous and carbohydrates (R2 = 0.30; p < 0.0001 for the model). Sweetened softdrinks and chewy candy were risk factors for higher caries prevalence, while consuming milk and carrots were protectors. Caries in schoolchildren is highly prevalent in this community and is related to younger age and lower intake of vitamin D, calcium and fiber, but a higher consumption of phosphorous and carbohydrates. No relationship was found between caries and nutritional status.
Shabri, Ani; Samsudin, Ruhaidah
2014-01-01
Crude oil prices do play significant role in the global economy and are a key input into option pricing formulas, portfolio allocation, and risk measurement. In this paper, a hybrid model integrating wavelet and multiple linear regressions (MLR) is proposed for crude oil price forecasting. In this model, Mallat wavelet transform is first selected to decompose an original time series into several subseries with different scale. Then, the principal component analysis (PCA) is used in processing subseries data in MLR for crude oil price forecasting. The particle swarm optimization (PSO) is used to adopt the optimal parameters of the MLR model. To assess the effectiveness of this model, daily crude oil market, West Texas Intermediate (WTI), has been used as the case study. Time series prediction capability performance of the WMLR model is compared with the MLR, ARIMA, and GARCH models using various statistics measures. The experimental results show that the proposed model outperforms the individual models in forecasting of the crude oil prices series. PMID:24895666
Golmohammadi, Hassan
2009-11-30
A quantitative structure-property relationship (QSPR) study was performed to develop models those relate the structure of 141 organic compounds to their octanol-water partition coefficients (log P(o/w)). A genetic algorithm was applied as a variable selection tool. Modeling of log P(o/w) of these compounds as a function of theoretically derived descriptors was established by multiple linear regression (MLR), partial least squares (PLS), and artificial neural network (ANN). The best selected descriptors that appear in the models are: atomic charge weighted partial positively charged surface area (PPSA-3), fractional atomic charge weighted partial positive surface area (FPSA-3), minimum atomic partial charge (Qmin), molecular volume (MV), total dipole moment of molecule (mu), maximum antibonding contribution of a molecule orbital in the molecule (MAC), and maximum free valency of a C atom in the molecule (MFV). The result obtained showed the ability of developed artificial neural network to prediction of partition coefficients of organic compounds. Also, the results revealed the superiority of ANN over the MLR and PLS models. Copyright 2009 Wiley Periodicals, Inc.
Modeling Longitudinal Data Containing Non-Normal Within Subject Errors
NASA Technical Reports Server (NTRS)
Feiveson, Alan; Glenn, Nancy L.
2013-01-01
The mission of the National Aeronautics and Space Administration’s (NASA) human research program is to advance safe human spaceflight. This involves conducting experiments, collecting data, and analyzing data. The data are longitudinal and result from a relatively few number of subjects; typically 10 – 20. A longitudinal study refers to an investigation where participant outcomes and possibly treatments are collected at multiple follow-up times. Standard statistical designs such as mean regression with random effects and mixed–effects regression are inadequate for such data because the population is typically not approximately normally distributed. Hence, more advanced data analysis methods are necessary. This research focuses on four such methods for longitudinal data analysis: the recently proposed linear quantile mixed models (lqmm) by Geraci and Bottai (2013), quantile regression, multilevel mixed–effects linear regression, and robust regression. This research also provides computational algorithms for longitudinal data that scientists can directly use for human spaceflight and other longitudinal data applications, then presents statistical evidence that verifies which method is best for specific situations. This advances the study of longitudinal data in a broad range of applications including applications in the sciences, technology, engineering and mathematics fields.
Application of XGBoost algorithm in hourly PM2.5 concentration prediction
NASA Astrophysics Data System (ADS)
Pan, Bingyue
2018-02-01
In view of prediction techniques of hourly PM2.5 concentration in China, this paper applied the XGBoost(Extreme Gradient Boosting) algorithm to predict hourly PM2.5 concentration. The monitoring data of air quality in Tianjin city was analyzed by using XGBoost algorithm. The prediction performance of the XGBoost method is evaluated by comparing observed and predicted PM2.5 concentration using three measures of forecast accuracy. The XGBoost method is also compared with the random forest algorithm, multiple linear regression, decision tree regression and support vector machines for regression models using computational results. The results demonstrate that the XGBoost algorithm outperforms other data mining methods.
Chaurasia, Ashok; Harel, Ofer
2015-02-10
Tests for regression coefficients such as global, local, and partial F-tests are common in applied research. In the framework of multiple imputation, there are several papers addressing tests for regression coefficients. However, for simultaneous hypothesis testing, the existing methods are computationally intensive because they involve calculation with vectors and (inversion of) matrices. In this paper, we propose a simple method based on the scalar entity, coefficient of determination, to perform (global, local, and partial) F-tests with multiply imputed data. The proposed method is evaluated using simulated data and applied to suicide prevention data. Copyright © 2014 John Wiley & Sons, Ltd.
Robbins, Blaine
2013-01-01
Sociologists, political scientists, and economists all suggest that culture plays a pivotal role in the development of large-scale cooperation. In this study, I used generalized trust as a measure of culture to explore if and how culture impacts intentional homicide, my operationalization of cooperation. I compiled multiple cross-national data sets and used pooled time-series linear regression, single-equation instrumental-variables linear regression, and fixed- and random-effects estimation techniques on an unbalanced panel of 118 countries and 232 observations spread over a 15-year time period. Results suggest that culture and large-scale cooperation form a tenuous relationship, while economic factors such as development, inequality, and geopolitics appear to drive large-scale cooperation. PMID:23527211
Almalik, Osama; Nijhuis, Michiel B; van den Heuvel, Edwin R
2014-01-01
Shelf-life estimation usually requires that at least three registration batches are tested for stability at multiple storage conditions. The shelf-life estimates are often obtained by linear regression analysis per storage condition, an approach implicitly suggested by ICH guideline Q1E. A linear regression analysis combining all data from multiple storage conditions was recently proposed in the literature when variances are homogeneous across storage conditions. The combined analysis is expected to perform better than the separate analysis per storage condition, since pooling data would lead to an improved estimate of the variation and higher numbers of degrees of freedom, but this is not evident for shelf-life estimation. Indeed, the two approaches treat the observed initial batch results, the intercepts in the model, and poolability of batches differently, which may eliminate or reduce the expected advantage of the combined approach with respect to the separate approach. Therefore, a simulation study was performed to compare the distribution of simulated shelf-life estimates on several characteristics between the two approaches and to quantify the difference in shelf-life estimates. In general, the combined statistical analysis does estimate the true shelf life more consistently and precisely than the analysis per storage condition, but it did not outperform the separate analysis in all circumstances.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jahandideh, Sepideh; Jahandideh, Samad; Asadabadi, Ebrahim Barzegari
2009-11-15
Prediction of the amount of hospital waste production will be helpful in the storage, transportation and disposal of hospital waste management. Based on this fact, two predictor models including artificial neural networks (ANNs) and multiple linear regression (MLR) were applied to predict the rate of medical waste generation totally and in different types of sharp, infectious and general. In this study, a 5-fold cross-validation procedure on a database containing total of 50 hospitals of Fars province (Iran) were used to verify the performance of the models. Three performance measures including MAR, RMSE and R{sup 2} were used to evaluate performancemore » of models. The MLR as a conventional model obtained poor prediction performance measure values. However, MLR distinguished hospital capacity and bed occupancy as more significant parameters. On the other hand, ANNs as a more powerful model, which has not been introduced in predicting rate of medical waste generation, showed high performance measure values, especially 0.99 value of R{sup 2} confirming the good fit of the data. Such satisfactory results could be attributed to the non-linear nature of ANNs in problem solving which provides the opportunity for relating independent variables to dependent ones non-linearly. In conclusion, the obtained results showed that our ANN-based model approach is very promising and may play a useful role in developing a better cost-effective strategy for waste management in future.« less
Normality of raw data in general linear models: The most widespread myth in statistics
Kery, Marc; Hatfield, Jeff S.
2003-01-01
In years of statistical consulting for ecologists and wildlife biologists, by far the most common misconception we have come across has been the one about normality in general linear models. These comprise a very large part of the statistical models used in ecology and include t tests, simple and multiple linear regression, polynomial regression, and analysis of variance (ANOVA) and covariance (ANCOVA). There is a widely held belief that the normality assumption pertains to the raw data rather than to the model residuals. We suspect that this error may also occur in countless published studies, whenever the normality assumption is tested prior to analysis. This may lead to the use of nonparametric alternatives (if there are any), when parametric tests would indeed be appropriate, or to use of transformations of raw data, which may introduce hidden assumptions such as multiplicative effects on the natural scale in the case of log-transformed data. Our aim here is to dispel this myth. We very briefly describe relevant theory for two cases of general linear models to show that the residuals need to be normally distributed if tests requiring normality are to be used, such as t and F tests. We then give two examples demonstrating that the distribution of the response variable may be nonnormal, and yet the residuals are well behaved. We do not go into the issue of how to test normality; instead we display the distributions of response variables and residuals graphically.
Byun, Bo-Ram; Kim, Yong-Il; Maki, Koutaro; Son, Woo-Sung
2015-01-01
This study was aimed to examine the correlation between skeletal maturation status and parameters from the odontoid process/body of the second vertebra and the bodies of third and fourth cervical vertebrae and simultaneously build multiple regression models to be able to estimate skeletal maturation status in Korean girls. Hand-wrist radiographs and cone beam computed tomography (CBCT) images were obtained from 74 Korean girls (6–18 years of age). CBCT-generated cervical vertebral maturation (CVM) was used to demarcate the odontoid process and the body of the second cervical vertebra, based on the dentocentral synchondrosis. Correlation coefficient analysis and multiple linear regression analysis were used for each parameter of the cervical vertebrae (P < 0.05). Forty-seven of 64 parameters from CBCT-generated CVM (independent variables) exhibited statistically significant correlations (P < 0.05). The multiple regression model with the greatest R 2 had six parameters (PH2/W2, UW2/W2, (OH+AH2)/LW2, UW3/LW3, D3, and H4/W4) as independent variables with a variance inflation factor (VIF) of <2. CBCT-generated CVM was able to include parameters from the second cervical vertebral body and odontoid process, respectively, for the multiple regression models. This suggests that quantitative analysis might be used to estimate skeletal maturation status. PMID:25878721
Statistical Methods for Generalized Linear Models with Covariates Subject to Detection Limits.
Bernhardt, Paul W; Wang, Huixia J; Zhang, Daowen
2015-05-01
Censored observations are a common occurrence in biomedical data sets. Although a large amount of research has been devoted to estimation and inference for data with censored responses, very little research has focused on proper statistical procedures when predictors are censored. In this paper, we consider statistical methods for dealing with multiple predictors subject to detection limits within the context of generalized linear models. We investigate and adapt several conventional methods and develop a new multiple imputation approach for analyzing data sets with predictors censored due to detection limits. We establish the consistency and asymptotic normality of the proposed multiple imputation estimator and suggest a computationally simple and consistent variance estimator. We also demonstrate that the conditional mean imputation method often leads to inconsistent estimates in generalized linear models, while several other methods are either computationally intensive or lead to parameter estimates that are biased or more variable compared to the proposed multiple imputation estimator. In an extensive simulation study, we assess the bias and variability of different approaches within the context of a logistic regression model and compare variance estimation methods for the proposed multiple imputation estimator. Lastly, we apply several methods to analyze the data set from a recently-conducted GenIMS study.
Modeling thermal sensation in a Mediterranean climate—a comparison of linear and ordinal models
NASA Astrophysics Data System (ADS)
Pantavou, Katerina; Lykoudis, Spyridon
2014-08-01
A simple thermo-physiological model of outdoor thermal sensation adjusted with psychological factors is developed aiming to predict thermal sensation in Mediterranean climates. Microclimatic measurements simultaneously with interviews on personal and psychological conditions were carried out in a square, a street canyon and a coastal location of the greater urban area of Athens, Greece. Multiple linear and ordinal regression were applied in order to estimate thermal sensation making allowance for all the recorded parameters or specific, empirically selected, subsets producing so-called extensive and empirical models, respectively. Meteorological, thermo-physiological and overall models - considering psychological factors as well - were developed. Predictions were improved when personal and psychological factors were taken into account as compared to meteorological models. The model based on ordinal regression reproduced extreme values of thermal sensation vote more adequately than the linear regression one, while the empirical model produced satisfactory results in relation to the extensive model. The effects of adaptation and expectation on thermal sensation vote were introduced in the models by means of the exposure time, season and preference related to air temperature and irradiation. The assessment of thermal sensation could be a useful criterion in decision making regarding public health, outdoor spaces planning and tourism.
Multiplicative Forests for Continuous-Time Processes
Weiss, Jeremy C.; Natarajan, Sriraam; Page, David
2013-01-01
Learning temporal dependencies between variables over continuous time is an important and challenging task. Continuous-time Bayesian networks effectively model such processes but are limited by the number of conditional intensity matrices, which grows exponentially in the number of parents per variable. We develop a partition-based representation using regression trees and forests whose parameter spaces grow linearly in the number of node splits. Using a multiplicative assumption we show how to update the forest likelihood in closed form, producing efficient model updates. Our results show multiplicative forests can be learned from few temporal trajectories with large gains in performance and scalability. PMID:25284967
Multiplicative Forests for Continuous-Time Processes.
Weiss, Jeremy C; Natarajan, Sriraam; Page, David
2012-01-01
Learning temporal dependencies between variables over continuous time is an important and challenging task. Continuous-time Bayesian networks effectively model such processes but are limited by the number of conditional intensity matrices, which grows exponentially in the number of parents per variable. We develop a partition-based representation using regression trees and forests whose parameter spaces grow linearly in the number of node splits. Using a multiplicative assumption we show how to update the forest likelihood in closed form, producing efficient model updates. Our results show multiplicative forests can be learned from few temporal trajectories with large gains in performance and scalability.
Chen, Chen; Xie, Yuanchang
2016-06-01
Annual Average Daily Traffic (AADT) is often considered as a main covariate for predicting crash frequencies at urban and suburban intersections. A linear functional form is typically assumed for the Safety Performance Function (SPF) to describe the relationship between the natural logarithm of expected crash frequency and covariates derived from AADTs. Such a linearity assumption has been questioned by many researchers. This study applies Generalized Additive Models (GAMs) and Piecewise Linear Negative Binomial (PLNB) regression models to fit intersection crash data. Various covariates derived from minor-and major-approach AADTs are considered. Three different dependent variables are modeled, which are total multiple-vehicle crashes, rear-end crashes, and angle crashes. The modeling results suggest that a nonlinear functional form may be more appropriate. Also, the results show that it is important to take into consideration the joint safety effects of multiple covariates. Additionally, it is found that the ratio of minor to major-approach AADT has a varying impact on intersection safety and deserves further investigations. Copyright © 2016 Elsevier Ltd. All rights reserved.
Modeling non-linear growth responses to temperature and hydrology in wetland trees
NASA Astrophysics Data System (ADS)
Keim, R.; Allen, S. T.
2016-12-01
Growth responses of wetland trees to flooding and climate variations are difficult to model because they depend on multiple, apparently interacting factors, but are a critical link in hydrological control of wetland carbon budgets. To more generally understand tree growth to hydrological forcing, we modeled non-linear responses of tree ring growth to flooding and climate at sub-annual time steps, using Vaganov-Shashkin response functions. We calibrated the model to six baldcypress tree-ring chronologies from two hydrologically distinct sites in southern Louisiana, and tested several hypotheses of plasticity in wetlands tree responses to interacting environmental variables. The model outperformed traditional multiple linear regression. More importantly, optimized response parameters were generally similar among sites with varying hydrological conditions, suggesting generality to the functions. Model forms that included interacting responses to multiple forcing factors were more effective than were single response functions, indicating the principle of a single limiting factor is not correct in wetlands and both climatic and hydrological variables must be considered in predicting responses to hydrological or climate change.
Malomane, Dorcus Kholofelo; Norris, David; Banga, Cuthbert B; Ngambi, Jones W
2014-02-01
Body weight and weight of body parts are of economic importance. It is difficult to directly predict body weight from highly correlated morphological traits through multiple regression. Factor analysis was carried out to examine the relationship between body weight and five linear body measurements (body length, body girth, wing length, shank thickness, and shank length) in South African Venda (VN), Naked neck (NN), and Potchefstroom koekoek (PK) indigenous chicken breeds, with a view to identify those factors that define body conformation. Multiple regression was subsequently performed to predict body weight, using orthogonal traits derived from the factor analysis. Measurements were obtained from 210 chickens, 22 weeks of age, 70 chickens per breed. High correlations were obtained between body weight and all body measurements except for wing length in PK. Two factors extracted after varimax rotation explained 91, 95, and 83% of total variation in VN, NN, and PK, respectively. Factor 1 explained 73, 90, and 64% in VN, NN, and PK, respectively, and was loaded on all body measurements except for wing length in VN and PK. In a multiple regression, these two factors accounted for 72% variation in body weight in VN, while only factor 1 accounted for 83 and 74% variation in body weight in NN and PK, respectively. The two factors could be used to define body size and conformation of these breeds. Factor 1 could predict body weight in all three breeds. Body measurements can be better selected jointly to improve body weight in these breeds.
Theobald, Roddy; Freeman, Scott
2014-01-01
Although researchers in undergraduate science, technology, engineering, and mathematics education are currently using several methods to analyze learning gains from pre- and posttest data, the most commonly used approaches have significant shortcomings. Chief among these is the inability to distinguish whether differences in learning gains are due to the effect of an instructional intervention or to differences in student characteristics when students cannot be assigned to control and treatment groups at random. Using pre- and posttest scores from an introductory biology course, we illustrate how the methods currently in wide use can lead to erroneous conclusions, and how multiple linear regression offers an effective framework for distinguishing the impact of an instructional intervention from the impact of student characteristics on test score gains. In general, we recommend that researchers always use student-level regression models that control for possible differences in student ability and preparation to estimate the effect of any nonrandomized instructional intervention on student performance. PMID:24591502
Theobald, Roddy; Freeman, Scott
2014-01-01
Although researchers in undergraduate science, technology, engineering, and mathematics education are currently using several methods to analyze learning gains from pre- and posttest data, the most commonly used approaches have significant shortcomings. Chief among these is the inability to distinguish whether differences in learning gains are due to the effect of an instructional intervention or to differences in student characteristics when students cannot be assigned to control and treatment groups at random. Using pre- and posttest scores from an introductory biology course, we illustrate how the methods currently in wide use can lead to erroneous conclusions, and how multiple linear regression offers an effective framework for distinguishing the impact of an instructional intervention from the impact of student characteristics on test score gains. In general, we recommend that researchers always use student-level regression models that control for possible differences in student ability and preparation to estimate the effect of any nonrandomized instructional intervention on student performance.
Macrocell path loss prediction using artificial intelligence techniques
NASA Astrophysics Data System (ADS)
Usman, Abraham U.; Okereke, Okpo U.; Omizegba, Elijah E.
2014-04-01
The prediction of propagation loss is a practical non-linear function approximation problem which linear regression or auto-regression models are limited in their ability to handle. However, some computational Intelligence techniques such as artificial neural networks (ANNs) and adaptive neuro-fuzzy inference systems (ANFISs) have been shown to have great ability to handle non-linear function approximation and prediction problems. In this study, the multiple layer perceptron neural network (MLP-NN), radial basis function neural network (RBF-NN) and an ANFIS network were trained using actual signal strength measurement taken at certain suburban areas of Bauchi metropolis, Nigeria. The trained networks were then used to predict propagation losses at the stated areas under differing conditions. The predictions were compared with the prediction accuracy of the popular Hata model. It was observed that ANFIS model gave a better fit in all cases having higher R2 values in each case and on average is more robust than MLP and RBF models as it generalises better to a different data.
NASA Astrophysics Data System (ADS)
Polat, Esra; Gunay, Suleyman
2013-10-01
One of the problems encountered in Multiple Linear Regression (MLR) is multicollinearity, which causes the overestimation of the regression parameters and increase of the variance of these parameters. Hence, in case of multicollinearity presents, biased estimation procedures such as classical Principal Component Regression (CPCR) and Partial Least Squares Regression (PLSR) are then performed. SIMPLS algorithm is the leading PLSR algorithm because of its speed, efficiency and results are easier to interpret. However, both of the CPCR and SIMPLS yield very unreliable results when the data set contains outlying observations. Therefore, Hubert and Vanden Branden (2003) have been presented a robust PCR (RPCR) method and a robust PLSR (RPLSR) method called RSIMPLS. In RPCR, firstly, a robust Principal Component Analysis (PCA) method for high-dimensional data on the independent variables is applied, then, the dependent variables are regressed on the scores using a robust regression method. RSIMPLS has been constructed from a robust covariance matrix for high-dimensional data and robust linear regression. The purpose of this study is to show the usage of RPCR and RSIMPLS methods on an econometric data set, hence, making a comparison of two methods on an inflation model of Turkey. The considered methods have been compared in terms of predictive ability and goodness of fit by using a robust Root Mean Squared Error of Cross-validation (R-RMSECV), a robust R2 value and Robust Component Selection (RCS) statistic.
Enders, Felicity
2013-12-01
Although regression is widely used for reading and publishing in the medical literature, no instruments were previously available to assess students' understanding. The goal of this study was to design and assess such an instrument for graduate students in Clinical and Translational Science and Public Health. A 27-item REsearch on Global Regression Expectations in StatisticS (REGRESS) quiz was developed through an iterative process. Consenting students taking a course on linear regression in a Clinical and Translational Science program completed the quiz pre- and postcourse. Student results were compared to practicing statisticians with a master's or doctoral degree in statistics or a closely related field. Fifty-two students responded precourse, 59 postcourse , and 22 practicing statisticians completed the quiz. The mean (SD) score was 9.3 (4.3) for students precourse and 19.0 (3.5) postcourse (P < 0.001). Postcourse students had similar results to practicing statisticians (mean (SD) of 20.1(3.5); P = 0.21). Students also showed significant improvement pre/postcourse in each of six domain areas (P < 0.001). The REGRESS quiz was internally reliable (Cronbach's alpha 0.89). The initial validation is quite promising with statistically significant and meaningful differences across time and study populations. Further work is needed to validate the quiz across multiple institutions. © 2013 Wiley Periodicals, Inc.
Learning accurate and interpretable models based on regularized random forests regression
2014-01-01
Background Many biology related research works combine data from multiple sources in an effort to understand the underlying problems. It is important to find and interpret the most important information from these sources. Thus it will be beneficial to have an effective algorithm that can simultaneously extract decision rules and select critical features for good interpretation while preserving the prediction performance. Methods In this study, we focus on regression problems for biological data where target outcomes are continuous. In general, models constructed from linear regression approaches are relatively easy to interpret. However, many practical biological applications are nonlinear in essence where we can hardly find a direct linear relationship between input and output. Nonlinear regression techniques can reveal nonlinear relationship of data, but are generally hard for human to interpret. We propose a rule based regression algorithm that uses 1-norm regularized random forests. The proposed approach simultaneously extracts a small number of rules from generated random forests and eliminates unimportant features. Results We tested the approach on some biological data sets. The proposed approach is able to construct a significantly smaller set of regression rules using a subset of attributes while achieving prediction performance comparable to that of random forests regression. Conclusion It demonstrates high potential in aiding prediction and interpretation of nonlinear relationships of the subject being studied. PMID:25350120
Gender Performance Differences in Biochemistry
ERIC Educational Resources Information Center
Rauschenberger, Matthew M.; Sweeder, Ryan D.
2010-01-01
This study examined the historical performance of students at Michigan State University in a two-part biochemistry series Biochem I (n = 5,900) and Biochem II (n = 5,214) for students enrolled from 1997 to 2009. Multiple linear regressions predicted 54.9-87.5% of the variance in student from Biochem I grade and 53.8-76.1% of the variance in…
Curriculum-Based Measurement of Oral Reading: Quality of Progress Monitoring Outcomes
ERIC Educational Resources Information Center
Christ, Theodore J.; Zopluoglu, Cengiz; Long, Jeffery D.; Monaghen, Barbara D.
2012-01-01
Curriculum-based measurement of oral reading (CBM-R) is frequently used to set student goals and monitor student progress. This study examined the quality of growth estimates derived from CBM-R progress monitoring data. The authors used a linear mixed effects regression (LMER) model to simulate progress monitoring data for multiple levels of…
Beyond the Black-White Test Score Gap: Latinos' Early School Experiences and Literacy Outcomes
ERIC Educational Resources Information Center
Delgado, Enilda A.; Stoll, Laurie Cooper
2015-01-01
Data from the Early Childhood Longitudinal Survey-Birth Cohort are used to analyze the factors that lead to the reading readiness of children who participate in nonparental care the year prior to kindergarten (N = 4,550), with a specific focus on Latino children (N = 800). Stepwise multiple linear regression analysis demonstrates that reading…
Ecological and Topographic Features of Volcanic Ash-Influenced Forest Soils
Mark Kimsey; Brian Gardner; Alan Busacca
2007-01-01
Volcanic ash distribution and thickness were determined for a forested region of north-central Idaho. Mean ash thickness and multiple linear regression analyses were used to model the effect of environmental variables on ash thickness. Slope and slope curvature relationships with volcanic ash thickness varied on a local spatial scale across the study area. Ash...
ERIC Educational Resources Information Center
Mizzelle, Sylvia Jean
2012-01-01
The purpose of this study was to examine the relationship between teachers' and principals' perceptions on the North Carolina Teacher Working Conditions Survey (TWC) and the influence this relationship had on student achievement. A quantitative research design using a Multiple Linear Regression investigated the relationship between teachers' and…
Helping Students Assess the Relative Importance of Different Intermolecular Interactions
ERIC Educational Resources Information Center
Jasien, Paul G.
2008-01-01
A semi-quantitative model has been developed to estimate the relative effects of dispersion, dipole-dipole interactions, and H-bonding on the normal boiling points ("T[subscript b]") for a subset of simple organic systems. The model is based upon a statistical analysis using multiple linear regression on a series of straight-chain organic…
Developing a predictive tropospheric ozone model for Tabriz
NASA Astrophysics Data System (ADS)
Khatibi, Rahman; Naghipour, Leila; Ghorbani, Mohammad A.; Smith, Michael S.; Karimi, Vahid; Farhoudi, Reza; Delafrouz, Hadi; Arvanaghi, Hadi
2013-04-01
Predictive ozone models are becoming indispensable tools by providing a capability for pollution alerts to serve people who are vulnerable to the risks. We have developed a tropospheric ozone prediction capability for Tabriz, Iran, by using the following five modeling strategies: three regression-type methods: Multiple Linear Regression (MLR), Artificial Neural Networks (ANNs), and Gene Expression Programming (GEP); and two auto-regression-type models: Nonlinear Local Prediction (NLP) to implement chaos theory and Auto-Regressive Integrated Moving Average (ARIMA) models. The regression-type modeling strategies explain the data in terms of: temperature, solar radiation, dew point temperature, and wind speed, by regressing present ozone values to their past values. The ozone time series are available at various time intervals, including hourly intervals, from August 2010 to March 2011. The results for MLR, ANN and GEP models are not overly good but those produced by NLP and ARIMA are promising for the establishing a forecasting capability.
Fisz, Jacek J
2006-12-07
The optimization approach based on the genetic algorithm (GA) combined with multiple linear regression (MLR) method, is discussed. The GA-MLR optimizer is designed for the nonlinear least-squares problems in which the model functions are linear combinations of nonlinear functions. GA optimizes the nonlinear parameters, and the linear parameters are calculated from MLR. GA-MLR is an intuitive optimization approach and it exploits all advantages of the genetic algorithm technique. This optimization method results from an appropriate combination of two well-known optimization methods. The MLR method is embedded in the GA optimizer and linear and nonlinear model parameters are optimized in parallel. The MLR method is the only one strictly mathematical "tool" involved in GA-MLR. The GA-MLR approach simplifies and accelerates considerably the optimization process because the linear parameters are not the fitted ones. Its properties are exemplified by the analysis of the kinetic biexponential fluorescence decay surface corresponding to a two-excited-state interconversion process. A short discussion of the variable projection (VP) algorithm, designed for the same class of the optimization problems, is presented. VP is a very advanced mathematical formalism that involves the methods of nonlinear functionals, algebra of linear projectors, and the formalism of Fréchet derivatives and pseudo-inverses. Additional explanatory comments are added on the application of recently introduced the GA-NR optimizer to simultaneous recovery of linear and weakly nonlinear parameters occurring in the same optimization problem together with nonlinear parameters. The GA-NR optimizer combines the GA method with the NR method, in which the minimum-value condition for the quadratic approximation to chi(2), obtained from the Taylor series expansion of chi(2), is recovered by means of the Newton-Raphson algorithm. The application of the GA-NR optimizer to model functions which are multi-linear combinations of nonlinear functions, is indicated. The VP algorithm does not distinguish the weakly nonlinear parameters from the nonlinear ones and it does not apply to the model functions which are multi-linear combinations of nonlinear functions.
SU-F-R-20: Image Texture Features Correlate with Time to Local Failure in Lung SBRT Patients
DOE Office of Scientific and Technical Information (OSTI.GOV)
Andrews, M; Abazeed, M; Woody, N
Purpose: To explore possible correlation between CT image-based texture and histogram features and time-to-local-failure in early stage non-small cell lung cancer (NSCLC) patients treated with stereotactic body radiotherapy (SBRT).Methods and Materials: From an IRB-approved lung SBRT registry for patients treated between 2009–2013 we selected 48 (20 male, 28 female) patients with local failure. Median patient age was 72.3±10.3 years. Mean time to local failure was 15 ± 7.1 months. Physician-contoured gross tumor volumes (GTV) on the planning CT images were processed and 3D gray-level co-occurrence matrix (GLCM) based texture and histogram features were calculated in Matlab. Data were exported tomore » R and a multiple linear regression model was used to examine the relationship between texture features and time-to-local-failure. Results: Multiple linear regression revealed that entropy (p=0.0233, multiple R2=0.60) from GLCM-based texture analysis and the standard deviation (p=0.0194, multiple R2=0.60) from the histogram-based features were statistically significantly correlated with the time-to-local-failure. Conclusion: Image-based texture analysis can be used to predict certain aspects of treatment outcomes of NSCLC patients treated with SBRT. We found entropy and standard deviation calculated for the GTV on the CT images displayed a statistically significant correlation with and time-to-local-failure in lung SBRT patients.« less
NASA Astrophysics Data System (ADS)
Arantes Camargo, Livia; Marques, José, Jr.
2015-04-01
The prediction of erodibility using indirect methods such as diffuse reflectance spectroscopy could facilitate the characterization of the spatial variability in large areas and optimize implementation of conservation practices. The aim of this study was to evaluate the prediction of interrill erodibility (Ki) and rill erodibility (Kr) by means of iron oxides content and soil color using multiple linear regression and diffuse reflectance spectroscopy (DRS) using regression analysis by least squares partial (PLSR). The soils were collected from three geomorphic surfaces and analyzed for chemical, physical and mineralogical properties, plus scanned in the spectral range from the visible and infrared. Maps of spatial distribution of Ki and Kr were built with the values calculated by the calibrated models that obtained the best accuracy using geostatistics. Interrill-rill erodibility presented negative correlation with iron extracted by dithionite-citrate-bicarbonate, hematite, and chroma, confirming the influence of iron oxides in soil structural stability. Hematite and hue were the attributes that most contributed in calibration models by multiple linear regression for the prediction of Ki (R2 = 0.55) and Kr (R2 = 0.53). The diffuse reflectance spectroscopy via PLSR allowed to predict Interrill-rill erodibility with high accuracy (R2adj = 0.76, 0.81 respectively and RPD> 2.0) in the range of the visible spectrum (380-800 nm) and the characterization of the spatial variability of these attributes by geostatistics.
Pistonesi, Marcelo F; Di Nezio, María S; Centurión, María E; Lista, Adriana G; Fragoso, Wallace D; Pontes, Márcio J C; Araújo, Mário C U; Band, Beatriz S Fernández
2010-12-15
In this study, a novel, simple, and efficient spectrofluorimetric method to determine directly and simultaneously five phenolic compounds (hydroquinone, resorcinol, phenol, m-cresol and p-cresol) in air samples is presented. For this purpose, variable selection by the successive projections algorithm (SPA) is used in order to obtain simple multiple linear regression (MLR) models based on a small subset of wavelengths. For comparison, partial least square (PLS) regression is also employed in full-spectrum. The concentrations of the calibration matrix ranged from 0.02 to 0.2 mg L(-1) for hydroquinone, from 0.05 to 0.6 mg L(-1) for resorcinol, and from 0.05 to 0.4 mg L(-1) for phenol, m-cresol and p-cresol; incidentally, such ranges are in accordance with the Argentinean environmental legislation. To verify the accuracy of the proposed method a recovery study on real air samples of smoking environment was carried out with satisfactory results (94-104%). The advantage of the proposed method is that it requires only spectrofluorimetric measurements of samples and chemometric modeling for simultaneous determination of five phenols. With it, air is simply sampled and no pre-treatment sample is needed (i.e., separation steps and derivatization reagents are avoided) that means a great saving of time. Copyright © 2010 Elsevier B.V. All rights reserved.
Modeling Laterality of the Globus Pallidus Internus in Patients With Parkinson's Disease.
Sharim, Justin; Yazdi, Daniel; Baohan, Amy; Behnke, Eric; Pouratian, Nader
2017-04-01
Neurosurgical interventions such as deep brain stimulation surgery of the globus pallidus internus (GPi) play an important role in the treatment of medically refractory Parkinson's disease (PD), and require high targeting accuracy. Variability in the laterality of the GPi across patients with PD has not been well characterized. The aim of this report is to identify factors that may contribute to differences in position of the motor region of GPi. The charts and operative reports of 101 PD patients following deep brain stimulation surgery (70 males, aged 11-78 years) representing 201 GPi were retrospectively reviewed. Data extracted for each subject include age, gender, anterior and posterior commissures (AC-PC) distance, and third ventricular width. Multiple linear regression, stepwise regression, and relative importance of regressors analysis were performed to assess the predictive ability of these variables on GPi laterality. Multiple linear regression for target vs. third ventricular width, gender, AC-PC distance, and age were significant for normalized linear regression coefficients of 0.333 (p < 0.0001), 0.206 (p = 0.00219), 0.168 (p = 0.0119), and 0.159 (p = 0.0136), respectively. Third ventricular width, gender, AC-PC distance, and age each account for 44.06% (21.38-65.69%, 95% CI), 20.82% (10.51-35.88%), 21.46% (8.28-37.05%), and 13.66% (2.62-28.64%) of the R 2 value, respectively. Effect size calculation was significant for a change in the GPi laterality of 0.19 mm per mm of ventricular width, 0.11 mm per mm of AC-PC distance, 0.017 mm per year in age, and 0.54 mm increase for male gender. This variability highlights the limitations of indirect targeting alone, and argues for the continued use of MRI as well as intraoperative physiological testing to account for such factors that contribute to patient-specific variability in GPi localization. © 2016 International Neuromodulation Society.
Refractive Status at Birth: Its Relation to Newborn Physical Parameters at Birth and Gestational Age
Varghese, Raji Mathew; Sreenivas, Vishnubhatla; Puliyel, Jacob Mammen; Varughese, Sara
2009-01-01
Background Refractive status at birth is related to gestational age. Preterm babies have myopia which decreases as gestational age increases and term babies are known to be hypermetropic. This study looked at the correlation of refractive status with birth weight in term and preterm babies, and with physical indicators of intra-uterine growth such as the head circumference and length of the baby at birth. Methods All babies delivered at St. Stephens Hospital and admitted in the nursery were eligible for the study. Refraction was performed within the first week of life. 0.8% tropicamide with 0.5% phenylephrine was used to achieve cycloplegia and paralysis of accommodation. 599 newborn babies participated in the study. Data pertaining to the right eye is utilized for all the analyses except that for anisometropia where the two eyes were compared. Growth parameters were measured soon after birth. Simple linear regression analysis was performed to see the association of refractive status, (mean spherical equivalent (MSE), astigmatism and anisometropia) with each of the study variables, namely gestation, length, weight and head circumference. Subsequently, multiple linear regression was carried out to identify the independent predictors for each of the outcome parameters. Results Simple linear regression showed a significant relation between all 4 study variables and refractive error but in multiple regression only gestational age and weight were related to refractive error. The partial correlation of weight with MSE adjusted for gestation was 0.28 and that of gestation with MSE adjusted for weight was 0.10. Birth weight had a higher correlation to MSE than gestational age. Conclusion This is the first study to look at refractive error against all these growth parameters, in preterm and term babies at birth. It would appear from this study that birth weight rather than gestation should be used as criteria for screening for refractive error, especially in developing countries where the incidence of intrauterine malnutrition is higher. PMID:19214228
Hao, Xu; Yujun, Sun; Xinjie, Wang; Jin, Wang; Yao, Fu
2015-01-01
A multiple linear model was developed for individual tree crown width of Cunninghamia lanceolata (Lamb.) Hook in Fujian province, southeast China. Data were obtained from 55 sample plots of pure China-fir plantation stands. An Ordinary Linear Least Squares (OLS) regression was used to establish the crown width model. To adjust for correlations between observations from the same sample plots, we developed one level linear mixed-effects (LME) models based on the multiple linear model, which take into account the random effects of plots. The best random effects combinations for the LME models were determined by the Akaike's information criterion, the Bayesian information criterion and the -2logarithm likelihood. Heteroscedasticity was reduced by three residual variance functions: the power function, the exponential function and the constant plus power function. The spatial correlation was modeled by three correlation structures: the first-order autoregressive structure [AR(1)], a combination of first-order autoregressive and moving average structures [ARMA(1,1)], and the compound symmetry structure (CS). Then, the LME model was compared to the multiple linear model using the absolute mean residual (AMR), the root mean square error (RMSE), and the adjusted coefficient of determination (adj-R2). For individual tree crown width models, the one level LME model showed the best performance. An independent dataset was used to test the performance of the models and to demonstrate the advantage of calibrating LME models.
Association of dentine hypersensitivity with different risk factors - a cross sectional study.
Vijaya, V; Sanjay, Venkataraam; Varghese, Rana K; Ravuri, Rajyalakshmi; Agarwal, Anil
2013-12-01
This study was done to assess the prevalence of Dentine hypersensitivity (DH) and its associated risk factors. This epidemiological study was done among patients coming to dental college regarding prevalence of DH. A self structured questionnaire along with clinical examination was done for assessment. Descriptive statistics were obtained and frequency distribution was calculated using Chi square test at p value <0.05. Stepwise multiple linear regression was also done to access frequency of DH with different factors. The study population was comprised of 655 participants with different age groups. Our study showed prevalence as 55% and it was more common among males. Similarly smokers and those who use hard tooth brush had more cases of DH. Step wise multiple linear regression showed that best predictor for DH was age followed by habit of smoking and type of tooth brush. Most aggravating factors were cold water (15.4%) and sweet foods (14.7%), whereas only 5% of the patients had it while brushing. A high level of dental hypersensitivity has been in this study and more common among males. A linear finding was shown with age, smoking and type of tooth brush. How to cite this article: Vijaya V, Sanjay V, Varghese RK, Ravuri R, Agarwal A. Association of Dentine Hypersensitivity with Different Risk Factors - A Cross Sectional Study. J Int Oral Health 2013;5(6):88-92 .
Pre-natal exposures to cocaine and alcohol and physical growth patterns to age 8 years
Lumeng, Julie C.; Cabral, Howard J.; Gannon, Katherine; Heeren, Timothy; Frank, Deborah A.
2007-01-01
Two hundred and two primarily African American/Caribbean children (classified by maternal report and infant meconium as 38 heavier, 74 lighter and 89 not cocaine-exposed) were measured repeatedly from birth to age 8 years to assess whether there is an independent effect of prenatal cocaine exposure on physical growth patterns. Children with fetal alcohol syndrome identifiable at birth were excluded. At birth, cocaine and alcohol exposures were significantly and independently associated with lower weight, length and head circumference in cross-sectional multiple regression analyses. The relationship over time of pre-natal exposures to weight, height, and head circumference was then examined by multiple linear regression using mixed linear models including covariates: child’s gestational age, gender, ethnicity, age at assessment, current caregiver, birth mother’s use of alcohol, marijuana and tobacco during the pregnancy and pre-pregnancy weight (for child’s weight) and height (for child’s height and head circumference). The cocaine effects did not persist beyond infancy in piecewise linear mixed models, but a significant and independent negative effect of pre-natal alcohol exposure persisted for weight, height, and head circumference. Catch-up growth in cocaine-exposed infants occurred primarily by 6 months of age for all growth parameters, with some small fluctuations in growth rates in the preschool age range but no detectable differences between heavier versus unexposed nor lighter versus unexposed thereafter. PMID:17412558
Giacomino, Agnese; Abollino, Ornella; Malandrino, Mery; Mentasti, Edoardo
2011-03-04
Single and sequential extraction procedures are used for studying element mobility and availability in solid matrices, like soils, sediments, sludge, and airborne particulate matter. In the first part of this review we reported an overview on these procedures and described the applications of chemometric uni- and bivariate techniques and of multivariate pattern recognition techniques based on variable reduction to the experimental results obtained. The second part of the review deals with the use of chemometrics not only for the visualization and interpretation of data, but also for the investigation of the effects of experimental conditions on the response, the optimization of their values and the calculation of element fractionation. We will describe the principles of the multivariate chemometric techniques considered, the aims for which they were applied and the key findings obtained. The following topics will be critically addressed: pattern recognition by cluster analysis (CA), linear discriminant analysis (LDA) and other less common techniques; modelling by multiple linear regression (MLR); investigation of spatial distribution of variables by geostatistics; calculation of fractionation patterns by a mixture resolution method (Chemometric Identification of Substrates and Element Distributions, CISED); optimization and characterization of extraction procedures by experimental design; other multivariate techniques less commonly applied. Copyright © 2010 Elsevier B.V. All rights reserved.
Socio-economic factors associated with infant mortality in Italy: an ecological study.
Dallolio, Laura; Di Gregori, Valentina; Lenzi, Jacopo; Franchino, Giuseppe; Calugi, Simona; Domenighetti, Gianfranco; Fantini, Maria Pia
2012-08-16
One issue that continues to attract the attention of public health researchers is the possible relationship in high-income countries between income, income inequality and infant mortality (IM). The aim of this study was to assess the associations between IM and major socio-economic determinants in Italy. Associations between infant mortality rates in the 20 Italian regions (2006-2008) and the Gini index of income inequality, mean household income, percentage of women with at least 8 years of education, and percentage of unemployed aged 15-64 years were assessed using Pearson correlation coefficients. Univariate linear regression and multiple stepwise linear regression analyses were performed to determine the magnitude and direction of the effect of the four socio-economic variables on IM. The Gini index and the total unemployment rate showed a positive strong correlation with IM (r = 0.70; p < 0.001 and r = 0.84; p < 0.001 respectively), mean household income showed a strong negative correlation (r = -0.78; p < 0.001), while female educational attainment presented a weak negative correlation (r = -0.45; p < 0.05). Using a multiple stepwise linear regression model, only unemployment rate was independently associated with IM (b = 0.15, p < 0.001). In Italy, a high-income country where health care is universally available, variations in IM were strongly associated with relative and absolute income and unemployment rate. These results suggest that in Italy IM is not only related to income distribution, as demonstrated for other developed countries, but also to economic factors such as absolute income and unemployment. In order to reduce IM and the existing inequalities, the challenge for Italian decision makers is to promote economic growth and enhance employment levels.
An open-access CMIP5 pattern library for temperature and precipitation: Description and methodology
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lynch, Cary D.; Hartin, Corinne A.; Bond-Lamberty, Benjamin
Pattern scaling is used to efficiently emulate general circulation models and explore uncertainty in climate projections under multiple forcing scenarios. Pattern scaling methods assume that local climate changes scale with a global mean temperature increase, allowing for spatial patterns to be generated for multiple models for any future emission scenario. For uncertainty quantification and probabilistic statistical analysis, a library of patterns with descriptive statistics for each file would be beneficial, but such a library does not presently exist. Of the possible techniques used to generate patterns, the two most prominent are the delta and least squared regression methods. We exploremore » the differences and statistical significance between patterns generated by each method and assess performance of the generated patterns across methods and scenarios. Differences in patterns across seasons between methods and epochs were largest in high latitudes (60-90°N/S). Bias and mean errors between modeled and pattern predicted output from the linear regression method were smaller than patterns generated by the delta method. Across scenarios, differences in the linear regression method patterns were more statistically significant, especially at high latitudes. We found that pattern generation methodologies were able to approximate the forced signal of change to within ≤ 0.5°C, but choice of pattern generation methodology for pattern scaling purposes should be informed by user goals and criteria. As a result, this paper describes our library of least squared regression patterns from all CMIP5 models for temperature and precipitation on an annual and sub-annual basis, along with the code used to generate these patterns.« less
An open-access CMIP5 pattern library for temperature and precipitation: Description and methodology
Lynch, Cary D.; Hartin, Corinne A.; Bond-Lamberty, Benjamin; ...
2017-05-15
Pattern scaling is used to efficiently emulate general circulation models and explore uncertainty in climate projections under multiple forcing scenarios. Pattern scaling methods assume that local climate changes scale with a global mean temperature increase, allowing for spatial patterns to be generated for multiple models for any future emission scenario. For uncertainty quantification and probabilistic statistical analysis, a library of patterns with descriptive statistics for each file would be beneficial, but such a library does not presently exist. Of the possible techniques used to generate patterns, the two most prominent are the delta and least squared regression methods. We exploremore » the differences and statistical significance between patterns generated by each method and assess performance of the generated patterns across methods and scenarios. Differences in patterns across seasons between methods and epochs were largest in high latitudes (60-90°N/S). Bias and mean errors between modeled and pattern predicted output from the linear regression method were smaller than patterns generated by the delta method. Across scenarios, differences in the linear regression method patterns were more statistically significant, especially at high latitudes. We found that pattern generation methodologies were able to approximate the forced signal of change to within ≤ 0.5°C, but choice of pattern generation methodology for pattern scaling purposes should be informed by user goals and criteria. As a result, this paper describes our library of least squared regression patterns from all CMIP5 models for temperature and precipitation on an annual and sub-annual basis, along with the code used to generate these patterns.« less
Bomfim, Rafael Aiello; Crosato, Edgard; Mazzilli, Luiz Eugênio Nigro; Frias, Antonio Carlos
2015-01-01
This study evaluates the prevalence and risk factors of non-carious cervical lesions (NCCLs) in a Brazilian population of workers exposed and non-exposed to acid mists and chemical products. One hundred workers (46 exposed and 54 non-exposed) were evaluated in a Centro de Referência em Saúde do Trabalhador - CEREST (Worker's Health Reference Center). The workers responded to questionnaires regarding their personal information and about alcohol consumption and tobacco use. A clinical examination was conducted to evaluate the presence of NCCLs, according to WHO parameters. Statistical analyses were performed by unconditional logistic regression and multiple linear regression, with the critical level of p < 0.05. NCCLs were significantly associated with age groups (18-34, 35-44, 45-68 years). The unconditional logistic regression showed that the presence of NCCLs was better explained by age group (OR = 4.04; CI 95% 1.77-9.22) and occupational exposure to acid mists and chemical products (OR = 3.84; CI 95% 1.10-13.49), whereas the linear multiple regression revealed that NCCLs were better explained by years of smoking (p = 0.01) and age group (p = 0.04). The prevalence of NCCLs in the study population was particularly high (76.84%), and the risk factors for NCCLs were age, exposure to acid mists and smoking habit. Controlling risk factors through preventive and educative measures, allied to the use of personal protective equipment to prevent the occupational exposure to acid mists, may contribute to minimizing the prevalence of NCCLs.
Zhou, Qing-he; Zhu, Bo; Wei, Chang-na; Yan, Min
2016-03-24
Studies have shown that abdominal girth and vertebral column length have high predictive value for spinal spread after administering a dose of plain bupivacaine. we designed a study to identify the specific correlations between abdominal girth, vertebral column length and a 0.5% dosage of plain bupivacaine, which should provide a minimum upper block level (T12) and a suitable upper block level (T10) for lower limb surgeries. A suitable dose of 0.5% plain bupivacaine was administered intrathecally between the L3 and L4 vertebrae for lower limb surgeries. If the upper cephalad spread of the patient by loss of pinprick discrimination was T12 or T10, the patient was enrolled in this study. Five patient variables and intrathecal plain bupivacaine dose were recorded. Linear regression and multiple regression analyses were performed. Totals of 111 patients and 121 patients who lost pinprick discrimination at T12 and T10, respectively, were analyzed in this study. Linear regression analysis showed that only abdominal girth and plain bupivacaine dose were strongly correlated (r =-0.827 for T12, r = -0.806 for T10; both p < 0.0001). Multiple linear regression analysis showed that both abdominal girth and vertebral column length were the key determinants of plain bupivacaine dose (both p < 0.0001). R(2) was 0.874 and 0.860 for the loss of pinprick discrimination at T12 and T10, respectively. Our data indicated that vertebral column length and abdominal girth were strongly correlated with the dosage of intrathecal plain bupivacaine for the loss of pinprick discrimination at T12 and T10. The two regression equations were YT12 = 3.547 + 0.045X1-0.044X2 and YT10 = 3.848 + 0.047X1- 0.046X2 (Y, 0.5% plain bupivacaine volume; X1, vertebral column length;and X 2, abdominal girth), which can accurately predict the minimum and suitable intrathecal bupivacaine dose for lower limb surgery to a great extent, separately.
Seaman, Shaun R; Hughes, Rachael A
2018-06-01
Estimating the parameters of a regression model of interest is complicated by missing data on the variables in that model. Multiple imputation is commonly used to handle these missing data. Joint model multiple imputation and full-conditional specification multiple imputation are known to yield imputed data with the same asymptotic distribution when the conditional models of full-conditional specification are compatible with that joint model. We show that this asymptotic equivalence of imputation distributions does not imply that joint model multiple imputation and full-conditional specification multiple imputation will also yield asymptotically equally efficient inference about the parameters of the model of interest, nor that they will be equally robust to misspecification of the joint model. When the conditional models used by full-conditional specification multiple imputation are linear, logistic and multinomial regressions, these are compatible with a restricted general location joint model. We show that multiple imputation using the restricted general location joint model can be substantially more asymptotically efficient than full-conditional specification multiple imputation, but this typically requires very strong associations between variables. When associations are weaker, the efficiency gain is small. Moreover, full-conditional specification multiple imputation is shown to be potentially much more robust than joint model multiple imputation using the restricted general location model to mispecification of that model when there is substantial missingness in the outcome variable.
Kim, Sungjin; Jinich, Adrián; Aspuru-Guzik, Alán
2017-04-24
We propose a multiple descriptor multiple kernel (MultiDK) method for efficient molecular discovery using machine learning. We show that the MultiDK method improves both the speed and accuracy of molecular property prediction. We apply the method to the discovery of electrolyte molecules for aqueous redox flow batteries. Using multiple-type-as opposed to single-type-descriptors, we obtain more relevant features for machine learning. Following the principle of "wisdom of the crowds", the combination of multiple-type descriptors significantly boosts prediction performance. Moreover, by employing multiple kernels-more than one kernel function for a set of the input descriptors-MultiDK exploits nonlinear relations between molecular structure and properties better than a linear regression approach. The multiple kernels consist of a Tanimoto similarity kernel and a linear kernel for a set of binary descriptors and a set of nonbinary descriptors, respectively. Using MultiDK, we achieve an average performance of r 2 = 0.92 with a test set of molecules for solubility prediction. We also extend MultiDK to predict pH-dependent solubility and apply it to a set of quinone molecules with different ionizable functional groups to assess their performance as flow battery electrolytes.
ERIC Educational Resources Information Center
Eidietis, L.; Jewkes, A. M.
2011-01-01
This study examined teachers' dispositions toward and choices to teach ocean science using a survey design. A sample of 89 in-service K-8 teachers in the United States reported their (1) feelings of preparedness to teach about ocean literacy and (2) attitudes toward ocean science on three measures. Results of multiple linear regression showed that…
Mediating Effects of Social Support on Quality of Life for Parents of Adults with Autism
ERIC Educational Resources Information Center
Marsack, Christina N.; Samuel, Preethy S.
2017-01-01
The aim of this study was to examine the mediating effect of formal and informal social support on the relationship of caregiver burden and quality of life (QOL), using a sample of 320 parents (aged 50 or older) of adult children with autism spectrum disorder (ASD). Multiple linear regression and mediation analyses indicated that caregiver burden…
ERIC Educational Resources Information Center
Caldwell, Dale G.
2017-01-01
This correlational, explanatory study utilized multiple linear and hierarchical regression to examine the predictive power of socioeconomic, parental and district factors on the total percentage of students who scored Proficient or Advanced Proficient on the 2013 MCAS Grade 4 language arts and mathematics test. The population for this study…
Impact of pine tip moth attack on loblolly pine
Roy Hedden
1999-01-01
Data on the impact of Nantucket pine tip moth, Rhyacionia frustrana, attack on the height of loblolly pine, Pinus taeda, in the first three growing seasons after planting from three locations in eastern North Carolina (U.S.A.) was used to develop multiple linear regression models relating tree height to tip moth infestation level in each growing season. These models...
Estimating V0[subscript 2]max Using a Personalized Step Test
ERIC Educational Resources Information Center
Webb, Carrie; Vehrs, Pat R.; George, James D.; Hager, Ronald
2014-01-01
The purpose of this study was to develop a step test with a personalized step rate and step height to predict cardiorespiratory fitness in 80 college-aged males and females using the self-reported perceived functional ability scale and data collected during the step test. Multiple linear regression analysis yielded a model (R = 0.90, SEE = 3.43…
ERIC Educational Resources Information Center
Al-Maamari, Faisal
2015-01-01
It is important to consider the question of whether teacher-, course-, and student-related factors affect student ratings of instructors in Student Evaluation of Teaching (SET) in English Language Teaching (ELT). This paper reports on a statistical analysis of SET in two large EFL programmes at a university setting in the Sultanate of Oman. I…
ERIC Educational Resources Information Center
Stratton, Beverly D.; And Others
Demographic data on 92 subjects identified as having reading problems were used to develop equations useful in identifying high risk, reading disabled students. Multiple linear regression analysis of the data indicated that reading disability (1) had a significant positive relationship with birth order and number of siblings; (2) had a positive…
3D Mapping of Language Networks in Clinical and Pre-Clinical Alzheimer's Disease
ERIC Educational Resources Information Center
Apostolova, Liana G.; Lu, Po; Rogers, Steve; Dutton, Rebecca A.; Hayashi, Kiralee M.; Toga, Arthur W.; Cummings, Jeffrey L.; Thompson, Paul M.
2008-01-01
We investigated the associations between Boston naming and the animal fluency tests and cortical atrophy in 19 probable AD and 5 multiple domain amnestic mild cognitive impairment patients who later converted to AD. We applied a surface-based computational anatomy technique to MRI scans of the brain and then used linear regression models to detect…
ERIC Educational Resources Information Center
Deering, Pamela Rose
2014-01-01
This research compares and contrasts two approaches to predictive analysis of three years' of school district data to investigate relationships between student and teacher characteristics and math achievement as measured by the state-mandated Maryland School Assessment mathematics exam. The sample for the study consisted of 3,514 students taught…
ERIC Educational Resources Information Center
Drewery, David; Nevison, Colleen; Pretti, T. Judene; Cormier, Lauren; Barclay, Sage; Pennaforte, Antoine
2016-01-01
This study discusses and tests a conceptual model of co-op work-term quality from a student perspective. Drawing from an earlier exploration of co-op students' perceptions of work-term quality, variables related to role characteristics, interpersonal dynamics, and organizational elements were used in a multiple linear regression analysis to…
The impact of menopausal symptoms on work ability.
Geukes, Marije; van Aalst, Mariëlle P; Nauta, Mary C E; Oosterhof, Henk
2012-03-01
Menopause is an important life event that may have a negative influence on quality of life. Work ability, a concept widely used in occupational health, can predict both future impairment and duration of sickness absence. The aim of this study was to examine the impact of menopausal symptoms on work ability. This was a cross-sectional study that used a sample of healthy working Dutch women aged 44 to 60 years. Work ability was measured using the Work Ability Index, and menopausal symptoms were measured using the Greene Climacteric Scale. Stepwise multiple linear regression models were used to examine the relationship between menopausal symptoms and work ability. A total of 208 women were included in this study. There was a significant negative correlation between total Greene Climacteric Scale score and Work Ability Index score. Total Greene Climacteric Scale score predicted 33.8% of the total variance in the Work Ability Index score. Only the psychological and somatic subscales of the Greene Climacteric Scale were significant predictors in multiple linear regression analysis. Together, they accounted for 36.5% of total variance in Work Ability Index score. Menopausal symptoms are negatively associated with work ability and may increase the risk of sickness absence.
NASA Astrophysics Data System (ADS)
Sahoo, Sasmita; Jha, Madan K.
2013-12-01
The potential of multiple linear regression (MLR) and artificial neural network (ANN) techniques in predicting transient water levels over a groundwater basin were compared. MLR and ANN modeling was carried out at 17 sites in Japan, considering all significant inputs: rainfall, ambient temperature, river stage, 11 seasonal dummy variables, and influential lags of rainfall, ambient temperature, river stage and groundwater level. Seventeen site-specific ANN models were developed, using multi-layer feed-forward neural networks trained with Levenberg-Marquardt backpropagation algorithms. The performance of the models was evaluated using statistical and graphical indicators. Comparison of the goodness-of-fit statistics of the MLR models with those of the ANN models indicated that there is better agreement between the ANN-predicted groundwater levels and the observed groundwater levels at all the sites, compared to the MLR. This finding was supported by the graphical indicators and the residual analysis. Thus, it is concluded that the ANN technique is superior to the MLR technique in predicting spatio-temporal distribution of groundwater levels in a basin. However, considering the practical advantages of the MLR technique, it is recommended as an alternative and cost-effective groundwater modeling tool.
Language and hope in schizophrenia-spectrum disorders.
Bonfils, Kelsey A; Luther, Lauren; Firmin, Ruth L; Lysaker, Paul H; Minor, Kyle S; Salyers, Michelle P
2016-11-30
Hope is integral to recovery for those with schizophrenia. Considering recent advancements in the examination of clients' lexical qualities, we were interested in how clients' words reflect hope. Using computerized lexical analysis, we examined social, emotion, and future words' relations to hope and its pathways and agency components. Forty-five clients provided detailed narratives about their life and mental illness. Transcripts were analyzed using the Linguistic Inquiry and Word Count program (LIWC), which assigns words to categories (e.g., "anxiety") based on a pre-existing dictionary. Correlations and linear multiple regression were used to examine relationships between lexical qualities and hope. Hope and its subcomponents had significant or trending bivariate correlations in expected directions with several emotion-related word categories (anger and sadness) but were not associated with expected categories such as social words, positive emotions, optimism, achievement, and future words. In linear multiple regressions, no LIWC variable significantly predicted hope agency, but anger words significantly predicted both total hope and hope pathways. Our findings indicate lexical analysis tools can be used to investigate recovery-oriented concepts such as hope, and results may inform clinical practice. Future research should aim to replicate our findings in larger samples. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Factors associated with arterial stiffness in children aged 9-10 years
Batista, Milena Santos; Mill, José Geraldo; Pereira, Taisa Sabrina Silva; Fernandes, Carolina Dadalto Rocha; Molina, Maria del Carmen Bisi
2015-01-01
OBJECTIVE To analyze the factors associated with stiffness of the great arteries in prepubertal children. METHODS This study with convenience sample of 231 schoolchildren aged 9-10 years enrolled in public and private schools in Vitória, ES, Southeastern Brazil, in 2010-2011. Anthropometric and hemodynamic data, blood pressure, and pulse wave velocity in the carotid-femoral segment were obtained. Data on current and previous health conditions were obtained by questionnaire and notes on the child’s health card. Multiple linear regression was applied to identify the partial and total contribution of the factors in determining the pulse wave velocity values. RESULTS Among the students, 50.2% were female and 55.4% were 10 years old. Among those classified in the last tertile of pulse wave velocity, 60.0% were overweight, with higher mean blood pressure, waist circumference, and waist-to-height ratio. Birth weight was not associated with pulse wave velocity. After multiple linear regression analysis, body mass index (BMI) and diastolic blood pressure remained in the model. CONCLUSIONS BMI was the most important factor in determining arterial stiffness in children aged 9-10 years. PMID:25902563
Analysis of Sequence Data Under Multivariate Trait-Dependent Sampling.
Tao, Ran; Zeng, Donglin; Franceschini, Nora; North, Kari E; Boerwinkle, Eric; Lin, Dan-Yu
2015-06-01
High-throughput DNA sequencing allows for the genotyping of common and rare variants for genetic association studies. At the present time and for the foreseeable future, it is not economically feasible to sequence all individuals in a large cohort. A cost-effective strategy is to sequence those individuals with extreme values of a quantitative trait. We consider the design under which the sampling depends on multiple quantitative traits. Under such trait-dependent sampling, standard linear regression analysis can result in bias of parameter estimation, inflation of type I error, and loss of power. We construct a likelihood function that properly reflects the sampling mechanism and utilizes all available data. We implement a computationally efficient EM algorithm and establish the theoretical properties of the resulting maximum likelihood estimators. Our methods can be used to perform separate inference on each trait or simultaneous inference on multiple traits. We pay special attention to gene-level association tests for rare variants. We demonstrate the superiority of the proposed methods over standard linear regression through extensive simulation studies. We provide applications to the Cohorts for Heart and Aging Research in Genomic Epidemiology Targeted Sequencing Study and the National Heart, Lung, and Blood Institute Exome Sequencing Project.
Forecasting on the total volumes of Malaysia's imports and exports by multiple linear regression
NASA Astrophysics Data System (ADS)
Beh, W. L.; Yong, M. K. Au
2017-04-01
This study is to give an insight on the doubt of the important of macroeconomic variables that affecting the total volumes of Malaysia's imports and exports by using multiple linear regression (MLR) analysis. The time frame for this study will be determined by using quarterly data of the total volumes of Malaysia's imports and exports covering the period between 2000-2015. The macroeconomic variables will be limited to eleven variables which are the exchange rate of US Dollar with Malaysia Ringgit (USD-MYR), exchange rate of China Yuan with Malaysia Ringgit (RMB-MYR), exchange rate of European Euro with Malaysia Ringgit (EUR-MYR), exchange rate of Singapore Dollar with Malaysia Ringgit (SGD-MYR), crude oil prices, gold prices, producer price index (PPI), interest rate, consumer price index (CPI), industrial production index (IPI) and gross domestic product (GDP). This study has applied the Johansen Co-integration test to investigate the relationship among the total volumes to Malaysia's imports and exports. The result shows that crude oil prices, RMB-MYR, EUR-MYR and IPI play important roles in the total volumes of Malaysia's imports. Meanwhile crude oil price, USD-MYR and GDP play important roles in the total volumes of Malaysia's exports.
Huppert, Theodore J
2016-01-01
Functional near-infrared spectroscopy (fNIRS) is a noninvasive neuroimaging technique that uses low levels of light to measure changes in cerebral blood oxygenation levels. In the majority of NIRS functional brain studies, analysis of this data is based on a statistical comparison of hemodynamic levels between a baseline and task or between multiple task conditions by means of a linear regression model: the so-called general linear model. Although these methods are similar to their implementation in other fields, particularly for functional magnetic resonance imaging, the specific application of these methods in fNIRS research differs in several key ways related to the sources of noise and artifacts unique to fNIRS. In this brief communication, we discuss the application of linear regression models in fNIRS and the modifications needed to generalize these models in order to deal with structured (colored) noise due to systemic physiology and noise heteroscedasticity due to motion artifacts. The objective of this work is to present an overview of these noise properties in the context of the linear model as it applies to fNIRS data. This work is aimed at explaining these mathematical issues to the general fNIRS experimental researcher but is not intended to be a complete mathematical treatment of these concepts.
NASA Astrophysics Data System (ADS)
Khazaei, Ardeshir; Sarmasti, Negin; Seyf, Jaber Yousefi
2016-03-01
Quantitative structure activity relationship were used to study a series of curcumin-related compounds with inhibitory effect on prostate cancer PC-3 cells, pancreas cancer Panc-1 cells, and colon cancer HT-29 cells. Sphere exclusion method was used to split data set in two categories of train and test set. Multiple linear regression, principal component regression and partial least squares were used as the regression methods. In other hand, to investigate the effect of feature selection methods, stepwise, Genetic algorithm, and simulated annealing were used. In two cases (PC-3 cells and Panc-1 cells), the best models were generated by a combination of multiple linear regression and stepwise (PC-3 cells: r2 = 0.86, q2 = 0.82, pred_r2 = 0.93, and r2m (test) = 0.43, Panc-1 cells: r2 = 0.85, q2 = 0.80, pred_r2 = 0.71, and r2m (test) = 0.68). For the HT-29 cells, principal component regression with stepwise (r2 = 0.69, q2 = 0.62, pred_r2 = 0.54, and r2m (test) = 0.41) is the best method. The QSAR study reveals descriptors which have crucial role in the inhibitory property of curcumin-like compounds. 6ChainCount, T_C_C_1, and T_O_O_7 are the most important descriptors that have the greatest effect. With a specific end goal to design and optimization of novel efficient curcumin-related compounds it is useful to introduce heteroatoms such as nitrogen, oxygen, and sulfur atoms in the chemical structure (reduce the contribution of T_C_C_1 descriptor) and increase the contribution of 6ChainCount and T_O_O_7 descriptors. Models can be useful in the better design of some novel curcumin-related compounds that can be used in the treatment of prostate, pancreas, and colon cancers.
Wang, Yubo; Veluvolu, Kalyana C
2017-06-14
It is often difficult to analyze biological signals because of their nonlinear and non-stationary characteristics. This necessitates the usage of time-frequency decomposition methods for analyzing the subtle changes in these signals that are often connected to an underlying phenomena. This paper presents a new approach to analyze the time-varying characteristics of such signals by employing a simple truncated Fourier series model, namely the band-limited multiple Fourier linear combiner (BMFLC). In contrast to the earlier designs, we first identified the sparsity imposed on the signal model in order to reformulate the model to a sparse linear regression model. The coefficients of the proposed model are then estimated by a convex optimization algorithm. The performance of the proposed method was analyzed with benchmark test signals. An energy ratio metric is employed to quantify the spectral performance and results show that the proposed method Sparse-BMFLC has high mean energy (0.9976) ratio and outperforms existing methods such as short-time Fourier transfrom (STFT), continuous Wavelet transform (CWT) and BMFLC Kalman Smoother. Furthermore, the proposed method provides an overall 6.22% in reconstruction error.
Regional flow duration curves: Geostatistical techniques versus multivariate regression
Pugliese, Alessio; Farmer, William H.; Castellarin, Attilio; Archfield, Stacey A.; Vogel, Richard M.
2016-01-01
A period-of-record flow duration curve (FDC) represents the relationship between the magnitude and frequency of daily streamflows. Prediction of FDCs is of great importance for locations characterized by sparse or missing streamflow observations. We present a detailed comparison of two methods which are capable of predicting an FDC at ungauged basins: (1) an adaptation of the geostatistical method, Top-kriging, employing a linear weighted average of dimensionless empirical FDCs, standardised with a reference streamflow value; and (2) regional multiple linear regression of streamflow quantiles, perhaps the most common method for the prediction of FDCs at ungauged sites. In particular, Top-kriging relies on a metric for expressing the similarity between catchments computed as the negative deviation of the FDC from a reference streamflow value, which we termed total negative deviation (TND). Comparisons of these two methods are made in 182 largely unregulated river catchments in the southeastern U.S. using a three-fold cross-validation algorithm. Our results reveal that the two methods perform similarly throughout flow-regimes, with average Nash-Sutcliffe Efficiencies 0.566 and 0.662, (0.883 and 0.829 on log-transformed quantiles) for the geostatistical and the linear regression models, respectively. The differences between the reproduction of FDC's occurred mostly for low flows with exceedance probability (i.e. duration) above 0.98.
Kwon, Jin-Woo; Choi, Jin A; La, Tae Yoon
2016-11-01
The aim of this article was to assess the associations of serum 25-hydroxyvitamin D [25(OH)D] and daily sun exposure time with myopia in Korean adults.This study is based on the Korea National Health and Nutrition Examination Survey (KNHANES) of Korean adults in 2010-2012; multiple logistic regression analyses were performed to examine the associations of serum 25(OH)D levels and daily sun exposure time with myopia, defined as spherical equivalent ≤-0.5D, after adjustment for age, sex, household income, body mass index (BMI), exercise, intraocular pressure (IOP), and education level. Also, multiple linear regression analyses were performed to examine the relationship between serum 25(OH)D levels with spherical equivalent after adjustment for daily sun exposure time in addition to the confounding factors above.Between the nonmyopic and myopic groups, spherical equivalent, age, IOP, BMI, waist circumference, education level, household income, and area of residence differed significantly (all P < 0.05). Compared with subjects with daily sun exposure time <2 hour, subjects with sun exposure time ≥2 to <5 hour, and those with sun exposure time ≥5 hour had significantly less myopia (P < 0.001). In addition, compared with subjects were categorized into quartiles of serum 25(OH)D, the higher quartiles had gradually lower prevalences of myopia after adjustment for confounding factors (P < 0.001). In multiple linear regression analyses, spherical equivalent was significantly associated with serum 25(OH)D concentration after adjustment for confounding factors (P = 0.002).Low serum 25(OH)D levels and shorter daily sun exposure time may be independently associated with a high prevalence of myopia in Korean adults. These data suggest a direct role for vitamin D in the development of myopia.
Evaluation of Relationship between Trunk Muscle Endurance and Static Balance in Male Students
Barati, Amirhossein; SafarCherati, Afsaneh; Aghayari, Azar; Azizi, Faeze; Abbasi, Hamed
2013-01-01
Purpose Fatigue of trunk muscle contributes to spinal instability over strenuous and prolonged physical tasks and therefore may lead to injury, however from a performance perspective, relation between endurance efficient core muscles and optimal balance control has not been well-known. The purpose of this study was to examine the relationship of trunk muscle endurance and static balance. Methods Fifty male students inhabitant of Tehran university dormitory (age 23.9±2.4, height 173.0±4.5 weight 70.7±6.3) took part in the study. Trunk muscle endurance was assessed using Sørensen test of trunk extensor endurance, trunk flexor endurance test, side bridge endurance test and static balance was measured using single-limb stance test. A multiple linear regression analysis was applied to test if the trunk muscle endurance measures significantly predicted the static balance. Results There were positive correlations between static balance level and trunk flexor, extensor and lateral endurance measures (Pearson correlation test, r=0.80 and P<0.001; r=0.71 and P<0.001; r=0.84 and P<0.001, respectively). According to multiple regression analysis for variables predicting static balance, the linear combination of trunk muscle endurance measures was significantly related to the static balance (F (3,46) = 66.60, P<0.001). Endurance of trunk flexor, extensor and lateral muscles were significantly associated with the static balance level. The regression model which included these factors had the sample multiple correlation coefficient of 0.902, indicating that approximately 81% of the variance of the static balance is explained by the model. Conclusion There is a significant relationship between trunk muscle endurance and static balance. PMID:24800004
Martínez-Moyá, María; Navarrete-Muñoz, Eva M; García de la Hera, Manuela; Giménez-Monzo, Daniel; González-Palacios, Sandra; Valera-Gran, Desirée; Sempere-Orts, María; Vioque, Jesús
2014-01-01
To explore the association between excess weight or body mass index (BMI) and the time spent watching television, self-reported physical activity and sleep duration in a young adult population. We analyzed cross-sectional baseline data of 1,135 participants (17-35 years old) from the project Dieta, salud y antropometría en población universitaria (Diet, Health and Anthrompmetric Variables in Univeristy Students). Information about time spent watching television, sleep duration, self-reported physical activity and self-reported height and weight was provided by a baseline questionnaire. BMI was calculated as kg/m(2) and excess of weight was defined as ≥25. We used multiple logistic regression to explore the association between excess weight (no/yes) and independent variables, and multiple linear regression for BMI. The prevalence of excess weight was 13.7% (11.2% were overweight and 2.5% were obese). A significant positive association was found between excess weight and a greater amount of time spent watching television. Participants who reported watching television >2h a day had a higher risk of excess weight than those who watched television ≤1h a day (OR=2.13; 95%CI: 1.37-3.36; p-trend: 0.002). A lower level of physical activity was associated with an increased risk of excess weight, although the association was statistically significant only in multiple linear regression (p=0.037). No association was observed with sleep duration. A greater number of hours spent watching television and lower physical activity were significantly associated with a higher BMI in young adults. Both factors are potentially modifiable with preventive strategies. Copyright © 2013 SESPAS. Published by Elsevier Espana. All rights reserved.
Malignant testicular tumour incidence and mortality trends
Wojtyła-Buciora, Paulina; Więckowska, Barbara; Krzywinska-Wiewiorowska, Małgorzata; Gromadecka-Sutkiewicz, Małgorzata
2016-01-01
Aim of the study In Poland testicular tumours are the most frequent cancer among men aged 20–44 years. Testicular tumour incidence since the 1980s and 1990s has been diversified geographically, with an increased risk of mortality in Wielkopolska Province, which was highlighted at the turn of the 1980s and 1990s. The aim of the study was the comparative analysis of the tendencies in incidence and death rates due to malignant testicular tumours observed among men in Poland and in Wielkopolska Province. Material and methods Data from the National Cancer Registry were used for calculations. The incidence/mortality rates among men due to malignant testicular cancer as well as the tendencies in incidence/death ratio observed in Poland and Wielkopolska were established based on regression equation. The analysis was deepened by adopting the multiple linear regression model. A p-value < 0.05 was arbitrarily adopted as the criterion of statistical significance, and for multiple comparisons it was modified according to the Bonferroni adjustment to a value of p < 0.0028. Calculations were performed with the use of PQStat v1.4.8 package. Results The incidence of malignant testicular neoplasms observed among men in Poland and in Wielkopolska Province indicated a significant rising tendency. The multiple linear regression model confirmed that the year variable is a strong incidence forecast factor only within the territory of Poland. A corresponding analysis of mortality rates among men in Poland and in Wielkopolska Province did not show any statistically significant correlations. Conclusions Late diagnosis of Polish patients calls for undertaking appropriate educational activities that would facilitate earlier reporting of the patients, thus increasing their chances for recovery. Introducing preventive examinations in the regions of increased risk of testicular tumour may allow earlier diagnosis. PMID:27095941
Qing, Si-han; Chang, Yun-feng; Dong, Xiao-ai; Li, Yuan; Chen, Xiao-gang; Shu, Yong-kang; Deng, Zhen-hua
2013-10-01
To establish the mathematical models of stature estimation for Sichuan Han female with measurement of lumbar vertebrae by X-ray to provide essential data for forensic anthropology research. The samples, 206 Sichuan Han females, were divided into three groups including group A, B and C according to the ages. Group A (206 samples) consisted of all ages, group B (116 samples) were 20-45 years old and 90 samples over 45 years old were group C. All the samples were examined lumbar vertebrae through CR technology, including the parameters of five centrums (L1-L5) as anterior border, posterior border and central heights (x1-x15), total central height of lumbar spine (x16), and the real height of every sample. The linear regression analysis was produced using the parameters to establish the mathematical models of stature estimation. Sixty-two trained subjects were tested to verify the accuracy of the mathematical models. The established mathematical models by hypothesis test of linear regression equation model were statistically significant (P<0.05). The standard errors of the equation were 2.982-5.004 cm, while correlation coefficients were 0.370-0.779 and multiple correlation coefficients were 0.533-0.834. The return tests of the highest correlation coefficient and multiple correlation coefficient of each group showed that the highest accuracy of the multiple regression equation, y = 100.33 + 1.489 x3 - 0.548 x6 + 0.772 x9 + 0.058 x12 + 0.645 x15, in group A were 80.6% (+/- lSE) and 100% (+/- 2SE). The established mathematical models in this study could be applied for the stature estimation for Sichuan Han females.
Byg, Blaire; Bazzi, Angela Robertson; Funk, Danielle; James, Bonface; Potter, Jennifer
2016-12-01
Syndemic theory posits that epidemics of multiple physical and psychosocial problems co-occur among disadvantaged groups due to adverse social conditions. Although sexual minority populations are often stigmatized and vulnerable to multiple health problems, the syndemic perspective has been underutilized in understanding chronic disease. To assess the potential utility of this perspective in understanding the management of co-occurring HIV and Type 2 diabetes, we used linear regression to examine glycemic control (A1c) among men who have sex with men (MSM) with both HIV and Type 2 diabetes (n = 88). Bivariable linear regression explored potential syndemic correlates of inadequate glycemic control. Compared to those with adequate glycemic control (A1c ≤ 7.5 %), more men with inadequate glycemic control (A1c > 7.5 %) had hypertension (70 vs. 46 %, p = 0.034), high triglycerides (93 vs. 61 %, p = 0.002), depression (67 vs. 39 %, p = 0.018), current substance abuse (15 vs. 2 %, p = 0.014), and detectable levels of HIV (i.e., viral load ≥75 copies per ml blood; 30 vs. 10 %, p = 0.019). In multivariable regression controlling for age, the factors that were independently associated with higher A1c were high triglycerides, substance use, and detectable HIV viral load, suggesting that chronic disease management among MSM is complex and challenging for patients and providers. Findings also suggest that syndemic theory can be a clarifying lens for understanding chronic disease management among sexual minority stigmatized populations. Interventions targeting single conditions may be inadequate when multiple conditions co-occur; thus, research using a syndemic framework may be helpful in identifying intervention strategies that target multiple co-occurring conditions.
Šabanagić-Hajrić, Selma; Alajbegović, Azra
2015-02-01
To evaluate the impacts of education level and employment status on health-related quality of life (HRQoL) in multiple sclerosis patients. This study included 100 multiple sclerosis patients treated at the Department of Neurology, Clinical Center of the University of Sarajevo. Inclusion criteria were the Expanded Disability Status Scale (EDSS) score between 1.0 and 6.5, age between 18 and 65 years, stable disease on enrollment. Quality of life (QoL) was evaluated by the Multiple Sclerosis Quality of Life-54 questionnaire (MSQoL-54). Mann-Whitney and Kruskal-Wallis test were used for comparisons. Linear regression analyses were performed to evaluate prediction value of educational level and employment status in predicting MSQOL-54 physical and mental composite scores. Full employment status had positive impact on physical health (54.85 vs. 37.90; p les than 0.001) and mental health (59.55 vs. 45.90; p les than 0.001) composite scores. Employment status retained its independent predictability for both physical (r(2)=0.105) and mental (r(2)=0.076) composite scores in linear regression analysis. Patients with college degree had slightly higher median value of physical (49.36 vs. 45.30) and mental health composite score (66.74 vs. 55.62) comparing to others, without statistically significant difference. Employment proved to be an important factor in predicting quality of life in multiple sclerosis patients. Higher education level may determine better QOL but without significant predictive value. Sustained employment and development of vocational rehabilitation programs for MS patients living in the country with high unemployment level is an important factor in improving both physical and mental health outcomes in MS patients.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sun Wei; Huang, Guo H., E-mail: huang@iseis.org; Institute for Energy, Environment and Sustainable Communities, University of Regina, Regina, Saskatchewan, S4S 0A2
2012-06-15
Highlights: Black-Right-Pointing-Pointer Inexact piecewise-linearization-based fuzzy flexible programming is proposed. Black-Right-Pointing-Pointer It's the first application to waste management under multiple complexities. Black-Right-Pointing-Pointer It tackles nonlinear economies-of-scale effects in interval-parameter constraints. Black-Right-Pointing-Pointer It estimates costs more accurately than the linear-regression-based model. Black-Right-Pointing-Pointer Uncertainties are decreased and more satisfactory interval solutions are obtained. - Abstract: To tackle nonlinear economies-of-scale (EOS) effects in interval-parameter constraints for a representative waste management problem, an inexact piecewise-linearization-based fuzzy flexible programming (IPFP) model is developed. In IPFP, interval parameters for waste amounts and transportation/operation costs can be quantified; aspiration levels for net system costs, as well as tolerancemore » intervals for both capacities of waste treatment facilities and waste generation rates can be reflected; and the nonlinear EOS effects transformed from objective function to constraints can be approximated. An interactive algorithm is proposed for solving the IPFP model, which in nature is an interval-parameter mixed-integer quadratically constrained programming model. To demonstrate the IPFP's advantages, two alternative models are developed to compare their performances. One is a conventional linear-regression-based inexact fuzzy programming model (IPFP2) and the other is an IPFP model with all right-hand-sides of fussy constraints being the corresponding interval numbers (IPFP3). The comparison results between IPFP and IPFP2 indicate that the optimized waste amounts would have the similar patterns in both models. However, when dealing with EOS effects in constraints, the IPFP2 may underestimate the net system costs while the IPFP can estimate the costs more accurately. The comparison results between IPFP and IPFP3 indicate that their solutions would be significantly different. The decreased system uncertainties in IPFP's solutions demonstrate its effectiveness for providing more satisfactory interval solutions than IPFP3. Following its first application to waste management, the IPFP can be potentially applied to other environmental problems under multiple complexities.« less
Effect of pencil grasp on the speed and legibility of handwriting in children.
Schwellnus, Heidi; Carnahan, Heather; Kushki, Azadeh; Polatajko, Helene; Missiuna, Cheryl; Chau, Tom
2012-01-01
Pencil grasps other than the dynamic tripod may be functional for handwriting. This study examined the impact of grasp on handwriting speed and legibility. We videotaped 120 typically developing fourth-grade students while they performed a writing task. We categorized the grasps they used and evaluated their writing for speed and legibility using a handwriting assessment. Using linear regression analysis, we examined the relationship between grasp and handwriting. We documented six categories of pencil grasp: four mature grasp patterns, one immature grasp pattern, and one alternating grasp pattern. Multiple linear regression results revealed no significant effect for mature grasp on either legibility or speed. Pencil grasp patterns did not influence handwriting speed or legibility in this sample of typically developing children. This finding adds to the mounting body of evidence that alternative grasps may be acceptable for fast and legible handwriting. Copyright © 2012 by the American Occupational Therapy Association, Inc.
Crawford, John R; Garthwaite, Paul H; Denham, Annie K; Chelune, Gordon J
2012-12-01
Regression equations have many useful roles in psychological assessment. Moreover, there is a large reservoir of published data that could be used to build regression equations; these equations could then be employed to test a wide variety of hypotheses concerning the functioning of individual cases. This resource is currently underused because (a) not all psychologists are aware that regression equations can be built not only from raw data but also using only basic summary data for a sample, and (b) the computations involved are tedious and prone to error. In an attempt to overcome these barriers, Crawford and Garthwaite (2007) provided methods to build and apply simple linear regression models using summary statistics as data. In the present study, we extend this work to set out the steps required to build multiple regression models from sample summary statistics and the further steps required to compute the associated statistics for drawing inferences concerning an individual case. We also develop, describe, and make available a computer program that implements these methods. Although there are caveats associated with the use of the methods, these need to be balanced against pragmatic considerations and against the alternative of either entirely ignoring a pertinent data set or using it informally to provide a clinical "guesstimate." Upgraded versions of earlier programs for regression in the single case are also provided; these add the point and interval estimates of effect size developed in the present article.
Solar cycle in current reanalyses: (non)linear attribution study
NASA Astrophysics Data System (ADS)
Kuchar, A.; Sacha, P.; Miksovsky, J.; Pisoft, P.
2014-12-01
This study focusses on the variability of temperature, ozone and circulation characteristics in the stratosphere and lower mesosphere with regard to the influence of the 11 year solar cycle. It is based on attribution analysis using multiple nonlinear techniques (Support Vector Regression, Neural Networks) besides the traditional linear approach. The analysis was applied to several current reanalysis datasets for the 1979-2013 period, including MERRA, ERA-Interim and JRA-55, with the aim to compare how this type of data resolves especially the double-peaked solar response in temperature and ozone variables and the consequent changes induced by these anomalies. Equatorial temperature signals in the lower and upper stratosphere were found to be sufficiently robust and in qualitative agreement with previous observational studies. The analysis also pointed to the solar signal in the ozone datasets (i.e. MERRA and ERA-Interim) not being consistent with the observed double-peaked ozone anomaly extracted from satellite measurements. Consequently the results obtained by linear regression were confirmed by the nonlinear approach through all datasets, suggesting that linear regression is a relevant tool to sufficiently resolve the solar signal in the middle atmosphere. Furthermore, the seasonal dependence of the solar response was also discussed, mainly as a source of dynamical causalities in the wave propagation characteristics in the zonal wind and the induced meridional circulation in the winter hemispheres. The hypothetical mechanism of a weaker Brewer Dobson circulation was reviewed together with discussion of polar vortex stability.
NASA Astrophysics Data System (ADS)
Wang, Xuntao; Feng, Jianhu; Wang, Hu; Hong, Shidi; Zheng, Supei
2018-03-01
A three-dimensional finite element box girder bridge and its asphalt concrete deck pavement were established by ANSYS software, and the interlayer bonding condition of asphalt concrete deck pavement was assumed to be contact bonding condition. Orthogonal experimental design is used to arrange the testing plans of material parameters, and an evaluation of the effect of different material parameters in the mechanical response of asphalt concrete surface layer was conducted by multiple linear regression model and using the results from the finite element analysis. Results indicated that stress regression equations can well predict the stress of the asphalt concrete surface layer, and elastic modulus of waterproof layer has a significant influence on stress values of asphalt concrete surface layer.
Heddam, Salim
2014-11-01
The prediction of colored dissolved organic matter (CDOM) using artificial neural network approaches has received little attention in the past few decades. In this study, colored dissolved organic matter (CDOM) was modeled using generalized regression neural network (GRNN) and multiple linear regression (MLR) models as a function of Water temperature (TE), pH, specific conductance (SC), and turbidity (TU). Evaluation of the prediction accuracy of the models is based on the root mean square error (RMSE), mean absolute error (MAE), coefficient of correlation (CC), and Willmott's index of agreement (d). The results indicated that GRNN can be applied successfully for prediction of colored dissolved organic matter (CDOM).
Marston, Louise; Peacock, Janet L; Yu, Keming; Brocklehurst, Peter; Calvert, Sandra A; Greenough, Anne; Marlow, Neil
2009-07-01
Studies of prematurely born infants contain a relatively large percentage of multiple births, so the resulting data have a hierarchical structure with small clusters of size 1, 2 or 3. Ignoring the clustering may lead to incorrect inferences. The aim of this study was to compare statistical methods which can be used to analyse such data: generalised estimating equations, multilevel models, multiple linear regression and logistic regression. Four datasets which differed in total size and in percentage of multiple births (n = 254, multiple 18%; n = 176, multiple 9%; n = 10 098, multiple 3%; n = 1585, multiple 8%) were analysed. With the continuous outcome, two-level models produced similar results in the larger dataset, while generalised least squares multilevel modelling (ML GLS 'xtreg' in Stata) and maximum likelihood multilevel modelling (ML MLE 'xtmixed' in Stata) produced divergent estimates using the smaller dataset. For the dichotomous outcome, most methods, except generalised least squares multilevel modelling (ML GH 'xtlogit' in Stata) gave similar odds ratios and 95% confidence intervals within datasets. For the continuous outcome, our results suggest using multilevel modelling. We conclude that generalised least squares multilevel modelling (ML GLS 'xtreg' in Stata) and maximum likelihood multilevel modelling (ML MLE 'xtmixed' in Stata) should be used with caution when the dataset is small. Where the outcome is dichotomous and there is a relatively large percentage of non-independent data, it is recommended that these are accounted for in analyses using logistic regression with adjusted standard errors or multilevel modelling. If, however, the dataset has a small percentage of clusters greater than size 1 (e.g. a population dataset of children where there are few multiples) there appears to be less need to adjust for clustering.
Fernandes, David Douglas Sousa; Gomes, Adriano A; Costa, Gean Bezerra da; Silva, Gildo William B da; Véras, Germano
2011-12-15
This work is concerned of evaluate the use of visible and near-infrared (NIR) range, separately and combined, to determine the biodiesel content in biodiesel/diesel blends using Multiple Linear Regression (MLR) and variable selection by Successive Projections Algorithm (SPA). Full spectrum models employing Partial Least Squares (PLS) and variables selection by Stepwise (SW) regression coupled with Multiple Linear Regression (MLR) and PLS models also with variable selection by Jack-Knife (Jk) were compared the proposed methodology. Several preprocessing were evaluated, being chosen derivative Savitzky-Golay with second-order polynomial and 17-point window for NIR and visible-NIR range, with offset correction. A total of 100 blends with biodiesel content between 5 and 50% (v/v) prepared starting from ten sample of biodiesel. In the NIR and visible region the best model was the SPA-MLR using only two and eight wavelengths with RMSEP of 0.6439% (v/v) and 0.5741 respectively, while in the visible-NIR region the best model was the SW-MLR using five wavelengths and RMSEP of 0.9533% (v/v). Results indicate that both spectral ranges evaluated showed potential for developing a rapid and nondestructive method to quantify biodiesel in blends with mineral diesel. Finally, one can still mention that the improvement in terms of prediction error obtained with the procedure for variables selection was significant. Copyright © 2011 Elsevier B.V. All rights reserved.
Salazar, Edwin; Buitrago, Carolina; Molina, Federico; Alzate, Catalina Arango
2015-05-01
Determine the trend in mortality from external causes in pregnant and postpartum women and its relationship to socioeconomic factors. Descriptive study, based on the official registries of deaths reported by the National Statistics Agency, 1998-2010. The trend was analyzed using Poisson regressions. Bivariate correlations and multiple linear regression models were constructed to explore the relationship between mortality and socioeconomic factors: human development index, Gini index, gross domestic product, unsatisfied basic needs, unemployment rate, poverty, extreme poverty, quality of life index, illiteracy rate, and percentage of affiliation to the Social Security System. A total of 2 223 female deaths from external causes were recorded, of which 1 429 occurred during pregnancy and 794 in the postpartum period. The gross mortality rate dropped from 30.7 per 100 000 live births plus fetal deaths in 1998 to 16.7 in 2010. A downward curve with no significant inflection points was shown in the risk of dying from this cause. The multiple linear regression model showed a correlation between mortality and extreme poverty and the illiteracy rate, suggesting that these indicators could explain 89.4% of the change in mortality from external causes in pregnant and postpartum women each year in Colombia. Mortality from external causes in pregnant and postpartum women showed a significant downward trend that may be explained by important socioeconomic changes in the country, including a decrease in extreme poverty and in the illiteracy rate.
Inflammation, homocysteine and carotid intima-media thickness.
Baptista, Alexandre P; Cacdocar, Sanjiva; Palmeiro, Hugo; Faísca, Marília; Carrasqueira, Herménio; Morgado, Elsa; Sampaio, Sandra; Cabrita, Ana; Silva, Ana Paula; Bernardo, Idalécio; Gome, Veloso; Neves, Pedro L
2008-01-01
Cardiovascular disease is the main cause of morbidity and mortality in chronic renal patients. Carotid intima-media thickness (CIMT) is one of the most accurate markers of atherosclerosis risk. In this study, the authors set out to evaluate a population of chronic renal patients to determine which factors are associated with an increase in intima-media thickness. We included 56 patients (F=22, M=34), with a mean age of 68.6 years, and an estimated glomerular filtration rate of 15.8 ml/min (calculated by the MDRD equation). Various laboratory and inflammatory parameters (hsCRP, IL-6 and TNF-alpha) were evaluated. All subjects underwent measurement of internal carotid artery intima-media thickness by high-resolution real-time B-mode ultrasonography using a 10 MHz linear transducer. Intima-media thickness was used as a dependent variable in a simple linear regression model, with the various laboratory parameters as independent variables. Only parameters showing a significant correlation with CIMT were evaluated in a multiple regression model: age (p=0.001), hemoglobin (p=00.3), logCRP (p=0.042), logIL-6 (p=0.004) and homocysteine (p=0.002). In the multiple regression model we found that age (p=0.001) and homocysteine (p=0.027) were independently correlated with CIMT. LogIL-6 did not reach statistical significance (p=0.057), probably due to the small population size. The authors conclude that age and homocysteine correlate with carotid intima-media thickness, and thus can be considered as markers/risk factors in chronic renal patients.
Rodriguez-Sabate, Clara; Morales, Ingrid; Sanchez, Alberto; Rodriguez, Manuel
2017-01-01
The complexity of basal ganglia (BG) interactions is often condensed into simple models mainly based on animal data and that present BG in closed-loop cortico-subcortical circuits of excitatory/inhibitory pathways which analyze the incoming cortical data and return the processed information to the cortex. This study was aimed at identifying functional relationships in the BG motor-loop of 24 healthy-subjects who provided written, informed consent and whose BOLD-activity was recorded by MRI methods. The analysis of the functional interaction between these centers by correlation techniques and multiple linear regression showed non-linear relationships which cannot be suitably addressed with these methods. The multiple correspondence analysis (MCA), an unsupervised multivariable procedure which can identify non-linear interactions, was used to study the functional connectivity of BG when subjects were at rest. Linear methods showed different functional interactions expected according to current BG models. MCA showed additional functional interactions which were not evident when using lineal methods. Seven functional configurations of BG were identified with MCA, two involving the primary motor and somatosensory cortex, one involving the deepest BG (external-internal globus pallidum, subthalamic nucleus and substantia nigral), one with the input-output BG centers (putamen and motor thalamus), two linking the input-output centers with other BG (external pallidum and subthalamic nucleus), and one linking the external pallidum and the substantia nigral. The results provide evidence that the non-linear MCA and linear methods are complementary and should be best used in conjunction to more fully understand the nature of functional connectivity of brain centers.
ERIC Educational Resources Information Center
Nielson, David E.; George, James D.; Vehrs, Pat R.; Hager, Ron L.; Webb, Carrie V.
2010-01-01
The purpose of this study was to develop a multiple linear regression model to predict treadmill VO[subscript 2max] scores using both exercise and non-exercise data. One hundred five college-aged participants (53 male, 52 female) successfully completed a submaximal cycle ergometer test and a maximal graded exercise test on a motorized treadmill.…
ERIC Educational Resources Information Center
Khan, Wasi Z.; Al Zubaidy, Sarim
2017-01-01
The variance in students' academic performance in a civilian institute and in a military technological institute could be linked to the environment of the competition available to the students. The magnitude of talent, domain of skills and volume of efforts students put are identical in both type of institutes. The significant factor is the…
ERIC Educational Resources Information Center
Siweya, Hlengani J.; Letsoalo, Peter
2014-01-01
This study investigated whether formative assessment is a predictor of summative assessment in a university first-year chemistry class. The sample comprised a total of 1687 first-year chemistry students chosen from the 2011 and 2012 cohorts. Both simple and multiple linear regression (SLR and MLR) techniques were applied to perform the primary aim…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kim, Yangho; Lee, Byung-Kook, E-mail: bklee@sch.ac.kr
Introduction: The objective of this study was to evaluate associations between blood lead, cadmium, and mercury levels with estimated glomerular filtration rate in a general population of South Korean adults. Methods: This was a cross-sectional study based on data obtained in the Korean National Health and Nutrition Examination Survey (KNHANES) (2008-2010). The final analytical sample consisted of 5924 participants. Estimated glomerular filtration rate (eGFR) was calculated using the MDRD Study equation as an indicator of glomerular function. Results: In multiple linear regression analysis of log2-transformed blood lead as a continuous variable on eGFR, after adjusting for covariates including cadmium andmore » mercury, the difference in eGFR levels associated with doubling of blood lead were -2.624 mL/min per 1.73 m Superscript-Two (95% CI: -3.803 to -1.445). In multiple linear regression analysis using quartiles of blood lead as the independent variable, the difference in eGFR levels comparing participants in the highest versus the lowest quartiles of blood lead was -3.835 mL/min per 1.73 m Superscript-Two (95% CI: -5.730 to -1.939). In a multiple linear regression analysis using blood cadmium and mercury, as continuous or categorical variables, as independent variables, neither metal was a significant predictor of eGFR. Odds ratios (ORs) and 95% CI values for reduced eGFR calculated for log2-transformed blood metals and quartiles of the three metals showed similar trends after adjustment for covariates. Discussion: In this large, representative sample of South Korean adults, elevated blood lead level was consistently associated with lower eGFR levels and with the prevalence of reduced eGFR even in blood lead levels below 10 {mu}g/dL. In conclusion, elevated blood lead level was associated with lower eGFR in a Korean general population, supporting the role of lead as a risk factor for chronic kidney disease.« less
Relationships between use of television during meals and children's food consumption patterns.
Coon, K A; Goldberg, J; Rogers, B L; Tucker, K L
2001-01-01
We examined relationships between the presence of television during meals and children's food consumption patterns to test whether children's overall food consumption patterns, including foods not normally advertised, vary systematically with the extent to which television is part of normal mealtime routines. Ninety-one parent-child pairs from suburbs adjacent to Washington, DC, recruited via advertisements and word of mouth, participated. Children were in the fourth, fifth, or sixth grades. Socioeconomic data and information on television use were collected during survey interviews. Three nonconsecutive 24-hour dietary recalls, conducted with each child, were used to construct nutrient and food intake outcome variables. Independent sample t tests were used to compare mean food and nutrient intakes of children from families in which the television was usually on during 2 or more meals (n = 41) to those of children from families in which the television was either never on or only on during one meal (n = 50). Multiple linear regression models, controlling for socioeconomic factors and other covariates, were used to test strength of associations between television and children's consumption of food groups and nutrients. Children from families with high television use derived, on average, 6% more of their total daily energy intake from meats; 5% more from pizza, salty snacks, and soda; and nearly 5% less of their energy intake from fruits, vegetables, and juices than did children from families with low television use. Associations between television and children's consumption of food groups remained statistically significant in multiple linear regression models that controlled for socioeconomic factors and other covariates. Children from high television families derived less of their total energy from carbohydrate and consumed twice as much caffeine as children from low television families. There continued to be a significant association between television and children's consumption of caffeine when these relationships were tested in multiple linear regression models. The dietary patterns of children from families in which television viewing is a normal part of meal routines may include fewer fruits and vegetables and more pizzas, snack foods, and sodas than the dietary patterns of children from families in which television viewing and eating are separate activities.
An open-access CMIP5 pattern library for temperature and precipitation: description and methodology
NASA Astrophysics Data System (ADS)
Lynch, Cary; Hartin, Corinne; Bond-Lamberty, Ben; Kravitz, Ben
2017-05-01
Pattern scaling is used to efficiently emulate general circulation models and explore uncertainty in climate projections under multiple forcing scenarios. Pattern scaling methods assume that local climate changes scale with a global mean temperature increase, allowing for spatial patterns to be generated for multiple models for any future emission scenario. For uncertainty quantification and probabilistic statistical analysis, a library of patterns with descriptive statistics for each file would be beneficial, but such a library does not presently exist. Of the possible techniques used to generate patterns, the two most prominent are the delta and least squares regression methods. We explore the differences and statistical significance between patterns generated by each method and assess performance of the generated patterns across methods and scenarios. Differences in patterns across seasons between methods and epochs were largest in high latitudes (60-90° N/S). Bias and mean errors between modeled and pattern-predicted output from the linear regression method were smaller than patterns generated by the delta method. Across scenarios, differences in the linear regression method patterns were more statistically significant, especially at high latitudes. We found that pattern generation methodologies were able to approximate the forced signal of change to within ≤ 0.5 °C, but the choice of pattern generation methodology for pattern scaling purposes should be informed by user goals and criteria. This paper describes our library of least squares regression patterns from all CMIP5 models for temperature and precipitation on an annual and sub-annual basis, along with the code used to generate these patterns. The dataset and netCDF data generation code are available at doi:10.5281/zenodo.495632.
NASA Astrophysics Data System (ADS)
Zhao, Wei; Fan, Shaojia; Guo, Hai; Gao, Bo; Sun, Jiaren; Chen, Laiguo
2016-11-01
The quantile regression (QR) method has been increasingly introduced to atmospheric environmental studies to explore the non-linear relationship between local meteorological conditions and ozone mixing ratios. In this study, we applied QR for the first time, together with multiple linear regression (MLR), to analyze the dominant meteorological parameters influencing the mean, 10th percentile, 90th percentile and 99th percentile of maximum daily 8-h average (MDA8) ozone concentrations in 2000-2015 in Hong Kong. The dominance analysis (DA) was used to assess the relative importance of meteorological variables in the regression models. Results showed that the MLR models worked better at suburban and rural sites than at urban sites, and worked better in winter than in summer. QR models performed better in summer for 99th and 90th percentiles and performed better in autumn and winter for 10th percentile. And QR models also performed better in suburban and rural areas for 10th percentile. The top 3 dominant variables associated with MDA8 ozone concentrations, changing with seasons and regions, were frequently associated with the six meteorological parameters: boundary layer height, humidity, wind direction, surface solar radiation, total cloud cover and sea level pressure. Temperature rarely became a significant variable in any season, which could partly explain the peak of monthly average ozone concentrations in October in Hong Kong. And we found the effect of solar radiation would be enhanced during extremely ozone pollution episodes (i.e., the 99th percentile). Finally, meteorological effects on MDA8 ozone had no significant changes before and after the 2010 Asian Games.
Association of Dentine Hypersensitivity with Different Risk Factors – A Cross Sectional Study
Vijaya, V; Sanjay, Venkataraam; Varghese, Rana K; Ravuri, Rajyalakshmi; Agarwal, Anil
2013-01-01
Background: This study was done to assess the prevalence of Dentine hypersensitivity (DH) and its associated risk factors. Materials & Methods: This epidemiological study was done among patients coming to dental college regarding prevalence of DH. A self structured questionnaire along with clinical examination was done for assessment. Descriptive statistics were obtained and frequency distribution was calculated using Chi square test at p value <0.05. Stepwise multiple linear regression was also done to access frequency of DH with different factors. Results: The study population was comprised of 655 participants with different age groups. Our study showed prevalence as 55% and it was more common among males. Similarly smokers and those who use hard tooth brush had more cases of DH. Step wise multiple linear regression showed that best predictor for DH was age followed by habit of smoking and type of tooth brush. Most aggravating factors were cold water (15.4%) and sweet foods (14.7%), whereas only 5% of the patients had it while brushing. Conclusion: A high level of dental hypersensitivity has been in this study and more common among males. A linear finding was shown with age, smoking and type of tooth brush. How to cite this article: Vijaya V, Sanjay V, Varghese RK, Ravuri R, Agarwal A. Association of Dentine Hypersensitivity with Different Risk Factors – A Cross Sectional Study. J Int Oral Health 2013;5(6):88-92 . PMID:24453451
Development of quantitative screen for 1550 chemicals with GC-MS.
Bergmann, Alan J; Points, Gary L; Scott, Richard P; Wilson, Glenn; Anderson, Kim A
2018-05-01
With hundreds of thousands of chemicals in the environment, effective monitoring requires high-throughput analytical techniques. This paper presents a quantitative screening method for 1550 chemicals based on statistical modeling of responses with identification and integration performed using deconvolution reporting software. The method was evaluated with representative environmental samples. We tested biological extracts, low-density polyethylene, and silicone passive sampling devices spiked with known concentrations of 196 representative chemicals. A multiple linear regression (R 2 = 0.80) was developed with molecular weight, logP, polar surface area, and fractional ion abundance to predict chemical responses within a factor of 2.5. Linearity beyond the calibration had R 2 > 0.97 for three orders of magnitude. Median limits of quantitation were estimated to be 201 pg/μL (1.9× standard deviation). The number of detected chemicals and the accuracy of quantitation were similar for environmental samples and standard solutions. To our knowledge, this is the most precise method for the largest number of semi-volatile organic chemicals lacking authentic standards. Accessible instrumentation and software make this method cost effective in quantifying a large, customizable list of chemicals. When paired with silicone wristband passive samplers, this quantitative screen will be very useful for epidemiology where binning of concentrations is common. Graphical abstract A multiple linear regression of chemical responses measured with GC-MS allowed quantitation of 1550 chemicals in samples such as silicone wristbands.
Depuydt, Christophe E; Thys, Sofie; Beert, Johan; Jonckheere, Jef; Salembier, Geert; Bogers, Johannes J
2016-11-01
Persistent high-risk human papillomavirus (HPV) infection is strongly associated with development of high-grade cervical intraepithelial neoplasia or cancer (CIN3+). In single type infections, serial type-specific viral-load measurements predict the natural history of the infection. In infections with multiple HPV-types, the individual type-specific viral-load profile could distinguish progressing HPV-infections from regressing infections. A case-cohort natural history study was established using samples from untreated women with multiple HPV-infections who developed CIN3+ (n = 57) or cleared infections (n = 88). Enriched cell pellet from liquid based cytology samples were subjected to a clinically validated real-time qPCR-assay (18 HPV-types). Using serial type-specific viral-load measurements (≥3) we calculated HPV-specific slopes and coefficient of determination (R(2) ) by linear regression. For each woman slopes and R(2) were used to calculate which HPV-induced processes were ongoing (progression, regression, serial transient, transient). In transient infections with multiple HPV-types, each single HPV-type generated similar increasing (0.27copies/cell/day) and decreasing (-0.27copies/cell/day) viral-load slopes. In CIN3+, at least one of the HPV-types had a clonal progressive course (R(2) ≥ 0.85; 0.0025copies/cell/day). In selected CIN3+ cases (n = 6), immunostaining detecting type-specific HPV 16, 31, 33, 58 and 67 RNA showed an even staining in clonal populations (CIN3+), whereas in transient virion-producing infections the RNA-staining was less in the basal layer compared to the upper layer where cells were ready to desquamate and release newly-formed virions. RNA-hybridization patterns matched the calculated ongoing processes measured by R(2) and slope in serial type-specific viral-load measurements preceding the biopsy. In women with multiple HPV-types, serial type-specific viral-load measurements predict the natural history of the different HPV-types and elucidates HPV-genotype attribution. © 2016 UICC.
Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne
2012-01-01
In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models. PMID:23275882
Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne
2012-12-01
In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models.
Almalki, Mohammed J; FitzGerald, Gerry; Clark, Michele
2012-09-12
Quality of work life (QWL) has been found to influence the commitment of health professionals, including nurses. However, reliable information on QWL and turnover intention of primary health care (PHC) nurses is limited. The aim of this study was to examine the relationship between QWL and turnover intention of PHC nurses in Saudi Arabia. A cross-sectional survey was used in this study. Data were collected using Brooks' survey of Quality of Nursing Work Life, the Anticipated Turnover Scale and demographic data questions. A total of 508 PHC nurses in the Jazan Region, Saudi Arabia, completed the questionnaire (RR = 87%). Descriptive statistics, t-test, ANOVA, General Linear Model (GLM) univariate analysis, standard multiple regression, and hierarchical multiple regression were applied for analysis using SPSS v17 for Windows. Findings suggested that the respondents were dissatisfied with their work life, with almost 40% indicating a turnover intention from their current PHC centres. Turnover intention was significantly related to QWL. Using standard multiple regression, 26% of the variance in turnover intention was explained by QWL, p < 0.001, with R2 = .263. Further analysis using hierarchical multiple regression found that the total variance explained by the model as a whole (demographics and QWL) was 32.1%, p < 0.001. QWL explained an additional 19% of the variance in turnover intention, after controlling for demographic variables. Creating and maintaining a healthy work life for PHC nurses is very important to improve their work satisfaction, reduce turnover, enhance productivity and improve nursing care outcomes.
2012-01-01
Background Quality of work life (QWL) has been found to influence the commitment of health professionals, including nurses. However, reliable information on QWL and turnover intention of primary health care (PHC) nurses is limited. The aim of this study was to examine the relationship between QWL and turnover intention of PHC nurses in Saudi Arabia. Methods A cross-sectional survey was used in this study. Data were collected using Brooks’ survey of Quality of Nursing Work Life, the Anticipated Turnover Scale and demographic data questions. A total of 508 PHC nurses in the Jazan Region, Saudi Arabia, completed the questionnaire (RR = 87%). Descriptive statistics, t-test, ANOVA, General Linear Model (GLM) univariate analysis, standard multiple regression, and hierarchical multiple regression were applied for analysis using SPSS v17 for Windows. Results Findings suggested that the respondents were dissatisfied with their work life, with almost 40% indicating a turnover intention from their current PHC centres. Turnover intention was significantly related to QWL. Using standard multiple regression, 26% of the variance in turnover intention was explained by QWL, p < 0.001, with R2 = .263. Further analysis using hierarchical multiple regression found that the total variance explained by the model as a whole (demographics and QWL) was 32.1%, p < 0.001. QWL explained an additional 19% of the variance in turnover intention, after controlling for demographic variables. Conclusions Creating and maintaining a healthy work life for PHC nurses is very important to improve their work satisfaction, reduce turnover, enhance productivity and improve nursing care outcomes. PMID:22970764
Regression modeling of ground-water flow
Cooley, R.L.; Naff, R.L.
1985-01-01
Nonlinear multiple regression methods are developed to model and analyze groundwater flow systems. Complete descriptions of regression methodology as applied to groundwater flow models allow scientists and engineers engaged in flow modeling to apply the methods to a wide range of problems. Organization of the text proceeds from an introduction that discusses the general topic of groundwater flow modeling, to a review of basic statistics necessary to properly apply regression techniques, and then to the main topic: exposition and use of linear and nonlinear regression to model groundwater flow. Statistical procedures are given to analyze and use the regression models. A number of exercises and answers are included to exercise the student on nearly all the methods that are presented for modeling and statistical analysis. Three computer programs implement the more complex methods. These three are a general two-dimensional, steady-state regression model for flow in an anisotropic, heterogeneous porous medium, a program to calculate a measure of model nonlinearity with respect to the regression parameters, and a program to analyze model errors in computed dependent variables such as hydraulic head. (USGS)
Investigating bias in squared regression structure coefficients
Nimon, Kim F.; Zientek, Linda R.; Thompson, Bruce
2015-01-01
The importance of structure coefficients and analogs of regression weights for analysis within the general linear model (GLM) has been well-documented. The purpose of this study was to investigate bias in squared structure coefficients in the context of multiple regression and to determine if a formula that had been shown to correct for bias in squared Pearson correlation coefficients and coefficients of determination could be used to correct for bias in squared regression structure coefficients. Using data from a Monte Carlo simulation, this study found that squared regression structure coefficients corrected with Pratt's formula produced less biased estimates and might be more accurate and stable estimates of population squared regression structure coefficients than estimates with no such corrections. While our findings are in line with prior literature that identified multicollinearity as a predictor of bias in squared regression structure coefficients but not coefficients of determination, the findings from this study are unique in that the level of predictive power, number of predictors, and sample size were also observed to contribute bias in squared regression structure coefficients. PMID:26217273
NASA Astrophysics Data System (ADS)
Shih, C. Y.; Tsuei, Y. G.; Allemang, R. J.; Brown, D. L.
1988-10-01
A method of using the matrix Auto-Regressive Moving Average (ARMA) model in the Laplace domain for multiple-reference global parameter identification is presented. This method is particularly applicable to the area of modal analysis where high modal density exists. The method is also applicable when multiple reference frequency response functions are used to characterise linear systems. In order to facilitate the mathematical solution, the Forsythe orthogonal polynomial is used to reduce the ill-conditioning of the formulated equations and to decouple the normal matrix into two reduced matrix blocks. A Complex Mode Indicator Function (CMIF) is introduced, which can be used to determine the proper order of the rational polynomials.
Ribbons, Karen; Lea, Rodney; Schofield, Peter W; Lechner-Scott, Jeannette
2017-01-01
Neurological and psychological symptoms in multiple sclerosis can affect cognitive function. The objective of this study was to explore the relationship between psychological measures and cognitive performance in a patient cohort. In 322 multiple sclerosis patients, psychological symptoms were measured using the Depression Anxiety and Stress Scale, and cognitive function was evaluated using Audio Recorded Cognitive Screen. Multifactor linear regression analysis, accounting for all clinical covariates, found that anxiety was the only psychological measure to remain a significant predictor of cognitive performance (p<0.001), particularly memory function (p<0.001). Further prospective studies are required to determine whether treatment of anxiety improves cognitive impairment.
Wan, Chao; Hao, Zhixiu; Wen, Shizhu; Leng, Huijie
2014-01-01
The mechanical properties of ligaments are key contributors to the stability and function of musculoskeletal joints. Ligaments are generally composed of ground substance, collagen (mainly type I and III collagen), and minimal elastin fibers. However, no consensus has been reached about whether the distribution of different types of collagen correlates with the mechanical behaviors of ligaments. The main objective of this study was to determine whether the collagen type distribution is correlated with the mechanical properties of ligaments. Using axial tensile tests and picrosirius red staining-polarization observations, the mechanical behaviors and the ratios of the various types of collagen were investigated for twenty-four rabbit medial collateral ligaments from twenty-four rabbits of different ages, respectively. One-way analysis of variance was used in the comparison of the Young's modulus in the linear region of the stress-strain curves and the ratios of type I and III collagen for the specimens (the mid-substance specimens of the ligaments) with different ages. A multiple linear regression was performed using the collagen contents (the ratios of type I and III collagen) and the Young's modulus of the specimens. During the maturation of the ligaments, the type I collagen content increased, and the type III collagen content decreased. A significant and strong correlation () was identified by multiple linear regression between the collagen contents (i.e., the ratios of type I and type III collagen) and the mechanical properties of the specimens. The collagen content of ligaments might provide a new perspective for evaluating the linear modulus of global stress-strain curves for ligaments and open a new door for studying the mechanical behaviors and functions of connective tissues. PMID:25062068
Wan, Chao; Hao, Zhixiu; Wen, Shizhu; Leng, Huijie
2014-01-01
The mechanical properties of ligaments are key contributors to the stability and function of musculoskeletal joints. Ligaments are generally composed of ground substance, collagen (mainly type I and III collagen), and minimal elastin fibers. However, no consensus has been reached about whether the distribution of different types of collagen correlates with the mechanical behaviors of ligaments. The main objective of this study was to determine whether the collagen type distribution is correlated with the mechanical properties of ligaments. Using axial tensile tests and picrosirius red staining-polarization observations, the mechanical behaviors and the ratios of the various types of collagen were investigated for twenty-four rabbit medial collateral ligaments from twenty-four rabbits of different ages, respectively. One-way analysis of variance was used in the comparison of the Young's modulus in the linear region of the stress-strain curves and the ratios of type I and III collagen for the specimens (the mid-substance specimens of the ligaments) with different ages. A multiple linear regression was performed using the collagen contents (the ratios of type I and III collagen) and the Young's modulus of the specimens. During the maturation of the ligaments, the type I collagen content increased, and the type III collagen content decreased. A significant and strong correlation (R2 = 0.839, P < 0.05) was identified by multiple linear regression between the collagen contents (i.e., the ratios of type I and type III collagen) and the mechanical properties of the specimens. The collagen content of ligaments might provide a new perspective for evaluating the linear modulus of global stress-strain curves for ligaments and open a new door for studying the mechanical behaviors and functions of connective tissues.
Meteorological adjustment of yearly mean values for air pollutant concentration comparison
NASA Technical Reports Server (NTRS)
Sidik, S. M.; Neustadter, H. E.
1976-01-01
Using multiple linear regression analysis, models which estimate mean concentrations of Total Suspended Particulate (TSP), sulfur dioxide, and nitrogen dioxide as a function of several meteorologic variables, two rough economic indicators, and a simple trend in time are studied. Meteorologic data were obtained and do not include inversion heights. The goodness of fit of the estimated models is partially reflected by the squared coefficient of multiple correlation which indicates that, at the various sampling stations, the models accounted for about 23 to 47 percent of the total variance of the observed TSP concentrations. If the resulting model equations are used in place of simple overall means of the observed concentrations, there is about a 20 percent improvement in either: (1) predicting mean concentrations for specified meteorological conditions; or (2) adjusting successive yearly averages to allow for comparisons devoid of meteorological effects. An application to source identification is presented using regression coefficients of wind velocity predictor variables.
Howley, Donna; Howley, Peter; Oxenham, Marc F
2018-06-01
Stature and a further 8 anthropometric dimensions were recorded from the arms and hands of a sample of 96 staff and students from the Australian National University and The University of Newcastle, Australia. These dimensions were used to create simple and multiple logistic regression models for sex estimation and simple and multiple linear regression equations for stature estimation of a contemporary Australian population. Overall sex classification accuracies using the models created were comparable to similar studies. The stature estimation models achieved standard errors of estimates (SEE) which were comparable to and in many cases lower than those achieved in similar research. Generic, non sex-specific models achieved similar SEEs and R 2 values to the sex-specific models indicating stature may be accurately estimated when sex is unknown. Copyright © 2018 Elsevier B.V. All rights reserved.
Optimizing methods for linking cinematic features to fMRI data.
Kauttonen, Janne; Hlushchuk, Yevhen; Tikka, Pia
2015-04-15
One of the challenges of naturalistic neurosciences using movie-viewing experiments is how to interpret observed brain activations in relation to the multiplicity of time-locked stimulus features. As previous studies have shown less inter-subject synchronization across viewers of random video footage than story-driven films, new methods need to be developed for analysis of less story-driven contents. To optimize the linkage between our fMRI data collected during viewing of a deliberately non-narrative silent film 'At Land' by Maya Deren (1944) and its annotated content, we combined the method of elastic-net regularization with the model-driven linear regression and the well-established data-driven independent component analysis (ICA) and inter-subject correlation (ISC) methods. In the linear regression analysis, both IC and region-of-interest (ROI) time-series were fitted with time-series of a total of 36 binary-valued and one real-valued tactile annotation of film features. The elastic-net regularization and cross-validation were applied in the ordinary least-squares linear regression in order to avoid over-fitting due to the multicollinearity of regressors, the results were compared against both the partial least-squares (PLS) regression and the un-regularized full-model regression. Non-parametric permutation testing scheme was applied to evaluate the statistical significance of regression. We found statistically significant correlation between the annotation model and 9 ICs out of 40 ICs. Regression analysis was also repeated for a large set of cubic ROIs covering the grey matter. Both IC- and ROI-based regression analyses revealed activations in parietal and occipital regions, with additional smaller clusters in the frontal lobe. Furthermore, we found elastic-net based regression more sensitive than PLS and un-regularized regression since it detected a larger number of significant ICs and ROIs. Along with the ISC ranking methods, our regression analysis proved a feasible method for ordering the ICs based on their functional relevance to the annotated cinematic features. The novelty of our method is - in comparison to the hypothesis-driven manual pre-selection and observation of some individual regressors biased by choice - in applying data-driven approach to all content features simultaneously. We found especially the combination of regularized regression and ICA useful when analyzing fMRI data obtained using non-narrative movie stimulus with a large set of complex and correlated features. Copyright © 2015. Published by Elsevier Inc.
Steinmann, Zoran J N; Venkatesh, Aranya; Hauck, Mara; Schipper, Aafke M; Karuppiah, Ramkumar; Laurenzi, Ian J; Huijbregts, Mark A J
2014-05-06
One of the major challenges in life cycle assessment (LCA) is the availability and quality of data used to develop models and to make appropriate recommendations. Approximations and assumptions are often made if appropriate data are not readily available. However, these proxies may introduce uncertainty into the results. A regression model framework may be employed to assess missing data in LCAs of products and processes. In this study, we develop such a regression-based framework to estimate CO2 emission factors associated with coal power plants in the absence of reported data. Our framework hypothesizes that emissions from coal power plants can be explained by plant-specific factors (predictors) that include steam pressure, total capacity, plant age, fuel type, and gross domestic product (GDP) per capita of the resident nations of those plants. Using reported emission data for 444 plants worldwide, plant level CO2 emission factors were fitted to the selected predictors by a multiple linear regression model and a local linear regression model. The validated models were then applied to 764 coal power plants worldwide, for which no reported data were available. Cumulatively, available reported data and our predictions together account for 74% of the total world's coal-fired power generation capacity.
Anodic microbial community diversity as a predictor of the power output of microbial fuel cells.
Stratford, James P; Beecroft, Nelli J; Slade, Robert C T; Grüning, André; Avignone-Rossa, Claudio
2014-03-01
The relationship between the diversity of mixed-species microbial consortia and their electrogenic potential in the anodes of microbial fuel cells was examined using different diversity measures as predictors. Identical microbial fuel cells were sampled at multiple time-points. Biofilm and suspension communities were analysed by denaturing gradient gel electrophoresis to calculate the number and relative abundance of species. Shannon and Simpson indices and richness were examined for association with power using bivariate and multiple linear regression, with biofilm DNA as an additional variable. In simple bivariate regressions, the correlation of Shannon diversity of the biofilm and power is stronger (r=0.65, p=0.001) than between power and richness (r=0.39, p=0.076), or between power and the Simpson index (r=0.5, p=0.018). Using Shannon diversity and biofilm DNA as predictors of power, a regression model can be constructed (r=0.73, p<0.001). Ecological parameters such as the Shannon index are predictive of the electrogenic potential of microbial communities. Copyright © 2014 Elsevier Ltd. All rights reserved.
Covariate Selection for Multilevel Models with Missing Data
Marino, Miguel; Buxton, Orfeu M.; Li, Yi
2017-01-01
Missing covariate data hampers variable selection in multilevel regression settings. Current variable selection techniques for multiply-imputed data commonly address missingness in the predictors through list-wise deletion and stepwise-selection methods which are problematic. Moreover, most variable selection methods are developed for independent linear regression models and do not accommodate multilevel mixed effects regression models with incomplete covariate data. We develop a novel methodology that is able to perform covariate selection across multiply-imputed data for multilevel random effects models when missing data is present. Specifically, we propose to stack the multiply-imputed data sets from a multiple imputation procedure and to apply a group variable selection procedure through group lasso regularization to assess the overall impact of each predictor on the outcome across the imputed data sets. Simulations confirm the advantageous performance of the proposed method compared with the competing methods. We applied the method to reanalyze the Healthy Directions-Small Business cancer prevention study, which evaluated a behavioral intervention program targeting multiple risk-related behaviors in a working-class, multi-ethnic population. PMID:28239457
High school science enrollment of black students
NASA Astrophysics Data System (ADS)
Goggins, Ellen O.; Lindbeck, Joy S.
How can the high school science enrollment of black students be increased? School and home counseling and classroom procedures could benefit from variables identified as predictors of science enrollment. The problem in this study was to identify a set of variables which characterize science course enrollment by black secondary students. The population consisted of a subsample of 3963 black high school seniors from The High School and Beyond 1980 Base-Year Survey. Using multiple linear regression, backward regression, and correlation analyses, the US Census regions and grades mostly As and Bs in English were found to be significant predictors of the number of science courses scheduled by black seniors.
NASA Technical Reports Server (NTRS)
Stankiewicz, N.
1982-01-01
The multiple channel input signal to a soft limiter amplifier as a traveling wave tube is represented as a finite, linear sum of Gaussian functions in the frequency domain. Linear regression is used to fit the channel shapes to a least squares residual error. Distortions in output signal, namely intermodulation products, are produced by the nonlinear gain characteristic of the amplifier and constitute the principal noise analyzed in this study. The signal to noise ratios are calculated for various input powers from saturation to 10 dB below saturation for two specific distributions of channels. A criterion for the truncation of the series expansion of the nonlinear transfer characteristic is given. It is found that he signal to noise ratios are very sensitive to the coefficients used in this expansion. Improper or incorrect truncation of the series leads to ambiguous results in the signal to noise ratios.
Kumar, K Vasanth
2007-04-02
Kinetic experiments were carried out for the sorption of safranin onto activated carbon particles. The kinetic data were fitted to pseudo-second order model of Ho, Sobkowsk and Czerwinski, Blanchard et al. and Ritchie by linear and non-linear regression methods. Non-linear method was found to be a better way of obtaining the parameters involved in the second order rate kinetic expressions. Both linear and non-linear regression showed that the Sobkowsk and Czerwinski and Ritchie's pseudo-second order models were the same. Non-linear regression analysis showed that both Blanchard et al. and Ho have similar ideas on the pseudo-second order model but with different assumptions. The best fit of experimental data in Ho's pseudo-second order expression by linear and non-linear regression method showed that Ho pseudo-second order model was a better kinetic expression when compared to other pseudo-second order kinetic expressions.
Brian K. Via; Todd F. Shupe; Leslie H. Groom; Michael Stine; Chi-Leung So
2003-01-01
In manufacturing, monitoring the mechanical properties of wood with near infrared spectroscopy (NIR) is an attractive alternative to more conventional methods. However, no attention has been given to see if models differ between juvenile and mature wood. Additionally, it would be convenient if multiple linear regression (MLR) could perform well in the place of more...
We applied a multiple linear regression model to understand the relationships of PM2.5 with meteorological variables in the contiguous US and from there to infer the sensitivity of PM2.5 to climate change. We used 2004-2008 PM2.5 observations fro...
ERIC Educational Resources Information Center
Emerson, Natacha D.; Morrell, Holly E. R.; Neece, Cameron
2016-01-01
Having a consistent source of medical care may facilitate diagnosis of autism spectrum disorders (ASD). This study examined predictors of age of ASD diagnosis using data from the 2011-2012 National Survey of Children's Health. Using multiple linear regression analysis, age of diagnosis was predicted by race, ASD severity, having a consistent…
Statistical Tutorial | Center for Cancer Research
Recent advances in cancer biology have resulted in the need for increased statistical analysis of research data. ST is designed as a follow up to Statistical Analysis of Research Data (SARD) held in April 2018. The tutorial will apply the general principles of statistical analysis of research data including descriptive statistics, z- and t-tests of means and mean differences, simple and multiple linear regression, ANOVA tests, and Chi-Squared distribution.
ERIC Educational Resources Information Center
Beauchamp, Guy
2005-01-01
A study to present specific hypothesis that satisfactorily explain the boiling point of a number of molecules, CH[subscript w]F[subscript x]Cl[subscript y]Br[subscript z] having similar structure, and then analyze the model with the help of multiple linear regression (MLR), a data analysis tool. The MLR analysis was useful in selecting the…
The dynamic model of enterprise revenue management
NASA Astrophysics Data System (ADS)
Mitsel, A. A.; Kataev, M. Yu; Kozlov, S. V.; Korepanov, K. V.
2017-01-01
The article presents the dynamic model of enterprise revenue management. This model is based on the quadratic criterion and linear control law. The model is founded on multiple regression that links revenues with the financial performance of the enterprise. As a result, optimal management is obtained so as to provide the given enterprise revenue, namely, the values of financial indicators that ensure the planned profit of the organization are acquired.
Catalog of Air Force Weather Technical Documents, 1941-2006
2006-05-19
radiosondes in current use in USA. Elementary discussion of statistical terms and concepts used for expressing accuracy or error is discussed. AWS TR 105...Techniques, Appendix B: Vorticity—An Elementary Discussion of the Concept, August 1956, 27pp. Formerly AWSM 105– 50/1A. Provides the necessary back...steps involved in ordinary multiple linear regression. Conditional probability is calculated using transnormalized variables in the multivariate normal
Prediction system of hydroponic plant growth and development using algorithm Fuzzy Mamdani method
NASA Astrophysics Data System (ADS)
Sudana, I. Made; Purnawirawan, Okta; Arief, Ulfa Mediaty
2017-03-01
Hydroponics is a method of farming without soil. One of the Hydroponic plants is Watercress (Nasturtium Officinale). The development and growth process of hydroponic Watercress was influenced by levels of nutrients, acidity and temperature. The independent variables can be used as input variable system to predict the value level of plants growth and development. The prediction system is using Fuzzy Algorithm Mamdani method. This system was built to implement the function of Fuzzy Inference System (Fuzzy Inference System/FIS) as a part of the Fuzzy Logic Toolbox (FLT) by using MATLAB R2007b. FIS is a computing system that works on the principle of fuzzy reasoning which is similar to humans' reasoning. Basically FIS consists of four units which are fuzzification unit, fuzzy logic reasoning unit, base knowledge unit and defuzzification unit. In addition to know the effect of independent variables on the plants growth and development that can be visualized with the function diagram of FIS output surface that is shaped three-dimensional, and statistical tests based on the data from the prediction system using multiple linear regression method, which includes multiple linear regression analysis, T test, F test, the coefficient of determination and donations predictor that are calculated using SPSS (Statistical Product and Service Solutions) software applications.
Linard, Joshua I.
2013-01-01
Mitigating the effects of salt and selenium on water quality in the Grand Valley and lower Gunnison River Basin in western Colorado is a major concern for land managers. Previous modeling indicated means to improve the models by including more detailed geospatial data and a more rigorous method for developing the models. After evaluating all possible combinations of geospatial variables, four multiple linear regression models resulted that could estimate irrigation-season salt yield, nonirrigation-season salt yield, irrigation-season selenium yield, and nonirrigation-season selenium yield. The adjusted r-squared and the residual standard error (in units of log-transformed yield) of the models were, respectively, 0.87 and 2.03 for the irrigation-season salt model, 0.90 and 1.25 for the nonirrigation-season salt model, 0.85 and 2.94 for the irrigation-season selenium model, and 0.93 and 1.75 for the nonirrigation-season selenium model. The four models were used to estimate yields and loads from contributing areas corresponding to 12-digit hydrologic unit codes in the lower Gunnison River Basin study area. Each of the 175 contributing areas was ranked according to its estimated mean seasonal yield of salt and selenium.
Yamakado, Minoru; Tanaka, Takayuki; Nagao, Kenji; Imaizumi, Akira; Komatsu, Michiharu; Daimon, Takashi; Miyano, Hiroshi; Tani, Mizuki; Toda, Akiko; Yamamoto, Hiroshi; Horimoto, Katsuhisa; Ishizaka, Yuko
2017-11-03
Fatty liver disease (FLD) increases the risk of diabetes, cardiovascular disease, and steatohepatitis, which leads to fibrosis, cirrhosis, and hepatocellular carcinoma. Thus, the early detection of FLD is necessary. We aimed to find a quantitative and feasible model for discriminating the FLD, based on plasma free amino acid (PFAA) profiles. We constructed models of the relationship between PFAA levels in 2,000 generally healthy Japanese subjects and the diagnosis of FLD by abdominal ultrasound scan by multiple logistic regression analysis with variable selection. The performance of these models for FLD discrimination was validated using an independent data set of 2,160 subjects. The generated PFAA-based model was able to identify FLD patients. The area under the receiver operating characteristic curve for the model was 0.83, which was higher than those of other existing liver function-associated markers ranging from 0.53 to 0.80. The value of the linear discriminant in the model yielded the adjusted odds ratio (with 95% confidence intervals) for a 1 standard deviation increase of 2.63 (2.14-3.25) in the multiple logistic regression analysis with known liver function-associated covariates. Interestingly, the linear discriminant values were significantly associated with the progression of FLD, and patients with nonalcoholic steatohepatitis also exhibited higher values.
Li, Siyue; Zhang, Quanfa
2011-06-15
Water samples were collected for determination of dissolved trace metals in 56 sampling sites throughout the upper Han River, China. Multivariate statistical analyses including correlation analysis, stepwise multiple linear regression models, and principal component and factor analysis (PCA/FA) were employed to examine the land use influences on trace metals, and a receptor model of factor analysis-multiple linear regression (FA-MLR) was used for source identification/apportionment of anthropogenic heavy metals in the surface water of the River. Our results revealed that land use was an important factor in water metals in the snow melt flow period and land use in the riparian zone was not a better predictor of metals than land use away from the river. Urbanization in a watershed and vegetation along river networks could better explain metals, and agriculture, regardless of its relative location, however slightly explained metal variables in the upper Han River. FA-MLR analysis identified five source types of metals, and mining, fossil fuel combustion, and vehicle exhaust were the dominant pollutions in the surface waters. The results demonstrated great impacts of human activities on metal concentrations in the subtropical river of China. Copyright © 2011 Elsevier B.V. All rights reserved.
2014-01-01
Background It is not well established how psychosocial factors like social support and depression affect health-related quality of life in multimorbid and elderly patients. We investigated whether depressive mood mediates the influence of social support on health-related quality of life. Methods Cross-sectional data of 3,189 multimorbid patients from the baseline assessment of the German MultiCare cohort study were used. Mediation was tested using the approach described by Baron and Kenny based on multiple linear regression, and controlling for socioeconomic variables and burden of multimorbidity. Results Mediation analyses confirmed that depressive mood mediates the influence of social support on health-related quality of life (Sobel’s p < 0.001). Multiple linear regression showed that the influence of depressive mood (β = −0.341, p < 0.01) on health-related quality of life is greater than the influence of multimorbidity (β = −0.234, p < 0.01). Conclusion Social support influences health-related quality of life, but this association is strongly mediated by depressive mood. Depression should be taken into consideration in research on multimorbidity, and clinicians should be aware of its importance when caring for multimorbid patients. Trial registration ISRCTN89818205 PMID:24708815
Five-Hole Flow Angle Probe Calibration for the NASA Glenn Icing Research Tunnel
NASA Technical Reports Server (NTRS)
Gonsalez, Jose C.; Arrington, E. Allen
1999-01-01
A spring 1997 test section calibration program is scheduled for the NASA Glenn Research Center Icing Research Tunnel following the installation of new water injecting spray bars. A set of new five-hole flow angle pressure probes was fabricated to properly calibrate the test section for total pressure, static pressure, and flow angle. The probes have nine pressure ports: five total pressure ports on a hemispherical head and four static pressure ports located 14.7 diameters downstream of the head. The probes were calibrated in the NASA Glenn 3.5-in.-diameter free-jet calibration facility. After completing calibration data acquisition for two probes, two data prediction models were evaluated. Prediction errors from a linear discrete model proved to be no worse than those from a full third-order multiple regression model. The linear discrete model only required calibration data acquisition according to an abridged test matrix, thus saving considerable time and financial resources over the multiple regression model that required calibration data acquisition according to a more extensive test matrix. Uncertainties in calibration coefficients and predicted values of flow angle, total pressure, static pressure. Mach number. and velocity were examined. These uncertainties consider the instrumentation that will be available in the Icing Research Tunnel for future test section calibration testing.
Association between the Type of Workplace and Lung Function in Copper Miners
Gruszczyński, Leszek; Wojakowska, Anna; Ścieszka, Marek; Turczyn, Barbara; Schmidt, Edward
2016-01-01
The aim of the analysis was to retrospectively assess changes in lung function in copper miners depending on the type of workplace. In the groups of 225 operators, 188 welders, and 475 representatives of other jobs, spirometry was performed at the start of employment and subsequently after 10, 20, and 25 years of work. Spirometry Longitudinal Data Analysis software was used to estimate changes in group means for FEV1 and FVC. Multiple linear regression analysis was used to assess an association between workplace and lung function. Lung function assessed on the basis of calculation of longitudinal FEV1 (FVC) decline was similar in all studied groups. However, multiple linear regression model used in cross-sectional analysis revealed an association between workplace and lung function. In the group of welders, FEF75 was lower in comparison to operators and other miners as early as after 10 years of work. Simultaneously, in smoking welders, the FEV1/FVC ratio was lower than in nonsmokers (p < 0,05). The interactions between type of workplace and smoking (p < 0,05) in their effect on FVC, FEV1, PEF, and FEF50 were shown. Among underground working copper miners, the group of smoking welders is especially threatened by impairment of lung ventilatory function. PMID:27274987
NASA Astrophysics Data System (ADS)
Laborda, Francisco; Medrano, Jesús; Castillo, Juan R.
2004-06-01
The quality of the quantitative results obtained from transient signals in high-performance liquid chromatography-inductively coupled plasma mass spectrometry (HPLC-ICPMS) and flow injection-inductively coupled plasma mass spectrometry (FI-ICPMS) was investigated under multielement conditions. Quantification methods were based on multiple-point calibration by simple and weighted linear regression, and double-point calibration (measurement of the baseline and one standard). An uncertainty model, which includes the main sources of uncertainty from FI-ICPMS and HPLC-ICPMS (signal measurement, sample flow rate and injection volume), was developed to estimate peak area uncertainties and statistical weights used in weighted linear regression. The behaviour of the ICPMS instrument was characterized in order to be considered in the model, concluding that the instrument works as a concentration detector when it is used to monitorize transient signals from flow injection or chromatographic separations. Proper quantification by the three calibration methods was achieved when compared to reference materials, although the double-point calibration allowed to obtain results of the same quality as the multiple-point calibration, shortening the calibration time. Relative expanded uncertainties ranged from 10-20% for concentrations around the LOQ to 5% for concentrations higher than 100 times the LOQ.
Zhao, Rui; Catalano, Paul; DeGruttola, Victor G.; Michor, Franziska
2017-01-01
The dynamics of tumor burden, secreted proteins or other biomarkers over time, is often used to evaluate the effectiveness of therapy and to predict outcomes for patients. Many methods have been proposed to investigate longitudinal trends to better characterize patients and to understand disease progression. However, most approaches assume a homogeneous patient population and a uniform response trajectory over time and across patients. Here, we present a mixture piecewise linear Bayesian hierarchical model, which takes into account both population heterogeneity and nonlinear relationships between biomarkers and time. Simulation results show that our method was able to classify subjects according to their patterns of treatment response with greater than 80% accuracy in the three scenarios tested. We then applied our model to a large randomized controlled phase III clinical trial of multiple myeloma patients. Analysis results suggest that the longitudinal tumor burden trajectories in multiple myeloma patients are heterogeneous and nonlinear, even among patients assigned to the same treatment cohort. In addition, between cohorts, there are distinct differences in terms of the regression parameters and the distributions among categories in the mixture. Those results imply that longitudinal data from clinical trials may harbor unobserved subgroups and nonlinear relationships; accounting for both may be important for analyzing longitudinal data. PMID:28723910
Akbar, Jamshed; Iqbal, Shahid; Batool, Fozia; Karim, Abdul; Chan, Kim Wei
2012-01-01
Quantitative structure-retention relationships (QSRRs) have successfully been developed for naturally occurring phenolic compounds in a reversed-phase liquid chromatographic (RPLC) system. A total of 1519 descriptors were calculated from the optimized structures of the molecules using MOPAC2009 and DRAGON softwares. The data set of 39 molecules was divided into training and external validation sets. For feature selection and mapping we used step-wise multiple linear regression (SMLR), unsupervised forward selection followed by step-wise multiple linear regression (UFS-SMLR) and artificial neural networks (ANN). Stable and robust models with significant predictive abilities in terms of validation statistics were obtained with negation of any chance correlation. ANN models were found better than remaining two approaches. HNar, IDM, Mp, GATS2v, DISP and 3D-MoRSE (signals 22, 28 and 32) descriptors based on van der Waals volume, electronegativity, mass and polarizability, at atomic level, were found to have significant effects on the retention times. The possible implications of these descriptors in RPLC have been discussed. All the models are proven to be quite able to predict the retention times of phenolic compounds and have shown remarkable validation, robustness, stability and predictive performance. PMID:23203132
Interaction Models for Functional Regression.
Usset, Joseph; Staicu, Ana-Maria; Maity, Arnab
2016-02-01
A functional regression model with a scalar response and multiple functional predictors is proposed that accommodates two-way interactions in addition to their main effects. The proposed estimation procedure models the main effects using penalized regression splines, and the interaction effect by a tensor product basis. Extensions to generalized linear models and data observed on sparse grids or with measurement error are presented. A hypothesis testing procedure for the functional interaction effect is described. The proposed method can be easily implemented through existing software. Numerical studies show that fitting an additive model in the presence of interaction leads to both poor estimation performance and lost prediction power, while fitting an interaction model where there is in fact no interaction leads to negligible losses. The methodology is illustrated on the AneuRisk65 study data.
NASA Technical Reports Server (NTRS)
Jacobsen, R. T.; Stewart, R. B.; Crain, R. W., Jr.; Rose, G. L.; Myers, A. F.
1976-01-01
A method was developed for establishing a rational choice of the terms to be included in an equation of state with a large number of adjustable coefficients. The methods presented were developed for use in the determination of an equation of state for oxygen and nitrogen. However, a general application of the methods is possible in studies involving the determination of an optimum polynomial equation for fitting a large number of data points. The data considered in the least squares problem are experimental thermodynamic pressure-density-temperature data. Attention is given to a description of stepwise multiple regression and the use of stepwise regression in the determination of an equation of state for oxygen and nitrogen.
Akimoto, Yuki; Yugi, Katsuyuki; Uda, Shinsuke; Kudo, Takamasa; Komori, Yasunori; Kubota, Hiroyuki; Kuroda, Shinya
2013-01-01
Cells use common signaling molecules for the selective control of downstream gene expression and cell-fate decisions. The relationship between signaling molecules and downstream gene expression and cellular phenotypes is a multiple-input and multiple-output (MIMO) system and is difficult to understand due to its complexity. For example, it has been reported that, in PC12 cells, different types of growth factors activate MAP kinases (MAPKs) including ERK, JNK, and p38, and CREB, for selective protein expression of immediate early genes (IEGs) such as c-FOS, c-JUN, EGR1, JUNB, and FOSB, leading to cell differentiation, proliferation and cell death; however, how multiple-inputs such as MAPKs and CREB regulate multiple-outputs such as expression of the IEGs and cellular phenotypes remains unclear. To address this issue, we employed a statistical method called partial least squares (PLS) regression, which involves a reduction of the dimensionality of the inputs and outputs into latent variables and a linear regression between these latent variables. We measured 1,200 data points for MAPKs and CREB as the inputs and 1,900 data points for IEGs and cellular phenotypes as the outputs, and we constructed the PLS model from these data. The PLS model highlighted the complexity of the MIMO system and growth factor-specific input-output relationships of cell-fate decisions in PC12 cells. Furthermore, to reduce the complexity, we applied a backward elimination method to the PLS regression, in which 60 input variables were reduced to 5 variables, including the phosphorylation of ERK at 10 min, CREB at 5 min and 60 min, AKT at 5 min and JNK at 30 min. The simple PLS model with only 5 input variables demonstrated a predictive ability comparable to that of the full PLS model. The 5 input variables effectively extracted the growth factor-specific simple relationships within the MIMO system in cell-fate decisions in PC12 cells.
Kennen, Jonathan G.; Ayers, Mark A.
2002-01-01
Community data from 36 watersheds were used to evaluate the response of fish, invertebrate, and algal assemblages in New Jersey streams to environmental characteristics along a gradient of urban land use that ranged from 3 to 96 percent. Aquatic assemblages were sampled at 36 sites during 1996-98, and more than 400 environmental attributes at multiple spatial scales were summarized. Data matrices were reduced to 43, 170, and 103 species of fish, invertebrates, and algae, respectively, by means of a predetermined joint frequency and relative abundance approach. White sucker (Catostomus commersoni) and Tessellated darter (Etheostoma olmstedi) were the most abundant fishes, accounting for more than 20 and 17 percent, respectively, of the mean abundance. Net-spinning caddisflies (Hydropsychidae) were the most commonly occurring benthic invertebrates and were found at all but one of the 36 sampling sites. Blue-green (for example, Calothrix sp. and Oscillatoria sp.) and green (for example, Protoderma viride) algae were the most widely distrib-uted algae; however, more than 81 percent of the algal taxa collected were diatoms. Principal-component and correlation analyses were used to reduce the dimensionality of the environmental data. Multiple linear regression analysis of extracted ordination axes then was used to develop models that expressed effects of increasing urban land use on the structure of aquatic assemblages. Significant environmental variables identified by using multiple linear regression analysis then were included in a direct gradient analysis. Partial canonical correspondence analysis of relativized abundance data was used to restrict further the effects of residual natural variability, and to identify relations among the environmental variables and the structure of fish, invertebrate, and algal assemblages along an urban land-use gradient. Results of this approach, combined with the results of the multiple linear regression analyses, were used to identify human population density (311-37,594 persons/km2), amount and type of impervious surface cover (0.12-1,350 km2), nutrient concentrations (for example, 0.01-0.29 mg/L of phosphorus), hydrologic instability (for example, 100-8,955 ft3/s for 2-year peak flow), the amount of forest and wetlands in a basin (0.01-6.25 km2), and substrate quality (0-87 percent cobble substrate) as variables that are highly correlated with aquatic-assemblage structure. Species distributions in ordination space clearly indicate that tolerant species are more abundant in the streams impaired by urbanization and sensitive taxa are more closely associated with the least impaired basins. The distinct differences in aquatic assemblages along the urban land-use gradient demonstrate the deleterious effects of urbanization on assemblage structure and indicate that conserving landscape attributes that mitigate anthropogenic influences (for example, stormwater-management practices emphasizing infiltration and preservation of existing forests, wetlands, and riparian corridors) will help to maintain the relative abundance of sensitive taxa. Complementary multiple linear regression models indicate that aquatic community indices were correlated with many of the anthropogenic factors that were found to be significant along the urban land-use gradient. These indices appear to be effective in differentiating the moderately and severely impaired streams from the minimally impaired streams. Evaluation of disturbance thresholds for aquatic assemblages indicates that moderate to severe impairment is detectable in New Jersey streams when impervious surface cover in the drainage basin reaches approximately 18 percent.
Anderson, Carl A; McRae, Allan F; Visscher, Peter M
2006-07-01
Standard quantitative trait loci (QTL) mapping techniques commonly assume that the trait is both fully observed and normally distributed. When considering survival or age-at-onset traits these assumptions are often incorrect. Methods have been developed to map QTL for survival traits; however, they are both computationally intensive and not available in standard genome analysis software packages. We propose a grouped linear regression method for the analysis of continuous survival data. Using simulation we compare this method to both the Cox and Weibull proportional hazards models and a standard linear regression method that ignores censoring. The grouped linear regression method is of equivalent power to both the Cox and Weibull proportional hazards methods and is significantly better than the standard linear regression method when censored observations are present. The method is also robust to the proportion of censored individuals and the underlying distribution of the trait. On the basis of linear regression methodology, the grouped linear regression model is computationally simple and fast and can be implemented readily in freely available statistical software.
Depression is a predictor for balance in people with multiple sclerosis.
Alghwiri, Alia A; Khalil, Hanan; Al-Sharman, Alham; El-Salem, Khalid
2018-05-26
Balance impairments are common and multifactorial among people with multiple sclerosis (MS). Depression is the most common psychological disorder in MS population and is strongly correlated with MS disease. Depression might be one of the factors that contribute to balance deficits in this population. However, the relationship between depression and balance impairments has not been explored in people with MS. To investigate the association between depression and balance impairments in people with MS. Cross sectional design was used in patients with MS. The Activities-specific Balance Confidence scale (ABC) and Berg Balance Scale (BBS) was used to assess balance. Beck Depression Inventory (BDI-II) was used to quantify depression and Kurtizki Expanded Disability Status Scale (EDSS) was utilized for the evaluation of MS disability severity. Pearson correlation coefficient was used to examine the association between depression and balance measurements. Multiple linear stepwise regressions were also conducted to find out if depression is a potential predictor for balance deficits. Seventy-five individuals with MS (Female = 69%) with a mean age (SD) of 38.8 (10) and a mean (SD) EDSS score of 3.0 (1.4) were recruited in this study. Depression was present in 53% of the patients. Depression was significantly correlated with balance measurements and EDSS. However, multiple linear stepwise regressions found that only depression and age significantly predict balance. Depression and balance were found frequent and associated in people with MS. Importantly depression was a significant predictor for balance impairments in individuals with MS. Balance rehabilitation may be hindered by depression. Therefore, depression should be evaluated and treated properly in individuals with MS. Copyright © 2018 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Zhang, Ying; Bi, Peng; Hiller, Janet
2008-01-01
This is the first study to identify appropriate regression models for the association between climate variation and salmonellosis transmission. A comparison between different regression models was conducted using surveillance data in Adelaide, South Australia. By using notified salmonellosis cases and climatic variables from the Adelaide metropolitan area over the period 1990-2003, four regression methods were examined: standard Poisson regression, autoregressive adjusted Poisson regression, multiple linear regression, and a seasonal autoregressive integrated moving average (SARIMA) model. Notified salmonellosis cases in 2004 were used to test the forecasting ability of the four models. Parameter estimation, goodness-of-fit and forecasting ability of the four regression models were compared. Temperatures occurring 2 weeks prior to cases were positively associated with cases of salmonellosis. Rainfall was also inversely related to the number of cases. The comparison of the goodness-of-fit and forecasting ability suggest that the SARIMA model is better than the other three regression models. Temperature and rainfall may be used as climatic predictors of salmonellosis cases in regions with climatic characteristics similar to those of Adelaide. The SARIMA model could, thus, be adopted to quantify the relationship between climate variations and salmonellosis transmission.
Quantile regression models of animal habitat relationships
Cade, Brian S.
2003-01-01
Typically, all factors that limit an organism are not measured and included in statistical models used to investigate relationships with their environment. If important unmeasured variables interact multiplicatively with the measured variables, the statistical models often will have heterogeneous response distributions with unequal variances. Quantile regression is an approach for estimating the conditional quantiles of a response variable distribution in the linear model, providing a more complete view of possible causal relationships between variables in ecological processes. Chapter 1 introduces quantile regression and discusses the ordering characteristics, interval nature, sampling variation, weighting, and interpretation of estimates for homogeneous and heterogeneous regression models. Chapter 2 evaluates performance of quantile rankscore tests used for hypothesis testing and constructing confidence intervals for linear quantile regression estimates (0 ≤ τ ≤ 1). A permutation F test maintained better Type I errors than the Chi-square T test for models with smaller n, greater number of parameters p, and more extreme quantiles τ. Both versions of the test required weighting to maintain correct Type I errors when there was heterogeneity under the alternative model. An example application related trout densities to stream channel width:depth. Chapter 3 evaluates a drop in dispersion, F-ratio like permutation test for hypothesis testing and constructing confidence intervals for linear quantile regression estimates (0 ≤ τ ≤ 1). Chapter 4 simulates from a large (N = 10,000) finite population representing grid areas on a landscape to demonstrate various forms of hidden bias that might occur when the effect of a measured habitat variable on some animal was confounded with the effect of another unmeasured variable (spatially and not spatially structured). Depending on whether interactions of the measured habitat and unmeasured variable were negative (interference interactions) or positive (facilitation interactions), either upper (τ > 0.5) or lower (τ < 0.5) quantile regression parameters were less biased than mean rate parameters. Sampling (n = 20 - 300) simulations demonstrated that confidence intervals constructed by inverting rankscore tests provided valid coverage of these biased parameters. Quantile regression was used to estimate effects of physical habitat resources on a bivalve mussel (Macomona liliana) in a New Zealand harbor by modeling the spatial trend surface as a cubic polynomial of location coordinates.
Rodríguez, A; Reyes, L F; Monclou, J; Suberviola, B; Bodí, M; Sirgo, G; Solé-Violán, J; Guardiola, J; Barahona, D; Díaz, E; Martín-Loeches, I; Restrepo, M I
2018-02-09
Serum procalcitonin (PCT) concentration could be increased in patients with renal dysfunction in the absence of bacterial infection. To determine the interactions among serum renal biomarkers of acute kidney injury (AKI) and serum PCT concentration, in patients admitted to the intensive care unit (ICU) due to lung influenza infection. Secondary analysis of a prospective multicentre observational study. 148 Spanish ICUs. ICU patients admitted with influenza infection without bacterial co-infection. Clinical, laboratory and hemodynamic variables were recorded. AKI was classified as AKI I or II based on creatinine (Cr) concentrations (≥1.60-2.50mg/dL and Cr≥2.51-3.99mg/dL, respectively). Patients with chronic renal disease, receiving renal replacement treatment or with Cr>4mg/dL were excluded. Spearman's correlation, simple and multiple linear regression analysis were performed. None. Out of 663 patients included in the study, 52 (8.2%) and 10 (1.6%) developed AKI I and II, respectively. Patients with AKI were significantly older, had more comorbid conditions and were more severally ill. PCT concentrations were higher in patients with AKI (2.62 [0.60-10.0]ng/mL vs. 0.40 [0.13-1.20]ng/mL, p=0.002). Weak correlations between Cr/PCT (rho=0.18) and Urea (U)/PCT (rho=0.19) were identified. Simple linear regression showed poor interaction between Cr/U and PCT concentrations (Cr R 2 =0.03 and U R 2 =0.018). Similar results were observed during multiple linear regression analysis (Cr R 2 =0.046 and U R 2 =0.013). Although PCT concentrations were slightly higher in patients with AKI, high PCT concentrations are not explained by AKI and could be warning sign of a potential bacterial infection. Copyright © 2018 Elsevier España, S.L.U. y SEMICYUC. All rights reserved.
Can Functional Cardiac Age be Predicted from ECG in a Normal Healthy Population
NASA Technical Reports Server (NTRS)
Schlegel, Todd; Starc, Vito; Leban, Manja; Sinigoj, Petra; Vrhovec, Milos
2011-01-01
In a normal healthy population, we desired to determine the most age-dependent conventional and advanced ECG parameters. We hypothesized that changes in several ECG parameters might correlate with age and together reliably characterize the functional age of the heart. Methods: An initial study population of 313 apparently healthy subjects was ultimately reduced to 148 subjects (74 men, 84 women, in the range from 10 to 75 years of age) after exclusion criteria. In all subjects, ECG recordings (resting 5-minute 12-lead high frequency ECG) were evaluated via custom software programs to calculate up to 85 different conventional and advanced ECG parameters including beat-to-beat QT and RR variability, waveform complexity, and signal-averaged, high-frequency and spatial/spatiotemporal ECG parameters. The prediction of functional age was evaluated by multiple linear regression analysis using the best 5 univariate predictors. Results: Ignoring what were ultimately small differences between males and females, the functional age was found to be predicted (R2= 0.69, P < 0.001) from a linear combination of 5 independent variables: QRS elevation in the frontal plane (p<0.001), a new repolarization parameter QTcorr (p<0.001), mean high frequency QRS amplitude (p=0.009), the variability parameter % VLF of RRV (p=0.021) and the P-wave width (p=0.10). Here, QTcorr represents the correlation between the calculated QT and the measured QT signal. Conclusions: In apparently healthy subjects with normal conventional ECGs, functional cardiac age can be estimated by multiple linear regression analysis of mostly advanced ECG results. Because some parameters in the regression formula, such as QTcorr, high frequency QRS amplitude and P-wave width also change with disease in the same direction as with increased age, increased functional age of the heart may reflect subtle age-related pathologies in cardiac electrical function that are usually hidden on conventional ECG.
Socio-economic factors associated with infant mortality in Italy: an ecological study
2012-01-01
Introduction One issue that continues to attract the attention of public health researchers is the possible relationship in high-income countries between income, income inequality and infant mortality (IM). The aim of this study was to assess the associations between IM and major socio-economic determinants in Italy. Methods Associations between infant mortality rates in the 20 Italian regions (2006–2008) and the Gini index of income inequality, mean household income, percentage of women with at least 8 years of education, and percentage of unemployed aged 15–64 years were assessed using Pearson correlation coefficients. Univariate linear regression and multiple stepwise linear regression analyses were performed to determine the magnitude and direction of the effect of the four socio-economic variables on IM. Results The Gini index and the total unemployment rate showed a positive strong correlation with IM (r = 0.70; p < 0.001 and r = 0.84; p < 0.001 respectively), mean household income showed a strong negative correlation (r = −0.78; p < 0.001), while female educational attainment presented a weak negative correlation (r = −0.45; p < 0.05). Using a multiple stepwise linear regression model, only unemployment rate was independently associated with IM (b = 0.15, p < 0.001). Conclusions In Italy, a high-income country where health care is universally available, variations in IM were strongly associated with relative and absolute income and unemployment rate. These results suggest that in Italy IM is not only related to income distribution, as demonstrated for other developed countries, but also to economic factors such as absolute income and unemployment. In order to reduce IM and the existing inequalities, the challenge for Italian decision makers is to promote economic growth and enhance employment levels. PMID:22898293
González Costa, J J; Reigosa, M J; Matías, J M; Covelo, E F
2017-09-01
The aim of this study was to model the sorption and retention of Cd, Cu, Ni, Pb and Zn in soils. To that extent, the sorption and retention of these metals were studied and the soil characterization was performed separately. Multiple stepwise regression was used to produce multivariate models with linear techniques and with support vector machines, all of which included 15 explanatory variables characterizing soils. When the R-squared values are represented, two different groups are noticed. Cr, Cu and Pb sorption and retention show a higher R-squared; the most explanatory variables being humified organic matter, Al oxides and, in some cases, cation-exchange capacity (CEC). The other group of metals (Cd, Ni and Zn) shows a lower R-squared, and clays are the most explanatory variables, including a percentage of vermiculite and slime. In some cases, quartz, plagioclase or hematite percentages also show some explanatory capacity. Support Vector Machine (SVM) regression shows that the different models are not as regular as in multiple regression in terms of number of variables, the regression for nickel adsorption being the one with the highest number of variables in its optimal model. On the other hand, there are cases where the most explanatory variables are the same for two metals, as it happens with Cd and Cr adsorption. A similar adsorption mechanism is thus postulated. These patterns of the introduction of variables in the model allow us to create explainability sequences. Those which are the most similar to the selectivity sequences obtained by Covelo (2005) are Mn oxides in multiple regression and change capacity in SVM. Among all the variables, the only one that is explanatory for all the metals after applying the maximum parsimony principle is the percentage of sand in the retention process. In the competitive model arising from the aforementioned sequences, the most intense competitiveness for the adsorption and retention of different metals appears between Cr and Cd, Cu and Zn in multiple regression; and between Cr and Cd in SVM regression. Copyright © 2017 Elsevier B.V. All rights reserved.
Linear regression crash prediction models : issues and proposed solutions.
DOT National Transportation Integrated Search
2010-05-01
The paper develops a linear regression model approach that can be applied to : crash data to predict vehicle crashes. The proposed approach involves novice data aggregation : to satisfy linear regression assumptions; namely error structure normality ...
Lee, L.; Helsel, D.
2005-01-01
Trace contaminants in water, including metals and organics, often are measured at sufficiently low concentrations to be reported only as values below the instrument detection limit. Interpretation of these "less thans" is complicated when multiple detection limits occur. Statistical methods for multiply censored, or multiple-detection limit, datasets have been developed for medical and industrial statistics, and can be employed to estimate summary statistics or model the distributions of trace-level environmental data. We describe S-language-based software tools that perform robust linear regression on order statistics (ROS). The ROS method has been evaluated as one of the most reliable procedures for developing summary statistics of multiply censored data. It is applicable to any dataset that has 0 to 80% of its values censored. These tools are a part of a software library, or add-on package, for the R environment for statistical computing. This library can be used to generate ROS models and associated summary statistics, plot modeled distributions, and predict exceedance probabilities of water-quality standards. ?? 2005 Elsevier Ltd. All rights reserved.
Comparison between Linear and Nonlinear Regression in a Laboratory Heat Transfer Experiment
ERIC Educational Resources Information Center
Gonçalves, Carine Messias; Schwaab, Marcio; Pinto, José Carlos
2013-01-01
In order to interpret laboratory experimental data, undergraduate students are used to perform linear regression through linearized versions of nonlinear models. However, the use of linearized models can lead to statistically biased parameter estimates. Even so, it is not an easy task to introduce nonlinear regression and show for the students…
Which symptoms contribute the most to patients' perception of health in multiple sclerosis?
Green, Rivka; Cutter, Gary; Friendly, Michael; Kister, Ilya
2017-01-01
Multiple sclerosis is a polysymptomatic disease. Little is known about relative contributions of the different multiple sclerosis symptoms to self-perception of health. To investigate the relationship between symptom severity in 11 domains affected by multiple sclerosis and self-rated health. Multiple sclerosis patients in two multiple sclerosis centers assessed self-rated health with a validated instrument and symptom burden with symptoMScreen, a validated battery of Likert scales for 11 domains commonly affected by multiple sclerosis. Pearson correlations and multivariate linear regressions were used to investigate the relationship between symptoMScreen scores and self-rated health. Among 1865 multiple sclerosis outpatients (68% women, 78% with relapsing-remitting multiple sclerosis, mean age 46.38 ± 12.47 years, disease duration 13.43 ± 10.04 years), average self-rated health score was 2.30 ('moderate to good'). Symptom burden (composite symptoMScreen score) highly correlated with self-rated health ( r = 0.68, P < 0.0001) as did each of the symptoMScreen domain subscores. In regression analysis, pain ( t = 7.00), ambulation ( t = 6.91), and fatigue ( t = 5.85) contributed the highest amount of variance in self-rated health ( P < 0.001). Pain contributed the most to multiple sclerosis outpatients' perception of health, followed by gait dysfunction and fatigue. These findings suggest that 'invisible disability' may be more important to patients' sense of wellbeing than physical disability, and challenge the notion that physical disability should be the primary outcome measure in multiple sclerosis.
Ghasemi, Jahan B; Safavi-Sohi, Reihaneh; Barbosa, Euzébio G
2012-02-01
A quasi 4D-QSAR has been carried out on a series of potent Gram-negative LpxC inhibitors. This approach makes use of the molecular dynamics (MD) trajectories and topology information retrieved from the GROMACS package. This new methodology is based on the generation of a conformational ensemble profile, CEP, for each compound instead of only one conformation, followed by the calculation intermolecular interaction energies at each grid point considering probes and all aligned conformations resulting from MD simulations. These interaction energies are independent variables employed in a QSAR analysis. The comparison of the proposed methodology to comparative molecular field analysis (CoMFA) formalism was performed. This methodology explores jointly the main features of CoMFA and 4D-QSAR models. Step-wise multiple linear regression was used for the selection of the most informative variables. After variable selection, multiple linear regression (MLR) and partial least squares (PLS) methods used for building the regression models. Leave-N-out cross-validation (LNO), and Y-randomization were performed in order to confirm the robustness of the model in addition to analysis of the independent test set. Best models provided the following statistics: [Formula in text] (PLS) and [Formula in text] (MLR). Docking study was applied to investigate the major interactions in protein-ligand complex with CDOCKER algorithm. Visualization of the descriptors of the best model helps us to interpret the model from the chemical point of view, supporting the applicability of this new approach in rational drug design.
Ma, Jing; Yu, Jiong; Hao, Guangshu; Wang, Dan; Sun, Yanni; Lu, Jianxin; Cao, Hongcui; Lin, Feiyan
2017-02-20
The prevalence of high hyperlipemia is increasing around the world. Our aims are to analyze the relationship of triglyceride (TG) and cholesterol (TC) with indexes of liver function and kidney function, and to develop a prediction model of TG, TC in overweight people. A total of 302 adult healthy subjects and 273 overweight subjects were enrolled in this study. The levels of fasting indexes of TG (fs-TG), TC (fs-TC), blood glucose, liver function, and kidney function were measured and analyzed by correlation analysis and multiple linear regression (MRL). The back propagation artificial neural network (BP-ANN) was applied to develop prediction models of fs-TG and fs-TC. The results showed there was significant difference in biochemical indexes between healthy people and overweight people. The correlation analysis showed fs-TG was related to weight, height, blood glucose, and indexes of liver and kidney function; while fs-TC was correlated with age, indexes of liver function (P < 0.01). The MRL analysis indicated regression equations of fs-TG and fs-TC both had statistic significant (P < 0.01) when included independent indexes. The BP-ANN model of fs-TG reached training goal at 59 epoch, while fs-TC model achieved high prediction accuracy after training 1000 epoch. In conclusions, there was high relationship of fs-TG and fs-TC with weight, height, age, blood glucose, indexes of liver function and kidney function. Based on related variables, the indexes of fs-TG and fs-TC can be predicted by BP-ANN models in overweight people.
Iserbyt, Peter; Schouppe, Gilles; Charlier, Nathalie
2015-04-01
Research investigating lifeguards' performance of Basic Life Support (BLS) with Automated External Defibrillator (AED) is limited. Assessing simulated BLS/AED performance in Flemish lifeguards and identifying factors affecting this performance. Six hundred and sixteen (217 female and 399 male) certified Flemish lifeguards (aged 16-71 years) performed BLS with an AED on a Laerdal ResusciAnne manikin simulating an adult victim of drowning. Stepwise multiple linear regression analysis was conducted with BLS/AED performance as outcome variable and demographic data as explanatory variables. Mean BLS/AED performance for all lifeguards was 66.5%. Compression rate and depth adhered closely to ERC 2010 guidelines. Ventilation volume and flow rate exceeded the guidelines. A significant regression model, F(6, 415)=25.61, p<.001, ES=.38, explained 27% of the variance in BLS performance (R2=.27). Significant predictors were age (beta=-.31, p<.001), years of certification (beta=-.41, p<.001), time on duty per year (beta=-.25, p<.001), practising BLS skills (beta=.11, p=.011), and being a professional lifeguard (beta=-.13, p=.029). 71% of lifeguards reported not practising BLS/AED. Being young, recently certified, few days of employment per year, practising BLS skills and not being a professional lifeguard are factors associated with higher BLS/AED performance. Measures should be taken to prevent BLS/AED performances from decaying with age and longer certification. Refresher courses could include a formal skills test and lifeguards should be encouraged to practise their BLS/AED skills. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Age estimation standards for a Western Australian population using the coronal pulp cavity index.
Karkhanis, Shalmira; Mack, Peter; Franklin, Daniel
2013-09-10
Age estimation is a vital aspect in creating a biological profile and aids investigators by narrowing down potentially matching identities from the available pool. In addition to routine casework, in the present global political scenario, age estimation in living individuals is required in cases of refugees, asylum seekers, human trafficking and to ascertain age of criminal responsibility. Thus robust methods that are simple, non-invasive and ethically viable are required. The aim of the present study is, therefore, to test the reliability and applicability of the coronal pulp cavity index method, for the purpose of developing age estimation standards for an adult Western Australian population. A total of 450 orthopantomograms (220 females and 230 males) of Australian individuals were analyzed. Crown and coronal pulp chamber heights were measured in the mandibular left and right premolars, and the first and second molars. These measurements were then used to calculate the tooth coronal index. Data was analyzed using paired sample t-tests to assess bilateral asymmetry followed by simple linear and multiple regressions to develop age estimation models. The most accurate age estimation based on simple linear regression model was with mandibular right first molar (SEE ±8.271 years). Multiple regression models improved age prediction accuracy considerably and the most accurate model was with bilateral first and second molars (SEE ±6.692 years). This study represents the first investigation of this method in a Western Australian population and our results indicate that the method is suitable for forensic application. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Estimation of stature from the foot and its segments in a sub-adult female population of North India
2011-01-01
Background Establishing personal identity is one of the main concerns in forensic investigations. Estimation of stature forms a basic domain of the investigation process in unknown and co-mingled human remains in forensic anthropology case work. The objective of the present study was to set up standards for estimation of stature from the foot and its segments in a sub-adult female population. Methods The sample for the study constituted 149 young females from the Northern part of India. The participants were aged between 13 and 18 years. Besides stature, seven anthropometric measurements that included length of the foot from each toe (T1, T2, T3, T4, and T5 respectively), foot breadth at ball (BBAL) and foot breadth at heel (BHEL) were measured on both feet in each participant using standard methods and techniques. Results The results indicated that statistically significant differences (p < 0.05) between left and right feet occur in both the foot breadth measurements (BBAL and BHEL). Foot length measurements (T1 to T5 lengths) did not show any statistically significant bilateral asymmetry. The correlation between stature and all the foot measurements was found to be positive and statistically significant (p-value < 0.001). Linear regression models and multiple regression models were derived for estimation of stature from the measurements of the foot. The present study indicates that anthropometric measurements of foot and its segments are valuable in the estimation of stature. Foot length measurements estimate stature with greater accuracy when compared to foot breadth measurements. Conclusions The present study concluded that foot measurements have a strong relationship with stature in the sub-adult female population of North India. Hence, the stature of an individual can be successfully estimated from the foot and its segments using different regression models derived in the study. The regression models derived in the study may be applied successfully for the estimation of stature in sub-adult females, whenever foot remains are brought for forensic examination. Stepwise multiple regression models tend to estimate stature more accurately than linear regression models in female sub-adults. PMID:22104433
Krishan, Kewal; Kanchan, Tanuj; Passi, Neelam
2011-11-21
Establishing personal identity is one of the main concerns in forensic investigations. Estimation of stature forms a basic domain of the investigation process in unknown and co-mingled human remains in forensic anthropology case work. The objective of the present study was to set up standards for estimation of stature from the foot and its segments in a sub-adult female population. The sample for the study constituted 149 young females from the Northern part of India. The participants were aged between 13 and 18 years. Besides stature, seven anthropometric measurements that included length of the foot from each toe (T1, T2, T3, T4, and T5 respectively), foot breadth at ball (BBAL) and foot breadth at heel (BHEL) were measured on both feet in each participant using standard methods and techniques. The results indicated that statistically significant differences (p < 0.05) between left and right feet occur in both the foot breadth measurements (BBAL and BHEL). Foot length measurements (T1 to T5 lengths) did not show any statistically significant bilateral asymmetry. The correlation between stature and all the foot measurements was found to be positive and statistically significant (p-value < 0.001). Linear regression models and multiple regression models were derived for estimation of stature from the measurements of the foot. The present study indicates that anthropometric measurements of foot and its segments are valuable in the estimation of stature. Foot length measurements estimate stature with greater accuracy when compared to foot breadth measurements. The present study concluded that foot measurements have a strong relationship with stature in the sub-adult female population of North India. Hence, the stature of an individual can be successfully estimated from the foot and its segments using different regression models derived in the study. The regression models derived in the study may be applied successfully for the estimation of stature in sub-adult females, whenever foot remains are brought for forensic examination. Stepwise multiple regression models tend to estimate stature more accurately than linear regression models in female sub-adults.
Guo, Ying; Little, Roderick J; McConnell, Daniel S
2012-01-01
Covariate measurement error is common in epidemiologic studies. Current methods for correcting measurement error with information from external calibration samples are insufficient to provide valid adjusted inferences. We consider the problem of estimating the regression of an outcome Y on covariates X and Z, where Y and Z are observed, X is unobserved, but a variable W that measures X with error is observed. Information about measurement error is provided in an external calibration sample where data on X and W (but not Y and Z) are recorded. We describe a method that uses summary statistics from the calibration sample to create multiple imputations of the missing values of X in the regression sample, so that the regression coefficients of Y on X and Z and associated standard errors can be estimated using simple multiple imputation combining rules, yielding valid statistical inferences under the assumption of a multivariate normal distribution. The proposed method is shown by simulation to provide better inferences than existing methods, namely the naive method, classical calibration, and regression calibration, particularly for correction for bias and achieving nominal confidence levels. We also illustrate our method with an example using linear regression to examine the relation between serum reproductive hormone concentrations and bone mineral density loss in midlife women in the Michigan Bone Health and Metabolism Study. Existing methods fail to adjust appropriately for bias due to measurement error in the regression setting, particularly when measurement error is substantial. The proposed method corrects this deficiency.
Functional capacity following univentricular repair--midterm outcome.
Sen, Supratim; Bandyopadhyay, Biswajit; Eriksson, Peter; Chattopadhyay, Amitabha
2012-01-01
Previous studies have seldom compared functional capacity in children following Fontan procedure alongside those with Glenn operation as destination therapy. We hypothesized that Fontan circulation enables better midterm submaximal exercise capacity as compared to Glenn physiology and evaluated this using the 6-minute walk test. Fifty-seven children aged 5-18 years with Glenn (44) or Fontan (13) operations were evaluated with standard 6-minute walk protocols. Baseline SpO(2) was significantly lower in Glenn patients younger than 10 years compared to Fontan counterparts and similar in the two groups in older children. Postexercise SpO(2) fell significantly in Glenn patients compared to the Fontan group. There was no statistically significant difference in baseline, postexercise, or postrecovery heart rates (HRs), or 6-minute walk distances in the two groups. Multiple regression analysis revealed lower resting HR, higher resting SpO(2) , and younger age at latest operation to be significant determinants of longer 6-minute walk distance. Multiple regression analysis also established that younger age at operation, higher resting SpO(2) , Fontan operation, lower resting HR, and lower postexercise HR were significant determinants of higher postexercise SpO(2) . Younger age at operation and exercise, lower resting HR and postexercise HR, higher resting SpO(2) and postexercise SpO(2) , and dominant ventricular morphology being left ventricular or indeterminate/mixed had significant association with better 6-minute work on multiple regression analysis. Lower resting HR had linear association with longer 6-minute walk distances in the Glenn patients. Compared to Glenn physiology, Fontan operation did not have better submaximal exercise capacity assessed by walk distance or work on multiple regression analysis. Lower resting HR, higher resting SpO(2) , and younger age at operation were factors uniformly associated with better submaximal exercise capacity. © 2012 Wiley Periodicals, Inc.
Lifespan development of pro- and anti-saccades: multiple regression models for point estimates.
Klein, Christoph; Foerster, Friedrich; Hartnegg, Klaus; Fischer, Burkhart
2005-12-07
The comparative study of anti- and pro-saccade task performance contributes to our functional understanding of the frontal lobes, their alterations in psychiatric or neurological populations, and their changes during the life span. In the present study, we apply regression analysis to model life span developmental effects on various pro- and anti-saccade task parameters, using data of a non-representative sample of 327 participants aged 9 to 88 years. Development up to the age of about 27 years was dominated by curvilinear rather than linear effects of age. Furthermore, the largest developmental differences were found for intra-subject variability measures and the anti-saccade task parameters. Ageing, by contrast, had the shape of a global linear decline of the investigated saccade functions, lacking the differential effects of age observed during development. While these results do support the assumption that frontal lobe functions can be distinguished from other functions by their strong and protracted development, they do not confirm the assumption of disproportionate deterioration of frontal lobe functions with ageing. We finally show that the regression models applied here to quantify life span developmental effects can also be used for individual predictions in applied research contexts or clinical practice.
NASA Technical Reports Server (NTRS)
Jones, Harrison P.; Branston, Detrick D.; Jones, Patricia B.; Popescu, Miruna D.
2002-01-01
An earlier study compared NASA/NSO Spectromagnetograph (SPM) data with spacecraft measurements of total solar irradiance (TSI) variations over a 1.5 year period in the declining phase of solar cycle 22. This paper extends the analysis to an eight-year period which also spans the rising and early maximum phases of cycle 23. The conclusions of the earlier work appear to be robust: three factors (sunspots, strong unipolar regions, and strong mixed polarity regions) describe most of the variation in the SPM record, but only the first two are associated with TSI. Additionally, the residuals of a linear multiple regression of TSI against SPM observations over the entire eight-year period show an unexplained, increasing, linear time variation with a rate of about 0.05 W m(exp -2) per year. Separate regressions for the periods before and after 1996 January 01 show no unexplained trends but differ substantially in regression parameters. This behavior may reflect a solar source of TSI variations beyond sunspots and faculae but more plausibly results from uncompensated non-solar effects in one or both of the TSI and SPM data sets.
Shatat, Ibrahim F; Abdallah, Rany T; Sas, David J; Hailpern, Susan M
2012-07-01
Despite being associated with multiple disease processes and cardiovascular outcomes, uric acid (UA) reference ranges for adolescents are lacking. We sought to describe the distribution of UA and its relationship to demographic, clinical, socioeconomic, and dietary factors among U.S. adolescents. A nationally representative subsample of 1,912 adolescents aged 13-18 years in NHANES 2005-2008 representing 19,888,299 adolescents was used for this study. Percentiles of the distribution of UA were estimated using quantile regression. Linear regression models examined the association of UA and demographic, socioeconomic, and dietary factors. Mean UA level was 5.14 ± 1.45 mg/dl. Mean UA increased with increasing age and was higher in non-Hispanic white race, male sex, higher body mass index (BMI) Z-score, and with higher systolic blood pressure. In fully adjusted linear regression models, sex, age, race, and BMI were independent determinants of higher UA. This study defines serum UA reference ranges for adolescents. Also, it reveals some intriguing relationships between UA and demographic and clinical characteristics that warrant further studies to examine the pathophysiological role of UA in different disease processes.
Factors associated with parasite dominance in fishes from Brazil.
Amarante, Cristina Fernandes do; Tassinari, Wagner de Souza; Luque, Jose Luis; Pereira, Maria Julia Salim
2016-06-14
The present study used regression models to evaluate the existence of factors that may influence the numerical parasite dominance with an epidemiological approximation. A database including 3,746 fish specimens and their respective parasites were used to evaluate the relationship between parasite dominance and biotic characteristics inherent to the studied hosts and the parasite taxa. Multivariate, classical, and mixed effects linear regression models were fitted. The calculations were performed using R software (95% CI). In the fitting of the classical multiple linear regression model, freshwater and planktivorous fish species and body length, as well as the species of the taxa Trematoda, Monogenea, and Hirudinea, were associated with parasite dominance. However, the fitting of the mixed effects model showed that the body length of the host and the species of the taxa Nematoda, Trematoda, Monogenea, Hirudinea, and Crustacea were significantly associated with parasite dominance. Studies that consider specific biological aspects of the hosts and parasites should expand the knowledge regarding factors that influence the numerical dominance of fish in Brazil. The use of a mixed model shows, once again, the importance of the appropriate use of a model correlated with the characteristics of the data to obtain consistent results.
Breakfast intake among adults with type 2 diabetes: is bigger better?
Jarvandi, Soghra; Schootman, Mario; Racette, Susan B.
2015-01-01
Objective To assess the association between breakfast energy and total daily energy intake among individuals with type 2 diabetes. Design Cross-sectional study. Daily energy intake was computed from a 24-h dietary recall. Multiple regression models were used to estimate the association between daily energy intake (dependent variable) and quartiles of energy intake at breakfast (independent variable) expressed as either absolute or relative (% of total daily energy intake) terms. Orthogonal polynomial contrasts were used to test for linear and quadratic trends. Models were controlled for sex, age, race/ethnicity, body mass index, physical activity and smoking. In addition, we used separate multiple regression models to test the effect of quartiles of absolute and relative breakfast energy on intake at lunch, dinner, and snacks. Setting The 1999–2004 National Health and Nutrition Examination Survey (NHANES). Subjects Participants aged ≥ 30 years with self-reported history of diabetes (N = 1,146). Results Daily energy intake increased as absolute breakfast energy intake increased (linear trend, P < 0.0001; quadratic trend, P = 0.02), but decreased as relative breakfast energy intake increased (linear trend, P < 0.0001). In addition, while higher quartiles of absolute breakfast intake had no associations with energy intake at subsequent meals, higher quartiles of relative breakfast intake were associated with lower energy intake during all subsequent meals and snacks (P < 0.05). Conclusions Consuming a breakfast that provided less energy or comprised a greater proportion of daily energy intake was associated with lower total daily energy intake in adults with type 2 diabetes. PMID:25529061
Stroop Color-Word Interference Test: Normative data for Spanish-speaking pediatric population.
Rivera, D; Morlett-Paredes, A; Peñalver Guia, A I; Irías Escher, M J; Soto-Añari, M; Aguayo Arelis, A; Rute-Pérez, S; Rodríguez-Lorenzana, A; Rodríguez-Agudelo, Y; Albaladejo-Blázquez, N; García de la Cadena, C; Ibáñez-Alfonso, J A; Rodriguez-Irizarry, W; García-Guerrero, C E; Delgado-Mejía, I D; Padilla-López, A; Vergara-Moragues, E; Barrios Nevado, M D; Saracostti Schwartzman, M; Arango-Lasprilla, J C
2017-01-01
To generate normative data for the Stroop Word-Color Interference test in Spanish-speaking pediatric populations. The sample consisted of 4,373 healthy children from nine countries in Latin America (Chile, Cuba, Ecuador, Guatemala, Honduras, Mexico, Paraguay, Peru, and Puerto Rico) and Spain. Each participant was administered the Stroop Word-Color Interference test as part of a larger neuropsychological battery. The Stroop Word, Stroop Color, Stroop Word-Color, and Stroop Interference scores were normed using multiple linear regressions and standard deviations of residual values. Age, age2, sex, and mean level of parental education (MLPE) were included as predictors in the analyses. The final multiple linear regression models showed main effects for age on all scores, except on Stroop Interference for Guatemala, such that scores increased linearly as a function of age. Age2 affected Stroop Word scores for all countries, Stroop Color scores for Ecuador, Mexico, Peru, and Spain; Stroop Word-Color scores for Ecuador, Mexico, and Paraguay; and Stroop Interference scores for Cuba, Guatemala, and Spain. MLPE affected Stroop Word scores for Chile, Mexico, and Puerto Rico; Stroop Color scores for Mexico, Puerto Rico, and Spain; Stroop Word-Color scores for Ecuador, Guatemala, Mexico, Puerto Rico and Spain; and Stroop-Interference scores for Ecuador, Mexico, and Spain. Sex affected Stroop Word scores for Spain, Stroop Color scores for Mexico, and Stroop Interference for Honduras. This is the largest Spanish-speaking pediatric normative study in the world, and it will allow neuropsychologists from these countries to have a more accurate approach to interpret the Stroop Word-Color Interference test in pediatric populations.
Van de Voorde, Tim; Vlaeminck, Jeroen; Canters, Frank
2008-01-01
Urban growth and its related environmental problems call for sustainable urban management policies to safeguard the quality of urban environments. Vegetation plays an important part in this as it provides ecological, social, health and economic benefits to a city's inhabitants. Remotely sensed data are of great value to monitor urban green and despite the clear advantages of contemporary high resolution images, the benefits of medium resolution data should not be discarded. The objective of this research was to estimate fractional vegetation cover from a Landsat ETM+ image with sub-pixel classification, and to compare accuracies obtained with multiple stepwise regression analysis, linear spectral unmixing and multi-layer perceptrons (MLP) at the level of meaningful urban spatial entities. Despite the small, but nevertheless statistically significant differences at pixel level between the alternative approaches, the spatial pattern of vegetation cover and estimation errors is clearly distinctive at neighbourhood level. At this spatially aggregated level, a simple regression model appears to attain sufficient accuracy. For mapping at a spatially more detailed level, the MLP seems to be the most appropriate choice. Brightness normalisation only appeared to affect the linear models, especially the linear spectral unmixing. PMID:27879914
Abnormal dynamics of language in schizophrenia.
Stephane, Massoud; Kuskowski, Michael; Gundel, Jeanette
2014-05-30
Language could be conceptualized as a dynamic system that includes multiple interactive levels (sub-lexical, lexical, sentence, and discourse) and components (phonology, semantics, and syntax). In schizophrenia, abnormalities are observed at all language elements (levels and components) but the dynamic between these elements remains unclear. We hypothesize that the dynamics between language elements in schizophrenia is abnormal and explore how this dynamic is altered. We, first, investigated language elements with comparable procedures in patients and healthy controls. Second, using measures of reaction time, we performed multiple linear regression analyses to evaluate the inter-relationships among language elements and the effect of group on these relationships. Patients significantly differed from controls with respect to sub-lexical/lexical, lexical/sentence, and sentence/discourse regression coefficients. The intercepts of the regression slopes increased in the same order above (from lower to higher levels) in patients but not in controls. Regression coefficients between syntax and both sentence level and discourse level semantics did not differentiate patients from controls. This study indicates that the dynamics between language elements is abnormal in schizophrenia. In patients, top-down flow of linguistic information might be reduced, and the relationship between phonology and semantics but not between syntax and semantics appears to be altered. Published by Elsevier Ireland Ltd.
Satellite remote sensing of fine particulate air pollutants over Indian mega cities
NASA Astrophysics Data System (ADS)
Sreekanth, V.; Mahesh, B.; Niranjan, K.
2017-11-01
In the backdrop of the need for high spatio-temporal resolution data on PM2.5 mass concentrations for health and epidemiological studies over India, empirical relations between Aerosol Optical Depth (AOD) and PM2.5 mass concentrations are established over five Indian mega cities. These relations are sought to predict the surface PM2.5 mass concentrations from high resolution columnar AOD datasets. Current study utilizes multi-city public domain PM2.5 data (from US Consulate and Embassy's air monitoring program) and MODIS AOD, spanning for almost four years. PM2.5 is found to be positively correlated with AOD. Station-wise linear regression analysis has shown spatially varying regression coefficients. Similar analysis has been repeated by eliminating data from the elevated aerosol prone seasons, which has improved the correlation coefficient. The impact of the day to day variability in the local meteorological conditions on the AOD-PM2.5 relationship has been explored by performing a multiple regression analysis. A cross-validation approach for the multiple regression analysis considering three years of data as training dataset and one-year data as validation dataset yielded an R value of ∼0.63. The study was concluded by discussing the factors which can improve the relationship.
The Application of the Cumulative Logistic Regression Model to Automated Essay Scoring
ERIC Educational Resources Information Center
Haberman, Shelby J.; Sinharay, Sandip
2010-01-01
Most automated essay scoring programs use a linear regression model to predict an essay score from several essay features. This article applied a cumulative logit model instead of the linear regression model to automated essay scoring. Comparison of the performances of the linear regression model and the cumulative logit model was performed on a…
de Freitas, Mariana V; Marquez-Bernardes, Liandra F; de Arvelos, Letícia R; Paraíso, Lara F; Gonçalves E Oliveira, Ana Flávia M; Mascarenhas Netto, Rita de C; Neto, Morun Bernardino; Garrote-Filho, Mario S; de Souza, Paulo César A; Penha-Silva, Nilson
2014-10-01
To evaluate the influence of age on the relationships between biochemical and hematological variables and stability of erythrocyte membrane in relation to the sodium dodecyl sulfate (SDS) in population of 105 female volunteers between 20 and 90 years. The stability of RBC membrane was determined by non-linear regression of the dependency of the absorbance of hemoglobin released as a function of SDS concentration, represented by the half-transition point of the curve (D50) and the variation in the concentration of the detergent to promote lysis (dD). There was an age-dependent increase in the membrane stability in relation to SDS. Analyses by multiple linear regression showed that this stability increase is significantly related to the hematological variable red cell distribution width (RDW) and the biochemical variables blood albumin and cholesterol. The positive association between erythrocyte stability and RDW may reflect one possible mechanism involved in the clinical meaning of this hematological index.
Meng, Yilin; Roux, Benoît
2015-08-11
The weighted histogram analysis method (WHAM) is a standard protocol for postprocessing the information from biased umbrella sampling simulations to construct the potential of mean force with respect to a set of order parameters. By virtue of the WHAM equations, the unbiased density of state is determined by satisfying a self-consistent condition through an iterative procedure. While the method works very effectively when the number of order parameters is small, its computational cost grows rapidly in higher dimension. Here, we present a simple and efficient alternative strategy, which avoids solving the self-consistent WHAM equations iteratively. An efficient multivariate linear regression framework is utilized to link the biased probability densities of individual umbrella windows and yield an unbiased global free energy landscape in the space of order parameters. It is demonstrated with practical examples that free energy landscapes that are comparable in accuracy to WHAM can be generated at a small fraction of the cost.
2015-01-01
The weighted histogram analysis method (WHAM) is a standard protocol for postprocessing the information from biased umbrella sampling simulations to construct the potential of mean force with respect to a set of order parameters. By virtue of the WHAM equations, the unbiased density of state is determined by satisfying a self-consistent condition through an iterative procedure. While the method works very effectively when the number of order parameters is small, its computational cost grows rapidly in higher dimension. Here, we present a simple and efficient alternative strategy, which avoids solving the self-consistent WHAM equations iteratively. An efficient multivariate linear regression framework is utilized to link the biased probability densities of individual umbrella windows and yield an unbiased global free energy landscape in the space of order parameters. It is demonstrated with practical examples that free energy landscapes that are comparable in accuracy to WHAM can be generated at a small fraction of the cost. PMID:26574437
1986-09-01
OF REPORT Approved for public release; distribution 2b DECLASSIFICATION /DOWNGRADING SCHEDULE is unlimited. 4 PERFORMING ORGANIZATION REPORT NUMBER(S...S MONITORING ORGANIZATION REPORT NUMBER(S) 6a NAME OF PERFORMING ORGANIZATION 6b OFFICE SYMBOL 7a. NAME OF MONITORING ORGANIZATION (If applicable...ORGAIZATION (If applicable) 8c ADDRESS(Ciry, State, ard ZIPCode) 10 SOURCE OF FUNDING NUMBERS PROGRAM PROJECT TASK WORK UNIT ELEMENT NO INO NO ACCESSION
Estimating Required Contingency Funds for Construction Projects using Multiple Linear Regression
2006-03-01
Breusch - Pagan test , in which the null hypothesis states that the residuals have constant variance. The alternate hypothesis is that the residuals do not...variance, the Breusch - Pagan test provides statistical evidence that the assumption is justified. For the proposed model, the p-value is 0.173...entire test sample. v Acknowledgments First, I would like to acknowledge the influence and help of Greg Hoffman. His work served as the
NASA Astrophysics Data System (ADS)
Prahutama, Alan; Suparti; Wahyu Utami, Tiani
2018-03-01
Regression analysis is an analysis to model the relationship between response variables and predictor variables. The parametric approach to the regression model is very strict with the assumption, but nonparametric regression model isn’t need assumption of model. Time series data is the data of a variable that is observed based on a certain time, so if the time series data wanted to be modeled by regression, then we should determined the response and predictor variables first. Determination of the response variable in time series is variable in t-th (yt), while the predictor variable is a significant lag. In nonparametric regression modeling, one developing approach is to use the Fourier series approach. One of the advantages of nonparametric regression approach using Fourier series is able to overcome data having trigonometric distribution. In modeling using Fourier series needs parameter of K. To determine the number of K can be used Generalized Cross Validation method. In inflation modeling for the transportation sector, communication and financial services using Fourier series yields an optimal K of 120 parameters with R-square 99%. Whereas if it was modeled by multiple linear regression yield R-square 90%.
NASA Astrophysics Data System (ADS)
Tang, Jie; Liu, Rong; Zhang, Yue-Li; Liu, Mou-Ze; Hu, Yong-Fang; Shao, Ming-Jie; Zhu, Li-Jun; Xin, Hua-Wen; Feng, Gui-Wen; Shang, Wen-Jun; Meng, Xiang-Guang; Zhang, Li-Rong; Ming, Ying-Zi; Zhang, Wei
2017-02-01
Tacrolimus has a narrow therapeutic window and considerable variability in clinical use. Our goal was to compare the performance of multiple linear regression (MLR) and eight machine learning techniques in pharmacogenetic algorithm-based prediction of tacrolimus stable dose (TSD) in a large Chinese cohort. A total of 1,045 renal transplant patients were recruited, 80% of which were randomly selected as the “derivation cohort” to develop dose-prediction algorithm, while the remaining 20% constituted the “validation cohort” to test the final selected algorithm. MLR, artificial neural network (ANN), regression tree (RT), multivariate adaptive regression splines (MARS), boosted regression tree (BRT), support vector regression (SVR), random forest regression (RFR), lasso regression (LAR) and Bayesian additive regression trees (BART) were applied and their performances were compared in this work. Among all the machine learning models, RT performed best in both derivation [0.71 (0.67-0.76)] and validation cohorts [0.73 (0.63-0.82)]. In addition, the ideal rate of RT was 4% higher than that of MLR. To our knowledge, this is the first study to use machine learning models to predict TSD, which will further facilitate personalized medicine in tacrolimus administration in the future.
NASA Astrophysics Data System (ADS)
Gao, Xiangyun; An, Haizhong; Fang, Wei; Huang, Xuan; Li, Huajiao; Zhong, Weiqiong; Ding, Yinghui
2014-07-01
The linear regression parameters between two time series can be different under different lengths of observation period. If we study the whole period by the sliding window of a short period, the change of the linear regression parameters is a process of dynamic transmission over time. We tackle fundamental research that presents a simple and efficient computational scheme: a linear regression patterns transmission algorithm, which transforms linear regression patterns into directed and weighted networks. The linear regression patterns (nodes) are defined by the combination of intervals of the linear regression parameters and the results of the significance testing under different sizes of the sliding window. The transmissions between adjacent patterns are defined as edges, and the weights of the edges are the frequency of the transmissions. The major patterns, the distance, and the medium in the process of the transmission can be captured. The statistical results of weighted out-degree and betweenness centrality are mapped on timelines, which shows the features of the distribution of the results. Many measurements in different areas that involve two related time series variables could take advantage of this algorithm to characterize the dynamic relationships between the time series from a new perspective.
Gao, Xiangyun; An, Haizhong; Fang, Wei; Huang, Xuan; Li, Huajiao; Zhong, Weiqiong; Ding, Yinghui
2014-07-01
The linear regression parameters between two time series can be different under different lengths of observation period. If we study the whole period by the sliding window of a short period, the change of the linear regression parameters is a process of dynamic transmission over time. We tackle fundamental research that presents a simple and efficient computational scheme: a linear regression patterns transmission algorithm, which transforms linear regression patterns into directed and weighted networks. The linear regression patterns (nodes) are defined by the combination of intervals of the linear regression parameters and the results of the significance testing under different sizes of the sliding window. The transmissions between adjacent patterns are defined as edges, and the weights of the edges are the frequency of the transmissions. The major patterns, the distance, and the medium in the process of the transmission can be captured. The statistical results of weighted out-degree and betweenness centrality are mapped on timelines, which shows the features of the distribution of the results. Many measurements in different areas that involve two related time series variables could take advantage of this algorithm to characterize the dynamic relationships between the time series from a new perspective.
Navarrete-Muñoz, Eva María; Valera-Gran, Desirée; Garcia-de-la-Hera, Manuela; Gonzalez-Palacios, Sandra; Riaño, Isolina; Murcia, Mario; Lertxundi, Aitana; Guxens, Mònica; Tardón, Adonina; Amiano, Pilar; Vrijheid, Martine; Rebagliato, Marisa; Vioque, Jesus
2017-11-27
We investigated the association between maternal use of folic acid (FA) during pregnancy and child anthropometric measures at birth. We included 2302 mother-child pairs from a population-based birth cohort in Spain (INMA Project). FA dosages at first and third trimester of pregnancy were assessed using a specific battery questionnaire and were categorized in non-user, < 1000, 1000-4999, and ≥ 5000 µg/day. Anthropometric measures at birth (weight in grams, length and head circumference in centimetres) were obtained from medical records. Small for gestational age according to weight (SGA-w), length (SGA-l) and head circumference (SGA-hc) were defined using the 10th percentile based on Spanish standardized growth reference charts. Multiple linear and logistic regression analyses were used to explore the association between FA dosages in different stages of pregnancy and child anthropometric measures at birth. In the multiple linear regression analysis, we found a tendency for a negative association between the use of high dosages of FA (≥ 5000 µg/day) in the periconceptional period of pregnancy and weight at birth compared to mothers who were non-users of FA (β = - 73.83; 95% CI - 151.71, 4.06). In the multiple logistic regression, a greater risk of SGA-w was also evident among children whose mothers took FA dosages of 1000-4999 (OR = 2.21; 95% CI 1.17, 4.19) and of ≥ 5000 µg/day (OR = 2.32; 95% CI 1.06, 5.08) compared to mothers non-users of FA in the periconceptional period of pregnancy. Our findings suggest that a high dosage of FA (≥ 1000 µg/day) may be associated with an increased risk of SGA-w at birth.
Agier, Lydiane; Portengen, Lützen; Chadeau-Hyam, Marc; Basagaña, Xavier; Giorgis-Allemand, Lise; Siroux, Valérie; Robinson, Oliver; Vlaanderen, Jelle; González, Juan R; Nieuwenhuijsen, Mark J; Vineis, Paolo; Vrijheid, Martine; Slama, Rémy; Vermeulen, Roel
2016-12-01
The exposome constitutes a promising framework to improve understanding of the effects of environmental exposures on health by explicitly considering multiple testing and avoiding selective reporting. However, exposome studies are challenged by the simultaneous consideration of many correlated exposures. We compared the performances of linear regression-based statistical methods in assessing exposome-health associations. In a simulation study, we generated 237 exposure covariates with a realistic correlation structure and with a health outcome linearly related to 0 to 25 of these covariates. Statistical methods were compared primarily in terms of false discovery proportion (FDP) and sensitivity. On average over all simulation settings, the elastic net and sparse partial least-squares regression showed a sensitivity of 76% and an FDP of 44%; Graphical Unit Evolutionary Stochastic Search (GUESS) and the deletion/substitution/addition (DSA) algorithm revealed a sensitivity of 81% and an FDP of 34%. The environment-wide association study (EWAS) underperformed these methods in terms of FDP (average FDP, 86%) despite a higher sensitivity. Performances decreased considerably when assuming an exposome exposure matrix with high levels of correlation between covariates. Correlation between exposures is a challenge for exposome research, and the statistical methods investigated in this study were limited in their ability to efficiently differentiate true predictors from correlated covariates in a realistic exposome context. Although GUESS and DSA provided a marginally better balance between sensitivity and FDP, they did not outperform the other multivariate methods across all scenarios and properties examined, and computational complexity and flexibility should also be considered when choosing between these methods. Citation: Agier L, Portengen L, Chadeau-Hyam M, Basagaña X, Giorgis-Allemand L, Siroux V, Robinson O, Vlaanderen J, González JR, Nieuwenhuijsen MJ, Vineis P, Vrijheid M, Slama R, Vermeulen R. 2016. A systematic comparison of linear regression-based statistical methods to assess exposome-health associations. Environ Health Perspect 124:1848-1856; http://dx.doi.org/10.1289/EHP172.
Biostatistics Series Module 10: Brief Overview of Multivariate Methods.
Hazra, Avijit; Gogtay, Nithya
2017-01-01
Multivariate analysis refers to statistical techniques that simultaneously look at three or more variables in relation to the subjects under investigation with the aim of identifying or clarifying the relationships between them. These techniques have been broadly classified as dependence techniques, which explore the relationship between one or more dependent variables and their independent predictors, and interdependence techniques, that make no such distinction but treat all variables equally in a search for underlying relationships. Multiple linear regression models a situation where a single numerical dependent variable is to be predicted from multiple numerical independent variables. Logistic regression is used when the outcome variable is dichotomous in nature. The log-linear technique models count type of data and can be used to analyze cross-tabulations where more than two variables are included. Analysis of covariance is an extension of analysis of variance (ANOVA), in which an additional independent variable of interest, the covariate, is brought into the analysis. It tries to examine whether a difference persists after "controlling" for the effect of the covariate that can impact the numerical dependent variable of interest. Multivariate analysis of variance (MANOVA) is a multivariate extension of ANOVA used when multiple numerical dependent variables have to be incorporated in the analysis. Interdependence techniques are more commonly applied to psychometrics, social sciences and market research. Exploratory factor analysis and principal component analysis are related techniques that seek to extract from a larger number of metric variables, a smaller number of composite factors or components, which are linearly related to the original variables. Cluster analysis aims to identify, in a large number of cases, relatively homogeneous groups called clusters, without prior information about the groups. The calculation intensive nature of multivariate analysis has so far precluded most researchers from using these techniques routinely. The situation is now changing with wider availability, and increasing sophistication of statistical software and researchers should no longer shy away from exploring the applications of multivariate methods to real-life data sets.
Arsenyev, P A; Trezvov, V V; Saratovskaya, N V
1997-01-01
This work represents a method, which allows to determine phase composition of calcium hydroxylapatite basing on its infrared spectrum. The method uses factor analysis of the spectral data of calibration set of samples to determine minimal number of factors required to reproduce the spectra within experimental error. Multiple linear regression is applied to establish correlation between factor scores of calibration standards and their properties. The regression equations can be used to predict the property value of unknown sample. The regression model was built for determination of beta-tricalcium phosphate content in hydroxylapatite. Statistical estimation of quality of the model was carried out. Application of the factor analysis on spectral data allows to increase accuracy of beta-tricalcium phosphate determination and expand the range of determination towards its less concentration. Reproducibility of results is retained.
Resistance of nickel-chromium-aluminum alloys to cyclic oxidation at 1100 C and 1200 C
NASA Technical Reports Server (NTRS)
Barrett, C. A.; Lowell, C. E.
1976-01-01
Nickel-rich alloys in the Ni-Cr-Al system were evaluated for cyclic oxidation resistance in still air at 1,100 and 1,200 C. A first approximation oxidation attack parameter Ka was derived from specific weight change data involving both a scaling growth constant and a spalling constant. An estimating equation was derived with Ka as a function of the Cr and Al content by multiple linear regression and translated into countour ternary diagrams showing regions of minimum attack. An additional factor inferred from the regression analysis was that alloys melted in zirconia crucibles had significantly greater oxidation resistance than comparable alloys melted otherwise.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Penna, M.L.; Duchiade, M.P.
The authors report the results of an investigation into the possible association between air pollution and infant mortality from pneumonia in the Rio de Janeiro Metropolitan Area. This investigation employed multiple linear regression analysis (stepwise method) for infant mortality from pneumonia in 1980, including the study population's areas of residence, incomes, and pollution exposure as independent variables. With the income variable included in the regression, a statistically significant association was observed between the average annual level of particulates and infant mortality from pneumonia. While this finding should be accepted with caution, it does suggest a biological association between these variables.more » The authors' conclusion is that air quality indicators should be included in studies of acute respiratory infections in developing countries.« less
Estimating standard errors in feature network models.
Frank, Laurence E; Heiser, Willem J
2007-05-01
Feature network models are graphical structures that represent proximity data in a discrete space while using the same formalism that is the basis of least squares methods employed in multidimensional scaling. Existing methods to derive a network model from empirical data only give the best-fitting network and yield no standard errors for the parameter estimates. The additivity properties of networks make it possible to consider the model as a univariate (multiple) linear regression problem with positivity restrictions on the parameters. In the present study, both theoretical and empirical standard errors are obtained for the constrained regression parameters of a network model with known features. The performance of both types of standard error is evaluated using Monte Carlo techniques.
NASA Astrophysics Data System (ADS)
Martínez-Fernández, J.; Chuvieco, E.; Koutsias, N.
2013-02-01
Humans are responsible for most forest fires in Europe, but anthropogenic factors behind these events are still poorly understood. We tried to identify the driving factors of human-caused fire occurrence in Spain by applying two different statistical approaches. Firstly, assuming stationary processes for the whole country, we created models based on multiple linear regression and binary logistic regression to find factors associated with fire density and fire presence, respectively. Secondly, we used geographically weighted regression (GWR) to better understand and explore the local and regional variations of those factors behind human-caused fire occurrence. The number of human-caused fires occurring within a 25-yr period (1983-2007) was computed for each of the 7638 Spanish mainland municipalities, creating a binary variable (fire/no fire) to develop logistic models, and a continuous variable (fire density) to build standard linear regression models. A total of 383 657 fires were registered in the study dataset. The binary logistic model, which estimates the probability of having/not having a fire, successfully classified 76.4% of the total observations, while the ordinary least squares (OLS) regression model explained 53% of the variation of the fire density patterns (adjusted R2 = 0.53). Both approaches confirmed, in addition to forest and climatic variables, the importance of variables related with agrarian activities, land abandonment, rural population exodus and developmental processes as underlying factors of fire occurrence. For the GWR approach, the explanatory power of the GW linear model for fire density using an adaptive bandwidth increased from 53% to 67%, while for the GW logistic model the correctly classified observations improved only slightly, from 76.4% to 78.4%, but significantly according to the corrected Akaike Information Criterion (AICc), from 3451.19 to 3321.19. The results from GWR indicated a significant spatial variation in the local parameter estimates for all the variables and an important reduction of the autocorrelation in the residuals of the GW linear model. Despite the fitting improvement of local models, GW regression, more than an alternative to "global" or traditional regression modelling, seems to be a valuable complement to explore the non-stationary relationships between the response variable and the explanatory variables. The synergy of global and local modelling provides insights into fire management and policy and helps further our understanding of the fire problem over large areas while at the same time recognizing its local character.
Posa, Mihalj; Pilipović, Ana; Lalić, Mladena; Popović, Jovan
2011-02-15
Linear dependence between temperature (t) and retention coefficient (k, reversed phase HPLC) of bile acids is obtained. Parameters (a, intercept and b, slope) of the linear function k=f(t) highly correlate with bile acids' structures. Investigated bile acids form linear congeneric groups on a principal component (calculated from k=f(t)) score plot that are in accordance with conformations of the hydroxyl and oxo groups in a bile acid steroid skeleton. Partition coefficient (K(p)) of nitrazepam in bile acids' micelles is investigated. Nitrazepam molecules incorporated in micelles show modified bioavailability (depo effect, higher permeability, etc.). Using multiple linear regression method QSAR models of nitrazepams' partition coefficient, K(p) are derived on the temperatures of 25°C and 37°C. For deriving linear regression models on both temperatures experimentally obtained lipophilicity parameters are included (PC1 from data k=f(t)) and in silico descriptors of the shape of a molecule while on the higher temperature molecular polarisation is introduced. This indicates the fact that the incorporation mechanism of nitrazepam in BA micelles changes on the higher temperatures. QSAR models are derived using partial least squares method as well. Experimental parameters k=f(t) are shown to be significant predictive variables. Both QSAR models are validated using cross validation and internal validation method. PLS models have slightly higher predictive capability than MLR models. Copyright © 2010 Elsevier B.V. All rights reserved.
Reducing the number of reconstructions needed for estimating channelized observer performance
NASA Astrophysics Data System (ADS)
Pineda, Angel R.; Miedema, Hope; Brenner, Melissa; Altaf, Sana
2018-03-01
A challenge for task-based optimization is the time required for each reconstructed image in applications where reconstructions are time consuming. Our goal is to reduce the number of reconstructions needed to estimate the area under the receiver operating characteristic curve (AUC) of the infinitely-trained optimal channelized linear observer. We explore the use of classifiers which either do not invert the channel covariance matrix or do feature selection. We also study the assumption that multiple low contrast signals in the same image of a non-linear reconstruction do not significantly change the estimate of the AUC. We compared the AUC of several classifiers (Hotelling, logistic regression, logistic regression using Firth bias reduction and the least absolute shrinkage and selection operator (LASSO)) with a small number of observations both for normal simulated data and images from a total variation reconstruction in magnetic resonance imaging (MRI). We used 10 Laguerre-Gauss channels and the Mann-Whitney estimator for AUC. For this data, our results show that at small sample sizes feature selection using the LASSO technique can decrease bias of the AUC estimation with increased variance and that for large sample sizes the difference between these classifiers is small. We also compared the use of multiple signals in a single reconstructed image to reduce the number of reconstructions in a total variation reconstruction for accelerated imaging in MRI. We found that AUC estimation using multiple low contrast signals in the same image resulted in similar AUC estimates as doing a single reconstruction per signal leading to a 13x reduction in the number of reconstructions needed.
Quality of search strategies reported in systematic reviews published in stereotactic radiosurgery.
Faggion, Clovis M; Wu, Yun-Chun; Tu, Yu-Kang; Wasiak, Jason
2016-06-01
Systematic reviews require comprehensive literature search strategies to avoid publication bias. This study aimed to assess and evaluate the reporting quality of search strategies within systematic reviews published in the field of stereotactic radiosurgery (SRS). Three electronic databases (Ovid MEDLINE(®), Ovid EMBASE(®) and the Cochrane Library) were searched to identify systematic reviews addressing SRS interventions, with the last search performed in October 2014. Manual searches of the reference lists of included systematic reviews were conducted. The search strategies of the included systematic reviews were assessed using a standardized nine-question form based on the Cochrane Collaboration guidelines and Assessment of Multiple Systematic Reviews checklist. Multiple linear regression analyses were performed to identify the important predictors of search quality. A total of 85 systematic reviews were included. The median quality score of search strategies was 2 (interquartile range = 2). Whilst 89% of systematic reviews reported the use of search terms, only 14% of systematic reviews reported searching the grey literature. Multiple linear regression analyses identified publication year (continuous variable), meta-analysis performance and journal impact factor (continuous variable) as predictors of higher mean quality scores. This study identified the urgent need to improve the quality of search strategies within systematic reviews published in the field of SRS. This study is the first to address how authors performed searches to select clinical studies for inclusion in their systematic reviews. Comprehensive and well-implemented search strategies are pivotal to reduce the chance of publication bias and consequently generate more reliable systematic review findings.
Korany, Mohamed A; Gazy, Azza A; Khamis, Essam F; Ragab, Marwa A A; Kamal, Miranda F
2018-06-01
This study outlines two robust regression approaches, namely least median of squares (LMS) and iteratively re-weighted least squares (IRLS) to investigate their application in instrument analysis of nutraceuticals (that is, fluorescence quenching of merbromin reagent upon lipoic acid addition). These robust regression methods were used to calculate calibration data from the fluorescence quenching reaction (∆F and F-ratio) under ideal or non-ideal linearity conditions. For each condition, data were treated using three regression fittings: Ordinary Least Squares (OLS), LMS and IRLS. Assessment of linearity, limits of detection (LOD) and quantitation (LOQ), accuracy and precision were carefully studied for each condition. LMS and IRLS regression line fittings showed significant improvement in correlation coefficients and all regression parameters for both methods and both conditions. In the ideal linearity condition, the intercept and slope changed insignificantly, but a dramatic change was observed for the non-ideal condition and linearity intercept. Under both linearity conditions, LOD and LOQ values after the robust regression line fitting of data were lower than those obtained before data treatment. The results obtained after statistical treatment indicated that the linearity ranges for drug determination could be expanded to lower limits of quantitation by enhancing the regression equation parameters after data treatment. Analysis results for lipoic acid in capsules, using both fluorimetric methods, treated by parametric OLS and after treatment by robust LMS and IRLS were compared for both linearity conditions. Copyright © 2018 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Kargoll, Boris; Omidalizarandi, Mohammad; Loth, Ina; Paffenholz, Jens-André; Alkhatib, Hamza
2018-03-01
In this paper, we investigate a linear regression time series model of possibly outlier-afflicted observations and autocorrelated random deviations. This colored noise is represented by a covariance-stationary autoregressive (AR) process, in which the independent error components follow a scaled (Student's) t-distribution. This error model allows for the stochastic modeling of multiple outliers and for an adaptive robust maximum likelihood (ML) estimation of the unknown regression and AR coefficients, the scale parameter, and the degree of freedom of the t-distribution. This approach is meant to be an extension of known estimators, which tend to focus only on the regression model, or on the AR error model, or on normally distributed errors. For the purpose of ML estimation, we derive an expectation conditional maximization either algorithm, which leads to an easy-to-implement version of iteratively reweighted least squares. The estimation performance of the algorithm is evaluated via Monte Carlo simulations for a Fourier as well as a spline model in connection with AR colored noise models of different orders and with three different sampling distributions generating the white noise components. We apply the algorithm to a vibration dataset recorded by a high-accuracy, single-axis accelerometer, focusing on the evaluation of the estimated AR colored noise model.
Liu, Chaoqun; Zhong, Chunrong; Zhou, Xuezhen; Chen, Renjuan; Wu, Jiangyue; Wang, Weiye; Li, Xiating; Ding, Huisi; Guo, Yanfang; Gao, Qin; Hu, Xingwen; Xiong, Guoping; Yang, Xuefeng; Hao, Liping; Xiao, Mei; Yang, Nianhong
2017-01-01
Bilirubin concentrations have been recently reported to be negatively associated with type 2 diabetes mellitus. We examined the association between bilirubin concentrations and gestational diabetes mellitus. In a prospective cohort study, 2969 pregnant women were recruited prior to 16 weeks of gestation and were followed up until delivery. The value of bilirubin was tested and oral glucose tolerance test was conducted to screen gestational diabetes mellitus. The relationship between serum bilirubin concentration and gestational weeks was studied by two-piecewise linear regression. A subsample of 1135 participants with serum bilirubin test during 16-18 weeks gestation was conducted to research the association between serum bilirubin levels and risk of gestational diabetes mellitus by logistic regression. Gestational diabetes mellitus developed in 8.5 % of the participants (223 of 2969). Two-piecewise linear regression analyses demonstrated that the levels of bilirubin decreased with gestational week up to the turning point 23 and after that point, levels of bilirubin were increased slightly. In multiple logistic regression analysis, the relative risk of developing gestational diabetes mellitus was lower in the highest tertile of direct bilirubin than that in the lowest tertile (RR 0.60; 95 % CI, 0.35-0.89). The results suggested that women with higher serum direct bilirubin levels during the second trimester of pregnancy have lower risk for development of gestational diabetes mellitus.
The allometry of coarse root biomass: log-transformed linear regression or nonlinear regression?
Lai, Jiangshan; Yang, Bo; Lin, Dunmei; Kerkhoff, Andrew J; Ma, Keping
2013-01-01
Precise estimation of root biomass is important for understanding carbon stocks and dynamics in forests. Traditionally, biomass estimates are based on allometric scaling relationships between stem diameter and coarse root biomass calculated using linear regression (LR) on log-transformed data. Recently, it has been suggested that nonlinear regression (NLR) is a preferable fitting method for scaling relationships. But while this claim has been contested on both theoretical and empirical grounds, and statistical methods have been developed to aid in choosing between the two methods in particular cases, few studies have examined the ramifications of erroneously applying NLR. Here, we use direct measurements of 159 trees belonging to three locally dominant species in east China to compare the LR and NLR models of diameter-root biomass allometry. We then contrast model predictions by estimating stand coarse root biomass based on census data from the nearby 24-ha Gutianshan forest plot and by testing the ability of the models to predict known root biomass values measured on multiple tropical species at the Pasoh Forest Reserve in Malaysia. Based on likelihood estimates for model error distributions, as well as the accuracy of extrapolative predictions, we find that LR on log-transformed data is superior to NLR for fitting diameter-root biomass scaling models. More importantly, inappropriately using NLR leads to grossly inaccurate stand biomass estimates, especially for stands dominated by smaller trees.
Age and mortality after injury: is the association linear?
Friese, R S; Wynne, J; Joseph, B; Hashmi, A; Diven, C; Pandit, V; O'Keeffe, T; Zangbar, B; Kulvatunyou, N; Rhee, P
2014-10-01
Multiple studies have demonstrated a linear association between advancing age and mortality after injury. An inflection point, or an age at which outcomes begin to differ, has not been previously described. We hypothesized that the relationship between age and mortality after injury is non-linear and an inflection point exists. We performed a retrospective cohort analysis at our urban level I center from 2007 through 2009. All patients aged 65 years and older with the admission diagnosis of injury were included. Non-parametric logistic regression was used to identify the functional form between mortality and age. Multivariate logistic regression was utilized to explore the association between age and mortality. Age 65 years was used as the reference. Significance was defined as p < 0.05. A total of 1,107 patients were included in the analysis. One-third required intensive care unit (ICU) admission and 48 % had traumatic brain injury. 229 patients (20.6 %) were 84 years of age or older. The overall mortality was 7.2 %. Our model indicates that mortality is a quadratic function of age. After controlling for confounders, age is associated with mortality with a regression coefficient of 1.08 for the linear term (p = 0.02) and a regression coefficient of -0.006 for the quadratic term (p = 0.03). The model identified 84.4 years of age as the inflection point at which mortality rates begin to decline. The risk of death after injury varies linearly with age until 84 years. After 84 years of age, the mortality rates decline. These findings may reflect the varying severity of comorbidities and differences in baseline functional status in elderly trauma patients. Specifically, a proportion of our injured patient population less than 84 years old may be more frail, contributing to increased mortality after trauma, whereas a larger proportion of our injured patients over 84 years old, by virtue of reaching this advanced age, may, in fact, be less frail, contributing to less risk of death.
Improved determination of particulate absorption from combined filter pad and PSICAM measurements.
Lefering, Ina; Röttgers, Rüdiger; Weeks, Rebecca; Connor, Derek; Utschig, Christian; Heymann, Kerstin; McKee, David
2016-10-31
Filter pad light absorption measurements are subject to two major sources of experimental uncertainty: the so-called pathlength amplification factor, β, and scattering offsets, o, for which previous null-correction approaches are limited by recent observations of non-zero absorption in the near infrared (NIR). A new filter pad absorption correction method is presented here which uses linear regression against point-source integrating cavity absorption meter (PSICAM) absorption data to simultaneously resolve both β and the scattering offset. The PSICAM has previously been shown to provide accurate absorption data, even in highly scattering waters. Comparisons of PSICAM and filter pad particulate absorption data reveal linear relationships that vary on a sample by sample basis. This regression approach provides significantly improved agreement with PSICAM data (3.2% RMS%E) than previously published filter pad absorption corrections. Results show that direct transmittance (T-method) filter pad absorption measurements perform effectively at the same level as more complex geometrical configurations based on integrating cavity measurements (IS-method and QFT-ICAM) because the linear regression correction compensates for the sensitivity to scattering errors in the T-method. This approach produces accurate filter pad particulate absorption data for wavelengths in the blue/UV and in the NIR where sensitivity issues with PSICAM measurements limit performance. The combination of the filter pad absorption and PSICAM is therefore recommended for generating full spectral, best quality particulate absorption data as it enables correction of multiple errors sources across both measurements.
Thieler, E. Robert; Himmelstoss, Emily A.; Zichichi, Jessica L.; Ergul, Ayhan
2009-01-01
The Digital Shoreline Analysis System (DSAS) version 4.0 is a software extension to ESRI ArcGIS v.9.2 and above that enables a user to calculate shoreline rate-of-change statistics from multiple historic shoreline positions. A user-friendly interface of simple buttons and menus guides the user through the major steps of shoreline change analysis. Components of the extension and user guide include (1) instruction on the proper way to define a reference baseline for measurements, (2) automated and manual generation of measurement transects and metadata based on user-specified parameters, and (3) output of calculated rates of shoreline change and other statistical information. DSAS computes shoreline rates of change using four different methods: (1) endpoint rate, (2) simple linear regression, (3) weighted linear regression, and (4) least median of squares. The standard error, correlation coefficient, and confidence interval are also computed for the simple and weighted linear-regression methods. The results of all rate calculations are output to a table that can be linked to the transect file by a common attribute field. DSAS is intended to facilitate the shoreline change-calculation process and to provide rate-of-change information and the statistical data necessary to establish the reliability of the calculated results. The software is also suitable for any generic application that calculates positional change over time, such as assessing rates of change of glacier limits in sequential aerial photos, river edge boundaries, land-cover changes, and so on.
The effects of climate change on harp seals (Pagophilus groenlandicus).
Johnston, David W; Bowers, Matthew T; Friedlaender, Ari S; Lavigne, David M
2012-01-01
Harp seals (Pagophilus groenlandicus) have evolved life history strategies to exploit seasonal sea ice as a breeding platform. As such, individuals are prepared to deal with fluctuations in the quantity and quality of ice in their breeding areas. It remains unclear, however, how shifts in climate may affect seal populations. The present study assesses the effects of climate change on harp seals through three linked analyses. First, we tested the effects of short-term climate variability on young-of-the year harp seal mortality using a linear regression of sea ice cover in the Gulf of St. Lawrence against stranding rates of dead harp seals in the region during 1992 to 2010. A similar regression of stranding rates and North Atlantic Oscillation (NAO) index values was also conducted. These analyses revealed negative correlations between both ice cover and NAO conditions and seal mortality, indicating that lighter ice cover and lower NAO values result in higher mortality. A retrospective cross-correlation analysis of NAO conditions and sea ice cover from 1978 to 2011 revealed that NAO-related changes in sea ice may have contributed to the depletion of seals on the east coast of Canada during 1950 to 1972, and to their recovery during 1973 to 2000. This historical retrospective also reveals opposite links between neonatal mortality in harp seals in the Northeast Atlantic and NAO phase. Finally, an assessment of the long-term trends in sea ice cover in the breeding regions of harp seals across the entire North Atlantic during 1979 through 2011 using multiple linear regression models and mixed effects linear regression models revealed that sea ice cover in all harp seal breeding regions has been declining by as much as 6 percent per decade over the time series of available satellite data.
The Effects of Climate Change on Harp Seals (Pagophilus groenlandicus)
Johnston, David W.; Bowers, Matthew T.; Friedlaender, Ari S.; Lavigne, David M.
2012-01-01
Harp seals (Pagophilus groenlandicus) have evolved life history strategies to exploit seasonal sea ice as a breeding platform. As such, individuals are prepared to deal with fluctuations in the quantity and quality of ice in their breeding areas. It remains unclear, however, how shifts in climate may affect seal populations. The present study assesses the effects of climate change on harp seals through three linked analyses. First, we tested the effects of short-term climate variability on young-of-the year harp seal mortality using a linear regression of sea ice cover in the Gulf of St. Lawrence against stranding rates of dead harp seals in the region during 1992 to 2010. A similar regression of stranding rates and North Atlantic Oscillation (NAO) index values was also conducted. These analyses revealed negative correlations between both ice cover and NAO conditions and seal mortality, indicating that lighter ice cover and lower NAO values result in higher mortality. A retrospective cross-correlation analysis of NAO conditions and sea ice cover from 1978 to 2011 revealed that NAO-related changes in sea ice may have contributed to the depletion of seals on the east coast of Canada during 1950 to 1972, and to their recovery during 1973 to 2000. This historical retrospective also reveals opposite links between neonatal mortality in harp seals in the Northeast Atlantic and NAO phase. Finally, an assessment of the long-term trends in sea ice cover in the breeding regions of harp seals across the entire North Atlantic during 1979 through 2011 using multiple linear regression models and mixed effects linear regression models revealed that sea ice cover in all harp seal breeding regions has been declining by as much as 6 percent per decade over the time series of available satellite data. PMID:22238591
Gordon, Evan M.; Stollstorff, Melanie; Vaidya, Chandan J.
2012-01-01
Many researchers have noted that the functional architecture of the human brain is relatively invariant during task performance and the resting state. Indeed, intrinsic connectivity networks (ICNs) revealed by resting-state functional connectivity analyses are spatially similar to regions activated during cognitive tasks. This suggests that patterns of task-related activation in individual subjects may result from the engagement of one or more of these ICNs; however, this has not been tested. We used a novel analysis, spatial multiple regression, to test whether the patterns of activation during an N-back working memory task could be well described by a linear combination of ICNs delineated using Independent Components Analysis at rest. We found that across subjects, the cingulo-opercular Set Maintenance ICN, as well as right and left Frontoparietal Control ICNs, were reliably activated during working memory, while Default Mode and Visual ICNs were reliably deactivated. Further, involvement of Set Maintenance, Frontoparietal Control, and Dorsal Attention ICNs was sensitive to varying working memory load. Finally, the degree of left Frontoparietal Control network activation predicted response speed, while activation in both left Frontoparietal Control and Dorsal Attention networks predicted task accuracy. These results suggest that a close relationship between resting-state networks and task-evoked activation is functionally relevant for behavior, and that spatial multiple regression analysis is a suitable method for revealing that relationship. PMID:21761505
Kim, Yoonsang; Choi, Young-Ku; Emery, Sherry
2013-08-01
Several statistical packages are capable of estimating generalized linear mixed models and these packages provide one or more of three estimation methods: penalized quasi-likelihood, Laplace, and Gauss-Hermite. Many studies have investigated these methods' performance for the mixed-effects logistic regression model. However, the authors focused on models with one or two random effects and assumed a simple covariance structure between them, which may not be realistic. When there are multiple correlated random effects in a model, the computation becomes intensive, and often an algorithm fails to converge. Moreover, in our analysis of smoking status and exposure to anti-tobacco advertisements, we have observed that when a model included multiple random effects, parameter estimates varied considerably from one statistical package to another even when using the same estimation method. This article presents a comprehensive review of the advantages and disadvantages of each estimation method. In addition, we compare the performances of the three methods across statistical packages via simulation, which involves two- and three-level logistic regression models with at least three correlated random effects. We apply our findings to a real dataset. Our results suggest that two packages-SAS GLIMMIX Laplace and SuperMix Gaussian quadrature-perform well in terms of accuracy, precision, convergence rates, and computing speed. We also discuss the strengths and weaknesses of the two packages in regard to sample sizes.
Kim, Yoonsang; Emery, Sherry
2013-01-01
Several statistical packages are capable of estimating generalized linear mixed models and these packages provide one or more of three estimation methods: penalized quasi-likelihood, Laplace, and Gauss-Hermite. Many studies have investigated these methods’ performance for the mixed-effects logistic regression model. However, the authors focused on models with one or two random effects and assumed a simple covariance structure between them, which may not be realistic. When there are multiple correlated random effects in a model, the computation becomes intensive, and often an algorithm fails to converge. Moreover, in our analysis of smoking status and exposure to anti-tobacco advertisements, we have observed that when a model included multiple random effects, parameter estimates varied considerably from one statistical package to another even when using the same estimation method. This article presents a comprehensive review of the advantages and disadvantages of each estimation method. In addition, we compare the performances of the three methods across statistical packages via simulation, which involves two- and three-level logistic regression models with at least three correlated random effects. We apply our findings to a real dataset. Our results suggest that two packages—SAS GLIMMIX Laplace and SuperMix Gaussian quadrature—perform well in terms of accuracy, precision, convergence rates, and computing speed. We also discuss the strengths and weaknesses of the two packages in regard to sample sizes. PMID:24288415
NASA Astrophysics Data System (ADS)
Caimmi, R.
2011-08-01
Concerning bivariate least squares linear regression, the classical approach pursued for functional models in earlier attempts ( York, 1966, 1969) is reviewed using a new formalism in terms of deviation (matrix) traces which, for unweighted data, reduce to usual quantities leaving aside an unessential (but dimensional) multiplicative factor. Within the framework of classical error models, the dependent variable relates to the independent variable according to the usual additive model. The classes of linear models considered are regression lines in the general case of correlated errors in X and in Y for weighted data, and in the opposite limiting situations of (i) uncorrelated errors in X and in Y, and (ii) completely correlated errors in X and in Y. The special case of (C) generalized orthogonal regression is considered in detail together with well known subcases, namely: (Y) errors in X negligible (ideally null) with respect to errors in Y; (X) errors in Y negligible (ideally null) with respect to errors in X; (O) genuine orthogonal regression; (R) reduced major-axis regression. In the limit of unweighted data, the results determined for functional models are compared with their counterparts related to extreme structural models i.e. the instrumental scatter is negligible (ideally null) with respect to the intrinsic scatter ( Isobe et al., 1990; Feigelson and Babu, 1992). While regression line slope and intercept estimators for functional and structural models necessarily coincide, the contrary holds for related variance estimators even if the residuals obey a Gaussian distribution, with the exception of Y models. An example of astronomical application is considered, concerning the [O/H]-[Fe/H] empirical relations deduced from five samples related to different stars and/or different methods of oxygen abundance determination. For selected samples and assigned methods, different regression models yield consistent results within the errors (∓ σ) for both heteroscedastic and homoscedastic data. Conversely, samples related to different methods produce discrepant results, due to the presence of (still undetected) systematic errors, which implies no definitive statement can be made at present. A comparison is also made between different expressions of regression line slope and intercept variance estimators, where fractional discrepancies are found to be not exceeding a few percent, which grows up to about 20% in the presence of large dispersion data. An extension of the formalism to structural models is left to a forthcoming paper.
1974-01-01
REGRESSION MODEL - THE UNCONSTRAINED, LINEAR EQUALITY AND INEQUALITY CONSTRAINED APPROACHES January 1974 Nelson Delfino d’Avila Mascarenha;? Image...Report 520 DIGITAL IMAGE RESTORATION UNDER A REGRESSION MODEL THE UNCONSTRAINED, LINEAR EQUALITY AND INEQUALITY CONSTRAINED APPROACHES January...a two- dimensional form adequately describes the linear model . A dis- cretization is performed by using quadrature methods. By trans
Element enrichment factor calculation using grain-size distribution and functional data regression.
Sierra, C; Ordóñez, C; Saavedra, A; Gallego, J R
2015-01-01
In environmental geochemistry studies it is common practice to normalize element concentrations in order to remove the effect of grain size. Linear regression with respect to a particular grain size or conservative element is a widely used method of normalization. In this paper, the utility of functional linear regression, in which the grain-size curve is the independent variable and the concentration of pollutant the dependent variable, is analyzed and applied to detrital sediment. After implementing functional linear regression and classical linear regression models to normalize and calculate enrichment factors, we concluded that the former regression technique has some advantages over the latter. First, functional linear regression directly considers the grain-size distribution of the samples as the explanatory variable. Second, as the regression coefficients are not constant values but functions depending on the grain size, it is easier to comprehend the relationship between grain size and pollutant concentration. Third, regularization can be introduced into the model in order to establish equilibrium between reliability of the data and smoothness of the solutions. Copyright © 2014 Elsevier Ltd. All rights reserved.
Van Looy, Stijn; Verplancke, Thierry; Benoit, Dominique; Hoste, Eric; Van Maele, Georges; De Turck, Filip; Decruyenaere, Johan
2007-01-01
Tacrolimus is an important immunosuppressive drug for organ transplantation patients. It has a narrow therapeutic range, toxic side effects, and a blood concentration with wide intra- and interindividual variability. Hence, it is of the utmost importance to monitor tacrolimus blood concentration, thereby ensuring clinical effect and avoiding toxic side effects. Prediction models for tacrolimus blood concentration can improve clinical care by optimizing monitoring of these concentrations, especially in the initial phase after transplantation during intensive care unit (ICU) stay. This is the first study in the ICU in which support vector machines, as a new data modeling technique, are investigated and tested in their prediction capabilities of tacrolimus blood concentration. Linear support vector regression (SVR) and nonlinear radial basis function (RBF) SVR are compared with multiple linear regression (MLR). Tacrolimus blood concentrations, together with 35 other relevant variables from 50 liver transplantation patients, were extracted from our ICU database. This resulted in a dataset of 457 blood samples, on average between 9 and 10 samples per patient, finally resulting in a database of more than 16,000 data values. Nonlinear RBF SVR, linear SVR, and MLR were performed after selection of clinically relevant input variables and model parameters. Differences between observed and predicted tacrolimus blood concentrations were calculated. Prediction accuracy of the three methods was compared after fivefold cross-validation (Friedman test and Wilcoxon signed rank analysis). Linear SVR and nonlinear RBF SVR had mean absolute differences between observed and predicted tacrolimus blood concentrations of 2.31 ng/ml (standard deviation [SD] 2.47) and 2.38 ng/ml (SD 2.49), respectively. MLR had a mean absolute difference of 2.73 ng/ml (SD 3.79). The difference between linear SVR and MLR was statistically significant (p < 0.001). RBF SVR had the advantage of requiring only 2 input variables to perform this prediction in comparison to 15 and 16 variables needed by linear SVR and MLR, respectively. This is an indication of the superior prediction capability of nonlinear SVR. Prediction of tacrolimus blood concentration with linear and nonlinear SVR was excellent, and accuracy was superior in comparison with an MLR model.
Who Will Win?: Predicting the Presidential Election Using Linear Regression
ERIC Educational Resources Information Center
Lamb, John H.
2007-01-01
This article outlines a linear regression activity that engages learners, uses technology, and fosters cooperation. Students generated least-squares linear regression equations using TI-83 Plus[TM] graphing calculators, Microsoft[C] Excel, and paper-and-pencil calculations using derived normal equations to predict the 2004 presidential election.…
Watkins, Nicholas; Kennedy, Mary; Lee, Nelson; O'Neill, Michael; Peavey, Erin; Ducharme, Maria; Padula, Cynthia
2012-05-01
This study explored the impact of unit design and healthcare information technology (HIT) on nursing workflow and patient-centered care (PCC). Healthcare information technology and unit layout-related predictors of nursing workflow and PCC were measured during a 3-phase study involving questionnaires and work sampling methods. Stepwise multiple linear regressions demonstrated several HIT and unit layout-related factors that impact nursing workflow and PCC.
Lee, Eun Jee; Ogbolu, Yolanda
The purposes of this study were to (a) examine the relationship between personal characteristics (age, gender), psychological factors (depression), and physical factors (sleep time) on smartphone addiction in children and (b) determine whether parental control is associated with a lower incidence of smartphone addiction. Data were collected from children aged 10-12 years (N = 208) by a self-report questionnaire in two elementary schools and were analyzed using t test, one-way analysis of variance, correlation, and multiple linear regression. Most of the participants (73.3%) owned a smartphone, and the percentage of risky smartphone users was 12%. The multiple linear regression model explained 25.4% (adjusted R = .239) of the variance in the smartphone addiction score (SAS). Three variables were significantly associated with the SAS (age, depression, and parental control), and three variables were excluded (gender, geographic region, and parental control software). Teens, aged 10-12 years, with higher depression scores had higher SASs. The more parental control perceived by the student, the higher the SAS. There was no significant relationship between parental control software and smartphone addiction. This is one of the first studies to examine smartphone addiction in teens. Control-oriented managing by parents of children's smartphone use is not very effective and may exacerbate smartphone addiction. Future research should identify additional strategies, beyond parental control software, that have the potential to prevent, reduce, and eliminate smartphone addiction.
Hasani Sangani, Mohammad; Jabbarian Amiri, Bahman; Alizadeh Shabani, Afshin; Sakieh, Yousef; Ashrafi, Sohrab
2015-04-01
Increasing land utilization through diverse forms of human activities, such as agriculture, forestry, urban growth, and industrial development, has led to negative impacts on the water quality of rivers. To find out how catchment attributes, such as land use, hydrologic soil groups, and lithology, can affect water quality variables (Ca(2+), Mg(2+), Na(+), Cl(-), HCO 3 (-) , pH, TDS, EC, SAR), a spatio-statistical approach was applied to 23 catchments in southern basins of the Caspian Sea. All input data layers (digital maps of land use, soil, and lithology) were prepared using geographic information system (GIS) and spatial analysis. Relationships between water quality variables and catchment attributes were then examined by Spearman rank correlation tests and multiple linear regression. Stepwise approach-based multiple linear regressions were developed to examine the relationship between catchment attributes and water quality variables. The areas (%) of marl, tuff, or diorite, as well as those of good-quality rangeland and bare land had negative effects on all water quality variables, while those of basalt, forest land cover were found to contribute to improved river water quality. Moreover, lithological variables showed the greatest most potential for predicting the mean concentration values of water quality variables, and noting that measure of EC and TDS have inversely associated with area (%) of urban land use.
Fu, Chang; Li, Zhen; Mao, Zongfu
2018-01-30
Participation in social activities is one of important factors for older adults' health. The present study aims to examine the cross-sectional association between social activities and cognitive function among Chinese elderly. A total of 8966 individuals aged 60 and older from the 2015 China Health and Retirement Longitudinal Study were obtained for this study. Telephone interviews of cognitive status, episodic memory, and visuospatial abilities were assessed by questionnaire. We used the sum of all three of the above measures to represent the respondent's cognitive status as a whole. Types and frequencies of participation in social groups were used to measure social activities. Multiple linear regression analysis was used to explore the relationship between social activities and cognitive function. After adjustment for demographics, smoking, drinking, depression, hypertension, diabetes, basic activities of daily living, instrumental activities of daily living, and self-rated health, multiple linear regression analysis revealed that interaction with friends, participating in hobby groups, and sports groups were associated with better cognitive function among both men and women ( p < 0.05); doing volunteer work was associated with better cognitive function among women but not among men ( p < 0.05). These findings suggest that there is a cross-sectional association between participation in social activities and cognitive function among Chinese elderly. Longitudinal studies are needed to examine the effects of social activities on cognitive function.
Syrengelas, Dimitrios; Kalampoki, Vassiliki; Kleisiouni, Paraskevi; Konstantinou, Dimitrios; Siahanidou, Tania
2014-07-01
The aims of this study were to investigate gross motor development in Greek infants and establish AIMS percentile curves and to examine possible association of AIMS scores with socioeconomic parameters. Mean AIMS scores of 1068 healthy Greek full-term infants were compared at monthly age level with the respective mean scores of the Canadian normative sample. In a subgroup of 345 study participants, parents provided, via interview, information about family socioeconomic status. Multiple linear regression analysis was performed to evaluate the relationship of infant motor development with socioeconomic parameters. Mean AIMS scores did not differ significantly between Greek and Canadian infants in any of the 19 monthly levels of age. In multiple linear regression analysis, the educational level of the mother and also whether the infant was being raised by grandparents/babysitter were significantly associated with gross motor development (p=0.02 and p<0.001, respectively), whereas there was no significant correlation of mean AIMS scores with gender, birth order, maternal age, paternal educational level and family monthly income. Gross motor development of healthy Greek full-term infants, assessed by AIMS during the first 19months of age, follows a similar course to that of the original Canadian sample. Specific socioeconomic factors are associated with the infants' motor development. Copyright © 2014 Elsevier Ltd. All rights reserved.
Fu, Chang; Li, Zhen; Mao, Zongfu
2018-01-01
Participation in social activities is one of important factors for older adults’ health. The present study aims to examine the cross-sectional association between social activities and cognitive function among Chinese elderly. A total of 8966 individuals aged 60 and older from the 2015 China Health and Retirement Longitudinal Study were obtained for this study. Telephone interviews of cognitive status, episodic memory, and visuospatial abilities were assessed by questionnaire. We used the sum of all three of the above measures to represent the respondent’s cognitive status as a whole. Types and frequencies of participation in social groups were used to measure social activities. Multiple linear regression analysis was used to explore the relationship between social activities and cognitive function. After adjustment for demographics, smoking, drinking, depression, hypertension, diabetes, basic activities of daily living, instrumental activities of daily living, and self-rated health, multiple linear regression analysis revealed that interaction with friends, participating in hobby groups, and sports groups were associated with better cognitive function among both men and women (p < 0.05); doing volunteer work was associated with better cognitive function among women but not among men (p < 0.05). These findings suggest that there is a cross-sectional association between participation in social activities and cognitive function among Chinese elderly. Longitudinal studies are needed to examine the effects of social activities on cognitive function. PMID:29385773
Serrano-Gallardo, Pilar; Martínez-Marcos, Mercedes; Espejo-Matorrales, Flora; Arakawa, Tiemi; Magnabosco, Gabriela Tavares; Pinto, Ione Carvalho
2016-01-01
ABSTRACT Objective: to identify the students' perception about the quality of clinical placements and asses the influence of the different tutoring processes in clinical learning. Methods: analytical cross-sectional study on second and third year nursing students (n=122) about clinical learning in primary health care. The Clinical Placement Evaluation Tool and a synthetic index of attitudes and skills were computed to give scores to the clinical learning (scale 0-10). Univariate, bivariate and multivariate (multiple linear regression) analyses were performed. Results: the response rate was 91.8%. The most commonly identified tutoring process was "preceptor-professor" (45.2%). The clinical placement was assessed as "optimal" by 55.1%, relationship with team-preceptor was considered good by 80.4% of the cases and the average grade for clinical learning was 7.89. The multiple linear regression model with more explanatory capacity included the variables "Academic year" (beta coefficient = 1.042 for third-year students), "Primary Health Care Area (PHC)" (beta coefficient = 0.308 for Area B) and "Clinical placement perception" (beta coefficient = - 0.204 for a suboptimal perception). Conclusions: timeframe within the academic program, location and clinical placement perception were associated with students' clinical learning. Students' perceptions of setting quality were positive and a good team-preceptor relationship is a matter of relevance. PMID:27627124
[Application of artificial neural networks on the prediction of surface ozone concentrations].
Shen, Lu-Lu; Wang, Yu-Xuan; Duan, Lei
2011-08-01
Ozone is an important secondary air pollutant in the lower atmosphere. In order to predict the hourly maximum ozone one day in advance based on the meteorological variables for the Wanqingsha site in Guangzhou, Guangdong province, a neural network model (Multi-Layer Perceptron) and a multiple linear regression model were used and compared. Model inputs are meteorological parameters (wind speed, wind direction, air temperature, relative humidity, barometric pressure and solar radiation) of the next day and hourly maximum ozone concentration of the previous day. The OBS (optimal brain surgeon) was adopted to prune the neutral work, to reduce its complexity and to improve its generalization ability. We find that the pruned neural network has the capacity to predict the peak ozone, with an agreement index of 92.3%, the root mean square error of 0.0428 mg/m3, the R-square of 0.737 and the success index of threshold exceedance 77.0% (the threshold O3 mixing ratio of 0.20 mg/m3). When the neural classifier was added to the neural network model, the success index of threshold exceedance increased to 83.6%. Through comparison of the performance indices between the multiple linear regression model and the neural network model, we conclud that that neural network is a better choice to predict peak ozone from meteorological forecast, which may be applied to practical prediction of ozone concentration.
An Application of Robust Method in Multiple Linear Regression Model toward Credit Card Debt
NASA Astrophysics Data System (ADS)
Amira Azmi, Nur; Saifullah Rusiman, Mohd; Khalid, Kamil; Roslan, Rozaini; Sufahani, Suliadi; Mohamad, Mahathir; Salleh, Rohayu Mohd; Hamzah, Nur Shamsidah Amir
2018-04-01
Credit card is a convenient alternative replaced cash or cheque, and it is essential component for electronic and internet commerce. In this study, the researchers attempt to determine the relationship and significance variables between credit card debt and demographic variables such as age, household income, education level, years with current employer, years at current address, debt to income ratio and other debt. The provided data covers 850 customers information. There are three methods that applied to the credit card debt data which are multiple linear regression (MLR) models, MLR models with least quartile difference (LQD) method and MLR models with mean absolute deviation method. After comparing among three methods, it is found that MLR model with LQD method became the best model with the lowest value of mean square error (MSE). According to the final model, it shows that the years with current employer, years at current address, household income in thousands and debt to income ratio are positively associated with the amount of credit debt. Meanwhile variables for age, level of education and other debt are negatively associated with amount of credit debt. This study may serve as a reference for the bank company by using robust methods, so that they could better understand their options and choice that is best aligned with their goals for inference regarding to the credit card debt.
Matsuba, Ikuro; Saito, Kazumi; Takai, Masahiko; Hirao, Koichi; Sone, Hirohito
2012-09-01
To investigate the relationship between fasting insulin levels and metabolic risk factors (MRFs) in type 2 diabetic patients at the first clinic/hospital visit in Japan over the years 2000 to 2009. In total, 4,798 drug-naive Japanese patients with type 2 diabetes were registered on their first clinic/hospital visits. Conventional clinical factors and fasting insulin levels were observed at baseline within the Japan Diabetes Clinical Data Management (JDDM) study between consecutive 2-year groups. Multiple linear regression analysis was performed using a model in which the dependent variable was fasting insulin values using various clinical explanatory variables. Fasting insulin levels were found to be decreasing from 2000 to 2009. Multiple linear regression analysis with the fasting insulin levels as the dependent variable showed that waist circumference (WC), BMI, mean blood pressure, triglycerides, and HDL cholesterol were significant, with WC and BMI as the main factors. ANCOVA after adjustment for age and fasting plasma glucose clearly shows the decreasing trend in fasting insulin levels and the increasing trend in BMI. During the 10-year observation period, the decreasing trend in fasting insulin was related to the slight increase in WC/BMI in type 2 diabetes. Low pancreatic β-cell reserve on top of a lifestyle background might be dependent on an increase in MRFs.
Matsuba, Ikuro; Saito, Kazumi; Takai, Masahiko; Hirao, Koichi; Sone, Hirohito
2012-01-01
OBJECTIVE To investigate the relationship between fasting insulin levels and metabolic risk factors (MRFs) in type 2 diabetic patients at the first clinic/hospital visit in Japan over the years 2000 to 2009. RESEARCH DESIGN AND METHODS In total, 4,798 drug-naive Japanese patients with type 2 diabetes were registered on their first clinic/hospital visits. Conventional clinical factors and fasting insulin levels were observed at baseline within the Japan Diabetes Clinical Data Management (JDDM) study between consecutive 2-year groups. Multiple linear regression analysis was performed using a model in which the dependent variable was fasting insulin values using various clinical explanatory variables. RESULTS Fasting insulin levels were found to be decreasing from 2000 to 2009. Multiple linear regression analysis with the fasting insulin levels as the dependent variable showed that waist circumference (WC), BMI, mean blood pressure, triglycerides, and HDL cholesterol were significant, with WC and BMI as the main factors. ANCOVA after adjustment for age and fasting plasma glucose clearly shows the decreasing trend in fasting insulin levels and the increasing trend in BMI. CONCLUSIONS During the 10-year observation period, the decreasing trend in fasting insulin was related to the slight increase in WC/BMI in type 2 diabetes. Low pancreatic β-cell reserve on top of a lifestyle background might be dependent on an increase in MRFs. PMID:22665215
Schwab, Bianca; Daniel, Heloisa Silveira; Lutkemeyer, Carine; Neves, João Arthur Lange Lins; Zilli, Louise Nassif; Guarnieri, Ricardo; Diaz, Alexandre Paim; Michels, Ana Maria Maykot Prates
2015-01-01
Health-related quality of life (HRQOL) assessment tools have been broadly used in the medical context. These tools are used to measure the subjective impact of the disease on patients. The objective of this study was to evaluate the variables associated with HRQOL in a Brazilian sample of patients followed up in a tertiary outpatient clinic for depression and anxiety disorders. Cross-sectional study. Independent variables were those included in a sociodemographic questionnaire and the Hospital Anxiety and Depression Scale (HADS) scores. Dependent variables were those included in the short version of the World Health Organization Quality of Life (WHOQOL-BREF) and the scores for its subdomains (overall quality of life and general health, physical health, psychological health, social relationships, and environment). A multiple linear regression analysis was used to find the variables independently associated with each outcome. Seventy-five adult patients were evaluated. After multiple linear regression analysis, the HADS scores were associated with all outcomes, except social relationships (p = 0.08). Female gender was associated with poor total scores, as well as psychological health and environment. Unemployment was associated with poor physical health. Identifying the factors associated with HRQOL and recognizing that depression and anxiety are major factors are essential to improve the care of patients.
Analysis of methods to estimate spring flows in a karst aquifer
Sepulveda, N.
2009-01-01
Hydraulically and statistically based methods were analyzed to identify the most reliable method to predict spring flows in a karst aquifer. Measured water levels at nearby observation wells, measured spring pool altitudes, and the distance between observation wells and the spring pool were the parameters used to match measured spring flows. Measured spring flows at six Upper Floridan aquifer springs in central Florida were used to assess the reliability of these methods to predict spring flows. Hydraulically based methods involved the application of the Theis, Hantush-Jacob, and Darcy-Weisbach equations, whereas the statistically based methods were the multiple linear regressions and the technology of artificial neural networks (ANNs). Root mean square errors between measured and predicted spring flows using the Darcy-Weisbach method ranged between 5% and 15% of the measured flows, lower than the 7% to 27% range for the Theis or Hantush-Jacob methods. Flows at all springs were estimated to be turbulent based on the Reynolds number derived from the Darcy-Weisbach equation for conduit flow. The multiple linear regression and the Darcy-Weisbach methods had similar spring flow prediction capabilities. The ANNs provided the lowest residuals between measured and predicted spring flows, ranging from 1.6% to 5.3% of the measured flows. The model prediction efficiency criteria also indicated that the ANNs were the most accurate method predicting spring flows in a karst aquifer. ?? 2008 National Ground Water Association.
Noh, J-W; Kwon, Y-D; Yoon, S-J; Hwang, J-I
2011-06-01
Numerous studies on HNC services have been carried out by signifying their needs, efficiency and effectiveness. However, no study has ever been performed to determine the critical factors associated with HNC's positive results despite the deluge of positive studies on the service. This study included all of the 89 training hospitals that were practising HNC service in Korea as of November 2006. The input factors affecting the performance were classified as either internal or external environmental factors. This analysis was conducted to understand the impact that the corresponding factors had on performance. Data were analysed by using multiple linear regressions. The internal and external environment variables affected the performance of HNC based on univariate analysis. The meaningful variables were internal environmental factors. Specifically, managerial resource (the number of operating beds and the outpatient/inpatient ratio) were meaningful when the multiple linear regression analysis was performed. Indeed, the importance of organizational culture (the passion of HNC nurses) was significant. This study, considering the limited market size of Korea, illustrates that the critical factor for the development of hospital-led HNC lies with internal environmental factors rather than external ones. Among the internal environmental factors, the hospitals' managerial resource-related factors (specifically, the passion of nurses) were the most important contributing element. © 2011 The Authors. International Nursing Review © 2011 International Council of Nurses.
Bianconi, André; Zuben, Cláudio J. Von; Serapião, Adriane B. de S.; Govone, José S.
2010-01-01
Bionomic features of blowflies may be clarified and detailed by the deployment of appropriate modelling techniques such as artificial neural networks, which are mathematical tools widely applied to the resolution of complex biological problems. The principal aim of this work was to use three well-known neural networks, namely Multi-Layer Perceptron (MLP), Radial Basis Function (RBF), and Adaptive Neural Network-Based Fuzzy Inference System (ANFIS), to ascertain whether these tools would be able to outperform a classical statistical method (multiple linear regression) in the prediction of the number of resultant adults (survivors) of experimental populations of Chrysomya megacephala (F.) (Diptera: Calliphoridae), based on initial larval density (number of larvae), amount of available food, and duration of immature stages. The coefficient of determination (R2) derived from the RBF was the lowest in the testing subset in relation to the other neural networks, even though its R2 in the training subset exhibited virtually a maximum value. The ANFIS model permitted the achievement of the best testing performance. Hence this model was deemed to be more effective in relation to MLP and RBF for predicting the number of survivors. All three networks outperformed the multiple linear regression, indicating that neural models could be taken as feasible techniques for predicting bionomic variables concerning the nutritional dynamics of blowflies. PMID:20569135
Kang, Kun-Tai; Chiu, Shuenn-Nan; Weng, Wen-Chin; Lee, Pei-Lin; Hsu, Wei-Chung
2017-03-01
To compare office blood pressure (BP) and 24-hour ambulatory BP (ABP) monitoring to facilitate the diagnosis and management of hypertension in children with obstructive sleep apnea (OSA). Children aged 4-16 years with OSA-related symptoms were recruited from a tertiary referral medical center. All children underwent overnight polysomnography, office BP, and 24-hour ABP studies. Multiple linear regression analyses were applied to elucidate the association between the apnea-hypopnea index and BP. Correlation and consistency between office BP and 24-hour ABP were measured by Pearson correlation, intraclass correlation, and Bland-Altman analyses. In the 163 children enrolled (mean age, 8.2 ± 3.3 years; 67% male). The prevalence of systolic hypertension at night was significantly higher in children with moderate-to-severe OSA than in those with primary snoring (44.9% vs 16.1%, P = .006). Pearson correlation and intraclass correlation analyses revealed associations between office BP and 24-hour BP, and Bland-Altman analysis indicated an agreement between office and 24-hour BP measurements. However, multiple linear regression analyses demonstrated that 24-hour BP (nighttime systolic BP and mean arterial pressure), unlike office BP, was independently associated with the apnea-hypopnea index, after adjustment for adiposity variables. Twenty-four-hour ABP is more strongly correlated with OSA in children, compared with office BP. Copyright © 2016 Elsevier Inc. All rights reserved.
Analysis of methods to estimate spring flows in a karst aquifer.
Sepúlveda, Nicasio
2009-01-01
Hydraulically and statistically based methods were analyzed to identify the most reliable method to predict spring flows in a karst aquifer. Measured water levels at nearby observation wells, measured spring pool altitudes, and the distance between observation wells and the spring pool were the parameters used to match measured spring flows. Measured spring flows at six Upper Floridan aquifer springs in central Florida were used to assess the reliability of these methods to predict spring flows. Hydraulically based methods involved the application of the Theis, Hantush-Jacob, and Darcy-Weisbach equations, whereas the statistically based methods were the multiple linear regressions and the technology of artificial neural networks (ANNs). Root mean square errors between measured and predicted spring flows using the Darcy-Weisbach method ranged between 5% and 15% of the measured flows, lower than the 7% to 27% range for the Theis or Hantush-Jacob methods. Flows at all springs were estimated to be turbulent based on the Reynolds number derived from the Darcy-Weisbach equation for conduit flow. The multiple linear regression and the Darcy-Weisbach methods had similar spring flow prediction capabilities. The ANNs provided the lowest residuals between measured and predicted spring flows, ranging from 1.6% to 5.3% of the measured flows. The model prediction efficiency criteria also indicated that the ANNs were the most accurate method predicting spring flows in a karst aquifer.
Age is no barrier: predictors of academic success in older learners
NASA Astrophysics Data System (ADS)
Imlach, Abbie-Rose; Ward, David D.; Stuart, Kimberley E.; Summers, Mathew J.; Valenzuela, Michael J.; King, Anna E.; Saunders, Nichole L.; Summers, Jeffrey; Srikanth, Velandai K.; Robinson, Andrew; Vickers, James C.
2017-11-01
Although predictors of academic success have been identified in young adults, such predictors are unlikely to translate directly to an older student population, where such information is scarce. The current study aimed to examine cognitive, psychosocial, lifetime, and genetic predictors of university-level academic performance in older adults (50-79 years old). Participants were mostly female (71%) and had a greater than high school education level (M = 14.06 years, SD = 2.76), on average. Two multiple linear regression analyses were conducted. The first examined all potential predictors of grade point average (GPA) in the subset of participants who had volunteered samples for genetic analysis (N = 181). Significant predictors of GPA were then re-examined in a second multiple linear regression using the full sample (N = 329). Our data show that the cognitive domains of episodic memory and language processing, in conjunction with midlife engagement in cognitively stimulating activities, have a role in predicting academic performance as measured by GPA in the first year of study. In contrast, it was determined that age, IQ, gender, working memory, psychosocial factors, and common brain gene polymorphisms linked to brain function, plasticity and degeneration (APOE, BDNF, COMT, KIBRA, SERT) did not influence academic performance. These findings demonstrate that ageing does not impede academic achievement, and that discrete cognitive skills as well as lifetime engagement in cognitively stimulating activities can promote academic success in older adults.
Vasomotor and physical menopausal symptoms are associated with sleep quality.
Kim, Min-Ju; Yim, Gyeyoon; Park, Hyun-Young
2018-01-01
Sleep disturbance is one of the common complaints in menopause. This study investigated the relationship between menopausal symptoms and sleep quality in middle-aged women. This cross-sectional observational study involved 634 women aged 44-56 years attending a healthcare center at Kangbuk Samsung Hospitals. Sleep quality was measured using the Pittsburgh Sleep Quality Index (PSQI).Multiple linear regression analysis was performed to assess the associations between Menopause-specific Quality of Life (MENQOL) scores and PSQI scores and Menopause-specific Quality of Life (MENQOL)scores. The mean PSQI score was 3.6±2.3, and the rates of poor sleep quality(PSQI score > 5) in premenopausal, perimenopausal, and postmenopausal women were 14.4%, 18.2%, and 30.2%, respectively. Total PSQI score, specifically the sleep latency, habitual sleep efficiency and sleep disturbances scores, were significantly increased in postmenopausal women. Multiple linear regression analysis adjusted for age, BMI, hypertension, diabetes, smoking, marital status, family income, education, employment status, parity, physical activity, depression symptoms, perceived stress and menopausal status showed that higher PSQI score was positively correlated with higher vasomotor(ß = 0.240, P = 0.020)and physical(ß = 0.572, P<0.001) scores. Vasomotor and physical menopause symptoms was related to poor sleep quality. Effective management strategies aimed at reducing menopausal symptoms may improve sleep quality among women around the time of menopause.
The microcomputer scientific software series 2: general linear model--regression.
Harold M. Rauscher
1983-01-01
The general linear model regression (GLMR) program provides the microcomputer user with a sophisticated regression analysis capability. The output provides a regression ANOVA table, estimators of the regression model coefficients, their confidence intervals, confidence intervals around the predicted Y-values, residuals for plotting, a check for multicollinearity, a...
A Landsat study of water quality in Lake Okeechobee
NASA Technical Reports Server (NTRS)
Gervin, J. C.; Marshall, M. L.
1976-01-01
This paper uses multiple regression techniques to investigate the relationship between Landsat radiance values and water quality measurements. For a period of over one year, the Central and Southern Florida Flood Control District sampled the water of Lake Okeechobee for chlorophyll, carotenoids, turbidity, and various nutrients at the time of Landsat overpasses. Using an overlay map of the sampling stations, Landsat radiance values were measured from computer compatible tapes using a GE image 100 and averaging over a 22-acre area at each station. These radiance values in four bands were used to form a number of functions (powers, logarithms, exponentials, and ratios), which were then compared with the ground measurements using multiple linear regression techniques. Several dates were used to provide generality and to study possible seasonal variations. Individual correlations were presented for the various water quality parameters and best fit equations were examined for chlorophyll and turbidity. The results and their relationship to past hydrological research were discussed.
Elfering, Achim; Häfliger, Evelyne; Celik, Zehra; Grebner, Simone
2018-07-01
In industrial countries home care services for elderly people living in the community are growing rapidly. Home care nursing is intensive and the nurses often suffer from musculoskeletal pain. Time pressure and job control are job-related factors linked to the risk of experiencing lower back pain (LBP) and LBP-related work impairment. This survey investigated whether work-family conflict (WFC), emotional dissonance and being appreciated at work have incremental predictive value. Responses were obtained from 125 home care nurses (63% response rate). Multiple linear regression showed that emotional dissonance and being appreciated at work predicted LBP intensity and LBP-related disability independently of time pressure and job control. WFC was not a predictor of LBP-related disability in multiple regression analyses despite a zero-order correlation with it. Redesigning the working pattern of home care nurses to reduce the emotional demands and improve appreciation of their work might reduce the incidence of LBP in this group.
Fonseca-Machado, Mariana de Oliveira; Monteiro, Juliana Cristina dos Santos; Haas, Vanderlei José; Abrão, Ana Cristina Freitas de Vilhena; Gomes-Sponholz, Flávia
2015-01-01
Objective: to identify the relationship between posttraumatic stress disorder, trait and state anxiety, and intimate partner violence during pregnancy. Method: observational, cross-sectional study developed with 358 pregnant women. The Posttraumatic Stress Disorder Checklist - Civilian Version was used, as well as the State-Trait Anxiety Inventory and an adapted version of the instrument used in the World Health Organization Multi-country Study on Women's Health and Domestic Violence. Results: after adjusting to the multiple logistic regression model, intimate partner violence, occurred during pregnancy, was associated with the indication of posttraumatic stress disorder. The adjusted multiple linear regression models showed that the victims of violence, in the current pregnancy, had higher symptom scores of trait and state anxiety than non-victims. Conclusion: recognizing the intimate partner violence as a clinically relevant and identifiable risk factor for the occurrence of anxiety disorders during pregnancy can be a first step in the prevention thereof. PMID:26487135
Changes in the timing of snowmelt and streamflow in Colorado: A response to recent warming
Clow, David W.
2010-01-01
Trends in the timing of snowmelt and associated runoff in Colorado were evaluated for the 1978-2007 water years using the regional Kendall test (RKT) on daily snow-water equivalent (SWE) data from snowpack telemetry (SNOTEL) sites and daily streamflow data from headwater streams. The RKT is a robust, nonparametric test that provides an increased power of trend detection by grouping data from multiple sites within a given geographic region. The RKT analyses indicated strong, pervasive trends in snowmelt and streamflow timing, which have shifted toward earlier in the year by a median of 2-3 weeks over the 29-yr study period. In contrast, relatively few statistically significant trends were detected using simple linear regression. RKT analyses also indicated that November-May air temperatures increased by a median of 0.9 degrees C decade-1, while 1 April SWE and maximum SWE declined by a median of 4.1 and 3.6 cm decade-1, respectively. Multiple linear regression models were created, using monthly air temperatures, snowfall, latitude, and elevation as explanatory variables to identify major controlling factors on snowmelt timing. The models accounted for 45% of the variance in snowmelt onset, and 78% of the variance in the snowmelt center of mass (when half the snowpack had melted). Variations in springtime air temperature and SWE explained most of the interannual variability in snowmelt timing. Regression coefficients for air temperature were negative, indicating that warm temperatures promote early melt. Regression coefficients for SWE, latitude, and elevation were positive, indicating that abundant snowfall tends to delay snowmelt, and snowmelt tends to occur later at northern latitudes and high elevations. Results from this study indicate that even the mountains of Colorado, with their high elevations and cold snowpacks, are experiencing substantial shifts in the timing of snowmelt and snowmelt runoff toward earlier in the year.
A secure distributed logistic regression protocol for the detection of rare adverse drug events
El Emam, Khaled; Samet, Saeed; Arbuckle, Luk; Tamblyn, Robyn; Earle, Craig; Kantarcioglu, Murat
2013-01-01
Background There is limited capacity to assess the comparative risks of medications after they enter the market. For rare adverse events, the pooling of data from multiple sources is necessary to have the power and sufficient population heterogeneity to detect differences in safety and effectiveness in genetic, ethnic and clinically defined subpopulations. However, combining datasets from different data custodians or jurisdictions to perform an analysis on the pooled data creates significant privacy concerns that would need to be addressed. Existing protocols for addressing these concerns can result in reduced analysis accuracy and can allow sensitive information to leak. Objective To develop a secure distributed multi-party computation protocol for logistic regression that provides strong privacy guarantees. Methods We developed a secure distributed logistic regression protocol using a single analysis center with multiple sites providing data. A theoretical security analysis demonstrates that the protocol is robust to plausible collusion attacks and does not allow the parties to gain new information from the data that are exchanged among them. The computational performance and accuracy of the protocol were evaluated on simulated datasets. Results The computational performance scales linearly as the dataset sizes increase. The addition of sites results in an exponential growth in computation time. However, for up to five sites, the time is still short and would not affect practical applications. The model parameters are the same as the results on pooled raw data analyzed in SAS, demonstrating high model accuracy. Conclusion The proposed protocol and prototype system would allow the development of logistic regression models in a secure manner without requiring the sharing of personal health information. This can alleviate one of the key barriers to the establishment of large-scale post-marketing surveillance programs. We extended the secure protocol to account for correlations among patients within sites through generalized estimating equations, and to accommodate other link functions by extending it to generalized linear models. PMID:22871397
A secure distributed logistic regression protocol for the detection of rare adverse drug events.
El Emam, Khaled; Samet, Saeed; Arbuckle, Luk; Tamblyn, Robyn; Earle, Craig; Kantarcioglu, Murat
2013-05-01
There is limited capacity to assess the comparative risks of medications after they enter the market. For rare adverse events, the pooling of data from multiple sources is necessary to have the power and sufficient population heterogeneity to detect differences in safety and effectiveness in genetic, ethnic and clinically defined subpopulations. However, combining datasets from different data custodians or jurisdictions to perform an analysis on the pooled data creates significant privacy concerns that would need to be addressed. Existing protocols for addressing these concerns can result in reduced analysis accuracy and can allow sensitive information to leak. To develop a secure distributed multi-party computation protocol for logistic regression that provides strong privacy guarantees. We developed a secure distributed logistic regression protocol using a single analysis center with multiple sites providing data. A theoretical security analysis demonstrates that the protocol is robust to plausible collusion attacks and does not allow the parties to gain new information from the data that are exchanged among them. The computational performance and accuracy of the protocol were evaluated on simulated datasets. The computational performance scales linearly as the dataset sizes increase. The addition of sites results in an exponential growth in computation time. However, for up to five sites, the time is still short and would not affect practical applications. The model parameters are the same as the results on pooled raw data analyzed in SAS, demonstrating high model accuracy. The proposed protocol and prototype system would allow the development of logistic regression models in a secure manner without requiring the sharing of personal health information. This can alleviate one of the key barriers to the establishment of large-scale post-marketing surveillance programs. We extended the secure protocol to account for correlations among patients within sites through generalized estimating equations, and to accommodate other link functions by extending it to generalized linear models.
Influence of landscape-scale factors in limiting brook trout populations in Pennsylvania streams
Kocovsky, P.M.; Carline, R.F.
2006-01-01
Landscapes influence the capacity of streams to produce trout through their effect on water chemistry and other factors at the reach scale. Trout abundance also fluctuates over time; thus, to thoroughly understand how spatial factors at landscape scales affect trout populations, one must assess the changes in populations over time to provide a context for interpreting the importance of spatial factors. We used data from the Pennsylvania Fish and Boat Commission's fisheries management database to investigate spatial factors that affect the capacity of streams to support brook trout Salvelinus fontinalis and to provide models useful for their management. We assessed the relative importance of spatial and temporal variation by calculating variance components and comparing relative standard errors for spatial and temporal variation. We used binary logistic regression to predict the presence of harvestable-length brook trout and multiple linear regression to assess the mechanistic links between landscapes and trout populations and to predict population density. The variance in trout density among streams was equal to or greater than the temporal variation for several streams, indicating that differences among sites affect population density. Logistic regression models correctly predicted the absence of harvestable-length brook trout in 60% of validation samples. The r 2-value for the linear regression model predicting density was 0.3, indicating low predictive ability. Both logistic and linear regression models supported buffering capacity against acid episodes as an important mechanistic link between landscapes and trout populations. Although our models fail to predict trout densities precisely, their success at elucidating the mechanistic links between landscapes and trout populations, in concert with the importance of spatial variation, increases our understanding of factors affecting brook trout abundance and will help managers and private groups to protect and enhance populations of wild brook trout. ?? Copyright by the American Fisheries Society 2006.
Oil and gas pipeline construction cost analysis and developing regression models for cost estimation
NASA Astrophysics Data System (ADS)
Thaduri, Ravi Kiran
In this study, cost data for 180 pipelines and 136 compressor stations have been analyzed. On the basis of the distribution analysis, regression models have been developed. Material, Labor, ROW and miscellaneous costs make up the total cost of a pipeline construction. The pipelines are analyzed based on different pipeline lengths, diameter, location, pipeline volume and year of completion. In a pipeline construction, labor costs dominate the total costs with a share of about 40%. Multiple non-linear regression models are developed to estimate the component costs of pipelines for various cross-sectional areas, lengths and locations. The Compressor stations are analyzed based on the capacity, year of completion and location. Unlike the pipeline costs, material costs dominate the total costs in the construction of compressor station, with an average share of about 50.6%. Land costs have very little influence on the total costs. Similar regression models are developed to estimate the component costs of compressor station for various capacities and locations.
Demidenko, Eugene
2017-09-01
The exact density distribution of the nonlinear least squares estimator in the one-parameter regression model is derived in closed form and expressed through the cumulative distribution function of the standard normal variable. Several proposals to generalize this result are discussed. The exact density is extended to the estimating equation (EE) approach and the nonlinear regression with an arbitrary number of linear parameters and one intrinsically nonlinear parameter. For a very special nonlinear regression model, the derived density coincides with the distribution of the ratio of two normally distributed random variables previously obtained by Fieller (1932), unlike other approximations previously suggested by other authors. Approximations to the density of the EE estimators are discussed in the multivariate case. Numerical complications associated with the nonlinear least squares are illustrated, such as nonexistence and/or multiple solutions, as major factors contributing to poor density approximation. The nonlinear Markov-Gauss theorem is formulated based on the near exact EE density approximation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sacher, G.A.
1978-01-01
The maximum lifespans in captivity for terrestrial mammalian species can be estimated by means of a multiple linear regression of logarithm of lifespan (L) on the logarithm of adult brain weight (E) and body weight (S). This paper describes the application of regression formulas based on data from terrestrial mammals to the estimation of odontocete and mysticete lifespans. The regression formulas predict cetacean lifespans that are in accord with the data on maximum cetacean lifespans obtained in recent years by objective age determination procedures. More remarkable is the correct prediction by the regression formulas that the odontocete species have nearlymore » constant lifespans, almost independent of body weight over a 300:1 body weight range. This prediction is a consequence of the fact, remarkable in itself, that over this body weight range the Odontoceti have a brain:body allometric slope of 1/3, as compared to a slope of 2/3 for the Mammalia as a whole.« less
Spatial interpolation schemes of daily precipitation for hydrologic modeling
Hwang, Y.; Clark, M.R.; Rajagopalan, B.; Leavesley, G.
2012-01-01
Distributed hydrologic models typically require spatial estimates of precipitation interpolated from sparsely located observational points to the specific grid points. We compare and contrast the performance of regression-based statistical methods for the spatial estimation of precipitation in two hydrologically different basins and confirmed that widely used regression-based estimation schemes fail to describe the realistic spatial variability of daily precipitation field. The methods assessed are: (1) inverse distance weighted average; (2) multiple linear regression (MLR); (3) climatological MLR; and (4) locally weighted polynomial regression (LWP). In order to improve the performance of the interpolations, the authors propose a two-step regression technique for effective daily precipitation estimation. In this simple two-step estimation process, precipitation occurrence is first generated via a logistic regression model before estimate the amount of precipitation separately on wet days. This process generated the precipitation occurrence, amount, and spatial correlation effectively. A distributed hydrologic model (PRMS) was used for the impact analysis in daily time step simulation. Multiple simulations suggested noticeable differences between the input alternatives generated by three different interpolation schemes. Differences are shown in overall simulation error against the observations, degree of explained variability, and seasonal volumes. Simulated streamflows also showed different characteristics in mean, maximum, minimum, and peak flows. Given the same parameter optimization technique, LWP input showed least streamflow error in Alapaha basin and CMLR input showed least error (still very close to LWP) in Animas basin. All of the two-step interpolation inputs resulted in lower streamflow error compared to the directly interpolated inputs. ?? 2011 Springer-Verlag.