ERIC Educational Resources Information Center
Rocconi, Louis M.
2013-01-01
This study examined the differing conclusions one may come to depending upon the type of analysis chosen, hierarchical linear modeling or ordinary least squares (OLS) regression. To illustrate this point, this study examined the influences of seniors' self-reported critical thinking abilities three ways: (1) an OLS regression with the student…
Vajargah, Kianoush Fathi; Sadeghi-Bazargani, Homayoun; Mehdizadeh-Esfanjani, Robab; Savadi-Oskouei, Daryoush; Farhoudi, Mehdi
2012-01-01
The objective of the present study was to assess the comparable applicability of orthogonal projections to latent structures (OPLS) statistical model vs traditional linear regression in order to investigate the role of trans cranial doppler (TCD) sonography in predicting ischemic stroke prognosis. The study was conducted on 116 ischemic stroke patients admitted to a specialty neurology ward. The Unified Neurological Stroke Scale was used once for clinical evaluation on the first week of admission and again six months later. All data was primarily analyzed using simple linear regression and later considered for multivariate analysis using PLS/OPLS models through the SIMCA P+12 statistical software package. The linear regression analysis results used for the identification of TCD predictors of stroke prognosis were confirmed through the OPLS modeling technique. Moreover, in comparison to linear regression, the OPLS model appeared to have higher sensitivity in detecting the predictors of ischemic stroke prognosis and detected several more predictors. Applying the OPLS model made it possible to use both single TCD measures/indicators and arbitrarily dichotomized measures of TCD single vessel involvement as well as the overall TCD result. In conclusion, the authors recommend PLS/OPLS methods as complementary rather than alternative to the available classical regression models such as linear regression.
Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L
2018-01-01
Aims A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R2), using R2 as the primary metric of assay agreement. However, the use of R2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. Methods We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Results Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. Conclusions The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. PMID:28747393
Bowen, Stephen R; Chappell, Richard J; Bentzen, Søren M; Deveau, Michael A; Forrest, Lisa J; Jeraj, Robert
2012-01-01
Purpose To quantify associations between pre-radiotherapy and post-radiotherapy PET parameters via spatially resolved regression. Materials and methods Ten canine sinonasal cancer patients underwent PET/CT scans of [18F]FDG (FDGpre), [18F]FLT (FLTpre), and [61Cu]Cu-ATSM (Cu-ATSMpre). Following radiotherapy regimens of 50 Gy in 10 fractions, veterinary patients underwent FDG PET/CT scans at three months (FDGpost). Regression of standardized uptake values in baseline FDGpre, FLTpre and Cu-ATSMpre tumour voxels to those in FDGpost images was performed for linear, log-linear, generalized-linear and mixed-fit linear models. Goodness-of-fit in regression coefficients was assessed by R2. Hypothesis testing of coefficients over the patient population was performed. Results Multivariate linear model fits of FDGpre to FDGpost were significantly positive over the population (FDGpost~0.17 FDGpre, p=0.03), and classified slopes of RECIST non-responders and responders to be different (0.37 vs. 0.07, p=0.01). Generalized-linear model fits related FDGpre to FDGpost by a linear power law (FDGpost~FDGpre0.93, p<0.001). Univariate mixture model fits of FDGpre improved R2 from 0.17 to 0.52. Neither baseline FLT PET nor Cu-ATSM PET uptake contributed statistically significant multivariate regression coefficients. Conclusions Spatially resolved regression analysis indicates that pre-treatment FDG PET uptake is most strongly associated with three-month post-treatment FDG PET uptake in this patient population, though associations are histopathology-dependent. PMID:22682748
Watanabe, Hiroyuki; Miyazaki, Hiroyasu
2006-01-01
Over- and/or under-correction of QT intervals for changes in heart rate may lead to misleading conclusions and/or masking the potential of a drug to prolong the QT interval. This study examines a nonparametric regression model (Loess Smoother) to adjust the QT interval for differences in heart rate, with an improved fitness over a wide range of heart rates. 240 sets of (QT, RR) observations collected from each of 8 conscious and non-treated beagle dogs were used as the materials for investigation. The fitness of the nonparametric regression model to the QT-RR relationship was compared with four models (individual linear regression, common linear regression, and Bazett's and Fridericia's correlation models) with reference to Akaike's Information Criterion (AIC). Residuals were visually assessed. The bias-corrected AIC of the nonparametric regression model was the best of the models examined in this study. Although the parametric models did not fit, the nonparametric regression model improved the fitting at both fast and slow heart rates. The nonparametric regression model is the more flexible method compared with the parametric method. The mathematical fit for linear regression models was unsatisfactory at both fast and slow heart rates, while the nonparametric regression model showed significant improvement at all heart rates in beagle dogs.
Khalil, Mohamed H.; Shebl, Mostafa K.; Kosba, Mohamed A.; El-Sabrout, Karim; Zaki, Nesma
2016-01-01
Aim: This research was conducted to determine the most affecting parameters on hatchability of indigenous and improved local chickens’ eggs. Materials and Methods: Five parameters were studied (fertility, early and late embryonic mortalities, shape index, egg weight, and egg weight loss) on four strains, namely Fayoumi, Alexandria, Matrouh, and Montazah. Multiple linear regression was performed on the studied parameters to determine the most influencing one on hatchability. Results: The results showed significant differences in commercial and scientific hatchability among strains. Alexandria strain has the highest significant commercial hatchability (80.70%). Regarding the studied strains, highly significant differences in hatching chick weight among strains were observed. Using multiple linear regression analysis, fertility made the greatest percent contribution (71.31%) to hatchability, and the lowest percent contributions were made by shape index and egg weight loss. Conclusion: A prediction of hatchability using multiple regression analysis could be a good tool to improve hatchability percentage in chickens. PMID:27651666
How is the weather? Forecasting inpatient glycemic control
Saulnier, George E; Castro, Janna C; Cook, Curtiss B; Thompson, Bithika M
2017-01-01
Aim: Apply methods of damped trend analysis to forecast inpatient glycemic control. Method: Observed and calculated point-of-care blood glucose data trends were determined over 62 weeks. Mean absolute percent error was used to calculate differences between observed and forecasted values. Comparisons were drawn between model results and linear regression forecasting. Results: The forecasted mean glucose trends observed during the first 24 and 48 weeks of projections compared favorably to the results provided by linear regression forecasting. However, in some scenarios, the damped trend method changed inferences compared with linear regression. In all scenarios, mean absolute percent error values remained below the 10% accepted by demand industries. Conclusion: Results indicate that forecasting methods historically applied within demand industries can project future inpatient glycemic control. Additional study is needed to determine if forecasting is useful in the analyses of other glucometric parameters and, if so, how to apply the techniques to quality improvement. PMID:29134125
Biostatistics Series Module 6: Correlation and Linear Regression.
Hazra, Avijit; Gogtay, Nithya
2016-01-01
Correlation and linear regression are the most commonly used techniques for quantifying the association between two numeric variables. Correlation quantifies the strength of the linear relationship between paired variables, expressing this as a correlation coefficient. If both variables x and y are normally distributed, we calculate Pearson's correlation coefficient ( r ). If normality assumption is not met for one or both variables in a correlation analysis, a rank correlation coefficient, such as Spearman's rho (ρ) may be calculated. A hypothesis test of correlation tests whether the linear relationship between the two variables holds in the underlying population, in which case it returns a P < 0.05. A 95% confidence interval of the correlation coefficient can also be calculated for an idea of the correlation in the population. The value r 2 denotes the proportion of the variability of the dependent variable y that can be attributed to its linear relation with the independent variable x and is called the coefficient of determination. Linear regression is a technique that attempts to link two correlated variables x and y in the form of a mathematical equation ( y = a + bx ), such that given the value of one variable the other may be predicted. In general, the method of least squares is applied to obtain the equation of the regression line. Correlation and linear regression analysis are based on certain assumptions pertaining to the data sets. If these assumptions are not met, misleading conclusions may be drawn. The first assumption is that of linear relationship between the two variables. A scatter plot is essential before embarking on any correlation-regression analysis to show that this is indeed the case. Outliers or clustering within data sets can distort the correlation coefficient value. Finally, it is vital to remember that though strong correlation can be a pointer toward causation, the two are not synonymous.
Biostatistics Series Module 6: Correlation and Linear Regression
Hazra, Avijit; Gogtay, Nithya
2016-01-01
Correlation and linear regression are the most commonly used techniques for quantifying the association between two numeric variables. Correlation quantifies the strength of the linear relationship between paired variables, expressing this as a correlation coefficient. If both variables x and y are normally distributed, we calculate Pearson's correlation coefficient (r). If normality assumption is not met for one or both variables in a correlation analysis, a rank correlation coefficient, such as Spearman's rho (ρ) may be calculated. A hypothesis test of correlation tests whether the linear relationship between the two variables holds in the underlying population, in which case it returns a P < 0.05. A 95% confidence interval of the correlation coefficient can also be calculated for an idea of the correlation in the population. The value r2 denotes the proportion of the variability of the dependent variable y that can be attributed to its linear relation with the independent variable x and is called the coefficient of determination. Linear regression is a technique that attempts to link two correlated variables x and y in the form of a mathematical equation (y = a + bx), such that given the value of one variable the other may be predicted. In general, the method of least squares is applied to obtain the equation of the regression line. Correlation and linear regression analysis are based on certain assumptions pertaining to the data sets. If these assumptions are not met, misleading conclusions may be drawn. The first assumption is that of linear relationship between the two variables. A scatter plot is essential before embarking on any correlation-regression analysis to show that this is indeed the case. Outliers or clustering within data sets can distort the correlation coefficient value. Finally, it is vital to remember that though strong correlation can be a pointer toward causation, the two are not synonymous. PMID:27904175
Non-Linear Approach in Kinesiology Should Be Preferred to the Linear--A Case of Basketball.
Trninić, Marko; Jeličić, Mario; Papić, Vladan
2015-07-01
In kinesiology, medicine, biology and psychology, in which research focus is on dynamical self-organized systems, complex connections exist between variables. Non-linear nature of complex systems has been discussed and explained by the example of non-linear anthropometric predictors of performance in basketball. Previous studies interpreted relations between anthropometric features and measures of effectiveness in basketball by (a) using linear correlation models, and by (b) including all basketball athletes in the same sample of participants regardless of their playing position. In this paper the significance and character of linear and non-linear relations between simple anthropometric predictors (AP) and performance criteria consisting of situation-related measures of effectiveness (SE) in basketball were determined and evaluated. The sample of participants consisted of top-level junior basketball players divided in three groups according to their playing time (8 minutes and more per game) and playing position: guards (N = 42), forwards (N = 26) and centers (N = 40). Linear (general model) and non-linear (general model) regression models were calculated simultaneously and separately for each group. The conclusion is viable: non-linear regressions are frequently superior to linear correlations when interpreting actual association logic among research variables.
Røislien, Jo; Lossius, Hans Morten; Kristiansen, Thomas
2015-01-01
Background Trauma is a leading global cause of death. Trauma mortality rates are higher in rural areas, constituting a challenge for quality and equality in trauma care. The aim of the study was to explore population density and transport time to hospital care as possible predictors of geographical differences in mortality rates, and to what extent choice of statistical method might affect the analytical results and accompanying clinical conclusions. Methods Using data from the Norwegian Cause of Death registry, deaths from external causes 1998–2007 were analysed. Norway consists of 434 municipalities, and municipality population density and travel time to hospital care were entered as predictors of municipality mortality rates in univariate and multiple regression models of increasing model complexity. We fitted linear regression models with continuous and categorised predictors, as well as piecewise linear and generalised additive models (GAMs). Models were compared using Akaike's information criterion (AIC). Results Population density was an independent predictor of trauma mortality rates, while the contribution of transport time to hospital care was highly dependent on choice of statistical model. A multiple GAM or piecewise linear model was superior, and similar, in terms of AIC. However, while transport time was statistically significant in multiple models with piecewise linear or categorised predictors, it was not in GAM or standard linear regression. Conclusions Population density is an independent predictor of trauma mortality rates. The added explanatory value of transport time to hospital care is marginal and model-dependent, highlighting the importance of exploring several statistical models when studying complex associations in observational data. PMID:25972600
Partitioning sources of variation in vertebrate species richness
Boone, R.B.; Krohn, W.B.
2000-01-01
Aim: To explore biogeographic patterns of terrestrial vertebrates in Maine, USA using techniques that would describe local and spatial correlations with the environment. Location: Maine, USA. Methods: We delineated the ranges within Maine (86,156 km2) of 275 species using literature and expert review. Ranges were combined into species richness maps, and compared to geomorphology, climate, and woody plant distributions. Methods were adapted that compared richness of all vertebrate classes to each environmental correlate, rather than assessing a single explanatory theory. We partitioned variation in species richness into components using tree and multiple linear regression. Methods were used that allowed for useful comparisons between tree and linear regression results. For both methods we partitioned variation into broad-scale (spatially autocorrelated) and fine-scale (spatially uncorrelated) explained and unexplained components. By partitioning variance, and using both tree and linear regression in analyses, we explored the degree of variation in species richness for each vertebrate group that Could be explained by the relative contribution of each environmental variable. Results: In tree regression, climate variation explained richness better (92% of mean deviance explained for all species) than woody plant variation (87%) and geomorphology (86%). Reptiles were highly correlated with environmental variation (93%), followed by mammals, amphibians, and birds (each with 84-82% deviance explained). In multiple linear regression, climate was most closely associated with total vertebrate richness (78%), followed by woody plants (67%) and geomorphology (56%). Again, reptiles were closely correlated with the environment (95%), followed by mammals (73%), amphibians (63%) and birds (57%). Main conclusions: Comparing variation explained using tree and multiple linear regression quantified the importance of nonlinear relationships and local interactions between species richness and environmental variation, identifying the importance of linear relationships between reptiles and the environment, and nonlinear relationships between birds and woody plants, for example. Conservation planners should capture climatic variation in broad-scale designs; temperatures may shift during climate change, but the underlying correlations between the environment and species richness will presumably remain.
Claessens, T E; Georgakopoulos, D; Afanasyeva, M; Vermeersch, S J; Millar, H D; Stergiopulos, N; Westerhof, N; Verdonck, P R; Segers, P
2006-04-01
The linear time-varying elastance theory is frequently used to describe the change in ventricular stiffness during the cardiac cycle. The concept assumes that all isochrones (i.e., curves that connect pressure-volume data occurring at the same time) are linear and have a common volume intercept. Of specific interest is the steepest isochrone, the end-systolic pressure-volume relationship (ESPVR), of which the slope serves as an index for cardiac contractile function. Pressure-volume measurements, achieved with a combined pressure-conductance catheter in the left ventricle of 13 open-chest anesthetized mice, showed a marked curvilinearity of the isochrones. We therefore analyzed the shape of the isochrones by using six regression algorithms (two linear, two quadratic, and two logarithmic, each with a fixed or time-varying intercept) and discussed the consequences for the elastance concept. Our main observations were 1) the volume intercept varies considerably with time; 2) isochrones are equally well described by using quadratic or logarithmic regression; 3) linear regression with a fixed intercept shows poor correlation (R(2) < 0.75) during isovolumic relaxation and early filling; and 4) logarithmic regression is superior in estimating the fixed volume intercept of the ESPVR. In conclusion, the linear time-varying elastance fails to provide a sufficiently robust model to account for changes in pressure and volume during the cardiac cycle in the mouse ventricle. A new framework accounting for the nonlinear shape of the isochrones needs to be developed.
MULTIPLE LINEAR REGRESSION FOR LAKE ICE AND LAKE TEMPERATURE CHARACTERISTICS. (R824801)
The perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Concl...
London Measure of Unplanned Pregnancy: guidance for its use as an outcome measure
Hall, Jennifer A; Barrett, Geraldine; Copas, Andrew; Stephenson, Judith
2017-01-01
Background The London Measure of Unplanned Pregnancy (LMUP) is a psychometrically validated measure of the degree of intention of a current or recent pregnancy. The LMUP is increasingly being used worldwide, and can be used to evaluate family planning or preconception care programs. However, beyond recommending the use of the full LMUP scale, there is no published guidance on how to use the LMUP as an outcome measure. Ordinal logistic regression has been recommended informally, but studies published to date have all used binary logistic regression and dichotomized the scale at different cut points. There is thus a need for evidence-based guidance to provide a standardized methodology for multivariate analysis and to enable comparison of results. This paper makes recommendations for the regression method for analysis of the LMUP as an outcome measure. Materials and methods Data collected from 4,244 pregnant women in Malawi were used to compare five regression methods: linear, logistic with two cut points, and ordinal logistic with either the full or grouped LMUP score. The recommendations were then tested on the original UK LMUP data. Results There were small but no important differences in the findings across the regression models. Logistic regression resulted in the largest loss of information, and assumptions were violated for the linear and ordinal logistic regression. Consequently, robust standard errors were used for linear regression and a partial proportional odds ordinal logistic regression model attempted. The latter could only be fitted for grouped LMUP score. Conclusion We recommend the linear regression model with robust standard errors to make full use of the LMUP score when analyzed as an outcome measure. Ordinal logistic regression could be considered, but a partial proportional odds model with grouped LMUP score may be required. Logistic regression is the least-favored option, due to the loss of information. For logistic regression, the cut point for un/planned pregnancy should be between nine and ten. These recommendations will standardize the analysis of LMUP data and enhance comparability of results across studies. PMID:28435343
Fenske, Nora; Burns, Jacob; Hothorn, Torsten; Rehfuess, Eva A.
2013-01-01
Background Most attempts to address undernutrition, responsible for one third of global child deaths, have fallen behind expectations. This suggests that the assumptions underlying current modelling and intervention practices should be revisited. Objective We undertook a comprehensive analysis of the determinants of child stunting in India, and explored whether the established focus on linear effects of single risks is appropriate. Design Using cross-sectional data for children aged 0–24 months from the Indian National Family Health Survey for 2005/2006, we populated an evidence-based diagram of immediate, intermediate and underlying determinants of stunting. We modelled linear, non-linear, spatial and age-varying effects of these determinants using additive quantile regression for four quantiles of the Z-score of standardized height-for-age and logistic regression for stunting and severe stunting. Results At least one variable within each of eleven groups of determinants was significantly associated with height-for-age in the 35% Z-score quantile regression. The non-modifiable risk factors child age and sex, and the protective factors household wealth, maternal education and BMI showed the largest effects. Being a twin or multiple birth was associated with dramatically decreased height-for-age. Maternal age, maternal BMI, birth order and number of antenatal visits influenced child stunting in non-linear ways. Findings across the four quantile and two logistic regression models were largely comparable. Conclusions Our analysis confirms the multifactorial nature of child stunting. It emphasizes the need to pursue a systems-based approach and to consider non-linear effects, and suggests that differential effects across the height-for-age distribution do not play a major role. PMID:24223839
RESOLUTION OF THE DESTRUCTIVE EFFECT OF NOISE ON LINEAR REGRESSION OF TWO TIME SERIES. (R825260)
The perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Concl...
Birthweight Related Factors in Northwestern Iran: Using Quantile Regression Method
Fallah, Ramazan; Kazemnejad, Anoshirvan; Zayeri, Farid; Shoghli, Alireza
2016-01-01
Introduction: Birthweight is one of the most important predicting indicators of the health status in adulthood. Having a balanced birthweight is one of the priorities of the health system in most of the industrial and developed countries. This indicator is used to assess the growth and health status of the infants. The aim of this study was to assess the birthweight of the neonates by using quantile regression in Zanjan province. Methods: This analytical descriptive study was carried out using pre-registered (March 2010 - March 2012) data of neonates in urban/rural health centers of Zanjan province using multiple-stage cluster sampling. Data were analyzed using multiple linear regressions andquantile regression method and SAS 9.2 statistical software. Results: From 8456 newborn baby, 4146 (49%) were female. The mean age of the mothers was 27.1±5.4 years. The mean birthweight of the neonates was 3104 ± 431 grams. Five hundred and seventy-three patients (6.8%) of the neonates were less than 2500 grams. In all quantiles, gestational age of neonates (p<0.05), weight and educational level of the mothers (p<0.05) showed a linear significant relationship with the i of the neonates. However, sex and birth rank of the neonates, mothers age, place of residence (urban/rural) and career were not significant in all quantiles (p>0.05). Conclusion: This study revealed the results of multiple linear regression and quantile regression were not identical. We strictly recommend the use of quantile regression when an asymmetric response variable or data with outliers is available. PMID:26925889
Do State Examinations Measure Teacher Quality?
ERIC Educational Resources Information Center
Harrell, Pamela Esprivalo
2009-01-01
This study investigates teacher content knowledge of candidates enrolled in an online graduate teacher certification programme. Descriptive data and linear regression were used to draw conclusions about the content area knowledge of individuals in the sample and the significance of the predictors examined. Descriptive data show 1/3 of the 8-12…
2012-01-01
Background The objective of this study was to scrutinize number line estimation behaviors displayed by children in mathematics classrooms during the first three years of schooling. We extend existing research by not only mapping potential logarithmic-linear shifts but also provide a new perspective by studying in detail the estimation strategies of individual target digits within a number range familiar to children. Methods Typically developing children (n = 67) from Years 1-3 completed a number-to-position numerical estimation task (0-20 number line). Estimation behaviors were first analyzed via logarithmic and linear regression modeling. Subsequently, using an analysis of variance we compared the estimation accuracy of each digit, thus identifying target digits that were estimated with the assistance of arithmetic strategy. Results Our results further confirm a developmental logarithmic-linear shift when utilizing regression modeling; however, uniquely we have identified that children employ variable strategies when completing numerical estimation, with levels of strategy advancing with development. Conclusion In terms of the existing cognitive research, this strategy factor highlights the limitations of any regression modeling approach, or alternatively, it could underpin the developmental time course of the logarithmic-linear shift. Future studies need to systematically investigate this relationship and also consider the implications for educational practice. PMID:22217191
Terza, Joseph V; Bradford, W David; Dismuke, Clara E
2008-01-01
Objective To investigate potential bias in the use of the conventional linear instrumental variables (IV) method for the estimation of causal effects in inherently nonlinear regression settings. Data Sources Smoking Supplement to the 1979 National Health Interview Survey, National Longitudinal Alcohol Epidemiologic Survey, and simulated data. Study Design Potential bias from the use of the linear IV method in nonlinear models is assessed via simulation studies and real world data analyses in two commonly encountered regression setting: (1) models with a nonnegative outcome (e.g., a count) and a continuous endogenous regressor; and (2) models with a binary outcome and a binary endogenous regressor. Principle Findings The simulation analyses show that substantial bias in the estimation of causal effects can result from applying the conventional IV method in inherently nonlinear regression settings. Moreover, the bias is not attenuated as the sample size increases. This point is further illustrated in the survey data analyses in which IV-based estimates of the relevant causal effects diverge substantially from those obtained with appropriate nonlinear estimation methods. Conclusions We offer this research as a cautionary note to those who would opt for the use of linear specifications in inherently nonlinear settings involving endogeneity. PMID:18546544
Advanced statistics: linear regression, part I: simple linear regression.
Marill, Keith A
2004-01-01
Simple linear regression is a mathematical technique used to model the relationship between a single independent predictor variable and a single dependent outcome variable. In this, the first of a two-part series exploring concepts in linear regression analysis, the four fundamental assumptions and the mechanics of simple linear regression are reviewed. The most common technique used to derive the regression line, the method of least squares, is described. The reader will be acquainted with other important concepts in simple linear regression, including: variable transformations, dummy variables, relationship to inference testing, and leverage. Simplified clinical examples with small datasets and graphic models are used to illustrate the points. This will provide a foundation for the second article in this series: a discussion of multiple linear regression, in which there are multiple predictor variables.
Theobald, Roddy; Freeman, Scott
2014-01-01
Although researchers in undergraduate science, technology, engineering, and mathematics education are currently using several methods to analyze learning gains from pre- and posttest data, the most commonly used approaches have significant shortcomings. Chief among these is the inability to distinguish whether differences in learning gains are due to the effect of an instructional intervention or to differences in student characteristics when students cannot be assigned to control and treatment groups at random. Using pre- and posttest scores from an introductory biology course, we illustrate how the methods currently in wide use can lead to erroneous conclusions, and how multiple linear regression offers an effective framework for distinguishing the impact of an instructional intervention from the impact of student characteristics on test score gains. In general, we recommend that researchers always use student-level regression models that control for possible differences in student ability and preparation to estimate the effect of any nonrandomized instructional intervention on student performance. PMID:24591502
Theobald, Roddy; Freeman, Scott
2014-01-01
Although researchers in undergraduate science, technology, engineering, and mathematics education are currently using several methods to analyze learning gains from pre- and posttest data, the most commonly used approaches have significant shortcomings. Chief among these is the inability to distinguish whether differences in learning gains are due to the effect of an instructional intervention or to differences in student characteristics when students cannot be assigned to control and treatment groups at random. Using pre- and posttest scores from an introductory biology course, we illustrate how the methods currently in wide use can lead to erroneous conclusions, and how multiple linear regression offers an effective framework for distinguishing the impact of an instructional intervention from the impact of student characteristics on test score gains. In general, we recommend that researchers always use student-level regression models that control for possible differences in student ability and preparation to estimate the effect of any nonrandomized instructional intervention on student performance.
Learning accurate and interpretable models based on regularized random forests regression
2014-01-01
Background Many biology related research works combine data from multiple sources in an effort to understand the underlying problems. It is important to find and interpret the most important information from these sources. Thus it will be beneficial to have an effective algorithm that can simultaneously extract decision rules and select critical features for good interpretation while preserving the prediction performance. Methods In this study, we focus on regression problems for biological data where target outcomes are continuous. In general, models constructed from linear regression approaches are relatively easy to interpret. However, many practical biological applications are nonlinear in essence where we can hardly find a direct linear relationship between input and output. Nonlinear regression techniques can reveal nonlinear relationship of data, but are generally hard for human to interpret. We propose a rule based regression algorithm that uses 1-norm regularized random forests. The proposed approach simultaneously extracts a small number of rules from generated random forests and eliminates unimportant features. Results We tested the approach on some biological data sets. The proposed approach is able to construct a significantly smaller set of regression rules using a subset of attributes while achieving prediction performance comparable to that of random forests regression. Conclusion It demonstrates high potential in aiding prediction and interpretation of nonlinear relationships of the subject being studied. PMID:25350120
2007-01-05
positive / false negatives. The quantitative on-site methods were evaluated using linear regression analysis and relative percent difference (RPD) comparison...Conclusion ...............................................................................................3-9 3.2 Quantitative Analysis Using CRREL...3-37 3.3 Quantitative Analysis for NG by GC/TID.........................................................3-38 3.3.1 Introduction
Prenatal Lead Exposure and Fetal Growth: Smaller Infants Have Heightened Susceptibility
Rodosthenous, Rodosthenis S.; Burris, Heather H.; Svensson, Katherine; Amarasiriwardena, Chitra J.; Cantoral, Alejandra; Schnaas, Lourdes; Mercado-García, Adriana; Coull, Brent A.; Wright, Robert O.; Téllez-Rojo, Martha M.; Baccarelli, Andrea A.
2016-01-01
Background As population lead levels decrease, the toxic effects of lead may be distributed to more sensitive populations, such as infants with poor fetal growth. Objectives To determine the association of prenatal lead exposure and fetal growth; and to evaluate whether infants with poor fetal growth are more susceptible to lead toxicity than those with normal fetal growth. Methods We examined the association of second trimester maternal blood lead levels (BLL) with birthweight-for-gestational age (BWGA) z-score in 944 mother-infant participants of the PROGRESS cohort. We determined the association between maternal BLL and BWGA z-score by using both linear and quantile regression. We estimated odds ratios for small-for-gestational age (SGA) infants between maternal BLL quartiles using logistic regression. Maternal age, body mass index, socioeconomic status, parity, household smoking exposure, hemoglobin levels, and infant sex were included as confounders. Results While linear regression showed a negative association between maternal BLL and BWGA z-score (β=−0.06 z-score units per log2 BLL increase; 95% CI: −0.13, 0.003; P=0.06), quantile regression revealed larger magnitudes of this association in the <30th percentiles of BWGA z-score (β range [−0.08, −0.13] z-score units per log2 BLL increase; all P values <0.05). Mothers in the highest BLL quartile had an odds ratio of 1.62 (95% CI: 0.99–2.65) for having a SGA infant compared to the lowest BLL quartile. Conclusions While both linear and quantile regression showed a negative association between prenatal lead exposure and birthweight, quantile regression revealed that smaller infants may represent a more susceptible subpopulation. PMID:27923585
Correlation and simple linear regression.
Eberly, Lynn E
2007-01-01
This chapter highlights important steps in using correlation and simple linear regression to address scientific questions about the association of two continuous variables with each other. These steps include estimation and inference, assessing model fit, the connection between regression and ANOVA, and study design. Examples in microbiology are used throughout. This chapter provides a framework that is helpful in understanding more complex statistical techniques, such as multiple linear regression, linear mixed effects models, logistic regression, and proportional hazards regression.
A Study of the Effect of the Front-End Styling of Sport Utility Vehicles on Pedestrian Head Injuries
Qin, Qin; Chen, Zheng; Bai, Zhonghao; Cao, Libo
2018-01-01
Background The number of sport utility vehicles (SUVs) on China market is continuously increasing. It is necessary to investigate the relationships between the front-end styling features of SUVs and head injuries at the styling design stage for improving the pedestrian protection performance and product development efficiency. Methods Styling feature parameters were extracted from the SUV side contour line. And simplified finite element models were established based on the 78 SUV side contour lines. Pedestrian headform impact simulations were performed and validated. The head injury criterion of 15 ms (HIC15) at four wrap-around distances was obtained. A multiple linear regression analysis method was employed to describe the relationships between the styling feature parameters and the HIC15 at each impact point. Results The relationship between the selected styling features and the HIC15 showed reasonable correlations, and the regression models and the selected independent variables showed statistical significance. Conclusions The regression equations obtained by multiple linear regression can be used to assess the performance of SUV styling in protecting pedestrians' heads and provide styling designers with technical guidance regarding their artistic creations.
Regression dilution bias: tools for correction methods and sample size calculation.
Berglund, Lars
2012-08-01
Random errors in measurement of a risk factor will introduce downward bias of an estimated association to a disease or a disease marker. This phenomenon is called regression dilution bias. A bias correction may be made with data from a validity study or a reliability study. In this article we give a non-technical description of designs of reliability studies with emphasis on selection of individuals for a repeated measurement, assumptions of measurement error models, and correction methods for the slope in a simple linear regression model where the dependent variable is a continuous variable. Also, we describe situations where correction for regression dilution bias is not appropriate. The methods are illustrated with the association between insulin sensitivity measured with the euglycaemic insulin clamp technique and fasting insulin, where measurement of the latter variable carries noticeable random error. We provide software tools for estimation of a corrected slope in a simple linear regression model assuming data for a continuous dependent variable and a continuous risk factor from a main study and an additional measurement of the risk factor in a reliability study. Also, we supply programs for estimation of the number of individuals needed in the reliability study and for choice of its design. Our conclusion is that correction for regression dilution bias is seldom applied in epidemiological studies. This may cause important effects of risk factors with large measurement errors to be neglected.
Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne
2012-01-01
In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models. PMID:23275882
Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne
2012-12-01
In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models.
Pattern Recognition Analysis of Age-Related Retinal Ganglion Cell Signatures in the Human Eye
Yoshioka, Nayuta; Zangerl, Barbara; Nivison-Smith, Lisa; Khuu, Sieu K.; Jones, Bryan W.; Pfeiffer, Rebecca L.; Marc, Robert E.; Kalloniatis, Michael
2017-01-01
Purpose To characterize macular ganglion cell layer (GCL) changes with age and provide a framework to assess changes in ocular disease. This study used data clustering to analyze macular GCL patterns from optical coherence tomography (OCT) in a large cohort of subjects without ocular disease. Methods Single eyes of 201 patients evaluated at the Centre for Eye Health (Sydney, Australia) were retrospectively enrolled (age range, 20–85); 8 × 8 grid locations obtained from Spectralis OCT macular scans were analyzed with unsupervised classification into statistically separable classes sharing common GCL thickness and change with age. The resulting classes and gridwise data were fitted with linear and segmented linear regression curves. Additionally, normalized data were analyzed to determine regression as a percentage. Accuracy of each model was examined through comparison of predicted 50-year-old equivalent macular GCL thickness for the entire cohort to a true 50-year-old reference cohort. Results Pattern recognition clustered GCL thickness across the macula into five to eight spatially concentric classes. F-test demonstrated segmented linear regression to be the most appropriate model for macular GCL change. The pattern recognition–derived and normalized model revealed less difference between the predicted macular GCL thickness and the reference cohort (average ± SD 0.19 ± 0.92 and −0.30 ± 0.61 μm) than a gridwise model (average ± SD 0.62 ± 1.43 μm). Conclusions Pattern recognition successfully identified statistically separable macular areas that undergo a segmented linear reduction with age. This regression model better predicted macular GCL thickness. The various unique spatial patterns revealed by pattern recognition combined with core GCL thickness data provide a framework to analyze GCL loss in ocular disease. PMID:28632847
do Prado, Mara Rúbia Maciel Cardoso; Oliveira, Fabiana de Cássia Carvalho; Assis, Karine Franklin; Ribeiro, Sarah Aparecida Vieira; do Prado, Pedro Paulo; Sant'Ana, Luciana Ferreira da Rocha; Priore, Silvia Eloiza; Franceschini, Sylvia do Carmo Castro
2015-01-01
Abstract Objective: To assess the prevalence of vitamin D deficiency and its associated factors in women and their newborns in the postpartum period. Methods: This cross-sectional study evaluated vitamin D deficiency/insufficiency in 226 women and their newborns in Viçosa (Minas Gerais, BR) between December 2011 and November 2012. Cord blood and venous maternal blood were collected to evaluate the following biochemical parameters: vitamin D, alkaline phosphatase, calcium, phosphorus and parathyroid hormone. Poisson regression analysis, with a confidence interval of 95%, was applied to assess vitamin D deficiency and its associated factors. Multiple linear regression analysis was performed to identify factors associated with 25(OH)D deficiency in the newborns and women from the study. The criteria for variable inclusion in the multiple linear regression model was the association with the dependent variable in the simple linear regression analysis, considering p<0.20. Significance level was α <5%. Results: From 226 women included, 200 (88.5%) were 20-44 years old; the median age was 28 years. Deficient/insufficient levels of vitamin D were found in 192 (85%) women and in 182 (80.5%) neonates. The maternal 25(OH)D and alkaline phosphatase levels were independently associated with vitamin D deficiency in infants. Conclusions: This study identified a high prevalence of vitamin D deficiency and insufficiency in women and newborns and the association between maternal nutritional status of vitamin D and their infants' vitamin D status. PMID:26100593
Liu, Yan; Salvendy, Gavriel
2009-05-01
This paper aims to demonstrate the effects of measurement errors on psychometric measurements in ergonomics studies. A variety of sources can cause random measurement errors in ergonomics studies and these errors can distort virtually every statistic computed and lead investigators to erroneous conclusions. The effects of measurement errors on five most widely used statistical analysis tools have been discussed and illustrated: correlation; ANOVA; linear regression; factor analysis; linear discriminant analysis. It has been shown that measurement errors can greatly attenuate correlations between variables, reduce statistical power of ANOVA, distort (overestimate, underestimate or even change the sign of) regression coefficients, underrate the explanation contributions of the most important factors in factor analysis and depreciate the significance of discriminant function and discrimination abilities of individual variables in discrimination analysis. The discussions will be restricted to subjective scales and survey methods and their reliability estimates. Other methods applied in ergonomics research, such as physical and electrophysiological measurements and chemical and biomedical analysis methods, also have issues of measurement errors, but they are beyond the scope of this paper. As there has been increasing interest in the development and testing of theories in ergonomics research, it has become very important for ergonomics researchers to understand the effects of measurement errors on their experiment results, which the authors believe is very critical to research progress in theory development and cumulative knowledge in the ergonomics field.
Explicit criteria for prioritization of cataract surgery
Ma Quintana, José; Escobar, Antonio; Bilbao, Amaia
2006-01-01
Background Consensus techniques have been used previously to create explicit criteria to prioritize cataract extraction; however, the appropriateness of the intervention was not included explicitly in previous studies. We developed a prioritization tool for cataract extraction according to the RAND method. Methods Criteria were developed using a modified Delphi panel judgment process. A panel of 11 ophthalmologists was assembled. Ratings were analyzed regarding the level of agreement among panelists. We studied the effect of all variables on the final panel score using general linear and logistic regression models. Priority scoring systems were developed by means of optimal scaling and general linear models. The explicit criteria developed were summarized by means of regression tree analysis. Results Eight variables were considered to create the indications. Of the 310 indications that the panel evaluated, 22.6% were considered high priority, 52.3% intermediate priority, and 25.2% low priority. Agreement was reached for 31.9% of the indications and disagreement for 0.3%. Logistic regression and general linear models showed that the preoperative visual acuity of the cataractous eye, visual function, and anticipated visual acuity postoperatively were the most influential variables. Alternative and simple scoring systems were obtained by optimal scaling and general linear models where the previous variables were also the most important. The decision tree also shows the importance of the previous variables and the appropriateness of the intervention. Conclusion Our results showed acceptable validity as an evaluation and management tool for prioritizing cataract extraction. It also provides easy algorithms for use in clinical practice. PMID:16512893
NASA Astrophysics Data System (ADS)
Zahari, Siti Meriam; Ramli, Norazan Mohamed; Moktar, Balkiah; Zainol, Mohammad Said
2014-09-01
In the presence of multicollinearity and multiple outliers, statistical inference of linear regression model using ordinary least squares (OLS) estimators would be severely affected and produces misleading results. To overcome this, many approaches have been investigated. These include robust methods which were reported to be less sensitive to the presence of outliers. In addition, ridge regression technique was employed to tackle multicollinearity problem. In order to mitigate both problems, a combination of ridge regression and robust methods was discussed in this study. The superiority of this approach was examined when simultaneous presence of multicollinearity and multiple outliers occurred in multiple linear regression. This study aimed to look at the performance of several well-known robust estimators; M, MM, RIDGE and robust ridge regression estimators, namely Weighted Ridge M-estimator (WRM), Weighted Ridge MM (WRMM), Ridge MM (RMM), in such a situation. Results of the study showed that in the presence of simultaneous multicollinearity and multiple outliers (in both x and y-direction), the RMM and RIDGE are more or less similar in terms of superiority over the other estimators, regardless of the number of observation, level of collinearity and percentage of outliers used. However, when outliers occurred in only single direction (y-direction), the WRMM estimator is the most superior among the robust ridge regression estimators, by producing the least variance. In conclusion, the robust ridge regression is the best alternative as compared to robust and conventional least squares estimators when dealing with simultaneous presence of multicollinearity and outliers.
Kumar, K Vasanth
2007-04-02
Kinetic experiments were carried out for the sorption of safranin onto activated carbon particles. The kinetic data were fitted to pseudo-second order model of Ho, Sobkowsk and Czerwinski, Blanchard et al. and Ritchie by linear and non-linear regression methods. Non-linear method was found to be a better way of obtaining the parameters involved in the second order rate kinetic expressions. Both linear and non-linear regression showed that the Sobkowsk and Czerwinski and Ritchie's pseudo-second order models were the same. Non-linear regression analysis showed that both Blanchard et al. and Ho have similar ideas on the pseudo-second order model but with different assumptions. The best fit of experimental data in Ho's pseudo-second order expression by linear and non-linear regression method showed that Ho pseudo-second order model was a better kinetic expression when compared to other pseudo-second order kinetic expressions.
2013-01-01
Background This study aims to improve accuracy of Bioelectrical Impedance Analysis (BIA) prediction equations for estimating fat free mass (FFM) of the elderly by using non-linear Back Propagation Artificial Neural Network (BP-ANN) model and to compare the predictive accuracy with the linear regression model by using energy dual X-ray absorptiometry (DXA) as reference method. Methods A total of 88 Taiwanese elderly adults were recruited in this study as subjects. Linear regression equations and BP-ANN prediction equation were developed using impedances and other anthropometrics for predicting the reference FFM measured by DXA (FFMDXA) in 36 male and 26 female Taiwanese elderly adults. The FFM estimated by BIA prediction equations using traditional linear regression model (FFMLR) and BP-ANN model (FFMANN) were compared to the FFMDXA. The measuring results of an additional 26 elderly adults were used to validate than accuracy of the predictive models. Results The results showed the significant predictors were impedance, gender, age, height and weight in developed FFMLR linear model (LR) for predicting FFM (coefficient of determination, r2 = 0.940; standard error of estimate (SEE) = 2.729 kg; root mean square error (RMSE) = 2.571kg, P < 0.001). The above predictors were set as the variables of the input layer by using five neurons in the BP-ANN model (r2 = 0.987 with a SD = 1.192 kg and relatively lower RMSE = 1.183 kg), which had greater (improved) accuracy for estimating FFM when compared with linear model. The results showed a better agreement existed between FFMANN and FFMDXA than that between FFMLR and FFMDXA. Conclusion When compared the performance of developed prediction equations for estimating reference FFMDXA, the linear model has lower r2 with a larger SD in predictive results than that of BP-ANN model, which indicated ANN model is more suitable for estimating FFM. PMID:23388042
Relationship between age and elite marathon race time in world single age records from 5 to 93 years
2014-01-01
Background The aims of the study were (i) to investigate the relationship between elite marathon race times and age in 1-year intervals by using the world single age records in marathon running from 5 to 93 years and (ii) to evaluate the sex difference in elite marathon running performance with advancing age. Methods World single age records in marathon running in 1-year intervals for women and men were analysed regarding changes across age for both men and women using linear and non-linear regression analyses for each age for women and men. Results The relationship between elite marathon race time and age was non-linear (i.e. polynomial regression 4th degree) for women and men. The curve was U-shaped where performance improved from 5 to ~20 years. From 5 years to ~15 years, boys and girls performed very similar. Between ~20 and ~35 years, performance was quite linear, but started to decrease at the age of ~35 years in a curvilinear manner with increasing age in both women and men. The sex difference increased non-linearly (i.e. polynomial regression 7th degree) from 5 to ~20 years, remained unchanged at ~20 min from ~20 to ~50 years and increased thereafter. The sex difference was lowest (7.5%, 10.5 min) at the age of 49 years. Conclusion Elite marathon race times improved from 5 to ~20 years, remained linear between ~20 and ~35 years, and started to increase at the age of ~35 years in a curvilinear manner with increasing age in both women and men. The sex difference in elite marathon race time increased non-linearly and was lowest at the age of ~49 years. PMID:25120915
Tutorial on Biostatistics: Linear Regression Analysis of Continuous Correlated Eye Data
Ying, Gui-shuang; Maguire, Maureen G; Glynn, Robert; Rosner, Bernard
2017-01-01
Purpose To describe and demonstrate appropriate linear regression methods for analyzing correlated continuous eye data. Methods We describe several approaches to regression analysis involving both eyes, including mixed effects and marginal models under various covariance structures to account for inter-eye correlation. We demonstrate, with SAS statistical software, applications in a study comparing baseline refractive error between one eye with choroidal neovascularization (CNV) and the unaffected fellow eye, and in a study determining factors associated with visual field data in the elderly. Results When refractive error from both eyes were analyzed with standard linear regression without accounting for inter-eye correlation (adjusting for demographic and ocular covariates), the difference between eyes with CNV and fellow eyes was 0.15 diopters (D; 95% confidence interval, CI −0.03 to 0.32D, P=0.10). Using a mixed effects model or a marginal model, the estimated difference was the same but with narrower 95% CI (0.01 to 0.28D, P=0.03). Standard regression for visual field data from both eyes provided biased estimates of standard error (generally underestimated) and smaller P-values, while analysis of the worse eye provided larger P-values than mixed effects models and marginal models. Conclusion In research involving both eyes, ignoring inter-eye correlation can lead to invalid inferences. Analysis using only right or left eyes is valid, but decreases power. Worse-eye analysis can provide less power and biased estimates of effect. Mixed effects or marginal models using the eye as the unit of analysis should be used to appropriately account for inter-eye correlation and maximize power and precision. PMID:28102741
A Technique of Fuzzy C-Mean in Multiple Linear Regression Model toward Paddy Yield
NASA Astrophysics Data System (ADS)
Syazwan Wahab, Nur; Saifullah Rusiman, Mohd; Mohamad, Mahathir; Amira Azmi, Nur; Che Him, Norziha; Ghazali Kamardan, M.; Ali, Maselan
2018-04-01
In this paper, we propose a hybrid model which is a combination of multiple linear regression model and fuzzy c-means method. This research involved a relationship between 20 variates of the top soil that are analyzed prior to planting of paddy yields at standard fertilizer rates. Data used were from the multi-location trials for rice carried out by MARDI at major paddy granary in Peninsular Malaysia during the period from 2009 to 2012. Missing observations were estimated using mean estimation techniques. The data were analyzed using multiple linear regression model and a combination of multiple linear regression model and fuzzy c-means method. Analysis of normality and multicollinearity indicate that the data is normally scattered without multicollinearity among independent variables. Analysis of fuzzy c-means cluster the yield of paddy into two clusters before the multiple linear regression model can be used. The comparison between two method indicate that the hybrid of multiple linear regression model and fuzzy c-means method outperform the multiple linear regression model with lower value of mean square error.
Anderson, Carl A; McRae, Allan F; Visscher, Peter M
2006-07-01
Standard quantitative trait loci (QTL) mapping techniques commonly assume that the trait is both fully observed and normally distributed. When considering survival or age-at-onset traits these assumptions are often incorrect. Methods have been developed to map QTL for survival traits; however, they are both computationally intensive and not available in standard genome analysis software packages. We propose a grouped linear regression method for the analysis of continuous survival data. Using simulation we compare this method to both the Cox and Weibull proportional hazards models and a standard linear regression method that ignores censoring. The grouped linear regression method is of equivalent power to both the Cox and Weibull proportional hazards methods and is significantly better than the standard linear regression method when censored observations are present. The method is also robust to the proportion of censored individuals and the underlying distribution of the trait. On the basis of linear regression methodology, the grouped linear regression model is computationally simple and fast and can be implemented readily in freely available statistical software.
Geographical variation of cerebrovascular disease in New York State: the correlation with income
Han, Daikwon; Carrow, Shannon S; Rogerson, Peter A; Munschauer, Frederick E
2005-01-01
Background Income is known to be associated with cerebrovascular disease; however, little is known about the more detailed relationship between cerebrovascular disease and income. We examined the hypothesis that the geographical distribution of cerebrovascular disease in New York State may be predicted by a nonlinear model using income as a surrogate socioeconomic risk factor. Results We used spatial clustering methods to identify areas with high and low prevalence of cerebrovascular disease at the ZIP code level after smoothing rates and correcting for edge effects; geographic locations of high and low clusters of cerebrovascular disease in New York State were identified with and without income adjustment. To examine effects of income, we calculated the excess number of cases using a non-linear regression with cerebrovascular disease rates taken as the dependent variable and income and income squared taken as independent variables. The resulting regression equation was: excess rate = 32.075 - 1.22*10-4(income) + 8.068*10-10(income2), and both income and income squared variables were significant at the 0.01 level. When income was included as a covariate in the non-linear regression, the number and size of clusters of high cerebrovascular disease prevalence decreased. Some 87 ZIP codes exceeded the critical value of the local statistic yielding a relative risk of 1.2. The majority of low cerebrovascular disease prevalence geographic clusters disappeared when the non-linear income effect was included. For linear regression, the excess rate of cerebrovascular disease falls with income; each $10,000 increase in median income of each ZIP code resulted in an average reduction of 3.83 observed cases. The significant nonlinear effect indicates a lessening of this income effect with increasing income. Conclusion Income is a non-linear predictor of excess cerebrovascular disease rates, with both low and high observed cerebrovascular disease rate areas associated with higher income. Income alone explains a significant amount of the geographical variance in cerebrovascular disease across New York State since both high and low clusters of cerebrovascular disease dissipate or disappear with income adjustment. Geographical modeling, including non-linear effects of income, may allow for better identification of other non-traditional risk factors. PMID:16242043
Linear regression crash prediction models : issues and proposed solutions.
DOT National Transportation Integrated Search
2010-05-01
The paper develops a linear regression model approach that can be applied to : crash data to predict vehicle crashes. The proposed approach involves novice data aggregation : to satisfy linear regression assumptions; namely error structure normality ...
Comparison between Linear and Nonlinear Regression in a Laboratory Heat Transfer Experiment
ERIC Educational Resources Information Center
Gonçalves, Carine Messias; Schwaab, Marcio; Pinto, José Carlos
2013-01-01
In order to interpret laboratory experimental data, undergraduate students are used to perform linear regression through linearized versions of nonlinear models. However, the use of linearized models can lead to statistically biased parameter estimates. Even so, it is not an easy task to introduce nonlinear regression and show for the students…
Female Literacy Rate is a Better Predictor of Birth Rate and Infant Mortality Rate in India
Saurabh, Suman; Sarkar, Sonali; Pandey, Dhruv K.
2013-01-01
Background: Educated women are known to take informed reproductive and healthcare decisions. These result in population stabilization and better infant care reflected by lower birth rates and infant mortality rates (IMRs), respectively. Materials and Methods: Our objective was to study the relationship of male and female literacy rates with crude birth rates (CBRs) and IMRs of the states and union territories (UTs) of India. The data were analyzed using linear regression. CBR and IMR were taken as the dependent variables; while the overall literacy rates, male, and female literacy rates were the independent variables. Results: CBRs were inversely related to literacy rates (slope parameter = −0.402, P < 0.001). On multiple linear regression with male and female literacy rates, a significant inverse relationship emerged between female literacy rate and CBR (slope = −0.363, P < 0.001), while male literacy rate was not significantly related to CBR (P = 0.674). IMR of the states were also inversely related to their literacy rates (slope = −1.254, P < 0.001). Multiple linear regression revealed a significant inverse relationship between IMR and female literacy (slope = −0.816, P = 0.031), whereas male literacy rate was not significantly related (P = 0.630). Conclusion: Female literacy is relatively highly important for both population stabilization and better infant health. PMID:26664840
Vitello, Dominic J; Ripper, Richard M; Fettiplace, Michael R; Weinberg, Guy L; Vitello, Joseph M
2015-01-01
Purpose. The gravimetric method of weighing surgical sponges is used to quantify intraoperative blood loss. The dry mass minus the wet mass of the gauze equals the volume of blood lost. This method assumes that the density of blood is equivalent to water (1 gm/mL). This study's purpose was to validate the assumption that the density of blood is equivalent to water and to correlate density with hematocrit. Methods. 50 µL of whole blood was weighed from eighteen rats. A distilled water control was weighed for each blood sample. The averages of the blood and water were compared utilizing a Student's unpaired, one-tailed t-test. The masses of the blood samples and the hematocrits were compared using a linear regression. Results. The average mass of the eighteen blood samples was 0.0489 g and that of the distilled water controls was 0.0492 g. The t-test showed P = 0.2269 and R (2) = 0.03154. The hematocrit values ranged from 24% to 48%. The linear regression R (2) value was 0.1767. Conclusions. The R (2) value comparing the blood and distilled water masses suggests high correlation between the two populations. Linear regression showed the hematocrit was not proportional to the mass of the blood. The study confirmed that the measured density of blood is similar to water.
Kim, Seong-Gil
2018-01-01
Background The purpose of this study was to investigate the effect of ankle ROM and lower-extremity muscle strength on static balance control ability in young adults. Material/Methods This study was conducted with 65 young adults, but 10 young adults dropped out during the measurement, so 55 young adults (male: 19, female: 36) completed the study. Postural sway (length and velocity) was measured with eyes open and closed, and ankle ROM (AROM and PROM of dorsiflexion and plantarflexion) and lower-extremity muscle strength (flexor and extensor of hip, knee, and ankle joint) were measured. Pearson correlation coefficient was used to examine the correlation between variables and static balance ability. Simple linear regression analysis and multiple linear regression analysis were used to examine the effect of variables on static balance ability. Results In correlation analysis, plantarflexion ROM (AROM and PROM) and lower-extremity muscle strength (except hip extensor) were significantly correlated with postural sway (p<0.05). In simple correlation analysis, all variables that passed the correlation analysis procedure had significant influence (p<0.05). In multiple linear regression analysis, plantar flexion PROM with eyes open significantly influenced sway length (B=0.681) and sway velocity (B=0.011). Conclusions Lower-extremity muscle strength and ankle plantarflexion ROM influenced static balance control ability, with ankle plantarflexion PROM showing the greatest influence. Therefore, both contractile structures and non-contractile structures should be of interest when considering static balance control ability improvement. PMID:29760375
Kim, Seong-Gil; Kim, Wan-Soo
2018-05-15
BACKGROUND The purpose of this study was to investigate the effect of ankle ROM and lower-extremity muscle strength on static balance control ability in young adults. MATERIAL AND METHODS This study was conducted with 65 young adults, but 10 young adults dropped out during the measurement, so 55 young adults (male: 19, female: 36) completed the study. Postural sway (length and velocity) was measured with eyes open and closed, and ankle ROM (AROM and PROM of dorsiflexion and plantarflexion) and lower-extremity muscle strength (flexor and extensor of hip, knee, and ankle joint) were measured. Pearson correlation coefficient was used to examine the correlation between variables and static balance ability. Simple linear regression analysis and multiple linear regression analysis were used to examine the effect of variables on static balance ability. RESULTS In correlation analysis, plantarflexion ROM (AROM and PROM) and lower-extremity muscle strength (except hip extensor) were significantly correlated with postural sway (p<0.05). In simple correlation analysis, all variables that passed the correlation analysis procedure had significant influence (p<0.05). In multiple linear regression analysis, plantar flexion PROM with eyes open significantly influenced sway length (B=0.681) and sway velocity (B=0.011). CONCLUSIONS Lower-extremity muscle strength and ankle plantarflexion ROM influenced static balance control ability, with ankle plantarflexion PROM showing the greatest influence. Therefore, both contractile structures and non-contractile structures should be of interest when considering static balance control ability improvement.
The Application of the Cumulative Logistic Regression Model to Automated Essay Scoring
ERIC Educational Resources Information Center
Haberman, Shelby J.; Sinharay, Sandip
2010-01-01
Most automated essay scoring programs use a linear regression model to predict an essay score from several essay features. This article applied a cumulative logit model instead of the linear regression model to automated essay scoring. Comparison of the performances of the linear regression model and the cumulative logit model was performed on a…
NASA Astrophysics Data System (ADS)
Nuccitelli, Dana; Cowtan, Kevin; Jacobs, Peter; Richardson, Mark; Way, Robert G.; Blackburn, Anne-Marie; Stolpe, Martin B.; Cook, John
2014-04-01
Lu (2013) (L13) argued that solar effects and anthropogenic halogenated gases can explain most of the observed warming of global mean surface air temperatures since 1850, with virtually no contribution from atmospheric carbon dioxide (CO2) concentrations. Here we show that this conclusion is based on assumptions about the saturation of the CO2-induced greenhouse effect that have been experimentally falsified. L13 also confuses equilibrium and transient response, and relies on data sources that have been superseeded due to known inaccuracies. Furthermore, the statistical approach of sequential linear regression artificially shifts variance onto the first predictor. L13's artificial choice of regression order and neglect of other relevant data is the fundamental cause of the incorrect main conclusion. Consideration of more modern data and a more parsimonious multiple regression model leads to contradiction with L13's statistical results. Finally, the correlation arguments in L13 are falsified by considering either the more appropriate metric of global heat accumulation, or data on longer timescales.
NASA Astrophysics Data System (ADS)
Gao, Xiangyun; An, Haizhong; Fang, Wei; Huang, Xuan; Li, Huajiao; Zhong, Weiqiong; Ding, Yinghui
2014-07-01
The linear regression parameters between two time series can be different under different lengths of observation period. If we study the whole period by the sliding window of a short period, the change of the linear regression parameters is a process of dynamic transmission over time. We tackle fundamental research that presents a simple and efficient computational scheme: a linear regression patterns transmission algorithm, which transforms linear regression patterns into directed and weighted networks. The linear regression patterns (nodes) are defined by the combination of intervals of the linear regression parameters and the results of the significance testing under different sizes of the sliding window. The transmissions between adjacent patterns are defined as edges, and the weights of the edges are the frequency of the transmissions. The major patterns, the distance, and the medium in the process of the transmission can be captured. The statistical results of weighted out-degree and betweenness centrality are mapped on timelines, which shows the features of the distribution of the results. Many measurements in different areas that involve two related time series variables could take advantage of this algorithm to characterize the dynamic relationships between the time series from a new perspective.
Gao, Xiangyun; An, Haizhong; Fang, Wei; Huang, Xuan; Li, Huajiao; Zhong, Weiqiong; Ding, Yinghui
2014-07-01
The linear regression parameters between two time series can be different under different lengths of observation period. If we study the whole period by the sliding window of a short period, the change of the linear regression parameters is a process of dynamic transmission over time. We tackle fundamental research that presents a simple and efficient computational scheme: a linear regression patterns transmission algorithm, which transforms linear regression patterns into directed and weighted networks. The linear regression patterns (nodes) are defined by the combination of intervals of the linear regression parameters and the results of the significance testing under different sizes of the sliding window. The transmissions between adjacent patterns are defined as edges, and the weights of the edges are the frequency of the transmissions. The major patterns, the distance, and the medium in the process of the transmission can be captured. The statistical results of weighted out-degree and betweenness centrality are mapped on timelines, which shows the features of the distribution of the results. Many measurements in different areas that involve two related time series variables could take advantage of this algorithm to characterize the dynamic relationships between the time series from a new perspective.
Korany, Mohamed A; Gazy, Azza A; Khamis, Essam F; Ragab, Marwa A A; Kamal, Miranda F
2018-06-01
This study outlines two robust regression approaches, namely least median of squares (LMS) and iteratively re-weighted least squares (IRLS) to investigate their application in instrument analysis of nutraceuticals (that is, fluorescence quenching of merbromin reagent upon lipoic acid addition). These robust regression methods were used to calculate calibration data from the fluorescence quenching reaction (∆F and F-ratio) under ideal or non-ideal linearity conditions. For each condition, data were treated using three regression fittings: Ordinary Least Squares (OLS), LMS and IRLS. Assessment of linearity, limits of detection (LOD) and quantitation (LOQ), accuracy and precision were carefully studied for each condition. LMS and IRLS regression line fittings showed significant improvement in correlation coefficients and all regression parameters for both methods and both conditions. In the ideal linearity condition, the intercept and slope changed insignificantly, but a dramatic change was observed for the non-ideal condition and linearity intercept. Under both linearity conditions, LOD and LOQ values after the robust regression line fitting of data were lower than those obtained before data treatment. The results obtained after statistical treatment indicated that the linearity ranges for drug determination could be expanded to lower limits of quantitation by enhancing the regression equation parameters after data treatment. Analysis results for lipoic acid in capsules, using both fluorimetric methods, treated by parametric OLS and after treatment by robust LMS and IRLS were compared for both linearity conditions. Copyright © 2018 John Wiley & Sons, Ltd.
NASA Technical Reports Server (NTRS)
Jones, Harrison P.; Branston, Detrick D.; Jones, Patricia B.; Popescu, Miruna D.
2002-01-01
An earlier study compared NASA/NSO Spectromagnetograph (SPM) data with spacecraft measurements of total solar irradiance (TSI) variations over a 1.5 year period in the declining phase of solar cycle 22. This paper extends the analysis to an eight-year period which also spans the rising and early maximum phases of cycle 23. The conclusions of the earlier work appear to be robust: three factors (sunspots, strong unipolar regions, and strong mixed polarity regions) describe most of the variation in the SPM record, but only the first two are associated with TSI. Additionally, the residuals of a linear multiple regression of TSI against SPM observations over the entire eight-year period show an unexplained, increasing, linear time variation with a rate of about 0.05 W m(exp -2) per year. Separate regressions for the periods before and after 1996 January 01 show no unexplained trends but differ substantially in regression parameters. This behavior may reflect a solar source of TSI variations beyond sunspots and faculae but more plausibly results from uncompensated non-solar effects in one or both of the TSI and SPM data sets.
1974-01-01
REGRESSION MODEL - THE UNCONSTRAINED, LINEAR EQUALITY AND INEQUALITY CONSTRAINED APPROACHES January 1974 Nelson Delfino d’Avila Mascarenha;? Image...Report 520 DIGITAL IMAGE RESTORATION UNDER A REGRESSION MODEL THE UNCONSTRAINED, LINEAR EQUALITY AND INEQUALITY CONSTRAINED APPROACHES January...a two- dimensional form adequately describes the linear model . A dis- cretization is performed by using quadrature methods. By trans
Element enrichment factor calculation using grain-size distribution and functional data regression.
Sierra, C; Ordóñez, C; Saavedra, A; Gallego, J R
2015-01-01
In environmental geochemistry studies it is common practice to normalize element concentrations in order to remove the effect of grain size. Linear regression with respect to a particular grain size or conservative element is a widely used method of normalization. In this paper, the utility of functional linear regression, in which the grain-size curve is the independent variable and the concentration of pollutant the dependent variable, is analyzed and applied to detrital sediment. After implementing functional linear regression and classical linear regression models to normalize and calculate enrichment factors, we concluded that the former regression technique has some advantages over the latter. First, functional linear regression directly considers the grain-size distribution of the samples as the explanatory variable. Second, as the regression coefficients are not constant values but functions depending on the grain size, it is easier to comprehend the relationship between grain size and pollutant concentration. Third, regularization can be introduced into the model in order to establish equilibrium between reliability of the data and smoothness of the solutions. Copyright © 2014 Elsevier Ltd. All rights reserved.
Jacobsen, Henrik Børsting; Reme, Silje Endresen; Sembajwe, Grace; Hopcia, Karen; Stiles, Tore C.; Sorensen, Glorian; Porter, James H.; Marino, Miguel; Buxton, Orfeu M.
2014-01-01
Objectives The aim of this study was to investigate the longitudinal effect of work-related stress, sleep deficiency and physical activity on 10-year cardiometabolic risk among an all-female worker population. Methods Data on patient care workers (n=99) was collected two years apart. Baseline measures included: job stress, physical activity, night work and sleep deficiency. Biomarkers and objective measurements were used to estimate 10-year cardiometabolic risk at follow-up. Significant associations (P<0.05) from baseline analyses were used to build a multivariable linear regression model. Results The participants were mostly white nurses with a mean age of 41 years. Adjusted linear regression showed that having sleep maintenance problems, a different occupation than nurse, and/or not exercising at recommended levels at baseline increased the 10-year cardiometabolic risk at follow-up. Conclusions In female workers prone to work-related stress and sleep deficiency, maintaining sleep and exercise patterns had a strong impact on modifiable 10-year cardiometabolic risk. PMID:24809311
Who Will Win?: Predicting the Presidential Election Using Linear Regression
ERIC Educational Resources Information Center
Lamb, John H.
2007-01-01
This article outlines a linear regression activity that engages learners, uses technology, and fosters cooperation. Students generated least-squares linear regression equations using TI-83 Plus[TM] graphing calculators, Microsoft[C] Excel, and paper-and-pencil calculations using derived normal equations to predict the 2004 presidential election.…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wei, J; Chao, M
2016-06-15
Purpose: To develop a novel strategy to extract the respiratory motion of the thoracic diaphragm from kilovoltage cone beam computed tomography (CBCT) projections by a constrained linear regression optimization technique. Methods: A parabolic function was identified as the geometric model and was employed to fit the shape of the diaphragm on the CBCT projections. The search was initialized by five manually placed seeds on a pre-selected projection image. Temporal redundancies, the enabling phenomenology in video compression and encoding techniques, inherent in the dynamic properties of the diaphragm motion together with the geometrical shape of the diaphragm boundary and the associatedmore » algebraic constraint that significantly reduced the searching space of viable parabolic parameters was integrated, which can be effectively optimized by a constrained linear regression approach on the subsequent projections. The innovative algebraic constraints stipulating the kinetic range of the motion and the spatial constraint preventing any unphysical deviations was able to obtain the optimal contour of the diaphragm with minimal initialization. The algorithm was assessed by a fluoroscopic movie acquired at anteriorposterior fixed direction and kilovoltage CBCT projection image sets from four lung and two liver patients. The automatic tracing by the proposed algorithm and manual tracking by a human operator were compared in both space and frequency domains. Results: The error between the estimated and manual detections for the fluoroscopic movie was 0.54mm with standard deviation (SD) of 0.45mm, while the average error for the CBCT projections was 0.79mm with SD of 0.64mm for all enrolled patients. The submillimeter accuracy outcome exhibits the promise of the proposed constrained linear regression approach to track the diaphragm motion on rotational projection images. Conclusion: The new algorithm will provide a potential solution to rendering diaphragm motion and ultimately improving tumor motion management for radiation therapy of cancer patients.« less
Statistical approach to the analysis of olive long-term pollen season trends in southern Spain.
García-Mozo, H; Yaezel, L; Oteros, J; Galán, C
2014-03-01
Analysis of long-term airborne pollen counts makes it possible not only to chart pollen-season trends but also to track changing patterns in flowering phenology. Changes in higher plant response over a long interval are considered among the most valuable bioindicators of climate change impact. Phenological-trend models can also provide information regarding crop production and pollen-allergen emission. The interest of this information makes essential the election of the statistical analysis for time series study. We analysed trends and variations in the olive flowering season over a 30-year period (1982-2011) in southern Europe (Córdoba, Spain), focussing on: annual Pollen Index (PI); Pollen Season Start (PSS), Peak Date (PD), Pollen Season End (PSE) and Pollen Season Duration (PSD). Apart from the traditional Linear Regression analysis, a Seasonal-Trend Decomposition procedure based on Loess (STL) and an ARIMA model were performed. Linear regression results indicated a trend toward delayed PSE and earlier PSS and PD, probably influenced by the rise in temperature. These changes are provoking longer flowering periods in the study area. The use of the STL technique provided a clearer picture of phenological behaviour. Data decomposition on pollination dynamics enabled the trend toward an alternate bearing cycle to be distinguished from the influence of other stochastic fluctuations. Results pointed to show a rising trend in pollen production. With a view toward forecasting future phenological trends, ARIMA models were constructed to predict PSD, PSS and PI until 2016. Projections displayed a better goodness of fit than those derived from linear regression. Findings suggest that olive reproductive cycle is changing considerably over the last 30years due to climate change. Further conclusions are that STL improves the effectiveness of traditional linear regression in trend analysis, and ARIMA models can provide reliable trend projections for future years taking into account the internal fluctuations in time series. Copyright © 2013 Elsevier B.V. All rights reserved.
The microcomputer scientific software series 2: general linear model--regression.
Harold M. Rauscher
1983-01-01
The general linear model regression (GLMR) program provides the microcomputer user with a sophisticated regression analysis capability. The output provides a regression ANOVA table, estimators of the regression model coefficients, their confidence intervals, confidence intervals around the predicted Y-values, residuals for plotting, a check for multicollinearity, a...
Shen, Minxue; Tan, Hongzhuan; Zhou, Shujin; Retnakaran, Ravi; Smith, Graeme N.; Davidge, Sandra T.; Trasler, Jacquetta; Walker, Mark C.; Wen, Shi Wu
2016-01-01
Background It has been reported that higher folate intake from food and supplementation is associated with decreased blood pressure (BP). The association between serum folate concentration and BP has been examined in few studies. We aim to examine the association between serum folate and BP levels in a cohort of young Chinese women. Methods We used the baseline data from a pre-conception cohort of women of childbearing age in Liuyang, China, for this study. Demographic data were collected by structured interview. Serum folate concentration was measured by immunoassay, and homocysteine, blood glucose, triglyceride and total cholesterol were measured through standardized clinical procedures. Multiple linear regression and principal component regression model were applied in the analysis. Results A total of 1,532 healthy normotensive non-pregnant women were included in the final analysis. The mean concentration of serum folate was 7.5 ± 5.4 nmol/L and 55% of the women presented with folate deficiency (< 6.8 nmol/L). Multiple linear regression and principal component regression showed that serum folate levels were inversely associated with systolic and diastolic BP, after adjusting for demographic, anthropometric, and biochemical factors. Conclusions Serum folate is inversely associated with BP in non-pregnant women of childbearing age with high prevalence of folate deficiency. PMID:27182603
Yao, Hong; Zhuang, Wei; Qian, Yu; Xia, Bisheng; Yang, Yang; Qian, Xin
2016-01-01
Turbidity (T) has been widely used to detect the occurrence of pollutants in surface water. Using data collected from January 2013 to June 2014 at eleven sites along two rivers feeding the Taihu Basin, China, the relationship between the concentration of five metals (aluminum (Al), titanium (Ti), nickel (Ni), vanadium (V), lead (Pb)) and turbidity was investigated. Metal concentration was determined using inductively coupled plasma mass spectrometry (ICP-MS). The linear regression of metal concentration and turbidity provided a good fit, with R2 = 0.86–0.93 for 72 data sets collected in the industrial river and R2 = 0.60–0.85 for 60 data sets collected in the cleaner river. All the regression presented good linear relationship, leading to the conclusion that the occurrence of the five metals are directly related to suspended solids, and these metal concentration could be approximated using these regression equations. Thus, the linear regression equations were applied to estimate the metal concentration using online turbidity data from January 1 to June 30 in 2014. In the prediction, the WASP 7.5.2 (Water Quality Analysis Simulation Program) model was introduced to interpret the transport and fates of total suspended solids; in addition, metal concentration downstream of the two rivers was predicted. All the relative errors between the estimated and measured metal concentration were within 30%, and those between the predicted and measured values were within 40%. The estimation and prediction process of metals’ concentration indicated that exploring the relationship between metals and turbidity values might be one effective technique for efficient estimation and prediction of metal concentration to facilitate better long-term monitoring with high temporal and spatial density. PMID:27028017
Yao, Hong; Zhuang, Wei; Qian, Yu; Xia, Bisheng; Yang, Yang; Qian, Xin
2016-01-01
Turbidity (T) has been widely used to detect the occurrence of pollutants in surface water. Using data collected from January 2013 to June 2014 at eleven sites along two rivers feeding the Taihu Basin, China, the relationship between the concentration of five metals (aluminum (Al), titanium (Ti), nickel (Ni), vanadium (V), lead (Pb)) and turbidity was investigated. Metal concentration was determined using inductively coupled plasma mass spectrometry (ICP-MS). The linear regression of metal concentration and turbidity provided a good fit, with R(2) = 0.86-0.93 for 72 data sets collected in the industrial river and R(2) = 0.60-0.85 for 60 data sets collected in the cleaner river. All the regression presented good linear relationship, leading to the conclusion that the occurrence of the five metals are directly related to suspended solids, and these metal concentration could be approximated using these regression equations. Thus, the linear regression equations were applied to estimate the metal concentration using online turbidity data from January 1 to June 30 in 2014. In the prediction, the WASP 7.5.2 (Water Quality Analysis Simulation Program) model was introduced to interpret the transport and fates of total suspended solids; in addition, metal concentration downstream of the two rivers was predicted. All the relative errors between the estimated and measured metal concentration were within 30%, and those between the predicted and measured values were within 40%. The estimation and prediction process of metals' concentration indicated that exploring the relationship between metals and turbidity values might be one effective technique for efficient estimation and prediction of metal concentration to facilitate better long-term monitoring with high temporal and spatial density.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Penna, M.L.; Duchiade, M.P.
The authors report the results of an investigation into the possible association between air pollution and infant mortality from pneumonia in the Rio de Janeiro Metropolitan Area. This investigation employed multiple linear regression analysis (stepwise method) for infant mortality from pneumonia in 1980, including the study population's areas of residence, incomes, and pollution exposure as independent variables. With the income variable included in the regression, a statistically significant association was observed between the average annual level of particulates and infant mortality from pneumonia. While this finding should be accepted with caution, it does suggest a biological association between these variables.more » The authors' conclusion is that air quality indicators should be included in studies of acute respiratory infections in developing countries.« less
Vitello, Dominic J.; Ripper, Richard M.; Fettiplace, Michael R.; Weinberg, Guy L.; Vitello, Joseph M.
2015-01-01
Purpose. The gravimetric method of weighing surgical sponges is used to quantify intraoperative blood loss. The dry mass minus the wet mass of the gauze equals the volume of blood lost. This method assumes that the density of blood is equivalent to water (1 gm/mL). This study's purpose was to validate the assumption that the density of blood is equivalent to water and to correlate density with hematocrit. Methods. 50 µL of whole blood was weighed from eighteen rats. A distilled water control was weighed for each blood sample. The averages of the blood and water were compared utilizing a Student's unpaired, one-tailed t-test. The masses of the blood samples and the hematocrits were compared using a linear regression. Results. The average mass of the eighteen blood samples was 0.0489 g and that of the distilled water controls was 0.0492 g. The t-test showed P = 0.2269 and R 2 = 0.03154. The hematocrit values ranged from 24% to 48%. The linear regression R 2 value was 0.1767. Conclusions. The R 2 value comparing the blood and distilled water masses suggests high correlation between the two populations. Linear regression showed the hematocrit was not proportional to the mass of the blood. The study confirmed that the measured density of blood is similar to water. PMID:26464949
Wang, D Z; Wang, C; Shen, C F; Zhang, Y; Zhang, H; Song, G D; Xue, X D; Xu, Z L; Zhang, S; Jiang, G H
2017-05-10
We described the time trend of acute myocardial infarction (AMI) from 1999 to 2013 in Tianjin incidence rate with Cochran-Armitage trend (CAT) test and linear regression analysis, and the results were compared. Based on actual population, CAT test had much stronger statistical power than linear regression analysis for both overall incidence trend and age specific incidence trend (Cochran-Armitage trend P value
Yang, Ruiqi; Wang, Fei; Zhang, Jialing; Zhu, Chonglei; Fan, Limei
2015-05-19
To establish the reference values of thalamus, caudate nucleus and lenticular nucleus diameters through fetal thalamic transverse section. A total of 265 fetuses at our hospital were randomly selected from November 2012 to August 2014. And the transverse and length diameters of thalamus, caudate nucleus and lenticular nucleus were measured. SPSS 19.0 statistical software was used to calculate the regression curve of fetal diameter changes and gestational weeks of pregnancy. P < 0.05 was considered as having statistical significance. The linear regression equation of fetal thalamic length diameter and gestational week was: Y = 0.051X+0.201, R = 0.876, linear regression equation of thalamic transverse diameter and fetal gestational week was: Y = 0.031X+0.229, R = 0.817, linear regression equation of fetal head of caudate nucleus length diameter and gestational age was: Y = 0.033X+0.101, R = 0.722, linear regression equation of fetal head of caudate nucleus transverse diameter and gestational week was: R = 0.025 - 0.046, R = 0.711, linear regression equation of fetal lentiform nucleus length diameter and gestational week was: Y = 0.046+0.229, R = 0.765, linear regression equation of fetal lentiform nucleus diameter and gestational week was: Y = 0.025 - 0.05, R = 0.772. Ultrasonic measurement of diameter of fetal thalamus caudate nucleus, and lenticular nucleus through thalamic transverse section is simple and convenient. And measurements increase with fetal gestational weeks and there is linear regression relationship between them.
Local Linear Regression for Data with AR Errors.
Li, Runze; Li, Yan
2009-07-01
In many statistical applications, data are collected over time, and they are likely correlated. In this paper, we investigate how to incorporate the correlation information into the local linear regression. Under the assumption that the error process is an auto-regressive process, a new estimation procedure is proposed for the nonparametric regression by using local linear regression method and the profile least squares techniques. We further propose the SCAD penalized profile least squares method to determine the order of auto-regressive process. Extensive Monte Carlo simulation studies are conducted to examine the finite sample performance of the proposed procedure, and to compare the performance of the proposed procedures with the existing one. From our empirical studies, the newly proposed procedures can dramatically improve the accuracy of naive local linear regression with working-independent error structure. We illustrate the proposed methodology by an analysis of real data set.
Orthogonal Regression: A Teaching Perspective
ERIC Educational Resources Information Center
Carr, James R.
2012-01-01
A well-known approach to linear least squares regression is that which involves minimizing the sum of squared orthogonal projections of data points onto the best fit line. This form of regression is known as orthogonal regression, and the linear model that it yields is known as the major axis. A similar method, reduced major axis regression, is…
Practical Session: Simple Linear Regression
NASA Astrophysics Data System (ADS)
Clausel, M.; Grégoire, G.
2014-12-01
Two exercises are proposed to illustrate the simple linear regression. The first one is based on the famous Galton's data set on heredity. We use the lm R command and get coefficients estimates, standard error of the error, R2, residuals …In the second example, devoted to data related to the vapor tension of mercury, we fit a simple linear regression, predict values, and anticipate on multiple linear regression. This pratical session is an excerpt from practical exercises proposed by A. Dalalyan at EPNC (see Exercises 1 and 2 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_4.pdf).
Morse Code, Scrabble, and the Alphabet
ERIC Educational Resources Information Center
Richardson, Mary; Gabrosek, John; Reischman, Diann; Curtiss, Phyliss
2004-01-01
In this paper we describe an interactive activity that illustrates simple linear regression. Students collect data and analyze it using simple linear regression techniques taught in an introductory applied statistics course. The activity is extended to illustrate checks for regression assumptions and regression diagnostics taught in an…
Advanced statistics: linear regression, part II: multiple linear regression.
Marill, Keith A
2004-01-01
The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.
NASA Astrophysics Data System (ADS)
Kang, Pilsang; Koo, Changhoi; Roh, Hokyu
2017-11-01
Since simple linear regression theory was established at the beginning of the 1900s, it has been used in a variety of fields. Unfortunately, it cannot be used directly for calibration. In practical calibrations, the observed measurements (the inputs) are subject to errors, and hence they vary, thus violating the assumption that the inputs are fixed. Therefore, in the case of calibration, the regression line fitted using the method of least squares is not consistent with the statistical properties of simple linear regression as already established based on this assumption. To resolve this problem, "classical regression" and "inverse regression" have been proposed. However, they do not completely resolve the problem. As a fundamental solution, we introduce "reversed inverse regression" along with a new methodology for deriving its statistical properties. In this study, the statistical properties of this regression are derived using the "error propagation rule" and the "method of simultaneous error equations" and are compared with those of the existing regression approaches. The accuracy of the statistical properties thus derived is investigated in a simulation study. We conclude that the newly proposed regression and methodology constitute the complete regression approach for univariate linear calibrations.
Inferring gene regression networks with model trees
2010-01-01
Background Novel strategies are required in order to handle the huge amount of data produced by microarray technologies. To infer gene regulatory networks, the first step is to find direct regulatory relationships between genes building the so-called gene co-expression networks. They are typically generated using correlation statistics as pairwise similarity measures. Correlation-based methods are very useful in order to determine whether two genes have a strong global similarity but do not detect local similarities. Results We propose model trees as a method to identify gene interaction networks. While correlation-based methods analyze each pair of genes, in our approach we generate a single regression tree for each gene from the remaining genes. Finally, a graph from all the relationships among output and input genes is built taking into account whether the pair of genes is statistically significant. For this reason we apply a statistical procedure to control the false discovery rate. The performance of our approach, named REGNET, is experimentally tested on two well-known data sets: Saccharomyces Cerevisiae and E.coli data set. First, the biological coherence of the results are tested. Second the E.coli transcriptional network (in the Regulon database) is used as control to compare the results to that of a correlation-based method. This experiment shows that REGNET performs more accurately at detecting true gene associations than the Pearson and Spearman zeroth and first-order correlation-based methods. Conclusions REGNET generates gene association networks from gene expression data, and differs from correlation-based methods in that the relationship between one gene and others is calculated simultaneously. Model trees are very useful techniques to estimate the numerical values for the target genes by linear regression functions. They are very often more precise than linear regression models because they can add just different linear regressions to separate areas of the search space favoring to infer localized similarities over a more global similarity. Furthermore, experimental results show the good performance of REGNET. PMID:20950452
A comparison of methods for the analysis of binomial clustered outcomes in behavioral research.
Ferrari, Alberto; Comelli, Mario
2016-12-01
In behavioral research, data consisting of a per-subject proportion of "successes" and "failures" over a finite number of trials often arise. This clustered binary data are usually non-normally distributed, which can distort inference if the usual general linear model is applied and sample size is small. A number of more advanced methods is available, but they are often technically challenging and a comparative assessment of their performances in behavioral setups has not been performed. We studied the performances of some methods applicable to the analysis of proportions; namely linear regression, Poisson regression, beta-binomial regression and Generalized Linear Mixed Models (GLMMs). We report on a simulation study evaluating power and Type I error rate of these models in hypothetical scenarios met by behavioral researchers; plus, we describe results from the application of these methods on data from real experiments. Our results show that, while GLMMs are powerful instruments for the analysis of clustered binary outcomes, beta-binomial regression can outperform them in a range of scenarios. Linear regression gave results consistent with the nominal level of significance, but was overall less powerful. Poisson regression, instead, mostly led to anticonservative inference. GLMMs and beta-binomial regression are generally more powerful than linear regression; yet linear regression is robust to model misspecification in some conditions, whereas Poisson regression suffers heavily from violations of the assumptions when used to model proportion data. We conclude providing directions to behavioral scientists dealing with clustered binary data and small sample sizes. Copyright © 2016 Elsevier B.V. All rights reserved.
Quality of life in breast cancer patients--a quantile regression analysis.
Pourhoseingholi, Mohamad Amin; Safaee, Azadeh; Moghimi-Dehkordi, Bijan; Zeighami, Bahram; Faghihzadeh, Soghrat; Tabatabaee, Hamid Reza; Pourhoseingholi, Asma
2008-01-01
Quality of life study has an important role in health care especially in chronic diseases, in clinical judgment and in medical resources supplying. Statistical tools like linear regression are widely used to assess the predictors of quality of life. But when the response is not normal the results are misleading. The aim of this study is to determine the predictors of quality of life in breast cancer patients, using quantile regression model and compare to linear regression. A cross-sectional study conducted on 119 breast cancer patients that admitted and treated in chemotherapy ward of Namazi hospital in Shiraz. We used QLQ-C30 questionnaire to assessment quality of life in these patients. A quantile regression was employed to assess the assocciated factors and the results were compared to linear regression. All analysis carried out using SAS. The mean score for the global health status for breast cancer patients was 64.92+/-11.42. Linear regression showed that only grade of tumor, occupational status, menopausal status, financial difficulties and dyspnea were statistically significant. In spite of linear regression, financial difficulties were not significant in quantile regression analysis and dyspnea was only significant for first quartile. Also emotion functioning and duration of disease statistically predicted the QOL score in the third quartile. The results have demonstrated that using quantile regression leads to better interpretation and richer inference about predictors of the breast cancer patient quality of life.
Interpretation of commonly used statistical regression models.
Kasza, Jessica; Wolfe, Rory
2014-01-01
A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.
Use of probabilistic weights to enhance linear regression myoelectric control
NASA Astrophysics Data System (ADS)
Smith, Lauren H.; Kuiken, Todd A.; Hargrove, Levi J.
2015-12-01
Objective. Clinically available prostheses for transradial amputees do not allow simultaneous myoelectric control of degrees of freedom (DOFs). Linear regression methods can provide simultaneous myoelectric control, but frequently also result in difficulty with isolating individual DOFs when desired. This study evaluated the potential of using probabilistic estimates of categories of gross prosthesis movement, which are commonly used in classification-based myoelectric control, to enhance linear regression myoelectric control. Approach. Gaussian models were fit to electromyogram (EMG) feature distributions for three movement classes at each DOF (no movement, or movement in either direction) and used to weight the output of linear regression models by the probability that the user intended the movement. Eight able-bodied and two transradial amputee subjects worked in a virtual Fitts’ law task to evaluate differences in controllability between linear regression and probability-weighted regression for an intramuscular EMG-based three-DOF wrist and hand system. Main results. Real-time and offline analyses in able-bodied subjects demonstrated that probability weighting improved performance during single-DOF tasks (p < 0.05) by preventing extraneous movement at additional DOFs. Similar results were seen in experiments with two transradial amputees. Though goodness-of-fit evaluations suggested that the EMG feature distributions showed some deviations from the Gaussian, equal-covariance assumptions used in this experiment, the assumptions were sufficiently met to provide improved performance compared to linear regression control. Significance. Use of probability weights can improve the ability to isolate individual during linear regression myoelectric control, while maintaining the ability to simultaneously control multiple DOFs.
Simplified large African carnivore density estimators from track indices.
Winterbach, Christiaan W; Ferreira, Sam M; Funston, Paul J; Somers, Michael J
2016-01-01
The range, population size and trend of large carnivores are important parameters to assess their status globally and to plan conservation strategies. One can use linear models to assess population size and trends of large carnivores from track-based surveys on suitable substrates. The conventional approach of a linear model with intercept may not intercept at zero, but may fit the data better than linear model through the origin. We assess whether a linear regression through the origin is more appropriate than a linear regression with intercept to model large African carnivore densities and track indices. We did simple linear regression with intercept analysis and simple linear regression through the origin and used the confidence interval for ß in the linear model y = αx + ß, Standard Error of Estimate, Mean Squares Residual and Akaike Information Criteria to evaluate the models. The Lion on Clay and Low Density on Sand models with intercept were not significant ( P > 0.05). The other four models with intercept and the six models thorough origin were all significant ( P < 0.05). The models using linear regression with intercept all included zero in the confidence interval for ß and the null hypothesis that ß = 0 could not be rejected. All models showed that the linear model through the origin provided a better fit than the linear model with intercept, as indicated by the Standard Error of Estimate and Mean Square Residuals. Akaike Information Criteria showed that linear models through the origin were better and that none of the linear models with intercept had substantial support. Our results showed that linear regression through the origin is justified over the more typical linear regression with intercept for all models we tested. A general model can be used to estimate large carnivore densities from track densities across species and study areas. The formula observed track density = 3.26 × carnivore density can be used to estimate densities of large African carnivores using track counts on sandy substrates in areas where carnivore densities are 0.27 carnivores/100 km 2 or higher. To improve the current models, we need independent data to validate the models and data to test for non-linear relationship between track indices and true density at low densities.
[From clinical judgment to linear regression model.
Palacios-Cruz, Lino; Pérez, Marcela; Rivas-Ruiz, Rodolfo; Talavera, Juan O
2013-01-01
When we think about mathematical models, such as linear regression model, we think that these terms are only used by those engaged in research, a notion that is far from the truth. Legendre described the first mathematical model in 1805, and Galton introduced the formal term in 1886. Linear regression is one of the most commonly used regression models in clinical practice. It is useful to predict or show the relationship between two or more variables as long as the dependent variable is quantitative and has normal distribution. Stated in another way, the regression is used to predict a measure based on the knowledge of at least one other variable. Linear regression has as it's first objective to determine the slope or inclination of the regression line: Y = a + bx, where "a" is the intercept or regression constant and it is equivalent to "Y" value when "X" equals 0 and "b" (also called slope) indicates the increase or decrease that occurs when the variable "x" increases or decreases in one unit. In the regression line, "b" is called regression coefficient. The coefficient of determination (R 2 ) indicates the importance of independent variables in the outcome.
Bias due to two-stage residual-outcome regression analysis in genetic association studies.
Demissie, Serkalem; Cupples, L Adrienne
2011-11-01
Association studies of risk factors and complex diseases require careful assessment of potential confounding factors. Two-stage regression analysis, sometimes referred to as residual- or adjusted-outcome analysis, has been increasingly used in association studies of single nucleotide polymorphisms (SNPs) and quantitative traits. In this analysis, first, a residual-outcome is calculated from a regression of the outcome variable on covariates and then the relationship between the adjusted-outcome and the SNP is evaluated by a simple linear regression of the adjusted-outcome on the SNP. In this article, we examine the performance of this two-stage analysis as compared with multiple linear regression (MLR) analysis. Our findings show that when a SNP and a covariate are correlated, the two-stage approach results in biased genotypic effect and loss of power. Bias is always toward the null and increases with the squared-correlation between the SNP and the covariate (). For example, for , 0.1, and 0.5, two-stage analysis results in, respectively, 0, 10, and 50% attenuation in the SNP effect. As expected, MLR was always unbiased. Since individual SNPs often show little or no correlation with covariates, a two-stage analysis is expected to perform as well as MLR in many genetic studies; however, it produces considerably different results from MLR and may lead to incorrect conclusions when independent variables are highly correlated. While a useful alternative to MLR under , the two -stage approach has serious limitations. Its use as a simple substitute for MLR should be avoided. © 2011 Wiley Periodicals, Inc.
Hemmila, April; McGill, Jim; Ritter, David
2008-03-01
To determine if changes in fingerprint infrared spectra linear with age can be found, partial least squares (PLS1) regression of 155 fingerprint infrared spectra against the person's age was constructed. The regression produced a linear model of age as a function of spectrum with a root mean square error of calibration of less than 4 years, showing an inflection at about 25 years of age. The spectral ranges emphasized by the regression do not correspond to the highest concentration constituents of the fingerprints. Separate linear regression models for old and young people can be constructed with even more statistical rigor. The success of the regression demonstrates that a combination of constituents can be found that changes linearly with age, with a significant shift around puberty.
Gimelfarb, A.; Willis, J. H.
1994-01-01
An experiment was conducted to investigate the offspring-parent regression for three quantitative traits (weight, abdominal bristles and wing length) in Drosophila melanogaster. Linear and polynomial models were fitted for the regressions of a character in offspring on both parents. It is demonstrated that responses by the characters to selection predicted by the nonlinear regressions may differ substantially from those predicted by the linear regressions. This is true even, and especially, if selection is weak. The realized heritability for a character under selection is shown to be determined not only by the offspring-parent regression but also by the distribution of the character and by the form and strength of selection. PMID:7828818
Prediction of pulmonary hypertension in idiopathic pulmonary fibrosis☆
Zisman, David A.; Ross, David J.; Belperio, John A.; Saggar, Rajan; Lynch, Joseph P.; Ardehali, Abbas; Karlamangla, Arun S.
2007-01-01
Summary Background Reliable, noninvasive approaches to the diagnosis of pulmonary hypertension in idiopathic pulmonary fibrosis are needed. We tested the hypothesis that the forced vital capacity to diffusing capacity ratio and room air resting pulse oximetry may be combined to predict mean pulmonary artery pressure (MPAP) in idiopathic pulmonary fibrosis. Methods Sixty-one idiopathic pulmonary fibrosis patients with available right-heart catheterization were studied. We regressed measured MPAP as a continuous variable on pulse oximetry (SpO2) and percent predicted forced vital capacity (FVC) to percent-predicted diffusing capacity ratio (% FVC/% DLco) in a multivariable linear regression model. Results Linear regression generated the following equation: MPAP = −11.9+0.272 × SpO2+0.0659 × (100−SpO2)2+3.06 × (% FVC/% DLco); adjusted R2 = 0.55, p<0.0001. The sensitivity, specificity, positive predictive and negative predictive value of model-predicted pulmonary hypertension were 71% (95% confidence interval (CI): 50–89%), 81% (95% CI: 68–92%), 71% (95% CI: 51–87%) and 81% (95% CI: 68–94%). Conclusions A pulmonary hypertension predictor based on room air resting pulse oximetry and FVC to diffusing capacity ratio has a relatively high negative predictive value. However, this model will require external validation before it can be used in clinical practice. PMID:17604151
Linear and nonlinear regression techniques for simultaneous and proportional myoelectric control.
Hahne, J M; Biessmann, F; Jiang, N; Rehbaum, H; Farina, D; Meinecke, F C; Muller, K-R; Parra, L C
2014-03-01
In recent years the number of active controllable joints in electrically powered hand-prostheses has increased significantly. However, the control strategies for these devices in current clinical use are inadequate as they require separate and sequential control of each degree-of-freedom (DoF). In this study we systematically compare linear and nonlinear regression techniques for an independent, simultaneous and proportional myoelectric control of wrist movements with two DoF. These techniques include linear regression, mixture of linear experts (ME), multilayer-perceptron, and kernel ridge regression (KRR). They are investigated offline with electro-myographic signals acquired from ten able-bodied subjects and one person with congenital upper limb deficiency. The control accuracy is reported as a function of the number of electrodes and the amount and diversity of training data providing guidance for the requirements in clinical practice. The results showed that KRR, a nonparametric statistical learning method, outperformed the other methods. However, simple transformations in the feature space could linearize the problem, so that linear models could achieve similar performance as KRR at much lower computational costs. Especially ME, a physiologically inspired extension of linear regression represents a promising candidate for the next generation of prosthetic devices.
NASA Astrophysics Data System (ADS)
Bradshaw, Tyler; Fu, Rau; Bowen, Stephen; Zhu, Jun; Forrest, Lisa; Jeraj, Robert
2015-07-01
Dose painting relies on the ability of functional imaging to identify resistant tumor subvolumes to be targeted for additional boosting. This work assessed the ability of FDG, FLT, and Cu-ATSM PET imaging to predict the locations of residual FDG PET in canine tumors following radiotherapy. Nineteen canines with spontaneous sinonasal tumors underwent PET/CT imaging with radiotracers FDG, FLT, and Cu-ATSM prior to hypofractionated radiotherapy. Therapy consisted of 10 fractions of 4.2 Gy to the sinonasal cavity with or without an integrated boost of 0.8 Gy to the GTV. Patients had an additional FLT PET/CT scan after fraction 2, a Cu-ATSM PET/CT scan after fraction 3, and follow-up FDG PET/CT scans after radiotherapy. Following image registration, simple and multiple linear and logistic voxel regressions were performed to assess how well pre- and mid-treatment PET imaging predicted post-treatment FDG uptake. R2 and pseudo R2 were used to assess the goodness of fits. For simple linear regression models, regression coefficients for all pre- and mid-treatment PET images were significantly positive across the population (P < 0.05). However, there was large variability among patients in goodness of fits: R2 ranged from 0.00 to 0.85, with a median of 0.12. Results for logistic regression models were similar. Multiple linear regression models resulted in better fits (median R2 = 0.31), but there was still large variability between patients in R2. The R2 from regression models for different predictor variables were highly correlated across patients (R ≈ 0.8), indicating tumors that were poorly predicted with one tracer were also poorly predicted by other tracers. In conclusion, the high inter-patient variability in goodness of fits indicates that PET was able to predict locations of residual tumor in some patients, but not others. This suggests not all patients would be good candidates for dose painting based on a single biological target.
Bradshaw, Tyler; Fu, Rau; Bowen, Stephen; Zhu, Jun; Forrest, Lisa; Jeraj, Robert
2015-07-07
Dose painting relies on the ability of functional imaging to identify resistant tumor subvolumes to be targeted for additional boosting. This work assessed the ability of FDG, FLT, and Cu-ATSM PET imaging to predict the locations of residual FDG PET in canine tumors following radiotherapy. Nineteen canines with spontaneous sinonasal tumors underwent PET/CT imaging with radiotracers FDG, FLT, and Cu-ATSM prior to hypofractionated radiotherapy. Therapy consisted of 10 fractions of 4.2 Gy to the sinonasal cavity with or without an integrated boost of 0.8 Gy to the GTV. Patients had an additional FLT PET/CT scan after fraction 2, a Cu-ATSM PET/CT scan after fraction 3, and follow-up FDG PET/CT scans after radiotherapy. Following image registration, simple and multiple linear and logistic voxel regressions were performed to assess how well pre- and mid-treatment PET imaging predicted post-treatment FDG uptake. R(2) and pseudo R(2) were used to assess the goodness of fits. For simple linear regression models, regression coefficients for all pre- and mid-treatment PET images were significantly positive across the population (P < 0.05). However, there was large variability among patients in goodness of fits: R(2) ranged from 0.00 to 0.85, with a median of 0.12. Results for logistic regression models were similar. Multiple linear regression models resulted in better fits (median R(2) = 0.31), but there was still large variability between patients in R(2). The R(2) from regression models for different predictor variables were highly correlated across patients (R ≈ 0.8), indicating tumors that were poorly predicted with one tracer were also poorly predicted by other tracers. In conclusion, the high inter-patient variability in goodness of fits indicates that PET was able to predict locations of residual tumor in some patients, but not others. This suggests not all patients would be good candidates for dose painting based on a single biological target.
Unitary Response Regression Models
ERIC Educational Resources Information Center
Lipovetsky, S.
2007-01-01
The dependent variable in a regular linear regression is a numerical variable, and in a logistic regression it is a binary or categorical variable. In these models the dependent variable has varying values. However, there are problems yielding an identity output of a constant value which can also be modelled in a linear or logistic regression with…
An Expert System for the Evaluation of Cost Models
1990-09-01
contrast to the condition of equal error variance, called homoscedasticity. (Reference: Applied Linear Regression Models by John Neter - page 423...normal. (Reference: Applied Linear Regression Models by John Neter - page 125) Click Here to continue -> Autocorrelation Click Here for the index - Index...over time. Error terms correlated over time are said to be autocorrelated or serially correlated. (REFERENCE: Applied Linear Regression Models by John
Compound Identification Using Penalized Linear Regression on Metabolomics
Liu, Ruiqi; Wu, Dongfeng; Zhang, Xiang; Kim, Seongho
2014-01-01
Compound identification is often achieved by matching the experimental mass spectra to the mass spectra stored in a reference library based on mass spectral similarity. Because the number of compounds in the reference library is much larger than the range of mass-to-charge ratio (m/z) values so that the data become high dimensional data suffering from singularity. For this reason, penalized linear regressions such as ridge regression and the lasso are used instead of the ordinary least squares regression. Furthermore, two-step approaches using the dot product and Pearson’s correlation along with the penalized linear regression are proposed in this study. PMID:27212894
Which Frail Older People Are Dehydrated? The UK DRIE Study
Bunn, Diane K.; Downing, Alice; Jimoh, Florence O.; Groves, Joyce; Free, Carol; Cowap, Vicky; Potter, John F.; Hunter, Paul R.; Shepstone, Lee
2016-01-01
Background: Water-loss dehydration in older people is associated with increased mortality and disability. We aimed to assess the prevalence of dehydration in older people living in UK long-term care and associated cognitive, functional, and health characteristics. Methods: The Dehydration Recognition In our Elders (DRIE) cohort study included people aged 65 or older living in long-term care without heart or renal failure. In a cross-sectional baseline analysis, we assessed serum osmolality, previously suggested dehydration risk factors, general health, markers of continence, cognitive and functional health, nutrition status, and medications. Univariate linear regression was used to assess relationships between participant characteristics and serum osmolality, then associated characteristics entered into stepwise backwards multivariate linear regression. Results: DRIE included 188 residents (mean age 86 years, 66% women) of whom 20% were dehydrated (serum osmolality >300 mOsm/kg). Linear and logistic regression suggested that renal, cognitive, and diabetic status were consistently associated with serum osmolality and odds of dehydration, while potassium-sparing diuretics, sex, number of recent health contacts, and bladder incontinence were sometimes associated. Thirst was not associated with hydration status. Conclusions: DRIE found high prevalence of dehydration in older people living in UK long-term care, reinforcing the proposed association between cognitive and renal function and hydration. Dehydration is associated with increased mortality and disability in older people, but trials to assess effects of interventions to support healthy fluid intakes in older people living in residential care are needed to enable us to formally assess causal direction and any health benefits of increasing fluid intakes. PMID:26553658
Ramasubramanian, Viswanathan; Glasser, Adrian
2015-01-01
PURPOSE To determine whether relatively low-resolution ultrasound biomicroscopy (UBM) can predict the accommodative optical response in prepresbyopic eyes as well as in a previous study of young phakic subjects, despite lower accommodative amplitudes. SETTING College of Optometry, University of Houston, Houston, USA. DESIGN Observational cross-sectional study. METHODS Static accommodative optical response was measured with infrared photorefraction and an autorefractor (WR-5100K) in subjects aged 36 to 46 years. A 35 MHz UBM device (Vumax, Sonomed Escalon) was used to image the left eye, while the right eye viewed accommodative stimuli. Custom-developed Matlab image-analysis software was used to perform automated analysis of UBM images to measure the ocular biometry parameters. The accommodative optical response was predicted from biometry parameters using linear regression, 95% confidence intervals (CIs), and 95% prediction intervals. RESULTS The study evaluated 25 subjects. Per-diopter (D) accommodative changes in anterior chamber depth (ACD), lens thickness, anterior and posterior lens radii of curvature, and anterior segment length were similar to previous values from young subjects. The standard deviations (SDs) of accommodative optical response predicted from linear regressions for UBM-measured biometry parameters were ACD, 0.15 D; lens thickness, 0.25 D; anterior lens radii of curvature, 0.09 D; posterior lens radii of curvature, 0.37 D; and anterior segment length, 0.42 D. CONCLUSIONS Ultrasound biomicroscopy parameters can, on average, predict accommodative optical response with SDs of less than 0.55 D using linear regressions and 95% CIs. Ultrasound biomicroscopy can be used to visualize and quantify accommodative biometric changes and predict accommodative optical response in prepresbyopic eyes. PMID:26049831
Cooley, Richard L.
1983-01-01
This paper investigates factors influencing the degree of improvement in estimates of parameters of a nonlinear regression groundwater flow model by incorporating prior information of unknown reliability. Consideration of expected behavior of the regression solutions and results of a hypothetical modeling problem lead to several general conclusions. First, if the parameters are properly scaled, linearized expressions for the mean square error (MSE) in parameter estimates of a nonlinear model will often behave very nearly as if the model were linear. Second, by using prior information, the MSE in properly scaled parameters can be reduced greatly over the MSE of ordinary least squares estimates of parameters. Third, plots of estimated MSE and the estimated standard deviation of MSE versus an auxiliary parameter (the ridge parameter) specifying the degree of influence of the prior information on regression results can help determine the potential for improvement of parameter estimates. Fourth, proposed criteria can be used to make appropriate choices for the ridge parameter and another parameter expressing degree of overall bias in the prior information. Results of a case study of Truckee Meadows, Reno-Sparks area, Washoe County, Nevada, conform closely to the results of the hypothetical problem. In the Truckee Meadows case, incorporation of prior information did not greatly change the parameter estimates from those obtained by ordinary least squares. However, the analysis showed that both sets of estimates are more reliable than suggested by the standard errors from ordinary least squares.
Kumar, Rajesh; Dogra, Vishal; Rani, Khushbu; Sahu, Kanti
2017-01-01
Background: District level determinants of total fertility rate in Empowered Action Group states of India can help in ongoing population stabilization programs in India. Objective: Present study intends to assess the role of district level determinants in predicting total fertility rate among districts of the Empowered Action Group states of India. Material and Methods: Data from Annual Health Survey (2011-12) was analysed using STATA and R software packages. Multiple linear regression models were built and evaluated using Akaike Information Criterion. For further understanding, recursive partitioning was used to prepare a regression tree. Results: Female married illiteracy positively associated with total fertility rate and explained more than half (53%) of variance. Under multiple linear regression model, married illiteracy, infant mortality rate, Ante natal care registration, household size, median age of live birth and sex ratio explained 70% of total variance in total fertility rate. In regression tree, female married illiteracy was the root node and splits at 42% determined TFR <= 2.7. The next left side branch was again married illiteracy with splits at 23% to determine TFR <= 2.1. Conclusion: We conclude that female married illiteracy is one of the most important determinants explaining total fertility rate among the districts of an Empowered Action Group states. Focus on female literacy is required to stabilize the population growth in long run. PMID:29416999
Control Variate Selection for Multiresponse Simulation.
1987-05-01
M. H. Knuter, Applied Linear Regression Mfodels, Richard D. Erwin, Inc., Homewood, Illinois, 1983. Neuts, Marcel F., Probability, Allyn and Bacon...1982. Neter, J., V. Wasserman, and M. H. Knuter, Applied Linear Regression .fodels, Richard D. Erwin, Inc., Homewood, Illinois, 1983. Neuts, Marcel F...Aspects of J%,ultivariate Statistical Theory, John Wiley and Sons, New York, New York, 1982. dY Neter, J., W. Wasserman, and M. H. Knuter, Applied Linear Regression Mfodels
ERIC Educational Resources Information Center
Kobrin, Jennifer L.; Sinharay, Sandip; Haberman, Shelby J.; Chajewski, Michael
2011-01-01
This study examined the adequacy of a multiple linear regression model for predicting first-year college grade point average (FYGPA) using SAT[R] scores and high school grade point average (HSGPA). A variety of techniques, both graphical and statistical, were used to examine if it is possible to improve on the linear regression model. The results…
High correlations between MRI brain volume measurements based on NeuroQuant® and FreeSurfer.
Ross, David E; Ochs, Alfred L; Tate, David F; Tokac, Umit; Seabaugh, John; Abildskov, Tracy J; Bigler, Erin D
2018-05-30
NeuroQuant ® (NQ) and FreeSurfer (FS) are commonly used computer-automated programs for measuring MRI brain volume. Previously they were reported to have high intermethod reliabilities but often large intermethod effect size differences. We hypothesized that linear transformations could be used to reduce the large effect sizes. This study was an extension of our previously reported study. We performed NQ and FS brain volume measurements on 60 subjects (including normal controls, patients with traumatic brain injury, and patients with Alzheimer's disease). We used two statistical approaches in parallel to develop methods for transforming FS volumes into NQ volumes: traditional linear regression, and Bayesian linear regression. For both methods, we used regression analyses to develop linear transformations of the FS volumes to make them more similar to the NQ volumes. The FS-to-NQ transformations based on traditional linear regression resulted in effect sizes which were small to moderate. The transformations based on Bayesian linear regression resulted in all effect sizes being trivially small. To our knowledge, this is the first report describing a method for transforming FS to NQ data so as to achieve high reliability and low effect size differences. Machine learning methods like Bayesian regression may be more useful than traditional methods. Copyright © 2018 Elsevier B.V. All rights reserved.
Quantile Regression in the Study of Developmental Sciences
Petscher, Yaacov; Logan, Jessica A. R.
2014-01-01
Linear regression analysis is one of the most common techniques applied in developmental research, but only allows for an estimate of the average relations between the predictor(s) and the outcome. This study describes quantile regression, which provides estimates of the relations between the predictor(s) and outcome, but across multiple points of the outcome’s distribution. Using data from the High School and Beyond and U.S. Sustained Effects Study databases, quantile regression is demonstrated and contrasted with linear regression when considering models with: (a) one continuous predictor, (b) one dichotomous predictor, (c) a continuous and a dichotomous predictor, and (d) a longitudinal application. Results from each example exhibited the differential inferences which may be drawn using linear or quantile regression. PMID:24329596
A SEMIPARAMETRIC BAYESIAN MODEL FOR CIRCULAR-LINEAR REGRESSION
We present a Bayesian approach to regress a circular variable on a linear predictor. The regression coefficients are assumed to have a nonparametric distribution with a Dirichlet process prior. The semiparametric Bayesian approach gives added flexibility to the model and is usefu...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jahandideh, Sepideh; Jahandideh, Samad; Asadabadi, Ebrahim Barzegari
2009-11-15
Prediction of the amount of hospital waste production will be helpful in the storage, transportation and disposal of hospital waste management. Based on this fact, two predictor models including artificial neural networks (ANNs) and multiple linear regression (MLR) were applied to predict the rate of medical waste generation totally and in different types of sharp, infectious and general. In this study, a 5-fold cross-validation procedure on a database containing total of 50 hospitals of Fars province (Iran) were used to verify the performance of the models. Three performance measures including MAR, RMSE and R{sup 2} were used to evaluate performancemore » of models. The MLR as a conventional model obtained poor prediction performance measure values. However, MLR distinguished hospital capacity and bed occupancy as more significant parameters. On the other hand, ANNs as a more powerful model, which has not been introduced in predicting rate of medical waste generation, showed high performance measure values, especially 0.99 value of R{sup 2} confirming the good fit of the data. Such satisfactory results could be attributed to the non-linear nature of ANNs in problem solving which provides the opportunity for relating independent variables to dependent ones non-linearly. In conclusion, the obtained results showed that our ANN-based model approach is very promising and may play a useful role in developing a better cost-effective strategy for waste management in future.« less
Møller, Anne; Reventlow, Susanne; Hansen, Åse Marie; Andersen, Lars L; Siersma, Volkert; Lund, Rikke; Avlund, Kirsten; Andersen, Johan Hviid; Mortensen, Ole Steen
2015-01-01
Objectives Our aim was to study associations between physical exposures throughout working life and physical function measured as chair-rise performance in midlife. Methods The Copenhagen Aging and Midlife Biobank (CAMB) provided data about employment and measures of physical function. Individual job histories were assigned exposures from a job exposure matrix. Exposures were standardised to ton-years (lifting 1000 kg each day in 1 year), stand-years (standing/walking for 6 h each day in 1 year) and kneel-years (kneeling for 1 h each day in 1 year). The associations between exposure-years and chair-rise performance (number of chair-rises in 30 s) were analysed in multivariate linear and non-linear regression models adjusted for covariates. Results Mean age among the 5095 participants was 59 years in both genders, and, on average, men achieved 21.58 (SD=5.60) and women 20.38 (SD=5.33) chair-rises in 30 s. Physical exposures were associated with poorer chair-rise performance in both men and women, however, only associations between lifting and standing/walking and chair-rise remained statistically significant among men in the final model. Spline regression analyses showed non-linear associations and confirmed the findings. Conclusions Higher physical exposure throughout working life is associated with slightly poorer chair-rise performance. The associations between exposure and outcome were non-linear. PMID:26537502
Kumar, K Vasanth; Sivanesan, S
2006-08-25
Pseudo second order kinetic expressions of Ho, Sobkowsk and Czerwinski, Blanachard et al. and Ritchie were fitted to the experimental kinetic data of malachite green onto activated carbon by non-linear and linear method. Non-linear method was found to be a better way of obtaining the parameters involved in the second order rate kinetic expressions. Both linear and non-linear regression showed that the Sobkowsk and Czerwinski and Ritchie's pseudo second order model were the same. Non-linear regression analysis showed that both Blanachard et al. and Ho have similar ideas on the pseudo second order model but with different assumptions. The best fit of experimental data in Ho's pseudo second order expression by linear and non-linear regression method showed that Ho pseudo second order model was a better kinetic expression when compared to other pseudo second order kinetic expressions. The amount of dye adsorbed at equilibrium, q(e), was predicted from Ho pseudo second order expression and were fitted to the Langmuir, Freundlich and Redlich Peterson expressions by both linear and non-linear method to obtain the pseudo isotherms. The best fitting pseudo isotherm was found to be the Langmuir and Redlich Peterson isotherm. Redlich Peterson is a special case of Langmuir when the constant g equals unity.
2011-01-01
Background Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs) to variation in features of the trajectories of the state variables (outputs) throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR), where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR) and ordinary least squares (OLS) regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Results Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback loops. Conclusions HC-PLSR is a promising approach for metamodelling in systems biology, especially for highly nonlinear or non-monotone parameter to phenotype maps. The algorithm can be flexibly adjusted to suit the complexity of the dynamic model behaviour, inviting automation in the metamodelling of complex systems. PMID:21627852
2015-07-15
Long-term effects on cancer survivors’ quality of life of physical training versus physical training combined with cognitive-behavioral therapy ...COMPARISON OF NEURAL NETWORK AND LINEAR REGRESSION MODELS IN STATISTICALLY PREDICTING MENTAL AND PHYSICAL HEALTH STATUS OF BREAST...34Comparison of Neural Network and Linear Regression Models in Statistically Predicting Mental and Physical Health Status of Breast Cancer Survivors
Cook, James P; Mahajan, Anubha; Morris, Andrew P
2017-02-01
Linear mixed models are increasingly used for the analysis of genome-wide association studies (GWAS) of binary phenotypes because they can efficiently and robustly account for population stratification and relatedness through inclusion of random effects for a genetic relationship matrix. However, the utility of linear (mixed) models in the context of meta-analysis of GWAS of binary phenotypes has not been previously explored. In this investigation, we present simulations to compare the performance of linear and logistic regression models under alternative weighting schemes in a fixed-effects meta-analysis framework, considering designs that incorporate variable case-control imbalance, confounding factors and population stratification. Our results demonstrate that linear models can be used for meta-analysis of GWAS of binary phenotypes, without loss of power, even in the presence of extreme case-control imbalance, provided that one of the following schemes is used: (i) effective sample size weighting of Z-scores or (ii) inverse-variance weighting of allelic effect sizes after conversion onto the log-odds scale. Our conclusions thus provide essential recommendations for the development of robust protocols for meta-analysis of binary phenotypes with linear models.
Prediction of the Main Engine Power of a New Container Ship at the Preliminary Design Stage
NASA Astrophysics Data System (ADS)
Cepowski, Tomasz
2017-06-01
The paper presents mathematical relationships that allow us to forecast the estimated main engine power of new container ships, based on data concerning vessels built in 2005-2015. The presented approximations allow us to estimate the engine power based on the length between perpendiculars and the number of containers the ship will carry. The approximations were developed using simple linear regression and multivariate linear regression analysis. The presented relations have practical application for estimation of container ship engine power needed in preliminary parametric design of the ship. It follows from the above that the use of multiple linear regression to predict the main engine power of a container ship brings more accurate solutions than simple linear regression.
ERIC Educational Resources Information Center
Li, Deping; Oranje, Andreas
2007-01-01
Two versions of a general method for approximating standard error of regression effect estimates within an IRT-based latent regression model are compared. The general method is based on Binder's (1983) approach, accounting for complex samples and finite populations by Taylor series linearization. In contrast, the current National Assessment of…
Ernst, Anja F; Albers, Casper J
2017-01-01
Misconceptions about the assumptions behind the standard linear regression model are widespread and dangerous. These lead to using linear regression when inappropriate, and to employing alternative procedures with less statistical power when unnecessary. Our systematic literature review investigated employment and reporting of assumption checks in twelve clinical psychology journals. Findings indicate that normality of the variables themselves, rather than of the errors, was wrongfully held for a necessary assumption in 4% of papers that use regression. Furthermore, 92% of all papers using linear regression were unclear about their assumption checks, violating APA-recommendations. This paper appeals for a heightened awareness for and increased transparency in the reporting of statistical assumption checking.
Ernst, Anja F.
2017-01-01
Misconceptions about the assumptions behind the standard linear regression model are widespread and dangerous. These lead to using linear regression when inappropriate, and to employing alternative procedures with less statistical power when unnecessary. Our systematic literature review investigated employment and reporting of assumption checks in twelve clinical psychology journals. Findings indicate that normality of the variables themselves, rather than of the errors, was wrongfully held for a necessary assumption in 4% of papers that use regression. Furthermore, 92% of all papers using linear regression were unclear about their assumption checks, violating APA-recommendations. This paper appeals for a heightened awareness for and increased transparency in the reporting of statistical assumption checking. PMID:28533971
Estimating linear temporal trends from aggregated environmental monitoring data
Erickson, Richard A.; Gray, Brian R.; Eager, Eric A.
2017-01-01
Trend estimates are often used as part of environmental monitoring programs. These trends inform managers (e.g., are desired species increasing or undesired species decreasing?). Data collected from environmental monitoring programs is often aggregated (i.e., averaged), which confounds sampling and process variation. State-space models allow sampling variation and process variations to be separated. We used simulated time-series to compare linear trend estimations from three state-space models, a simple linear regression model, and an auto-regressive model. We also compared the performance of these five models to estimate trends from a long term monitoring program. We specifically estimated trends for two species of fish and four species of aquatic vegetation from the Upper Mississippi River system. We found that the simple linear regression had the best performance of all the given models because it was best able to recover parameters and had consistent numerical convergence. Conversely, the simple linear regression did the worst job estimating populations in a given year. The state-space models did not estimate trends well, but estimated population sizes best when the models converged. We found that a simple linear regression performed better than more complex autoregression and state-space models when used to analyze aggregated environmental monitoring data.
Comparing The Effectiveness of a90/95 Calculations (Preprint)
2006-09-01
Nachtsheim, John Neter, William Li, Applied Linear Statistical Models , 5th ed., McGraw-Hill/Irwin, 2005 5. Mood, Graybill and Boes, Introduction...curves is based on methods that are only valid for ordinary linear regression. Requirements for a valid Ordinary Least-Squares Regression Model There... linear . For example is a linear model ; is not. 2. Uniform variance (homoscedasticity
Correlation and simple linear regression.
Zou, Kelly H; Tuncali, Kemal; Silverman, Stuart G
2003-06-01
In this tutorial article, the concepts of correlation and regression are reviewed and demonstrated. The authors review and compare two correlation coefficients, the Pearson correlation coefficient and the Spearman rho, for measuring linear and nonlinear relationships between two continuous variables. In the case of measuring the linear relationship between a predictor and an outcome variable, simple linear regression analysis is conducted. These statistical concepts are illustrated by using a data set from published literature to assess a computed tomography-guided interventional technique. These statistical methods are important for exploring the relationships between variables and can be applied to many radiologic studies.
Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L
2018-02-01
A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R 2 ), using R 2 as the primary metric of assay agreement. However, the use of R 2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
2017-10-01
ENGINEERING CENTER GRAIN EVALUATION SOFTWARE TO NUMERICALLY PREDICT LINEAR BURN REGRESSION FOR SOLID PROPELLANT GRAIN GEOMETRIES Brian...author(s) and should not be construed as an official Department of the Army position, policy, or decision, unless so designated by other documentation...U.S. ARMY ARMAMENT RESEARCH, DEVELOPMENT AND ENGINEERING CENTER GRAIN EVALUATION SOFTWARE TO NUMERICALLY PREDICT LINEAR BURN REGRESSION FOR SOLID
Linear regression in astronomy. II
NASA Technical Reports Server (NTRS)
Feigelson, Eric D.; Babu, Gutti J.
1992-01-01
A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.
A Constrained Linear Estimator for Multiple Regression
ERIC Educational Resources Information Center
Davis-Stober, Clintin P.; Dana, Jason; Budescu, David V.
2010-01-01
"Improper linear models" (see Dawes, Am. Psychol. 34:571-582, "1979"), such as equal weighting, have garnered interest as alternatives to standard regression models. We analyze the general circumstances under which these models perform well by recasting a class of "improper" linear models as "proper" statistical models with a single predictor. We…
On the design of classifiers for crop inventories
NASA Technical Reports Server (NTRS)
Heydorn, R. P.; Takacs, H. C.
1986-01-01
Crop proportion estimators that use classifications of satellite data to correct, in an additive way, a given estimate acquired from ground observations are discussed. A linear version of these estimators is optimal, in terms of minimum variance, when the regression of the ground observations onto the satellite observations in linear. When this regression is not linear, but the reverse regression (satellite observations onto ground observations) is linear, the estimator is suboptimal but still has certain appealing variance properties. In this paper expressions are derived for those regressions which relate the intercepts and slopes to conditional classification probabilities. These expressions are then used to discuss the question of classifier designs that can lead to low-variance crop proportion estimates. Variance expressions for these estimates in terms of classifier omission and commission errors are also derived.
Buchvold, Hogne Vikanes; Pallesen, Ståle; Waage, Siri; Bjorvatn, Bjørn
2018-05-01
Objectives The aim of this study was to investigate changes in body mass index (BMI) between different work schedules and different average number of yearly night shifts over a four-year follow-up period. Methods A prospective study of Norwegian nurses (N=2965) with different work schedules was conducted: day only, two-shift rotation (day and evening shifts), three-shift rotation (day, evening and night shifts), night only, those who changed towards night shifts, and those who changed away from schedules containing night shifts. Paired student's t-tests were used to evaluate within subgroup changes in BMI. Multiple linear regression analysis was used to evaluate between groups effects on BMI when adjusting for BMI at baseline, sex, age, marital status, children living at home, and years since graduation. The same regression model was used to evaluate the effect of average number of yearly night shifts on BMI change. Results We found that night workers [mean difference (MD) 1.30 (95% CI 0.70-1.90)], two shift workers [MD 0.48 (95% CI 0.20-0.75)], three shift workers [MD 0.46 (95% CI 0.30-0.62)], and those who changed work schedule away from [MD 0.57 (95% CI 0.17-0.84)] or towards night work [MD 0.63 (95% CI 0.20-1.05)] all had significant BMI gain (P<0.01) during the follow-up period. However, day workers had a non-significant BMI gain. Using adjusted multiple linear regressions, we found that night workers had significantly larger BMI gain compared to day workers [B=0.89 (95% CI 0.06-1.72), P<0.05]. We did not find any significant association between average number of yearly night shifts and BMI change using our multiple linear regression model. Conclusions After adjusting for possible confounders, we found that BMI increased significantly more among night workers compared to day workers.
Refractive Status at Birth: Its Relation to Newborn Physical Parameters at Birth and Gestational Age
Varghese, Raji Mathew; Sreenivas, Vishnubhatla; Puliyel, Jacob Mammen; Varughese, Sara
2009-01-01
Background Refractive status at birth is related to gestational age. Preterm babies have myopia which decreases as gestational age increases and term babies are known to be hypermetropic. This study looked at the correlation of refractive status with birth weight in term and preterm babies, and with physical indicators of intra-uterine growth such as the head circumference and length of the baby at birth. Methods All babies delivered at St. Stephens Hospital and admitted in the nursery were eligible for the study. Refraction was performed within the first week of life. 0.8% tropicamide with 0.5% phenylephrine was used to achieve cycloplegia and paralysis of accommodation. 599 newborn babies participated in the study. Data pertaining to the right eye is utilized for all the analyses except that for anisometropia where the two eyes were compared. Growth parameters were measured soon after birth. Simple linear regression analysis was performed to see the association of refractive status, (mean spherical equivalent (MSE), astigmatism and anisometropia) with each of the study variables, namely gestation, length, weight and head circumference. Subsequently, multiple linear regression was carried out to identify the independent predictors for each of the outcome parameters. Results Simple linear regression showed a significant relation between all 4 study variables and refractive error but in multiple regression only gestational age and weight were related to refractive error. The partial correlation of weight with MSE adjusted for gestation was 0.28 and that of gestation with MSE adjusted for weight was 0.10. Birth weight had a higher correlation to MSE than gestational age. Conclusion This is the first study to look at refractive error against all these growth parameters, in preterm and term babies at birth. It would appear from this study that birth weight rather than gestation should be used as criteria for screening for refractive error, especially in developing countries where the incidence of intrauterine malnutrition is higher. PMID:19214228
Ebrahimzadeh, Farzad; Hajizadeh, Ebrahim; Vahabi, Nasim; Almasian, Mohammad; Bakhteyar, Katayoon
2015-01-01
Background: Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. Methods: In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were selected by the stratified and cluster sampling; relevant variables were measured and for prediction of unwanted pregnancy, logistic regression, discriminant analysis, and probit regression models and SPSS software version 21 were used. To compare these models, indicators such as sensitivity, specificity, the area under the ROC curve, and the percentage of correct predictions were used. Results: The prevalence of unwanted pregnancies was 25.3%. The logistic and probit regression models indicated that parity and pregnancy spacing, contraceptive methods, household income and number of living male children were related to unwanted pregnancy. The performance of the models based on the area under the ROC curve was 0.735, 0.733, and 0.680 for logistic regression, probit regression, and linear discriminant analysis, respectively. Conclusion: Given the relatively high prevalence of unwanted pregnancies in Khorramabad, it seems necessary to revise family planning programs. Despite the similar accuracy of the models, if the researcher is interested in the interpretability of the results, the use of the logistic regression model is recommended. PMID:26793655
Szekér, Szabolcs; Vathy-Fogarassy, Ágnes
2018-01-01
Logistic regression based propensity score matching is a widely used method in case-control studies to select the individuals of the control group. This method creates a suitable control group if all factors affecting the output variable are known. However, if relevant latent variables exist as well, which are not taken into account during the calculations, the quality of the control group is uncertain. In this paper, we present a statistics-based research in which we try to determine the relationship between the accuracy of the logistic regression model and the uncertainty of the dependent variable of the control group defined by propensity score matching. Our analyses show that there is a linear correlation between the fit of the logistic regression model and the uncertainty of the output variable. In certain cases, a latent binary explanatory variable can result in a relative error of up to 70% in the prediction of the outcome variable. The observed phenomenon calls the attention of analysts to an important point, which must be taken into account when deducting conclusions.
Fananapazir, Ghaneh; Benzl, Robert; Corwin, Michael T; Chen, Ling-Xin; Sageshima, Junichiro; Stewart, Susan L; Troppmann, Christoph
2018-07-01
Purpose To determine whether the predonation computed tomography (CT)-based volume of the future remnant kidney is predictive of postdonation renal function in living kidney donors. Materials and Methods This institutional review board-approved, retrospective, HIPAA-compliant study included 126 live kidney donors who had undergone predonation renal CT between January 2007 and December 2014 as well as 2-year postdonation measurement of estimated glomerular filtration rate (eGFR). The whole kidney volume and cortical volume of the future remnant kidney were measured and standardized for body surface area (BSA). Bivariate linear associations between the ratios of whole kidney volume to BSA and cortical volume to BSA were obtained. A linear regression model for 2-year postdonation eGFR that incorporated donor age, sex, and either whole kidney volume-to-BSA ratio or cortical volume-to-BSA ratio was created, and the coefficient of determination (R 2 ) for the model was calculated. Factors not statistically additive in assessing 2-year eGFR were removed by using backward elimination, and the coefficient of determination for this parsimonious model was calculated. Results Correlation was slightly better for cortical volume-to-BSA ratio than for whole kidney volume-to-BSA ratio (r = 0.48 vs r = 0.44, respectively). The linear regression model incorporating all donor factors had an R 2 of 0.66. The only factors that were significantly additive to the equation were cortical volume-to-BSA ratio and predonation eGFR (P = .01 and P < .01, respectively), and the final parsimonious linear regression model incorporating these two variables explained almost the same amount of variance (R 2 = 0.65) as did the full model. Conclusion The cortical volume of the future remnant kidney helped predict postdonation eGFR at 2 years. The cortical volume-to-BSA ratio should thus be considered for addition as an important variable to living kidney donor evaluation and selection guidelines. © RSNA, 2018.
Cespedes, Elizabeth M.; Horan, Christine M.; Gillman, Matthew W.; Gortmaker, Steven L.; Price, Sarah; Rifas-Shiman, Sheryl L.; Mitchell, Kathleen; Taveras, Elsie M.
2014-01-01
Objective To evaluate the High Five for Kids intervention effect on television (TV) within subgroups, examine participant characteristics associated with process measures and assess perceived helpfulness of TV intervention components. Method High Five (RCT of 445 overweight/obese 2–7 year-olds in Massachusetts [2006–2008]) reduced TV by 0.36 hours/day. 1-year effects on TV, stratified by subgroup, were assessed using linear regression. Among intervention participants (n=253), associations of intervention component helpfulness with TV reduction were examined using linear regression and associations of participant characteristics with processes linked to TV reduction (choosing TV and completing intervention visits) were examined using logistic regression. Results High Five reduced TV across subgroups. Parents of Latino (v. white) children had lower odds of completing >=2 study visits (OR 0.39 [95%CI: 0.18, 0.84]). Parents of black (v. white) children had higher odds of choosing TV (OR: 2.23 [95% CI: 1.08, 4.59]), as did parents of obese (v. overweight) children and children watching >=2 hours/day (v. <2) at baseline. Greater perceived helpfulness was associated with greater TV reduction. Conclusion Clinic-based motivational interviewing reduces TV in children. Low cost education approaches (e.g., printed materials) may be well-received. Parents of children at higher obesity risk could be more motivated to reduce TV. PMID:24518002
Linear regression analysis of survival data with missing censoring indicators.
Wang, Qihua; Dinse, Gregg E
2011-04-01
Linear regression analysis has been studied extensively in a random censorship setting, but typically all of the censoring indicators are assumed to be observed. In this paper, we develop synthetic data methods for estimating regression parameters in a linear model when some censoring indicators are missing. We define estimators based on regression calibration, imputation, and inverse probability weighting techniques, and we prove all three estimators are asymptotically normal. The finite-sample performance of each estimator is evaluated via simulation. We illustrate our methods by assessing the effects of sex and age on the time to non-ambulatory progression for patients in a brain cancer clinical trial.
An Analysis of COLA (Cost of Living Adjustment) Allocation within the United States Coast Guard.
1983-09-01
books Applied Linear Regression [Ref. 39], and Statistical Methods in Research and Production [Ref. 40], or any other book on regression. In the event...Indexes, Master’s Thesis, Air Force Institute of Technology, Wright-Patterson AFB, 1976. 39. Weisberg, Stanford, Applied Linear Regression , Wiley, 1980. 40
Testing hypotheses for differences between linear regression lines
Stanley J. Zarnoch
2009-01-01
Five hypotheses are identified for testing differences between simple linear regression lines. The distinctions between these hypotheses are based on a priori assumptions and illustrated with full and reduced models. The contrast approach is presented as an easy and complete method for testing for overall differences between the regressions and for making pairwise...
Graphical Description of Johnson-Neyman Outcomes for Linear and Quadratic Regression Surfaces.
ERIC Educational Resources Information Center
Schafer, William D.; Wang, Yuh-Yin
A modification of the usual graphical representation of heterogeneous regressions is described that can aid in interpreting significant regions for linear or quadratic surfaces. The standard Johnson-Neyman graph is a bivariate plot with the criterion variable on the ordinate and the predictor variable on the abscissa. Regression surfaces are drawn…
Teaching the Concept of Breakdown Point in Simple Linear Regression.
ERIC Educational Resources Information Center
Chan, Wai-Sum
2001-01-01
Most introductory textbooks on simple linear regression analysis mention the fact that extreme data points have a great influence on ordinary least-squares regression estimation; however, not many textbooks provide a rigorous mathematical explanation of this phenomenon. Suggests a way to fill this gap by teaching students the concept of breakdown…
Estimating monotonic rates from biological data using local linear regression.
Olito, Colin; White, Craig R; Marshall, Dustin J; Barneche, Diego R
2017-03-01
Accessing many fundamental questions in biology begins with empirical estimation of simple monotonic rates of underlying biological processes. Across a variety of disciplines, ranging from physiology to biogeochemistry, these rates are routinely estimated from non-linear and noisy time series data using linear regression and ad hoc manual truncation of non-linearities. Here, we introduce the R package LoLinR, a flexible toolkit to implement local linear regression techniques to objectively and reproducibly estimate monotonic biological rates from non-linear time series data, and demonstrate possible applications using metabolic rate data. LoLinR provides methods to easily and reliably estimate monotonic rates from time series data in a way that is statistically robust, facilitates reproducible research and is applicable to a wide variety of research disciplines in the biological sciences. © 2017. Published by The Company of Biologists Ltd.
Locally linear regression for pose-invariant face recognition.
Chai, Xiujuan; Shan, Shiguang; Chen, Xilin; Gao, Wen
2007-07-01
The variation of facial appearance due to the viewpoint (/pose) degrades face recognition systems considerably, which is one of the bottlenecks in face recognition. One of the possible solutions is generating virtual frontal view from any given nonfrontal view to obtain a virtual gallery/probe face. Following this idea, this paper proposes a simple, but efficient, novel locally linear regression (LLR) method, which generates the virtual frontal view from a given nonfrontal face image. We first justify the basic assumption of the paper that there exists an approximate linear mapping between a nonfrontal face image and its frontal counterpart. Then, by formulating the estimation of the linear mapping as a prediction problem, we present the regression-based solution, i.e., globally linear regression. To improve the prediction accuracy in the case of coarse alignment, LLR is further proposed. In LLR, we first perform dense sampling in the nonfrontal face image to obtain many overlapped local patches. Then, the linear regression technique is applied to each small patch for the prediction of its virtual frontal patch. Through the combination of all these patches, the virtual frontal view is generated. The experimental results on the CMU PIE database show distinct advantage of the proposed method over Eigen light-field method.
A reliable and cost effective approach for radiographic monitoring in nutritional rickets
Gupta, V; Sharma, V; Sinha, B; Samanta, S
2014-01-01
Objective: Radiological scoring is particularly useful in rickets, where pre-treatment radiographical findings can reflect the disease severity and can be used to monitor the improvement. However, there is only a single radiographic scoring system for rickets developed by Thacher and, to the best of our knowledge, no study has evaluated radiographic changes in rickets based on this scoring system apart from the one done by Thacher himself. The main objective of this study is to compare and analyse the pre-treatment and post-treatment radiographic parameters in nutritional rickets with the help of Thacher's scoring technique. Methods: 176 patients with nutritional rickets were given a single intramuscular injection of vitamin D (600 000 IU) along with oral calcium (50 mg kg−1) and vitamin D (400 IU per day) until radiological resolution and followed for 1 year. Pre- and post-treatment radiological parameters were compared and analysed statistically based on Thacher's scoring system. Results: Radiological resolution was complete by 6 months. Time for radiological resolution and initial radiological score were linearly associated on regression analysis. The distal ulna was the last to heal in most cases except when the initial score was 10, when distal femur was the last to heal. Conclusion: Thacher's scoring system can effectively monitor nutritional rickets. The formula derived through linear regression has prognostic significance. Advances in knowledge: The distal femur is a better indicator in radiologically severe rickets and when resolution is delayed. Thacher's scoring is very useful for monitoring of rickets. The formula derived through linear regression can predict the expected time for radiological resolution. PMID:24593231
van der Zijden, A M; Groen, B E; Tanck, E; Nienhuis, B; Verdonschot, N; Weerdesteyn, V
2017-03-21
Many research groups have studied fall impact mechanics to understand how fall severity can be reduced to prevent hip fractures. Yet, direct impact force measurements with force plates are restricted to a very limited repertoire of experimental falls. The purpose of this study was to develop a generic model for estimating hip impact forces (i.e. fall severity) in in vivo sideways falls without the use of force plates. Twelve experienced judokas performed sideways Martial Arts (MA) and Block ('natural') falls on a force plate, both with and without a mat on top. Data were analyzed to determine the hip impact force and to derive 11 selected (subject-specific and kinematic) variables. Falls from kneeling height were used to perform a stepwise regression procedure to assess the effects of these input variables and build the model. The final model includes four input variables, involving one subject-specific measure and three kinematic variables: maximum upper body deceleration, body mass, shoulder angle at the instant of 'maximum impact' and maximum hip deceleration. The results showed that estimated and measured hip impact forces were linearly related (explained variances ranging from 46 to 63%). Hip impact forces of MA falls onto the mat from a standing position (3650±916N) estimated by the final model were comparable with measured values (3698±689N), even though these data were not used for training the model. In conclusion, a generic linear regression model was developed that enables the assessment of fall severity through kinematic measures of sideways falls, without using force plates. Copyright © 2017 Elsevier Ltd. All rights reserved.
Real, Jordi; Forné, Carles; Roso-Llorach, Albert; Martínez-Sánchez, Jose M
2016-05-01
Controlling for confounders is a crucial step in analytical observational studies, and multivariable models are widely used as statistical adjustment techniques. However, the validation of the assumptions of the multivariable regression models (MRMs) should be made clear in scientific reporting. The objective of this study is to review the quality of statistical reporting of the most commonly used MRMs (logistic, linear, and Cox regression) that were applied in analytical observational studies published between 2003 and 2014 by journals indexed in MEDLINE.Review of a representative sample of articles indexed in MEDLINE (n = 428) with observational design and use of MRMs (logistic, linear, and Cox regression). We assessed the quality of reporting about: model assumptions and goodness-of-fit, interactions, sensitivity analysis, crude and adjusted effect estimate, and specification of more than 1 adjusted model.The tests of underlying assumptions or goodness-of-fit of the MRMs used were described in 26.2% (95% CI: 22.0-30.3) of the articles and 18.5% (95% CI: 14.8-22.1) reported the interaction analysis. Reporting of all items assessed was higher in articles published in journals with a higher impact factor.A low percentage of articles indexed in MEDLINE that used multivariable techniques provided information demonstrating rigorous application of the model selected as an adjustment method. Given the importance of these methods to the final results and conclusions of observational studies, greater rigor is required in reporting the use of MRMs in the scientific literature.
Yokoo, Takeshi; Serai, Suraj D; Pirasteh, Ali; Bashir, Mustafa R; Hamilton, Gavin; Hernando, Diego; Hu, Houchun H; Hetterich, Holger; Kühn, Jens-Peter; Kukuk, Guido M; Loomba, Rohit; Middleton, Michael S; Obuchowski, Nancy A; Song, Ji Soo; Tang, An; Wu, Xinhuai; Reeder, Scott B; Sirlin, Claude B
2018-02-01
Purpose To determine the linearity, bias, and precision of hepatic proton density fat fraction (PDFF) measurements by using magnetic resonance (MR) imaging across different field strengths, imager manufacturers, and reconstruction methods. Materials and Methods This meta-analysis was performed in accordance with Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. A systematic literature search identified studies that evaluated the linearity and/or bias of hepatic PDFF measurements by using MR imaging (hereafter, MR imaging-PDFF) against PDFF measurements by using colocalized MR spectroscopy (hereafter, MR spectroscopy-PDFF) or the precision of MR imaging-PDFF. The quality of each study was evaluated by using the Quality Assessment of Studies of Diagnostic Accuracy 2 tool. De-identified original data sets from the selected studies were pooled. Linearity was evaluated by using linear regression between MR imaging-PDFF and MR spectroscopy-PDFF measurements. Bias, defined as the mean difference between MR imaging-PDFF and MR spectroscopy-PDFF measurements, was evaluated by using Bland-Altman analysis. Precision, defined as the agreement between repeated MR imaging-PDFF measurements, was evaluated by using a linear mixed-effects model, with field strength, imager manufacturer, reconstruction method, and region of interest as random effects. Results Twenty-three studies (1679 participants) were selected for linearity and bias analyses and 11 studies (425 participants) were selected for precision analyses. MR imaging-PDFF was linear with MR spectroscopy-PDFF (R 2 = 0.96). Regression slope (0.97; P < .001) and mean Bland-Altman bias (-0.13%; 95% limits of agreement: -3.95%, 3.40%) indicated minimal underestimation by using MR imaging-PDFF. MR imaging-PDFF was precise at the region-of-interest level, with repeatability and reproducibility coefficients of 2.99% and 4.12%, respectively. Field strength, imager manufacturer, and reconstruction method each had minimal effects on reproducibility. Conclusion MR imaging-PDFF has excellent linearity, bias, and precision across different field strengths, imager manufacturers, and reconstruction methods. © RSNA, 2017 Online supplemental material is available for this article. An earlier incorrect version of this article appeared online. This article was corrected on October 2, 2017.
Deriving Hounsfield units using grey levels in cone beam computed tomography
Mah, P; Reeves, T E; McDavid, W D
2010-01-01
Objectives An in vitro study was performed to investigate the relationship between grey levels in dental cone beam CT (CBCT) and Hounsfield units (HU) in CBCT scanners. Methods A phantom containing 8 different materials of known composition and density was imaged with 11 different dental CBCT scanners and 2 medical CT scanners. The phantom was scanned under three conditions: phantom alone and phantom in a small and large water container. The reconstructed data were exported as Digital Imaging and Communications in Medicine (DICOM) and analysed with On Demand 3D® by Cybermed, Seoul, Korea. The relationship between grey levels and linear attenuation coefficients was investigated. Results It was demonstrated that a linear relationship between the grey levels and the attenuation coefficients of each of the materials exists at some “effective” energy. From the linear regression equation of the reference materials, attenuation coefficients were obtained for each of the materials and CT numbers in HU were derived using the standard equation. Conclusions HU can be derived from the grey levels in dental CBCT scanners using linear attenuation coefficients as an intermediate step. PMID:20729181
Effect of Malmquist bias on correlation studies with IRAS data base
NASA Technical Reports Server (NTRS)
Verter, Frances
1993-01-01
The relationships between galaxy properties in the sample of Trinchieri et al. (1989) are reexamined with corrections for Malmquist bias. The linear correlations are tested and linear regressions are fit for log-log plots of L(FIR), L(H-alpha), and L(B) as well as ratios of these quantities. The linear correlations for Malmquist bias are corrected using the method of Verter (1988), in which each galaxy observation is weighted by the inverse of its sampling volume. The linear regressions are corrected for Malmquist bias by a new method invented here in which each galaxy observation is weighted by its sampling volume. The results of correlation and regressions among the sample are significantly changed in the anticipated sense that the corrected correlation confidences are lower and the corrected slopes of the linear regressions are lower. The elimination of Malmquist bias eliminates the nonlinear rise in luminosity that has caused some authors to hypothesize additional components of FIR emission.
On the use of log-transformation vs. nonlinear regression for analyzing biological power laws.
Xiao, Xiao; White, Ethan P; Hooten, Mevin B; Durham, Susan L
2011-10-01
Power-law relationships are among the most well-studied functional relationships in biology. Recently the common practice of fitting power laws using linear regression (LR) on log-transformed data has been criticized, calling into question the conclusions of hundreds of studies. It has been suggested that nonlinear regression (NLR) is preferable, but no rigorous comparison of these two methods has been conducted. Using Monte Carlo simulations, we demonstrate that the error distribution determines which method performs better, with NLR better characterizing data with additive, homoscedastic, normal error and LR better characterizing data with multiplicative, heteroscedastic, lognormal error. Analysis of 471 biological power laws shows that both forms of error occur in nature. While previous analyses based on log-transformation appear to be generally valid, future analyses should choose methods based on a combination of biological plausibility and analysis of the error distribution. We provide detailed guidelines and associated computer code for doing so, including a model averaging approach for cases where the error structure is uncertain.
Discrimination of serum Raman spectroscopy between normal and colorectal cancer
NASA Astrophysics Data System (ADS)
Li, Xiaozhou; Yang, Tianyue; Yu, Ting; Li, Siqi
2011-07-01
Raman spectroscopy of tissues has been widely studied for the diagnosis of various cancers, but biofluids were seldom used as the analyte because of the low concentration. Herein, serum of 30 normal people, 46 colon cancer, and 44 rectum cancer patients were measured Raman spectra and analyzed. The information of Raman peaks (intensity and width) and that of the fluorescence background (baseline function coefficients) were selected as parameters for statistical analysis. Principal component regression (PCR) and partial least square regression (PLSR) were used on the selected parameters separately to see the performance of the parameters. PCR performed better than PLSR in our spectral data. Then linear discriminant analysis (LDA) was used on the principal components (PCs) of the two regression method on the selected parameters, and a diagnostic accuracy of 88% and 83% were obtained. The conclusion is that the selected features can maintain the information of original spectra well and Raman spectroscopy of serum has the potential for the diagnosis of colorectal cancer.
[Research of prevalence of schistosomiasis in Hunan province, 1984-2015].
Li, F Y; Tan, H Z; Ren, G H; Jiang, Q; Wang, H L
2017-03-10
Objective: To analyze the prevalence of schistosomiasis in Hunan province, and provide scientific evidence for the control and elimination of schistosomiasis. Methods: The changes of infection rates of Schistosoma ( S .) japonicum among residents and cattle in Hunan from 1984 to 2015 were analyzed by using dynamic trend diagram; and the time regression model was used to fit the infection rates of S. japonicum , and predict the recent infection rate. Results: The overall infection rates of S. japonicum in Hunan from 1984 to 2015 showed downward trend (95.29% in residents and 95.16% in cattle). By using the linear regression model, the actual values of infection rates in residents and cattle were all in the 95% confidence intervals of the value predicted; and the prediction showed that the infection rates in the residents and cattle would continue to decrease from 2016 to 2020. Conclusion: The prevalence of schistosomiasis was in decline in Hunan. The regression model has a good effect in the short-term prediction of schistosomiasis prevalence.
A primer for biomedical scientists on how to execute model II linear regression analysis.
Ludbrook, John
2012-04-01
1. There are two very different ways of executing linear regression analysis. One is Model I, when the x-values are fixed by the experimenter. The other is Model II, in which the x-values are free to vary and are subject to error. 2. I have received numerous complaints from biomedical scientists that they have great difficulty in executing Model II linear regression analysis. This may explain the results of a Google Scholar search, which showed that the authors of articles in journals of physiology, pharmacology and biochemistry rarely use Model II regression analysis. 3. I repeat my previous arguments in favour of using least products linear regression analysis for Model II regressions. I review three methods for executing ordinary least products (OLP) and weighted least products (WLP) regression analysis: (i) scientific calculator and/or computer spreadsheet; (ii) specific purpose computer programs; and (iii) general purpose computer programs. 4. Using a scientific calculator and/or computer spreadsheet, it is easy to obtain correct values for OLP slope and intercept, but the corresponding 95% confidence intervals (CI) are inaccurate. 5. Using specific purpose computer programs, the freeware computer program smatr gives the correct OLP regression coefficients and obtains 95% CI by bootstrapping. In addition, smatr can be used to compare the slopes of OLP lines. 6. When using general purpose computer programs, I recommend the commercial programs systat and Statistica for those who regularly undertake linear regression analysis and I give step-by-step instructions in the Supplementary Information as to how to use loss functions. © 2011 The Author. Clinical and Experimental Pharmacology and Physiology. © 2011 Blackwell Publishing Asia Pty Ltd.
Least-Squares Data Adjustment with Rank-Deficient Data Covariance Matrices
DOE Office of Scientific and Technical Information (OSTI.GOV)
Williams, J.G.
2011-07-01
A derivation of the linear least-squares adjustment formulae is required that avoids the assumption that the covariance matrix of prior parameters can be inverted. Possible proofs are of several kinds, including: (i) extension of standard results for the linear regression formulae, and (ii) minimization by differentiation of a quadratic form of the deviations in parameters and responses. In this paper, the least-squares adjustment equations are derived in both these ways, while explicitly assuming that the covariance matrix of prior parameters is singular. It will be proved that the solutions are unique and that, contrary to statements that have appeared inmore » the literature, the least-squares adjustment problem is not ill-posed. No modification is required to the adjustment formulae that have been used in the past in the case of a singular covariance matrix for the priors. In conclusion: The linear least-squares adjustment formula that has been used in the past is valid in the case of a singular covariance matrix for the covariance matrix of prior parameters. Furthermore, it provides a unique solution. Statements in the literature, to the effect that the problem is ill-posed are wrong. No regularization of the problem is required. This has been proved in the present paper by two methods, while explicitly assuming that the covariance matrix of prior parameters is singular: i) extension of standard results for the linear regression formulae, and (ii) minimization by differentiation of a quadratic form of the deviations in parameters and responses. No modification is needed to the adjustment formulae that have been used in the past. (author)« less
ERIC Educational Resources Information Center
Rocconi, Louis M.
2011-01-01
Hierarchical linear models (HLM) solve the problems associated with the unit of analysis problem such as misestimated standard errors, heterogeneity of regression and aggregation bias by modeling all levels of interest simultaneously. Hierarchical linear modeling resolves the problem of misestimated standard errors by incorporating a unique random…
ERIC Educational Resources Information Center
Preacher, Kristopher J.; Curran, Patrick J.; Bauer, Daniel J.
2006-01-01
Simple slopes, regions of significance, and confidence bands are commonly used to evaluate interactions in multiple linear regression (MLR) models, and the use of these techniques has recently been extended to multilevel or hierarchical linear modeling (HLM) and latent curve analysis (LCA). However, conducting these tests and plotting the…
Classical Testing in Functional Linear Models.
Kong, Dehan; Staicu, Ana-Maria; Maity, Arnab
2016-01-01
We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications.
Classical Testing in Functional Linear Models
Kong, Dehan; Staicu, Ana-Maria; Maity, Arnab
2016-01-01
We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications. PMID:28955155
Musuku, Adrien; Tan, Aimin; Awaiye, Kayode; Trabelsi, Fethi
2013-09-01
Linear calibration is usually performed using eight to ten calibration concentration levels in regulated LC-MS bioanalysis because a minimum of six are specified in regulatory guidelines. However, we have previously reported that two-concentration linear calibration is as reliable as or even better than using multiple concentrations. The purpose of this research is to compare two-concentration with multiple-concentration linear calibration through retrospective data analysis of multiple bioanalytical projects that were conducted in an independent regulated bioanalytical laboratory. A total of 12 bioanalytical projects were randomly selected: two validations and two studies for each of the three most commonly used types of sample extraction methods (protein precipitation, liquid-liquid extraction, solid-phase extraction). When the existing data were retrospectively linearly regressed using only the lowest and the highest concentration levels, no extra batch failure/QC rejection was observed and the differences in accuracy and precision between the original multi-concentration regression and the new two-concentration linear regression are negligible. Specifically, the differences in overall mean apparent bias (square root of mean individual bias squares) are within the ranges of -0.3% to 0.7% and 0.1-0.7% for the validations and studies, respectively. The differences in mean QC concentrations are within the ranges of -0.6% to 1.8% and -0.8% to 2.5% for the validations and studies, respectively. The differences in %CV are within the ranges of -0.7% to 0.9% and -0.3% to 0.6% for the validations and studies, respectively. The average differences in study sample concentrations are within the range of -0.8% to 2.3%. With two-concentration linear regression, an average of 13% of time and cost could have been saved for each batch together with 53% of saving in the lead-in for each project (the preparation of working standard solutions, spiking, and aliquoting). Furthermore, examples are given as how to evaluate the linearity over the entire concentration range when only two concentration levels are used for linear regression. To conclude, two-concentration linear regression is accurate and robust enough for routine use in regulated LC-MS bioanalysis and it significantly saves time and cost as well. Copyright © 2013 Elsevier B.V. All rights reserved.
A Linear Regression and Markov Chain Model for the Arabian Horse Registry
1993-04-01
as a tax deduction? Yes No T-4367 68 26. Regardless of previous equine tax deductions, do you consider your current horse activities to be... (Mark one...E L T-4367 A Linear Regression and Markov Chain Model For the Arabian Horse Registry Accesion For NTIS CRA&I UT 7 4:iC=D 5 D-IC JA" LI J:13tjlC,3 lO...the Arabian Horse Registry, which needed to forecast its future registration of purebred Arabian horses . A linear regression model was utilized to
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kwon, Deukwoo; Little, Mark P.; Miller, Donald L.
Purpose: To determine more accurate regression formulas for estimating peak skin dose (PSD) from reference air kerma (RAK) or kerma-area product (KAP). Methods: After grouping of the data from 21 procedures into 13 clinically similar groups, assessments were made of optimal clustering using the Bayesian information criterion to obtain the optimal linear regressions of (log-transformed) PSD vs RAK, PSD vs KAP, and PSD vs RAK and KAP. Results: Three clusters of clinical groups were optimal in regression of PSD vs RAK, seven clusters of clinical groups were optimal in regression of PSD vs KAP, and six clusters of clinical groupsmore » were optimal in regression of PSD vs RAK and KAP. Prediction of PSD using both RAK and KAP is significantly better than prediction of PSD with either RAK or KAP alone. The regression of PSD vs RAK provided better predictions of PSD than the regression of PSD vs KAP. The partial-pooling (clustered) method yields smaller mean squared errors compared with the complete-pooling method.Conclusion: PSD distributions for interventional radiology procedures are log-normal. Estimates of PSD derived from RAK and KAP jointly are most accurate, followed closely by estimates derived from RAK alone. Estimates of PSD derived from KAP alone are the least accurate. Using a stochastic search approach, it is possible to cluster together certain dissimilar types of procedures to minimize the total error sum of squares.« less
An improved multiple linear regression and data analysis computer program package
NASA Technical Reports Server (NTRS)
Sidik, S. M.
1972-01-01
NEWRAP, an improved version of a previous multiple linear regression program called RAPIER, CREDUC, and CRSPLT, allows for a complete regression analysis including cross plots of the independent and dependent variables, correlation coefficients, regression coefficients, analysis of variance tables, t-statistics and their probability levels, rejection of independent variables, plots of residuals against the independent and dependent variables, and a canonical reduction of quadratic response functions useful in optimum seeking experimentation. A major improvement over RAPIER is that all regression calculations are done in double precision arithmetic.
Education, Genetic Ancestry, and Blood Pressure in African Americans and Whites
Gravlee, Clarence C.; Mulligan, Connie J.
2012-01-01
Objectives. We assessed the relative roles of education and genetic ancestry in predicting blood pressure (BP) within African Americans and explored the association between education and BP across racial groups. Methods. We used t tests and linear regressions to examine the associations of genetic ancestry, estimated from a genomewide set of autosomal markers, and education with BP variation among African Americans in the Family Blood Pressure Program. We also performed linear regressions in self-identified African Americans and Whites to explore the association of education with BP across racial groups. Results. Education, but not genetic ancestry, significantly predicted BP variation in the African American subsample (b = −0.51 mm Hg per year additional education; P = .001). Although education was inversely associated with BP in the total population, within-group analyses showed that education remained a significant predictor of BP only among the African Americans. We found a significant interaction (b = 3.20; P = .006) between education and self-identified race in predicting BP. Conclusions. Racial disparities in BP may be better explained by differences in education than by genetic ancestry. Future studies of ancestry and disease should include measures of the social environment. PMID:22698014
Social support for women of reproductive age and its predictors: a population-based study
2012-01-01
Background Social support is an exchange of resources between at least two individuals perceived by the provider or recipient to be intended to promote the health of the recipient. Social support is a major determinant of health. The objective of this study was to determine the perceived social support and its associated sociodemographic factors among women of reproductive age. Methods This was a population-based cross-sectional study with multistage random cluster sampling of 1359 women of reproductive age. Data were collected using questionnaires on sociodemographic factors and perceived social support (PRQ85-Part 2). The relationship between the dependent variable (perceived social support) and the independent variables (sociodemographic characteristics) was analyzed using the multivariable linear regression model. Results The mean score of social support was 134.3 ± 17.9. Women scored highest in the “worth” dimension and lowest in the “social integration” dimension. Multivariable linear regression analysis indicated that the variables of education, spouse’s occupation, Sufficiency of income for expenses and primary support source were significantly related to the perceived social support. Conclusion Sociodemographic factors affect social support and could be considered in planning interventions to improve social support for Iranian women. PMID:22988834
Factors associated with arterial stiffness in children aged 9-10 years
Batista, Milena Santos; Mill, José Geraldo; Pereira, Taisa Sabrina Silva; Fernandes, Carolina Dadalto Rocha; Molina, Maria del Carmen Bisi
2015-01-01
OBJECTIVE To analyze the factors associated with stiffness of the great arteries in prepubertal children. METHODS This study with convenience sample of 231 schoolchildren aged 9-10 years enrolled in public and private schools in Vitória, ES, Southeastern Brazil, in 2010-2011. Anthropometric and hemodynamic data, blood pressure, and pulse wave velocity in the carotid-femoral segment were obtained. Data on current and previous health conditions were obtained by questionnaire and notes on the child’s health card. Multiple linear regression was applied to identify the partial and total contribution of the factors in determining the pulse wave velocity values. RESULTS Among the students, 50.2% were female and 55.4% were 10 years old. Among those classified in the last tertile of pulse wave velocity, 60.0% were overweight, with higher mean blood pressure, waist circumference, and waist-to-height ratio. Birth weight was not associated with pulse wave velocity. After multiple linear regression analysis, body mass index (BMI) and diastolic blood pressure remained in the model. CONCLUSIONS BMI was the most important factor in determining arterial stiffness in children aged 9-10 years. PMID:25902563
Measurement Consistency from Magnetic Resonance Images
Chung, Dongjun; Chung, Moo K.; Durtschi, Reid B.; Lindell, R. Gentry; Vorperian, Houri K.
2010-01-01
Rationale and Objectives In quantifying medical images, length-based measurements are still obtained manually. Due to possible human error, a measurement protocol is required to guarantee the consistency of measurements. In this paper, we review various statistical techniques that can be used in determining measurement consistency. The focus is on detecting a possible measurement bias and determining the robustness of the procedures to outliers. Materials and Methods We review correlation analysis, linear regression, Bland-Altman method, paired t-test, and analysis of variance (ANOVA). These techniques were applied to measurements, obtained by two raters, of head and neck structures from magnetic resonance images (MRI). Results The correlation analysis and the linear regression were shown to be insufficient for detecting measurement inconsistency. They are also very sensitive to outliers. The widely used Bland-Altman method is a visualization technique so it lacks the numerical quantification. The paired t-test tends to be sensitive to small measurement bias. On the other hand, ANOVA performs well even under small measurement bias. Conclusion In almost all cases, using only one method is insufficient and it is recommended to use several methods simultaneously. In general, ANOVA performs the best. PMID:18790405
NASA Astrophysics Data System (ADS)
Kutzbach, L.; Schneider, J.; Sachs, T.; Giebels, M.; Nykänen, H.; Shurpali, N. J.; Martikainen, P. J.; Alm, J.; Wilmking, M.
2007-07-01
Closed (non-steady state) chambers are widely used for quantifying carbon dioxide (CO2) fluxes between soils or low-stature canopies and the atmosphere. It is well recognised that covering a soil or vegetation by a closed chamber inherently disturbs the natural CO2 fluxes by altering the concentration gradients between the soil, the vegetation and the overlying air. Thus, the driving factors of CO2 fluxes are not constant during the closed chamber experiment, and no linear increase or decrease of CO2 concentration over time within the chamber headspace can be expected. Nevertheless, linear regression has been applied for calculating CO2 fluxes in many recent, partly influential, studies. This approach was justified by keeping the closure time short and assuming the concentration change over time to be in the linear range. Here, we test if the application of linear regression is really appropriate for estimating CO2 fluxes using closed chambers over short closure times and if the application of nonlinear regression is necessary. We developed a nonlinear exponential regression model from diffusion and photosynthesis theory. This exponential model was tested with four different datasets of CO2 flux measurements (total number: 1764) conducted at three peatland sites in Finland and a tundra site in Siberia. The flux measurements were performed using transparent chambers on vegetated surfaces and opaque chambers on bare peat surfaces. Thorough analyses of residuals demonstrated that linear regression was frequently not appropriate for the determination of CO2 fluxes by closed-chamber methods, even if closure times were kept short. The developed exponential model was well suited for nonlinear regression of the concentration over time c(t) evolution in the chamber headspace and estimation of the initial CO2 fluxes at closure time for the majority of experiments. CO2 flux estimates by linear regression can be as low as 40% of the flux estimates of exponential regression for closure times of only two minutes and even lower for longer closure times. The degree of underestimation increased with increasing CO2 flux strength and is dependent on soil and vegetation conditions which can disturb not only the quantitative but also the qualitative evaluation of CO2 flux dynamics. The underestimation effect by linear regression was observed to be different for CO2 uptake and release situations which can lead to stronger bias in the daily, seasonal and annual CO2 balances than in the individual fluxes. To avoid serious bias of CO2 flux estimates based on closed chamber experiments, we suggest further tests using published datasets and recommend the use of nonlinear regression models for future closed chamber studies.
Forsberg, Flemming; Ro, Raymond J.; Fox, Traci B; Liu, Ji-Bin; Chiou, See-Ying; Potoczek, Magdalena; Goldberg, Barry B
2010-01-01
The purpose of this study was to prospectively compare noninvasive, quantitative measures of vascularity obtained from 4 contrast enhanced ultrasound (US) techniques to 4 invasive immunohistochemical markers of tumor angiogenesis in a large group of murine xenografts. Glioma (C6) or breast cancer (NMU) cells were implanted in 144 rats. The contrast agent Optison (GE Healthcare, Princeton, NJ) was injected in a tail vein (dose: 0.4ml/kg). Power Doppler imaging (PDI), pulse-subtraction harmonic imaging (PSHI), flash-echo imaging (FEI), and Microflow imaging (MFI; a technique creating maximum intensity projection images over time) was performed with an Aplio scanner (Toshiba America Medical Systems, Tustin, CA) and a 7.5 MHz linear array. Fractional tumor neovascularity was calculated from digital clips of contrast US, while the relative area stained was calculated from specimens. Results were compared using a factorial, repeated measures ANOVA, linear regression and z-tests. The tortuous morphology of tumor neovessels was visualized better with MFI than with the other US modes. Cell line, implantation method and contrast US imaging technique were significant parameters in the ANOVA model (p<0.05). The strongest correlation determined by linear regression in the C6 model was between PSHI and percent area stained with CD31 (r=0.37, p<0.0001). In the NMU model the strongest correlation was between FEI and COX-2 (r=0.46, p<0.0001). There were no statistically significant differences between correlations obtained with the various US methods (p>0.05). In conclusion, the largest study of contrast US of murine xenografts to date has been conducted and quantitative contrast enhanced US measures of tumor neovascularity in glioma and breast cancer xenograft models appear to provide a noninvasive marker for angiogenesis; although the best method for monitoring angiogenesis was not conclusively established. PMID:21144542
Fischer, A; Friggens, N C; Berry, D P; Faverdin, P
2018-07-01
The ability to properly assess and accurately phenotype true differences in feed efficiency among dairy cows is key to the development of breeding programs for improving feed efficiency. The variability among individuals in feed efficiency is commonly characterised by the residual intake approach. Residual feed intake is represented by the residuals of a linear regression of intake on the corresponding quantities of the biological functions that consume (or release) energy. However, the residuals include both, model fitting and measurement errors as well as any variability in cow efficiency. The objective of this study was to isolate the individual animal variability in feed efficiency from the residual component. Two separate models were fitted, in one the standard residual energy intake (REI) was calculated as the residual of a multiple linear regression of lactation average net energy intake (NEI) on lactation average milk energy output, average metabolic BW, as well as lactation loss and gain of body condition score. In the other, a linear mixed model was used to simultaneously fit fixed linear regressions and random cow levels on the biological traits and intercept using fortnight repeated measures for the variables. This method split the predicted NEI in two parts: one quantifying the population mean intercept and coefficients, and one quantifying cow-specific deviations in the intercept and coefficients. The cow-specific part of predicted NEI was assumed to isolate true differences in feed efficiency among cows. NEI and associated energy expenditure phenotypes were available for the first 17 fortnights of lactation from 119 Holstein cows; all fed a constant energy-rich diet. Mixed models fitting cow-specific intercept and coefficients to different combinations of the aforementioned energy expenditure traits, calculated on a fortnightly basis, were compared. The variance of REI estimated with the lactation average model represented only 8% of the variance of measured NEI. Among all compared mixed models, the variance of the cow-specific part of predicted NEI represented between 53% and 59% of the variance of REI estimated from the lactation average model or between 4% and 5% of the variance of measured NEI. The remaining 41% to 47% of the variance of REI estimated with the lactation average model may therefore reflect model fitting errors or measurement errors. In conclusion, the use of a mixed model framework with cow-specific random regressions seems to be a promising method to isolate the cow-specific component of REI in dairy cows.
ERIC Educational Resources Information Center
Quinino, Roberto C.; Reis, Edna A.; Bessegato, Lupercio F.
2013-01-01
This article proposes the use of the coefficient of determination as a statistic for hypothesis testing in multiple linear regression based on distributions acquired by beta sampling. (Contains 3 figures.)
Statistical power analyses using G*Power 3.1: tests for correlation and regression analyses.
Faul, Franz; Erdfelder, Edgar; Buchner, Axel; Lang, Albert-Georg
2009-11-01
G*Power is a free power analysis program for a variety of statistical tests. We present extensions and improvements of the version introduced by Faul, Erdfelder, Lang, and Buchner (2007) in the domain of correlation and regression analyses. In the new version, we have added procedures to analyze the power of tests based on (1) single-sample tetrachoric correlations, (2) comparisons of dependent correlations, (3) bivariate linear regression, (4) multiple linear regression based on the random predictor model, (5) logistic regression, and (6) Poisson regression. We describe these new features and provide a brief introduction to their scope and handling.
Rasmussen, Patrick P.; Gray, John R.; Glysson, G. Douglas; Ziegler, Andrew C.
2009-01-01
In-stream continuous turbidity and streamflow data, calibrated with measured suspended-sediment concentration data, can be used to compute a time series of suspended-sediment concentration and load at a stream site. Development of a simple linear (ordinary least squares) regression model for computing suspended-sediment concentrations from instantaneous turbidity data is the first step in the computation process. If the model standard percentage error (MSPE) of the simple linear regression model meets a minimum criterion, this model should be used to compute a time series of suspended-sediment concentrations. Otherwise, a multiple linear regression model using paired instantaneous turbidity and streamflow data is developed and compared to the simple regression model. If the inclusion of the streamflow variable proves to be statistically significant and the uncertainty associated with the multiple regression model results in an improvement over that for the simple linear model, the turbidity-streamflow multiple linear regression model should be used to compute a suspended-sediment concentration time series. The computed concentration time series is subsequently used with its paired streamflow time series to compute suspended-sediment loads by standard U.S. Geological Survey techniques. Once an acceptable regression model is developed, it can be used to compute suspended-sediment concentration beyond the period of record used in model development with proper ongoing collection and analysis of calibration samples. Regression models to compute suspended-sediment concentrations are generally site specific and should never be considered static, but they represent a set period in a continually dynamic system in which additional data will help verify any change in sediment load, type, and source.
NASA Astrophysics Data System (ADS)
Kutzbach, L.; Schneider, J.; Sachs, T.; Giebels, M.; Nykänen, H.; Shurpali, N. J.; Martikainen, P. J.; Alm, J.; Wilmking, M.
2007-11-01
Closed (non-steady state) chambers are widely used for quantifying carbon dioxide (CO2) fluxes between soils or low-stature canopies and the atmosphere. It is well recognised that covering a soil or vegetation by a closed chamber inherently disturbs the natural CO2 fluxes by altering the concentration gradients between the soil, the vegetation and the overlying air. Thus, the driving factors of CO2 fluxes are not constant during the closed chamber experiment, and no linear increase or decrease of CO2 concentration over time within the chamber headspace can be expected. Nevertheless, linear regression has been applied for calculating CO2 fluxes in many recent, partly influential, studies. This approach has been justified by keeping the closure time short and assuming the concentration change over time to be in the linear range. Here, we test if the application of linear regression is really appropriate for estimating CO2 fluxes using closed chambers over short closure times and if the application of nonlinear regression is necessary. We developed a nonlinear exponential regression model from diffusion and photosynthesis theory. This exponential model was tested with four different datasets of CO2 flux measurements (total number: 1764) conducted at three peatlands sites in Finland and a tundra site in Siberia. Thorough analyses of residuals demonstrated that linear regression was frequently not appropriate for the determination of CO2 fluxes by closed-chamber methods, even if closure times were kept short. The developed exponential model was well suited for nonlinear regression of the concentration over time c(t) evolution in the chamber headspace and estimation of the initial CO2 fluxes at closure time for the majority of experiments. However, a rather large percentage of the exponential regression functions showed curvatures not consistent with the theoretical model which is considered to be caused by violations of the underlying model assumptions. Especially the effects of turbulence and pressure disturbances by the chamber deployment are suspected to have caused unexplainable curvatures. CO2 flux estimates by linear regression can be as low as 40% of the flux estimates of exponential regression for closure times of only two minutes. The degree of underestimation increased with increasing CO2 flux strength and was dependent on soil and vegetation conditions which can disturb not only the quantitative but also the qualitative evaluation of CO2 flux dynamics. The underestimation effect by linear regression was observed to be different for CO2 uptake and release situations which can lead to stronger bias in the daily, seasonal and annual CO2 balances than in the individual fluxes. To avoid serious bias of CO2 flux estimates based on closed chamber experiments, we suggest further tests using published datasets and recommend the use of nonlinear regression models for future closed chamber studies.
NASA Astrophysics Data System (ADS)
Mahaboob, B.; Venkateswarlu, B.; Sankar, J. Ravi; Balasiddamuni, P.
2017-11-01
This paper uses matrix calculus techniques to obtain Nonlinear Least Squares Estimator (NLSE), Maximum Likelihood Estimator (MLE) and Linear Pseudo model for nonlinear regression model. David Pollard and Peter Radchenko [1] explained analytic techniques to compute the NLSE. However the present research paper introduces an innovative method to compute the NLSE using principles in multivariate calculus. This study is concerned with very new optimization techniques used to compute MLE and NLSE. Anh [2] derived NLSE and MLE of a heteroscedatistic regression model. Lemcoff [3] discussed a procedure to get linear pseudo model for nonlinear regression model. In this research article a new technique is developed to get the linear pseudo model for nonlinear regression model using multivariate calculus. The linear pseudo model of Edmond Malinvaud [4] has been explained in a very different way in this paper. David Pollard et.al used empirical process techniques to study the asymptotic of the LSE (Least-squares estimation) for the fitting of nonlinear regression function in 2006. In Jae Myung [13] provided a go conceptual for Maximum likelihood estimation in his work “Tutorial on maximum likelihood estimation
A method for fitting regression splines with varying polynomial order in the linear mixed model.
Edwards, Lloyd J; Stewart, Paul W; MacDougall, James E; Helms, Ronald W
2006-02-15
The linear mixed model has become a widely used tool for longitudinal analysis of continuous variables. The use of regression splines in these models offers the analyst additional flexibility in the formulation of descriptive analyses, exploratory analyses and hypothesis-driven confirmatory analyses. We propose a method for fitting piecewise polynomial regression splines with varying polynomial order in the fixed effects and/or random effects of the linear mixed model. The polynomial segments are explicitly constrained by side conditions for continuity and some smoothness at the points where they join. By using a reparameterization of this explicitly constrained linear mixed model, an implicitly constrained linear mixed model is constructed that simplifies implementation of fixed-knot regression splines. The proposed approach is relatively simple, handles splines in one variable or multiple variables, and can be easily programmed using existing commercial software such as SAS or S-plus. The method is illustrated using two examples: an analysis of longitudinal viral load data from a study of subjects with acute HIV-1 infection and an analysis of 24-hour ambulatory blood pressure profiles.
GIS Tools to Estimate Average Annual Daily Traffic
DOT National Transportation Integrated Search
2012-06-01
This project presents five tools that were created for a geographical information system to estimate Annual Average Daily : Traffic using linear regression. Three of the tools can be used to prepare spatial data for linear regression. One tool can be...
Jose F. Negron; Willis C. Schaupp; Kenneth E. Gibson; John Anhold; Dawn Hansen; Ralph Thier; Phil Mocettini
1999-01-01
Data collected from Douglas-fir stands infected by the Douglas-fir beetle in Wyoming, Montana, Idaho, and Utah, were used to develop models to estimate amount of mortality in terms of basal area killed. Models were built using stepwise linear regression and regression tree approaches. Linear regression models using initial Douglas-fir basal area were built for all...
Ling, Ru; Liu, Jiawang
2011-12-01
To construct prediction model for health workforce and hospital beds in county hospitals of Hunan by multiple linear regression. We surveyed 16 counties in Hunan with stratified random sampling according to uniform questionnaires,and multiple linear regression analysis with 20 quotas selected by literature view was done. Independent variables in the multiple linear regression model on medical personnels in county hospitals included the counties' urban residents' income, crude death rate, medical beds, business occupancy, professional equipment value, the number of devices valued above 10 000 yuan, fixed assets, long-term debt, medical income, medical expenses, outpatient and emergency visits, hospital visits, actual available bed days, and utilization rate of hospital beds. Independent variables in the multiple linear regression model on county hospital beds included the the population of aged 65 and above in the counties, disposable income of urban residents, medical personnel of medical institutions in county area, business occupancy, the total value of professional equipment, fixed assets, long-term debt, medical income, medical expenses, outpatient and emergency visits, hospital visits, actual available bed days, utilization rate of hospital beds, and length of hospitalization. The prediction model shows good explanatory and fitting, and may be used for short- and mid-term forecasting.
Linear regression analysis: part 14 of a series on evaluation of scientific publications.
Schneider, Astrid; Hommel, Gerhard; Blettner, Maria
2010-11-01
Regression analysis is an important statistical method for the analysis of medical data. It enables the identification and characterization of relationships among multiple factors. It also enables the identification of prognostically relevant risk factors and the calculation of risk scores for individual prognostication. This article is based on selected textbooks of statistics, a selective review of the literature, and our own experience. After a brief introduction of the uni- and multivariable regression models, illustrative examples are given to explain what the important considerations are before a regression analysis is performed, and how the results should be interpreted. The reader should then be able to judge whether the method has been used correctly and interpret the results appropriately. The performance and interpretation of linear regression analysis are subject to a variety of pitfalls, which are discussed here in detail. The reader is made aware of common errors of interpretation through practical examples. Both the opportunities for applying linear regression analysis and its limitations are presented.
Grajeda, Laura M; Ivanescu, Andrada; Saito, Mayuko; Crainiceanu, Ciprian; Jaganath, Devan; Gilman, Robert H; Crabtree, Jean E; Kelleher, Dermott; Cabrera, Lilia; Cama, Vitaliano; Checkley, William
2016-01-01
Childhood growth is a cornerstone of pediatric research. Statistical models need to consider individual trajectories to adequately describe growth outcomes. Specifically, well-defined longitudinal models are essential to characterize both population and subject-specific growth. Linear mixed-effect models with cubic regression splines can account for the nonlinearity of growth curves and provide reasonable estimators of population and subject-specific growth, velocity and acceleration. We provide a stepwise approach that builds from simple to complex models, and account for the intrinsic complexity of the data. We start with standard cubic splines regression models and build up to a model that includes subject-specific random intercepts and slopes and residual autocorrelation. We then compared cubic regression splines vis-à-vis linear piecewise splines, and with varying number of knots and positions. Statistical code is provided to ensure reproducibility and improve dissemination of methods. Models are applied to longitudinal height measurements in a cohort of 215 Peruvian children followed from birth until their fourth year of life. Unexplained variability, as measured by the variance of the regression model, was reduced from 7.34 when using ordinary least squares to 0.81 (p < 0.001) when using a linear mixed-effect models with random slopes and a first order continuous autoregressive error term. There was substantial heterogeneity in both the intercept (p < 0.001) and slopes (p < 0.001) of the individual growth trajectories. We also identified important serial correlation within the structure of the data (ρ = 0.66; 95 % CI 0.64 to 0.68; p < 0.001), which we modeled with a first order continuous autoregressive error term as evidenced by the variogram of the residuals and by a lack of association among residuals. The final model provides a parametric linear regression equation for both estimation and prediction of population- and individual-level growth in height. We show that cubic regression splines are superior to linear regression splines for the case of a small number of knots in both estimation and prediction with the full linear mixed effect model (AIC 19,352 vs. 19,598, respectively). While the regression parameters are more complex to interpret in the former, we argue that inference for any problem depends more on the estimated curve or differences in curves rather than the coefficients. Moreover, use of cubic regression splines provides biological meaningful growth velocity and acceleration curves despite increased complexity in coefficient interpretation. Through this stepwise approach, we provide a set of tools to model longitudinal childhood data for non-statisticians using linear mixed-effect models.
a Comparison Between Two Ols-Based Approaches to Estimating Urban Multifractal Parameters
NASA Astrophysics Data System (ADS)
Huang, Lin-Shan; Chen, Yan-Guang
Multifractal theory provides a new spatial analytical tool for urban studies, but many basic problems remain to be solved. Among various pending issues, the most significant one is how to obtain proper multifractal dimension spectrums. If an algorithm is improperly used, the parameter spectrums will be abnormal. This paper is devoted to investigating two ordinary least squares (OLS)-based approaches for estimating urban multifractal parameters. Using empirical study and comparative analysis, we demonstrate how to utilize the adequate linear regression to calculate multifractal parameters. The OLS regression analysis has two different approaches. One is that the intercept is fixed to zero, and the other is that the intercept is not limited. The results of comparative study show that the zero-intercept regression yields proper multifractal parameter spectrums within certain scale range of moment order, while the common regression method often leads to abnormal multifractal parameter values. A conclusion can be reached that fixing the intercept to zero is a more advisable regression method for multifractal parameters estimation, and the shapes of spectral curves and value ranges of fractal parameters can be employed to diagnose urban problems. This research is helpful for scientists to understand multifractal models and apply a more reasonable technique to multifractal parameter calculations.
Secure Logistic Regression Based on Homomorphic Encryption: Design and Evaluation
Song, Yongsoo; Wang, Shuang; Xia, Yuhou; Jiang, Xiaoqian
2018-01-01
Background Learning a model without accessing raw data has been an intriguing idea to security and machine learning researchers for years. In an ideal setting, we want to encrypt sensitive data to store them on a commercial cloud and run certain analyses without ever decrypting the data to preserve privacy. Homomorphic encryption technique is a promising candidate for secure data outsourcing, but it is a very challenging task to support real-world machine learning tasks. Existing frameworks can only handle simplified cases with low-degree polynomials such as linear means classifier and linear discriminative analysis. Objective The goal of this study is to provide a practical support to the mainstream learning models (eg, logistic regression). Methods We adapted a novel homomorphic encryption scheme optimized for real numbers computation. We devised (1) the least squares approximation of the logistic function for accuracy and efficiency (ie, reduce computation cost) and (2) new packing and parallelization techniques. Results Using real-world datasets, we evaluated the performance of our model and demonstrated its feasibility in speed and memory consumption. For example, it took approximately 116 minutes to obtain the training model from the homomorphically encrypted Edinburgh dataset. In addition, it gives fairly accurate predictions on the testing dataset. Conclusions We present the first homomorphically encrypted logistic regression outsourcing model based on the critical observation that the precision loss of classification models is sufficiently small so that the decision plan stays still. PMID:29666041
Heun, Manfred; Abbo, Shahal; Lev-Yadun, Simcha; Gopher, Avi
2012-07-01
The recent review by Fuller et al. (2012a) in this journal is part of a series of papers maintaining that plant domestication in the Near East was a slow process lasting circa 4000 years and occurring independently in different locations across the Fertile Crescent. Their protracted domestication scenario is based entirely on linear regression derived from the percentage of domesticated plant remains at specific archaeological sites and the age of these sites themselves. This paper discusses why estimates like haldanes and darwins cannot be applied to the seven founder crops in the Near East (einkorn and emmer wheat, barley, peas, chickpeas, lentils, and bitter vetch). All of these crops are self-fertilizing plants and for this reason they do not fulfil the requirements for performing calculations of this kind. In addition, the percentage of domesticates at any site may be the result of factors other than those that affect the selection for domesticates growing in the surrounding area. These factors are unlikely to have been similar across prehistoric sites of habitation, societies, and millennia. The conclusion here is that single crop analyses are necessary rather than general reviews drawing on regression analyses based on erroneous assumptions. The fact that all seven of these founder crops are self-fertilizers should be incorporated into a comprehensive domestication scenario for the Near East, as self-fertilization naturally isolates domesticates from their wild progenitors.
Prediction of monthly rainfall in Victoria, Australia: Clusterwise linear regression approach
NASA Astrophysics Data System (ADS)
Bagirov, Adil M.; Mahmood, Arshad; Barton, Andrew
2017-05-01
This paper develops the Clusterwise Linear Regression (CLR) technique for prediction of monthly rainfall. The CLR is a combination of clustering and regression techniques. It is formulated as an optimization problem and an incremental algorithm is designed to solve it. The algorithm is applied to predict monthly rainfall in Victoria, Australia using rainfall data with five input meteorological variables over the period of 1889-2014 from eight geographically diverse weather stations. The prediction performance of the CLR method is evaluated by comparing observed and predicted rainfall values using four measures of forecast accuracy. The proposed method is also compared with the CLR using the maximum likelihood framework by the expectation-maximization algorithm, multiple linear regression, artificial neural networks and the support vector machines for regression models using computational results. The results demonstrate that the proposed algorithm outperforms other methods in most locations.
Regression Model Term Selection for the Analysis of Strain-Gage Balance Calibration Data
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert Manfred; Volden, Thomas R.
2010-01-01
The paper discusses the selection of regression model terms for the analysis of wind tunnel strain-gage balance calibration data. Different function class combinations are presented that may be used to analyze calibration data using either a non-iterative or an iterative method. The role of the intercept term in a regression model of calibration data is reviewed. In addition, useful algorithms and metrics originating from linear algebra and statistics are recommended that will help an analyst (i) to identify and avoid both linear and near-linear dependencies between regression model terms and (ii) to make sure that the selected regression model of the calibration data uses only statistically significant terms. Three different tests are suggested that may be used to objectively assess the predictive capability of the final regression model of the calibration data. These tests use both the original data points and regression model independent confirmation points. Finally, data from a simplified manual calibration of the Ames MK40 balance is used to illustrate the application of some of the metrics and tests to a realistic calibration data set.
Coping Styles in Heart Failure Patients with Depressive Symptoms
Trivedi, Ranak B.; Blumenthal, James A.; O'Connor, Christopher; Adams, Kirkwood; Hinderliter, Alan; Sueta-Dupree, Carla; Johnson, Kristy; Sherwood, Andrew
2009-01-01
Objective Elevated depressive symptoms have been linked to poorer prognosis in heart failure (HF) patients. Our objective was to identify coping styles associated with depressive symptoms in HF patients. Methods 222 stable HF patients (32.75% female, 45.4% non-Hispanic Black) completed multiple questionnaires. Beck Depression Inventory (BDI) assessed depressive symptoms, Life Orientation Test (LOT-R) assessed optimism, ENRICHD Social Support Inventory (ESSI) and Perceived Social Support Scale (PSSS) assessed social support, and COPE assessed coping styles. Linear regression analyses were employed to assess the association of coping styles with continuous BDI scores. Logistic regression analyses were performed using BDI scores dichotomized into BDI<10 versus BDI≥10, to identify coping styles accompanying clinically significant depressive symptoms. Results In linear regression models, higher BDI scores were associated with lower scores on the acceptance (β=-.14), humor (β=-.15), planning (β=-.15), and emotional support (β=-.14) subscales of the COPE, and higher scores on the behavioral disengagement (β=.41), denial (β=.33), venting (β=.25), and mental disengagement (β=.22) subscales. Higher PSSS and ESSI scores were associated with lower BDI scores (β=-.32 and -.25, respectively). Higher LOT-R scores were associated with higher BDI scores (β=.39, p<.001). In logistical regression models, BDI≥10 was associated with greater likelihood of behavioral disengagement (OR=1.3), denial (OR=1.2), mental disengagement (OR=1.3), venting (OR=1.2), and pessimism (OR=1.2), and lower perceived social support measured by PSSS (OR=.92) and ESSI (OR=.92). Conclusion Depressive symptoms in HF patients are associated with avoidant coping, lower perceived social support, and pessimism. Results raise the possibility that interventions designed to improve coping may reduce depressive symptoms. PMID:19773027
Donta, Balaiah; Dasgupta, Anindita; Ghule, Mohan; Battala, Madhusudana; Nair, Saritha; Silverman, Jay G.; Jadhav, Arun; Palaye, Prajakta; Saggurti, Niranjan; Raj, Anita
2015-01-01
Objective Evidence has linked economic hardship with increased intimate partner violence (IPV) perpetration among males. However, less is known about how economic debt or gender norms related to men's roles in relationships or the household, which often underlie IPV perpetration, intersect in or may explain these associations. We assessed the intersection of economic debt, attitudes toward gender norms, and IPV perpetration among married men in India. Methods Data were from the evaluation of a family planning intervention among young married couples (n=1,081) in rural Maharashtra, India. Crude and adjusted logistic regression models for dichotomous outcome variables and linear regression models for continuous outcomes were used to examine debt in relation to husbands' attitudes toward gender-based norms (i.e., beliefs supporting IPV and beliefs regarding male dominance in relationships and the household), as well as sexual and physical IPV perpetration. Results Twenty percent of husbands reported debt. In adjusted linear regression models, debt was associated with husbands' attitudes supportive of IPV (b=0.015, p=0.004) and norms supporting male dominance in relationships and the household (b=0.006, p=0.003). In logistic regression models adjusted for relevant demographics, debt was associated with perpetration of physical IPV (adjusted odds ratio [AOR] = 1.4, 95% confidence interval [CI] 1.1, 1.9) and sexual IPV (AOR=1.6, 95% CI 1.1, 2.1) from husbands. These findings related to debt and relation to IPV were slightly attenuated when further adjusted for men's attitudes toward gender norms. Conclusion Findings suggest the need for combined gender equity and economic promotion interventions to address high levels of debt and related IPV reported among married couples in rural India. PMID:26556938
Wong, William W.; Strizich, Garrett; Heo, Moonseong; Heymsfield, Steven B.; Himes, John H.; Rock, Cheryl L.; Gellman, Marc D.; Siega-Riz, Anna Maria; Sotres-Alvarez, Daniela; Davis, Sonia M.; Arredondo, Elva M.; Van Horn, Linda; Wylie-Rosett, Judith; Sanchez-Johnsen, Lisa; Kaplan, Robert; Mossavar-Rahmani, Yasmin
2016-01-01
Objective To evaluate the percentage of body fat (%BF)-BMI relationship, identify %BF levels corresponding to adult BMI cut-points, and examine %BF-BMI agreement in a diverse Hispanic/Latino population. Methods %BF by bioelectrical impedance analysis (BIA) was corrected against %BF by 18O dilution in 476 participants of the ancillary Hispanic Community Health/Latinos Studies. Corrected %BF were regressed against 1/BMI in the parent study (n=15,261), fitting models for each age group, by sex and Hispanic/Latino background; predicted %BF was then computed for each BMI cut-point. Results BIA underestimated %BF by 8.7 ± 0.3% in women and 4.6 ± 0.3% in men (P < 0.0001). The %BF-BMI relationshp was non-linear and linear for 1/BMI. Sex- and age-specific regression parameters between %BF and 1/BMI were consistent across Hispanic/Latino backgrounds (P > 0.05). The precision of the %BF-1/BMI association weakened with increasing age in men but not women. The proportion of participants classified as non-obese by BMI but obese by %BF was generally higher among women and older adults (16.4% in women vs. 12.0% in men aged 50-74 y). Conclusions %BF was linearly related to 1/BMI with consistent relationship across Hispanic/Lation backgrounds. BMI cut-points consistently underestimated the proportion of Hispanics/Latinos with excess adiposity. PMID:27184359
Association of Dentine Hypersensitivity with Different Risk Factors – A Cross Sectional Study
Vijaya, V; Sanjay, Venkataraam; Varghese, Rana K; Ravuri, Rajyalakshmi; Agarwal, Anil
2013-01-01
Background: This study was done to assess the prevalence of Dentine hypersensitivity (DH) and its associated risk factors. Materials & Methods: This epidemiological study was done among patients coming to dental college regarding prevalence of DH. A self structured questionnaire along with clinical examination was done for assessment. Descriptive statistics were obtained and frequency distribution was calculated using Chi square test at p value <0.05. Stepwise multiple linear regression was also done to access frequency of DH with different factors. Results: The study population was comprised of 655 participants with different age groups. Our study showed prevalence as 55% and it was more common among males. Similarly smokers and those who use hard tooth brush had more cases of DH. Step wise multiple linear regression showed that best predictor for DH was age followed by habit of smoking and type of tooth brush. Most aggravating factors were cold water (15.4%) and sweet foods (14.7%), whereas only 5% of the patients had it while brushing. Conclusion: A high level of dental hypersensitivity has been in this study and more common among males. A linear finding was shown with age, smoking and type of tooth brush. How to cite this article: Vijaya V, Sanjay V, Varghese RK, Ravuri R, Agarwal A. Association of Dentine Hypersensitivity with Different Risk Factors – A Cross Sectional Study. J Int Oral Health 2013;5(6):88-92 . PMID:24453451
Scoring and staging systems using cox linear regression modeling and recursive partitioning.
Lee, J W; Um, S H; Lee, J B; Mun, J; Cho, H
2006-01-01
Scoring and staging systems are used to determine the order and class of data according to predictors. Systems used for medical data, such as the Child-Turcotte-Pugh scoring and staging systems for ordering and classifying patients with liver disease, are often derived strictly from physicians' experience and intuition. We construct objective and data-based scoring/staging systems using statistical methods. We consider Cox linear regression modeling and recursive partitioning techniques for censored survival data. In particular, to obtain a target number of stages we propose cross-validation and amalgamation algorithms. We also propose an algorithm for constructing scoring and staging systems by integrating local Cox linear regression models into recursive partitioning, so that we can retain the merits of both methods such as superior predictive accuracy, ease of use, and detection of interactions between predictors. The staging system construction algorithms are compared by cross-validation evaluation of real data. The data-based cross-validation comparison shows that Cox linear regression modeling is somewhat better than recursive partitioning when there are only continuous predictors, while recursive partitioning is better when there are significant categorical predictors. The proposed local Cox linear recursive partitioning has better predictive accuracy than Cox linear modeling and simple recursive partitioning. This study indicates that integrating local linear modeling into recursive partitioning can significantly improve prediction accuracy in constructing scoring and staging systems.
SU-F-T-130: [18F]-FDG Uptake Dose Response in Lung Correlates Linearly with Proton Therapy Dose
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kim, D; Titt, U; Mirkovic, D
2016-06-15
Purpose: Analysis of clinical outcomes in lung cancer patients treated with protons using 18F-FDG uptake in lung as a measure of dose response. Methods: A test case lung cancer patient was selected in an unbiased way. The test patient’s treatment planning and post treatment positron emission tomography (PET) were collected from picture archiving and communication system at the UT M.D. Anderson Cancer Center. Average computerized tomography scan was registered with post PET/CT through both rigid and deformable registrations for selected region of interest (ROI) via VelocityAI imaging informatics software. For the voxels in the ROI, a system that extracts themore » Standard Uptake Value (SUV) from PET was developed, and the corresponding relative biological effectiveness (RBE) weighted (both variable and constant) dose was computed using the Monte Carlo (MC) methods. The treatment planning system (TPS) dose was also obtained. Using histogram analysis, the voxel average normalized SUV vs. 3 different doses was obtained and linear regression fit was performed. Results: From the registration process, there were some regions that showed significant artifacts near the diaphragm and heart region, which yielded poor r-squared values when the linear regression fit was performed on normalized SUV vs. dose. Excluding these values, TPS fit yielded mean r-squared value of 0.79 (range 0.61–0.95), constant RBE fit yielded 0.79 (range 0.52–0.94), and variable RBE fit yielded 0.80 (range 0.52–0.94). Conclusion: A system that extracts SUV from PET to correlate between normalized SUV and various dose calculations was developed. A linear relation between normalized SUV and all three different doses was found.« less
Scarneciu, Camelia C; Sangeorzan, Livia; Rus, Horatiu; Scarneciu, Vlad D; Varciu, Mihai S; Andreescu, Oana; Scarneciu, Ioan
2017-01-01
This study aimed at assessing the incidence of pulmonary hypertension (PH) at newly diagnosed hyperthyroid patients and at finding a simple model showing the complex functional relation between pulmonary hypertension in hyperthyroidism and the factors causing it. The 53 hyperthyroid patients (H-group) were evaluated mainly by using an echocardiographical method and compared with 35 euthyroid (E-group) and 25 healthy people (C-group). In order to identify the factors causing pulmonary hypertension the statistical method of comparing the values of arithmetical means is used. The functional relation between the two random variables (PAPs and each of the factors determining it within our research study) can be expressed by linear or non-linear function. By applying the linear regression method described by a first-degree equation the line of regression (linear model) has been determined; by applying the non-linear regression method described by a second degree equation, a parabola-type curve of regression (non-linear or polynomial model) has been determined. We made the comparison and the validation of these two models by calculating the determination coefficient (criterion 1), the comparison of residuals (criterion 2), application of AIC criterion (criterion 3) and use of F-test (criterion 4). From the H-group, 47% have pulmonary hypertension completely reversible when obtaining euthyroidism. The factors causing pulmonary hypertension were identified: previously known- level of free thyroxin, pulmonary vascular resistance, cardiac output; new factors identified in this study- pretreatment period, age, systolic blood pressure. According to the four criteria and to the clinical judgment, we consider that the polynomial model (graphically parabola- type) is better than the linear one. The better model showing the functional relation between the pulmonary hypertension in hyperthyroidism and the factors identified in this study is given by a polynomial equation of second degree where the parabola is its graphical representation.
Motulsky, Harvey J; Brown, Ronald E
2006-01-01
Background Nonlinear regression, like linear regression, assumes that the scatter of data around the ideal curve follows a Gaussian or normal distribution. This assumption leads to the familiar goal of regression: to minimize the sum of the squares of the vertical or Y-value distances between the points and the curve. Outliers can dominate the sum-of-the-squares calculation, and lead to misleading results. However, we know of no practical method for routinely identifying outliers when fitting curves with nonlinear regression. Results We describe a new method for identifying outliers when fitting data with nonlinear regression. We first fit the data using a robust form of nonlinear regression, based on the assumption that scatter follows a Lorentzian distribution. We devised a new adaptive method that gradually becomes more robust as the method proceeds. To define outliers, we adapted the false discovery rate approach to handling multiple comparisons. We then remove the outliers, and analyze the data using ordinary least-squares regression. Because the method combines robust regression and outlier removal, we call it the ROUT method. When analyzing simulated data, where all scatter is Gaussian, our method detects (falsely) one or more outlier in only about 1–3% of experiments. When analyzing data contaminated with one or several outliers, the ROUT method performs well at outlier identification, with an average False Discovery Rate less than 1%. Conclusion Our method, which combines a new method of robust nonlinear regression with a new method of outlier identification, identifies outliers from nonlinear curve fits with reasonable power and few false positives. PMID:16526949
Assessing NARCCAP climate model effects using spatial confidence regions.
French, Joshua P; McGinnis, Seth; Schwartzman, Armin
2017-01-01
We assess similarities and differences between model effects for the North American Regional Climate Change Assessment Program (NARCCAP) climate models using varying classes of linear regression models. Specifically, we consider how the average temperature effect differs for the various global and regional climate model combinations, including assessment of possible interaction between the effects of global and regional climate models. We use both pointwise and simultaneous inference procedures to identify regions where global and regional climate model effects differ. We also show conclusively that results from pointwise inference are misleading, and that accounting for multiple comparisons is important for making proper inference.
As a fast and effective technique, the multiple linear regression (MLR) method has been widely used in modeling and prediction of beach bacteria concentrations. Among previous works on this subject, however, several issues were insufficiently or inconsistently addressed. Those is...
A simplified competition data analysis for radioligand specific activity determination.
Venturino, A; Rivera, E S; Bergoc, R M; Caro, R A
1990-01-01
Non-linear regression and two-step linear fit methods were developed to determine the actual specific activity of 125I-ovine prolactin by radioreceptor self-displacement analysis. The experimental results obtained by the different methods are superposable. The non-linear regression method is considered to be the most adequate procedure to calculate the specific activity, but if its software is not available, the other described methods are also suitable.
Height and Weight Estimation From Anthropometric Measurements Using Machine Learning Regressions
Fernandes, Bruno J. T.; Roque, Alexandre
2018-01-01
Height and weight are measurements explored to tracking nutritional diseases, energy expenditure, clinical conditions, drug dosages, and infusion rates. Many patients are not ambulant or may be unable to communicate, and a sequence of these factors may not allow accurate estimation or measurements; in those cases, it can be estimated approximately by anthropometric means. Different groups have proposed different linear or non-linear equations which coefficients are obtained by using single or multiple linear regressions. In this paper, we present a complete study of the application of different learning models to estimate height and weight from anthropometric measurements: support vector regression, Gaussian process, and artificial neural networks. The predicted values are significantly more accurate than that obtained with conventional linear regressions. In all the cases, the predictions are non-sensitive to ethnicity, and to gender, if more than two anthropometric parameters are analyzed. The learning model analysis creates new opportunities for anthropometric applications in industry, textile technology, security, and health care. PMID:29651366
NASA Astrophysics Data System (ADS)
Samhouri, M.; Al-Ghandoor, A.; Fouad, R. H.
2009-08-01
In this study two techniques, for modeling electricity consumption of the Jordanian industrial sector, are presented: (i) multivariate linear regression and (ii) neuro-fuzzy models. Electricity consumption is modeled as function of different variables such as number of establishments, number of employees, electricity tariff, prevailing fuel prices, production outputs, capacity utilizations, and structural effects. It was found that industrial production and capacity utilization are the most important variables that have significant effect on future electrical power demand. The results showed that both the multivariate linear regression and neuro-fuzzy models are generally comparable and can be used adequately to simulate industrial electricity consumption. However, comparison that is based on the square root average squared error of data suggests that the neuro-fuzzy model performs slightly better for future prediction of electricity consumption than the multivariate linear regression model. Such results are in full agreement with similar work, using different methods, for other countries.
Carvalho, Carlos; Gomes, Danielo G.; Agoulmine, Nazim; de Souza, José Neuman
2011-01-01
This paper proposes a method based on multivariate spatial and temporal correlation to improve prediction accuracy in data reduction for Wireless Sensor Networks (WSN). Prediction of data not sent to the sink node is a technique used to save energy in WSNs by reducing the amount of data traffic. However, it may not be very accurate. Simulations were made involving simple linear regression and multiple linear regression functions to assess the performance of the proposed method. The results show a higher correlation between gathered inputs when compared to time, which is an independent variable widely used for prediction and forecasting. Prediction accuracy is lower when simple linear regression is used, whereas multiple linear regression is the most accurate one. In addition to that, our proposal outperforms some current solutions by about 50% in humidity prediction and 21% in light prediction. To the best of our knowledge, we believe that we are probably the first to address prediction based on multivariate correlation for WSN data reduction. PMID:22346626
Using the Ridge Regression Procedures to Estimate the Multiple Linear Regression Coefficients
NASA Astrophysics Data System (ADS)
Gorgees, HazimMansoor; Mahdi, FatimahAssim
2018-05-01
This article concerns with comparing the performance of different types of ordinary ridge regression estimators that have been already proposed to estimate the regression parameters when the near exact linear relationships among the explanatory variables is presented. For this situations we employ the data obtained from tagi gas filling company during the period (2008-2010). The main result we reached is that the method based on the condition number performs better than other methods since it has smaller mean square error (MSE) than the other stated methods.
Weichenthal, Scott; Ryswyk, Keith Van; Goldstein, Alon; Bagg, Scott; Shekkarizfard, Maryam; Hatzopoulou, Marianne
2016-04-01
Existing evidence suggests that ambient ultrafine particles (UFPs) (<0.1µm) may contribute to acute cardiorespiratory morbidity. However, few studies have examined the long-term health effects of these pollutants owing in part to a need for exposure surfaces that can be applied in large population-based studies. To address this need, we developed a land use regression model for UFPs in Montreal, Canada using mobile monitoring data collected from 414 road segments during the summer and winter months between 2011 and 2012. Two different approaches were examined for model development including standard multivariable linear regression and a machine learning approach (kernel-based regularized least squares (KRLS)) that learns the functional form of covariate impacts on ambient UFP concentrations from the data. The final models included parameters for population density, ambient temperature and wind speed, land use parameters (park space and open space), length of local roads and rail, and estimated annual average NOx emissions from traffic. The final multivariable linear regression model explained 62% of the spatial variation in ambient UFP concentrations whereas the KRLS model explained 79% of the variance. The KRLS model performed slightly better than the linear regression model when evaluated using an external dataset (R(2)=0.58 vs. 0.55) or a cross-validation procedure (R(2)=0.67 vs. 0.60). In general, our findings suggest that the KRLS approach may offer modest improvements in predictive performance compared to standard multivariable linear regression models used to estimate spatial variations in ambient UFPs. However, differences in predictive performance were not statistically significant when evaluated using the cross-validation procedure. Crown Copyright © 2015. Published by Elsevier Inc. All rights reserved.
Jaworski, N W; Liu, D W; Li, D F; Stein, H H
2016-07-01
An experiment was conducted to determine effects on DE, ME, and NE for growing pigs of adding 15 or 30% wheat bran to a corn-soybean meal diet and to compare values for DE, ME, and NE calculated using the difference procedure with values obtained using linear regression. Eighteen barrows (54.4 ± 4.3 kg initial BW) were individually housed in metabolism crates. The experiment had 3 diets and 6 replicate pigs per diet. The control diet contained corn, soybean meal, and no wheat bran. Two additional diets were formulated by mixing 15 or 30% wheat bran with 85 or 70% of the control diet, respectively. The experimental period lasted 15 d. During the initial 7 d, pigs were adapted to their experimental diets and housed in metabolism crates and fed 573 kcal ME/kg BW per day. On d 8, metabolism crates with the pigs were moved into open-circuit respiration chambers for measurement of O consumption and CO and CH production. The feeding level was the same as in the adaptation period, and feces and urine were collected during this period. On d 13 and 14, pigs were fed 225 kcal ME/kg BW per day, and pigs were then fasted for 24 h to obtain fasting heat production. Results of the experiment indicated that the apparent total tract digestibility of DM, GE, crude fiber, ADF, and NDF linearly decreased ( ≤ 0.05) as wheat bran inclusion increased in the diets. The daily O consumption and CO and CH production by pigs fed increasing concentrations of wheat bran linearly decreased ( ≤ 0.05), resulting in a linear decrease ( ≤ 0.05) in heat production. The DE (3,454, 3,257, and 3,161 kcal/kg for diets containing 0, 15, and 30% wheat bran, respectively for diets containing 0, 15, and 30% wheat bran, respectively), ME (3,400, 3,209, and 3,091 kcal/kg for diets containing 0, 15, and 30% wheat bran, respectively), and NE (1,808, 1,575, and 1,458 kcal/kg for diets containing 0, 15, and 30% wheat bran, respectively) of diets decreased (linear, ≤ 0.05) as wheat bran inclusion increased. The DE, ME, and NE of wheat bran determined using the difference procedure were 2,168, 2,117, and 896 kcal/kg, respectively, and these values were within the 95% confidence interval of the DE (2,285 kcal/kg), ME (2,217 kcal/kg), and NE (961 kcal/kg) estimated by linear regression. In conclusion, increasing the inclusion of wheat bran in a corn-soybean meal based diet reduced energy and nutrient digestibility and heat production as well as DE, ME, and NE of diets, but values for DE, ME, and NE for wheat bran determined using the difference procedure were not different from values determined using linear regression.
Alzheimer's Disease Detection by Pseudo Zernike Moment and Linear Regression Classification.
Wang, Shui-Hua; Du, Sidan; Zhang, Yin; Phillips, Preetha; Wu, Le-Nan; Chen, Xian-Qing; Zhang, Yu-Dong
2017-01-01
This study presents an improved method based on "Gorji et al. Neuroscience. 2015" by introducing a relatively new classifier-linear regression classification. Our method selects one axial slice from 3D brain image, and employed pseudo Zernike moment with maximum order of 15 to extract 256 features from each image. Finally, linear regression classification was harnessed as the classifier. The proposed approach obtains an accuracy of 97.51%, a sensitivity of 96.71%, and a specificity of 97.73%. Our method performs better than Gorji's approach and five other state-of-the-art approaches. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Developing a dengue forecast model using machine learning: A case study in China
Zhang, Qin; Wang, Li; Xiao, Jianpeng; Zhang, Qingying; Luo, Ganfeng; Li, Zhihao; He, Jianfeng; Zhang, Yonghui; Ma, Wenjun
2017-01-01
Background In China, dengue remains an important public health issue with expanded areas and increased incidence recently. Accurate and timely forecasts of dengue incidence in China are still lacking. We aimed to use the state-of-the-art machine learning algorithms to develop an accurate predictive model of dengue. Methodology/Principal findings Weekly dengue cases, Baidu search queries and climate factors (mean temperature, relative humidity and rainfall) during 2011–2014 in Guangdong were gathered. A dengue search index was constructed for developing the predictive models in combination with climate factors. The observed year and week were also included in the models to control for the long-term trend and seasonality. Several machine learning algorithms, including the support vector regression (SVR) algorithm, step-down linear regression model, gradient boosted regression tree algorithm (GBM), negative binomial regression model (NBM), least absolute shrinkage and selection operator (LASSO) linear regression model and generalized additive model (GAM), were used as candidate models to predict dengue incidence. Performance and goodness of fit of the models were assessed using the root-mean-square error (RMSE) and R-squared measures. The residuals of the models were examined using the autocorrelation and partial autocorrelation function analyses to check the validity of the models. The models were further validated using dengue surveillance data from five other provinces. The epidemics during the last 12 weeks and the peak of the 2014 large outbreak were accurately forecasted by the SVR model selected by a cross-validation technique. Moreover, the SVR model had the consistently smallest prediction error rates for tracking the dynamics of dengue and forecasting the outbreaks in other areas in China. Conclusion and significance The proposed SVR model achieved a superior performance in comparison with other forecasting techniques assessed in this study. The findings can help the government and community respond early to dengue epidemics. PMID:29036169
Kwan, Johnny S H; Kung, Annie W C; Sham, Pak C
2011-09-01
Selective genotyping can increase power in quantitative trait association. One example of selective genotyping is two-tail extreme selection, but simple linear regression analysis gives a biased genetic effect estimate. Here, we present a simple correction for the bias.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Penna, M.L.; Duchiade, M.P.
This study examines the relationship between air pollution, measured as concentration of suspended particulates in the atmosphere, and infant mortality due to pneumonia in the metropolitan area of Rio de Janeiro. Multiple linear regression (progressive or stepwise method) was used to analyze infant mortality due to pneumonia, diarrhea, and all causes in 1980, by geographic area, income level, and degree of contamination. While the variable proportion of families with income equivalent to more than two minimum wages was included in the regressions corresponding to the three types of infant mortality, the average contamination index had a statistically significant coefficient (bmore » = 0.2208; t = 2.670; P = 0.0137) only in the case of mortality due to pneumonia. This would suggest a biological association, but, as in any ecological study, such conclusions should be viewed with caution. The authors believe that air quality indicators are essential to consider in studies of acute respiratory infections in developing countries.« less
2013-01-01
application of the Hammett equation with the constants rph in the chemistry of organophosphorus compounds, Russ. Chem. Rev. 38 (1969) 795–811. [13...of oximes and OP compounds and the ability of oximes to reactivate OP- inhibited AChE. Multiple linear regression equations were analyzed using...phosphonate pairs, 21 oxime/ phosphoramidate pairs and 12 oxime/phosphate pairs. The best linear regression equation resulting from multiple regression anal
A new approach to assess COPD by identifying lung function break-points
Eriksson, Göran; Jarenbäck, Linnea; Peterson, Stefan; Ankerst, Jaro; Bjermer, Leif; Tufvesson, Ellen
2015-01-01
Purpose COPD is a progressive disease, which can take different routes, leading to great heterogeneity. The aim of the post-hoc analysis reported here was to perform continuous analyses of advanced lung function measurements, using linear and nonlinear regressions. Patients and methods Fifty-one COPD patients with mild to very severe disease (Global Initiative for Chronic Obstructive Lung Disease [GOLD] Stages I–IV) and 41 healthy smokers were investigated post-bronchodilation by flow-volume spirometry, body plethysmography, diffusion capacity testing, and impulse oscillometry. The relationship between COPD severity, based on forced expiratory volume in 1 second (FEV1), and different lung function parameters was analyzed by flexible nonparametric method, linear regression, and segmented linear regression with break-points. Results Most lung function parameters were nonlinear in relation to spirometric severity. Parameters related to volume (residual volume, functional residual capacity, total lung capacity, diffusion capacity [diffusion capacity of the lung for carbon monoxide], diffusion capacity of the lung for carbon monoxide/alveolar volume) and reactance (reactance area and reactance at 5Hz) were segmented with break-points at 60%–70% of FEV1. FEV1/forced vital capacity (FVC) and resonance frequency had break-points around 80% of FEV1, while many resistance parameters had break-points below 40%. The slopes in percent predicted differed; resistance at 5 Hz minus resistance at 20 Hz had a linear slope change of −5.3 per unit FEV1, while residual volume had no slope change above and −3.3 change per unit FEV1 below its break-point of 61%. Conclusion Continuous analyses of different lung function parameters over the spirometric COPD severity range gave valuable information additional to categorical analyses. Parameters related to volume, diffusion capacity, and reactance showed break-points around 65% of FEV1, indicating that air trapping starts to dominate in moderate COPD (FEV1 =50%–80%). This may have an impact on the patient’s management plan and selection of patients and/or outcomes in clinical research. PMID:26508849
Jang, Seung-Ho; Ryu, Han-Seung; Choi, Suck-Chei; Lee, Sang-Yeol
2016-01-01
Objectives The purpose of this study was to examine psychosocial factors related to gastroesophageal reflux disease (GERD) and their effects on quality of life (QOL) in firefighters. Methods Data were collected from 1217 firefighters in a Korean province. We measured psychological symptoms using the scale. In order to observe the influence of the high-risk group on occupational stress, we conduct logistic multiple linear regression. The correlation between psychological factors and QOL was also analyzed and performed a hierarchical regression analysis. Results GERD was observed in 32.2% of subjects. Subjects with GERD showed higher depressive symptom, anxiety and occupational stress scores, and lower self-esteem and QOL scores relative to those observed in GERD – negative subject. GERD risk was higher for the following occupational stress subcategories: job demand, lack of reward, interpersonal conflict, and occupational climate. The stepwise regression analysis showed that depressive symptoms, occupational stress, self-esteem, and anxiety were the best predictors of QOL. Conclusions The results suggest that psychological and medical approaches should be combined in GERD assessment. PMID:27691373
On the use of log-transformation vs. nonlinear regression for analyzing biological power laws
Xiao, X.; White, E.P.; Hooten, M.B.; Durham, S.L.
2011-01-01
Power-law relationships are among the most well-studied functional relationships in biology. Recently the common practice of fitting power laws using linear regression (LR) on log-transformed data has been criticized, calling into question the conclusions of hundreds of studies. It has been suggested that nonlinear regression (NLR) is preferable, but no rigorous comparison of these two methods has been conducted. Using Monte Carlo simulations, we demonstrate that the error distribution determines which method performs better, with NLR better characterizing data with additive, homoscedastic, normal error and LR better characterizing data with multiplicative, heteroscedastic, lognormal error. Analysis of 471 biological power laws shows that both forms of error occur in nature. While previous analyses based on log-transformation appear to be generally valid, future analyses should choose methods based on a combination of biological plausibility and analysis of the error distribution. We provide detailed guidelines and associated computer code for doing so, including a model averaging approach for cases where the error structure is uncertain. ?? 2011 by the Ecological Society of America.
Specialization Agreements in the Council for Mutual Economic Assistance
1988-02-01
proportions to stabilize variance (S. Weisberg, Applied Linear Regression , 2nd ed., John Wiley & Sons, New York, 1985, p. 134). If the dependent...27, 1986, p. 3. Weisberg, S., Applied Linear Regression , 2nd ed., John Wiley & Sons, New York, 1985, p. 134. Wiles, P. J., Communist International
Radio Propagation Prediction Software for Complex Mixed Path Physical Channels
2006-08-14
63 4.4.6. Applied Linear Regression Analysis in the Frequency Range 1-50 MHz 69 4.4.7. Projected Scaling to...4.4.6. Applied Linear Regression Analysis in the Frequency Range 1-50 MHz In order to construct a comprehensive numerical algorithm capable of
Due to the complexity of the processes contributing to beach bacteria concentrations, many researchers rely on statistical modeling, among which multiple linear regression (MLR) modeling is most widely used. Despite its ease of use and interpretation, there may be time dependence...
Data Transformations for Inference with Linear Regression: Clarifications and Recommendations
ERIC Educational Resources Information Center
Pek, Jolynn; Wong, Octavia; Wong, C. M.
2017-01-01
Data transformations have been promoted as a popular and easy-to-implement remedy to address the assumption of normally distributed errors (in the population) in linear regression. However, the application of data transformations introduces non-ignorable complexities which should be fully appreciated before their implementation. This paper adds to…
USING LINEAR AND POLYNOMIAL MODELS TO EXAMINE THE ENVIRONMENTAL STABILITY OF VIRUSES
The article presents the development of model equations for describing the fate of viral infectivity in environmental samples. Most of the models were based upon the use of a two-step linear regression approach. The first step employs regression of log base 10 transformed viral t...
Identifying the Factors That Influence Change in SEBD Using Logistic Regression Analysis
ERIC Educational Resources Information Center
Camilleri, Liberato; Cefai, Carmel
2013-01-01
Multiple linear regression and ANOVA models are widely used in applications since they provide effective statistical tools for assessing the relationship between a continuous dependent variable and several predictors. However these models rely heavily on linearity and normality assumptions and they do not accommodate categorical dependent…
Simple and multiple linear regression: sample size considerations.
Hanley, James A
2016-11-01
The suggested "two subjects per variable" (2SPV) rule of thumb in the Austin and Steyerberg article is a chance to bring out some long-established and quite intuitive sample size considerations for both simple and multiple linear regression. This article distinguishes two of the major uses of regression models that imply very different sample size considerations, neither served well by the 2SPV rule. The first is etiological research, which contrasts mean Y levels at differing "exposure" (X) values and thus tends to focus on a single regression coefficient, possibly adjusted for confounders. The second research genre guides clinical practice. It addresses Y levels for individuals with different covariate patterns or "profiles." It focuses on the profile-specific (mean) Y levels themselves, estimating them via linear compounds of regression coefficients and covariates. By drawing on long-established closed-form variance formulae that lie beneath the standard errors in multiple regression, and by rearranging them for heuristic purposes, one arrives at quite intuitive sample size considerations for both research genres. Copyright © 2016 Elsevier Inc. All rights reserved.
Jiang, Feng; Han, Ji-zhong
2018-01-01
Cross-domain collaborative filtering (CDCF) solves the sparsity problem by transferring rating knowledge from auxiliary domains. Obviously, different auxiliary domains have different importance to the target domain. However, previous works cannot evaluate effectively the significance of different auxiliary domains. To overcome this drawback, we propose a cross-domain collaborative filtering algorithm based on Feature Construction and Locally Weighted Linear Regression (FCLWLR). We first construct features in different domains and use these features to represent different auxiliary domains. Thus the weight computation across different domains can be converted as the weight computation across different features. Then we combine the features in the target domain and in the auxiliary domains together and convert the cross-domain recommendation problem into a regression problem. Finally, we employ a Locally Weighted Linear Regression (LWLR) model to solve the regression problem. As LWLR is a nonparametric regression method, it can effectively avoid underfitting or overfitting problem occurring in parametric regression methods. We conduct extensive experiments to show that the proposed FCLWLR algorithm is effective in addressing the data sparsity problem by transferring the useful knowledge from the auxiliary domains, as compared to many state-of-the-art single-domain or cross-domain CF methods. PMID:29623088
Yu, Xu; Lin, Jun-Yu; Jiang, Feng; Du, Jun-Wei; Han, Ji-Zhong
2018-01-01
Cross-domain collaborative filtering (CDCF) solves the sparsity problem by transferring rating knowledge from auxiliary domains. Obviously, different auxiliary domains have different importance to the target domain. However, previous works cannot evaluate effectively the significance of different auxiliary domains. To overcome this drawback, we propose a cross-domain collaborative filtering algorithm based on Feature Construction and Locally Weighted Linear Regression (FCLWLR). We first construct features in different domains and use these features to represent different auxiliary domains. Thus the weight computation across different domains can be converted as the weight computation across different features. Then we combine the features in the target domain and in the auxiliary domains together and convert the cross-domain recommendation problem into a regression problem. Finally, we employ a Locally Weighted Linear Regression (LWLR) model to solve the regression problem. As LWLR is a nonparametric regression method, it can effectively avoid underfitting or overfitting problem occurring in parametric regression methods. We conduct extensive experiments to show that the proposed FCLWLR algorithm is effective in addressing the data sparsity problem by transferring the useful knowledge from the auxiliary domains, as compared to many state-of-the-art single-domain or cross-domain CF methods.
High Maternal Blood Mercury Level Is Associated with Low Verbal IQ in Children.
Jeong, Kyoung Sook; Park, Hyewon; Ha, Eunhee; Shin, Jiyoung; Hong, Yun Chul; Ha, Mina; Park, Hyesook; Kim, Bung Nyun; Lee, Boeun; Lee, Soo Jeong; Lee, Kyung Yeon; Kim, Ja Hyeong; Kim, Yangho
2017-07-01
The objective of the present study was to investigate the relationship of IQ in children with maternal blood mercury concentration during late pregnancy. The present study is a component of the Mothers and Children's Environmental Health (MOCEH) study, a multi-center birth cohort project in Korea that began in 2006. The study cohort consisted of 553 children whose mothers underwent testing for blood mercury during late pregnancy. The children were given the Korean language version of the Wechsler Preschool and Primary Scale of Intelligence, revised edition (WPPSI-R) at 60 months of age. Multivariate linear regression analysis, with adjustment for covariates, was used to assess the relationship between verbal, performance, and total IQ in children and blood mercury concentration of mothers during late pregnancy. The results of multivariate linear regression analysis indicated that a doubling of blood mercury was associated with the decrease in verbal and total IQ by 2.482 (95% confidence interval [CI], 0.749-4.214) and 2.402 (95% CI, 0.526-4.279), respectively, after adjustment. This inverse association remained after further adjustment for blood lead concentration. Fish intake is an effect modifier of child IQ. In conclusion, high maternal blood mercury level is associated with low verbal IQ in children. © 2017 The Korean Academy of Medical Sciences.
Impact of Trichiasis Surgery on Physical Functioning in Ethiopian Patients: STAR Trial
Wolle, Meraf A.; Cassard, Sandra D.; Gower, Emily W.; Munoz, Beatriz E.; Wang, Jiangxia; Alemayehu, Wondu; West, Sheila K.
2010-01-01
Purpose To evaluate the physical functioning of Ethiopian trichiasis surgery patients before and six months after surgery. Design Nested Cohort Study Methods This study was nested within the Surgery for Trichiasis, Antibiotics to Prevent Recurrence (STAR) clinical trial conducted in Ethiopia. Demographic information, ocular examinations, and physical functioning assessments were collected before and 6 months after surgery. A single score for patients’ physical functioning was constructed using Rasch analysis. A multivariate linear regression model was used to determine if change in physical functioning was associated with change in visual acuity. Results Of the 438 participants, 411 (93.8%) had both baseline and follow-up questionnaires. Physical functioning scores at baseline ranged from −6.32 (great difficulty) to +6.01 (no difficulty). The percent of participants reporting no difficulty in physical functioning increased by 32.6%; the proportion of participants in the mild/no visual impairment category increased by 8.6%. A multivariate linear regression model showed that for every line of vision gained, physical functioning improves significantly (0.09 units; 95% CI: 0.02–0.16). Conclusions Surgery to correct trichiasis appears to improve patients’ physical functioning as measured at 6 months. More effort in promoting trichiasis surgery is essential, not only to prevent corneal blindness, but also to enable improved functioning in daily life. PMID:21333268
2014-01-01
Background It is not well established how psychosocial factors like social support and depression affect health-related quality of life in multimorbid and elderly patients. We investigated whether depressive mood mediates the influence of social support on health-related quality of life. Methods Cross-sectional data of 3,189 multimorbid patients from the baseline assessment of the German MultiCare cohort study were used. Mediation was tested using the approach described by Baron and Kenny based on multiple linear regression, and controlling for socioeconomic variables and burden of multimorbidity. Results Mediation analyses confirmed that depressive mood mediates the influence of social support on health-related quality of life (Sobel’s p < 0.001). Multiple linear regression showed that the influence of depressive mood (β = −0.341, p < 0.01) on health-related quality of life is greater than the influence of multimorbidity (β = −0.234, p < 0.01). Conclusion Social support influences health-related quality of life, but this association is strongly mediated by depressive mood. Depression should be taken into consideration in research on multimorbidity, and clinicians should be aware of its importance when caring for multimorbid patients. Trial registration ISRCTN89818205 PMID:24708815
Sharif, Nasim
2010-01-01
Objective This study was conducted to compare the personal well-being among the wives of Iranian veterans living in the city of Qom. Method A sample of 300 was randomly selected from a database containing the addresses of veteran's families at Iran's Veterans Foundation in Qom (Bonyad-e-Shahid va Omoore Isargaran). The veterans' wives were divided into three groups: wives of martyrs (killed veterans), wives of prisoners of war, and wives of disabled veterans. The Persian translation of Personal Well-being Index and Stress Symptoms Checklist (SSC) were administered for data collection. Four women chose not to respond to Personal Well-being Index. Data were then analyzed using linear multivariate regression (stepwise method), analysis of variance, and by computing the correlation between variables. Results Results showed a negative correlation between well-being and stress symptoms. However, each group demonstrated different levels of stress symptoms. Furthermore, multivariate linear regression in the 3 groups showed that overall satisfaction of life and personal well-being (total score and its domains) could be predicted by different symptoms. Conclusion Each group experienced different challenges and thus different stress symptoms. Therefore, although they all need help, each group needs to be helped in a different way. PMID:22952487
Abbaspour, Seddigheh; Farmanbar, Rabiollah; Njafi, Fateme; Ghiasvand, Arezoo Mohamadkhani; Dehghankar, Leila
2017-01-01
Background Regular physical activity has been considered as health promotion, and identifying different effective psycho-social variables on physical has proven to be essential. Objective To identify the relationship between decisional balance and self-efficacy in physical activities using the transtheoretical model in the members of a retirement center in Rasht, Guillen. Methods A descriptive cross-sectional study was conducted in 2013 by using convenient sampling on 262 elderly people who are the members of retirement centers in Rasht. Data were collected using Stages of change, Decisional balance, Self-efficacy and Physical Activity Scale for the Elderly (PASE). Data was analyzed using SPSS-16 software, descriptive and analytic statistic (Pearson correlation, Spearman, ANOVA, HSD Tukey, linear and ordinal regression). Results The majority of participants were in maintenance stage. Mean and standard deviation physical activity for the elderly was 119.35±51.50. Stages of change and physical activities were significantly associated with decisional balance and self-efficacy (p<0.0001); however, cons had a significant and reverse association. According to linear and ordinal regression the only predicator variable of physical activity behavior was self-efficacy. Conclusion By increase in pros and self-efficacy on doing physical activity, it can be benefited in designing appropriate intervention programs. PMID:28713520
Relationship between Gender Roles and Sexual Assertiveness in Married Women
Azmoude, Elham; Firoozi, Mahbobe; Sadeghi Sahebzad, Elahe; Asgharipour, Neghar
2016-01-01
ABSTRACT Background: Evidence indicates that sexual assertiveness is one of the important factors affecting sexual satisfaction. According to some studies, traditional gender norms conflict with women’s capability in expressing sexual desires. This study examined the relationship between gender roles and sexual assertiveness in married women in Mashhad, Iran. Methods: This cross-sectional study was conducted on 120 women who referred to Mashhad health centers through convenient sampling in 2014-15. Data were collected using Bem Sex Role Inventory (BSRI) and Hulbert index of sexual assertiveness. Data were analyzed using SPSS 16 by Pearson and Spearman’s correlation tests and linear Regression Analysis. Results: The mean scores of sexual assertiveness was 54.93±13.20. According to the findings, there was non-significant correlation between Femininity and masculinity score with sexual assertiveness (P=0.069 and P=0.080 respectively). Linear regression analysis indicated that among the predictor variables, only Sexual function satisfaction was identified as the sexual assertiveness summary predictor variables (P=0.001). Conclusion: Based on the results, sexual assertiveness in married women does not comply with gender role, but it is related to Sexual function satisfaction. So, counseling psychologists need to consider this variable when designing intervention programs for modifying sexual assertiveness and find other variables that affect sexual assertiveness. PMID:27713899
Esserman, Denise A.; Moore, Charity G.; Roth, Mary T.
2009-01-01
Older community dwelling adults often take multiple medications for numerous chronic diseases. Non-adherence to these medications can have a large public health impact. Therefore, the measurement and modeling of medication adherence in the setting of polypharmacy is an important area of research. We apply a variety of different modeling techniques (standard linear regression; weighted linear regression; adjusted linear regression; naïve logistic regression; beta-binomial (BB) regression; generalized estimating equations (GEE)) to binary medication adherence data from a study in a North Carolina based population of older adults, where each medication an individual was taking was classified as adherent or non-adherent. In addition, through simulation we compare these different methods based on Type I error rates, bias, power, empirical 95% coverage, and goodness of fit. We find that estimation and inference using GEE is robust to a wide variety of scenarios and we recommend using this in the setting of polypharmacy when adherence is dichotomously measured for multiple medications per person. PMID:20414358
Genetic Programming Transforms in Linear Regression Situations
NASA Astrophysics Data System (ADS)
Castillo, Flor; Kordon, Arthur; Villa, Carlos
The chapter summarizes the use of Genetic Programming (GP) inMultiple Linear Regression (MLR) to address multicollinearity and Lack of Fit (LOF). The basis of the proposed method is applying appropriate input transforms (model respecification) that deal with these issues while preserving the information content of the original variables. The transforms are selected from symbolic regression models with optimal trade-off between accuracy of prediction and expressional complexity, generated by multiobjective Pareto-front GP. The chapter includes a comparative study of the GP-generated transforms with Ridge Regression, a variant of ordinary Multiple Linear Regression, which has been a useful and commonly employed approach for reducing multicollinearity. The advantages of GP-generated model respecification are clearly defined and demonstrated. Some recommendations for transforms selection are given as well. The application benefits of the proposed approach are illustrated with a real industrial application in one of the broadest empirical modeling areas in manufacturing - robust inferential sensors. The chapter contributes to increasing the awareness of the potential of GP in statistical model building by MLR.
Naval Research Logistics Quarterly. Volume 28. Number 3,
1981-09-01
denotes component-wise maximum. f has antone (isotone) differences on C x D if for cl < c2 and d, < d2, NAVAL RESEARCH LOGISTICS QUARTERLY VOL. 28...or negative correlations and linear or nonlinear regressions. Given are the mo- ments to order two and, for special cases, (he regression function and...data sets. We designate this bnb distribution as G - B - N(a, 0, v). The distribution admits only of positive correlation and linear regressions
Automating approximate Bayesian computation by local linear regression.
Thornton, Kevin R
2009-07-07
In several biological contexts, parameter inference often relies on computationally-intensive techniques. "Approximate Bayesian Computation", or ABC, methods based on summary statistics have become increasingly popular. A particular flavor of ABC based on using a linear regression to approximate the posterior distribution of the parameters, conditional on the summary statistics, is computationally appealing, yet no standalone tool exists to automate the procedure. Here, I describe a program to implement the method. The software package ABCreg implements the local linear-regression approach to ABC. The advantages are: 1. The code is standalone, and fully-documented. 2. The program will automatically process multiple data sets, and create unique output files for each (which may be processed immediately in R), facilitating the testing of inference procedures on simulated data, or the analysis of multiple data sets. 3. The program implements two different transformation methods for the regression step. 4. Analysis options are controlled on the command line by the user, and the program is designed to output warnings for cases where the regression fails. 5. The program does not depend on any particular simulation machinery (coalescent, forward-time, etc.), and therefore is a general tool for processing the results from any simulation. 6. The code is open-source, and modular.Examples of applying the software to empirical data from Drosophila melanogaster, and testing the procedure on simulated data, are shown. In practice, the ABCreg simplifies implementing ABC based on local-linear regression.
NASA Astrophysics Data System (ADS)
Jakubowski, J.; Stypulkowski, J. B.; Bernardeau, F. G.
2017-12-01
The first phase of the Abu Hamour drainage and storm tunnel was completed in early 2017. The 9.5 km long, 3.7 m diameter tunnel was excavated with two Earth Pressure Balance (EPB) Tunnel Boring Machines from Herrenknecht. TBM operation processes were monitored and recorded by Data Acquisition and Evaluation System. The authors coupled collected TBM drive data with available information on rock mass properties, cleansed, completed with secondary variables and aggregated by weeks and shifts. Correlations and descriptive statistics charts were examined. Multivariate Linear Regression and CART regression tree models linking TBM penetration rate (PR), penetration per revolution (PPR) and field penetration index (FPI) with TBM operational and geotechnical characteristics were performed for the conditions of the weak/soft rock of Doha. Both regression methods are interpretable and the data were screened with different computational approaches allowing enriched insight. The primary goal of the analysis was to investigate empirical relations between multiple explanatory and responding variables, to search for best subsets of explanatory variables and to evaluate the strength of linear and non-linear relations. For each of the penetration indices, a predictive model coupling both regression methods was built and validated. The resultant models appeared to be stronger than constituent ones and indicated an opportunity for more accurate and robust TBM performance predictions.
Spectral-Spatial Shared Linear Regression for Hyperspectral Image Classification.
Haoliang Yuan; Yuan Yan Tang
2017-04-01
Classification of the pixels in hyperspectral image (HSI) is an important task and has been popularly applied in many practical applications. Its major challenge is the high-dimensional small-sized problem. To deal with this problem, lots of subspace learning (SL) methods are developed to reduce the dimension of the pixels while preserving the important discriminant information. Motivated by ridge linear regression (RLR) framework for SL, we propose a spectral-spatial shared linear regression method (SSSLR) for extracting the feature representation. Comparing with RLR, our proposed SSSLR has the following two advantages. First, we utilize a convex set to explore the spatial structure for computing the linear projection matrix. Second, we utilize a shared structure learning model, which is formed by original data space and a hidden feature space, to learn a more discriminant linear projection matrix for classification. To optimize our proposed method, an efficient iterative algorithm is proposed. Experimental results on two popular HSI data sets, i.e., Indian Pines and Salinas demonstrate that our proposed methods outperform many SL methods.
Arnaoutakis, George J.; George, Timothy J.; Alejo, Diane E.; Merlo, Christian A.; Baumgartner, William A.; Cameron, Duke E.; Shah, Ashish S.
2011-01-01
Context The impact of Society of Thoracic Surgeons (STS) predicted mortality risk score on resource utilization after aortic valve replacement (AVR) has not been previously studied. Objective We hypothesize that increasing STS risk scores in patients having AVR are associated with greater hospital charges. Design, Setting, and Patients Clinical and financial data for patients undergoing AVR at a tertiary care, university hospital over a ten-year period (1/2000–12/2009) were retrospectively reviewed. The current STS formula (v2.61) for in-hospital mortality was used for all patients. After stratification into risk quartiles (Q), index admission hospital charges were compared across risk strata with Rank-Sum tests. Linear regression and Spearman’s coefficient assessed correlation and goodness of fit. Multivariable analysis assessed relative contributions of individual variables on overall charges. Main Outcome Measures Inflation-adjusted index hospitalization total charges Results 553 patients had AVR during the study period. Average predicted mortality was 2.9% (±3.4) and actual mortality was 3.4% for AVR. Median charges were greater in the upper Q of AVR patients [Q1–3,$39,949 (IQR32,708–51,323) vs Q4,$62,301 (IQR45,952–97,103), p=<0.01]. On univariate linear regression, there was a positive correlation between STS risk score and log-transformed charges (coefficient: 0.06, 95%CI 0.05–0.07, p<0.01). Spearman’s correlation R-value was 0.51. This positive correlation persisted in risk-adjusted multivariable linear regression. Each 1% increase in STS risk score was associated with an added $3,000 in hospital charges. Conclusions This study showed increasing STS risk score predicts greater charges after AVR. As competing therapies such as percutaneous valve replacement emerge to treat high risk patients, these results serve as a benchmark to compare resource utilization. PMID:21497834
Can Functional Cardiac Age be Predicted from ECG in a Normal Healthy Population
NASA Technical Reports Server (NTRS)
Schlegel, Todd; Starc, Vito; Leban, Manja; Sinigoj, Petra; Vrhovec, Milos
2011-01-01
In a normal healthy population, we desired to determine the most age-dependent conventional and advanced ECG parameters. We hypothesized that changes in several ECG parameters might correlate with age and together reliably characterize the functional age of the heart. Methods: An initial study population of 313 apparently healthy subjects was ultimately reduced to 148 subjects (74 men, 84 women, in the range from 10 to 75 years of age) after exclusion criteria. In all subjects, ECG recordings (resting 5-minute 12-lead high frequency ECG) were evaluated via custom software programs to calculate up to 85 different conventional and advanced ECG parameters including beat-to-beat QT and RR variability, waveform complexity, and signal-averaged, high-frequency and spatial/spatiotemporal ECG parameters. The prediction of functional age was evaluated by multiple linear regression analysis using the best 5 univariate predictors. Results: Ignoring what were ultimately small differences between males and females, the functional age was found to be predicted (R2= 0.69, P < 0.001) from a linear combination of 5 independent variables: QRS elevation in the frontal plane (p<0.001), a new repolarization parameter QTcorr (p<0.001), mean high frequency QRS amplitude (p=0.009), the variability parameter % VLF of RRV (p=0.021) and the P-wave width (p=0.10). Here, QTcorr represents the correlation between the calculated QT and the measured QT signal. Conclusions: In apparently healthy subjects with normal conventional ECGs, functional cardiac age can be estimated by multiple linear regression analysis of mostly advanced ECG results. Because some parameters in the regression formula, such as QTcorr, high frequency QRS amplitude and P-wave width also change with disease in the same direction as with increased age, increased functional age of the heart may reflect subtle age-related pathologies in cardiac electrical function that are usually hidden on conventional ECG.
Socio-economic factors associated with infant mortality in Italy: an ecological study
2012-01-01
Introduction One issue that continues to attract the attention of public health researchers is the possible relationship in high-income countries between income, income inequality and infant mortality (IM). The aim of this study was to assess the associations between IM and major socio-economic determinants in Italy. Methods Associations between infant mortality rates in the 20 Italian regions (2006–2008) and the Gini index of income inequality, mean household income, percentage of women with at least 8 years of education, and percentage of unemployed aged 15–64 years were assessed using Pearson correlation coefficients. Univariate linear regression and multiple stepwise linear regression analyses were performed to determine the magnitude and direction of the effect of the four socio-economic variables on IM. Results The Gini index and the total unemployment rate showed a positive strong correlation with IM (r = 0.70; p < 0.001 and r = 0.84; p < 0.001 respectively), mean household income showed a strong negative correlation (r = −0.78; p < 0.001), while female educational attainment presented a weak negative correlation (r = −0.45; p < 0.05). Using a multiple stepwise linear regression model, only unemployment rate was independently associated with IM (b = 0.15, p < 0.001). Conclusions In Italy, a high-income country where health care is universally available, variations in IM were strongly associated with relative and absolute income and unemployment rate. These results suggest that in Italy IM is not only related to income distribution, as demonstrated for other developed countries, but also to economic factors such as absolute income and unemployment. In order to reduce IM and the existing inequalities, the challenge for Italian decision makers is to promote economic growth and enhance employment levels. PMID:22898293
Simple linear and multivariate regression models.
Rodríguez del Águila, M M; Benítez-Parejo, N
2011-01-01
In biomedical research it is common to find problems in which we wish to relate a response variable to one or more variables capable of describing the behaviour of the former variable by means of mathematical models. Regression techniques are used to this effect, in which an equation is determined relating the two variables. While such equations can have different forms, linear equations are the most widely used form and are easy to interpret. The present article describes simple and multiple linear regression models, how they are calculated, and how their applicability assumptions are checked. Illustrative examples are provided, based on the use of the freely accessible R program. Copyright © 2011 SEICAP. Published by Elsevier Espana. All rights reserved.
Narayanan, Neethu; Gupta, Suman; Gajbhiye, V T; Manjaiah, K M
2017-04-01
A carboxy methyl cellulose-nano organoclay (nano montmorillonite modified with 35-45 wt % dimethyl dialkyl (C 14 -C 18 ) amine (DMDA)) composite was prepared by solution intercalation method. The prepared composite was characterized by infrared spectroscopy (FTIR), X-Ray diffraction spectroscopy (XRD) and scanning electron microscopy (SEM). The composite was utilized for its pesticide sorption efficiency for atrazine, imidacloprid and thiamethoxam. The sorption data was fitted into Langmuir and Freundlich isotherms using linear and non linear methods. The linear regression method suggested best fitting of sorption data into Type II Langmuir and Freundlich isotherms. In order to avoid the bias resulting from linearization, seven different error parameters were also analyzed by non linear regression method. The non linear error analysis suggested that the sorption data fitted well into Langmuir model rather than in Freundlich model. The maximum sorption capacity, Q 0 (μg/g) was given by imidacloprid (2000) followed by thiamethoxam (1667) and atrazine (1429). The study suggests that the degree of determination of linear regression alone cannot be used for comparing the best fitting of Langmuir and Freundlich models and non-linear error analysis needs to be done to avoid inaccurate results. Copyright © 2017 Elsevier Ltd. All rights reserved.
Estimation of stature from the foot and its segments in a sub-adult female population of North India
2011-01-01
Background Establishing personal identity is one of the main concerns in forensic investigations. Estimation of stature forms a basic domain of the investigation process in unknown and co-mingled human remains in forensic anthropology case work. The objective of the present study was to set up standards for estimation of stature from the foot and its segments in a sub-adult female population. Methods The sample for the study constituted 149 young females from the Northern part of India. The participants were aged between 13 and 18 years. Besides stature, seven anthropometric measurements that included length of the foot from each toe (T1, T2, T3, T4, and T5 respectively), foot breadth at ball (BBAL) and foot breadth at heel (BHEL) were measured on both feet in each participant using standard methods and techniques. Results The results indicated that statistically significant differences (p < 0.05) between left and right feet occur in both the foot breadth measurements (BBAL and BHEL). Foot length measurements (T1 to T5 lengths) did not show any statistically significant bilateral asymmetry. The correlation between stature and all the foot measurements was found to be positive and statistically significant (p-value < 0.001). Linear regression models and multiple regression models were derived for estimation of stature from the measurements of the foot. The present study indicates that anthropometric measurements of foot and its segments are valuable in the estimation of stature. Foot length measurements estimate stature with greater accuracy when compared to foot breadth measurements. Conclusions The present study concluded that foot measurements have a strong relationship with stature in the sub-adult female population of North India. Hence, the stature of an individual can be successfully estimated from the foot and its segments using different regression models derived in the study. The regression models derived in the study may be applied successfully for the estimation of stature in sub-adult females, whenever foot remains are brought for forensic examination. Stepwise multiple regression models tend to estimate stature more accurately than linear regression models in female sub-adults. PMID:22104433
1994-09-01
Institute of Technology, Wright- Patterson AFB OH, January 1994. 4. Neter, John and others. Applied Linear Regression Models. Boston: Irwin, 1989. 5...Technology, Wright-Patterson AFB OH 5 April 1994. 29. Neter, John and others. Applied Linear Regression Models. Boston: Irwin, 1989. 30. Office of
An Evaluation of the Automated Cost Estimating Integrated Tools (ACEIT) System
1989-09-01
residual and it is described as the residual divided by its standard deviation (13:App A,17). Neter, Wasserman, and Kutner, in Applied Linear Regression Models...others. Applied Linear Regression Models. Homewood IL: Irwin, 1983. 19. Raduchel, William J. "A Professional’s Perspective on User-Friendliness," Byte
A Simple and Convenient Method of Multiple Linear Regression to Calculate Iodine Molecular Constants
ERIC Educational Resources Information Center
Cooper, Paul D.
2010-01-01
A new procedure using a student-friendly least-squares multiple linear-regression technique utilizing a function within Microsoft Excel is described that enables students to calculate molecular constants from the vibronic spectrum of iodine. This method is advantageous pedagogically as it calculates molecular constants for ground and excited…
Conjoint Analysis: A Study of the Effects of Using Person Variables.
ERIC Educational Resources Information Center
Fraas, John W.; Newman, Isadore
Three statistical techniques--conjoint analysis, a multiple linear regression model, and a multiple linear regression model with a surrogate person variable--were used to estimate the relative importance of five university attributes for students in the process of selecting a college. The five attributes include: availability and variety of…
Fitting program for linear regressions according to Mahon (1996)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Trappitsch, Reto G.
2018-01-09
This program takes the users' Input data and fits a linear regression to it using the prescription presented by Mahon (1996). Compared to the commonly used York fit, this method has the correct prescription for measurement error propagation. This software should facilitate the proper fitting of measurements with a simple Interface.
How Robust Is Linear Regression with Dummy Variables?
ERIC Educational Resources Information Center
Blankmeyer, Eric
2006-01-01
Researchers in education and the social sciences make extensive use of linear regression models in which the dependent variable is continuous-valued while the explanatory variables are a combination of continuous-valued regressors and dummy variables. The dummies partition the sample into groups, some of which may contain only a few observations.…
Revisiting the Scale-Invariant, Two-Dimensional Linear Regression Method
ERIC Educational Resources Information Center
Patzer, A. Beate C.; Bauer, Hans; Chang, Christian; Bolte, Jan; Su¨lzle, Detlev
2018-01-01
The scale-invariant way to analyze two-dimensional experimental and theoretical data with statistical errors in both the independent and dependent variables is revisited by using what we call the triangular linear regression method. This is compared to the standard least-squares fit approach by applying it to typical simple sets of example data…
ERIC Educational Resources Information Center
Thompson, Russel L.
Homoscedasticity is an important assumption of linear regression. This paper explains what it is and why it is important to the researcher. Graphical and mathematical methods for testing the homoscedasticity assumption are demonstrated. Sources of homoscedasticity and types of homoscedasticity are discussed, and methods for correction are…
On the null distribution of Bayes factors in linear regression
USDA-ARS?s Scientific Manuscript database
We show that under the null, the 2 log (Bayes factor) is asymptotically distributed as a weighted sum of chi-squared random variables with a shifted mean. This claim holds for Bayesian multi-linear regression with a family of conjugate priors, namely, the normal-inverse-gamma prior, the g-prior, and...
Common pitfalls in statistical analysis: Linear regression analysis
Aggarwal, Rakesh; Ranganathan, Priya
2017-01-01
In a previous article in this series, we explained correlation analysis which describes the strength of relationship between two continuous variables. In this article, we deal with linear regression analysis which predicts the value of one continuous variable from another. We also discuss the assumptions and pitfalls associated with this analysis. PMID:28447022
Comparison of l₁-Norm SVR and Sparse Coding Algorithms for Linear Regression.
Zhang, Qingtian; Hu, Xiaolin; Zhang, Bo
2015-08-01
Support vector regression (SVR) is a popular function estimation technique based on Vapnik's concept of support vector machine. Among many variants, the l1-norm SVR is known to be good at selecting useful features when the features are redundant. Sparse coding (SC) is a technique widely used in many areas and a number of efficient algorithms are available. Both l1-norm SVR and SC can be used for linear regression. In this brief, the close connection between the l1-norm SVR and SC is revealed and some typical algorithms are compared for linear regression. The results show that the SC algorithms outperform the Newton linear programming algorithm, an efficient l1-norm SVR algorithm, in efficiency. The algorithms are then used to design the radial basis function (RBF) neural networks. Experiments on some benchmark data sets demonstrate the high efficiency of the SC algorithms. In particular, one of the SC algorithms, the orthogonal matching pursuit is two orders of magnitude faster than a well-known RBF network designing algorithm, the orthogonal least squares algorithm.
Chen, Tsung-Fu; Liang, Jyh-Chong; Lin, Tzu-Bin; Tsai, Chin-Chung
2016-01-01
Background Compared with the traditional ways of gaining health-related information from newspapers, magazines, radio, and television, the Internet is inexpensive, accessible, and conveys diverse opinions. Several studies on how increasing Internet use affected outpatient clinic visits were inconclusive. Objective The objective of this study was to examine the role of Internet use on ambulatory care-seeking behaviors as indicated by the number of outpatient clinic visits after adjusting for confounding variables. Methods We conducted this study using a sample randomly selected from the general population in Taiwan. To handle the missing data, we built a multivariate logistic regression model for propensity score matching using age and sex as the independent variables. The questionnaires with no missing data were then included in a multivariate linear regression model for examining the association between Internet use and outpatient clinic visits. Results We included a sample of 293 participants who answered the questionnaire with no missing data in the multivariate linear regression model. We found that Internet use was significantly associated with more outpatient clinic visits (P=.04). The participants with chronic diseases tended to make more outpatient clinic visits (P<.01). Conclusions The inconsistent quality of health-related information obtained from the Internet may be associated with patients’ increasing need for interpreting and discussing the information with health care professionals, thus resulting in an increasing number of outpatient clinic visits. In addition, the media literacy of Web-based health-related information seekers may also affect their ambulatory care-seeking behaviors, such as outpatient clinic visits. PMID:27927606
Physical Function in Older Men With Hyperkyphosis
Harrison, Stephanie L.; Fink, Howard A.; Marshall, Lynn M.; Orwoll, Eric; Barrett-Connor, Elizabeth; Cawthon, Peggy M.; Kado, Deborah M.
2015-01-01
Background. Age-related hyperkyphosis has been associated with poor physical function and is a well-established predictor of adverse health outcomes in older women, but its impact on health in older men is less well understood. Methods. We conducted a cross-sectional study to evaluate the association of hyperkyphosis and physical function in 2,363 men, aged 71–98 (M = 79) from the Osteoporotic Fractures in Men Study. Kyphosis was measured using the Rancho Bernardo Study block method. Measurements of grip strength and lower extremity function, including gait speed over 6 m, narrow walk (measure of dynamic balance), repeated chair stands ability and time, and lower extremity power (Nottingham Power Rig) were included separately as primary outcomes. We investigated associations of kyphosis and each outcome in age-adjusted and multivariable linear or logistic regression models, controlling for age, clinic, education, race, bone mineral density, height, weight, diabetes, and physical activity. Results. In multivariate linear regression, we observed a dose-related response of worse scores on each lower extremity physical function test as number of blocks increased, p for trend ≤.001. Using a cutoff of ≥4 blocks, 20% (N = 469) of men were characterized with hyperkyphosis. In multivariate logistic regression, men with hyperkyphosis had increased odds (range 1.5–1.8) of being in the worst quartile of performing lower extremity physical function tasks (p < .001 for each outcome). Kyphosis was not associated with grip strength in any multivariate analysis. Conclusions. Hyperkyphosis is associated with impaired lower extremity physical function in older men. Further studies are needed to determine the direction of causality. PMID:25431353
Ma, Jing; Yu, Jiong; Hao, Guangshu; Wang, Dan; Sun, Yanni; Lu, Jianxin; Cao, Hongcui; Lin, Feiyan
2017-02-20
The prevalence of high hyperlipemia is increasing around the world. Our aims are to analyze the relationship of triglyceride (TG) and cholesterol (TC) with indexes of liver function and kidney function, and to develop a prediction model of TG, TC in overweight people. A total of 302 adult healthy subjects and 273 overweight subjects were enrolled in this study. The levels of fasting indexes of TG (fs-TG), TC (fs-TC), blood glucose, liver function, and kidney function were measured and analyzed by correlation analysis and multiple linear regression (MRL). The back propagation artificial neural network (BP-ANN) was applied to develop prediction models of fs-TG and fs-TC. The results showed there was significant difference in biochemical indexes between healthy people and overweight people. The correlation analysis showed fs-TG was related to weight, height, blood glucose, and indexes of liver and kidney function; while fs-TC was correlated with age, indexes of liver function (P < 0.01). The MRL analysis indicated regression equations of fs-TG and fs-TC both had statistic significant (P < 0.01) when included independent indexes. The BP-ANN model of fs-TG reached training goal at 59 epoch, while fs-TC model achieved high prediction accuracy after training 1000 epoch. In conclusions, there was high relationship of fs-TG and fs-TC with weight, height, age, blood glucose, indexes of liver function and kidney function. Based on related variables, the indexes of fs-TG and fs-TC can be predicted by BP-ANN models in overweight people.
Zhang, Zili; Wang, Jian; Zheng, Zeguang; Chen, Xindong; Zeng, Xiansheng; Zhang, Yi; Li, Defu; Shu, Jiaze; Yang, Kai; Lai, Ning; Dong, Lian
2017-01-01
Background Convincing evidences have demonstrated the associations between HHIP and FAM13a polymorphisms and COPD in non-Asian populations. Here genetic variants in HHIP and FAM13a were investigated in Southern Han Chinese COPD. Methods A case-control study was conducted, including 989 cases and 999 controls. The associations between SNPs genotypes and COPD were performed by a logistic regression model; for SNPs and COPD-related phenotypes such as lung function, COPD severity, pack-year of smoking, and smoking status, a linear regression model was employed. Effects of risk alleles, genotypes, and haplotypes of the 3 significant SNPs in the HHIP gene on FEV1/FVC were also assessed in a linear regression model in COPD. Results The mean FEV1/FVC% value was 46.8 in combined COPD population. None of the 8 selected SNPs apparently related to COPD susceptibility. However, three SNPs (rs12509311, rs13118928, and rs182859) in HHIP were associated significantly with the FEV1/FVC% (Pmax = 4.1 × 10−4) in COPD adjusting for gender, age, and smoking pack-years. Moreover, statistical significance between risk alleles and the FEV1/FVC% (P = 2.3 × 10−4), risk genotypes, and the FEV1/FVC% (P = 3.5 × 10−4) was also observed in COPD. Conclusions Genetic variants in HHIP were related with FEV1/FVC in COPD. Significant relationships between risk alleles and risk genotypes and FEV1/FVC in COPD were also identified. PMID:28929109
A decline in the prevalence of injecting drug users in Estonia, 2005–2009
Uusküla, A; Rajaleid, K; Talu, A; Abel-Ollo, K; Des Jarlais, DC
2013-01-01
Aims and setting Descriptions of behavioural epidemics have received little attention compared with infectious disease epidemics in Eastern Europe. Here we report a study aimed at estimating trends in the prevalence of injection drug use between 2005 and 2009 in Estonia. Design and methods The number of injection drug users (IDUs) aged 15–44 each year between 2005 and 2009 was estimated using capture-recapture methodology based on 4 data sources (2 treatment data bases: drug abuse and non-fatal overdose treatment; criminal justice (drug related offences) and mortality (injection drug use related deaths) data). Poisson log-linear regression models were applied to the matched data, with interactions between data sources fitted to replicate the dependencies between the data sources. Linear regression was used to estimate average change over time. Findings there were 24305, 12292, 238, 545 records and 8100, 1655, 155, 545 individual IDUs identified in the four capture sources (Police, drug treatment, overdose, and death registry, accordingly) over the period 2005 – 2009. The estimated prevalence of IDUs among the population aged 15–44 declined from 2.7% (1.8–7.9%) in 2005 to 2.0% (1.4–5.0%) in 2008, and 0.9% (0.7–1.7%) in 2009. Regression analysis indicated an average reduction of over 1700 injectors per year. Conclusion While the capture-recapture method has known limitations, the results are consistent with other data from Estonia. Identifying the drivers of change in the prevalence of injection drug use warrants further research. PMID:23290632
Wang, T T; Jiang, L
2017-10-01
Objective: To investigate the prognostic value of highly sensitive cardiac Troponin T (hs-cTn T) for sepsis in critically ill patients. Methods: Patients estimated to stay in the ICU of Fuxing Hospital for more than 24h were enrolled at from March 2014 to December 2014. Serum hs-cTn T was tested within two hours. Univariate and multivariate linear regression analyses were used to determine the association of variables with the hs-cTn T. Multivariable logistic regression analysis was used to evaluate the risk factors of 28-day mortality. Results: A total of 125 patients were finally enrolled including 68 patients with sepsis and 57 without. The levels of hs-cTn T in sepsis and non-sepsis groups were significantly different[52.0(32.5, 87.5) ng/L vs 14.0(6.5, 29.0) ng/L respectively, P <0.001]. In sepsis group, hs-cTn T among common sepsis, severe sepsis and septic shock were similar. Hs-cTn T was significantly higher in non-survivors than survivors [27(13, 52)ng/L vs 44.5(28.8, 83.5)ng/L, P <0.001]. Age, sepsis, serum creatinine were independent risk factors affecting hs-cTn T by multivariate linear regression analyses. But hs-cTn T was not a risk factor for death. Conclusion: Patients with sepsis had higher serum hs-cTn T than those without sepsis. but it was not found to be associated with the severity of sepsis.
Horton, Megan K.; Blount, Benjamin C.; Valentin-Blasini, Liza; Wapner, Ronald; Whyatt, Robin; Gennings, Chris; Factor-Litvak, Pam
2015-01-01
Background Adequate maternal thyroid function during pregnancy is necessary for normal fetal brain development, making pregnancy a critical window of vulnerability to thyroid disrupting insults. Sodium/iodide symporter (NIS) inhibitors, namely perchlorate, nitrate, and thiocyanate, have been shown individually to competitively inhibit uptake of iodine by the thyroid. Several epidemiologic studies examined the association between these individual exposures and thyroid function. Few studies have examined the effect of this chemical mixture on thyroid function during pregnancy. Objectives We examined the cross sectional association between urinary perchlorate, thiocyanate and nitrate concentrations and thyroid function among healthy pregnant women living in New York City using weighted quantile sum (WQS) regression. Methods We measured thyroid stimulating hormone (TSH) and free thyroxine (FreeT4) in blood samples; perchlorate, thiocyanate, nitrate and iodide in urine samples collected from 284 pregnant women at 12 (± 2.8) weeks gestation. We examined associations between urinary analyte concentrations and TSH or FreeT4 using linear regression or WQS adjusting for gestational age, urinary iodide and creatinine. Results Individual analyte concentrations in urine were significantly correlated (Spearman’s r 0.4–0.5, p < 0.001). Linear regression analyses did not suggest associations between individual concentrations and thyroid function. The WQS revealed a significant positive association between the weighted sum of urinary concentrations of the three analytes and increased TSH. Perchlorate had the largest weight in the index, indicating the largest contribution to the WQS. Conclusions Co-exposure to perchlorate, nitrate and thiocyanate may alter maternal thyroid function, specifically TSH, during pregnancy. PMID:26408806
NASA Astrophysics Data System (ADS)
Wu, Cheng; Zhen Yu, Jian
2018-03-01
Linear regression techniques are widely used in atmospheric science, but they are often improperly applied due to lack of consideration or inappropriate handling of measurement uncertainty. In this work, numerical experiments are performed to evaluate the performance of five linear regression techniques, significantly extending previous works by Chu and Saylor. The five techniques are ordinary least squares (OLS), Deming regression (DR), orthogonal distance regression (ODR), weighted ODR (WODR), and York regression (YR). We first introduce a new data generation scheme that employs the Mersenne twister (MT) pseudorandom number generator. The numerical simulations are also improved by (a) refining the parameterization of nonlinear measurement uncertainties, (b) inclusion of a linear measurement uncertainty, and (c) inclusion of WODR for comparison. Results show that DR, WODR and YR produce an accurate slope, but the intercept by WODR and YR is overestimated and the degree of bias is more pronounced with a low R2 XY dataset. The importance of a properly weighting parameter λ in DR is investigated by sensitivity tests, and it is found that an improper λ in DR can lead to a bias in both the slope and intercept estimation. Because the λ calculation depends on the actual form of the measurement error, it is essential to determine the exact form of measurement error in the XY data during the measurement stage. If a priori error in one of the variables is unknown, or the measurement error described cannot be trusted, DR, WODR and YR can provide the least biases in slope and intercept among all tested regression techniques. For these reasons, DR, WODR and YR are recommended for atmospheric studies when both X and Y data have measurement errors. An Igor Pro-based program (Scatter Plot) was developed to facilitate the implementation of error-in-variables regressions.
Afantitis, Antreas; Melagraki, Georgia; Sarimveis, Haralambos; Koutentis, Panayiotis A; Markopoulos, John; Igglessi-Markopoulou, Olga
2006-08-01
A quantitative-structure activity relationship was obtained by applying Multiple Linear Regression Analysis to a series of 80 1-[2-hydroxyethoxy-methyl]-6-(phenylthio) thymine (HEPT) derivatives with significant anti-HIV activity. For the selection of the best among 37 different descriptors, the Elimination Selection Stepwise Regression Method (ES-SWR) was utilized. The resulting QSAR model (R (2) (CV) = 0.8160; S (PRESS) = 0.5680) proved to be very accurate both in training and predictive stages.
Wavelet regression model in forecasting crude oil price
NASA Astrophysics Data System (ADS)
Hamid, Mohd Helmie; Shabri, Ani
2017-05-01
This study presents the performance of wavelet multiple linear regression (WMLR) technique in daily crude oil forecasting. WMLR model was developed by integrating the discrete wavelet transform (DWT) and multiple linear regression (MLR) model. The original time series was decomposed to sub-time series with different scales by wavelet theory. Correlation analysis was conducted to assist in the selection of optimal decomposed components as inputs for the WMLR model. The daily WTI crude oil price series has been used in this study to test the prediction capability of the proposed model. The forecasting performance of WMLR model were also compared with regular multiple linear regression (MLR), Autoregressive Moving Average (ARIMA) and Generalized Autoregressive Conditional Heteroscedasticity (GARCH) using root mean square errors (RMSE) and mean absolute errors (MAE). Based on the experimental results, it appears that the WMLR model performs better than the other forecasting technique tested in this study.
Javed, Faizan; Chan, Gregory S H; Savkin, Andrey V; Middleton, Paul M; Malouf, Philip; Steel, Elizabeth; Mackie, James; Lovell, Nigel H
2009-01-01
This paper uses non-linear support vector regression (SVR) to model the blood volume and heart rate (HR) responses in 9 hemodynamically stable kidney failure patients during hemodialysis. Using radial bias function (RBF) kernels the non-parametric models of relative blood volume (RBV) change with time as well as percentage change in HR with respect to RBV were obtained. The e-insensitivity based loss function was used for SVR modeling. Selection of the design parameters which includes capacity (C), insensitivity region (e) and the RBF kernel parameter (sigma) was made based on a grid search approach and the selected models were cross-validated using the average mean square error (AMSE) calculated from testing data based on a k-fold cross-validation technique. Linear regression was also applied to fit the curves and the AMSE was calculated for comparison with SVR. For the model based on RBV with time, SVR gave a lower AMSE for both training (AMSE=1.5) as well as testing data (AMSE=1.4) compared to linear regression (AMSE=1.8 and 1.5). SVR also provided a better fit for HR with RBV for both training as well as testing data (AMSE=15.8 and 16.4) compared to linear regression (AMSE=25.2 and 20.1).
Breakfast intake among adults with type 2 diabetes: is bigger better?
Jarvandi, Soghra; Schootman, Mario; Racette, Susan B.
2015-01-01
Objective To assess the association between breakfast energy and total daily energy intake among individuals with type 2 diabetes. Design Cross-sectional study. Daily energy intake was computed from a 24-h dietary recall. Multiple regression models were used to estimate the association between daily energy intake (dependent variable) and quartiles of energy intake at breakfast (independent variable) expressed as either absolute or relative (% of total daily energy intake) terms. Orthogonal polynomial contrasts were used to test for linear and quadratic trends. Models were controlled for sex, age, race/ethnicity, body mass index, physical activity and smoking. In addition, we used separate multiple regression models to test the effect of quartiles of absolute and relative breakfast energy on intake at lunch, dinner, and snacks. Setting The 1999–2004 National Health and Nutrition Examination Survey (NHANES). Subjects Participants aged ≥ 30 years with self-reported history of diabetes (N = 1,146). Results Daily energy intake increased as absolute breakfast energy intake increased (linear trend, P < 0.0001; quadratic trend, P = 0.02), but decreased as relative breakfast energy intake increased (linear trend, P < 0.0001). In addition, while higher quartiles of absolute breakfast intake had no associations with energy intake at subsequent meals, higher quartiles of relative breakfast intake were associated with lower energy intake during all subsequent meals and snacks (P < 0.05). Conclusions Consuming a breakfast that provided less energy or comprised a greater proportion of daily energy intake was associated with lower total daily energy intake in adults with type 2 diabetes. PMID:25529061
Ricci, Cristian; Gervasi, Federico; Gaeta, Maddalena; Smuts, Cornelius M; Schutte, Aletta E; Leitzmann, Michael F
2018-05-01
Background Light physical activity is known to reduce atrial fibrillation risk, whereas moderate to vigorous physical activity may result in an increased risk. However, the question of what volume of physical activity can be considered beneficial remains poorly understood. The scope of the present work was to examine the relation between physical activity volume and atrial fibrillation risk. Design A comprehensive systematic review was performed following the PRISMA guidelines. Methods A non-linear meta-regression considering the amount of energy spent in physical activity was carried out. The first derivative of the non-linear relation between physical activity and atrial fibrillation risk was evaluated to determine the volume of physical activity that carried the minimum atrial fibrillation risk. Results The dose-response analysis of the relation between physical activity and atrial fibrillation risk showed that physical activity at volumes of 5-20 metabolic equivalents per week (MET-h/week) was associated with significant reduction in atrial fibrillation risk (relative risk for 19 MET-h/week = 0.92 (0.87, 0.98). By comparison, physical activity volumes exceeding 20 MET-h/week were unrelated to atrial fibrillation risk (relative risk for 21 MET-h/week = 0.95 (0.88, 1.02). Conclusion These data show a J-shaped relation between physical activity volume and atrial fibrillation risk. Physical activity at volumes of up to 20 MET-h/week is associated with reduced atrial fibrillation risk, whereas volumes exceeding 20 MET-h/week show no relation with risk.
Faecal nitrogen excretion as an approach to estimate forage intake of wethers.
Kozloski, G V; Oliveira, L; Poli, C H E C; Azevedo, E B; David, D B; Ribeiro Filho, H M N; Collet, S G
2014-08-01
Data from twenty-two digestibility trials were compiled to examine the relationship between faecal N concentration and organic matter (OM) digestibility (OMD), and between faecal N excretion and OM intake (OMI) by wethers fed tropical or temperate forages alone or with supplements. Data set was grouped by diet type as follows: only tropical grass (n = 204), only temperate grass (n = 160), tropical grass plus supplement (n = 216), temperate grass plus supplement (n = 48), tropical grass plus tropical legume (n = 60) and temperate grass with ruminal infusion of tannins (n = 16). Positive correlation between OMD and either total faecal N concentration (Nfc, % of OM) or metabolic faecal N concentration (Nmetfc, % of OM) was significant for most diet types. Exceptions were the diet that included a tropical legume, where both relationships were negative, and the diet that included tannin extract, where the correlation between OMD and Nfc was not significant. Pearson correlation and linear regressions between OM intake (OMI, g/day) and faecal N excretion (Nf, g/day) were significant for all diet types. When OMI was estimated from the OM faecal excretion and Nfc-based OMD values, the linear comparison between observed and estimated OMI values showed intercept different from 0 and slope different from 1. When OMI was estimated using the Nf-based linear regressions, the linear comparison between observed and estimated OMI values showed neither intercept different from 0 nor slope different from 1. Both linear comparisons showed similar R(2) values (i.e. 0.78 vs. 0.79). In conclusion, linear equations are suitable for directly estimating OM intake by wethers, fed only forage or forage plus supplements, from the amount of N excreted in faeces. The use of this approach in experiments with grazing wethers has the advantage of accounting for individual variations in diet selection and digestion processes and precludes the use of techniques to estimate forage digestibility. Journal of Animal Physiology and Animal Nutrition © 2013 Blackwell Verlag GmbH.
Assessing NARCCAP climate model effects using spatial confidence regions
French, Joshua P.; McGinnis, Seth; Schwartzman, Armin
2017-01-01
We assess similarities and differences between model effects for the North American Regional Climate Change Assessment Program (NARCCAP) climate models using varying classes of linear regression models. Specifically, we consider how the average temperature effect differs for the various global and regional climate model combinations, including assessment of possible interaction between the effects of global and regional climate models. We use both pointwise and simultaneous inference procedures to identify regions where global and regional climate model effects differ. We also show conclusively that results from pointwise inference are misleading, and that accounting for multiple comparisons is important for making proper inference. PMID:28936474
Post-processing through linear regression
NASA Astrophysics Data System (ADS)
van Schaeybroeck, B.; Vannitsem, S.
2011-03-01
Various post-processing techniques are compared for both deterministic and ensemble forecasts, all based on linear regression between forecast data and observations. In order to evaluate the quality of the regression methods, three criteria are proposed, related to the effective correction of forecast error, the optimal variability of the corrected forecast and multicollinearity. The regression schemes under consideration include the ordinary least-square (OLS) method, a new time-dependent Tikhonov regularization (TDTR) method, the total least-square method, a new geometric-mean regression (GM), a recently introduced error-in-variables (EVMOS) method and, finally, a "best member" OLS method. The advantages and drawbacks of each method are clarified. These techniques are applied in the context of the 63 Lorenz system, whose model version is affected by both initial condition and model errors. For short forecast lead times, the number and choice of predictors plays an important role. Contrarily to the other techniques, GM degrades when the number of predictors increases. At intermediate lead times, linear regression is unable to provide corrections to the forecast and can sometimes degrade the performance (GM and the best member OLS with noise). At long lead times the regression schemes (EVMOS, TDTR) which yield the correct variability and the largest correlation between ensemble error and spread, should be preferred.
Linear regression metamodeling as a tool to summarize and present simulation model results.
Jalal, Hawre; Dowd, Bryan; Sainfort, François; Kuntz, Karen M
2013-10-01
Modelers lack a tool to systematically and clearly present complex model results, including those from sensitivity analyses. The objective was to propose linear regression metamodeling as a tool to increase transparency of decision analytic models and better communicate their results. We used a simplified cancer cure model to demonstrate our approach. The model computed the lifetime cost and benefit of 3 treatment options for cancer patients. We simulated 10,000 cohorts in a probabilistic sensitivity analysis (PSA) and regressed the model outcomes on the standardized input parameter values in a set of regression analyses. We used the regression coefficients to describe measures of sensitivity analyses, including threshold and parameter sensitivity analyses. We also compared the results of the PSA to deterministic full-factorial and one-factor-at-a-time designs. The regression intercept represented the estimated base-case outcome, and the other coefficients described the relative parameter uncertainty in the model. We defined simple relationships that compute the average and incremental net benefit of each intervention. Metamodeling produced outputs similar to traditional deterministic 1-way or 2-way sensitivity analyses but was more reliable since it used all parameter values. Linear regression metamodeling is a simple, yet powerful, tool that can assist modelers in communicating model characteristics and sensitivity analyses.
Aptel, Florent; Sayous, Romain; Fortoul, Vincent; Beccat, Sylvain; Denis, Philippe
2010-12-01
To evaluate and compare the regional relationships between visual field sensitivity and retinal nerve fiber layer (RNFL) thickness as measured by spectral-domain optical coherence tomography (OCT) and scanning laser polarimetry. Prospective cross-sectional study. One hundred and twenty eyes of 120 patients (40 with healthy eyes, 40 with suspected glaucoma, and 40 with glaucoma) were tested on Cirrus-OCT, GDx VCC, and standard automated perimetry. Raw data on RNFL thickness were extracted for 256 peripapillary sectors of 1.40625 degrees each for the OCT measurement ellipse and 64 peripapillary sectors of 5.625 degrees each for the GDx VCC measurement ellipse. Correlations between peripapillary RNFL thickness in 6 sectors and visual field sensitivity in the 6 corresponding areas were evaluated using linear and logarithmic regression analysis. Receiver operating curve areas were calculated for each instrument. With spectral-domain OCT, the correlations (r(2)) between RNFL thickness and visual field sensitivity ranged from 0.082 (nasal RNFL and corresponding visual field area, linear regression) to 0.726 (supratemporal RNFL and corresponding visual field area, logarithmic regression). By comparison, with GDx-VCC, the correlations ranged from 0.062 (temporal RNFL and corresponding visual field area, linear regression) to 0.362 (supratemporal RNFL and corresponding visual field area, logarithmic regression). In pairwise comparisons, these structure-function correlations were generally stronger with spectral-domain OCT than with GDx VCC and with logarithmic regression than with linear regression. The largest areas under the receiver operating curve were seen for OCT superior thickness (0.963 ± 0.022; P < .001) in eyes with glaucoma and for OCT average thickness (0.888 ± 0.072; P < .001) in eyes with suspected glaucoma. The structure-function relationship was significantly stronger with spectral-domain OCT than with scanning laser polarimetry, and was better expressed logarithmically than linearly. Measurements with these 2 instruments should not be considered to be interchangeable. Copyright © 2010 Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Rule, David L.
Several regression methods were examined within the framework of weighted structural regression (WSR), comparing their regression weight stability and score estimation accuracy in the presence of outlier contamination. The methods compared are: (1) ordinary least squares; (2) WSR ridge regression; (3) minimum risk regression; (4) minimum risk 2;…
Unit Cohesion and the Surface Navy: Does Cohesion Affect Performance
1989-12-01
v. 68, 1968. Neter, J., Wasserman, W., and Kutner, M. H., Applied Linear Regression Models, 2d ed., Boston, MA: Irwin, 1989. Rand Corporation R-2607...Neter, J., Wasserman, W., and Kutner, M. H., Applied Linear Regression Models, 2d ed., Boston, MA: Irwin, 1989. SAS User’s Guide: Basics, Version 5 ed
1990-03-01
and M.H. Knuter. Applied Linear Regression Models. Homewood IL: Richard D. Erwin Inc., 1983. Pritsker, A. Alan B. Introduction to Simulation and SLAM...Control Variates in Simulation," European Journal of Operational Research, 42: (1989). Neter, J., W. Wasserman, and M.H. Xnuter. Applied Linear Regression Models
ERIC Educational Resources Information Center
Yan, Jun; Aseltine, Robert H., Jr.; Harel, Ofer
2013-01-01
Comparing regression coefficients between models when one model is nested within another is of great practical interest when two explanations of a given phenomenon are specified as linear models. The statistical problem is whether the coefficients associated with a given set of covariates change significantly when other covariates are added into…
Calibrated Peer Review for Interpreting Linear Regression Parameters: Results from a Graduate Course
ERIC Educational Resources Information Center
Enders, Felicity B.; Jenkins, Sarah; Hoverman, Verna
2010-01-01
Biostatistics is traditionally a difficult subject for students to learn. While the mathematical aspects are challenging, it can also be demanding for students to learn the exact language to use to correctly interpret statistical results. In particular, correctly interpreting the parameters from linear regression is both a vital tool and a…
ERIC Educational Resources Information Center
Richter, Tobias
2006-01-01
Most reading time studies using naturalistic texts yield data sets characterized by a multilevel structure: Sentences (sentence level) are nested within persons (person level). In contrast to analysis of variance and multiple regression techniques, hierarchical linear models take the multilevel structure of reading time data into account. They…
Some Applied Research Concerns Using Multiple Linear Regression Analysis.
ERIC Educational Resources Information Center
Newman, Isadore; Fraas, John W.
The intention of this paper is to provide an overall reference on how a researcher can apply multiple linear regression in order to utilize the advantages that it has to offer. The advantages and some concerns expressed about the technique are examined. A number of practical ways by which researchers can deal with such concerns as…
ERIC Educational Resources Information Center
Nelson, Dean
2009-01-01
Following the Guidelines for Assessment and Instruction in Statistics Education (GAISE) recommendation to use real data, an example is presented in which simple linear regression is used to evaluate the effect of the Montreal Protocol on atmospheric concentration of chlorofluorocarbons. This simple set of data, obtained from a public archive, can…
Quantum State Tomography via Linear Regression Estimation
Qi, Bo; Hou, Zhibo; Li, Li; Dong, Daoyi; Xiang, Guoyong; Guo, Guangcan
2013-01-01
A simple yet efficient state reconstruction algorithm of linear regression estimation (LRE) is presented for quantum state tomography. In this method, quantum state reconstruction is converted into a parameter estimation problem of a linear regression model and the least-squares method is employed to estimate the unknown parameters. An asymptotic mean squared error (MSE) upper bound for all possible states to be estimated is given analytically, which depends explicitly upon the involved measurement bases. This analytical MSE upper bound can guide one to choose optimal measurement sets. The computational complexity of LRE is O(d4) where d is the dimension of the quantum state. Numerical examples show that LRE is much faster than maximum-likelihood estimation for quantum state tomography. PMID:24336519
Applications of statistics to medical science, III. Correlation and regression.
Watanabe, Hiroshi
2012-01-01
In this third part of a series surveying medical statistics, the concepts of correlation and regression are reviewed. In particular, methods of linear regression and logistic regression are discussed. Arguments related to survival analysis will be made in a subsequent paper.
A phenomenological biological dose model for proton therapy based on linear energy transfer spectra.
Rørvik, Eivind; Thörnqvist, Sara; Stokkevåg, Camilla H; Dahle, Tordis J; Fjaera, Lars Fredrik; Ytre-Hauge, Kristian S
2017-06-01
The relative biological effectiveness (RBE) of protons varies with the radiation quality, quantified by the linear energy transfer (LET). Most phenomenological models employ a linear dependency of the dose-averaged LET (LET d ) to calculate the biological dose. However, several experiments have indicated a possible non-linear trend. Our aim was to investigate if biological dose models including non-linear LET dependencies should be considered, by introducing a LET spectrum based dose model. The RBE-LET relationship was investigated by fitting of polynomials from 1st to 5th degree to a database of 85 data points from aerobic in vitro experiments. We included both unweighted and weighted regression, the latter taking into account experimental uncertainties. Statistical testing was performed to decide whether higher degree polynomials provided better fits to the data as compared to lower degrees. The newly developed models were compared to three published LET d based models for a simulated spread out Bragg peak (SOBP) scenario. The statistical analysis of the weighted regression analysis favored a non-linear RBE-LET relationship, with the quartic polynomial found to best represent the experimental data (P = 0.010). The results of the unweighted regression analysis were on the borderline of statistical significance for non-linear functions (P = 0.053), and with the current database a linear dependency could not be rejected. For the SOBP scenario, the weighted non-linear model estimated a similar mean RBE value (1.14) compared to the three established models (1.13-1.17). The unweighted model calculated a considerably higher RBE value (1.22). The analysis indicated that non-linear models could give a better representation of the RBE-LET relationship. However, this is not decisive, as inclusion of the experimental uncertainties in the regression analysis had a significant impact on the determination and ranking of the models. As differences between the models were observed for the SOBP scenario, both non-linear LET spectrum- and linear LET d based models should be further evaluated in clinically realistic scenarios. © 2017 American Association of Physicists in Medicine.
Regression of non-linear coupling of noise in LIGO detectors
NASA Astrophysics Data System (ADS)
Da Silva Costa, C. F.; Billman, C.; Effler, A.; Klimenko, S.; Cheng, H.-P.
2018-03-01
In 2015, after their upgrade, the advanced Laser Interferometer Gravitational-Wave Observatory (LIGO) detectors started acquiring data. The effort to improve their sensitivity has never stopped since then. The goal to achieve design sensitivity is challenging. Environmental and instrumental noise couple to the detector output with different, linear and non-linear, coupling mechanisms. The noise regression method we use is based on the Wiener–Kolmogorov filter, which uses witness channels to make noise predictions. We present here how this method helped to determine complex non-linear noise couplings in the output mode cleaner and in the mirror suspension system of the LIGO detector.
NASA Astrophysics Data System (ADS)
Rodes, C. E.; Chillrud, S. N.; Haskell, W. L.; Intille, S. S.; Albinali, F.; Rosenberger, M. E.
2012-09-01
BackgroundMetabolic functions typically increase with human activity, but optimal methods to characterize activity levels for real-time predictions of ventilation volume (l min-1) during exposure assessments have not been available. Could tiny, triaxial accelerometers be incorporated into personal level monitors to define periods of acceptable wearing compliance, and allow the exposures (μg m-3) to be extended to potential doses in μg min-1 kg-1 of body weight? ObjectivesIn a pilot effort, we tested: 1) whether appropriately-processed accelerometer data could be utilized to predict compliance and in linear regressions to predict ventilation volumes in real-time as an on-board component of personal level exposure sensor systems, and 2) whether locating the exposure monitors on the chest in the breathing zone, provided comparable accelerometric data to other locations more typically utilized (waist, thigh, wrist, etc.). MethodsPrototype exposure monitors from RTI International and Columbia University were worn on the chest by a pilot cohort of adults while conducting an array of scripted activities (all <10 METS), spanning common recumbent, sedentary, and ambulatory activity categories. Referee Wocket accelerometers that were placed at various body locations allowed comparison with the chest-located exposure sensor accelerometers. An Oxycon Mobile mask was used to measure oral-nasal ventilation volumes in-situ. For the subset of participants with complete data (n = 22), linear regressions were constructed (processed accelerometric variable versus ventilation rate) for each participant and exposure monitor type, and Pearson correlations computed to compare across scenarios. ResultsTriaxial accelerometer data were demonstrated to be adequately sensitive indicators for predicting exposure monitor wearing compliance. Strong linear correlations (R values from 0.77 to 0.99) were observed for all participants for both exposure sensor accelerometer variables against ventilation volume for recumbent, sedentary, and ambulatory activities with MET values ˜<6. The RTI monitors mean R value of 0.91 was slightly higher than the Columbia monitors mean of 0.86 due to utilizing a 20 Hz data rate instead of a slower 1 Hz rate. A nominal mean regression slope was computed for the RTI system across participants and showed a modest RSD of +/-36.6%. Comparison of the correlation values of the exposure monitors with the Wocket accelerometers at various body locations showed statistically identical regressions for all sensors at alternate hip, ankle, upper arm, thigh, and pocket locations, but not for the Wocket accelerometer located at the dominant side wrist location (R = 0.57; p = 0.016). ConclusionsEven with a modest number of adult volunteers, the consistency and linearity of regression slopes for all subjects were very good with excellent within-person Pearson correlations for the accelerometer versus ventilation volume data. Computing accelerometric standard deviations allowed good sensitivity for compliance assessments even for sedentary activities. These pilot findings supported the hypothesis that a common linear regression is likely to be usable for a wider range of adults to predict ventilation volumes from accelerometry data over a range of low to moderate energy level activities. The predicted volumes would then allow real-time estimates of potential dose, enabling more robust panel studies. The poorer correlation in predicting ventilation rate for an accelerometer located on the wrist suggested that this location should not be considered for predictions of ventilation volume.
Goodarzi, Mohammad; Jensen, Richard; Vander Heyden, Yvan
2012-12-01
A Quantitative Structure-Retention Relationship (QSRR) is proposed to estimate the chromatographic retention of 83 diverse drugs on a Unisphere poly butadiene (PBD) column, using isocratic elutions at pH 11.7. Previous work has generated QSRR models for them using Classification And Regression Trees (CART). In this work, Ant Colony Optimization is used as a feature selection method to find the best molecular descriptors from a large pool. In addition, several other selection methods have been applied, such as Genetic Algorithms, Stepwise Regression and the Relief method, not only to evaluate Ant Colony Optimization as a feature selection method but also to investigate its ability to find the important descriptors in QSRR. Multiple Linear Regression (MLR) and Support Vector Machines (SVMs) were applied as linear and nonlinear regression methods, respectively, giving excellent correlation between the experimental, i.e. extrapolated to a mobile phase consisting of pure water, and predicted logarithms of the retention factors of the drugs (logk(w)). The overall best model was the SVM one built using descriptors selected by ACO. Copyright © 2012 Elsevier B.V. All rights reserved.
Evaluating Differential Effects Using Regression Interactions and Regression Mixture Models
ERIC Educational Resources Information Center
Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung
2015-01-01
Research increasingly emphasizes understanding differential effects. This article focuses on understanding regression mixture models, which are relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their…
SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES
Zhu, Liping; Huang, Mian; Li, Runze
2012-01-01
This paper is concerned with quantile regression for a semiparametric regression model, in which both the conditional mean and conditional variance function of the response given the covariates admit a single-index structure. This semiparametric regression model enables us to reduce the dimension of the covariates and simultaneously retains the flexibility of nonparametric regression. Under mild conditions, we show that the simple linear quantile regression offers a consistent estimate of the index parameter vector. This is a surprising and interesting result because the single-index model is possibly misspecified under the linear quantile regression. With a root-n consistent estimate of the index vector, one may employ a local polynomial regression technique to estimate the conditional quantile function. This procedure is computationally efficient, which is very appealing in high-dimensional data analysis. We show that the resulting estimator of the quantile function performs asymptotically as efficiently as if the true value of the index vector were known. The methodologies are demonstrated through comprehensive simulation studies and an application to a real dataset. PMID:24501536
Prediction of siRNA potency using sparse logistic regression.
Hu, Wei; Hu, John
2014-06-01
RNA interference (RNAi) can modulate gene expression at post-transcriptional as well as transcriptional levels. Short interfering RNA (siRNA) serves as a trigger for the RNAi gene inhibition mechanism, and therefore is a crucial intermediate step in RNAi. There have been extensive studies to identify the sequence characteristics of potent siRNAs. One such study built a linear model using LASSO (Least Absolute Shrinkage and Selection Operator) to measure the contribution of each siRNA sequence feature. This model is simple and interpretable, but it requires a large number of nonzero weights. We have introduced a novel technique, sparse logistic regression, to build a linear model using single-position specific nucleotide compositions which has the same prediction accuracy of the linear model based on LASSO. The weights in our new model share the same general trend as those in the previous model, but have only 25 nonzero weights out of a total 84 weights, a 54% reduction compared to the previous model. Contrary to the linear model based on LASSO, our model suggests that only a few positions are influential on the efficacy of the siRNA, which are the 5' and 3' ends and the seed region of siRNA sequences. We also employed sparse logistic regression to build a linear model using dual-position specific nucleotide compositions, a task LASSO is not able to accomplish well due to its high dimensional nature. Our results demonstrate the superiority of sparse logistic regression as a technique for both feature selection and regression over LASSO in the context of siRNA design.
Predictive and mechanistic multivariate linear regression models for reaction development
Santiago, Celine B.; Guo, Jing-Yao
2018-01-01
Multivariate Linear Regression (MLR) models utilizing computationally-derived and empirically-derived physical organic molecular descriptors are described in this review. Several reports demonstrating the effectiveness of this methodological approach towards reaction optimization and mechanistic interrogation are discussed. A detailed protocol to access quantitative and predictive MLR models is provided as a guide for model development and parameter analysis. PMID:29719711
Adding a Parameter Increases the Variance of an Estimated Regression Function
ERIC Educational Resources Information Center
Withers, Christopher S.; Nadarajah, Saralees
2011-01-01
The linear regression model is one of the most popular models in statistics. It is also one of the simplest models in statistics. It has received applications in almost every area of science, engineering and medicine. In this article, the authors show that adding a predictor to a linear model increases the variance of the estimated regression…
Using nonlinear quantile regression to estimate the self-thinning boundary curve
Quang V. Cao; Thomas J. Dean
2015-01-01
The relationship between tree size (quadratic mean diameter) and tree density (number of trees per unit area) has been a topic of research and discussion for many decades. Starting with Reineke in 1933, the maximum size-density relationship, on a log-log scale, has been assumed to be linear. Several techniques, including linear quantile regression, have been employed...
Simultaneous spectrophotometric determination of salbutamol and bromhexine in tablets.
Habib, I H I; Hassouna, M E M; Zaki, G A
2005-03-01
Typical anti-mucolytic drugs called salbutamol hydrochloride and bromhexine sulfate encountered in tablets were determined simultaneously either by using linear regression at zero-crossing wavelengths of the first derivation of UV-spectra or by application of multiple linear partial least squares regression method. The results obtained by the two proposed mathematical methods were compared with those obtained by the HPLC technique.
Laurens, L M L; Wolfrum, E J
2013-12-18
One of the challenges associated with microalgal biomass characterization and the comparison of microalgal strains and conversion processes is the rapid determination of the composition of algae. We have developed and applied a high-throughput screening technology based on near-infrared (NIR) spectroscopy for the rapid and accurate determination of algal biomass composition. We show that NIR spectroscopy can accurately predict the full composition using multivariate linear regression analysis of varying lipid, protein, and carbohydrate content of algal biomass samples from three strains. We also demonstrate a high quality of predictions of an independent validation set. A high-throughput 96-well configuration for spectroscopy gives equally good prediction relative to a ring-cup configuration, and thus, spectra can be obtained from as little as 10-20 mg of material. We found that lipids exhibit a dominant, distinct, and unique fingerprint in the NIR spectrum that allows for the use of single and multiple linear regression of respective wavelengths for the prediction of the biomass lipid content. This is not the case for carbohydrate and protein content, and thus, the use of multivariate statistical modeling approaches remains necessary.
Zhang, Xin; Liu, Pan; Chen, Yuguang; Bai, Lu; Wang, Wei
2014-01-01
The primary objective of this study was to identify whether the frequency of traffic conflicts at signalized intersections can be modeled. The opposing left-turn conflicts were selected for the development of conflict predictive models. Using data collected at 30 approaches at 20 signalized intersections, the underlying distributions of the conflicts under different traffic conditions were examined. Different conflict-predictive models were developed to relate the frequency of opposing left-turn conflicts to various explanatory variables. The models considered include a linear regression model, a negative binomial model, and separate models developed for four traffic scenarios. The prediction performance of different models was compared. The frequency of traffic conflicts follows a negative binominal distribution. The linear regression model is not appropriate for the conflict frequency data. In addition, drivers behaved differently under different traffic conditions. Accordingly, the effects of conflicting traffic volumes on conflict frequency vary across different traffic conditions. The occurrences of traffic conflicts at signalized intersections can be modeled using generalized linear regression models. The use of conflict predictive models has potential to expand the uses of surrogate safety measures in safety estimation and evaluation.
Standards for Standardized Logistic Regression Coefficients
ERIC Educational Resources Information Center
Menard, Scott
2011-01-01
Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…
Image interpolation via regularized local linear regression.
Liu, Xianming; Zhao, Debin; Xiong, Ruiqin; Ma, Siwei; Gao, Wen; Sun, Huifang
2011-12-01
The linear regression model is a very attractive tool to design effective image interpolation schemes. Some regression-based image interpolation algorithms have been proposed in the literature, in which the objective functions are optimized by ordinary least squares (OLS). However, it is shown that interpolation with OLS may have some undesirable properties from a robustness point of view: even small amounts of outliers can dramatically affect the estimates. To address these issues, in this paper we propose a novel image interpolation algorithm based on regularized local linear regression (RLLR). Starting with the linear regression model where we replace the OLS error norm with the moving least squares (MLS) error norm leads to a robust estimator of local image structure. To keep the solution stable and avoid overfitting, we incorporate the l(2)-norm as the estimator complexity penalty. Moreover, motivated by recent progress on manifold-based semi-supervised learning, we explicitly consider the intrinsic manifold structure by making use of both measured and unmeasured data points. Specifically, our framework incorporates the geometric structure of the marginal probability distribution induced by unmeasured samples as an additional local smoothness preserving constraint. The optimal model parameters can be obtained with a closed-form solution by solving a convex optimization problem. Experimental results on benchmark test images demonstrate that the proposed method achieves very competitive performance with the state-of-the-art interpolation algorithms, especially in image edge structure preservation. © 2011 IEEE
2016-01-01
Understanding the relationship between physiological measurements from human subjects and their demographic data is important within both the biometric and forensic domains. In this paper we explore the relationship between measurements of the human hand and a range of demographic features. We assess the ability of linear regression and machine learning classifiers to predict demographics from hand features, thereby providing evidence on both the strength of relationship and the key features underpinning this relationship. Our results show that we are able to predict sex, height, weight and foot size accurately within various data-range bin sizes, with machine learning classification algorithms out-performing linear regression in most situations. In addition, we identify the features used to provide these relationships applicable across multiple applications. PMID:27806075
Miguel-Hurtado, Oscar; Guest, Richard; Stevenage, Sarah V; Neil, Greg J; Black, Sue
2016-01-01
Understanding the relationship between physiological measurements from human subjects and their demographic data is important within both the biometric and forensic domains. In this paper we explore the relationship between measurements of the human hand and a range of demographic features. We assess the ability of linear regression and machine learning classifiers to predict demographics from hand features, thereby providing evidence on both the strength of relationship and the key features underpinning this relationship. Our results show that we are able to predict sex, height, weight and foot size accurately within various data-range bin sizes, with machine learning classification algorithms out-performing linear regression in most situations. In addition, we identify the features used to provide these relationships applicable across multiple applications.
Kumar, K Vasanth; Porkodi, K; Rocha, F
2008-01-15
A comparison of linear and non-linear regression method in selecting the optimum isotherm was made to the experimental equilibrium data of basic red 9 sorption by activated carbon. The r(2) was used to select the best fit linear theoretical isotherm. In the case of non-linear regression method, six error functions namely coefficient of determination (r(2)), hybrid fractional error function (HYBRID), Marquardt's percent standard deviation (MPSD), the average relative error (ARE), sum of the errors squared (ERRSQ) and sum of the absolute errors (EABS) were used to predict the parameters involved in the two and three parameter isotherms and also to predict the optimum isotherm. Non-linear regression was found to be a better way to obtain the parameters involved in the isotherms and also the optimum isotherm. For two parameter isotherm, MPSD was found to be the best error function in minimizing the error distribution between the experimental equilibrium data and predicted isotherms. In the case of three parameter isotherm, r(2) was found to be the best error function to minimize the error distribution structure between experimental equilibrium data and theoretical isotherms. The present study showed that the size of the error function alone is not a deciding factor to choose the optimum isotherm. In addition to the size of error function, the theory behind the predicted isotherm should be verified with the help of experimental data while selecting the optimum isotherm. A coefficient of non-determination, K(2) was explained and was found to be very useful in identifying the best error function while selecting the optimum isotherm.
Applied Multiple Linear Regression: A General Research Strategy
ERIC Educational Resources Information Center
Smith, Brandon B.
1969-01-01
Illustrates some of the basic concepts and procedures for using regression analysis in experimental design, analysis of variance, analysis of covariance, and curvilinear regression. Applications to evaluation of instruction and vocational education programs are illustrated. (GR)
A secure distributed logistic regression protocol for the detection of rare adverse drug events
El Emam, Khaled; Samet, Saeed; Arbuckle, Luk; Tamblyn, Robyn; Earle, Craig; Kantarcioglu, Murat
2013-01-01
Background There is limited capacity to assess the comparative risks of medications after they enter the market. For rare adverse events, the pooling of data from multiple sources is necessary to have the power and sufficient population heterogeneity to detect differences in safety and effectiveness in genetic, ethnic and clinically defined subpopulations. However, combining datasets from different data custodians or jurisdictions to perform an analysis on the pooled data creates significant privacy concerns that would need to be addressed. Existing protocols for addressing these concerns can result in reduced analysis accuracy and can allow sensitive information to leak. Objective To develop a secure distributed multi-party computation protocol for logistic regression that provides strong privacy guarantees. Methods We developed a secure distributed logistic regression protocol using a single analysis center with multiple sites providing data. A theoretical security analysis demonstrates that the protocol is robust to plausible collusion attacks and does not allow the parties to gain new information from the data that are exchanged among them. The computational performance and accuracy of the protocol were evaluated on simulated datasets. Results The computational performance scales linearly as the dataset sizes increase. The addition of sites results in an exponential growth in computation time. However, for up to five sites, the time is still short and would not affect practical applications. The model parameters are the same as the results on pooled raw data analyzed in SAS, demonstrating high model accuracy. Conclusion The proposed protocol and prototype system would allow the development of logistic regression models in a secure manner without requiring the sharing of personal health information. This can alleviate one of the key barriers to the establishment of large-scale post-marketing surveillance programs. We extended the secure protocol to account for correlations among patients within sites through generalized estimating equations, and to accommodate other link functions by extending it to generalized linear models. PMID:22871397
Biomass Stoves and Lens Opacity and Cataract in Nepalese Women
Pokhrel, Amod K.; Bates, Michael N.; Shrestha, Sachet P.; Bailey, Ian L.; DiMartino, Robert B.; Smith, Kirk R.; Joshi, N. D.
2014-01-01
Purpose Cataract is the most prevalent cause of blindness in Nepal. Several epidemiologic studies have associated cataracts with use of biomass cookstoves. These studies, however, have had limitations, including potential control selection bias and limited adjustment for possible confounding. This study, in Pokhara city, in an area of Nepal where biomass cookstoves are widely used without direct venting of the smoke to the outdoors, focuses on pre-clinical measures of opacity, while avoiding selection bias and taking into account comprehensive data on potential confounding factors Methods Using a cross-sectional study design, severity of lenticular damage, judged on the LOCS III scales, was investigated in females (n=143), aged 20-65 years, without previously diagnosed cataract. Linear and logistic regression analyses were used to examine the relationships with stove type and length of use. Clinically significant cataract, used in the logistic regression models, was defined as a LOCS III score > 2. Results Using gas cookstoves as the reference group, logistic regression analysis for nuclear cataract showed the evidence of relationships with stove type: for biomass stoves, the odds ratio (OR) was 2.58 (95% confidence interval [CI]: 1.22-5.46) and, for kerosene stoves, the OR was 5.18 (95% CI: 0.88-30.38). Similar results were found for nuclear color (LOCS III score > 2), but no association was found with cortical cataracts. Supporting a relationship between biomass stoves and nuclear cataract was a trend with years of exposure to biomass cookstoves (p=0.01). Linear regression analyses did not show clear evidence of an association between lenticular damage and stove types. Biomass fuel used for heating was not associated with any form of opacity. Conclusions This study provides support for associations of biomass and kerosene cookstoves with nuclear opacity and change in nuclear color. The novel associations with kerosene cookstove use deserve further investigation. PMID:23400024
DOE Office of Scientific and Technical Information (OSTI.GOV)
Horton, Megan K., E-mail: megan.horton@mssm.edu; Blount, Benjamin C.; Valentin-Blasini, Liza
Background: Adequate maternal thyroid function during pregnancy is necessary for normal fetal brain development, making pregnancy a critical window of vulnerability to thyroid disrupting insults. Sodium/iodide symporter (NIS) inhibitors, namely perchlorate, nitrate, and thiocyanate, have been shown individually to competitively inhibit uptake of iodine by the thyroid. Several epidemiologic studies examined the association between these individual exposures and thyroid function. Few studies have examined the effect of this chemical mixture on thyroid function during pregnancy Objectives: We examined the cross sectional association between urinary perchlorate, thiocyanate and nitrate concentrations and thyroid function among healthy pregnant women living in New Yorkmore » City using weighted quantile sum (WQS) regression. Methods: We measured thyroid stimulating hormone (TSH) and free thyroxine (FreeT4) in blood samples; perchlorate, thiocyanate, nitrate and iodide in urine samples collected from 284 pregnant women at 12 (±2.8) weeks gestation. We examined associations between urinary analyte concentrations and TSH or FreeT4 using linear regression or WQS adjusting for gestational age, urinary iodide and creatinine. Results: Individual analyte concentrations in urine were significantly correlated (Spearman's r 0.4–0.5, p<0.001). Linear regression analyses did not suggest associations between individual concentrations and thyroid function. The WQS revealed a significant positive association between the weighted sum of urinary concentrations of the three analytes and increased TSH. Perchlorate had the largest weight in the index, indicating the largest contribution to the WQS. Conclusions: Co-exposure to perchlorate, nitrate and thiocyanate may alter maternal thyroid function, specifically TSH, during pregnancy. - Highlights: • Perchlorate, nitrate, thiocyanate and iodide measured in maternal urine. • Thyroid function (TSH and Free T4) measured in maternal blood. • Weighted quantile sum (WQS) regression examined complex mixture effect. • WQS identified an inverse association between the exposure mixture and maternal TSH. • Perchlorate indicated as the ‘bad actor’ of the mixture.« less
Body mass index in relation to serum prostate-specific antigen levels and prostate cancer risk.
Bonn, Stephanie E; Sjölander, Arvid; Tillander, Annika; Wiklund, Fredrik; Grönberg, Henrik; Bälter, Katarina
2016-07-01
High Body mass index (BMI) has been directly associated with risk of aggressive or fatal prostate cancer. One possible explanation may be an effect of BMI on serum levels of prostate-specific antigen (PSA). To study the association between BMI and serum PSA as well as prostate cancer risk, a large cohort of men without prostate cancer at baseline was followed prospectively for prostate cancer diagnoses until 2015. Serum PSA and BMI were assessed among 15,827 men at baseline in 2010-2012. During follow-up, 735 men were diagnosed with prostate cancer with 282 (38.4%) classified as high-grade cancers. Multivariable linear regression models and natural cubic linear regression splines were fitted for analyses of BMI and log-PSA. For risk analysis, Cox proportional hazards regression models were used to estimate hazard ratios (HR) and 95% confidence intervals (CI) and natural cubic Cox regression splines producing standardized cancer-free probabilities were fitted. Results showed that baseline Serum PSA decreased by 1.6% (95% CI: -2.1 to -1.1) with every one unit increase in BMI. Statistically significant decreases of 3.7, 11.7 and 32.3% were seen for increasing BMI-categories of 25 < 30, 30 < 35 and ≥35 kg/m(2), respectively, compared to the reference (18.5 < 25 kg/m(2)). No statistically significant associations were seen between BMI and prostate cancer risk although results were indicative of a positive association to incidence rates of high-grade disease and an inverse association to incidence of low-grade disease. However, findings regarding risk are limited by the short follow-up time. In conclusion, BMI was inversely associated to PSA-levels. BMI should be taken into consideration when referring men to a prostate biopsy based on serum PSA-levels. © 2016 UICC.
Predicting recycling behaviour: Comparison of a linear regression model and a fuzzy logic model.
Vesely, Stepan; Klöckner, Christian A; Dohnal, Mirko
2016-03-01
In this paper we demonstrate that fuzzy logic can provide a better tool for predicting recycling behaviour than the customarily used linear regression. To show this, we take a set of empirical data on recycling behaviour (N=664), which we randomly divide into two halves. The first half is used to estimate a linear regression model of recycling behaviour, and to develop a fuzzy logic model of recycling behaviour. As the first comparison, the fit of both models to the data included in estimation of the models (N=332) is evaluated. As the second comparison, predictive accuracy of both models for "new" cases (hold-out data not included in building the models, N=332) is assessed. In both cases, the fuzzy logic model significantly outperforms the regression model in terms of fit. To conclude, when accurate predictions of recycling and possibly other environmental behaviours are needed, fuzzy logic modelling seems to be a promising technique. Copyright © 2015 Elsevier Ltd. All rights reserved.
Serrano-Gallardo, Pilar; Martínez-Marcos, Mercedes; Espejo-Matorrales, Flora; Arakawa, Tiemi; Magnabosco, Gabriela Tavares; Pinto, Ione Carvalho
2016-01-01
ABSTRACT Objective: to identify the students' perception about the quality of clinical placements and asses the influence of the different tutoring processes in clinical learning. Methods: analytical cross-sectional study on second and third year nursing students (n=122) about clinical learning in primary health care. The Clinical Placement Evaluation Tool and a synthetic index of attitudes and skills were computed to give scores to the clinical learning (scale 0-10). Univariate, bivariate and multivariate (multiple linear regression) analyses were performed. Results: the response rate was 91.8%. The most commonly identified tutoring process was "preceptor-professor" (45.2%). The clinical placement was assessed as "optimal" by 55.1%, relationship with team-preceptor was considered good by 80.4% of the cases and the average grade for clinical learning was 7.89. The multiple linear regression model with more explanatory capacity included the variables "Academic year" (beta coefficient = 1.042 for third-year students), "Primary Health Care Area (PHC)" (beta coefficient = 0.308 for Area B) and "Clinical placement perception" (beta coefficient = - 0.204 for a suboptimal perception). Conclusions: timeframe within the academic program, location and clinical placement perception were associated with students' clinical learning. Students' perceptions of setting quality were positive and a good team-preceptor relationship is a matter of relevance. PMID:27627124
Using the social cognitive theory to understand physical activity among dialysis patients.
Patterson, Megan S; Umstattd Meyer, M Renée; Beaujean, A Alexander; Bowden, Rodney G
2014-08-01
The purpose of this study was to use the social cognitive theory (SCT) constructs self-efficacy, outcome expectations, and self-regulation to better understand associations of physical activity (PA) behaviors among dialysis patients after controlling for demographic and health-related factors. This study was cross-sectional in design. Participants (N = 115; mean age = 61.51 years, SD = 14.01) completed self-report questionnaires during a regularly scheduled dialysis treatment session. Bivariate and hierarchical linear regression analyses were conducted to examine relationships among SCT constructs and PA. Significant relationships between PA and self-efficacy (r = .336), self-regulation (r = .280), and outcome expectations (r = .265) were detected among people on dialysis in bivariate analyses. Hierarchical linear regression revealed significant increases in variance explained for the addition of self-efficacy, self-regulation, and covariates (p < .01). Younger age, self-efficacy, and self-regulation were associated (p < .10) with greater participation in physical activity in the final model (R² = .272). Conclusion/Implication: This research supports the use of SCT in understanding PA among people undergoing dialysis treatment. The findings of this study can help health educators and health care practitioners better understand PA and how to promote it among this population. Future research should further investigate which activities dialysis patients participate in across the life span of their disease. Future PA programs should focus on increasing a patient's self-efficacy and self-regulation.
Prediction of Cancer Incidence and Mortality in Korea, 2018
Jung, Kyu-Won; Won, Young-Joo; Kong, Hyun-Joo; Lee, Eun Sook
2018-01-01
Purpose This study aimed to report on cancer incidence and mortality for the year 2018 to estimate Korea’s current cancer burden. Materials and Methods Cancer incidence data from 1999 to 2015 were obtained from the Korea National Cancer Incidence Database, and cancer mortality data from 1993 to 2016 were acquired from Statistics Korea. Cancer incidence and mortality were projected by fitting a linear regression model to observed age-specific cancer rates against observed years, then multiplying the projected age-specific rates by the age-specific population. The Joinpoint regression model was used to determine at which year the linear trend changed significantly, we only used the data of the latest trend. Results A total of 204,909 new cancer cases and 82,155 cancer deaths are expected to occur in Korea in 2018. The most common cancer sites were lung, followed by stomach, colorectal, breast and liver. These five cancers represent half of the overall burden of cancer in Korea. For mortality, the most common sites were lung cancer, followed by liver, colorectal, stomach and pancreas. Conclusion The incidence rate of all cancer in Korea are estimated to decrease gradually, mainly due to decrease of thyroid cancer. These up-to-date estimates of the cancer burden in Korea could be an important resource for planning and evaluation of cancer-control programs. PMID:29566480
Matsuba, Ikuro; Saito, Kazumi; Takai, Masahiko; Hirao, Koichi; Sone, Hirohito
2012-01-01
OBJECTIVE To investigate the relationship between fasting insulin levels and metabolic risk factors (MRFs) in type 2 diabetic patients at the first clinic/hospital visit in Japan over the years 2000 to 2009. RESEARCH DESIGN AND METHODS In total, 4,798 drug-naive Japanese patients with type 2 diabetes were registered on their first clinic/hospital visits. Conventional clinical factors and fasting insulin levels were observed at baseline within the Japan Diabetes Clinical Data Management (JDDM) study between consecutive 2-year groups. Multiple linear regression analysis was performed using a model in which the dependent variable was fasting insulin values using various clinical explanatory variables. RESULTS Fasting insulin levels were found to be decreasing from 2000 to 2009. Multiple linear regression analysis with the fasting insulin levels as the dependent variable showed that waist circumference (WC), BMI, mean blood pressure, triglycerides, and HDL cholesterol were significant, with WC and BMI as the main factors. ANCOVA after adjustment for age and fasting plasma glucose clearly shows the decreasing trend in fasting insulin levels and the increasing trend in BMI. CONCLUSIONS During the 10-year observation period, the decreasing trend in fasting insulin was related to the slight increase in WC/BMI in type 2 diabetes. Low pancreatic β-cell reserve on top of a lifestyle background might be dependent on an increase in MRFs. PMID:22665215
The relationship between praying and life expectancy in cancerous patients.
Hekmati Pour, N; Hojjati, H
2015-01-01
Introduction. Knowing that someone was entangled with cancer is a surprising experience for that person. Being aware of having cancer not only makes the person loose his hopes and ambitions, but also influences his body and mental. Meanwhile, religion can play the proper role of complementary treatment, increasing life expectancy in these patients. Objective. The study was conducted with the aim of determining the relationship between praying and life expectancy in cancerous patients. Method. This descriptive correlation study was performed on 96 malignant patients who were under chemotherapy in Golestan province in 1392. Paloma and Pendleton's Measure of Prayer Type questionnaires and Schneider questionnaire of life expectancy were used to collect this information. Analyses were performed by using SPSS 21.0. Data were analyzed by using the linear regression and the analytical significance was set at p < 0.05. Findings. The linear regression showed a significant relationship between life expectancy and praying (CI95:0.01-0.13), OR = 0.07, Beta = -0.24 P < 0.02) and in the light of previous experience it showed a significant relationship between praying and life expectancy. Conclusion. According to the obtained result of this study, cancerous patients can overcome their illness through praying, and they can also triumph cancer through self-confidence and control it, by getting more knowledge of their disease and become more hopeful about their future.
Cummings, Kristin J.; Cox-Ganser, Jean; Riggs, Margaret A.; Edwards, Nicole; Hobbs, Gerald R.; Kreiss, Kathleen
2008-01-01
Objectives. We investigated the relation between respiratory symptoms and exposure to water-damaged homes and the effect of respirator use in posthurricane New Orleans, Louisiana. Methods. We randomly selected 600 residential sites and then interviewed 1 adult per site. We created an exposure variable, calculated upper respiratory symptom (URS) and lower respiratory symptom (LRS) scores, and defined exacerbation categories by the effect on symptoms of being inside water-damaged homes. We used multiple linear regression to model symptom scores (for all participants) and polytomous logistic regression to model exacerbation of symptoms when inside (for those participating in clean-up). Results. Of 553 participants (response rate=92%), 372 (68%) had participated in clean-up; 233 (63%) of these used a respirator. Respiratory symptom scores increased linearly with exposure (P<.05 for trend). Disposable-respirator use was associated with lower odds of exacerbation of moderate or severe symptoms inside water-damaged homes for URS (odds ratio (OR)=.51; 95% confidence interval (CI)=0.24, 1.09) and LRS (OR=0.33; 95% CI=0.13, 0.83). Conclusions. Respiratory symptoms were positively associated with exposure to water-damaged homes, including exposure limited to being inside without participating in clean-up. Respirator use had a protective effect and should be considered when inside water-damaged homes regardless of activities undertaken. PMID:18381997
Pouchot, Jacques; Kherani, Raheem B.; Brant, Rollin; Lacaille, Diane; Lehman, Allen J.; Ensworth, Stephanie; Kopec, Jacek; Esdaile, John M.; Liang, Matthew H.
2008-01-01
Objective To estimate the minimal clinically important difference (MCID) of seven measures of fatigue in rheumatoid arthritis. Study Design and Setting A cross-sectional study design based on inter-individual comparisons was used. Six to eight subjects participated in a single meeting and completed seven fatigue questionnaires (nine sessions were organized and 61 subjects participated). After completion of the questionnaires, the subjects had five one-on-one 10-minute conversations with different people in the group to discuss their fatigue. After each conversation, each patient compared their fatigue to their conversational partner’s on a global rating. Ratings were compared to the scores of the fatigue measures to estimate the MCID. Both non-parametric and linear regression analyses were used. Results Non-parametric estimates for the MCID relative to “little more fatigue” tended to be smaller than those for “little less fatigue”. The global MCIDs estimated by linear regression were: FSS 20.2, VT 14.8, MAF 18.7, MFI 16.6, FACIT–F 15.9, CFS 9.9, RS 19.7, for normalized scores (0 to 100). The standardized MCIDs for the seven measures were roughly similar (0.67 to 0.76). Conclusion These estimates of MCID will help to interpret changes observed in a fatigue score and will be critical in estimating sample size requirements. PMID:18359189
Rostami, Reza; Sadeghi, Vahid; Zarei, Jamileh; Haddadi, Parvaneh; Mohazzab-Torabi, Saman; Salamati, Payman
2013-01-01
Objective The aim of this study was to compare the Persian version of the wechsler intelligence scale for children - fourth edition (WISC-IV) and cognitive assessment system (CAS) tests, to determine the correlation between their scales and to evaluate the probable concurrent validity of these tests in patients with learning disorders. Methods One-hundered-sixty-two children with learning disorder who were presented at Atieh Comprehensive Psychiatry Center were selected in a consecutive non-randomized order. All of the patients were assessed based on WISC-IV and CAS scores questionnaires. Pearson correlation coefficient was used to analyze the correlation between the data and to assess the concurrent validity of the two tests. Linear regression was used for statistical modeling. The type one error was considered 5% in maximum. Findings There was a strong correlation between total score of WISC-IV test and total score of CAS test in the patients (r=0.75, P<0.001). The correlations among the other scales were mostly high and all of them were statistically significant (P<0.001). A linear regression model was obtained (α = 0.51, β = 0.81 and P<0.001). Conclusion There is an acceptable correlation between the WISC-IV scales and CAS test in children with learning disorders. A concurrent validity is established between the two tests and their scales. PMID:23724180
Khiavi, Farzad Faraji; Dashti, Rezvan; Mokhtari, Saeedeh
2016-01-01
Introduction Individual characteristics are important factors influencing organizational commitment. Also, committed human resources can lead organizations to performance improvement as well as personal and organizational achievements. This research aimed to determine the association between organizational commitment and personality traits among faculty members of Ahvaz Jundishapur University of Medical Sciences. Methods the research population of this cross-sectional study was the faculty members of Ahvaz Jundishapur University of Medical Sciences (Ahvaz, Iran). The sample size was determined to be 83. Data collection instruments were the Allen and Meyer questionnaire for organizational commitment and Neo for characteristics’ features. The data were analyzed through Pearson’s product-moment correlation and the independent samples t-test, ANOVA, and simple linear regression analysis (SLR) by SPSS. Results Continuance commitment showed a significant positive association with neuroticism, extroversion, agreeableness, and conscientiousness. Normative commitment showed a significant positive association with conscientiousness and a negative association with extroversion (p = 0.001). Openness had a positive association with affective commitment. Openness and agreeableness, among the five characteristics’ features, had the most effect on organizational commitment, as indicated by simple linear regression analysis. Conclusion Faculty members’ characteristics showed a significant association with their organizational commitment. Determining appropriate characteristic criteria for faculty members may lead to employing committed personnel to accomplish the University’s objectives and tasks. PMID:27123222
Bennett, Bradley C; Husby, Chad E
2008-03-28
Botanical pharmacopoeias are non-random subsets of floras, with some taxonomic groups over- or under-represented. Moerman [Moerman, D.E., 1979. Symbols and selectivity: a statistical analysis of Native American medical ethnobotany, Journal of Ethnopharmacology 1, 111-119] introduced linear regression/residual analysis to examine these patterns. However, regression, the commonly-employed analysis, suffers from several statistical flaws. We use contingency table and binomial analyses to examine patterns of Shuar medicinal plant use (from Amazonian Ecuador). We first analyzed the Shuar data using Moerman's approach, modified to better meet requirements of linear regression analysis. Second, we assessed the exact randomization contingency table test for goodness of fit. Third, we developed a binomial model to test for non-random selection of plants in individual families. Modified regression models (which accommodated assumptions of linear regression) reduced R(2) to from 0.59 to 0.38, but did not eliminate all problems associated with regression analyses. Contingency table analyses revealed that the entire flora departs from the null model of equal proportions of medicinal plants in all families. In the binomial analysis, only 10 angiosperm families (of 115) differed significantly from the null model. These 10 families are largely responsible for patterns seen at higher taxonomic levels. Contingency table and binomial analyses offer an easy and statistically valid alternative to the regression approach.
An Application to the Prediction of LOD Change Based on General Regression Neural Network
NASA Astrophysics Data System (ADS)
Zhang, X. H.; Wang, Q. J.; Zhu, J. J.; Zhang, H.
2011-07-01
Traditional prediction of the LOD (length of day) change was based on linear models, such as the least square model and the autoregressive technique, etc. Due to the complex non-linear features of the LOD variation, the performances of the linear model predictors are not fully satisfactory. This paper applies a non-linear neural network - general regression neural network (GRNN) model to forecast the LOD change, and the results are analyzed and compared with those obtained with the back propagation neural network and other models. The comparison shows that the performance of the GRNN model in the prediction of the LOD change is efficient and feasible.
DOT National Transportation Integrated Search
2016-09-01
We consider the problem of solving mixed random linear equations with k components. This is the noiseless setting of mixed linear regression. The goal is to estimate multiple linear models from mixed samples in the case where the labels (which sample...
Linear regression techniques for use in the EC tracer method of secondary organic aerosol estimation
NASA Astrophysics Data System (ADS)
Saylor, Rick D.; Edgerton, Eric S.; Hartsell, Benjamin E.
A variety of linear regression techniques and simple slope estimators are evaluated for use in the elemental carbon (EC) tracer method of secondary organic carbon (OC) estimation. Linear regression techniques based on ordinary least squares are not suitable for situations where measurement uncertainties exist in both regressed variables. In the past, regression based on the method of Deming [1943. Statistical Adjustment of Data. Wiley, London] has been the preferred choice for EC tracer method parameter estimation. In agreement with Chu [2005. Stable estimate of primary OC/EC ratios in the EC tracer method. Atmospheric Environment 39, 1383-1392], we find that in the limited case where primary non-combustion OC (OC non-comb) is assumed to be zero, the ratio of averages (ROA) approach provides a stable and reliable estimate of the primary OC-EC ratio, (OC/EC) pri. In contrast with Chu [2005. Stable estimate of primary OC/EC ratios in the EC tracer method. Atmospheric Environment 39, 1383-1392], however, we find that the optimal use of Deming regression (and the more general York et al. [2004. Unified equations for the slope, intercept, and standard errors of the best straight line. American Journal of Physics 72, 367-375] regression) provides excellent results as well. For the more typical case where OC non-comb is allowed to obtain a non-zero value, we find that regression based on the method of York is the preferred choice for EC tracer method parameter estimation. In the York regression technique, detailed information on uncertainties in the measurement of OC and EC is used to improve the linear best fit to the given data. If only limited information is available on the relative uncertainties of OC and EC, then Deming regression should be used. On the other hand, use of ROA in the estimation of secondary OC, and thus the assumption of a zero OC non-comb value, generally leads to an overestimation of the contribution of secondary OC to total measured OC.
Big Data Toolsets to Pharmacometrics: Application of Machine Learning for Time‐to‐Event Analysis
Gong, Xiajing; Hu, Meng
2018-01-01
Abstract Additional value can be potentially created by applying big data tools to address pharmacometric problems. The performances of machine learning (ML) methods and the Cox regression model were evaluated based on simulated time‐to‐event data synthesized under various preset scenarios, i.e., with linear vs. nonlinear and dependent vs. independent predictors in the proportional hazard function, or with high‐dimensional data featured by a large number of predictor variables. Our results showed that ML‐based methods outperformed the Cox model in prediction performance as assessed by concordance index and in identifying the preset influential variables for high‐dimensional data. The prediction performances of ML‐based methods are also less sensitive to data size and censoring rates than the Cox regression model. In conclusion, ML‐based methods provide a powerful tool for time‐to‐event analysis, with a built‐in capacity for high‐dimensional data and better performance when the predictor variables assume nonlinear relationships in the hazard function. PMID:29536640
Characterizing multivariate decoding models based on correlated EEG spectral features
McFarland, Dennis J.
2013-01-01
Objective Multivariate decoding methods are popular techniques for analysis of neurophysiological data. The present study explored potential interpretative problems with these techniques when predictors are correlated. Methods Data from sensorimotor rhythm-based cursor control experiments was analyzed offline with linear univariate and multivariate models. Features were derived from autoregressive (AR) spectral analysis of varying model order which produced predictors that varied in their degree of correlation (i.e., multicollinearity). Results The use of multivariate regression models resulted in much better prediction of target position as compared to univariate regression models. However, with lower order AR features interpretation of the spectral patterns of the weights was difficult. This is likely to be due to the high degree of multicollinearity present with lower order AR features. Conclusions Care should be exercised when interpreting the pattern of weights of multivariate models with correlated predictors. Comparison with univariate statistics is advisable. Significance While multivariate decoding algorithms are very useful for prediction their utility for interpretation may be limited when predictors are correlated. PMID:23466267
Fonseca-Machado, Mariana de Oliveira; Monteiro, Juliana Cristina dos Santos; Haas, Vanderlei José; Abrão, Ana Cristina Freitas de Vilhena; Gomes-Sponholz, Flávia
2015-01-01
Objective: to identify the relationship between posttraumatic stress disorder, trait and state anxiety, and intimate partner violence during pregnancy. Method: observational, cross-sectional study developed with 358 pregnant women. The Posttraumatic Stress Disorder Checklist - Civilian Version was used, as well as the State-Trait Anxiety Inventory and an adapted version of the instrument used in the World Health Organization Multi-country Study on Women's Health and Domestic Violence. Results: after adjusting to the multiple logistic regression model, intimate partner violence, occurred during pregnancy, was associated with the indication of posttraumatic stress disorder. The adjusted multiple linear regression models showed that the victims of violence, in the current pregnancy, had higher symptom scores of trait and state anxiety than non-victims. Conclusion: recognizing the intimate partner violence as a clinically relevant and identifiable risk factor for the occurrence of anxiety disorders during pregnancy can be a first step in the prevention thereof. PMID:26487135
Price, James
2015-01-01
Propoxyphene was withdrawn from the US market in November 2010. This drug is still tested for in the workplace as part of expanded panel nonregulated testing. A convenience sample of urine specimens (n = 7838) were provided by workers from various industries. The percentage of positive specimens with 95% confidence intervals was calculated for each year of the study. Logistic regression was used to assess the impact of the year upon the propoxyphene result. The prevalence of positive propoxyphene tests was much higher before the product's withdrawal from the market. Logistic regression provided evidence of a decreasing linear trend (P < 0.000; β = -0.71). The odds ratio signifies that for every additional year the urine specimens were 0.49 times less likely to be positive for propoxyphene. This favors the determination that the change in propoxyphene positive drug test over the years is not by chance. The conclusion supports no longer performing nonregulated workplace propoxyphene urine drug testing for this population.
Yang, Xiaowei; Nie, Kun
2008-03-15
Longitudinal data sets in biomedical research often consist of large numbers of repeated measures. In many cases, the trajectories do not look globally linear or polynomial, making it difficult to summarize the data or test hypotheses using standard longitudinal data analysis based on various linear models. An alternative approach is to apply the approaches of functional data analysis, which directly target the continuous nonlinear curves underlying discretely sampled repeated measures. For the purposes of data exploration, many functional data analysis strategies have been developed based on various schemes of smoothing, but fewer options are available for making causal inferences regarding predictor-outcome relationships, a common task seen in hypothesis-driven medical studies. To compare groups of curves, two testing strategies with good power have been proposed for high-dimensional analysis of variance: the Fourier-based adaptive Neyman test and the wavelet-based thresholding test. Using a smoking cessation clinical trial data set, this paper demonstrates how to extend the strategies for hypothesis testing into the framework of functional linear regression models (FLRMs) with continuous functional responses and categorical or continuous scalar predictors. The analysis procedure consists of three steps: first, apply the Fourier or wavelet transform to the original repeated measures; then fit a multivariate linear model in the transformed domain; and finally, test the regression coefficients using either adaptive Neyman or thresholding statistics. Since a FLRM can be viewed as a natural extension of the traditional multiple linear regression model, the development of this model and computational tools should enhance the capacity of medical statistics for longitudinal data.
NASA Astrophysics Data System (ADS)
Gonçalves, Karen dos Santos; Winkler, Mirko S.; Benchimol-Barbosa, Paulo Roberto; de Hoogh, Kees; Artaxo, Paulo Eduardo; de Souza Hacon, Sandra; Schindler, Christian; Künzli, Nino
2018-07-01
Epidemiological studies generally use particulate matter measurements with diameter less 2.5 μm (PM2.5) from monitoring networks. Satellite aerosol optical depth (AOD) data has considerable potential in predicting PM2.5 concentrations, and thus provides an alternative method for producing knowledge regarding the level of pollution and its health impact in areas where no ground PM2.5 measurements are available. This is the case in the Brazilian Amazon rainforest region where forest fires are frequent sources of high pollution. In this study, we applied a non-linear model for predicting PM2.5 concentration from AOD retrievals using interaction terms between average temperature, relative humidity, sine, cosine of date in a period of 365,25 days and the square of the lagged relative residual. Regression performance statistics were tested comparing the goodness of fit and R2 based on results from linear regression and non-linear regression for six different models. The regression results for non-linear prediction showed the best performance, explaining on average 82% of the daily PM2.5 concentrations when considering the whole period studied. In the context of Amazonia, it was the first study predicting PM2.5 concentrations using the latest high-resolution AOD products also in combination with the testing of a non-linear model performance. Our results permitted a reliable prediction considering the AOD-PM2.5 relationship and set the basis for further investigations on air pollution impacts in the complex context of Brazilian Amazon Region.
2013-01-01
Background Developing countries in South Asia, such as Bangladesh, bear a disproportionate burden of diarrhoeal diseases such as Cholera, Typhoid and Paratyphoid. These seem to be aggravated by a number of social and environmental factors such as lack of access to safe drinking water, overcrowdedness and poor hygiene brought about by poverty. Some socioeconomic data can be obtained from census data whilst others are more difficult to elucidate. This study considers a range of both census data and spatial data from other sources, including remote sensing, as potential predictors of typhoid risk. Typhoid data are aggregated from hospital admission records for the period from 2005 to 2009. The spatial and statistical structures of the data are analysed and Principal Axis Factoring is used to reduce the degree of co-linearity in the data. The resulting factors are combined into a Quality of Life index, which in turn is used in a regression model of typhoid occurrence and risk. Results The three Principal Factors used together explain 87% of the variance in the initial candidate predictors, which eminently qualifies them for use as a set of uncorrelated explanatory variables in a linear regression model. Initial regression result using Ordinary Least Squares (OLS) were disappointing, this was explainable by analysis of the spatial autocorrelation inherent in the Principal factors. The use of Geographically Weighted Regression caused a considerable increase in the predictive power of regressions based on these factors. The best prediction, determined by analysis of the Akaike Information Criterion (AIC) was found when the three factors were combined into a quality of life index, using a method previously published by others, and had a coefficient of determination of 73%. Conclusions The typhoid occurrence/risk prediction equation was used to develop the first risk map showing areas of Dhaka Metropolitan Area whose inhabitants are at greater or lesser risk of typhoid infection. This, coupled with seasonal information on typhoid incidence also reported in this paper, has the potential to advise public health professionals on developing prevention strategies such as targeted vaccination. PMID:23497202
James, Robert F; Khattar, Nicolas K; Aljuboori, Zaid S; Page, Paul S; Shao, Elaine Y; Carter, Lacey M; Meyer, Kimberly S; Daniels, Michael W; Craycroft, John; Gaughen, John R; Chaudry, M Imran; Rai, Shesh N; Everhart, D Erik; Simard, J Marc
2018-05-11
OBJECTIVE Cognitive dysfunction occurs in up to 70% of aneurysmal subarachnoid hemorrhage (aSAH) survivors. Low-dose intravenous heparin (LDIVH) infusion using the Maryland protocol was recently shown to reduce clinical vasospasm and vasospasm-related infarction. In this study, the Montreal Cognitive Assessment (MoCA) was used to evaluate cognitive changes in aSAH patients treated with the Maryland LDIVH protocol compared with controls. METHODS A retrospective analysis of all patients treated for aSAH between July 2009 and April 2014 was conducted. Beginning in 2012, aSAH patients were treated with LDIVH in the postprocedural period. The MoCA was administered to all aSAH survivors prospectively during routine follow-up visits, at least 3 months after aSAH, by trained staff blinded to treatment status. Mean MoCA scores were compared between groups, and regression analyses were performed for relevant factors. RESULTS No significant differences in baseline characteristics were observed between groups. The mean MoCA score for the LDIVH group (n = 25) was 26.4 compared with 22.7 in controls (n = 22) (p = 0.013). Serious cognitive impairment (MoCA ≤ 20) was observed in 32% of controls compared with 0% in the LDIVH group (p = 0.008). Linear regression analysis demonstrated that only LDIVH was associated with a positive influence on MoCA scores (β = 3.68, p =0.019), whereas anterior communicating artery aneurysms and fevers were negatively associated with MoCA scores. Multivariable linear regression analysis resulted in all 3 factors maintaining significance. There were no treatment complications. CONCLUSIONS This preliminary study suggests that the Maryland LDIVH protocol may improve cognitive outcomes in aSAH patients. A randomized controlled trial is needed to determine the safety and potential benefit of unfractionated heparin in aSAH patients.
Shaffer, Kelly M.; Jacobs, Jamie M.; Nipp, Ryan D.; Carr, Alaina; Jackson, Vicki A.; Park, Elyse R.; Pirl, William F.; El-Jawahri, Areej; Gallagher, Emily R.; Greer, Joseph A.; Temel, Jennifer S.
2016-01-01
Purpose Caregiver, relational, and patient factors have been associated with the health of family members and friends providing care to patients with early-stage cancer. Little research has examined whether findings extend to family caregivers of patients with incurable cancer, who experience unique and substantial caregiving burdens. We examined correlates of mental and physical health among caregivers of patients with newly-diagnosed incurable lung or non-colorectal gastrointestinal cancer. Methods At baseline for a trial of early palliative care, caregivers of participating patients (N=275) reported their mental and physical health (Medical Outcome Survey-Short Form-36); patients reported their quality of life (Functional Assessment of Cancer Therapy-General). Analyses used hierarchical linear regression with two-tailed significance tests. Results Caregivers’ mental health was worse than the U.S. national population (M=44.31, p<.001), yet their physical health was better (M=56.20, p<.001). Hierarchical regression analyses testing caregiver, relational, and patient factors simultaneously revealed that younger (B=0.31, p=.001), spousal caregivers (B=−8.70, p=.003), who cared for patients reporting low emotional well-being (B=0.51, p=.01) reported worse mental health; older (B=−0.17, p=.01) caregivers with low educational attainment (B=4.36, p<.001) who cared for patients reporting low social well-being (B=0.35, p=.05) reported worse physical health. Conclusions In this large sample of family caregivers of patients with incurable cancer, caregiver demographics, relational factors, and patient-specific factors were all related to caregiver mental health, while caregiver demographics were primarily associated with caregiver physical health. These findings help identify characteristics of family caregivers at highest risk of poor mental and physical health who may benefit from greater supportive care. PMID:27866337
Han, Kelong; Ren, Melanie; Wick, Wolfgang; Abrey, Lauren; Das, Asha; Jin, Jin; Reardon, David A.
2014-01-01
Background The aim of this study was to determine correlations between progression-free survival (PFS) and the objective response rate (ORR) with overall survival (OS) in glioblastoma and to evaluate their potential use as surrogates for OS. Method Published glioblastoma trials reporting OS and ORR and/or PFS with sufficient detail were included in correlative analyses using weighted linear regression. Results Of 274 published unique glioblastoma trials, 91 were included. PFS and OS hazard ratios were strongly correlated; R2 = 0.92 (95% confidence interval [CI], 0.71–0.99). Linear regression determined that a 10% PFS risk reduction would yield an 8.1% ± 0.8% OS risk reduction. R2 between median PFS and median OS was 0.70 (95% CI, 0.59–0.79), with a higher value in trials using Response Assessment in Neuro-Oncology (RANO; R2 = 0.96, n = 8) versus Macdonald criteria (R2 = 0.70; n = 83). No significant differences were demonstrated between temozolomide- and bevacizumab-containing regimens (P = .10) or between trials using RANO and Macdonald criteria (P = .49). The regression line slope between median PFS and OS was significantly higher in newly diagnosed versus recurrent disease (0.58 vs 0.35, P = .04). R2 for 6-month PFS with 1-year OS and median OS were 0.60 (95% CI, 0.37–0.77) and 0.64 (95% CI, 0.42–0.77), respectively. Objective response rate and OS were poorly correlated (R2 = 0.22). Conclusion In glioblastoma, PFS and OS are strongly correlated, indicating that PFS may be an appropriate surrogate for OS. Compared with OS, PFS offers earlier assessment and higher statistical power at the time of analysis. PMID:24335699
Ruhdorfer, Anja; Wirth, Wolfgang; Eckstein, Felix
2014-01-01
Objective To determine the relationship between thigh muscle strength and clinically relevant differences in self-assessed lower limb function. Methods Isometric knee extensor and flexor strength of 4553 Osteoarthritis Initiative participants (2651 women/1902 men) was related to Western Ontario McMasters Universities (WOMAC) physical function scores by linear regression. Further, groups of Male and female participant strata with minimal clinically important differences (MCIDs) in WOMAC function scores (6/68) were compared across the full range of observed values, and to participants without functional deficits (WOMAC=0). The effect of WOMAC knee pain and body mass index on the above relationships was explored using stepwise regression. Results Per regression equations, a 3.7% reduction in extensor and a 4.0% reduction in flexor strength were associated with an MCID in WOMAC function in women, and a 3.6%/4.8% reduction in men. For strength divided by body weight, reductions were 5.2%/6.7% in women and 5.8%/6.7% in men. Comparing MCID strata across the full observed range of WOMAC function confirmed the above estimates and did not suggest non-linear relationships across the spectrum of observed values. WOMAC pain correlated strongly with WOMAC function, but extensor (and flexor) muscle strength contributed significant independent information. Conclusion Reductions of approximately 4% in isometric muscle strength and of 6% in strength/weight were related to a clinically relevant difference in WOMAC functional disability. Longitudinal studies will need to confirm these relationships within persons. Muscle extensor (and flexor) strength (per body weight) provided significant independent information in addition to pain in explaining variability in lower limb function. PMID:25303012
Cyst-based measurements for assessing lymphangioleiomyomatosis in computed tomography
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lo, P., E-mail: pechinlo@mednet.edu.ucla; Brown, M. S.; Kim, H.
Purpose: To investigate the efficacy of a new family of measurements made on individual pulmonary cysts extracted from computed tomography (CT) for assessing the severity of lymphangioleiomyomatosis (LAM). Methods: CT images were analyzed using thresholding to identify a cystic region of interest from chest CT of LAM patients. Individual cysts were then extracted from the cystic region by the watershed algorithm, which separates individual cysts based on subtle edges within the cystic regions. A family of measurements were then computed, which quantify the amount, distribution, and boundary appearance of the cysts. Sequential floating feature selection was used to select amore » small subset of features for quantification of the severity of LAM. Adjusted R{sup 2} from multiple linear regression and R{sup 2} from linear regression against measurements from spirometry were used to compare the performance of our proposed measurements with currently used density based CT measurements in the literature, namely, the relative area measure and the D measure. Results: Volumetric CT data, performed at total lung capacity and residual volume, from a total of 49 subjects enrolled in the MILES trial were used in our study. Our proposed measures had adjusted R{sup 2} ranging from 0.42 to 0.59 when regressing against the spirometry measures, with p < 0.05. For previously used density based CT measurements in the literature, the best R{sup 2} was 0.46 (for only one instance), with the majority being lower than 0.3 or p > 0.05. Conclusions: The proposed family of CT-based cyst measurements have better correlation with spirometric measures than previously used density based CT measurements. They show potential as a sensitive tool for quantitatively assessing the severity of LAM.« less
Evaluation of Relationship between Trunk Muscle Endurance and Static Balance in Male Students
Barati, Amirhossein; SafarCherati, Afsaneh; Aghayari, Azar; Azizi, Faeze; Abbasi, Hamed
2013-01-01
Purpose Fatigue of trunk muscle contributes to spinal instability over strenuous and prolonged physical tasks and therefore may lead to injury, however from a performance perspective, relation between endurance efficient core muscles and optimal balance control has not been well-known. The purpose of this study was to examine the relationship of trunk muscle endurance and static balance. Methods Fifty male students inhabitant of Tehran university dormitory (age 23.9±2.4, height 173.0±4.5 weight 70.7±6.3) took part in the study. Trunk muscle endurance was assessed using Sørensen test of trunk extensor endurance, trunk flexor endurance test, side bridge endurance test and static balance was measured using single-limb stance test. A multiple linear regression analysis was applied to test if the trunk muscle endurance measures significantly predicted the static balance. Results There were positive correlations between static balance level and trunk flexor, extensor and lateral endurance measures (Pearson correlation test, r=0.80 and P<0.001; r=0.71 and P<0.001; r=0.84 and P<0.001, respectively). According to multiple regression analysis for variables predicting static balance, the linear combination of trunk muscle endurance measures was significantly related to the static balance (F (3,46) = 66.60, P<0.001). Endurance of trunk flexor, extensor and lateral muscles were significantly associated with the static balance level. The regression model which included these factors had the sample multiple correlation coefficient of 0.902, indicating that approximately 81% of the variance of the static balance is explained by the model. Conclusion There is a significant relationship between trunk muscle endurance and static balance. PMID:24800004
Malignant testicular tumour incidence and mortality trends
Wojtyła-Buciora, Paulina; Więckowska, Barbara; Krzywinska-Wiewiorowska, Małgorzata; Gromadecka-Sutkiewicz, Małgorzata
2016-01-01
Aim of the study In Poland testicular tumours are the most frequent cancer among men aged 20–44 years. Testicular tumour incidence since the 1980s and 1990s has been diversified geographically, with an increased risk of mortality in Wielkopolska Province, which was highlighted at the turn of the 1980s and 1990s. The aim of the study was the comparative analysis of the tendencies in incidence and death rates due to malignant testicular tumours observed among men in Poland and in Wielkopolska Province. Material and methods Data from the National Cancer Registry were used for calculations. The incidence/mortality rates among men due to malignant testicular cancer as well as the tendencies in incidence/death ratio observed in Poland and Wielkopolska were established based on regression equation. The analysis was deepened by adopting the multiple linear regression model. A p-value < 0.05 was arbitrarily adopted as the criterion of statistical significance, and for multiple comparisons it was modified according to the Bonferroni adjustment to a value of p < 0.0028. Calculations were performed with the use of PQStat v1.4.8 package. Results The incidence of malignant testicular neoplasms observed among men in Poland and in Wielkopolska Province indicated a significant rising tendency. The multiple linear regression model confirmed that the year variable is a strong incidence forecast factor only within the territory of Poland. A corresponding analysis of mortality rates among men in Poland and in Wielkopolska Province did not show any statistically significant correlations. Conclusions Late diagnosis of Polish patients calls for undertaking appropriate educational activities that would facilitate earlier reporting of the patients, thus increasing their chances for recovery. Introducing preventive examinations in the regions of increased risk of testicular tumour may allow earlier diagnosis. PMID:27095941
Virji, M. Abbas; Trapnell, Bruce C.; Carey, Brenna; Healey, Terrance; Kreiss, Kathleen
2014-01-01
Rationale: Occupational exposure to indium compounds, including indium–tin oxide, can result in potentially fatal indium lung disease. However, the early effects of exposure on the lungs are not well understood. Objectives: To determine the relationship between short-term occupational exposures to indium compounds and the development of early lung abnormalities. Methods: Among indium–tin oxide production and reclamation facility workers, we measured plasma indium, respiratory symptoms, pulmonary function, chest computed tomography, and serum biomarkers of lung disease. Relationships between plasma indium concentration and health outcome variables were evaluated using restricted cubic spline and linear regression models. Measurements and Main Results: Eighty-seven (93%) of 94 indium–tin oxide facility workers (median tenure, 2 yr; median plasma indium, 1.0 μg/l) participated in the study. Spirometric abnormalities were not increased compared with the general population, and few subjects had radiographic evidence of alveolar proteinosis (n = 0), fibrosis (n = 2), or emphysema (n = 4). However, in internal comparisons, participants with plasma indium concentrations ≥ 1.0 μg/l had more dyspnea, lower mean FEV1 and FVC, and higher median serum Krebs von den Lungen-6 and surfactant protein-D levels. Spline regression demonstrated nonlinear exposure response, with significant differences occurring at plasma indium concentrations as low as 1.0 μg/l compared with the reference. Associations between health outcomes and the natural log of plasma indium concentration were evident in linear regression models. Associations were not explained by age, smoking status, facility tenure, or prior occupational exposures. Conclusions: In indium–tin oxide facility workers with short-term, low-level exposure, plasma indium concentrations lower than previously reported were associated with lung symptoms, decreased spirometric parameters, and increased serum biomarkers of lung disease. PMID:25295756
Davies, Simon J.C.; Mulsant, Benoit H.; Flint, Alastair J.; Rothschild, Anthony J.; Whyte, Ellen M.; Meyers, Barnett S.
2014-01-01
Background There are conflicting results on the impact of anxiety on depression outcomes. The impact of anxiety has not been studied in major depression with psychotic features (“psychotic depression”). Aims We assessed the impact of specific anxiety symptoms and disorders on the outcomes of psychotic depression. Methods We analyzed data from the Study of Pharmacotherapy for Psychotic Depression that randomized 259 younger and older participants to either olanzapine plus placebo or olanzapine plus sertraline. We assessed the impact of specific anxiety symptoms from the Brief Psychiatric Rating Scale (“tension”, “anxiety” and “somatic concerns” and a composite anxiety score) and diagnoses (panic disorder and GAD) on psychotic depression outcomes using linear or logistic regression. Age, gender, education and benzodiazepine use (at baseline and end) were included as covariates. Results Anxiety symptoms at baseline and anxiety disorder diagnoses differentially impacted outcomes. On adjusted linear regression there was an association between improvement in depressive symptoms and both baseline “tension” (coefficient = 0.784; 95% CI: 0.169–1.400; p = 0.013) and the composite anxiety score (regression coefficient = 0.348; 95% CI: 0.064–0.632; p = 0.017). There was an interaction between “tension” and treatment group, with better responses in those randomized to combination treatment if they had high baseline anxiety scores (coefficient = 1.309; 95% CI: 0.105–2.514; p = 0.033). In contrast, panic disorder was associated with worse clinical outcomes (coefficient = −3.858; 95% CI: –7.281 to −0.434; p = 0.027) regardless of treatment. Conclusions Our results suggest that analysis of the impact of anxiety on depression outcome needs to differentiate psychic and somatic symptoms. PMID:24656524
Serum Vitamin D Levels and Markers of Severity of Childhood Asthma in Costa Rica
Brehm, John M.; Celedón, Juan C.; Soto-Quiros, Manuel E.; Avila, Lydiana; Hunninghake, Gary M.; Forno, Erick; Laskey, Daniel; Sylvia, Jody S.; Hollis, Bruce W.; Weiss, Scott T.; Litonjua, Augusto A.
2009-01-01
Rationale: Maternal vitamin D intake during pregnancy has been inversely associated with asthma symptoms in early childhood. However, no study has examined the relationship between measured vitamin D levels and markers of asthma severity in childhood. Objectives: To determine the relationship between measured vitamin D levels and both markers of asthma severity and allergy in childhood. Methods: We examined the relation between 25-hydroxyvitamin D levels (the major circulating form of vitamin D) and markers of allergy and asthma severity in a cross-sectional study of 616 Costa Rican children between the ages of 6 and 14 years. Linear, logistic, and negative binomial regressions were used for the univariate and multivariate analyses. Measurements and Main Results: Of the 616 children with asthma, 175 (28%) had insufficient levels of vitamin D (<30 ng/ml). In multivariate linear regression models, vitamin D levels were significantly and inversely associated with total IgE and eosinophil count. In multivariate logistic regression models, a log10 unit increase in vitamin D levels was associated with reduced odds of any hospitalization in the previous year (odds ratio [OR], 0.05; 95% confidence interval [CI], 0.004–0.71; P = 0.03), any use of antiinflammatory medications in the previous year (OR, 0.18; 95% CI, 0.05–0.67; P = 0.01), and increased airway responsiveness (a ≤8.58-μmol provocative dose of methacholine producing a 20% fall in baseline FEV1 [OR, 0.15; 95% CI, 0.024–0.97; P = 0.05]). Conclusions: Our results suggest that vitamin D insufficiency is relatively frequent in an equatorial population of children with asthma. In these children, lower vitamin D levels are associated with increased markers of allergy and asthma severity. PMID:19179486
Correlation of Vitamin D status and orthodontic-induced external apical root resorption
Tehranchi, Azita; Sadighnia, Azin; Younessian, Farnaz; Abdi, Amir H.; Shirvani, Armin
2017-01-01
Background: Adequate Vitamin D is essential for dental and skeletal health in children and adult. The purpose of this study was to assess the correlation of serum Vitamin D level with external-induced apical root resorption (EARR) following fixed orthodontic treatment. Materials and Methods: In this cross-sectional study, the prevalence of Vitamin D deficiency (defined by25-hydroxyvitamin-D) was determined in 34 patients (23.5% male; age range 12–23 years; mean age 16.63 ± 2.84) treated with fixed orthodontic treatment. Root resorption of four maxillary incisors was measured using before and after periapical radiographs (136 measured teeth) by means of a design-to-purpose software to optimize data collection. Teeth with a maximum percentage of root resorption (%EARR) were indicated as representative root resorption for each patient. A multiple linear regression model and Pearson correlation coefficient were used to assess the association of Vitamin D status and observed EARR. P < 0.05 was considered statistically significant. Results: The Pearson coefficient between these two variables was determined about 0.15 (P = 0.38). Regression analysis revealed that Vitamin D status of the patients demonstrated no significant statistical correlation with EARR, after adjustment of confounding variables using linear regression model (P > 0.05). Conclusion: This study suggests that Vitamin D level is not among the clinical variables that are potential contributors for EARR. The prevalence of Vitamin D deficiency does not differ in patients with higher EARR. These data suggest the possibility that Vitamin D insufficiency may not contribute to the development of more apical root resorption although this remains to be confirmed by further longitudinal cohort studies. PMID:29238379
Bone Mineral Density across a Range of Physical Activity Volumes: NHANES 2007–2010
Whitfield, Geoffrey P.; Kohrt, Wendy M.; Pettee Gabriel, Kelley K.; Rahbar, Mohammad H.; Kohl, Harold W.
2014-01-01
Introduction The association between aerobic physical activity volume and bone mineral density (BMD) is not completely understood. The purpose of this study was to clarify the association between BMD and aerobic activity across a broad range of activity volumes, in particular volumes between those recommended in the 2008 Physical Activity Guidelines for Americans and those of trained endurance athletes. Methods Data from the 2007–2010 National Health and Nutrition Examination Survey were used to quantify the association between reported physical activity and BMD at the lumbar spine and proximal femur across the entire range of activity volumes reported by US adults. Participants were categorized into multiples of the minimum guideline-recommended volume based on reported moderate and vigorous intensity leisure activity. Lumbar and proximal femur BMD was assessed with dual-energy x-ray absorptiometry. Results Among women, multivariable-adjusted linear regression analyses revealed no significant differences in lumbar BMD across activity categories, while proximal femur BMD was significantly higher among those who exceeded guidelines by 2–4 times than those who reported no activity. Among men, multivariable-adjusted BMD at both sites neared its highest values among those who exceeded guidelines by at least 4 times and was not progressively higher with additional activity. Logistic regression estimating the odds of low BMD generally echoed the linear regression results. Conclusion The association between physical activity volume and BMD is complex. Among women, exceeding guidelines by 2–4 times may be important for maximizing BMD at the proximal femur, while among men, exceeding guidelines by 4+ times may be beneficial for lumbar and proximal femur BMD. PMID:24870584
Komaroff, Marina
2016-01-01
Objective. The aim of this study is to investigate if weight fluctuation is an independent risk factor for postmenopausal breast cancer (PBC) among women who gained weight in adult years. Methods. NHANES I Epidemiologic Follow-Up Study (NHEFS) database was used in the study. Women that were cancers-free at enrollment and diagnosed for the first time with breast cancer at age 50 or greater were considered cases. Controls were chosen from the subset of cancers-free women and matched to cases by years of follow-up and status of body mass index (BMI) at 25 years of age. Weight fluctuation was measured by the root-mean-square-error (RMSE) from a simple linear regression model for each woman with their body mass index (BMI) regressed on age (started at 25 years) while women with the positive slope from this regression were defined as weight gainers. Data were analyzed using conditional logistic regression models. Results. A total of 158 women were included into the study. The conditional logistic regression adjusted for weight gain demonstrated positive association between weight fluctuation in adult years and postmenopausal breast cancers (odds ratio/OR = 1.67; 95% confidence interval/CI: 1.06–2.66). Conclusions. The data suggested that long-term weight fluctuation was significant risk factor for PBC among women who gained weight in adult years. This finding underscores the importance of maintaining lost weight and avoiding weight fluctuation. PMID:26953120
Senn, Stephen; Graf, Erika; Caputo, Angelika
2007-12-30
Stratifying and matching by the propensity score are increasingly popular approaches to deal with confounding in medical studies investigating effects of a treatment or exposure. A more traditional alternative technique is the direct adjustment for confounding in regression models. This paper discusses fundamental differences between the two approaches, with a focus on linear regression and propensity score stratification, and identifies points to be considered for an adequate comparison. The treatment estimators are examined for unbiasedness and efficiency. This is illustrated in an application to real data and supplemented by an investigation on properties of the estimators for a range of underlying linear models. We demonstrate that in specific circumstances the propensity score estimator is identical to the effect estimated from a full linear model, even if it is built on coarser covariate strata than the linear model. As a consequence the coarsening property of the propensity score-adjustment for a one-dimensional confounder instead of a high-dimensional covariate-may be viewed as a way to implement a pre-specified, richly parametrized linear model. We conclude that the propensity score estimator inherits the potential for overfitting and that care should be taken to restrict covariates to those relevant for outcome. Copyright (c) 2007 John Wiley & Sons, Ltd.
Trends in Timing of Pregnancy Awareness Among US Women.
Branum, Amy M; Ahrens, Katherine A
2017-04-01
Objectives Early pregnancy detection is important for improving pregnancy outcomes as the first trimester is a critical window of fetal development; however, there has been no description of trends in timing of pregnancy awareness among US women. Methods We examined data from the 1995, 2002, 2006-2010 and 2011-2013 National Survey of Family Growth on self-reported timing of pregnancy awareness among women aged 15-44 years who reported at least one pregnancy in the 4 or 5 years prior to interview that did not result in induced abortion or adoption (n = 17, 406). We examined the associations between maternal characteristics and late pregnancy awareness (≥7 weeks' gestation) using adjusted prevalence ratios from logistic regression models. Gestational age at time of pregnancy awareness (continuous) was regressed over year of pregnancy conception (1990-2012) in a linear model. Results Among all pregnancies reported, gestational age at time of pregnancy awareness was 5.5 weeks (standard error = 0.04) and the prevalence of late pregnancy awareness was 23 % (standard error = 1 %). Late pregnancy awareness decreased with maternal age, was more prevalent among non-Hispanic black and Hispanic women compared to non-Hispanic white women, and for unintended pregnancies versus those that were intended (p < 0.01). Mean time of pregnancy awareness did not change linearly over a 23-year time period after adjustment for maternal age at the time of conception (p < 0.16). Conclusions for Practice On average, timing of pregnancy awareness did not change linearly during 1990-2012 among US women and occurs later among certain groups of women who are at higher risk of adverse birth outcomes.
Borchert, Sabrina; Wessolly, Michael; Mairinger, Elena; Kollmeier, Jens; Hager, Thomas; Mairinger, Thomas; Christoph, Daniel C.; Walter, Robert F.H.; Eberhardt, Wilfried E.E.; Plönes, Till; Wohlschlaeger, Jeremias; Jasani, Bharat; Schmid, Kurt Werner; Bankfalvi, Agnes
2018-01-01
Background Malignant pleural mesothelioma (MPM) is a biologically highly aggressive tumor arising from the pleura with a dismal prognosis. Cisplatin is the drug of choice for the treatment of MPM, and carboplatin seems to have comparable efficacy. Nevertheless, cisplatin treatment results in a response rate of merely 14% and a median survival of less than seven months. Due to their role in many cellular processes, methallothioneins (MTs) have been widely studied in various cancers. The known heavy metal detoxifying effect of MT-I and MT-II may be the reason for heavy metal drug resistance of various cancers including MPM. Methods 105 patients were retrospectively analyzed immunohistochemically for their MT expression levels. Survival analysis was done by Cox-regression, and statistical significance determined using likelihood ratio, Wald test and Score (logrank) tests. Results Cox-regression analyses were done in a linear and logarithmic scale revealing a significant association between expression of MT and shortened overall survival (OS) in a linear (p=0.0009) and logarithmic scale (p=0.0003). Reduced progression free survival (PFS) was also observed for MT expressing tumors (linear: p=0.0134, log: p=0.0152). Conclusion Since both, overall survival and progression-free survival are negatively correlated with detectable MT expression in MPM, our results indicate a possible resistance to platin-based chemotherapy associated with MT expression upregulation, found exclusively in progressive MPM samples. Initial cell culture studies suggest promoter DNA hypomethylation and expression of miRNA-566 a direct regulator of copper transporter SLC31A1 and a putative regulator of MT1A and MT2A gene expression, to be responsible for the drug resistance. PMID:29854276
Trends in Timing of Pregnancy Awareness Among US Women
2017-01-01
Objectives Early pregnancy detection is important for improving pregnancy outcomes as the first trimester is a critical window of fetal development; however, there has been no description of trends in timing of pregnancy awareness among US women. Methods We examined data from the 1995, 2002, 2006–2010 and 2011–2013 National Survey of Family Growth on self-reported timing of pregnancy awareness among women aged 15–44 years who reported at least one pregnancy in the 4 or 5 years prior to interview that did not result in induced abortion or adoption (n = 17, 406). We examined the associations between maternal characteristics and late pregnancy awareness (≥7 weeks’ gestation) using adjusted prevalence ratios from logistic regression models. Gestational age at time of pregnancy awareness (continuous) was regressed over year of pregnancy conception (1990–2012) in a linear model. Results Among all pregnancies reported, gestational age at time of pregnancy awareness was 5.5 weeks (standard error = 0.04) and the prevalence of late pregnancy awareness was 23 % (standard error = 1 %). Late pregnancy awareness decreased with maternal age, was more prevalent among non-Hispanic black and Hispanic women compared to non-Hispanic white women, and for unintended pregnancies versus those that were intended (p < 0.01). Mean time of pregnancy awareness did not change linearly over a 23-year time period after adjustment for maternal age at the time of conception (p < 0.16). Conclusions for Practice On average, timing of pregnancy awareness did not change linearly during 1990–2012 among US women and occurs later among certain groups of women who are at higher risk of adverse birth outcomes. PMID:27449777
Effect of water-based recovery on blood lactate removal after high-intensity exercise.
Lucertini, Francesco; Gervasi, Marco; D'Amen, Giancarlo; Sisti, Davide; Rocchi, Marco Bruno Luigi; Stocchi, Vilberto; Benelli, Piero
2017-01-01
This study assessed the effectiveness of water immersion to the shoulders in enhancing blood lactate removal during active and passive recovery after short-duration high-intensity exercise. Seventeen cyclists underwent active water- and land-based recoveries and passive water and land-based recoveries. The recovery conditions lasted 31 minutes each and started after the identification of each cyclist's blood lactate accumulation peak, induced by a 30-second all-out sprint on a cycle ergometer. Active recoveries were performed on a cycle ergometer at 70% of the oxygen consumption corresponding to the lactate threshold (the control for the intensity was oxygen consumption), while passive recoveries were performed with subjects at rest and seated on the cycle ergometer. Blood lactate concentration was measured 8 times during each recovery condition and lactate clearance was modeled over a negative exponential function using non-linear regression. Actual active recovery intensity was compared to the target intensity (one sample t-test) and passive recovery intensities were compared between environments (paired sample t-tests). Non-linear regression parameters (coefficients of the exponential decay of lactate; predicted resting lactates; predicted delta decreases in lactate) were compared between environments (linear mixed model analyses for repeated measures) separately for the active and passive recovery modes. Active recovery intensities did not differ significantly from the target oxygen consumption, whereas passive recovery resulted in a slightly lower oxygen consumption when performed while immersed in water rather than on land. The exponential decay of blood lactate was not significantly different in water- or land-based recoveries in either active or passive recovery conditions. In conclusion, water immersion at 29°C would not appear to be an effective practice for improving post-exercise lactate removal in either the active or passive recovery modes.
Hoover, Stephen; Jackson, Eric V.; Paul, David; Locke, Robert
2016-01-01
Summary Background Accurate prediction of future patient census in hospital units is essential for patient safety, health outcomes, and resource planning. Forecasting census in the Neonatal Intensive Care Unit (NICU) is particularly challenging due to limited ability to control the census and clinical trajectories. The fixed average census approach, using average census from previous year, is a forecasting alternative used in clinical practice, but has limitations due to census variations. Objective Our objectives are to: (i) analyze the daily NICU census at a single health care facility and develop census forecasting models, (ii) explore models with and without patient data characteristics obtained at the time of admission, and (iii) evaluate accuracy of the models compared with the fixed average census approach. Methods We used five years of retrospective daily NICU census data for model development (January 2008 – December 2012, N=1827 observations) and one year of data for validation (January – December 2013, N=365 observations). Best-fitting models of ARIMA and linear regression were applied to various 7-day prediction periods and compared using error statistics. Results The census showed a slightly increasing linear trend. Best fitting models included a non-seasonal model, ARIMA(1,0,0), seasonal ARIMA models, ARIMA(1,0,0)x(1,1,2)7 and ARIMA(2,1,4)x(1,1,2)14, as well as a seasonal linear regression model. Proposed forecasting models resulted on average in 36.49% improvement in forecasting accuracy compared with the fixed average census approach. Conclusions Time series models provide higher prediction accuracy under different census conditions compared with the fixed average census approach. Presented methodology is easily applicable in clinical practice, can be generalized to other care settings, support short- and long-term census forecasting, and inform staff resource planning. PMID:27437040
Denoeud, Lise; Fievet, Nadine; Aubouy, Agnès; Ayemonna, Paul; Kiniffo, Richard; Massougbodji, Achille; Cot, Michel
2007-01-01
Background In areas of stable transmission, malaria during pregnancy is associated with severe maternal and foetal outcomes, especially low birth weight (LBW). To prevent these complications, weekly chloroquine (CQ) chemoprophylaxis is now being replaced by intermittent preventive treatment with sulfadoxine-pyrimethamine in West Africa. The prevalence of placental malaria and its burden on LBW were assessed in Benin to evaluate the efficacy of weekly CQ chemoprophylaxis, prior to its replacement by intermittent preventive treatment. Methods In two maternity clinics in Ouidah, an observational study was conducted between April 2004 and April 2005. At each delivery, placental blood smears were examined for malaria infection and women were interviewed on their pregnancy history including CQ intake and dosage. CQ was measured in the urine of a sub-sample (n = 166). Multiple logistic and linear regression were used to assess factors associated with LBW and placental malaria. Results Among 1090 singleton live births, prevalence of placental malaria and LBW were 16% and 17% respectively. After adjustment, there was a non-significant association between placental malaria and LBW (adjusted OR = 1.43; P = 0.10). Multiple linear regression showed a positive association between placental malaria and decreased birth weight in primigravidae. More than 98% of the women reported regular chemoprophylaxis and CQ was detectable in 99% of urine samples. Protection from LBW was high in women reporting regular CQ prophylaxis, with a strong duration-effect relationship (test for linear trend: P < 0,001). Conclusion Despite high parasite resistance and limited effect on placental malaria, a CQ chemoprophylaxis taken at adequate doses showed to be still effective in reducing LBW in Benin. PMID:17341298
Du, Z; Zhang, J; Lu, J X; Lu, L P
2018-05-10
Objective: To analyze the distribution characteristics of bacillary dysentery in Beijing during 2004-2015 and evaluate the influence of meteorological factors on the temporal and spatial distribution of bacillary dysentery. Methods: The incidence data of bacterial dysentery and meteorological data in Beijing from 2004 to 2015 were collected. Descriptive epidemiological analysis was conducted to study the distribution characteristics of bacterial dysentery. Linear correlation analysis and multiple linear regression analysis were carried out to investigate the relationship between the incidence of bacillary dysentery and average precipitation, average air temperature, sunshine hours, average wind speed, average air pressure, gale and rain days. Results: A total of 280 704 cases of bacterial dysentery, including 36 deaths, were reported from 2004 to 2015 in Beijing, the average annual incidence was 130.15/100 000. The annual incidence peak was mainly between May and October, the cases occurred during this period accounted for 80.75 % of the total, and the incidence was highest in age group 0 year. The population distribution showed that most cases were children outside child care settings and students, and the sex ratio of the cases was 1.22∶1. The reported incidence of bacillary dysentery was positively associated with average precipitation, average air temperature and rain days with the correlation coefficients of 0.931, 0.878 and 0.888, but it was negatively associated with the average pressure, the correlation coefficient was -0.820. Multiple linear regression equation for fitting analysis of bacillary dysentery and meteorological factors was Y =3.792+0.162 X (1). Conclusion: The reported incidence of bacillary dysentery in Beijing was much higher than national level. The annual incidence peak was during July to August, and the average precipitation was an important meteorological factor influencing the incidence of bacillary dysentery.
Obesity and the labor market: A fresh look at the weight penalty.
Caliendo, Marco; Gehrsitz, Markus
2016-12-01
This paper applies semiparametric regression models to shed light on the relationship between body weight and labor market outcomes in Germany. We find conclusive evidence that these relationships are poorly described by linear or quadratic OLS specifications. Women's wages and employment probabilities do not follow a linear relationship and are highest at a body weight far below the clinical threshold of obesity. This indicates that looks, rather than health, is the driving force behind the adverse labor market outcomes to which overweight women are subject. Further support is lent to this notion by the fact that wage penalties for overweight and obese women are only observable in white-collar occupations. On the other hand, bigger appears to be better in the case of men, for whom employment prospects increase with weight, albeit with diminishing returns. However, underweight men in blue-collar jobs earn lower wages because they lack the muscular strength required in such occupations. Copyright © 2016 Elsevier B.V. All rights reserved.
Alam, Prawez; Foudah, Ahmed I.; Zaatout, Hala H.; T, Kamal Y; Abdel-Kader, Maged S.
2017-01-01
Background: A simple and sensitive thin-layer chromatographic method has been established for quantification of glycyrrhizin in Glycyrrhiza glabra rhizome and baby herbal formulations by validated Reverse Phase HPTLC method. Materials and Methods: RP-HPTLC Method was carried out using glass coated with RP-18 silica gel 60 F254S HPTLC plates using methanol-water (7: 3 v/v) as mobile phase. Results: The developed plate was scanned and quantified densitometrically at 256 nm. Glycyrrhizin peaks from Glycyrrhiza glabra rhizome and baby herbal formulations were identified by comparing their single spot at Rf = 0.63 ± 0.01. Linear regression analysis revealed a good linear relationship between peak area and amount of glycyrrhizin in the range of 2000-7000 ng/band. Conclusion: The method was validated, in accordance with ICH guidelines for precision, accuracy, and robustness. The proposed method will be useful to enumerate the therapeutic dose of glycyrrhizin in herbal formulations as well as in bulk drug. PMID:28573236
Epiaortic fat pad area: A novel index for the dimensions of the ascending aorta.
Toufan, Mehrnoush; Pourafkari, Leili; Boudagh, Shabnam; Nader, Nader D
2016-06-01
We sought to investigate the possible association between the area of the epiaortic fat pad (EAFP) and dimensions of the ascending aorta. A total of 193 individuals underwent transthoracic echocardiography (TTE) prospectively. The area of the EAFP was traced anterior to the aortic root and correlated with the diameter of the aorta. The mean area of the EAFP was 5.16 ± 2.28 cm(2) Absolute and indexed dimensions of the ascending aorta had a significant correlation with the area of the EAFP (p <0.001 for all). In a multivariate linear regression model, age >65 (p <0.001), body mass index >30 kg/m(2) (p = 0.02) and a history of hyperlipidemia (p = 0.003) were identified as independent predictors of the area for EAFP. In conclusion, both the absolute and indexed diameters of the ascending aorta at the different segments that directly come into contact with the EAFP linearly correlate with the area of the EAFP measured by TTE. © The Author(s) 2016.
Fenske, Nora; Burns, Jacob; Hothorn, Torsten; Rehfuess, Eva A
2013-01-01
Most attempts to address undernutrition, responsible for one third of global child deaths, have fallen behind expectations. This suggests that the assumptions underlying current modelling and intervention practices should be revisited. We undertook a comprehensive analysis of the determinants of child stunting in India, and explored whether the established focus on linear effects of single risks is appropriate. Using cross-sectional data for children aged 0-24 months from the Indian National Family Health Survey for 2005/2006, we populated an evidence-based diagram of immediate, intermediate and underlying determinants of stunting. We modelled linear, non-linear, spatial and age-varying effects of these determinants using additive quantile regression for four quantiles of the Z-score of standardized height-for-age and logistic regression for stunting and severe stunting. At least one variable within each of eleven groups of determinants was significantly associated with height-for-age in the 35% Z-score quantile regression. The non-modifiable risk factors child age and sex, and the protective factors household wealth, maternal education and BMI showed the largest effects. Being a twin or multiple birth was associated with dramatically decreased height-for-age. Maternal age, maternal BMI, birth order and number of antenatal visits influenced child stunting in non-linear ways. Findings across the four quantile and two logistic regression models were largely comparable. Our analysis confirms the multifactorial nature of child stunting. It emphasizes the need to pursue a systems-based approach and to consider non-linear effects, and suggests that differential effects across the height-for-age distribution do not play a major role.
Aqil, Muhammad; Kita, Ichiro; Yano, Akira; Nishiyama, Soichi
2007-10-01
Traditionally, the multiple linear regression technique has been one of the most widely used models in simulating hydrological time series. However, when the nonlinear phenomenon is significant, the multiple linear will fail to develop an appropriate predictive model. Recently, neuro-fuzzy systems have gained much popularity for calibrating the nonlinear relationships. This study evaluated the potential of a neuro-fuzzy system as an alternative to the traditional statistical regression technique for the purpose of predicting flow from a local source in a river basin. The effectiveness of the proposed identification technique was demonstrated through a simulation study of the river flow time series of the Citarum River in Indonesia. Furthermore, in order to provide the uncertainty associated with the estimation of river flow, a Monte Carlo simulation was performed. As a comparison, a multiple linear regression analysis that was being used by the Citarum River Authority was also examined using various statistical indices. The simulation results using 95% confidence intervals indicated that the neuro-fuzzy model consistently underestimated the magnitude of high flow while the low and medium flow magnitudes were estimated closer to the observed data. The comparison of the prediction accuracy of the neuro-fuzzy and linear regression methods indicated that the neuro-fuzzy approach was more accurate in predicting river flow dynamics. The neuro-fuzzy model was able to improve the root mean square error (RMSE) and mean absolute percentage error (MAPE) values of the multiple linear regression forecasts by about 13.52% and 10.73%, respectively. Considering its simplicity and efficiency, the neuro-fuzzy model is recommended as an alternative tool for modeling of flow dynamics in the study area.
González-Aparicio, I; Hidalgo, J; Baklanov, A; Padró, A; Santa-Coloma, O
2013-07-01
There is extensive evidence of the negative impacts on health linked to the rise of the regional background of particulate matter (PM) 10 levels. These levels are often increased over urban areas becoming one of the main air pollution concerns. This is the case on the Bilbao metropolitan area, Spain. This study describes a data-driven model to diagnose PM10 levels in Bilbao at hourly intervals. The model is built with a training period of 7-year historical data covering different urban environments (inland, city centre and coastal sites). The explanatory variables are quantitative-log [NO2], temperature, short-wave incoming radiation, wind speed and direction, specific humidity, hour and vehicle intensity-and qualitative-working days/weekends, season (winter/summer), the hour (from 00 to 23 UTC) and precipitation/no precipitation. Three different linear regression models are compared: simple linear regression; linear regression with interaction terms (INT); and linear regression with interaction terms following the Sawa's Bayesian Information Criteria (INT-BIC). Each type of model is calculated selecting two different periods: the training (it consists of 6 years) and the testing dataset (it consists of 1 year). The results of each type of model show that the INT-BIC-based model (R(2) = 0.42) is the best. Results were R of 0.65, 0.63 and 0.60 for the city centre, inland and coastal sites, respectively, a level of confidence similar to the state-of-the art methodology. The related error calculated for longer time intervals (monthly or seasonal means) diminished significantly (R of 0.75-0.80 for monthly means and R of 0.80 to 0.98 at seasonally means) with respect to shorter periods.
O'Leary, Neil; Chauhan, Balwantray C; Artes, Paul H
2012-10-01
To establish a method for estimating the overall statistical significance of visual field deterioration from an individual patient's data, and to compare its performance to pointwise linear regression. The Truncated Product Method was used to calculate a statistic S that combines evidence of deterioration from individual test locations in the visual field. The overall statistical significance (P value) of visual field deterioration was inferred by comparing S with its permutation distribution, derived from repeated reordering of the visual field series. Permutation of pointwise linear regression (PoPLR) and pointwise linear regression were evaluated in data from patients with glaucoma (944 eyes, median mean deviation -2.9 dB, interquartile range: -6.3, -1.2 dB) followed for more than 4 years (median 10 examinations over 8 years). False-positive rates were estimated from randomly reordered series of this dataset, and hit rates (proportion of eyes with significant deterioration) were estimated from the original series. The false-positive rates of PoPLR were indistinguishable from the corresponding nominal significance levels and were independent of baseline visual field damage and length of follow-up. At P < 0.05, the hit rates of PoPLR were 12, 29, and 42%, at the fifth, eighth, and final examinations, respectively, and at matching specificities they were consistently higher than those of pointwise linear regression. In contrast to population-based progression analyses, PoPLR provides a continuous estimate of statistical significance for visual field deterioration individualized to a particular patient's data. This allows close control over specificity, essential for monitoring patients in clinical practice and in clinical trials.
ERIC Educational Resources Information Center
Liou, Pey-Yan
2009-01-01
The current study examines three regression models: OLS (ordinary least square) linear regression, Poisson regression, and negative binomial regression for analyzing count data. Simulation results show that the OLS regression model performed better than the others, since it did not produce more false statistically significant relationships than…
Use of AMMI and linear regression models to analyze genotype-environment interaction in durum wheat.
Nachit, M M; Nachit, G; Ketata, H; Gauch, H G; Zobel, R W
1992-03-01
The joint durum wheat (Triticum turgidum L var 'durum') breeding program of the International Maize and Wheat Improvement Center (CIMMYT) and the International Center for Agricultural Research in the Dry Areas (ICARDA) for the Mediterranean region employs extensive multilocation testing. Multilocation testing produces significant genotype-environment (GE) interaction that reduces the accuracy for estimating yield and selecting appropriate germ plasm. The sum of squares (SS) of GE interaction was partitioned by linear regression techniques into joint, genotypic, and environmental regressions, and by Additive Main effects and the Multiplicative Interactions (AMMI) model into five significant Interaction Principal Component Axes (IPCA). The AMMI model was more effective in partitioning the interaction SS than the linear regression technique. The SS contained in the AMMI model was 6 times higher than the SS for all three regressions. Postdictive assessment recommended the use of the first five IPCA axes, while predictive assessment AMMI1 (main effects plus IPCA1). After elimination of random variation, AMMI1 estimates for genotypic yields within sites were more precise than unadjusted means. This increased precision was equivalent to increasing the number of replications by a factor of 3.7.
Lorenzo-Seva, Urbano; Ferrando, Pere J
2011-03-01
We provide an SPSS program that implements currently recommended techniques and recent developments for selecting variables in multiple linear regression analysis via the relative importance of predictors. The approach consists of: (1) optimally splitting the data for cross-validation, (2) selecting the final set of predictors to be retained in the equation regression, and (3) assessing the behavior of the chosen model using standard indices and procedures. The SPSS syntax, a short manual, and data files related to this article are available as supplemental materials from brm.psychonomic-journals.org/content/supplemental.
NASA Astrophysics Data System (ADS)
Gusriani, N.; Firdaniza
2018-03-01
The existence of outliers on multiple linear regression analysis causes the Gaussian assumption to be unfulfilled. If the Least Square method is forcedly used on these data, it will produce a model that cannot represent most data. For that, we need a robust regression method against outliers. This paper will compare the Minimum Covariance Determinant (MCD) method and the TELBS method on secondary data on the productivity of phytoplankton, which contains outliers. Based on the robust determinant coefficient value, MCD method produces a better model compared to TELBS method.
Orthogonal Projection in Teaching Regression and Financial Mathematics
ERIC Educational Resources Information Center
Kachapova, Farida; Kachapov, Ilias
2010-01-01
Two improvements in teaching linear regression are suggested. The first is to include the population regression model at the beginning of the topic. The second is to use a geometric approach: to interpret the regression estimate as an orthogonal projection and the estimation error as the distance (which is minimized by the projection). Linear…
Logistic models--an odd(s) kind of regression.
Jupiter, Daniel C
2013-01-01
The logistic regression model bears some similarity to the multivariable linear regression with which we are familiar. However, the differences are great enough to warrant a discussion of the need for and interpretation of logistic regression. Copyright © 2013 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.
Sunkara, Vasu; Hébert, James R.
2015-01-01
BACKGROUND Disparities in cancer screening, incidence, treatment, and survival are worsening globally. The mortality-to-incidence ratio (MIR) has been used previously to evaluate such disparities. METHODS The MIR for colorectal cancer is calculated for all Organisation for Economic Cooperation and Development (OECD) countries using the 2012 GLOBOCAN incidence and mortality statistics. Health system rankings were obtained from the World Health Organization. Two linear regression models were fit with the MIR as the dependent variable and health system ranking as the independent variable; one included all countries and one model had the “divergents” removed. RESULTS The regression model for all countries explained 24% of the total variance in the MIR. Nine countries were found to have regression-calculated MIRs that differed from the actual MIR by >20%. Countries with lower-than-expected MIRs were found to have strong national health systems characterized by formal colorectal cancer screening programs. Conversely, countries with higher-than-expected MIRs lack screening programs. When these divergent points were removed from the data set, the recalculated regression model explained 60% of the total variance in the MIR. CONCLUSIONS The MIR proved useful for identifying disparities in cancer screening and treatment internationally. It has potential as an indicator of the long-term success of cancer surveillance programs and may be extended to other cancer types for these purposes. PMID:25572676
Lin, Zhaozhou; Zhang, Qiao; Liu, Ruixin; Gao, Xiaojie; Zhang, Lu; Kang, Bingya; Shi, Junhan; Wu, Zidan; Gui, Xinjing; Li, Xuelin
2016-01-25
To accurately, safely, and efficiently evaluate the bitterness of Traditional Chinese Medicines (TCMs), a robust predictor was developed using robust partial least squares (RPLS) regression method based on data obtained from an electronic tongue (e-tongue) system. The data quality was verified by the Grubb's test. Moreover, potential outliers were detected based on both the standardized residual and score distance calculated for each sample. The performance of RPLS on the dataset before and after outlier detection was compared to other state-of-the-art methods including multivariate linear regression, least squares support vector machine, and the plain partial least squares regression. Both R² and root-mean-squares error (RMSE) of cross-validation (CV) were recorded for each model. With four latent variables, a robust RMSECV value of 0.3916 with bitterness values ranging from 0.63 to 4.78 were obtained for the RPLS model that was constructed based on the dataset including outliers. Meanwhile, the RMSECV, which was calculated using the models constructed by other methods, was larger than that of the RPLS model. After six outliers were excluded, the performance of all benchmark methods markedly improved, but the difference between the RPLS model constructed before and after outlier exclusion was negligible. In conclusion, the bitterness of TCM decoctions can be accurately evaluated with the RPLS model constructed using e-tongue data.
Analysis of Learning Curve Fitting Techniques.
1987-09-01
1986. 15. Neter, John and others. Applied Linear Regression Models. Homewood IL: Irwin, 19-33. 16. SAS User’s Guide: Basics, Version 5 Edition. SAS... Linear Regression Techniques (15:23-52). Random errors are assumed to be normally distributed when using -# ordinary least-squares, according to Johnston...lot estimated by the improvement curve formula. For a more detailed explanation of the ordinary least-squares technique, see Neter, et. al., Applied
On vertical profile of ozone at Syowa
NASA Technical Reports Server (NTRS)
Chubachi, Shigeru
1994-01-01
The difference in the vertical ozone profile at Syowa between 1966-1981 and 1982-1988 is shown. The month-height cross section of the slope of the linear regressions between ozone partial pressure and 100-mb temperature is also shown. The vertically integrated values of the slopes are in close agreement with the slopes calculated by linear regression of Dobson total ozone on 100-mb temperature in the period of 1982-1988.
Kovačević, Strahinja; Karadžić, Milica; Podunavac-Kuzmanović, Sanja; Jevrić, Lidija
2018-01-01
The present study is based on the quantitative structure-activity relationship (QSAR) analysis of binding affinity toward human prion protein (huPrP C ) of quinacrine, pyridine dicarbonitrile, diphenylthiazole and diphenyloxazole analogs applying different linear and non-linear chemometric regression techniques, including univariate linear regression, multiple linear regression, partial least squares regression and artificial neural networks. The QSAR analysis distinguished molecular lipophilicity as an important factor that contributes to the binding affinity. Principal component analysis was used in order to reveal similarities or dissimilarities among the studied compounds. The analysis of in silico absorption, distribution, metabolism, excretion and toxicity (ADMET) parameters was conducted. The ranking of the studied analogs on the basis of their ADMET parameters was done applying the sum of ranking differences, as a relatively new chemometric method. The main aim of the study was to reveal the most important molecular features whose changes lead to the changes in the binding affinities of the studied compounds. Another point of view on the binding affinity of the most promising analogs was established by application of molecular docking analysis. The results of the molecular docking were proven to be in agreement with the experimental outcome. Copyright © 2017 Elsevier B.V. All rights reserved.
Classification of sodium MRI data of cartilage using machine learning.
Madelin, Guillaume; Poidevin, Frederick; Makrymallis, Antonios; Regatte, Ravinder R
2015-11-01
To assess the possible utility of machine learning for classifying subjects with and subjects without osteoarthritis using sodium magnetic resonance imaging data. Theory: Support vector machine, k-nearest neighbors, naïve Bayes, discriminant analysis, linear regression, logistic regression, neural networks, decision tree, and tree bagging were tested. Sodium magnetic resonance imaging with and without fluid suppression by inversion recovery was acquired on the knee cartilage of 19 controls and 28 osteoarthritis patients. Sodium concentrations were measured in regions of interests in the knee for both acquisitions. Mean (MEAN) and standard deviation (STD) of these concentrations were measured in each regions of interest, and the minimum, maximum, and mean of these two measurements were calculated over all regions of interests for each subject. The resulting 12 variables per subject were used as predictors for classification. Either Min [STD] alone, or in combination with Mean [MEAN] or Min [MEAN], all from fluid suppressed data, were the best predictors with an accuracy >74%, mainly with linear logistic regression and linear support vector machine. Other good classifiers include discriminant analysis, linear regression, and naïve Bayes. Machine learning is a promising technique for classifying osteoarthritis patients and controls from sodium magnetic resonance imaging data. © 2014 Wiley Periodicals, Inc.
Lopes, Marta B; Calado, Cecília R C; Figueiredo, Mário A T; Bioucas-Dias, José M
2017-06-01
The monitoring of biopharmaceutical products using Fourier transform infrared (FT-IR) spectroscopy relies on calibration techniques involving the acquisition of spectra of bioprocess samples along the process. The most commonly used method for that purpose is partial least squares (PLS) regression, under the assumption that a linear model is valid. Despite being successful in the presence of small nonlinearities, linear methods may fail in the presence of strong nonlinearities. This paper studies the potential usefulness of nonlinear regression methods for predicting, from in situ near-infrared (NIR) and mid-infrared (MIR) spectra acquired in high-throughput mode, biomass and plasmid concentrations in Escherichia coli DH5-α cultures producing the plasmid model pVAX-LacZ. The linear methods PLS and ridge regression (RR) are compared with their kernel (nonlinear) versions, kPLS and kRR, as well as with the (also nonlinear) relevance vector machine (RVM) and Gaussian process regression (GPR). For the systems studied, RR provided better predictive performances compared to the remaining methods. Moreover, the results point to further investigation based on larger data sets whenever differences in predictive accuracy between a linear method and its kernelized version could not be found. The use of nonlinear methods, however, shall be judged regarding the additional computational cost required to tune their additional parameters, especially when the less computationally demanding linear methods herein studied are able to successfully monitor the variables under study.
Application of General Regression Neural Network to the Prediction of LOD Change
NASA Astrophysics Data System (ADS)
Zhang, Xiao-Hong; Wang, Qi-Jie; Zhu, Jian-Jun; Zhang, Hao
2012-01-01
Traditional methods for predicting the change in length of day (LOD change) are mainly based on some linear models, such as the least square model and autoregression model, etc. However, the LOD change comprises complicated non-linear factors and the prediction effect of the linear models is always not so ideal. Thus, a kind of non-linear neural network — general regression neural network (GRNN) model is tried to make the prediction of the LOD change and the result is compared with the predicted results obtained by taking advantage of the BP (back propagation) neural network model and other models. The comparison result shows that the application of the GRNN to the prediction of the LOD change is highly effective and feasible.
Berngard, Samuel Clark; Berngard, Jennifer Bishop; Krebs, Nancy F; Garcés, Ana; Miller, Leland V; Westcott, Jamie; Wright, Linda L; Kindem, Mark; Hambidge, K Michael
2013-12-01
Stunting is prevalent by the age of 6 months in the indigenous population of the Western Highlands of Guatemala. The objective of this study was to determine the time course and predictors of linear growth failure and weight-for-age in early infancy. One hundred and forty eight term newborns had measurements of length and weight in their homes, repeated at 3 and 6 months. Maternal measurements were also obtained. Mean ± SD length-for-age Z-score (LAZ) declined from newborn -1.0 ± 1.01 to -2.20 ± 1.05 and -2.26 ± 1.01 at 3 and 6 months respectively. Stunting rates for newborn, 3 and 6 months were 47%, 53% and 56% respectively. A multiple regression model (R(2) = 0.64) demonstrated that the major predictor of LAZ at 3 months was newborn LAZ with the other predictors being newborn weight-for-age Z-score (WAZ), gender and maternal education∗maternal age interaction. Because WAZ remained essentially constant and LAZ declined during the same period, weight-for-length Z-score (WLZ) increased from -0.44 to +1.28 from birth to 3 months. The more severe the linear growth failure, the greater WAZ was in proportion to the LAZ. The primary conclusion is that impaired fetal linear growth is the major predictor of early infant linear growth failure indicating that prevention needs to start with maternal interventions. © 2013.
von Ruesten, Anne; Steffen, Annika; Floegel, Anna; van der A, Daphne L.; Masala, Giovanna; Tjønneland, Anne; Halkjaer, Jytte; Palli, Domenico; Wareham, Nicholas J.; Loos, Ruth J. F.; Sørensen, Thorkild I. A.; Boeing, Heiner
2011-01-01
Objective To investigate trends in obesity prevalence in recent years and to predict the obesity prevalence in 2015 in European populations. Methods Data of 97 942 participants from seven cohorts involved in the European Prospective Investigation into Cancer and Nutrition (EPIC) study participating in the Diogenes project (named as “Diogenes cohort” in the following) with weight measurements at baseline and follow-up were used to predict future obesity prevalence with logistic linear and non-linear (leveling off) regression models. In addition, linear and leveling off models were fitted to the EPIC-Potsdam dataset with five weight measures during the observation period to find out which of these two models might provide the more realistic prediction. Results During a mean follow-up period of 6 years, the obesity prevalence in the Diogenes cohort increased from 13% to 17%. The linear prediction model predicted an overall obesity prevalence of about 30% in 2015, whereas the leveling off model predicted a prevalence of about 20%. In the EPIC-Potsdam cohort, the shape of obesity trend favors a leveling off model among men (R2 = 0.98), and a linear model among women (R2 = 0.99). Conclusion Our data show an increase in obesity prevalence since the 1990ies, and predictions by 2015 suggests a sizeable further increase in European populations. However, the estimates from the leveling off model were considerably lower. PMID:22102897
Estimating effects of limiting factors with regression quantiles
Cade, B.S.; Terrell, J.W.; Schroeder, R.L.
1999-01-01
In a recent Concepts paper in Ecology, Thomson et al. emphasized that assumptions of conventional correlation and regression analyses fundamentally conflict with the ecological concept of limiting factors, and they called for new statistical procedures to address this problem. The analytical issue is that unmeasured factors may be the active limiting constraint and may induce a pattern of unequal variation in the biological response variable through an interaction with the measured factors. Consequently, changes near the maxima, rather than at the center of response distributions, are better estimates of the effects expected when the observed factor is the active limiting constraint. Regression quantiles provide estimates for linear models fit to any part of a response distribution, including near the upper bounds, and require minimal assumptions about the form of the error distribution. Regression quantiles extend the concept of one-sample quantiles to the linear model by solving an optimization problem of minimizing an asymmetric function of absolute errors. Rank-score tests for regression quantiles provide tests of hypotheses and confidence intervals for parameters in linear models with heteroscedastic errors, conditions likely to occur in models of limiting ecological relations. We used selected regression quantiles (e.g., 5th, 10th, ..., 95th) and confidence intervals to test hypotheses that parameters equal zero for estimated changes in average annual acorn biomass due to forest canopy cover of oak (Quercus spp.) and oak species diversity. Regression quantiles also were used to estimate changes in glacier lily (Erythronium grandiflorum) seedling numbers as a function of lily flower numbers, rockiness, and pocket gopher (Thomomys talpoides fossor) activity, data that motivated the query by Thomson et al. for new statistical procedures. Both example applications showed that effects of limiting factors estimated by changes in some upper regression quantile (e.g., 90-95th) were greater than if effects were estimated by changes in the means from standard linear model procedures. Estimating a range of regression quantiles (e.g., 5-95th) provides a comprehensive description of biological response patterns for exploratory and inferential analyses in observational studies of limiting factors, especially when sampling large spatial and temporal scales.
Pfeiffer, R M; Riedl, R
2015-08-15
We assess the asymptotic bias of estimates of exposure effects conditional on covariates when summary scores of confounders, instead of the confounders themselves, are used to analyze observational data. First, we study regression models for cohort data that are adjusted for summary scores. Second, we derive the asymptotic bias for case-control studies when cases and controls are matched on a summary score, and then analyzed either using conditional logistic regression or by unconditional logistic regression adjusted for the summary score. Two scores, the propensity score (PS) and the disease risk score (DRS) are studied in detail. For cohort analysis, when regression models are adjusted for the PS, the estimated conditional treatment effect is unbiased only for linear models, or at the null for non-linear models. Adjustment of cohort data for DRS yields unbiased estimates only for linear regression; all other estimates of exposure effects are biased. Matching cases and controls on DRS and analyzing them using conditional logistic regression yields unbiased estimates of exposure effect, whereas adjusting for the DRS in unconditional logistic regression yields biased estimates, even under the null hypothesis of no association. Matching cases and controls on the PS yield unbiased estimates only under the null for both conditional and unconditional logistic regression, adjusted for the PS. We study the bias for various confounding scenarios and compare our asymptotic results with those from simulations with limited sample sizes. To create realistic correlations among multiple confounders, we also based simulations on a real dataset. Copyright © 2015 John Wiley & Sons, Ltd.
Mohd Yusof, Mohd Yusmiaidil Putera; Cauwels, Rita; Deschepper, Ellen; Martens, Luc
2015-08-01
The third molar development (TMD) has been widely utilized as one of the radiographic method for dental age estimation. By using the same radiograph of the same individual, third molar eruption (TME) information can be incorporated to the TMD regression model. This study aims to evaluate the performance of dental age estimation in individual method models and the combined model (TMD and TME) based on the classic regressions of multiple linear and principal component analysis. A sample of 705 digital panoramic radiographs of Malay sub-adults aged between 14.1 and 23.8 years was collected. The techniques described by Gleiser and Hunt (modified by Kohler) and Olze were employed to stage the TMD and TME, respectively. The data was divided to develop three respective models based on the two regressions of multiple linear and principal component analysis. The trained models were then validated on the test sample and the accuracy of age prediction was compared between each model. The coefficient of determination (R²) and root mean square error (RMSE) were calculated. In both genders, adjusted R² yielded an increment in the linear regressions of combined model as compared to the individual models. The overall decrease in RMSE was detected in combined model as compared to TMD (0.03-0.06) and TME (0.2-0.8). In principal component regression, low value of adjusted R(2) and high RMSE except in male were exhibited in combined model. Dental age estimation is better predicted using combined model in multiple linear regression models. Copyright © 2015 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.
40 CFR 1066.220 - Linearity verification for chassis dynamometer systems.
Code of Federal Regulations, 2014 CFR
2014-07-01
... dynamometer speed and torque at least as frequently as indicated in Table 1 of § 1066.215. The intent of... linear regression and the linearity criteria specified in Table 1 of this section. (b) Performance requirements. If a measurement system does not meet the applicable linearity criteria in Table 1 of this...
ERIC Educational Resources Information Center
Hovardas, Tasos
2016-01-01
Although ecological systems at varying scales involve non-linear interactions, learners insist thinking in a linear fashion when they deal with ecological phenomena. The overall objective of the present contribution was to propose a hypothetical learning progression for developing non-linear reasoning in prey-predator systems and to provide…
ERIC Educational Resources Information Center
Ker, H. W.
2014-01-01
Multilevel data are very common in educational research. Hierarchical linear models/linear mixed-effects models (HLMs/LMEs) are often utilized to analyze multilevel data nowadays. This paper discusses the problems of utilizing ordinary regressions for modeling multilevel educational data, compare the data analytic results from three regression…
Artes, Paul H; Crabb, David P
2010-01-01
To investigate why the specificity of the Moorfields Regression Analysis (MRA) of the Heidelberg Retina Tomograph (HRT) varies with disc size, and to derive accurate normative limits for neuroretinal rim area to address this problem. Two datasets from healthy subjects (Manchester, UK, n = 88; Halifax, Nova Scotia, Canada, n = 75) were used to investigate the physiological relationship between the optic disc and neuroretinal rim area. Normative limits for rim area were derived by quantile regression (QR) and compared with those of the MRA (derived by linear regression). Logistic regression analyses were performed to quantify the association between disc size and positive classifications with the MRA, as well as with the QR-derived normative limits. In both datasets, the specificity of the MRA depended on optic disc size. The odds of observing a borderline or outside-normal-limits classification increased by approximately 10% for each 0.1 mm(2) increase in disc area (P < 0.1). The lower specificity of the MRA with large optic discs could be explained by the failure of linear regression to model the extremes of the rim area distribution (observations far from the mean). In comparison, the normative limits predicted by QR were larger for smaller discs (less specific, more sensitive), and smaller for larger discs, such that false-positive rates became independent of optic disc size. Normative limits derived by quantile regression appear to remove the size-dependence of specificity with the MRA. Because quantile regression does not rely on the restrictive assumptions of standard linear regression, it may be a more appropriate method for establishing normative limits in other clinical applications where the underlying distributions are nonnormal or have nonconstant variance.
Geller, Marilyn G.; Zylberberg, Haley M.; Green, Peter H. R.; Lebwohl, Benjamin
2018-01-01
Background: The prevalence of depression in celiac disease (CD) is high, and patients are often burdened socially and financially by a gluten-free diet. However, the relationship between depression, somatic symptoms and dietary adherence in CD is complex and poorly understood. We used a patient powered research network (iCureCeliac®) to explore the effect that depression has on patients’ symptomatic response to a gluten-free diet (GFD). Methods: We identified patients with biopsy-diagnosed celiac disease who answered questions pertaining to symptoms (Celiac Symptom Index (CSI)), GFD adherence (Celiac Dietary Adherence Test (CDAT)), and a 5-point, scaled question regarding depressive symptoms relating to patients’ celiac disease. We then measured the correlation between symptoms and adherence (CSI vs. CDAT) in patients with depression versus those without depression. We also tested for interaction of depression with regard to the association with symptoms using a multiple linear regression model. Results: Among 519 patients, 86% were female and the mean age was 40.9 years. 46% of patients indicated that they felt “somewhat,” “quite a bit,” or “very much” depressed because of their disorder. There was a moderate correlation between worsened celiac symptoms and poorer GFD adherence (r = 0.6, p < 0.0001). In those with a positive depression screen, there was a moderate correlation between worsening symptoms and worsening dietary adherence (r = 0.5, p < 0.0001) whereas in those without depression, the correlation was stronger (r = 0.64, p < 0.0001). We performed a linear regression analysis, which suggests that the relationship between CSI and CDAT is modified by depression. Conclusions: In patients with depressive symptoms related to their disorder, correlation between adherence and symptoms was weaker than those without depressive symptoms. This finding was confirmed with a linear regression analysis, showing that depressive symptoms may modify the effect of a GFD on celiac symptoms. Depressive symptoms may therefore mask the relationship between inadvertent gluten exposure and symptoms. Additional longitudinal and prospective studies are needed to further explore this potentially important finding. PMID:29701659
Bae, Kyongtae T; Tao, Cheng; Wang, Jinhong; Kaya, Diana; Wu, Zhiyuan; Bae, Junu T; Chapman, Arlene B; Torres, Vicente E; Grantham, Jared J; Mrug, Michal; Bennett, William M; Flessner, Michael F; Landsittel, Doug P
2013-01-01
Objective To evaluate whether kidney and cyst volumes can be accurately estimated based on limited area measurements from MR images of patients with autosomal dominant polycystic kidney disease (ADPKD). Materials and Methods MR coronal images of 178 ADPKD participants from the Consortium for Radiologic Imaging Studies of ADPKD (CRISP) were analyzed. For each MR image slice, we measured kidney and renal cyst areas using stereology and region-based thresholding methods, respectively. The kidney and cyst ‘observed’ volumes were calculated by summing up the area measurements of all the slices covering the kidney. To estimate the volume, we selected a coronal mid-slice in each kidney and multiplied its area by the total number of slices (‘PANK2’ for kidney and ‘PANC2’ for cyst). We then compared the kidney and cyst volumes predicted from PANK2 and PANC2, respectively, to the corresponding observed volumes, using a linear regression analysis. Results The kidney volume predicted from PANK2 correlated extremely well with the observed kidney volume: R2=0.994 for right and 0.991 for left kidney. The linear regression coefficient multiplier to PANK2 that best fit the kidney volume was 0.637 (95%CI: 0.629–0.644) for right and 0.624 (95%CI: 0.616–0.633) for left kidney. The correlation between the cyst volume predicted from PANC2 and the observed cyst volume was also very high: R2=0.984 for right and 0.967 for left kidney. The least squares linear regression coefficient for PANC2 was 0.637 (95%CI: 0.624–0.649) for right and 0.608 (95%CI: 0.591–0.625) for left kidney. Conclusion Kidney and cyst volumes can be closely approximated by multiplying the product of the mid-slice area measurement and the total number of slices in the coronal MR images of ADPKD kidneys by 0.61–0.64. This information will help save processing time needed to estimate total kidney and cyst volumes of ADPKD kidneys. PMID:24107679
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kim, Yangho; Lee, Byung-Kook, E-mail: bklee@sch.ac.kr
Introduction: The objective of this study was to evaluate associations between blood lead, cadmium, and mercury levels with estimated glomerular filtration rate in a general population of South Korean adults. Methods: This was a cross-sectional study based on data obtained in the Korean National Health and Nutrition Examination Survey (KNHANES) (2008-2010). The final analytical sample consisted of 5924 participants. Estimated glomerular filtration rate (eGFR) was calculated using the MDRD Study equation as an indicator of glomerular function. Results: In multiple linear regression analysis of log2-transformed blood lead as a continuous variable on eGFR, after adjusting for covariates including cadmium andmore » mercury, the difference in eGFR levels associated with doubling of blood lead were -2.624 mL/min per 1.73 m Superscript-Two (95% CI: -3.803 to -1.445). In multiple linear regression analysis using quartiles of blood lead as the independent variable, the difference in eGFR levels comparing participants in the highest versus the lowest quartiles of blood lead was -3.835 mL/min per 1.73 m Superscript-Two (95% CI: -5.730 to -1.939). In a multiple linear regression analysis using blood cadmium and mercury, as continuous or categorical variables, as independent variables, neither metal was a significant predictor of eGFR. Odds ratios (ORs) and 95% CI values for reduced eGFR calculated for log2-transformed blood metals and quartiles of the three metals showed similar trends after adjustment for covariates. Discussion: In this large, representative sample of South Korean adults, elevated blood lead level was consistently associated with lower eGFR levels and with the prevalence of reduced eGFR even in blood lead levels below 10 {mu}g/dL. In conclusion, elevated blood lead level was associated with lower eGFR in a Korean general population, supporting the role of lead as a risk factor for chronic kidney disease.« less
Huang, Wan-Yu; Chang, Chia-Chu; Chen, Dar-Ren; Kor, Chew-Teng; Chen, Ting-Yu; Wu, Hung-Ming
2017-01-01
Introduction Hot flashes have been postulated to be linked to the development of metabolic disorders. This study aimed to evaluate the relationship between hot flashes, adipocyte-derived hormones, and insulin resistance in healthy, non-obese postmenopausal women. Participants and design In this cross-sectional study, a total of 151 women aged 45–60 years were stratified into one of three groups according to hot-flash status over the past three months: never experienced hot flashes (Group N), mild-to-moderate hot flashes (Group M), and severe hot flashes (Group S). Variables measured in this study included clinical parameters, hot flash experience, fasting levels of circulating glucose, lipid profiles, plasma insulin, and adipocyte-derived hormones. Multiple linear regression analysis was used to evaluate the associations of hot flashes with adipocyte-derived hormones, and with insulin resistance. Settings The study was performed in a hospital medical center. Results The mean (standard deviation) of body-mass index was 22.8(2.7) for Group N, 22.6(2.6) for Group M, and 23.5(2.4) for Group S, respectively. Women in Group S displayed statistically significantly higher levels of leptin, fasting glucose, and insulin, and lower levels of adiponectin than those in Groups M and N. Multivariate linear regression analysis revealed that hot-flash severity was significantly associated with higher leptin levels, lower adiponectin levels, and higher leptin-to-adiponectin ratio. Univariate linear regression analysis revealed that hot-flash severity was strongly associated with a higher HOMA-IR index (% difference, 58.03%; 95% confidence interval, 31.00–90.64; p < 0.001). The association between hot flashes and HOMA-IR index was attenuated after adjusting for leptin or adiponectin and was no longer significant after simultaneously adjusting for leptin and adiponectin. Conclusion The present study provides evidence that hot flashes are associated with insulin resistance in postmenopausal women. It further suggests that hot flash association with insulin resistance is dependent on the combination of leptin and adiponectin variables. PMID:28448547
The Evaluation on the Cadmium Net Concentration for Soil Ecosystems.
Yao, Yu; Wang, Pei-Fang; Wang, Chao; Hou, Jun; Miao, Ling-Zhan
2017-03-12
Yixing, known as the "City of Ceramics", is facing a new dilemma: a raw material crisis. Cadmium (Cd) exists in extremely high concentrations in soil due to the considerable input of industrial wastewater into the soil ecosystem. The in situ technique of diffusive gradients in thin film (DGT), the ex situ static equilibrium approach (HAc, EDTA and CaCl2), and the dissolved concentration in soil solution, as well as microwave digestion, were applied to predict the Cd bioavailability of soil, aiming to provide a robust and accurate method for Cd bioavailability evaluation in Yixing. Moreover, the typical local cash crops-paddy and zizania aquatica-were selected for Cd accumulation, aiming to select the ideal plants with tolerance to the soil Cd contamination. The results indicated that the biomasses of the two applied plants were sufficiently sensitive to reflect the stark regional differences of different sampling sites. The zizania aquatica could effectively reduce the total Cd concentration, as indicated by the high accumulation coefficients. However, the fact that the zizania aquatica has extremely high transfer coefficients, and its stem, as the edible part, might accumulate large amounts of Cd, led to the conclusion that zizania aquatica was not an ideal cash crop in Yixing. Furthermore, the labile Cd concentrations which were obtained by the DGT technique and dissolved in the soil solution showed a significant correlation with the Cd concentrations of the biota accumulation. However, the ex situ methods and the microwave digestion-obtained Cd concentrations showed a poor correlation with the accumulated Cd concentration in plant tissue. Correspondingly, the multiple linear regression models were built for fundamental analysis of the performance of different methods available for Cd bioavailability evaluation. The correlation coefficients of DGT obtained by the improved multiple linear regression model have not significantly improved compared to the coefficients obtained by the simple linear regression model. The results revealed that DGT was a robust measurement, which could obtain the labile Cd concentrations independent of the physicochemical features' variation in the soil ecosystem. Consequently, these findings provide stronger evidence that DGT is an effective and ideal tool for labile Cd evaluation in Yixing.
The Evaluation on the Cadmium Net Concentration for Soil Ecosystems
Yao, Yu; Wang, Pei-Fang; Wang, Chao; Hou, Jun; Miao, Ling-Zhan
2017-01-01
Yixing, known as the “City of Ceramics”, is facing a new dilemma: a raw material crisis. Cadmium (Cd) exists in extremely high concentrations in soil due to the considerable input of industrial wastewater into the soil ecosystem. The in situ technique of diffusive gradients in thin film (DGT), the ex situ static equilibrium approach (HAc, EDTA and CaCl2), and the dissolved concentration in soil solution, as well as microwave digestion, were applied to predict the Cd bioavailability of soil, aiming to provide a robust and accurate method for Cd bioavailability evaluation in Yixing. Moreover, the typical local cash crops—paddy and zizania aquatica—were selected for Cd accumulation, aiming to select the ideal plants with tolerance to the soil Cd contamination. The results indicated that the biomasses of the two applied plants were sufficiently sensitive to reflect the stark regional differences of different sampling sites. The zizania aquatica could effectively reduce the total Cd concentration, as indicated by the high accumulation coefficients. However, the fact that the zizania aquatica has extremely high transfer coefficients, and its stem, as the edible part, might accumulate large amounts of Cd, led to the conclusion that zizania aquatica was not an ideal cash crop in Yixing. Furthermore, the labile Cd concentrations which were obtained by the DGT technique and dissolved in the soil solution showed a significant correlation with the Cd concentrations of the biota accumulation. However, the ex situ methods and the microwave digestion-obtained Cd concentrations showed a poor correlation with the accumulated Cd concentration in plant tissue. Correspondingly, the multiple linear regression models were built for fundamental analysis of the performance of different methods available for Cd bioavailability evaluation. The correlation coefficients of DGT obtained by the improved multiple linear regression model have not significantly improved compared to the coefficients obtained by the simple linear regression model. The results revealed that DGT was a robust measurement, which could obtain the labile Cd concentrations independent of the physicochemical features’ variation in the soil ecosystem. Consequently, these findings provide stronger evidence that DGT is an effective and ideal tool for labile Cd evaluation in Yixing. PMID:28287500
El Dib, Regina; Gomaa, Huda; Ortiz, Alberto; Politei, Juan; Kapoor, Anil; Barreto, Fellype
2017-01-01
Background Anderson-Fabry disease (AFD) is an X-linked recessive inborn error of glycosphingolipid metabolism caused by a deficiency of alpha-galactosidase A. Renal failure, heart and cerebrovascular involvement reduce survival. A Cochrane review provided little evidence on the use of enzyme replacement therapy (ERT). We now complement this review through a linear regression and a pooled analysis of proportions from cohort studies. Objectives To evaluate the efficacy and safety of ERT for AFD. Materials and methods For the systematic review, a literature search was performed, from inception to March 2016, using Medline, EMBASE and LILACS. Inclusion criteria were cohort studies, patients with AFD on ERT or natural history, and at least one patient-important outcome (all-cause mortality, renal, cardiovascular or cerebrovascular events, and adverse events) reported. The pooled proportion and the confidence interval (CI) are shown for each outcome. Simple linear regressions for composite endpoints were performed. Results 77 cohort studies involving 15,305 participants proved eligible. The pooled proportions were as follows: a) for renal complications, agalsidase alfa 15.3% [95% CI 0.048, 0.303; I2 = 77.2%, p = 0.0005]; agalsidase beta 6% [95% CI 0.04, 0.07; I2 = not applicable]; and untreated patients 21.4% [95% CI 0.1522, 0.2835; I2 = 89.6%, p<0.0001]. Effect differences favored agalsidase beta compared to untreated patients; b) for cardiovascular complications, agalsidase alfa 28% [95% CI 0.07, 0.55; I2 = 96.7%, p<0.0001]; agalsidase beta 7% [95% CI 0.05, 0.08; I2 = not applicable]; and untreated patients 26.2% [95% CI 0.149, 0.394; I2 = 98.8%, p<0.0001]. Effect differences favored agalsidase beta compared to untreated patients; and c) for cerebrovascular complications, agalsidase alfa 11.1% [95% CI 0.058, 0.179; I2 = 70.5%, p = 0.0024]; agalsidase beta 3.5% [95% CI 0.024, 0.046; I2 = 0%, p = 0.4209]; and untreated patients 18.3% [95% CI 0.129, 0.245; I2 = 95% p < 0.0001]. Effect differences favored agalsidase beta over agalsidase alfa or untreated patients. A linear regression showed that Fabry patients receiving agalsidase alfa are more likely to have higher rates of composite endpoints compared to those receiving agalsidase beta. Conclusions Agalsidase beta is associated to a significantly lower incidence of renal, cardiovascular and cerebrovascular events than no ERT, and to a significantly lower incidence of cerebrovascular events than agalsidase alfa. In view of these results, the use of agalsidase beta for preventing major organ complications related to AFD can be recommended. PMID:28296917
NASA Technical Reports Server (NTRS)
MCKissick, Burnell T. (Technical Monitor); Plassman, Gerald E.; Mall, Gerald H.; Quagliano, John R.
2005-01-01
Linear multivariable regression models for predicting day and night Eddy Dissipation Rate (EDR) from available meteorological data sources are defined and validated. Model definition is based on a combination of 1997-2000 Dallas/Fort Worth (DFW) data sources, EDR from Aircraft Vortex Spacing System (AVOSS) deployment data, and regression variables primarily from corresponding Automated Surface Observation System (ASOS) data. Model validation is accomplished through EDR predictions on a similar combination of 1994-1995 Memphis (MEM) AVOSS and ASOS data. Model forms include an intercept plus a single term of fixed optimal power for each of these regression variables; 30-minute forward averaged mean and variance of near-surface wind speed and temperature, variance of wind direction, and a discrete cloud cover metric. Distinct day and night models, regressing on EDR and the natural log of EDR respectively, yield best performance and avoid model discontinuity over day/night data boundaries.
NASA Astrophysics Data System (ADS)
Chu, Hone-Jay; Kong, Shish-Jeng; Chang, Chih-Hua
2018-03-01
The turbidity (TB) of a water body varies with time and space. Water quality is traditionally estimated via linear regression based on satellite images. However, estimating and mapping water quality require a spatio-temporal nonstationary model, while TB mapping necessitates the use of geographically and temporally weighted regression (GTWR) and geographically weighted regression (GWR) models, both of which are more precise than linear regression. Given the temporal nonstationary models for mapping water quality, GTWR offers the best option for estimating regional water quality. Compared with GWR, GTWR provides highly reliable information for water quality mapping, boasts a relatively high goodness of fit, improves the explanation of variance from 44% to 87%, and shows a sufficient space-time explanatory power. The seasonal patterns of TB and the main spatial patterns of TB variability can be identified using the estimated TB maps from GTWR and by conducting an empirical orthogonal function (EOF) analysis.
Mental chronometry with simple linear regression.
Chen, J Y
1997-10-01
Typically, mental chronometry is performed by means of introducing an independent variable postulated to affect selectively some stage of a presumed multistage process. However, the effect could be a global one that spreads proportionally over all stages of the process. Currently, there is no method to test this possibility although simple linear regression might serve the purpose. In the present study, the regression approach was tested with tasks (memory scanning and mental rotation) that involved a selective effect and with a task (word superiority effect) that involved a global effect, by the dominant theories. The results indicate (1) the manipulation of the size of a memory set or of angular disparity affects the intercept of the regression function that relates the times for memory scanning with different set sizes or for mental rotation with different angular disparities and (2) the manipulation of context affects the slope of the regression function that relates the times for detecting a target character under word and nonword conditions. These ratify the regression approach as a useful method for doing mental chronometry.
Guan, Yongtao; Li, Yehua; Sinha, Rajita
2011-01-01
In a cocaine dependence treatment study, we use linear and nonlinear regression models to model posttreatment cocaine craving scores and first cocaine relapse time. A subset of the covariates are summary statistics derived from baseline daily cocaine use trajectories, such as baseline cocaine use frequency and average daily use amount. These summary statistics are subject to estimation error and can therefore cause biased estimators for the regression coefficients. Unlike classical measurement error problems, the error we encounter here is heteroscedastic with an unknown distribution, and there are no replicates for the error-prone variables or instrumental variables. We propose two robust methods to correct for the bias: a computationally efficient method-of-moments-based method for linear regression models and a subsampling extrapolation method that is generally applicable to both linear and nonlinear regression models. Simulations and an application to the cocaine dependence treatment data are used to illustrate the efficacy of the proposed methods. Asymptotic theory and variance estimation for the proposed subsampling extrapolation method and some additional simulation results are described in the online supplementary material. PMID:21984854
Kim, Dae-Hee; Choi, Jae-Hun; Lim, Myung-Eun; Park, Soo-Jun
2008-01-01
This paper suggests the method of correcting distance between an ambient intelligence display and a user based on linear regression and smoothing method, by which distance information of a user who approaches to the display can he accurately output even in an unanticipated condition using a passive infrared VIR) sensor and an ultrasonic device. The developed system consists of an ambient intelligence display and an ultrasonic transmitter, and a sensor gateway. Each module communicates with each other through RF (Radio frequency) communication. The ambient intelligence display includes an ultrasonic receiver and a PIR sensor for motion detection. In particular, this system selects and processes algorithms such as smoothing or linear regression for current input data processing dynamically through judgment process that is determined using the previous reliable data stored in a queue. In addition, we implemented GUI software with JAVA for real time location tracking and an ambient intelligence display.
Lee, Eunjee; Zhu, Hongtu; Kong, Dehan; Wang, Yalin; Giovanello, Kelly Sullivan; Ibrahim, Joseph G
2015-01-01
The aim of this paper is to develop a Bayesian functional linear Cox regression model (BFLCRM) with both functional and scalar covariates. This new development is motivated by establishing the likelihood of conversion to Alzheimer’s disease (AD) in 346 patients with mild cognitive impairment (MCI) enrolled in the Alzheimer’s Disease Neuroimaging Initiative 1 (ADNI-1) and the early markers of conversion. These 346 MCI patients were followed over 48 months, with 161 MCI participants progressing to AD at 48 months. The functional linear Cox regression model was used to establish that functional covariates including hippocampus surface morphology and scalar covariates including brain MRI volumes, cognitive performance (ADAS-Cog), and APOE status can accurately predict time to onset of AD. Posterior computation proceeds via an efficient Markov chain Monte Carlo algorithm. A simulation study is performed to evaluate the finite sample performance of BFLCRM. PMID:26900412
Liquid electrolyte informatics using an exhaustive search with linear regression.
Sodeyama, Keitaro; Igarashi, Yasuhiko; Nakayama, Tomofumi; Tateyama, Yoshitaka; Okada, Masato
2018-06-14
Exploring new liquid electrolyte materials is a fundamental target for developing new high-performance lithium-ion batteries. In contrast to solid materials, disordered liquid solution properties have been less studied by data-driven information techniques. Here, we examined the estimation accuracy and efficiency of three information techniques, multiple linear regression (MLR), least absolute shrinkage and selection operator (LASSO), and exhaustive search with linear regression (ES-LiR), by using coordination energy and melting point as test liquid properties. We then confirmed that ES-LiR gives the most accurate estimation among the techniques. We also found that ES-LiR can provide the relationship between the "prediction accuracy" and "calculation cost" of the properties via a weight diagram of descriptors. This technique makes it possible to choose the balance of the "accuracy" and "cost" when the search of a huge amount of new materials was carried out.
Huang, Jian; Zhang, Cun-Hui
2013-01-01
The ℓ1-penalized method, or the Lasso, has emerged as an important tool for the analysis of large data sets. Many important results have been obtained for the Lasso in linear regression which have led to a deeper understanding of high-dimensional statistical problems. In this article, we consider a class of weighted ℓ1-penalized estimators for convex loss functions of a general form, including the generalized linear models. We study the estimation, prediction, selection and sparsity properties of the weighted ℓ1-penalized estimator in sparse, high-dimensional settings where the number of predictors p can be much larger than the sample size n. Adaptive Lasso is considered as a special case. A multistage method is developed to approximate concave regularized estimation by applying an adaptive Lasso recursively. We provide prediction and estimation oracle inequalities for single- and multi-stage estimators, a general selection consistency theorem, and an upper bound for the dimension of the Lasso estimator. Important models including the linear regression, logistic regression and log-linear models are used throughout to illustrate the applications of the general results. PMID:24348100
STRONG ORACLE OPTIMALITY OF FOLDED CONCAVE PENALIZED ESTIMATION.
Fan, Jianqing; Xue, Lingzhou; Zou, Hui
2014-06-01
Folded concave penalization methods have been shown to enjoy the strong oracle property for high-dimensional sparse estimation. However, a folded concave penalization problem usually has multiple local solutions and the oracle property is established only for one of the unknown local solutions. A challenging fundamental issue still remains that it is not clear whether the local optimum computed by a given optimization algorithm possesses those nice theoretical properties. To close this important theoretical gap in over a decade, we provide a unified theory to show explicitly how to obtain the oracle solution via the local linear approximation algorithm. For a folded concave penalized estimation problem, we show that as long as the problem is localizable and the oracle estimator is well behaved, we can obtain the oracle estimator by using the one-step local linear approximation. In addition, once the oracle estimator is obtained, the local linear approximation algorithm converges, namely it produces the same estimator in the next iteration. The general theory is demonstrated by using four classical sparse estimation problems, i.e., sparse linear regression, sparse logistic regression, sparse precision matrix estimation and sparse quantile regression.
STRONG ORACLE OPTIMALITY OF FOLDED CONCAVE PENALIZED ESTIMATION
Fan, Jianqing; Xue, Lingzhou; Zou, Hui
2014-01-01
Folded concave penalization methods have been shown to enjoy the strong oracle property for high-dimensional sparse estimation. However, a folded concave penalization problem usually has multiple local solutions and the oracle property is established only for one of the unknown local solutions. A challenging fundamental issue still remains that it is not clear whether the local optimum computed by a given optimization algorithm possesses those nice theoretical properties. To close this important theoretical gap in over a decade, we provide a unified theory to show explicitly how to obtain the oracle solution via the local linear approximation algorithm. For a folded concave penalized estimation problem, we show that as long as the problem is localizable and the oracle estimator is well behaved, we can obtain the oracle estimator by using the one-step local linear approximation. In addition, once the oracle estimator is obtained, the local linear approximation algorithm converges, namely it produces the same estimator in the next iteration. The general theory is demonstrated by using four classical sparse estimation problems, i.e., sparse linear regression, sparse logistic regression, sparse precision matrix estimation and sparse quantile regression. PMID:25598560
NASA Astrophysics Data System (ADS)
Haris, A.; Nafian, M.; Riyanto, A.
2017-07-01
Danish North Sea Fields consist of several formations (Ekofisk, Tor, and Cromer Knoll) that was started from the age of Paleocene to Miocene. In this study, the integration of seismic and well log data set is carried out to determine the chalk sand distribution in the Danish North Sea field. The integration of seismic and well log data set is performed by using the seismic inversion analysis and seismic multi-attribute. The seismic inversion algorithm, which is used to derive acoustic impedance (AI), is model-based technique. The derived AI is then used as external attributes for the input of multi-attribute analysis. Moreover, the multi-attribute analysis is used to generate the linear and non-linear transformation of among well log properties. In the case of the linear model, selected transformation is conducted by weighting step-wise linear regression (SWR), while for the non-linear model is performed by using probabilistic neural networks (PNN). The estimated porosity, which is resulted by PNN shows better suited to the well log data compared with the results of SWR. This result can be understood since PNN perform non-linear regression so that the relationship between the attribute data and predicted log data can be optimized. The distribution of chalk sand has been successfully identified and characterized by porosity value ranging from 23% up to 30%.
Non-Asymptotic Oracle Inequalities for the High-Dimensional Cox Regression via Lasso.
Kong, Shengchun; Nan, Bin
2014-01-01
We consider finite sample properties of the regularized high-dimensional Cox regression via lasso. Existing literature focuses on linear models or generalized linear models with Lipschitz loss functions, where the empirical risk functions are the summations of independent and identically distributed (iid) losses. The summands in the negative log partial likelihood function for censored survival data, however, are neither iid nor Lipschitz.We first approximate the negative log partial likelihood function by a sum of iid non-Lipschitz terms, then derive the non-asymptotic oracle inequalities for the lasso penalized Cox regression using pointwise arguments to tackle the difficulties caused by lacking iid Lipschitz losses.
Non-Asymptotic Oracle Inequalities for the High-Dimensional Cox Regression via Lasso
Kong, Shengchun; Nan, Bin
2013-01-01
We consider finite sample properties of the regularized high-dimensional Cox regression via lasso. Existing literature focuses on linear models or generalized linear models with Lipschitz loss functions, where the empirical risk functions are the summations of independent and identically distributed (iid) losses. The summands in the negative log partial likelihood function for censored survival data, however, are neither iid nor Lipschitz.We first approximate the negative log partial likelihood function by a sum of iid non-Lipschitz terms, then derive the non-asymptotic oracle inequalities for the lasso penalized Cox regression using pointwise arguments to tackle the difficulties caused by lacking iid Lipschitz losses. PMID:24516328
Functional Relationships and Regression Analysis.
ERIC Educational Resources Information Center
Preece, Peter F. W.
1978-01-01
Using a degenerate multivariate normal model for the distribution of organismic variables, the form of least-squares regression analysis required to estimate a linear functional relationship between variables is derived. It is suggested that the two conventional regression lines may be considered to describe functional, not merely statistical,…
Isolating and Examining Sources of Suppression and Multicollinearity in Multiple Linear Regression
ERIC Educational Resources Information Center
Beckstead, Jason W.
2012-01-01
The presence of suppression (and multicollinearity) in multiple regression analysis complicates interpretation of predictor-criterion relationships. The mathematical conditions that produce suppression in regression analysis have received considerable attention in the methodological literature but until now nothing in the way of an analytic…
Suppression Situations in Multiple Linear Regression
ERIC Educational Resources Information Center
Shieh, Gwowen
2006-01-01
This article proposes alternative expressions for the two most prevailing definitions of suppression without resorting to the standardized regression modeling. The formulation provides a simple basis for the examination of their relationship. For the two-predictor regression, the author demonstrates that the previous results in the literature are…
Marrero-Ponce, Yovani; Medina-Marrero, Ricardo; Castillo-Garit, Juan A; Romero-Zaldivar, Vicente; Torrens, Francisco; Castro, Eduardo A
2005-04-15
A novel approach to bio-macromolecular design from a linear algebra point of view is introduced. A protein's total (whole protein) and local (one or more amino acid) linear indices are a new set of bio-macromolecular descriptors of relevance to protein QSAR/QSPR studies. These amino-acid level biochemical descriptors are based on the calculation of linear maps on Rn[f k(xmi):Rn-->Rn] in canonical basis. These bio-macromolecular indices are calculated from the kth power of the macromolecular pseudograph alpha-carbon atom adjacency matrix. Total linear indices are linear functional on Rn. That is, the kth total linear indices are linear maps from Rn to the scalar R[f k(xm):Rn-->R]. Thus, the kth total linear indices are calculated by summing the amino-acid linear indices of all amino acids in the protein molecule. A study of the protein stability effects for a complete set of alanine substitutions in the Arc repressor illustrates this approach. A quantitative model that discriminates near wild-type stability alanine mutants from the reduced-stability ones in a training series was obtained. This model permitted the correct classification of 97.56% (40/41) and 91.67% (11/12) of proteins in the training and test set, respectively. It shows a high Matthews correlation coefficient (MCC=0.952) for the training set and an MCC=0.837 for the external prediction set. Additionally, canonical regression analysis corroborated the statistical quality of the classification model (Rcanc=0.824). This analysis was also used to compute biological stability canonical scores for each Arc alanine mutant. On the other hand, the linear piecewise regression model compared favorably with respect to the linear regression one on predicting the melting temperature (tm) of the Arc alanine mutants. The linear model explains almost 81% of the variance of the experimental tm (R=0.90 and s=4.29) and the LOO press statistics evidenced its predictive ability (q2=0.72 and scv=4.79). Moreover, the TOMOCOMD-CAMPS method produced a linear piecewise regression (R=0.97) between protein backbone descriptors and tm values for alanine mutants of the Arc repressor. A break-point value of 51.87 degrees C characterized two mutant clusters and coincided perfectly with the experimental scale. For this reason, we can use the linear discriminant analysis and piecewise models in combination to classify and predict the stability of the mutant Arc homodimers. These models also permitted the interpretation of the driving forces of such folding process, indicating that topologic/topographic protein backbone interactions control the stability profile of wild-type Arc and its alanine mutants.
Predicting U.S. Army Reserve Unit Manning Using Market Demographics
2015-06-01
develops linear regression , classification tree, and logistic regression models to determine the ability of the location to support manning requirements... logistic regression model delivers predictive results that allow decision-makers to identify locations with a high probability of meeting unit...manning requirements. The recommendation of this thesis is that the USAR implement the logistic regression model. 14. SUBJECT TERMS U.S
Real, J; Cleries, R; Forné, C; Roso-Llorach, A; Martínez-Sánchez, J M
In medicine and biomedical research, statistical techniques like logistic, linear, Cox and Poisson regression are widely known. The main objective is to describe the evolution of multivariate techniques used in observational studies indexed in PubMed (1970-2013), and to check the requirements of the STROBE guidelines in the author guidelines in Spanish journals indexed in PubMed. A targeted PubMed search was performed to identify papers that used logistic linear Cox and Poisson models. Furthermore, a review was also made of the author guidelines of journals published in Spain and indexed in PubMed and Web of Science. Only 6.1% of the indexed manuscripts included a term related to multivariate analysis, increasing from 0.14% in 1980 to 12.3% in 2013. In 2013, 6.7, 2.5, 3.5, and 0.31% of the manuscripts contained terms related to logistic, linear, Cox and Poisson regression, respectively. On the other hand, 12.8% of journals author guidelines explicitly recommend to follow the STROBE guidelines, and 35.9% recommend the CONSORT guideline. A low percentage of Spanish scientific journals indexed in PubMed include the STROBE statement requirement in the author guidelines. Multivariate regression models in published observational studies such as logistic regression, linear, Cox and Poisson are increasingly used both at international level, as well as in journals published in Spanish. Copyright © 2015 Sociedad Española de Médicos de Atención Primaria (SEMERGEN). Publicado por Elsevier España, S.L.U. All rights reserved.
Wu, Lingtao; Lord, Dominique
2017-05-01
This study further examined the use of regression models for developing crash modification factors (CMFs), specifically focusing on the misspecification in the link function. The primary objectives were to validate the accuracy of CMFs derived from the commonly used regression models (i.e., generalized linear models or GLMs with additive linear link functions) when some of the variables have nonlinear relationships and quantify the amount of bias as a function of the nonlinearity. Using the concept of artificial realistic data, various linear and nonlinear crash modification functions (CM-Functions) were assumed for three variables. Crash counts were randomly generated based on these CM-Functions. CMFs were then derived from regression models for three different scenarios. The results were compared with the assumed true values. The main findings are summarized as follows: (1) when some variables have nonlinear relationships with crash risk, the CMFs for these variables derived from the commonly used GLMs are all biased, especially around areas away from the baseline conditions (e.g., boundary areas); (2) with the increase in nonlinearity (i.e., nonlinear relationship becomes stronger), the bias becomes more significant; (3) the quality of CMFs for other variables having linear relationships can be influenced when mixed with those having nonlinear relationships, but the accuracy may still be acceptable; and (4) the misuse of the link function for one or more variables can also lead to biased estimates for other parameters. This study raised the importance of the link function when using regression models for developing CMFs. Copyright © 2017 Elsevier Ltd. All rights reserved.
Linear regression models for solvent accessibility prediction in proteins.
Wagner, Michael; Adamczak, Rafał; Porollo, Aleksey; Meller, Jarosław
2005-04-01
The relative solvent accessibility (RSA) of an amino acid residue in a protein structure is a real number that represents the solvent exposed surface area of this residue in relative terms. The problem of predicting the RSA from the primary amino acid sequence can therefore be cast as a regression problem. Nevertheless, RSA prediction has so far typically been cast as a classification problem. Consequently, various machine learning techniques have been used within the classification framework to predict whether a given amino acid exceeds some (arbitrary) RSA threshold and would thus be predicted to be "exposed," as opposed to "buried." We have recently developed novel methods for RSA prediction using nonlinear regression techniques which provide accurate estimates of the real-valued RSA and outperform classification-based approaches with respect to commonly used two-class projections. However, while their performance seems to provide a significant improvement over previously published approaches, these Neural Network (NN) based methods are computationally expensive to train and involve several thousand parameters. In this work, we develop alternative regression models for RSA prediction which are computationally much less expensive, involve orders-of-magnitude fewer parameters, and are still competitive in terms of prediction quality. In particular, we investigate several regression models for RSA prediction using linear L1-support vector regression (SVR) approaches as well as standard linear least squares (LS) regression. Using rigorously derived validation sets of protein structures and extensive cross-validation analysis, we compare the performance of the SVR with that of LS regression and NN-based methods. In particular, we show that the flexibility of the SVR (as encoded by metaparameters such as the error insensitivity and the error penalization terms) can be very beneficial to optimize the prediction accuracy for buried residues. We conclude that the simple and computationally much more efficient linear SVR performs comparably to nonlinear models and thus can be used in order to facilitate further attempts to design more accurate RSA prediction methods, with applications to fold recognition and de novo protein structure prediction methods.
Regression Commonality Analysis: A Technique for Quantitative Theory Building
ERIC Educational Resources Information Center
Nimon, Kim; Reio, Thomas G., Jr.
2011-01-01
When it comes to multiple linear regression analysis (MLR), it is common for social and behavioral science researchers to rely predominately on beta weights when evaluating how predictors contribute to a regression model. Presenting an underutilized statistical technique, this article describes how organizational researchers can use commonality…
Precision Efficacy Analysis for Regression.
ERIC Educational Resources Information Center
Brooks, Gordon P.
When multiple linear regression is used to develop a prediction model, sample size must be large enough to ensure stable coefficients. If the derivation sample size is inadequate, the model may not predict well for future subjects. The precision efficacy analysis for regression (PEAR) method uses a cross- validity approach to select sample sizes…
ERIC Educational Resources Information Center
Jurs, Stephen; And Others
The scree test and its linear regression technique are reviewed, and results of its use in factor analysis and Delphi data sets are described. The scree test was originally a visual approach for making judgments about eigenvalues, which considered the relationships of the eigenvalues to one another as well as their actual values. The graph that is…
Periodontal inflamed surface area as a novel numerical variable describing periodontal conditions
2017-01-01
Purpose A novel index, the periodontal inflamed surface area (PISA), represents the sum of the periodontal pocket depth of bleeding on probing (BOP)-positive sites. In the present study, we evaluated correlations between PISA and periodontal classifications, and examined PISA as an index integrating the discrete conventional periodontal indexes. Methods This study was a cross-sectional subgroup analysis of data from a prospective cohort study investigating the association between chronic periodontitis and the clinical features of ankylosing spondylitis. Data from 84 patients without systemic diseases (the control group in the previous study) were analyzed in the present study. Results PISA values were positively correlated with conventional periodontal classifications (Spearman correlation coefficient=0.52; P<0.01) and with periodontal indexes, such as BOP and the plaque index (PI) (r=0.94; P<0.01 and r=0.60; P<0.01, respectively; Pearson correlation test). Porphyromonas gingivalis (P. gingivalis) expression and the presence of serum P. gingivalis antibodies were significant factors affecting PISA values in a simple linear regression analysis, together with periodontal classification, PI, bleeding index, and smoking, but not in the multivariate analysis. In the multivariate linear regression analysis, PISA values were positively correlated with the quantity of current smoking, PI, and severity of periodontal disease. Conclusions PISA integrates multiple periodontal indexes, such as probing pocket depth, BOP, and PI into a numerical variable. PISA is advantageous for quantifying periodontal inflammation and plaque accumulation. PMID:29093989
An Investigation of Age-Related Iron Deposition Using Susceptibility Weighted Imaging
Wang, Dan; Li, Wen-Bin; Wei, Xiao-Er; Li, Yue-Hua; Dai, Yong-Ming
2012-01-01
Aim To quantify age-dependent iron deposition changes in healthy subjects using Susceptibility Weighted Imaging (SWI). Materials and Methods In total, 143 healthy volunteers were enrolled. All underwent conventional MR and SWI sequences. Subjects were divided into eight groups according to age. Using phase images to quantify iron deposition in the head of the caudate nucleus and the lenticular nucleus, the angle radian value was calculated and compared between groups. ANOVA/Pearson correlation coefficient linear regression analysis and polynomial fitting were performed to analyze the relationship between iron deposition in the head of the caudate nucleus and lenticular nucleus with age. Results Iron deposition in the lenticular nucleus increased in individuals aged up to 40 years, but did not change in those aged over 40 years once a peak had been reached. In the head of the caudate nucleus, iron deposition peaked at 60 years (p<0.05). The correlation coefficients for iron deposition in the L-head of the caudate nucleus, R-head of the caudate nucleus, L-lenticular nucleus and R-lenticular nucleus with age were 0.67691, 0.48585, 0.5228 and 0.5228 (p<0.001, respectively). Linear regression analyses showed a significant correlation between iron deposition levels in with age groups. Conclusions Iron deposition in the lenticular nucleus was found to increase with age, reaching a plateau at 40 years. Iron deposition in the head of the caudate nucleus also increased with age, reaching a plateau at 60 years. PMID:23226360
Impact of sociodemographic variables on executive functions
Campanholo, Kenia Repiso; Boa, Izadora Nogueira Fonte; Hodroj, Flávia Cristina da Silva Araujo; Guerra, Glaucia Rosana Benute; Miotto, Eliane Correa; de Lucia, Mara Cristina Souza
2017-01-01
Executive functions (EFs) regulate human behavior and allow individuals to interact and act in the world. EFs are sensitive to sociodemographic variables such as age, which promotes their decline, and to others that can exert a neuroprotective effect. Objective To assess the predictive role of education, occupation and family income on decline in executive functions among a sample with a wide age range. Methods A total of 925 participants aged 18-89 years with 1-28 years' education were submitted to assessment of executive functions using the Card Sorting Test (CST), Phonemic Verbal Fluency (FAS) Task and Semantic Verbal Fluency (SVF) Task. Data on income, occupation and educational level were collected for the sample. The data were analyzed using Linear Regression, as well as Pearson's and Spearman's Correlation. Results Age showed a significant negative correlation (p<0.001) with performance on the CST, FAS and SVF, whereas education, income and occupation were positively associated (p<0.001) with the tasks applied. After application of the multivariate linear regression model, a significant positive relationship with the FAS was maintained only for education (p<0.001) and income (p<0.001). The negative relationship of age (p<0.001) and positive relationship of both education (p<0.001) and income (p<0.001and p=0.003) were evident on the CST and SVF. Conclusion Educational level and income positively influenced participants' results on executive function tests, attenuating expected decline for age. However, no relationship was found between occupation and the cognitive variables investigated. PMID:29213495
Huang, Hairong; Xu, Zanzan; Shao, Xianhong; Wismeijer, Daniel; Sun, Ping; Wang, Jingxiao
2017-01-01
Objectives This study identified potential general influencing factors for a mathematical prediction of implant stability quotient (ISQ) values in clinical practice. Methods We collected the ISQ values of 557 implants from 2 different brands (SICace and Osstem) placed by 2 surgeons in 336 patients. Surgeon 1 placed 329 SICace implants, and surgeon 2 placed 113 SICace implants and 115 Osstem implants. ISQ measurements were taken at T1 (immediately after implant placement) and T2 (before dental restoration). A multivariate linear regression model was used to analyze the influence of the following 11 candidate factors for stability prediction: sex, age, maxillary/mandibular location, bone type, immediate/delayed implantation, bone grafting, insertion torque, I-stage or II-stage healing pattern, implant diameter, implant length and T1-T2 time interval. Results The need for bone grafting as a predictor significantly influenced ISQ values in all three groups at T1 (weight coefficients ranging from -4 to -5). In contrast, implant diameter consistently influenced the ISQ values in all three groups at T2 (weight coefficients ranging from 3.4 to 4.2). Other factors, such as sex, age, I/II-stage implantation and bone type, did not significantly influence ISQ values at T2, and implant length did not significantly influence ISQ values at T1 or T2. Conclusions These findings provide a rational basis for mathematical models to quantitatively predict the ISQ values of implants in clinical practice. PMID:29084260
Sonographic Measurement of Fetal Ear Length in Turkish Women with a Normal Pregnancy
Özdemir, Mucize Eriç; Uzun, Işıl; Karahasanoğlu, Ayşe; Aygün, Mehmet; Akın, Hale; Yazıcıoğlu, Fehmi
2014-01-01
Background: Abnormal fetal ear length is a feature of chromosomal disorders. Fetal ear length measurement is a simple measurement that can be obtained during ultrasonographic examinations. Aims: To develop a nomogram for fetal ear length measurements in our population and investigate the correlation between fetal ear length, gestational age, and other standard fetal biometric measurements. Study Design: Cohort study. Methods: Ear lengths of the fetuses were measured in normal singleton pregnancies. The relationship between gestational age and fetal ear length in millimetres was analysed by simple linear regression. In addition, the correlation of fetal ear length measurements with biparietal diameter, head circumference, abdominal circumference, and femur length were evaluated.Ear length measurements were obtained from fetuses in 389 normal singleton pregnancies ranging between 16 and 28 weeks of gestation. Results: A nomogram was developed by linear regression analysis of the parameters ear length and gestational age. Fetal ear length (mm) = y = (1.348 X gestational age)−12.265), where gestational ages is in weeks. A high correlation was found between fetal ear length and gestational age, and a significant correlation was also found between fetal ear length and the biparietal diameter (r=0.962; p<0.001). Similar correlations were found between fetal ear length and head circumference, and fetal ear length and femur length. Conclusion: The results of this study provide a nomogram for fetal ear length. The study also demonstrates the relationship between ear length and other biometric measurements. PMID:25667783
SU-F-R-20: Image Texture Features Correlate with Time to Local Failure in Lung SBRT Patients
DOE Office of Scientific and Technical Information (OSTI.GOV)
Andrews, M; Abazeed, M; Woody, N
Purpose: To explore possible correlation between CT image-based texture and histogram features and time-to-local-failure in early stage non-small cell lung cancer (NSCLC) patients treated with stereotactic body radiotherapy (SBRT).Methods and Materials: From an IRB-approved lung SBRT registry for patients treated between 2009–2013 we selected 48 (20 male, 28 female) patients with local failure. Median patient age was 72.3±10.3 years. Mean time to local failure was 15 ± 7.1 months. Physician-contoured gross tumor volumes (GTV) on the planning CT images were processed and 3D gray-level co-occurrence matrix (GLCM) based texture and histogram features were calculated in Matlab. Data were exported tomore » R and a multiple linear regression model was used to examine the relationship between texture features and time-to-local-failure. Results: Multiple linear regression revealed that entropy (p=0.0233, multiple R2=0.60) from GLCM-based texture analysis and the standard deviation (p=0.0194, multiple R2=0.60) from the histogram-based features were statistically significantly correlated with the time-to-local-failure. Conclusion: Image-based texture analysis can be used to predict certain aspects of treatment outcomes of NSCLC patients treated with SBRT. We found entropy and standard deviation calculated for the GTV on the CT images displayed a statistically significant correlation with and time-to-local-failure in lung SBRT patients.« less
Health Care Utilization and Expenditures in Persons Receiving Social Assistance in 2012
Reich, Oliver; Wolffers, Felix; Signorell, Andri; Blozik, Eva
2015-01-01
Introduction: Lower socioeconomic position and measures of social and material deprivation are associated with morbidity and mortality. These inequalities in health among groups of various statuses remain one of the main challenges for public health. The aim of the study was to investigate differences in health care use and costs between recipients of social assistance and non-recipients aged 65 years and younger within the Swiss healthcare system. Methods: We analyzed claims data of 13 492 individuals living in Bern, Switzerland of which 391 received social assistance. For the year 2012, we compared the number of physician visits, hospitalizations, prescribed drugs, and total health care costs as covered by mandatory health insurance. Linear and logistic adjusted regression analyses were made to estimate the effect of receipt of social assistance on health service use and costs. Results: Multivariate linear regression analysis revealed that health care costs increased on average by 1 666 CHF if individuals received social assistance. Recipients of social assistance had on average 1.2 more ambulatory consultations than non-recipients and got 1.65 more different medications prescribed as compared to non-recipients. The chance for recipients of social assistance to be hospitalized was almost twice that of non-recipients (Odds Ratio 1.96, 95% confidence interval 1.49-2.59). Conclusions: Recipients of social assistance demonstrate an exceedingly high use of health services. The need for interventions to alleviate the identified inequalities in health and health care needs is obvious. PMID:25946912
Erythropoietin Levels in Elderly Patients with Anemia of Unknown Etiology
Sriram, Swetha; Martin, Alison; Xenocostas, Anargyros; Lazo-Langner, Alejandro
2016-01-01
Background In many elderly patients with anemia, a specific cause cannot be identified. This study investigates whether erythropoietin levels are inappropriately low in these cases of “anemia of unknown etiology” and whether this trend persists after accounting for confounders. Methods This study includes all anemic patients over 60 years old who had erythropoietin measured between 2005 and 2013 at a single center. Three independent reviewers used defined criteria to assign each patient’s anemia to one of ten etiologies: chronic kidney disease, iron deficiency, chronic disease, confirmed myelodysplastic syndrome (MDS), suspected MDS, vitamin B12 deficiency, folate deficiency, anemia of unknown etiology, other etiology, or multifactorial etiology. Iron deficiency anemia served as the comparison group in all analyses. We used linear regression to model the relationship between erythropoietin and the presence of each etiology, sequentially adding terms to the model to account for the hemoglobin concentration, estimated glomerular filtration rate (eGFR) and Charlson Comorbidity Index. Results A total of 570 patients met the inclusion criteria. Linear regression analysis showed that erythropoietin levels in chronic kidney disease, anemia of chronic disease and anemia of unknown etiology were lower by 48%, 46% and 27%, respectively, compared to iron deficiency anemia even after adjusting for hemoglobin, eGFR and comorbidities. Conclusions We have shown that erythropoietin levels are inappropriately low in anemia of unknown etiology, even after adjusting for confounders. This suggests that decreased erythropoietin production may play a key role in the pathogenesis of anemia of unknown etiology. PMID:27310832
Yamanari, Masahiro; Nagase, Satoko; Fukuda, Shinichi; Ishii, Kotaro; Tanaka, Ryosuke; Yasui, Takeshi; Oshika, Tetsuro; Miura, Masahiro; Yasuno, Yoshiaki
2014-05-01
The relationship between scleral birefringence and biometric parameters of human eyes in vivo is investigated. Scleral birefringence near the limbus of 21 healthy human eyes was measured using polarization-sensitive optical coherence tomography. Spherical equivalent refractive error, axial eye length, and intraocular pressure (IOP) were measured in all subjects. IOP and scleral birefringence of human eyes in vivo was found to have statistically significant correlations (r = -0.63, P = 0.002). The slope of linear regression was -2.4 × 10(-2) deg/μm/mmHg. Neither spherical equivalent refractive error nor axial eye length had significant correlations with scleral birefringence. To evaluate the direct influence of IOP to scleral birefringence, scleral birefringence of 16 ex vivo porcine eyes was measured under controlled IOP of 5-60 mmHg. In these ex vivo porcine eyes, the mean linear regression slope between controlled IOP and scleral birefringence was -9.9 × 10(-4) deg/μm/mmHg. In addition, porcine scleral collagen fibers were observed with second-harmonic-generation (SHG) microscopy. SHG images of porcine sclera, measured on the external surface at the superior side to the cornea, showed highly aligned collagen fibers parallel to the limbus. In conclusion, scleral birefringence of healthy human eyes was correlated with IOP, indicating that the ultrastructure of scleral collagen was correlated with IOP. It remains to show whether scleral collagen ultrastructure of human eyes is affected by IOP as a long-term effect.
Factors Predicting a Good Symptomatic Outcome After Prostate Artery Embolisation (PAE).
Maclean, D; Harris, M; Drake, T; Maher, B; Modi, S; Dyer, J; Somani, B; Hacking, N; Bryant, T
2018-02-26
As prostate artery embolisation (PAE) becomes an established treatment for benign prostatic obstruction, factors predicting good symptomatic outcome remain unclear. Pre-embolisation prostate size as a predictor is controversial with a handful of papers coming to conflicting conclusions. We aimed to investigate if an association existed in our patient cohort between prostate size and clinical benefit, in addition to evaluating percentage volume reduction as a predictor of symptomatic outcome following PAE. Prospective follow-up of 86 PAE patients at a single institution between June 2012 and January 2016 was conducted (mean age 64.9 years, range 54-80 years). Multiple linear regression analysis was performed to assess strength of association between clinical improvement (change in IPSS) and other variables, of any statistical correlation, through Pearson's bivariate analysis. No major procedural complications were identified and clinical success was achieved in 72.1% (n = 62) at 12 months. Initial prostate size and percentage reduction were found to have a significant association with clinical improvement. Multiple linear regression analysis (r 2 = 0.48) demonstrated that percentage volume reduction at 3 months (r = 0.68, p < 0.001) had the strongest correlation with good symptomatic improvement at 12 months after adjusting for confounding factors. Both the initial prostate size and percentage volume reduction at 3 months predict good symptomatic outcome at 12 months. These findings therefore aid patient selection and counselling to achieve optimal outcomes for men undergoing prostate artery embolisation.
Shelton, Rachel C.; Puleo, Elaine; Bennett, Gary G.; McNeill, Lorna H.; Sorensen, Glorian; Emmons, Karen M.
2010-01-01
Background Research on the association between self-reported racial or gender discrimination and body mass index (BMI) has been limited and inconclusive to date, particularly among lower-income populations. Objectives The aim of the current study was to examine the association between self-reported racial and gender discrimination and BMI among a sample of adult residents living in 12 urban lower-income housing sites in Boston, Masschusetts (USA). Methods Baseline survey data were collected among 1,307 (weighted N=1907) study participants. For analyses, linear regression models with a cluster design were conducted using SUDAAN and SAS statistical software. Results Our sample was predominately Black (weighted n=956) and Hispanic (weighted n=857), and female (weighted n=1420), with a mean age of 49.3 (SE: .40) and mean BMI of 30.2 kg m−2 (SE: .19). Nearly 47% of participants reported ever experiencing racial discrimination, and 24.8% reported ever experiencing gender discrimination. In bivariate and multivariable linear regression models, no main effect association was found between either racial or gender discrimination and BMI. Conclusions While our findings suggest that self-reported discrimination is not a key determinant of BMI among lower-income housing residents, these results should be considered in light of study limitations. Future researchers may want to investigate this association among other relevant samples, and other social contextual and cultural factors should be explored to understand how they contribute to disparities. PMID:19769005
Modeling energy expenditure in children and adolescents using quantile regression
Yang, Yunwen; Adolph, Anne L.; Puyau, Maurice R.; Vohra, Firoz A.; Zakeri, Issa F.
2013-01-01
Advanced mathematical models have the potential to capture the complex metabolic and physiological processes that result in energy expenditure (EE). Study objective is to apply quantile regression (QR) to predict EE and determine quantile-dependent variation in covariate effects in nonobese and obese children. First, QR models will be developed to predict minute-by-minute awake EE at different quantile levels based on heart rate (HR) and physical activity (PA) accelerometry counts, and child characteristics of age, sex, weight, and height. Second, the QR models will be used to evaluate the covariate effects of weight, PA, and HR across the conditional EE distribution. QR and ordinary least squares (OLS) regressions are estimated in 109 children, aged 5–18 yr. QR modeling of EE outperformed OLS regression for both nonobese and obese populations. Average prediction errors for QR compared with OLS were not only smaller at the median τ = 0.5 (18.6 vs. 21.4%), but also substantially smaller at the tails of the distribution (10.2 vs. 39.2% at τ = 0.1 and 8.7 vs. 19.8% at τ = 0.9). Covariate effects of weight, PA, and HR on EE for the nonobese and obese children differed across quantiles (P < 0.05). The associations (linear and quadratic) between PA and HR with EE were stronger for the obese than nonobese population (P < 0.05). In conclusion, QR provided more accurate predictions of EE compared with conventional OLS regression, especially at the tails of the distribution, and revealed substantially different covariate effects of weight, PA, and HR on EE in nonobese and obese children. PMID:23640591
Madarang, Krish J; Kang, Joo-Hyon
2014-06-01
Stormwater runoff has been identified as a source of pollution for the environment, especially for receiving waters. In order to quantify and manage the impacts of stormwater runoff on the environment, predictive models and mathematical models have been developed. Predictive tools such as regression models have been widely used to predict stormwater discharge characteristics. Storm event characteristics, such as antecedent dry days (ADD), have been related to response variables, such as pollutant loads and concentrations. However it has been a controversial issue among many studies to consider ADD as an important variable in predicting stormwater discharge characteristics. In this study, we examined the accuracy of general linear regression models in predicting discharge characteristics of roadway runoff. A total of 17 storm events were monitored in two highway segments, located in Gwangju, Korea. Data from the monitoring were used to calibrate United States Environmental Protection Agency's Storm Water Management Model (SWMM). The calibrated SWMM was simulated for 55 storm events, and the results of total suspended solid (TSS) discharge loads and event mean concentrations (EMC) were extracted. From these data, linear regression models were developed. R(2) and p-values of the regression of ADD for both TSS loads and EMCs were investigated. Results showed that pollutant loads were better predicted than pollutant EMC in the multiple regression models. Regression may not provide the true effect of site-specific characteristics, due to uncertainty in the data. Copyright © 2014 The Research Centre for Eco-Environmental Sciences, Chinese Academy of Sciences. Published by Elsevier B.V. All rights reserved.
Birthweight Related Factors in Northwestern Iran: Using Quantile Regression Method.
Fallah, Ramazan; Kazemnejad, Anoshirvan; Zayeri, Farid; Shoghli, Alireza
2015-11-18
Birthweight is one of the most important predicting indicators of the health status in adulthood. Having a balanced birthweight is one of the priorities of the health system in most of the industrial and developed countries. This indicator is used to assess the growth and health status of the infants. The aim of this study was to assess the birthweight of the neonates by using quantile regression in Zanjan province. This analytical descriptive study was carried out using pre-registered (March 2010 - March 2012) data of neonates in urban/rural health centers of Zanjan province using multiple-stage cluster sampling. Data were analyzed using multiple linear regressions andquantile regression method and SAS 9.2 statistical software. From 8456 newborn baby, 4146 (49%) were female. The mean age of the mothers was 27.1±5.4 years. The mean birthweight of the neonates was 3104 ± 431 grams. Five hundred and seventy-three patients (6.8%) of the neonates were less than 2500 grams. In all quantiles, gestational age of neonates (p<0.05), weight and educational level of the mothers (p<0.05) showed a linear significant relationship with the i of the neonates. However, sex and birth rank of the neonates, mothers age, place of residence (urban/rural) and career were not significant in all quantiles (p>0.05). This study revealed the results of multiple linear regression and quantile regression were not identical. We strictly recommend the use of quantile regression when an asymmetric response variable or data with outliers is available.
Henrard, S; Speybroeck, N; Hermans, C
2015-11-01
Haemophilia is a rare genetic haemorrhagic disease characterized by partial or complete deficiency of coagulation factor VIII, for haemophilia A, or IX, for haemophilia B. As in any other medical research domain, the field of haemophilia research is increasingly concerned with finding factors associated with binary or continuous outcomes through multivariable models. Traditional models include multiple logistic regressions, for binary outcomes, and multiple linear regressions for continuous outcomes. Yet these regression models are at times difficult to implement, especially for non-statisticians, and can be difficult to interpret. The present paper sought to didactically explain how, why, and when to use classification and regression tree (CART) analysis for haemophilia research. The CART method is non-parametric and non-linear, based on the repeated partitioning of a sample into subgroups based on a certain criterion. Breiman developed this method in 1984. Classification trees (CTs) are used to analyse categorical outcomes and regression trees (RTs) to analyse continuous ones. The CART methodology has become increasingly popular in the medical field, yet only a few examples of studies using this methodology specifically in haemophilia have to date been published. Two examples using CART analysis and previously published in this field are didactically explained in details. There is increasing interest in using CART analysis in the health domain, primarily due to its ease of implementation, use, and interpretation, thus facilitating medical decision-making. This method should be promoted for analysing continuous or categorical outcomes in haemophilia, when applicable. © 2015 John Wiley & Sons Ltd.
Zhao, Zeng-hui; Wang, Wei-ming; Gao, Xin; Yan, Ji-xing
2013-01-01
According to the geological characteristics of Xinjiang Ili mine in western area of China, a physical model of interstratified strata composed of soft rock and hard coal seam was established. Selecting the tunnel position, deformation modulus, and strength parameters of each layer as influencing factors, the sensitivity coefficient of roadway deformation to each parameter was firstly analyzed based on a Mohr-Columb strain softening model and nonlinear elastic-plastic finite element analysis. Then the effect laws of influencing factors which showed high sensitivity were further discussed. Finally, a regression model for the relationship between roadway displacements and multifactors was obtained by equivalent linear regression under multiple factors. The results show that the roadway deformation is highly sensitive to the depth of coal seam under the floor which should be considered in the layout of coal roadway; deformation modulus and strength of coal seam and floor have a great influence on the global stability of tunnel; on the contrary, roadway deformation is not sensitive to the mechanical parameters of soft roof; roadway deformation under random combinations of multi-factors can be deduced by the regression model. These conclusions provide theoretical significance to the arrangement and stability maintenance of coal roadway. PMID:24459447
Prunier, J G; Colyn, M; Legendre, X; Nimon, K F; Flamand, M C
2015-01-01
Direct gradient analyses in spatial genetics provide unique opportunities to describe the inherent complexity of genetic variation in wildlife species and are the object of many methodological developments. However, multicollinearity among explanatory variables is a systemic issue in multivariate regression analyses and is likely to cause serious difficulties in properly interpreting results of direct gradient analyses, with the risk of erroneous conclusions, misdirected research and inefficient or counterproductive conservation measures. Using simulated data sets along with linear and logistic regressions on distance matrices, we illustrate how commonality analysis (CA), a detailed variance-partitioning procedure that was recently introduced in the field of ecology, can be used to deal with nonindependence among spatial predictors. By decomposing model fit indices into unique and common (or shared) variance components, CA allows identifying the location and magnitude of multicollinearity, revealing spurious correlations and thus thoroughly improving the interpretation of multivariate regressions. Despite a few inherent limitations, especially in the case of resistance model optimization, this review highlights the great potential of CA to account for complex multicollinearity patterns in spatial genetics and identifies future applications and lines of research. We strongly urge spatial geneticists to systematically investigate commonalities when performing direct gradient analyses. © 2014 John Wiley & Sons Ltd.
Generalized Onsager's reciprocal relations for the master and Fokker-Planck equations
NASA Astrophysics Data System (ADS)
Peng, Liangrong; Zhu, Yi; Hong, Liu
2018-06-01
The Onsager's reciprocal relation plays a fundamental role in the nonequilibrium thermodynamics. However, unfortunately, its classical version is valid only within a narrow region near equilibrium due to the linear regression hypothesis, which largely restricts its usage. In this paper, based on the conservation-dissipation formalism, a generalized version of Onsager's relations for the master equations and Fokker-Planck equations was derived. Nonlinear constitutive relations with nonsymmetric and positively stable operators, which become symmetric under the detailed balance condition, constitute key features of this new generalization. Similar conclusions also hold for many other classical models in physics and chemistry, which in turn make the current study as a benchmark for the application of generalized Onsager's relations in nonequilibrium thermodynamics.
Early Childhood Adversity and Pregnancy Outcomes
Smith, Megan V.; Gotman, Nathan; Yonkers, Kimberly A.
2016-01-01
Objectives To examine the association between adverse childhood experiences (ACEs) and pregnancy outcomes; to explore mediators of this association including psychiatric illness and health habits. Methods Exposure to ACEs was determined by the Early Trauma Inventory Self Report Short Form; psychiatric diagnoses were generated by the Composite International Diagnostic Interview administered in a cohort of 2303 pregnant women. Linear regression and structural equation modeling bootstrapping approaches tested for multiple mediators. Results Each additional ACE decreased birth weight by 16.33 g and decreased gestational age by 0.063. Smoking was the strongest mediator of the effect on gestational age. Conclusions ACEs have an enduring effect on maternal reproductive health, as manifested by mothers’ delivery of offspring that were of reduced birth weight and shorter gestational age. PMID:26762511
Some comparisons of complexity in dictionary-based and linear computational models.
Gnecco, Giorgio; Kůrková, Věra; Sanguineti, Marcello
2011-03-01
Neural networks provide a more flexible approximation of functions than traditional linear regression. In the latter, one can only adjust the coefficients in linear combinations of fixed sets of functions, such as orthogonal polynomials or Hermite functions, while for neural networks, one may also adjust the parameters of the functions which are being combined. However, some useful properties of linear approximators (such as uniqueness, homogeneity, and continuity of best approximation operators) are not satisfied by neural networks. Moreover, optimization of parameters in neural networks becomes more difficult than in linear regression. Experimental results suggest that these drawbacks of neural networks are offset by substantially lower model complexity, allowing accuracy of approximation even in high-dimensional cases. We give some theoretical results comparing requirements on model complexity for two types of approximators, the traditional linear ones and so called variable-basis types, which include neural networks, radial, and kernel models. We compare upper bounds on worst-case errors in variable-basis approximation with lower bounds on such errors for any linear approximator. Using methods from nonlinear approximation and integral representations tailored to computational units, we describe some cases where neural networks outperform any linear approximator. Copyright © 2010 Elsevier Ltd. All rights reserved.
Montoye, Alexander H K; Begum, Munni; Henning, Zachary; Pfeiffer, Karin A
2017-02-01
This study had three purposes, all related to evaluating energy expenditure (EE) prediction accuracy from body-worn accelerometers: (1) compare linear regression to linear mixed models, (2) compare linear models to artificial neural network models, and (3) compare accuracy of accelerometers placed on the hip, thigh, and wrists. Forty individuals performed 13 activities in a 90 min semi-structured, laboratory-based protocol. Participants wore accelerometers on the right hip, right thigh, and both wrists and a portable metabolic analyzer (EE criterion). Four EE prediction models were developed for each accelerometer: linear regression, linear mixed, and two ANN models. EE prediction accuracy was assessed using correlations, root mean square error (RMSE), and bias and was compared across models and accelerometers using repeated-measures analysis of variance. For all accelerometer placements, there were no significant differences for correlations or RMSE between linear regression and linear mixed models (correlations: r = 0.71-0.88, RMSE: 1.11-1.61 METs; p > 0.05). For the thigh-worn accelerometer, there were no differences in correlations or RMSE between linear and ANN models (ANN-correlations: r = 0.89, RMSE: 1.07-1.08 METs. Linear models-correlations: r = 0.88, RMSE: 1.10-1.11 METs; p > 0.05). Conversely, one ANN had higher correlations and lower RMSE than both linear models for the hip (ANN-correlation: r = 0.88, RMSE: 1.12 METs. Linear models-correlations: r = 0.86, RMSE: 1.18-1.19 METs; p < 0.05), and both ANNs had higher correlations and lower RMSE than both linear models for the wrist-worn accelerometers (ANN-correlations: r = 0.82-0.84, RMSE: 1.26-1.32 METs. Linear models-correlations: r = 0.71-0.73, RMSE: 1.55-1.61 METs; p < 0.01). For studies using wrist-worn accelerometers, machine learning models offer a significant improvement in EE prediction accuracy over linear models. Conversely, linear models showed similar EE prediction accuracy to machine learning models for hip- and thigh-worn accelerometers and may be viable alternative modeling techniques for EE prediction for hip- or thigh-worn accelerometers.
Diagnosis of Enzyme Inhibition Using Excel Solver: A Combined Dry and Wet Laboratory Exercise
ERIC Educational Resources Information Center
Dias, Albino A.; Pinto, Paula A.; Fraga, Irene; Bezerra, Rui M. F.
2014-01-01
In enzyme kinetic studies, linear transformations of the Michaelis-Menten equation, such as the Lineweaver-Burk double-reciprocal transformation, present some constraints. The linear transformation distorts the experimental error and the relationship between "x" and "y" axes; consequently, linear regression of transformed data…
Su, Liyun; Zhao, Yanyong; Yan, Tianshun; Li, Fenglan
2012-01-01
Multivariate local polynomial fitting is applied to the multivariate linear heteroscedastic regression model. Firstly, the local polynomial fitting is applied to estimate heteroscedastic function, then the coefficients of regression model are obtained by using generalized least squares method. One noteworthy feature of our approach is that we avoid the testing for heteroscedasticity by improving the traditional two-stage method. Due to non-parametric technique of local polynomial estimation, it is unnecessary to know the form of heteroscedastic function. Therefore, we can improve the estimation precision, when the heteroscedastic function is unknown. Furthermore, we verify that the regression coefficients is asymptotic normal based on numerical simulations and normal Q-Q plots of residuals. Finally, the simulation results and the local polynomial estimation of real data indicate that our approach is surely effective in finite-sample situations.
Clustering performance comparison using K-means and expectation maximization algorithms.
Jung, Yong Gyu; Kang, Min Soo; Heo, Jun
2014-11-14
Clustering is an important means of data mining based on separating data categories by similar features. Unlike the classification algorithm, clustering belongs to the unsupervised type of algorithms. Two representatives of the clustering algorithms are the K -means and the expectation maximization (EM) algorithm. Linear regression analysis was extended to the category-type dependent variable, while logistic regression was achieved using a linear combination of independent variables. To predict the possibility of occurrence of an event, a statistical approach is used. However, the classification of all data by means of logistic regression analysis cannot guarantee the accuracy of the results. In this paper, the logistic regression analysis is applied to EM clusters and the K -means clustering method for quality assessment of red wine, and a method is proposed for ensuring the accuracy of the classification results.
Improvement of Storm Forecasts Using Gridded Bayesian Linear Regression for Northeast United States
NASA Astrophysics Data System (ADS)
Yang, J.; Astitha, M.; Schwartz, C. S.
2017-12-01
Bayesian linear regression (BLR) is a post-processing technique in which regression coefficients are derived and used to correct raw forecasts based on pairs of observation-model values. This study presents the development and application of a gridded Bayesian linear regression (GBLR) as a new post-processing technique to improve numerical weather prediction (NWP) of rain and wind storm forecasts over northeast United States. Ten controlled variables produced from ten ensemble members of the National Center for Atmospheric Research (NCAR) real-time prediction system are used for a GBLR model. In the GBLR framework, leave-one-storm-out cross-validation is utilized to study the performances of the post-processing technique in a database composed of 92 storms. To estimate the regression coefficients of the GBLR, optimization procedures that minimize the systematic and random error of predicted atmospheric variables (wind speed, precipitation, etc.) are implemented for the modeled-observed pairs of training storms. The regression coefficients calculated for meteorological stations of the National Weather Service are interpolated back to the model domain. An analysis of forecast improvements based on error reductions during the storms will demonstrate the value of GBLR approach. This presentation will also illustrate how the variances are optimized for the training partition in GBLR and discuss the verification strategy for grid points where no observations are available. The new post-processing technique is successful in improving wind speed and precipitation storm forecasts using past event-based data and has the potential to be implemented in real-time.
Francisco, Fabiane Lacerda; Saviano, Alessandro Morais; Almeida, Túlia de Souza Botelho; Lourenço, Felipe Rebello
2016-05-01
Microbiological assays are widely used to estimate the relative potencies of antibiotics in order to guarantee the efficacy, safety, and quality of drug products. Despite of the advantages of turbidimetric bioassays when compared to other methods, it has limitations concerning the linearity and range of the dose-response curve determination. Here, we proposed to use partial least squares (PLS) regression to solve these limitations and to improve the prediction of relative potencies of antibiotics. Kinetic-reading microplate turbidimetric bioassays for apramacyin and vancomycin were performed using Escherichia coli (ATCC 8739) and Bacillus subtilis (ATCC 6633), respectively. Microbial growths were measured as absorbance up to 180 and 300min for apramycin and vancomycin turbidimetric bioassays, respectively. Conventional dose-response curves (absorbances or area under the microbial growth curve vs. log of antibiotic concentration) showed significant regression, however there were significant deviation of linearity. Thus, they could not be used for relative potency estimations. PLS regression allowed us to construct a predictive model for estimating the relative potencies of apramycin and vancomycin without over-fitting and it improved the linear range of turbidimetric bioassay. In addition, PLS regression provided predictions of relative potencies equivalent to those obtained from agar diffusion official methods. Therefore, we conclude that PLS regression may be used to estimate the relative potencies of antibiotics with significant advantages when compared to conventional dose-response curve determination. Copyright © 2016 Elsevier B.V. All rights reserved.
ERIC Educational Resources Information Center
Dolan, Conor V.; Wicherts, Jelte M.; Molenaar, Peter C. M.
2004-01-01
We consider the question of how variation in the number and reliability of indicators affects the power to reject the hypothesis that the regression coefficients are zero in latent linear regression analysis. We show that power remains constant as long as the coefficient of determination remains unchanged. Any increase in the number of indicators…
NASA Technical Reports Server (NTRS)
Whitlock, C. H.; Kuo, C. Y.
1979-01-01
The objective of this paper is to define optical physics and/or environmental conditions under which the linear multiple-regression should be applicable. An investigation of the signal-response equations is conducted and the concept is tested by application to actual remote sensing data from a laboratory experiment performed under controlled conditions. Investigation of the signal-response equations shows that the exact solution for a number of optical physics conditions is of the same form as a linearized multiple-regression equation, even if nonlinear contributions from surface reflections, atmospheric constituents, or other water pollutants are included. Limitations on achieving this type of solution are defined.
Language and country preponderance trends in MEDLINE and its causes.
Loria, Alvar; Arroyo, Pedro
2005-07-01
The authors characterized the output of MEDLINE papers by language and country of publication during a thirty-four-year time period. We classified MEDLINE's journal articles by country of publication (Anglos/Non-Anglos) and language (English/Non-English) for the years 1966 and from 1970 to 2000 at five-year intervals. Eight English-speaking countries were considered Anglos. Linear regression analysis of number of papers versus time was performed. The global number of papers increased linearly at a rate of 8,142 papers per year. Anglo and English papers also increased linearly (6,740 and 9,199, respectively). Journals of Non-Anglo countries accounted for 25% of the English language increase (2,438 per year). Only Non-English papers decreased at a rate of 1,056 fewer papers per year. These trends have led to overwhelming shares of English and Anglo papers in MEDLINE. In 2000, 68% of all papers were published in the 8 Anglo countries and 90% were written in English. The Anglo and English preponderances appear to be a consequence of at least two phenomena: (1) editorial policy changes in MEDLINE and in some journals from Non-Anglo countries and (2) factors affecting Non-Anglo researchers in the third world (publication constraints, migration, and undersupport). These are tentative conclusions that need confirmation.
Wit, Jan M.; Himes, John H.; van Buuren, Stef; Denno, Donna M.; Suchdev, Parminder S.
2017-01-01
Background/Aims Childhood stunting is a prevalent problem in low- and middle-income countries and is associated with long-term adverse neurodevelopment and health outcomes. In this review, we define indicators of growth, discuss key challenges in their analysis and application, and offer suggestions for indicator selection in clinical research contexts. Methods Critical review of the literature. Results Linear growth is commonly expressed as length-for-age or height-for-age z-score (HAZ) in comparison to normative growth standards. Conditional HAZ corrects for regression to the mean where growth changes relate to previous status. In longitudinal studies, growth can be expressed as ΔHAZ at 2 time points. Multilevel modeling is preferable when more measurements per individual child are available over time. Height velocity z-score reference standards are available for children under the age of 2 years. Adjusting for covariates or confounders (e.g., birth weight, gestational age, sex, parental height, maternal education, socioeconomic status) is recommended in growth analyses. Conclusion The most suitable indicator(s) for linear growth can be selected based on the number of available measurements per child and the child's age. By following a step-by-step algorithm, growth analyses can be precisely and accurately performed to allow for improved comparability within and between studies. PMID:28196362
Glantz, S; Wilson-Loots, R
2003-01-01
Background: Because it is widely played, claims that smoking restrictions will adversely affect bingo games is used as an argument against these policies. We used publicly available data from Massachusetts to assess the impact of 100% smoke-free ordinances on profits from bingo and other gambling sponsored by charitable organisations between 1985 and 2001. Methods: We conducted two analyses: (1) a general linear model implementation of a time series analysis with net profits (adjusted to 2001 dollars) as the dependent variable, and community (as a fixed effect), year, lagged net profits, and the length of time the ordinance had been in force as the independent variables; (2) multiple linear regression of total state profits against time, lagged profits, and the percentage of the entire state population in communities that allow charitable gaming but prohibit smoking. Results: The general linear model analysis of data from individual communities showed that, while adjusted profits fell over time, this effect was not related to the presence of an ordinance. The analysis in terms of the fraction of the population living in communities with ordinances yielded the same result. Conclusion: Policymakers can implement smoke-free policies without concern that these policies will affect charitable gaming. PMID:14660778
1981-09-01
corresponds to the same square footage that consumed the electrical energy. 3. The basic assumptions of multiple linear regres- sion, as enumerated in...7. Data related to the sample of bases is assumed to be representative of bases in the population. Limitations Basic limitations on this research were... Ratemaking --Overview. Rand Report R-5894, Santa Monica CA, May 1977. Chatterjee, Samprit, and Bertram Price. Regression Analysis by Example. New York: John
Study on power grid characteristics in summer based on Linear regression analysis
NASA Astrophysics Data System (ADS)
Tang, Jin-hui; Liu, You-fei; Liu, Juan; Liu, Qiang; Liu, Zhuan; Xu, Xi
2018-05-01
The correlation analysis of power load and temperature is the precondition and foundation for accurate load prediction, and a great deal of research has been made. This paper constructed the linear correlation model between temperature and power load, then the correlation of fault maintenance work orders with the power load is researched. Data details of Jiangxi province in 2017 summer such as temperature, power load, fault maintenance work orders were adopted in this paper to develop data analysis and mining. Linear regression models established in this paper will promote electricity load growth forecast, fault repair work order review, distribution network operation weakness analysis and other work to further deepen the refinement.
Interpreting Regression Results: beta Weights and Structure Coefficients are Both Important.
ERIC Educational Resources Information Center
Thompson, Bruce
Various realizations have led to less frequent use of the "OVA" methods (analysis of variance--ANOVA--among others) and to more frequent use of general linear model approaches such as regression. However, too few researchers understand all the various coefficients produced in regression. This paper explains these coefficients and their…
Spatial Assessment of Model Errors from Four Regression Techniques
Lianjun Zhang; Jeffrey H. Gove; Jeffrey H. Gove
2005-01-01
Fomst modelers have attempted to account for the spatial autocorrelations among trees in growth and yield models by applying alternative regression techniques such as linear mixed models (LMM), generalized additive models (GAM), and geographicalIy weighted regression (GWR). However, the model errors are commonly assessed using average errors across the entire study...
Quantile Regression in the Study of Developmental Sciences
ERIC Educational Resources Information Center
Petscher, Yaacov; Logan, Jessica A. R.
2014-01-01
Linear regression analysis is one of the most common techniques applied in developmental research, but only allows for an estimate of the average relations between the predictor(s) and the outcome. This study describes quantile regression, which provides estimates of the relations between the predictor(s) and outcome, but across multiple points of…
Maintenance Operations in Mission Oriented Protective Posture Level IV (MOPPIV)
1987-10-01
Repair FADAC Printed Circuit Board ............. 6 3. Data Analysis Techniques ............................. 6 a. Multiple Linear Regression... ANALYSIS /DISCUSSION ............................... 12 1. Exa-ple of Regression Analysis ..................... 12 S2. Regression results for all tasks...6 * TABLE 9. Task Grouping for Analysis ........................ 7 "TABXLE 10. Remove/Replace H60A3 Power Pack................. 8 TABLE
Experimental and computational prediction of glass transition temperature of drugs.
Alzghoul, Ahmad; Alhalaweh, Amjad; Mahlin, Denny; Bergström, Christel A S
2014-12-22
Glass transition temperature (Tg) is an important inherent property of an amorphous solid material which is usually determined experimentally. In this study, the relation between Tg and melting temperature (Tm) was evaluated using a data set of 71 structurally diverse druglike compounds. Further, in silico models for prediction of Tg were developed based on calculated molecular descriptors and linear (multilinear regression, partial least-squares, principal component regression) and nonlinear (neural network, support vector regression) modeling techniques. The models based on Tm predicted Tg with an RMSE of 19.5 K for the test set. Among the five computational models developed herein the support vector regression gave the best result with RMSE of 18.7 K for the test set using only four chemical descriptors. Hence, two different models that predict Tg of drug-like molecules with high accuracy were developed. If Tm is available, a simple linear regression can be used to predict Tg. However, the results also suggest that support vector regression and calculated molecular descriptors can predict Tg with equal accuracy, already before compound synthesis.
NASA Astrophysics Data System (ADS)
Wibowo, Wahyu; Wene, Chatrien; Budiantara, I. Nyoman; Permatasari, Erma Oktania
2017-03-01
Multiresponse semiparametric regression is simultaneous equation regression model and fusion of parametric and nonparametric model. The regression model comprise several models and each model has two components, parametric and nonparametric. The used model has linear function as parametric and polynomial truncated spline as nonparametric component. The model can handle both linearity and nonlinearity relationship between response and the sets of predictor variables. The aim of this paper is to demonstrate the application of the regression model for modeling of effect of regional socio-economic on use of information technology. More specific, the response variables are percentage of households has access to internet and percentage of households has personal computer. Then, predictor variables are percentage of literacy people, percentage of electrification and percentage of economic growth. Based on identification of the relationship between response and predictor variable, economic growth is treated as nonparametric predictor and the others are parametric predictors. The result shows that the multiresponse semiparametric regression can be applied well as indicate by the high coefficient determination, 90 percent.
Regression analysis using dependent Polya trees.
Schörgendorfer, Angela; Branscum, Adam J
2013-11-30
Many commonly used models for linear regression analysis force overly simplistic shape and scale constraints on the residual structure of data. We propose a semiparametric Bayesian model for regression analysis that produces data-driven inference by using a new type of dependent Polya tree prior to model arbitrary residual distributions that are allowed to evolve across increasing levels of an ordinal covariate (e.g., time, in repeated measurement studies). By modeling residual distributions at consecutive covariate levels or time points using separate, but dependent Polya tree priors, distributional information is pooled while allowing for broad pliability to accommodate many types of changing residual distributions. We can use the proposed dependent residual structure in a wide range of regression settings, including fixed-effects and mixed-effects linear and nonlinear models for cross-sectional, prospective, and repeated measurement data. A simulation study illustrates the flexibility of our novel semiparametric regression model to accurately capture evolving residual distributions. In an application to immune development data on immunoglobulin G antibodies in children, our new model outperforms several contemporary semiparametric regression models based on a predictive model selection criterion. Copyright © 2013 John Wiley & Sons, Ltd.
Lin, Zhaozhou; Zhang, Qiao; Liu, Ruixin; Gao, Xiaojie; Zhang, Lu; Kang, Bingya; Shi, Junhan; Wu, Zidan; Gui, Xinjing; Li, Xuelin
2016-01-01
To accurately, safely, and efficiently evaluate the bitterness of Traditional Chinese Medicines (TCMs), a robust predictor was developed using robust partial least squares (RPLS) regression method based on data obtained from an electronic tongue (e-tongue) system. The data quality was verified by the Grubb’s test. Moreover, potential outliers were detected based on both the standardized residual and score distance calculated for each sample. The performance of RPLS on the dataset before and after outlier detection was compared to other state-of-the-art methods including multivariate linear regression, least squares support vector machine, and the plain partial least squares regression. Both R2 and root-mean-squares error (RMSE) of cross-validation (CV) were recorded for each model. With four latent variables, a robust RMSECV value of 0.3916 with bitterness values ranging from 0.63 to 4.78 were obtained for the RPLS model that was constructed based on the dataset including outliers. Meanwhile, the RMSECV, which was calculated using the models constructed by other methods, was larger than that of the RPLS model. After six outliers were excluded, the performance of all benchmark methods markedly improved, but the difference between the RPLS model constructed before and after outlier exclusion was negligible. In conclusion, the bitterness of TCM decoctions can be accurately evaluated with the RPLS model constructed using e-tongue data. PMID:26821026
The National Tumor Association Foundation (ANT): A 30 year old model of home palliative care
2010-01-01
Background Models of palliative care delivery develop within a social, cultural, and political context. This paper describes the 30-year history of the National Tumor Association (ANT), a palliative care organization founded in the Italian province of Bologna, focusing on this model of home care for palliative cancer patients and on its evaluation. Methods Data were collected from the 1986-2008 ANT archives and documents from the Emilia-Romagna Region Health Department, Italy. Outcomes of interest were changed in: number of patients served, performance status at admission (Karnofsky Performance Status score [KPS]), length of participation in the program (days of care provided), place of death (home vs. hospital/hospice), and satisfaction with care. Statistical methods included linear and quadratic regressions. A linear and a quadratic regressions were generated; the independent variable was the year, while the dependent one was the number of patients from 1986 to 2008. Two linear regressions were generated for patients died at home and in the hospital, respectively. For each regression, the R square, the unstandardized and standardized coefficients and related P-values were estimated. Results The number of patients served by ANT has increased continuously from 131 (1986) to a cumulative total of 69,336 patients (2008), at a steady rate of approximately 121 additional patients per year and with no significant gender difference. The annual number of home visits increased from 6,357 (1985) to 904,782 (2008). More ANT patients died at home than in hospice or hospital; this proportion increased from 60% (1987) to 80% (2007). The rate of growth in the number of patients dying in hospital/hospice was approximately 40 patients/year (p < 0.01), vs. approximately 177 patients/year for patients who died at home. The percentage of patients with KPS < 40 at admission decreased from 70% (2003) to 30% (2008); the percentage of patients with KPS > 40 increased. Mean days of care for patients with KPS > 40 exceeded mean days for patients with KPS < 40 (p < 0.001). Patients and caregivers reported high satisfaction with care in each year of assessment; in 2008, among 187 interviewed caregivers, 95% judged the quality of doctors' assistance, and 91% judged the quality of nurses' assistance, to be "optimal." Conclusions The ANT home care model of palliative care delivery has been well-received, with progressively growing numbers of patients served. It has resulted in a greater proportion of home deaths and in patients' accessing palliative care at an earlier point in the disease trajectory. Changes in ANT chronicle palliative care trends in general. PMID:20529310
Guertler, Diana; Vandelanotte, Corneel; Short, Camille; Alley, Stephanie; Schoeppe, Stephanie; Duncan, Mitch J.
2015-01-01
Objective: This study aims to examine the relationship of lifestyle behaviors (physical activity, work and non-work sitting time, sleep quality, and sleep duration) with presenteeism while controlling for sociodemographics, work- and health-related variables. Methods: Data were collected from 710 workers (aged 20 to 76 years; 47.9% women) from randomly selected Australian adults who completed an online survey. Linear regression was used to examine the relationship between lifestyle behaviors and presenteeism. Results: Poorer sleep quality (standardized regression coefficients [B] = 0.112; P < 0.05), suboptimal duration (B = 0.081; P < 0.05), and lower work sitting time (B = −0.086; P < 0.05) were significantly associated with higher presenteeism when controlling for all lifestyle behaviors. Engaging in three risky lifestyle behaviors was associated with higher presenteeism (B = 0.150; P < 0.01) compared with engaging in none or one. Conclusions: The results of this study highlight the importance of sleep behaviors for presenteeism and call for behavioral interventions that simultaneously address sleep in conjunction with other activity-related behaviors. PMID:25742538
Big Data Toolsets to Pharmacometrics: Application of Machine Learning for Time-to-Event Analysis.
Gong, Xiajing; Hu, Meng; Zhao, Liang
2018-05-01
Additional value can be potentially created by applying big data tools to address pharmacometric problems. The performances of machine learning (ML) methods and the Cox regression model were evaluated based on simulated time-to-event data synthesized under various preset scenarios, i.e., with linear vs. nonlinear and dependent vs. independent predictors in the proportional hazard function, or with high-dimensional data featured by a large number of predictor variables. Our results showed that ML-based methods outperformed the Cox model in prediction performance as assessed by concordance index and in identifying the preset influential variables for high-dimensional data. The prediction performances of ML-based methods are also less sensitive to data size and censoring rates than the Cox regression model. In conclusion, ML-based methods provide a powerful tool for time-to-event analysis, with a built-in capacity for high-dimensional data and better performance when the predictor variables assume nonlinear relationships in the hazard function. © 2018 The Authors. Clinical and Translational Science published by Wiley Periodicals, Inc. on behalf of American Society for Clinical Pharmacology and Therapeutics.
Louys, Julien; Meloro, Carlo; Elton, Sarah; Ditchfield, Peter; Bishop, Laura C
2015-01-01
We test the performance of two models that use mammalian communities to reconstruct multivariate palaeoenvironments. While both models exploit the correlation between mammal communities (defined in terms of functional groups) and arboreal heterogeneity, the first uses a multiple multivariate regression of community structure and arboreal heterogeneity, while the second uses a linear regression of the principal components of each ecospace. The success of these methods means the palaeoenvironment of a particular locality can be reconstructed in terms of the proportions of heavy, moderate, light, and absent tree canopy cover. The linear regression is less biased, and more precisely and accurately reconstructs heavy tree canopy cover than the multiple multivariate model. However, the multiple multivariate model performs better than the linear regression for all other canopy cover categories. Both models consistently perform better than randomly generated reconstructions. We apply both models to the palaeocommunity of the Upper Laetolil Beds, Tanzania. Our reconstructions indicate that there was very little heavy tree cover at this site (likely less than 10%), with the palaeo-landscape instead comprising a mixture of light and absent tree cover. These reconstructions help resolve the previous conflicting palaeoecological reconstructions made for this site. Copyright © 2014 Elsevier Ltd. All rights reserved.
Cruz, Antonio M; Barr, Cameron; Puñales-Pozo, Elsa
2008-01-01
This research's main goals were to build a predictor for a turnaround time (TAT) indicator for estimating its values and use a numerical clustering technique for finding possible causes of undesirable TAT values. The following stages were used: domain understanding, data characterisation and sample reduction and insight characterisation. Building the TAT indicator multiple linear regression predictor and clustering techniques were used for improving corrective maintenance task efficiency in a clinical engineering department (CED). The indicator being studied was turnaround time (TAT). Multiple linear regression was used for building a predictive TAT value model. The variables contributing to such model were clinical engineering department response time (CE(rt), 0.415 positive coefficient), stock service response time (Stock(rt), 0.734 positive coefficient), priority level (0.21 positive coefficient) and service time (0.06 positive coefficient). The regression process showed heavy reliance on Stock(rt), CE(rt) and priority, in that order. Clustering techniques revealed the main causes of high TAT values. This examination has provided a means for analysing current technical service quality and effectiveness. In doing so, it has demonstrated a process for identifying areas and methods of improvement and a model against which to analyse these methods' effectiveness.
A New SEYHAN's Approach in Case of Heterogeneity of Regression Slopes in ANCOVA.
Ankarali, Handan; Cangur, Sengul; Ankarali, Seyit
2018-06-01
In this study, when the assumptions of linearity and homogeneity of regression slopes of conventional ANCOVA are not met, a new approach named as SEYHAN has been suggested to use conventional ANCOVA instead of robust or nonlinear ANCOVA. The proposed SEYHAN's approach involves transformation of continuous covariate into categorical structure when the relationship between covariate and dependent variable is nonlinear and the regression slopes are not homogenous. A simulated data set was used to explain SEYHAN's approach. In this approach, we performed conventional ANCOVA in each subgroup which is constituted according to knot values and analysis of variance with two-factor model after MARS method was used for categorization of covariate. The first model is a simpler model than the second model that includes interaction term. Since the model with interaction effect has more subjects, the power of test also increases and the existing significant difference is revealed better. We can say that linearity and homogeneity of regression slopes are not problem for data analysis by conventional linear ANCOVA model by helping this approach. It can be used fast and efficiently for the presence of one or more covariates.
The Influential Effect of Blending, Bump, Changing Period, and Eclipsing Cepheids on the Leavitt Law
NASA Astrophysics Data System (ADS)
García-Varela, A.; Muñoz, J. R.; Sabogal, B. E.; Vargas Domínguez, S.; Martínez, J.
2016-06-01
The investigation of the nonlinearity of the Leavitt law (LL) is a topic that began more than seven decades ago, when some of the studies in this field found that the LL has a break at about 10 days. The goal of this work is to investigate a possible statistical cause of this nonlinearity. By applying linear regressions to OGLE-II and OGLE-IV data, we find that to obtain the LL by using linear regression, robust techniques to deal with influential points and/or outliers are needed instead of the ordinary least-squares regression traditionally used. In particular, by using M- and MM-regressions we establish firmly and without doubt the linearity of the LL in the Large Magellanic Cloud, without rejecting or excluding Cepheid data from the analysis. This implies that light curves of Cepheids suggesting blending, bumps, eclipses, or period changes do not affect the LL for this galaxy. For the Small Magellanic Cloud, when including Cepheids of this kind, it is not possible to find an adequate model, probably because of the geometry of the galaxy. In that case, a possible influence of these stars could exist.
Multiple regression technique for Pth degree polynominals with and without linear cross products
NASA Technical Reports Server (NTRS)
Davis, J. W.
1973-01-01
A multiple regression technique was developed by which the nonlinear behavior of specified independent variables can be related to a given dependent variable. The polynomial expression can be of Pth degree and can incorporate N independent variables. Two cases are treated such that mathematical models can be studied both with and without linear cross products. The resulting surface fits can be used to summarize trends for a given phenomenon and provide a mathematical relationship for subsequent analysis. To implement this technique, separate computer programs were developed for the case without linear cross products and for the case incorporating such cross products which evaluate the various constants in the model regression equation. In addition, the significance of the estimated regression equation is considered and the standard deviation, the F statistic, the maximum absolute percent error, and the average of the absolute values of the percent of error evaluated. The computer programs and their manner of utilization are described. Sample problems are included to illustrate the use and capability of the technique which show the output formats and typical plots comparing computer results to each set of input data.
Zhang, Hanze; Huang, Yangxin; Wang, Wei; Chen, Henian; Langland-Orban, Barbara
2017-01-01
In longitudinal AIDS studies, it is of interest to investigate the relationship between HIV viral load and CD4 cell counts, as well as the complicated time effect. Most of common models to analyze such complex longitudinal data are based on mean-regression, which fails to provide efficient estimates due to outliers and/or heavy tails. Quantile regression-based partially linear mixed-effects models, a special case of semiparametric models enjoying benefits of both parametric and nonparametric models, have the flexibility to monitor the viral dynamics nonparametrically and detect the varying CD4 effects parametrically at different quantiles of viral load. Meanwhile, it is critical to consider various data features of repeated measurements, including left-censoring due to a limit of detection, covariate measurement error, and asymmetric distribution. In this research, we first establish a Bayesian joint models that accounts for all these data features simultaneously in the framework of quantile regression-based partially linear mixed-effects models. The proposed models are applied to analyze the Multicenter AIDS Cohort Study (MACS) data. Simulation studies are also conducted to assess the performance of the proposed methods under different scenarios.
Kumar, K Vasanth
2006-10-11
Batch kinetic experiments were carried out for the sorption of methylene blue onto activated carbon. The experimental kinetics were fitted to the pseudo first-order and pseudo second-order kinetics by linear and a non-linear method. The five different types of Ho pseudo second-order expression have been discussed. A comparison of linear least-squares method and a trial and error non-linear method of estimating the pseudo second-order rate kinetic parameters were examined. The sorption process was found to follow a both pseudo first-order kinetic and pseudo second-order kinetic model. Present investigation showed that it is inappropriate to use a type 1 and type pseudo second-order expressions as proposed by Ho and Blanachard et al. respectively for predicting the kinetic rate constants and the initial sorption rate for the studied system. Three correct possible alternate linear expressions (type 2 to type 4) to better predict the initial sorption rate and kinetic rate constants for the studied system (methylene blue/activated carbon) was proposed. Linear method was found to check only the hypothesis instead of verifying the kinetic model. Non-linear regression method was found to be the more appropriate method to determine the rate kinetic parameters.
Adjusted variable plots for Cox's proportional hazards regression model.
Hall, C B; Zeger, S L; Bandeen-Roche, K J
1996-01-01
Adjusted variable plots are useful in linear regression for outlier detection and for qualitative evaluation of the fit of a model. In this paper, we extend adjusted variable plots to Cox's proportional hazards model for possibly censored survival data. We propose three different plots: a risk level adjusted variable (RLAV) plot in which each observation in each risk set appears, a subject level adjusted variable (SLAV) plot in which each subject is represented by one point, and an event level adjusted variable (ELAV) plot in which the entire risk set at each failure event is represented by a single point. The latter two plots are derived from the RLAV by combining multiple points. In each point, the regression coefficient and standard error from a Cox proportional hazards regression is obtained by a simple linear regression through the origin fit to the coordinates of the pictured points. The plots are illustrated with a reanalysis of a dataset of 65 patients with multiple myeloma.
NASA Astrophysics Data System (ADS)
Sahabiev, I. A.; Ryazanov, S. S.; Kolcova, T. G.; Grigoryan, B. R.
2018-03-01
The three most common techniques to interpolate soil properties at a field scale—ordinary kriging (OK), regression kriging with multiple linear regression drift model (RK + MLR), and regression kriging with principal component regression drift model (RK + PCR)—were examined. The results of the performed study were compiled into an algorithm of choosing the most appropriate soil mapping technique. Relief attributes were used as the auxiliary variables. When spatial dependence of a target variable was strong, the OK method showed more accurate interpolation results, and the inclusion of the auxiliary data resulted in an insignificant improvement in prediction accuracy. According to the algorithm, the RK + PCR method effectively eliminates multicollinearity of explanatory variables. However, if the number of predictors is less than ten, the probability of multicollinearity is reduced, and application of the PCR becomes irrational. In that case, the multiple linear regression should be used instead.
Jupiter, Daniel C
2012-01-01
In this first of a series of statistical methodology commentaries for the clinician, we discuss the use of multivariate linear regression. Copyright © 2012 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.
An evaluation of bias in propensity score-adjusted non-linear regression models.
Wan, Fei; Mitra, Nandita
2018-03-01
Propensity score methods are commonly used to adjust for observed confounding when estimating the conditional treatment effect in observational studies. One popular method, covariate adjustment of the propensity score in a regression model, has been empirically shown to be biased in non-linear models. However, no compelling underlying theoretical reason has been presented. We propose a new framework to investigate bias and consistency of propensity score-adjusted treatment effects in non-linear models that uses a simple geometric approach to forge a link between the consistency of the propensity score estimator and the collapsibility of non-linear models. Under this framework, we demonstrate that adjustment of the propensity score in an outcome model results in the decomposition of observed covariates into the propensity score and a remainder term. Omission of this remainder term from a non-collapsible regression model leads to biased estimates of the conditional odds ratio and conditional hazard ratio, but not for the conditional rate ratio. We further show, via simulation studies, that the bias in these propensity score-adjusted estimators increases with larger treatment effect size, larger covariate effects, and increasing dissimilarity between the coefficients of the covariates in the treatment model versus the outcome model.
Modification of the USLE K factor for soil erodibility assessment on calcareous soils in Iran
NASA Astrophysics Data System (ADS)
Ostovari, Yaser; Ghorbani-Dashtaki, Shoja; Bahrami, Hossein-Ali; Naderi, Mehdi; Dematte, Jose Alexandre M.; Kerry, Ruth
2016-11-01
The measurement of soil erodibility (K) in the field is tedious, time-consuming and expensive; therefore, its prediction through pedotransfer functions (PTFs) could be far less costly and time-consuming. The aim of this study was to develop new PTFs to estimate the K factor using multiple linear regression, Mamdani fuzzy inference systems, and artificial neural networks. For this purpose, K was measured in 40 erosion plots with natural rainfall. Various soil properties including the soil particle size distribution, calcium carbonate equivalent, organic matter, permeability, and wet-aggregate stability were measured. The results showed that the mean measured K was 0.014 t h MJ- 1 mm- 1 and 2.08 times less than the estimated mean K (0.030 t h MJ- 1 mm- 1) using the USLE model. Permeability, wet-aggregate stability, very fine sand, and calcium carbonate were selected as independent variables by forward stepwise regression in order to assess the ability of multiple linear regression, Mamdani fuzzy inference systems and artificial neural networks to predict K. The calcium carbonate equivalent, which is not accounted for in the USLE model, had a significant impact on K in multiple linear regression due to its strong influence on the stability of aggregates and soil permeability. Statistical indices in validation and calibration datasets determined that the artificial neural networks method with the highest R2, lowest RMSE, and lowest ME was the best model for estimating the K factor. A strong correlation (R2 = 0.81, n = 40, p < 0.05) between the estimated K from multiple linear regression and measured K indicates that the use of calcium carbonate equivalent as a predictor variable gives a better estimation of K in areas with calcareous soils.
Postmolar gestational trophoblastic neoplasia: beyond the traditional risk factors.
Bakhtiyari, Mahmood; Mirzamoradi, Masoumeh; Kimyaiee, Parichehr; Aghaie, Abbas; Mansournia, Mohammd Ali; Ashrafi-Vand, Sepideh; Sarfjoo, Fatemeh Sadat
2015-09-01
To investigate the slope of linear regression of postevacuation serum hCG as an independent risk factor for postmolar gestational trophoblastic neoplasia (GTN). Multicenter retrospective cohort study. Academic referral health care centers. All subjects with confirmed hydatidiform mole and at least four measurements of β-hCG titer. None. Type and magnitude of the relationship between the slope of linear regression of β-hCG as a new risk factor and GTN using Bayesian logistic regression with penalized log-likelihood estimation. Among the high-risk and low-risk molar pregnancy cases, 11 (18.6%) and 19 cases (13.3%) had GTN, respectively. No significant relationship was found between the components of a high-risk pregnancy and GTN. The β-hCG return slope was higher in the spontaneous cure group. However, the initial level of this hormone in the first measurement was higher in the GTN group compared with in the spontaneous recovery group. The average time for diagnosing GTN in the high-risk molar pregnancy group was 2 weeks less than that of the low-risk molar pregnancy group. In addition to slope of linear regression of β-hCG (odds ratio [OR], 12.74, confidence interval [CI], 5.42-29.2), abortion history (OR, 2.53; 95% CI, 1.27-5.04) and large uterine height for gestational age (OR, 1.26; CI, 1.04-1.54) had the maximum effects on GTN outcome, respectively. The slope of linear regression of β-hCG was introduced as an independent risk factor, which could be used for clinical decision making based on records of β-hCG titer and subsequent prevention program. Copyright © 2015 American Society for Reproductive Medicine. Published by Elsevier Inc. All rights reserved.
Proton radius from electron scattering data
NASA Astrophysics Data System (ADS)
Higinbotham, Douglas W.; Kabir, Al Amin; Lin, Vincent; Meekins, David; Norum, Blaine; Sawatzky, Brad
2016-05-01
Background: The proton charge radius extracted from recent muonic hydrogen Lamb shift measurements is significantly smaller than that extracted from atomic hydrogen and electron scattering measurements. The discrepancy has become known as the proton radius puzzle. Purpose: In an attempt to understand the discrepancy, we review high-precision electron scattering results from Mainz, Jefferson Lab, Saskatoon, and Stanford. Methods: We make use of stepwise regression techniques using the F test as well as the Akaike information criterion to systematically determine the predictive variables to use for a given set and range of electron scattering data as well as to provide multivariate error estimates. Results: Starting with the precision, low four-momentum transfer (Q2) data from Mainz (1980) and Saskatoon (1974), we find that a stepwise regression of the Maclaurin series using the F test as well as the Akaike information criterion justify using a linear extrapolation which yields a value for the proton radius that is consistent with the result obtained from muonic hydrogen measurements. Applying the same Maclaurin series and statistical criteria to the 2014 Rosenbluth results on GE from Mainz, we again find that the stepwise regression tends to favor a radius consistent with the muonic hydrogen radius but produces results that are extremely sensitive to the range of data included in the fit. Making use of the high-Q2 data on GE to select functions which extrapolate to high Q2, we find that a Padé (N =M =1 ) statistical model works remarkably well, as does a dipole function with a 0.84 fm radius, GE(Q2) =(1+Q2/0.66 GeV2) -2 . Conclusions: Rigorous applications of stepwise regression techniques and multivariate error estimates result in the extraction of a proton charge radius that is consistent with the muonic hydrogen result of 0.84 fm; either from linear extrapolation of the extremely-low-Q2 data or by use of the Padé approximant for extrapolation using a larger range of data. Thus, based on a purely statistical analysis of electron scattering data, we conclude that the electron scattering results and the muonic hydrogen results are consistent. It is the atomic hydrogen results that are the outliers.
Kanamori, Shogo; Castro, Marcia C.; Sow, Seydou; Matsuno, Rui; Cissokho, Alioune; Jimba, Masamine
2016-01-01
Background The 5S method is a lean management tool for workplace organization, with 5S being an abbreviation for five Japanese words that translate to English as Sort, Set in Order, Shine, Standardize, and Sustain. In Senegal, the 5S intervention program was implemented in 10 health centers in two regions between 2011 and 2014. Objective To identify the impact of the 5S intervention program on the satisfaction of clients (patients and caretakers) who visited the health centers. Design A standardized 5S intervention protocol was implemented in the health centers using a quasi-experimental separate pre-post samples design (four intervention and three control health facilities). A questionnaire with 10 five-point Likert items was used to measure client satisfaction. Linear regression analysis was conducted to identify the intervention's effect on the client satisfaction scores, represented by an equally weighted average of the 10 Likert items (Cronbach's alpha=0.83). Additional regression analyses were conducted to identify the intervention's effect on the scores of each Likert item. Results Backward stepwise linear regression (n=1,928) indicated a statistically significant effect of the 5S intervention, represented by an increase of 0.19 points in the client satisfaction scores in the intervention group, 6 to 8 months after the intervention (p=0.014). Additional regression analyses showed significant score increases of 0.44 (p=0.002), 0.14 (p=0.002), 0.06 (p=0.019), and 0.17 (p=0.044) points on four items, which, respectively were healthcare staff members’ communication, explanations about illnesses or cases, and consultation duration, and clients’ overall satisfaction. Conclusions The 5S has the potential to improve client satisfaction at resource-poor health facilities and could therefore be recommended as a strategic option for improving the quality of healthcare service in low- and middle-income countries. To explore more effective intervention modalities, further studies need to address the mechanisms by which 5S leads to attitude changes in healthcare staff. PMID:27900932
Tolerance of ciliated protozoan Paramecium bursaria (Protozoa, Ciliophora) to ammonia and nitrites
NASA Astrophysics Data System (ADS)
Xu, Henglong; Song, Weibo; Lu, Lu; Alan, Warren
2005-09-01
The tolerance to ammonia and nitrites in freshwater ciliate Paramecium bursaria was measured in a conventional open system. The ciliate was exposed to different concentrations of ammonia and nitrites for 2h and 12h in order to determine the lethal concentrations. Linear regression analysis revealed that the 2h-LC50 value for ammonia was 95.94 mg/L and for nitrite 27.35 mg/L using probit scale method (with 95% confidence intervals). There was a linear correlation between the mortality probit scale and logarithmic concentration of ammonia which fit by a regression equation y=7.32 x 9.51 ( R 2=0.98; y, mortality probit scale; x, logarithmic concentration of ammonia), by which 2 h-LC50 value for ammonia was found to be 95.50 mg/L. A linear correlation between mortality probit scales and logarithmic concentration of nitrite is also followed the regression equation y=2.86 x+0.89 ( R 2=0.95; y, mortality probit scale; x, logarithmic concentration of nitrite). The regression analysis of toxicity curves showed that the linear correlation between exposed time of ammonia-N LC50 value and ammonia-N LC50 value followed the regression equation y=2 862.85 e -0.08 x ( R 2=0.95; y, duration of exposure to LC50 value; x, LC50 value), and that between exposed time of nitrite-N LC50 value and nitrite-N LC50 value followed the regression equation y=127.15 e -0.13 x ( R 2=0.91; y, exposed time of LC50 value; x, LC50 value). The results demonstrate that the tolerance to ammonia in P. bursaria is considerably higher than that of the larvae or juveniles of some metozoa, e.g. cultured prawns and oysters. In addition, ciliates, as bacterial predators, are likely to play a positive role in maintaining and improving water quality in aquatic environments with high-level ammonium, such as sewage treatment systems.
Linear growth trajectories in Zimbabwean infants12
Gough, Ethan K; Moodie, Erica EM; Prendergast, Andrew J; Ntozini, Robert; Moulton, Lawrence H; Humphrey, Jean H; Manges, Amee R
2016-01-01
Background: Undernutrition in early life underlies 45% of child deaths globally. Stunting malnutrition (suboptimal linear growth) also has long-term negative effects on childhood development. Linear growth deficits accrue in the first 1000 d of life. Understanding the patterns and timing of linear growth faltering or recovery during this period is critical to inform interventions to improve infant nutritional status. Objective: We aimed to identify the pattern and determinants of linear growth trajectories from birth through 24 mo of age in a cohort of Zimbabwean infants. Design: We performed a secondary analysis of longitudinal data from a subset of 3338 HIV-unexposed infants in the Zimbabwe Vitamin A for Mothers and Babies trial. We used k-means clustering for longitudinal data to identify linear growth trajectories and multinomial logistic regression to identify covariates that were associated with each trajectory group. Results: For the entire population, the mean length-for-age z score declined from −0.6 to −1.4 between birth and 24 mo of age. Within the population, 4 growth patterns were identified that were each characterized by worsening linear growth restriction but varied in the timing and severity of growth declines. In our multivariable model, 1-U increments in maternal height and education and infant birth weight and length were associated with greater relative odds of membership in the least–growth restricted groups (A and B) and reduced odds of membership in the more–growth restricted groups (C and D). Male infant sex was associated with reduced odds of membership in groups A and B but with increased odds of membership in groups C and D. Conclusion: In this population, all children were experiencing growth restriction but differences in magnitude were influenced by maternal height and education and infant sex, birth weight, and birth length, which suggest that key determinants of linear growth may already be established by the time of birth. This trial was registered at clinicaltrials.gov as NCT00198718. PMID:27806980
GWAS with longitudinal phenotypes: performance of approximate procedures
Sikorska, Karolina; Montazeri, Nahid Mostafavi; Uitterlinden, André; Rivadeneira, Fernando; Eilers, Paul HC; Lesaffre, Emmanuel
2015-01-01
Analysis of genome-wide association studies with longitudinal data using standard procedures, such as linear mixed model (LMM) fitting, leads to discouragingly long computation times. There is a need to speed up the computations significantly. In our previous work (Sikorska et al: Fast linear mixed model computations for genome-wide association studies with longitudinal data. Stat Med 2012; 32.1: 165–180), we proposed the conditional two-step (CTS) approach as a fast method providing an approximation to the P-value for the longitudinal single-nucleotide polymorphism (SNP) effect. In the first step a reduced conditional LMM is fit, omitting all the SNP terms. In the second step, the estimated random slopes are regressed on SNPs. The CTS has been applied to the bone mineral density data from the Rotterdam Study and proved to work very well even in unbalanced situations. In another article (Sikorska et al: GWAS on your notebook: fast semi-parallel linear and logistic regression for genome-wide association studies. BMC Bioinformatics 2013; 14: 166), we suggested semi-parallel computations, greatly speeding up fitting many linear regressions. Combining CTS with fast linear regression reduces the computation time from several weeks to a few minutes on a single computer. Here, we explore further the properties of the CTS both analytically and by simulations. We investigate the performance of our proposal in comparison with a related but different approach, the two-step procedure. It is analytically shown that for the balanced case, under mild assumptions, the P-value provided by the CTS is the same as from the LMM. For unbalanced data and in realistic situations, simulations show that the CTS method does not inflate the type I error rate and implies only a minimal loss of power. PMID:25712081
Local linear regression for function learning: an analysis based on sample discrepancy.
Cervellera, Cristiano; Macciò, Danilo
2014-11-01
Local linear regression models, a kind of nonparametric structures that locally perform a linear estimation of the target function, are analyzed in the context of empirical risk minimization (ERM) for function learning. The analysis is carried out with emphasis on geometric properties of the available data. In particular, the discrepancy of the observation points used both to build the local regression models and compute the empirical risk is considered. This allows to treat indifferently the case in which the samples come from a random external source and the one in which the input space can be freely explored. Both consistency of the ERM procedure and approximating capabilities of the estimator are analyzed, proving conditions to ensure convergence. Since the theoretical analysis shows that the estimation improves as the discrepancy of the observation points becomes smaller, low-discrepancy sequences, a family of sampling methods commonly employed for efficient numerical integration, are also analyzed. Simulation results involving two different examples of function learning are provided.
Adaptive local linear regression with application to printer color management.
Gupta, Maya R; Garcia, Eric K; Chin, Erika
2008-06-01
Local learning methods, such as local linear regression and nearest neighbor classifiers, base estimates on nearby training samples, neighbors. Usually, the number of neighbors used in estimation is fixed to be a global "optimal" value, chosen by cross validation. This paper proposes adapting the number of neighbors used for estimation to the local geometry of the data, without need for cross validation. The term enclosing neighborhood is introduced to describe a set of neighbors whose convex hull contains the test point when possible. It is proven that enclosing neighborhoods yield bounded estimation variance under some assumptions. Three such enclosing neighborhood definitions are presented: natural neighbors, natural neighbors inclusive, and enclosing k-NN. The effectiveness of these neighborhood definitions with local linear regression is tested for estimating lookup tables for color management. Significant improvements in error metrics are shown, indicating that enclosing neighborhoods may be a promising adaptive neighborhood definition for other local learning tasks as well, depending on the density of training samples.
Energy expenditure estimation during daily military routine with body-fixed sensors.
Wyss, Thomas; Mäder, Urs
2011-05-01
The purpose of this study was to develop and validate an algorithm for estimating energy expenditure during the daily military routine on the basis of data collected using body-fixed sensors. First, 8 volunteers completed isolated physical activities according to an established protocol, and the resulting data were used to develop activity-class-specific multiple linear regressions for physical activity energy expenditure on the basis of hip acceleration, heart rate, and body mass as independent variables. Second, the validity of these linear regressions was tested during the daily military routine using indirect calorimetry (n = 12). Volunteers' mean estimated energy expenditure did not significantly differ from the energy expenditure measured with indirect calorimetry (p = 0.898, 95% confidence interval = -1.97 to 1.75 kJ/min). We conclude that the developed activity-class-specific multiple linear regressions applied to the acceleration and heart rate data allow estimation of energy expenditure in 1-minute intervals during daily military routine, with accuracy equal to indirect calorimetry.
Agha, Salah R; Alnahhal, Mohammed J
2012-11-01
The current study investigates the possibility of obtaining the anthropometric dimensions, critical to school furniture design, without measuring all of them. The study first selects some anthropometric dimensions that are easy to measure. Two methods are then used to check if these easy-to-measure dimensions can predict the dimensions critical to the furniture design. These methods are multiple linear regression and neural networks. Each dimension that is deemed necessary to ergonomically design school furniture is expressed as a function of some other measured anthropometric dimensions. Results show that out of the five dimensions needed for chair design, four can be related to other dimensions that can be measured while children are standing. Therefore, the method suggested here would definitely save time and effort and avoid the difficulty of dealing with students while measuring these dimensions. In general, it was found that neural networks perform better than multiple linear regression in the current study. Copyright © 2012 Elsevier Ltd and The Ergonomics Society. All rights reserved.
Mixed effect Poisson log-linear models for clinical and epidemiological sleep hypnogram data
Swihart, Bruce J.; Caffo, Brian S.; Crainiceanu, Ciprian; Punjabi, Naresh M.
2013-01-01
Bayesian Poisson log-linear multilevel models scalable to epidemiological studies are proposed to investigate population variability in sleep state transition rates. Hierarchical random effects are used to account for pairings of subjects and repeated measures within those subjects, as comparing diseased to non-diseased subjects while minimizing bias is of importance. Essentially, non-parametric piecewise constant hazards are estimated and smoothed, allowing for time-varying covariates and segment of the night comparisons. The Bayesian Poisson regression is justified through a re-derivation of a classical algebraic likelihood equivalence of Poisson regression with a log(time) offset and survival regression assuming exponentially distributed survival times. Such re-derivation allows synthesis of two methods currently used to analyze sleep transition phenomena: stratified multi-state proportional hazards models and log-linear models with GEE for transition counts. An example data set from the Sleep Heart Health Study is analyzed. Supplementary material includes the analyzed data set as well as the code for a reproducible analysis. PMID:22241689
Lunt, Mark
2015-07-01
In the first article in this series we explored the use of linear regression to predict an outcome variable from a number of predictive factors. It assumed that the predictive factors were measured on an interval scale. However, this article shows how categorical variables can also be included in a linear regression model, enabling predictions to be made separately for different groups and allowing for testing the hypothesis that the outcome differs between groups. The use of interaction terms to measure whether the effect of a particular predictor variable differs between groups is also explained. An alternative approach to testing the difference between groups of the effect of a given predictor, which consists of measuring the effect in each group separately and seeing whether the statistical significance differs between the groups, is shown to be misleading. © The Author 2013. Published by Oxford University Press on behalf of the British Society for Rheumatology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
TG study of the Li0.4Fe2.4Zn0.2O4 ferrite synthesis
NASA Astrophysics Data System (ADS)
Lysenko, E. N.; Nikolaev, E. V.; Surzhikov, A. P.
2016-02-01
In this paper, the kinetic analysis of Li-Zn ferrite synthesis was studied using thermogravimetry (TG) method through the simultaneous application of non-linear regression to several measurements run at different heating rates (multivariate non-linear regression). Using TG-curves obtained for the four heating rates and Netzsch Thermokinetics software package, the kinetic models with minimal adjustable parameters were selected to quantitatively describe the reaction of Li-Zn ferrite synthesis. It was shown that the experimental TG-curves clearly suggest a two-step process for the ferrite synthesis and therefore a model-fitting kinetic analysis based on multivariate non-linear regressions was conducted. The complex reaction was described by a two-step reaction scheme consisting of sequential reaction steps. It is established that the best results were obtained using the Yander three-dimensional diffusion model at the first stage and Ginstling-Bronstein model at the second step. The kinetic parameters for lithium-zinc ferrite synthesis reaction were found and discussed.
A Feature-Free 30-Disease Pathological Brain Detection System by Linear Regression Classifier.
Chen, Yi; Shao, Ying; Yan, Jie; Yuan, Ti-Fei; Qu, Yanwen; Lee, Elizabeth; Wang, Shuihua
2017-01-01
Alzheimer's disease patients are increasing rapidly every year. Scholars tend to use computer vision methods to develop automatic diagnosis system. (Background) In 2015, Gorji et al. proposed a novel method using pseudo Zernike moment. They tested four classifiers: learning vector quantization neural network, pattern recognition neural network trained by Levenberg-Marquardt, by resilient backpropagation, and by scaled conjugate gradient. This study presents an improved method by introducing a relatively new classifier-linear regression classification. Our method selects one axial slice from 3D brain image, and employed pseudo Zernike moment with maximum order of 15 to extract 256 features from each image. Finally, linear regression classification was harnessed as the classifier. The proposed approach obtains an accuracy of 97.51%, a sensitivity of 96.71%, and a specificity of 97.73%. Our method performs better than Gorji's approach and five other state-of-the-art approaches. Therefore, it can be used to detect Alzheimer's disease. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Sebert Kuhlmann, Anne K.; Thomas, Deborah; R. Sain, Stephan
2009-01-01
Objectives. We examined patterns of pedestrian–motor vehicle collisions and associated environmental characteristics in Denver, Colorado. Methods. We integrated publicly available data on motor vehicle collisions, liquor licenses, land use, and sociodemographic characteristics to analyze spatial patterns and other characteristics of collisions involving pedestrians. We developed both linear and spatially weighted regression models of these collisions. Results. Spatial analysis revealed global clustering of pedestrian–motor vehicle collisions with concentrations in downtown, in a contiguous neighborhood, and along major arterial streets. Walking to work, population density, and liquor license outlet density all contributed significantly to both linear and spatial models of collisions involving pedestrians and were each significantly associated with these collisions. Conclusions. These models, constructed with data from Denver, identified conditions that likely contribute to patterns of pedestrian–motor vehicle collisions. Should these models be verified elsewhere, they will have implications for future research directions, public policy to enhance pedestrian safety, and public health programs aimed at decreasing unintentional injury from pedestrian–motor vehicle collisions and promoting walking as a routine physical activity. PMID:19608966
Missing Data in Clinical Studies: Issues and Methods
Ibrahim, Joseph G.; Chu, Haitao; Chen, Ming-Hui
2012-01-01
Missing data are a prevailing problem in any type of data analyses. A participant variable is considered missing if the value of the variable (outcome or covariate) for the participant is not observed. In this article, various issues in analyzing studies with missing data are discussed. Particularly, we focus on missing response and/or covariate data for studies with discrete, continuous, or time-to-event end points in which generalized linear models, models for longitudinal data such as generalized linear mixed effects models, or Cox regression models are used. We discuss various classifications of missing data that may arise in a study and demonstrate in several situations that the commonly used method of throwing out all participants with any missing data may lead to incorrect results and conclusions. The methods described are applied to data from an Eastern Cooperative Oncology Group phase II clinical trial of liver cancer and a phase III clinical trial of advanced non–small-cell lung cancer. Although the main area of application discussed here is cancer, the issues and methods we discuss apply to any type of study. PMID:22649133
[Influence of humidex on incidence of bacillary dysentery in Hefei: a time-series study].
Zhang, H; Zhao, K F; He, R X; Zhao, D S; Xie, M Y; Wang, S S; Bai, L J; Cheng, Q; Zhang, Y W; Su, H
2017-11-10
Objective: To investigate the effect of humidex combined with mean temperature and relative humidity on the incidence of bacillary dysentery in Hefei. Methods: Daily counts of bacillary dysentery cases and weather data in Hefei were collected from January 1, 2006 to December 31, 2013. Then, the humidex was calculated from temperature and relative humidity. A Poisson generalized linear regression combined with distributed lag non-linear model was applied to analyze the relationship between humidex and the incidence of bacillary dysentery, after adjusting for long-term and seasonal trends, day of week and other weather confounders. Stratified analyses by gender, age and address were also conducted. Results: The risk of bacillary dysentery increased with the rise of humidex. The adverse effect of high humidex (90 percentile of humidex) appeared in 2-days lag and it was the largest at 4-days lag ( RR =1.063, 95 %CI : 1.037-1.090). Subgroup analyses indicated that all groups were affected by high humidex at lag 2-5 days. Conclusion: High humidex could significantly increase the risk of bacillary dysentery, and the lagged effects were observed.
Prospective Associations Among Assets and Successful Transition to Early Adulthood
Vesely, Sara K.; Aspy, Cheryl B.; Tolma, Eleni L.
2015-01-01
Objectives. We investigated prospective associations among assets (e.g., family communication), which research has shown to protect youths from risk behavior, and successful transition to early adulthood (STEA). Methods. We included participants (n = 651) aged 18 years and older at study wave 5 (2007–2008) of the Youth Asset Study, in the Oklahoma City, Oklahoma, metro area, in the analyses. We categorized 14 assets into individual-, family-, or community-level groups. We included asset groups assessed at wave 1 (2003–2004) in linear regression analyses to predict STEA 4 years later at wave 5. Results. Individual- and community-level assets significantly (P < .05) predicted STEA 4 years later and the associations were generally linear, indicating that the more assets participants possessed the better the STEA outcome. There was a gender interaction for family-level assets suggesting that family-level assets were significant predictors of STEA for males but not for females. Conclusions. Public health programming should focus on community- and family-level youth assets as well as individual-level youth assets to promote positive health outcomes in early adulthood. PMID:25393188
Reasons for Hierarchical Linear Modeling: A Reminder.
ERIC Educational Resources Information Center
Wang, Jianjun
1999-01-01
Uses examples of hierarchical linear modeling (HLM) at local and national levels to illustrate proper applications of HLM and dummy variable regression. Raises cautions about the circumstances under which hierarchical data do not need HLM. (SLD)
BIODEGRADATION PROBABILITY PROGRAM (BIODEG)
The Biodegradation Probability Program (BIODEG) calculates the probability that a chemical under aerobic conditions with mixed cultures of microorganisms will biodegrade rapidly or slowly. It uses fragment constants developed using multiple linear and non-linear regressions and d...
The Use of Linear Programming for Prediction.
ERIC Educational Resources Information Center
Schnittjer, Carl J.
The purpose of the study was to develop a linear programming model to be used for prediction, test the accuracy of the predictions, and compare the accuracy with that produced by curvilinear multiple regression analysis. (Author)
Determining Predictor Importance in Hierarchical Linear Models Using Dominance Analysis
ERIC Educational Resources Information Center
Luo, Wen; Azen, Razia
2013-01-01
Dominance analysis (DA) is a method used to evaluate the relative importance of predictors that was originally proposed for linear regression models. This article proposes an extension of DA that allows researchers to determine the relative importance of predictors in hierarchical linear models (HLM). Commonly used measures of model adequacy in…
Independent contrasts and PGLS regression estimators are equivalent.
Blomberg, Simon P; Lefevre, James G; Wells, Jessie A; Waterhouse, Mary
2012-05-01
We prove that the slope parameter of the ordinary least squares regression of phylogenetically independent contrasts (PICs) conducted through the origin is identical to the slope parameter of the method of generalized least squares (GLSs) regression under a Brownian motion model of evolution. This equivalence has several implications: 1. Understanding the structure of the linear model for GLS regression provides insight into when and why phylogeny is important in comparative studies. 2. The limitations of the PIC regression analysis are the same as the limitations of the GLS model. In particular, phylogenetic covariance applies only to the response variable in the regression and the explanatory variable should be regarded as fixed. Calculation of PICs for explanatory variables should be treated as a mathematical idiosyncrasy of the PIC regression algorithm. 3. Since the GLS estimator is the best linear unbiased estimator (BLUE), the slope parameter estimated using PICs is also BLUE. 4. If the slope is estimated using different branch lengths for the explanatory and response variables in the PIC algorithm, the estimator is no longer the BLUE, so this is not recommended. Finally, we discuss whether or not and how to accommodate phylogenetic covariance in regression analyses, particularly in relation to the problem of phylogenetic uncertainty. This discussion is from both frequentist and Bayesian perspectives.
Pestaña-Melero, Francisco Luis; Haff, G Gregory; Rojas, Francisco Javier; Pérez-Castilla, Alejandro; García-Ramos, Amador
2017-12-18
This study aimed to compare the between-session reliability of the load-velocity relationship between (1) linear vs. polynomial regression models, (2) concentric-only vs. eccentric-concentric bench press variants, as well as (3) the within-participants vs. the between-participants variability of the velocity attained at each percentage of the one-repetition maximum (%1RM). The load-velocity relationship of 30 men (age: 21.2±3.8 y; height: 1.78±0.07 m, body mass: 72.3±7.3 kg; bench press 1RM: 78.8±13.2 kg) were evaluated by means of linear and polynomial regression models in the concentric-only and eccentric-concentric bench press variants in a Smith Machine. Two sessions were performed with each bench press variant. The main findings were: (1) first-order-polynomials (CV: 4.39%-4.70%) provided the load-velocity relationship with higher reliability than second-order-polynomials (CV: 4.68%-5.04%); (2) the reliability of the load-velocity relationship did not differ between the concentric-only and eccentric-concentric bench press variants; (3) the within-participants variability of the velocity attained at each %1RM was markedly lower than the between-participants variability. Taken together, these results highlight that, regardless of the bench press variant considered, the individual determination of the load-velocity relationship by a linear regression model could be recommended to monitor and prescribe the relative load in the Smith machine bench press exercise.
NASA Astrophysics Data System (ADS)
Jiang, Weiping; Ma, Jun; Li, Zhao; Zhou, Xiaohui; Zhou, Boye
2018-05-01
The analysis of the correlations between the noise in different components of GPS stations has positive significance to those trying to obtain more accurate uncertainty of velocity with respect to station motion. Previous research into noise in GPS position time series focused mainly on single component evaluation, which affects the acquisition of precise station positions, the velocity field, and its uncertainty. In this study, before and after removing the common-mode error (CME), we performed one-dimensional linear regression analysis of the noise amplitude vectors in different components of 126 GPS stations with a combination of white noise, flicker noise, and random walking noise in Southern California. The results show that, on the one hand, there are above-moderate degrees of correlation between the white noise amplitude vectors in all components of the stations before and after removal of the CME, while the correlations between flicker noise amplitude vectors in horizontal and vertical components are enhanced from un-correlated to moderately correlated by removing the CME. On the other hand, the significance tests show that, all of the obtained linear regression equations, which represent a unique function of the noise amplitude in any two components, are of practical value after removing the CME. According to the noise amplitude estimates in two components and the linear regression equations, more accurate noise amplitudes can be acquired in the two components.
NASA Astrophysics Data System (ADS)
Kuchar, A.; Sacha, P.; Miksovsky, J.; Pisoft, P.
2015-06-01
This study focusses on the variability of temperature, ozone and circulation characteristics in the stratosphere and lower mesosphere with regard to the influence of the 11-year solar cycle. It is based on attribution analysis using multiple nonlinear techniques (support vector regression, neural networks) besides the multiple linear regression approach. The analysis was applied to several current reanalysis data sets for the 1979-2013 period, including MERRA, ERA-Interim and JRA-55, with the aim to compare how these types of data resolve especially the double-peaked solar response in temperature and ozone variables and the consequent changes induced by these anomalies. Equatorial temperature signals in the tropical stratosphere were found to be in qualitative agreement with previous attribution studies, although the agreement with observational results was incomplete, especially for JRA-55. The analysis also pointed to the solar signal in the ozone data sets (i.e. MERRA and ERA-Interim) not being consistent with the observed double-peaked ozone anomaly extracted from satellite measurements. The results obtained by linear regression were confirmed by the nonlinear approach through all data sets, suggesting that linear regression is a relevant tool to sufficiently resolve the solar signal in the middle atmosphere. The seasonal evolution of the solar response was also discussed in terms of dynamical causalities in the winter hemispheres. The hypothetical mechanism of a weaker Brewer-Dobson circulation at solar maxima was reviewed together with a discussion of polar vortex behaviour.
Predicting birth weight with conditionally linear transformation models.
Möst, Lisa; Schmid, Matthias; Faschingbauer, Florian; Hothorn, Torsten
2016-12-01
Low and high birth weight (BW) are important risk factors for neonatal morbidity and mortality. Gynecologists must therefore accurately predict BW before delivery. Most prediction formulas for BW are based on prenatal ultrasound measurements carried out within one week prior to birth. Although successfully used in clinical practice, these formulas focus on point predictions of BW but do not systematically quantify uncertainty of the predictions, i.e. they result in estimates of the conditional mean of BW but do not deliver prediction intervals. To overcome this problem, we introduce conditionally linear transformation models (CLTMs) to predict BW. Instead of focusing only on the conditional mean, CLTMs model the whole conditional distribution function of BW given prenatal ultrasound parameters. Consequently, the CLTM approach delivers both point predictions of BW and fetus-specific prediction intervals. Prediction intervals constitute an easy-to-interpret measure of prediction accuracy and allow identification of fetuses subject to high prediction uncertainty. Using a data set of 8712 deliveries at the Perinatal Centre at the University Clinic Erlangen (Germany), we analyzed variants of CLTMs and compared them to standard linear regression estimation techniques used in the past and to quantile regression approaches. The best-performing CLTM variant was competitive with quantile regression and linear regression approaches in terms of conditional coverage and average length of the prediction intervals. We propose that CLTMs be used because they are able to account for possible heteroscedasticity, kurtosis, and skewness of the distribution of BWs. © The Author(s) 2014.
Krasikova, Dina V; Le, Huy; Bachura, Eric
2018-06-01
To address a long-standing concern regarding a gap between organizational science and practice, scholars called for more intuitive and meaningful ways of communicating research results to users of academic research. In this article, we develop a common language effect size index (CLβ) that can help translate research results to practice. We demonstrate how CLβ can be computed and used to interpret the effects of continuous and categorical predictors in multiple linear regression models. We also elaborate on how the proposed CLβ index is computed and used to interpret interactions and nonlinear effects in regression models. In addition, we test the robustness of the proposed index to violations of normality and provide means for computing standard errors and constructing confidence intervals around its estimates. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Nie, Z Q; Ou, Y Q; Zhuang, J; Qu, Y J; Mai, J Z; Chen, J M; Liu, X Q
2016-05-01
Conditional logistic regression analysis and unconditional logistic regression analysis are commonly used in case control study, but Cox proportional hazard model is often used in survival data analysis. Most literature only refer to main effect model, however, generalized linear model differs from general linear model, and the interaction was composed of multiplicative interaction and additive interaction. The former is only statistical significant, but the latter has biological significance. In this paper, macros was written by using SAS 9.4 and the contrast ratio, attributable proportion due to interaction and synergy index were calculated while calculating the items of logistic and Cox regression interactions, and the confidence intervals of Wald, delta and profile likelihood were used to evaluate additive interaction for the reference in big data analysis in clinical epidemiology and in analysis of genetic multiplicative and additive interactions.
Multiple regression for physiological data analysis: the problem of multicollinearity.
Slinker, B K; Glantz, S A
1985-07-01
Multiple linear regression, in which several predictor variables are related to a response variable, is a powerful statistical tool for gaining quantitative insight into complex in vivo physiological systems. For these insights to be correct, all predictor variables must be uncorrelated. However, in many physiological experiments the predictor variables cannot be precisely controlled and thus change in parallel (i.e., they are highly correlated). There is a redundancy of information about the response, a situation called multicollinearity, that leads to numerical problems in estimating the parameters in regression equations; the parameters are often of incorrect magnitude or sign or have large standard errors. Although multicollinearity can be avoided with good experimental design, not all interesting physiological questions can be studied without encountering multicollinearity. In these cases various ad hoc procedures have been proposed to mitigate multicollinearity. Although many of these procedures are controversial, they can be helpful in applying multiple linear regression to some physiological problems.
Schistosomiasis Breeding Environment Situation Analysis in Dongting Lake Area
NASA Astrophysics Data System (ADS)
Li, Chuanrong; Jia, Yuanyuan; Ma, Lingling; Liu, Zhaoyan; Qian, Yonggang
2013-01-01
Monitoring environmental characteristics, such as vegetation, soil moisture et al., of Oncomelania hupensis (O. hupensis)’ spatial/temporal distribution is of vital importance to the schistosomiasis prevention and control. In this study, the relationship between environmental factors derived from remotely sensed data and the density of O. hupensis was analyzed by a multiple linear regression model. Secondly, spatial analysis of the regression residual was investigated by the semi-variogram method. Thirdly, spatial analysis of the regression residual and the multiple linear regression model were both employed to estimate the spatial variation of O. hupensis density. Finally, the approach was used to monitor and predict the spatial and temporal variations of oncomelania of Dongting Lake region, China. And the areas of potential O. hupensis habitats were predicted and the influence of Three Gorges Dam (TGB)project on the density of O. hupensis was analyzed.
NASA Astrophysics Data System (ADS)
Yadav, Manish; Singh, Nitin Kumar
2017-12-01
A comparison of the linear and non-linear regression method in selecting the optimum isotherm among three most commonly used adsorption isotherms (Langmuir, Freundlich, and Redlich-Peterson) was made to the experimental data of fluoride (F) sorption onto Bio-F at a solution temperature of 30 ± 1 °C. The coefficient of correlation (r2) was used to select the best theoretical isotherm among the investigated ones. A total of four Langmuir linear equations were discussed and out of which linear form of most popular Langmuir-1 and Langmuir-2 showed the higher coefficient of determination (0.976 and 0.989) as compared to other Langmuir linear equations. Freundlich and Redlich-Peterson isotherms showed a better fit to the experimental data in linear least-square method, while in non-linear method Redlich-Peterson isotherm equations showed the best fit to the tested data set. The present study showed that the non-linear method could be a better way to obtain the isotherm parameters and represent the most suitable isotherm. Redlich-Peterson isotherm was found to be the best representative (r2 = 0.999) for this sorption system. It is also observed that the values of β are not close to unity, which means the isotherms are approaching the Freundlich but not the Langmuir isotherm.
Principal component regression analysis with SPSS.
Liu, R X; Kuang, J; Gong, Q; Hou, X L
2003-06-01
The paper introduces all indices of multicollinearity diagnoses, the basic principle of principal component regression and determination of 'best' equation method. The paper uses an example to describe how to do principal component regression analysis with SPSS 10.0: including all calculating processes of the principal component regression and all operations of linear regression, factor analysis, descriptives, compute variable and bivariate correlations procedures in SPSS 10.0. The principal component regression analysis can be used to overcome disturbance of the multicollinearity. The simplified, speeded up and accurate statistical effect is reached through the principal component regression analysis with SPSS.
Kim, Jiwon; Srinivasan, Aparna; Garcia, Tania S.; Franco, Antonino Di; Peskin, Charles S.; McQueen, David M.; Paul, Tracy K.; Feher, Attila; Geevarghese, Alexi; Rozenstrauch, Meenakshi; Devereux, Richard B.; Weinsaft, Jonathan W.
2016-01-01
Background Echo-derived linear dimensions offer straightforward indices of right ventricular (RV) structure but have not been systematically compared to RV volumes on cardiac magnetic resonance (CMR). Methods Echo and CMR were interpreted among CAD patients imaged via prospective (90%) or retrospective (10%) registries. For echo, American Society of Echocardiography (ASE) recommended RV dimensions were measured in apical 4-chamber (basal RV width, mid RV width, RV length), parasternal long (proximal RV outflow tract [pRVOT]) and short axis (distal RVOT) views. For CMR, RV end-diastolic (RV-EDV) and end-systolic (RV-ESV) volumes were quantified via border planimetry. Results 272 patients underwent echo and CMR within a narrow interval (0.4±1.0 days); complete acquisition of all ASE dimensions was feasible in 98%. All echo dimensions differed between patients with and without RV dilation on CMR (p<0.05). Basal RV width (r=0.70), pRVOT width (r=0.68), and RV length (r=0.61) yielded highest correlations with RV-EDV on CMR; end-systolic dimensions yielded similar correlations (r=0.68, 0.66, 0.65 respectively). In multivariable regression, basal RV width (regression coefficient 1.96 per mm [CI 1.22–2.70], p<0.001), RV length (0.97[0.56–1.37], p<0.001) and pRVOT width (2.62 [1.79–3.44], p<0.001) were independently associated with CMR RV-EDV[r= 0.80]. RV-ESV was similarly associated with echo dimensions (basal RV width; 1.59 per mm [CI 1.06–2.13], p<0.001) | RV length; 1.00 [0.66–1.34], p<0.001) | pRVOT width; 1.80 [1.22–2.39], p<0.001) [r= 0.79]. Conclusions RV linear dimensions provide readily obtainable markers of RV chamber size. Proximal RVOT and basal width are independently associated with CMR volumes, supporting use of multiple linear dimensions when assessing RV size on echo. PMID:27297619
Christensen, Jeppe Schultz; Raaschou-Nielsen, Ole; Tjønneland, Anne; Overvad, Kim; Nordsborg, Rikke B.; Ketzel, Matthias; Sørensen, Thorkild IA; Sørensen, Mette
2015-01-01
Background Traffic noise has been associated with cardiovascular and metabolic disorders. Potential modes of action are through stress and sleep disturbance, which may lead to endocrine dysregulation and overweight. Objectives We aimed to investigate the relationship between residential traffic and railway noise and adiposity. Methods In this cross-sectional study of 57,053 middle-aged people, height, weight, waist circumference, and bioelectrical impedance were measured at enrollment (1993–1997). Body mass index (BMI), body fat mass index (BFMI), and lean body mass index (LBMI) were calculated. Residential exposure to road and railway traffic noise exposure was calculated using the Nordic prediction method. Associations between traffic noise and anthropometric measures at enrollment were analyzed using general linear models and logistic regression adjusted for demographic and lifestyle factors. Results Linear regression models adjusted for age, sex, and socioeconomic factors showed that 5-year mean road traffic noise exposure preceding enrollment was associated with a 0.35-cm wider waist circumference (95% CI: 0.21, 0.50) and a 0.18-point higher BMI (95% CI: 0.12, 0.23) per 10 dB. Small, significant increases were also found for BFMI and LBMI. All associations followed linear exposure–response relationships. Exposure to railway noise was not linearly associated with adiposity measures. However, exposure > 60 dB was associated with a 0.71-cm wider waist circumference (95% CI: 0.23, 1.19) and a 0.19-point higher BMI (95% CI: 0.0072, 0.37) compared with unexposed participants (0–20 dB). Conclusions The present study finds positive associations between residential exposure to road traffic and railway noise and adiposity. Citation Christensen JS, Raaschou-Nielsen O, Tjønneland A, Overvad K, Nordsborg RB, Ketzel M, Sørensen TI, Sørensen M. 2016. Road traffic and railway noise exposures and adiposity in adults: a cross-sectional analysis of the Danish Diet, Cancer, and Health cohort. Environ Health Perspect 124:329–335; http://dx.doi.org/10.1289/ehp.1409052 PMID:26241990
A New Test of Linear Hypotheses in OLS Regression under Heteroscedasticity of Unknown Form
ERIC Educational Resources Information Center
Cai, Li; Hayes, Andrew F.
2008-01-01
When the errors in an ordinary least squares (OLS) regression model are heteroscedastic, hypothesis tests involving the regression coefficients can have Type I error rates that are far from the nominal significance level. Asymptotically, this problem can be rectified with the use of a heteroscedasticity-consistent covariance matrix (HCCM)…
Deriving the Regression Equation without Using Calculus
ERIC Educational Resources Information Center
Gordon, Sheldon P.; Gordon, Florence S.
2004-01-01
Probably the one "new" mathematical topic that is most responsible for modernizing courses in college algebra and precalculus over the last few years is the idea of fitting a function to a set of data in the sense of a least squares fit. Whether it be simple linear regression or nonlinear regression, this topic opens the door to applying the…
Relationships of Measurement Error and Prediction Error in Observed-Score Regression
ERIC Educational Resources Information Center
Moses, Tim
2012-01-01
The focus of this paper is assessing the impact of measurement errors on the prediction error of an observed-score regression. Measures are presented and described for decomposing the linear regression's prediction error variance into parts attributable to the true score variance and the error variances of the dependent variable and the predictor…
Optimized multiple linear mappings for single image super-resolution
NASA Astrophysics Data System (ADS)
Zhang, Kaibing; Li, Jie; Xiong, Zenggang; Liu, Xiuping; Gao, Xinbo
2017-12-01
Learning piecewise linear regression has been recognized as an effective way for example learning-based single image super-resolution (SR) in literature. In this paper, we employ an expectation-maximization (EM) algorithm to further improve the SR performance of our previous multiple linear mappings (MLM) based SR method. In the training stage, the proposed method starts with a set of linear regressors obtained by the MLM-based method, and then jointly optimizes the clustering results and the low- and high-resolution subdictionary pairs for regression functions by using the metric of the reconstruction errors. In the test stage, we select the optimal regressor for SR reconstruction by accumulating the reconstruction errors of m-nearest neighbors in the training set. Thorough experimental results carried on six publicly available datasets demonstrate that the proposed SR method can yield high-quality images with finer details and sharper edges in terms of both quantitative and perceptual image quality assessments.
Equilibrium, kinetics and process design of acid yellow 132 adsorption onto red pine sawdust.
Can, Mustafa
2015-01-01
Linear and non-linear regression procedures have been applied to the Langmuir, Freundlich, Tempkin, Dubinin-Radushkevich, and Redlich-Peterson isotherms for adsorption of acid yellow 132 (AY132) dye onto red pine (Pinus resinosa) sawdust. The effects of parameters such as particle size, stirring rate, contact time, dye concentration, adsorption dose, pH, and temperature were investigated, and interaction was characterized by Fourier transform infrared spectroscopy and field emission scanning electron microscope. The non-linear method of the Langmuir isotherm equation was found to be the best fitting model to the equilibrium data. The maximum monolayer adsorption capacity was found as 79.5 mg/g. The calculated thermodynamic results suggested that AY132 adsorption onto red pine sawdust was an exothermic, physisorption, and spontaneous process. Kinetics was analyzed by four different kinetic equations using non-linear regression analysis. The pseudo-second-order equation provides the best fit with experimental data.
Carbon dioxide stripping in aquaculture -- part III: model verification
Colt, John; Watten, Barnaby; Pfeiffer, Tim
2012-01-01
Based on conventional mass transfer models developed for oxygen, the use of the non-linear ASCE method, 2-point method, and one parameter linear-regression method were evaluated for carbon dioxide stripping data. For values of KLaCO2 < approximately 1.5/h, the 2-point or ASCE method are a good fit to experimental data, but the fit breaks down at higher values of KLaCO2. How to correct KLaCO2 for gas phase enrichment remains to be determined. The one-parameter linear regression model was used to vary the C*CO2 over the test, but it did not result in a better fit to the experimental data when compared to the ASCE or fixed C*CO2 assumptions.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Viani, Gustavo Arruda; Stefano, Eduardo Jose; Afonso, Sergio Luis
2009-08-01
Purpose: To determine in a meta-analysis whether the outcomes in men with localized prostate cancer treated with high-dose radiotherapy (HDRT) are better than those in men treated with conventional-dose radiotherapy (CDRT), by quantifying the effect of the total dose of radiotherapy on biochemical control (BC). Methods and Materials: The MEDLINE, EMBASE, CANCERLIT, and Cochrane Library databases, as well as the proceedings of annual meetings, were systematically searched to identify randomized, controlled studies comparing HDRT with CDRT for localized prostate cancer. To evaluate the dose-response relationship, we conducted a meta-regression analysis of BC ratios by means of weighted linear regression. Results:more » Seven RCTs with a total patient population of 2812 were identified that met the study criteria. Pooled results from these RCTs showed a significant reduction in the incidence of biochemical failure in those patients with prostate cancer treated with HDRT (p < 0.0001). However, there was no difference in the mortality rate (p = 0.38) and specific prostate cancer mortality rates (p = 0.45) between the groups receiving HDRT and CDRT. However, there were more cases of late Grade >2 gastrointestinal toxicity after HDRT than after CDRT. In the subgroup analysis, patients classified as being at low (p = 0.007), intermediate (p < 0.0001), and high risk (p < 0.0001) of biochemical failure all showed a benefit from HDRT. The meta-regression analysis also detected a linear correlation between the total dose of radiotherapy and biochemical failure (BC = -67.3 + [1.8 x radiotherapy total dose in Gy]; p = 0.04). Conclusions: Our meta-analysis showed that HDRT is superior to CDRT in preventing biochemical failure in low-, intermediate-, and high-risk prostate cancer patients, suggesting that this should be offered as a treatment for all patients, regardless of their risk status.« less
Live Donor Renal Anatomic Asymmetry and Post-Transplant Renal Function
Tanriover, Bekir; Fernandez, Sonalis; Campenot, Eric S.; Newhouse, Jeffrey H.; Oyfe, Irina; Mohan, Prince; Sandikci, Burhaneddin; Radhakrishnan, Jai; Wexler, Jennifer J.; Carroll, Maureen A.; Sharif, Sairah; Cohen, David J.; Ratner, Lloyd E.; Hardy, Mark A.
2014-01-01
Background Relationship between live donor renal anatomic asymmetry and post-transplant recipient function has not been studied extensively. Methods We analyzed 96 live-kidney donors, who had anatomical asymmetry (>10% renal length and/or volume difference calculated from CT angiograms) and their matching recipients. Split function differences (SFD) were quantified with 99mTc-DMSA renography. Implantation biopsies at time-zero were semi-quantitatively scored. A comprehensive model utilizing donor renal volume adjusted to recipient weight (Vol/Wgt), SFD, and biopsy score was used to predict recipient estimated glomerular filtration rate (eGFR) at one-year. Primary analysis consisted of a logistic regression model of outcome (odds of developing eGFR>60ml/min/1.73 m2 at one-year), a linear regression model of outcome (predicting recipient eGFR at one-year, using the CKD-EPI formula), and a Monte Carlo simulation based on the linear regression model (N=10,000 iterations). Results In the study cohort, the mean Vol/Wgt and eGFR at one-year were 2.04 ml/kg and 60.4 ml/min/1.73m2, respectively. Volume and split ratios between two donor kidneys were strongly correlated (r=0.79, p-value<0.001). The biopsy scores among SFD categories (<5%, 5–10%, >10%) were not different (p=0.190). On multivariate models, only Vol/Wgt was significantly associated with higher odds of having eGFR>60ml/min/1.73 m2 (OR=8.94, 95% CI 2.47–32.25, p=0.001) and had a strong discriminatory power in predicting the risk of eGFR<60ml/min/1.73m2 at one-year (ROC curve=0.78, 95% CI 0.68–0.89). Conclusion In the presence of donor renal anatomic asymmetry, Vol/Wgt appears to be a major determinant of recipient renal function at one-year post-transplantation. Renography can be replaced with CT volume calculation in estimating split renal function. PMID:25719258
NASA Astrophysics Data System (ADS)
Baker, R. G. V.
2005-12-01
The Internet has been publicly portrayed as a new technological horizon yielding instantaneous interaction to a point where geography no longer matters. This research aims to dispel this impression by applying a dynamic form of trip modelling to investigate pings in a global computer network compiled by the Stanford Linear Accelerator Centre (SLAC) from 1998 to 2004. Internet flows have been predicted to have the same mathematical operators as trips to a supermarket, since they are both periodic and constrained by a distance metric. Both actual and virtual trips are part of a spectrum of origin-destination pairs in the time-space convergence of trip time-lines. Internet interaction is very near to the convergence of these time-lines (at a very small time scale in milliseconds, but with interactions over thousands of kilometres). There is a lag effect and this is formalised by the derivation of Gaussian and gravity inequalities between the time taken (Δ t) and the partitioning of distance (Δ x). This inequality seems to be robust for a regression of Δ t to Δ x in the SLAC data set for each year (1998 to 2004). There is a constant ‘forbidden zone’ in the interaction, underpinned by the fact that pings do not travel faster than the speed of light. Superimposed upon this zone is the network capacity where a linear regression of Δ t to Δ x is a proxy summarising global Internet connectivity for that year. The results suggest that there has been a substantial improvement in connectivity over the period with R 2 increasing steadily from 0.39 to 0.65 from less Gaussian spreading of the ping latencies. Further, the regression line shifts towards the inequality boundary from 1998 to 2004, where the increased slope shows a greater proportional rise in local connectivity over global connectivity. A conclusion is that national geography still does matter in spatial interaction modelling of the Internet.
Scorletti, Eleonora; Bhatia, Lokpal; McCormick, Keith G; Clough, Geraldine F; Nash, Kathryn; Hodson, Leanne; Moyses, Helen E; Calder, Philip C; Byrne, Christopher D
2014-10-01
There is no licensed treatment for non-alcoholic fatty liver disease (NAFLD), a condition that increases risk of chronic liver disease, type 2 diabetes and cardiovascular disease. We tested whether 15-18 months treatment with docosahexaenoic acid (DHA) plus eicosapentaenoic acid (EPA) (Omacor/Lovaza) (4 g/day) decreased liver fat and improved two histologically-validated liver fibrosis biomarker scores (primary outcomes). Patients with NAFLD were randomised in a double blind placebo-controlled trial [DHA+EPA(n=51), placebo(n=52)]. We quantified liver fat percentage (%) by magnetic resonance spectroscopy in three liver zones. We measured liver fibrosis using two validated scores. We tested adherence to the intervention (Omacor group) and contamination (with DHA and EPA) (placebo group) by measuring erythrocyte percentage DHA and EPA enrichment (gas chromatography). We undertook multivariable linear regression to test effects of: a) DHA+EPA treatment (ITT analyses) and b) erythrocyte DHA and EPA enrichment (secondary analysis). Median (IQR) baseline and end of study liver fat% were 21.7 (19.3) and 19.7 (18.0) (placebo), and 23.0 (36.2) and 16.3 (22.0), (DHA+EPA). In the fully adjusted regression model there was a trend towards improvement in liver fat% with DHA+EPA treatment (β=-3.64 (95%CI -8.0,0.8); p=0.1) but there was evidence of contamination in the placebo group and variable adherence to the intervention in the Omacor group. Further regression analysis showed that DHA enrichment was independently associated with a decrease in liver fat% (for each 1% enrichment, β=-1.70 (95%CI -2.9,-0.5); p=0.007). No improvement in the fibrosis scores occurred. Conclusion. Erythrocyte DHA enrichment with DHA+EPA treatment is linearly associated with decreased liver fat%. Substantial decreases in liver fat% can be achieved with high percentage erythrocyte DHA enrichment in NAFLD. (Hepatology 2014;).
Enlarged perivascular spaces and cognitive impairment after stroke and transient ischemic attack.
Arba, Francesco; Quinn, Terence J; Hankey, Graeme J; Lees, Kennedy R; Wardlaw, Joanna M; Ali, Myzoon; Inzitari, Domenico
2018-01-01
Background Previous studies suggested that enlarged perivascular spaces are neuroimaging markers of cerebral small vessel disease. However, it is not clear whether enlarged perivascular spaces are associated with cognitive impairment. We aimed to determine the cross-sectional relationship between enlarged perivascular spaces and small vessel disease, and to investigate the relationship between enlarged perivascular spaces and subsequent cognitive impairment in patients with recent cerebral ischemic event. Methods Anonymized data were accessed from the virtual international stroke trial archive. We rated number of lacunes, white matter hyperintensities, brain atrophy, and enlarged perivascular spaces with validated scales on magnetic resonance brain images after the index stroke. We defined cognitive impairment as a mini mental state examination score of ≤26, recorded at one year post stroke. We examined the associations between enlarged perivascular spaces and clinical and imaging markers of small vessel disease at presentation and clinical evidence of cognitive impairment at one year using linear and logistic regression models. Results We analyzed data on 430 patients with mean (±SD) age 64.7 (±12.7) years, 276 (64%) males. In linear regression analysis, age (β = 0.24; p < 0.001), hypertension (β = 0.09; p = 0.025), and deep white matter hyperintensities (β = 0.31; p < 0.001) were associated with enlarged perivascular spaces. In logistic regression analysis, basal ganglia enlarged perivascular spaces were independently associated with cognitive impairment at one year after adjusting for clinical confounders (OR = 1.72, 95% CI = 1.22-2.42) and for clinical and imaging confounders (OR = 1.54; 95% CI = 1.03-2.31). Conclusions Our data show that in patients with ischemic cerebral events, enlarged perivascular spaces are cross-sectionally associated with age, hypertension, and white matter hyperintensities and suggest that enlarged perivascular spaces in the basal ganglia are associated with cognitive impairment after one year.
Volume and functional outcome of intracerebral hemorrhage according to oral anticoagulant type
Wilson, Duncan; Charidimou, Andreas; Shakeshaft, Clare; Ambler, Gareth; White, Mark; Cohen, Hannah; Yousry, Tarek; Al-Shahi Salman, Rustam; Lip, Gregory Y.H.; Brown, Martin M.; Jäger, Hans Rolf
2016-01-01
Objective: To compare intracerebral hemorrhage (ICH) volume and clinical outcome of non–vitamin K oral anticoagulants (NOAC)–associated ICH to warfarin-associated ICH. Methods: In this multicenter cross-sectional observational study of patients with anticoagulant-associated ICH, consecutive patients with NOAC-ICH were compared to those with warfarin-ICH selected from a population of 344 patients with anticoagulant-associated ICH. ICH volume was measured by an observer blinded to clinical details. Outcome measures were ICH volume and clinical outcome adjusted for confounding factors. Results: We compared 11 patients with NOAC-ICH to 52 patients with warfarin-ICH. The median ICH volume was 2.4 mL (interquartile range [IQR] 0.3–5.4 mL) for NOAC-ICH vs 8.9 mL (IQR 4.0–21.3 mL) for warfarin-ICH (p = 0.0028). In univariate linear regression, use of warfarin (difference in cube root volume 1.61; 95% confidence interval [CI] 0.69 to 2.53) and lobar ICH location (compared with nonlobar ICH; difference in cube root volume 1.52; 95% CI 2.20 to 0.85) were associated with larger ICH volumes. In multivariable linear regression adjusting for confounding factors (sex, hypertension, previous ischemic stroke, white matter disease burden, and premorbid modified Rankin Scale score [mRS]), warfarin use remained independently associated with larger ICH (cube root) volumes (coefficient 0.64; 95% CI 0.24 to 1.25; p = 0.042). Ordered logistic regression showed an increased odds of a worse clinical outcome (as measured by discharge mRS) in warfarin-ICH compared with NOAC-ICH: odds ratio 4.46 (95% CI 1.10 to 18.14; p = 0.037). Conclusions: In this small prospective observational study, patients with NOAC-associated ICH had smaller ICH volumes and better clinical outcomes compared with warfarin-associated ICH. PMID:26718576
Lu, Liming; Shi, Leiyu; Zeng, Jingchun; Wen, Zehuai
2017-01-01
Background Previous meta-analyses on the relationship between aspirin use and breast cancer risk have drawn inconsistent results. In addition, the threshold effect of different doses, frequencies and durations of aspirin use in preventing breast cancer have yet to be established. Results The search yielded 13 prospective cohort studies (N=857,831 participants) that reported an average of 7.6 cases/1,000 person-years of breast cancer during a follow-up period of from 4.4 to 14 years. With a random effects model, a borderline significant inverse association was observed between overall aspirin use and breast cancer risk, with a summarized RR = 0.94 (P = 0.051, 95% CI 0.87-1.01). The linear regression model was a better fit for the dose-response relationship, which displayed a potential relationship between the frequency of aspirin use and breast cancer risk (RR = 0.97, 0.95 and 0.90 for 5, 10 and 20 times/week aspirin use, respectively). It was also a better fit for the duration of aspirin use and breast cancer risk (RR = 0.86, 0.73 and 0.54 for 5, 10 and 20 years of aspirin use). Methods We searched MEDLINE, EMBASE and CENTRAL databases through early October 2016 for relevant prospective cohort studies of aspirin use and breast cancer risk. Meta-analysis of relative risks (RR) estimates associated with aspirin intake were presented by fixed or random effects models. The dose-response meta-analysis was performed by linear trend regression and restricted cubic spline regression. Conclusion Our study confirmed a dose-response relationship between aspirin use and breast cancer risk. For clinical prevention, long term (>5 years) consistent use (2-7 times/week) of aspirin appears to be more effective in achieving a protective effect against breast cancer. PMID:28418881
Neuhouser, Marian L.; Howard, Barbara; Lu, Jingmin; Tinker, Lesley F.; Van Horn, Linda; Caan, Bette; Rohan, Thomas; Stefanick, Marcia L.; Thomson, Cynthia A.
2012-01-01
Objective Nutrition plays an important role in metabolic syndrome etiology. We examined whether the Women’s Health Initiative (WHI) Dietary Modification Trial influenced metabolic syndrome risk. Materials/Methods 48,835 postmenopausal women aged 50–79 years were randomized to a low-fat (20% energy from fat) diet (intervention) or usual diet (comparison) for a mean of 8.1 years. Blood pressure, waist circumference and fasting blood measures of glucose, HDL-cholesterol and triglycerides were measured on a subsample (n= 2816) at baseline and years 1, 3 and 6 post-randomization. Logistic regression estimated associations of the intervention with metabolic syndrome risk and use of cholesterol-lowering and hypertension medications. Multivariate linear regression tested associations between the intervention and metabolic syndrome components. Results At year 3, but not years 1 or 6, women in the intervention group (vs. comparison) had a non-statistically significant lower risk of metabolic syndrome (OR=0.83, 95% CI 0.59–1.18). Linear regression models simultaneously modeling the five metabolic syndrome components revealed significant associations of the intervention with metabolic syndrome at year 1 (p<0.0001), but not years 3 (p=0.19) and 6 (p=0.17). Analyses restricted to intervention-adherent participants strengthened associations at years 3 (p=0.05) and 6 (p=0.06). Cholesterol-lowering and hypertension medication use was 19% lower at year 1 for intervention vs. comparison group women (OR=0.81, 95% CI 0.60–1.09). Over the entire trial, fewer intervention vs. comparison participants used these medications (26.0% vs. 29.9%), although results were not statistically significant (p=0.89). Conclusions The WHI low-fat diet may influence metabolic syndrome risk and decrease use of hypertension and cholesterol-lowering medications. Findings have potential for meaningful clinical translation. PMID:22633601
Flexible Meta-Regression to Assess the Shape of the Benzene–Leukemia Exposure–Response Curve
Vlaanderen, Jelle; Portengen, Lützen; Rothman, Nathaniel; Lan, Qing; Kromhout, Hans; Vermeulen, Roel
2010-01-01
Background Previous evaluations of the shape of the benzene–leukemia exposure–response curve (ERC) were based on a single set or on small sets of human occupational studies. Integrating evidence from all available studies that are of sufficient quality combined with flexible meta-regression models is likely to provide better insight into the functional relation between benzene exposure and risk of leukemia. Objectives We used natural splines in a flexible meta-regression method to assess the shape of the benzene–leukemia ERC. Methods We fitted meta-regression models to 30 aggregated risk estimates extracted from nine human observational studies and performed sensitivity analyses to assess the impact of a priori assessed study characteristics on the predicted ERC. Results The natural spline showed a supralinear shape at cumulative exposures less than 100 ppm-years, although this model fitted the data only marginally better than a linear model (p = 0.06). Stratification based on study design and jackknifing indicated that the cohort studies had a considerable impact on the shape of the ERC at high exposure levels (> 100 ppm-years) but that predicted risks for the low exposure range (< 50 ppm-years) were robust. Conclusions Although limited by the small number of studies and the large heterogeneity between studies, the inclusion of all studies of sufficient quality combined with a flexible meta-regression method provides the most comprehensive evaluation of the benzene–leukemia ERC to date. The natural spline based on all data indicates a significantly increased risk of leukemia [relative risk (RR) = 1.14; 95% confidence interval (CI), 1.04–1.26] at an exposure level as low as 10 ppm-years. PMID:20064779
1987-09-01
Edition,. Fail 1986. 33. Neter, John et al. Applied Linear Regression MoceL. Homewood IL: Richard D. Irwin, Incorporated, iJ83. 34. NovicK, David... Linear Regression Models (33) then, for each sample observation (X fh, the method of least squares considers the deviation of Yubms from its expected value...for finding good estimators of b - b5 * In -2raer to explain the procedure, the model Yubms = b0 + b!xfh will be discussed. According to Applied
Application of linear regression analysis in accuracy assessment of rolling force calculations
NASA Astrophysics Data System (ADS)
Poliak, E. I.; Shim, M. K.; Kim, G. S.; Choo, W. Y.
1998-10-01
Efficient operation of the computational models employed in process control systems require periodical assessment of the accuracy of their predictions. Linear regression is proposed as a tool which allows separate systematic and random prediction errors from those related to measurements. A quantitative characteristic of the model predictive ability is introduced in addition to standard statistical tests for model adequacy. Rolling force calculations are considered as an example for the application. However, the outlined approach can be used to assess the performance of any computational model.