Science.gov

Sample records for addition multiple regression

  1. A multiple additive regression tree analysis of three exposure measures during Hurricane Katrina.

    PubMed

    Curtis, Andrew; Li, Bin; Marx, Brian D; Mills, Jacqueline W; Pine, John

    2011-01-01

    This paper analyses structural and personal exposure to Hurricane Katrina. Structural exposure is measured by flood height and building damage; personal exposure is measured by the locations of 911 calls made during the response. Using these variables, this paper characterises the geography of exposure and also demonstrates the utility of a robust analytical approach in understanding health-related challenges to disadvantaged populations during recovery. Analysis is conducted using a contemporary statistical approach, a multiple additive regression tree (MART), which displays considerable improvement over traditional regression analysis. By using MART, the percentage of improvement in R-squares over standard multiple linear regression ranges from about 62 to more than 100 per cent. The most revealing finding is the modelled verification that African Americans experienced disproportionate exposure in both structural and personal contexts. Given the impact of exposure to health outcomes, this finding has implications for understanding the long-term health challenges facing this population.

  2. Prediction in Multiple Regression.

    ERIC Educational Resources Information Center

    Osborne, Jason W.

    2000-01-01

    Presents the concept of prediction via multiple regression (MR) and discusses the assumptions underlying multiple regression analyses. Also discusses shrinkage, cross-validation, and double cross-validation of prediction equations and describes how to calculate confidence intervals around individual predictions. (SLD)

  3. Multiple linear regression analysis

    NASA Technical Reports Server (NTRS)

    Edwards, T. R.

    1980-01-01

    Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.

  4. Multiple Regression and Its Discontents

    ERIC Educational Resources Information Center

    Snell, Joel C.; Marsh, Mitchell

    2012-01-01

    Multiple regression is part of a larger statistical strategy originated by Gauss. The authors raise questions about the theory and suggest some changes that would make room for Mandelbrot and Serendipity.

  5. Multiple Regression: A Leisurely Primer.

    ERIC Educational Resources Information Center

    Daniel, Larry G.; Onwuegbuzie, Anthony J.

    Multiple regression is a useful statistical technique when the researcher is considering situations in which variables of interest are theorized to be multiply caused. It may also be useful in those situations in which the researchers is interested in studies of predictability of phenomena of interest. This paper provides an introduction to…

  6. Cross-Validation, Shrinkage, and Multiple Regression.

    ERIC Educational Resources Information Center

    Hynes, Kevin

    One aspect of multiple regression--the shrinkage of the multiple correlation coefficient on cross-validation is reviewed. The paper consists of four sections. In section one, the distinction between a fixed and a random multiple regression model is made explicit. In section two, the cross-validation paradigm and an explanation for the occurrence…

  7. Correlation Weights in Multiple Regression

    ERIC Educational Resources Information Center

    Waller, Niels G.; Jones, Jeff A.

    2010-01-01

    A general theory on the use of correlation weights in linear prediction has yet to be proposed. In this paper we take initial steps in developing such a theory by describing the conditions under which correlation weights perform well in population regression models. Using OLS weights as a comparison, we define cases in which the two weighting…

  8. Some Simple Computational Formulas for Multiple Regression

    ERIC Educational Resources Information Center

    Aiken, Lewis R., Jr.

    1974-01-01

    Short-cut formulas are presented for direct computation of the beta weights, the standard errors of the beta weights, and the multiple correlation coefficient for multiple regression problems involving three independent variables and one dependent variable. (Author)

  9. Practical Session: Multiple Linear Regression

    NASA Astrophysics Data System (ADS)

    Clausel, M.; Grégoire, G.

    2014-12-01

    Three exercises are proposed to illustrate the simple linear regression. In the first one investigates the influence of several factors on atmospheric pollution. It has been proposed by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr33.pdf) and is based on data coming from 20 cities of U.S. Exercise 2 is an introduction to model selection whereas Exercise 3 provides a first example of analysis of variance. Exercises 2 and 3 have been proposed by A. Dalalyan at ENPC (see Exercises 2 and 3 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_5.pdf).

  10. Application and Interpretation of Hierarchical Multiple Regression.

    PubMed

    Jeong, Younhee; Jung, Mi Jung

    2016-01-01

    The authors reported the association between motivation and self-management behavior of individuals with chronic low back pain after adjusting control variables using hierarchical multiple regression (). This article describes details of the hierarchical regression applying the actual data used in the article by , including how to test assumptions, run the statistical tests, and report the results. PMID:27648796

  11. Incremental Net Effects in Multiple Regression

    ERIC Educational Resources Information Center

    Lipovetsky, Stan; Conklin, Michael

    2005-01-01

    A regular problem in regression analysis is estimating the comparative importance of the predictors in the model. This work considers the 'net effects', or shares of the predictors in the coefficient of the multiple determination, which is a widely used characteristic of the quality of a regression model. Estimation of the net effects can be a…

  12. The Geometry of Enhancement in Multiple Regression

    ERIC Educational Resources Information Center

    Waller, Niels G.

    2011-01-01

    In linear multiple regression, "enhancement" is said to occur when R[superscript 2] = b[prime]r greater than r[prime]r, where b is a p x 1 vector of standardized regression coefficients and r is a p x 1 vector of correlations between a criterion y and a set of standardized regressors, x. When p = 1 then b [is congruent to] r and enhancement cannot…

  13. Assumptions of Multiple Regression: Correcting Two Misconceptions

    ERIC Educational Resources Information Center

    Williams, Matt N.; Gomez Grajales, Carlos Alberto; Kurkiewicz, Dason

    2013-01-01

    In 2002, an article entitled "Four assumptions of multiple regression that researchers should always test" by Osborne and Waters was published in "PARE." This article has gone on to be viewed more than 275,000 times (as of August 2013), and it is one of the first results displayed in a Google search for "regression…

  14. Multiple Regression Analysis and Automatic Interaction Detection.

    ERIC Educational Resources Information Center

    Koplyay, Janos B.

    The Automatic Interaction Detector (AID) is discussed as to its usefulness in multiple regression analysis. The algorithm of AID-4 is a reversal of the model building process; it starts with the ultimate restricted model, namely, the whole group as a unit. By a unique splitting process maximizing the between sum of squares for the categories of…

  15. The M Word: Multicollinearity in Multiple Regression.

    ERIC Educational Resources Information Center

    Morrow-Howell, Nancy

    1994-01-01

    Notes that existence of substantial correlation between two or more independent variables creates problems of multicollinearity in multiple regression. Discusses multicollinearity problem in social work research in which independent variables are usually intercorrelated. Clarifies problems created by multicollinearity, explains detection of…

  16. Design Coding and Interpretation in Multiple Regression.

    ERIC Educational Resources Information Center

    Lunneborg, Clifford E.

    The multiple regression or general linear model (GLM) is a parameter estimation and hypothesis testing model which encompasses and approaches the more familiar fixed effects analysis of variance (ANOVA). The transition from ANOVA to GLM is accomplished, roughly, by coding treatment level or group membership to produce a set of predictor or…

  17. MLREG, stepwise multiple linear regression program

    SciTech Connect

    Carder, J.H.

    1981-09-01

    This program is written in FORTRAN for an IBM computer and performs multiple linear regressions according to a stepwise procedure. The program transforms and combines old variables into new variables, prints input and transformed data, sums, raw sums or squares, residual sum of squares, means and standard deviations, correlation coefficients, regression results at each step, ANOVA at each step, and predicted response results at each step. This package contains an EXEC used to execute the program,sample input data and output listing, source listing, documentation, and card decks containing the EXEC sample input, and FORTRAN source.

  18. Salience Assignment for Multiple-Instance Regression

    NASA Technical Reports Server (NTRS)

    Wagstaff, Kiri L.; Lane, Terran

    2007-01-01

    We present a Multiple-Instance Learning (MIL) algorithm for determining the salience of each item in each bag with respect to the bag's real-valued label. We use an alternating-projections constrained optimization approach to simultaneously learn a regression model and estimate all salience values. We evaluate this algorithm on a significant real-world problem, crop yield modeling, and demonstrate that it provides more extensive, intuitive, and stable salience models than Primary-Instance Regression, which selects a single relevant item from each bag.

  19. Hierarchical regression for analyses of multiple outcomes.

    PubMed

    Richardson, David B; Hamra, Ghassan B; MacLehose, Richard F; Cole, Stephen R; Chu, Haitao

    2015-09-01

    In cohort mortality studies, there often is interest in associations between an exposure of primary interest and mortality due to a range of different causes. A standard approach to such analyses involves fitting a separate regression model for each type of outcome. However, the statistical precision of some estimated associations may be poor because of sparse data. In this paper, we describe a hierarchical regression model for estimation of parameters describing outcome-specific relative rate functions and associated credible intervals. The proposed model uses background stratification to provide flexible control for the outcome-specific associations of potential confounders, and it employs a hierarchical "shrinkage" approach to stabilize estimates of an exposure's associations with mortality due to different causes of death. The approach is illustrated in analyses of cancer mortality in 2 cohorts: a cohort of dioxin-exposed US chemical workers and a cohort of radiation-exposed Japanese atomic bomb survivors. Compared with standard regression estimates of associations, hierarchical regression yielded estimates with improved precision that tended to have less extreme values. The hierarchical regression approach also allowed the fitting of models with effect-measure modification. The proposed hierarchical approach can yield estimates of association that are more precise than conventional estimates when one wishes to estimate associations with multiple outcomes. PMID:26232395

  20. Multiple linear regression for isotopic measurements

    NASA Astrophysics Data System (ADS)

    Garcia Alonso, J. I.

    2012-04-01

    There are two typical applications of isotopic measurements: the detection of natural variations in isotopic systems and the detection man-made variations using enriched isotopes as indicators. For both type of measurements accurate and precise isotope ratio measurements are required. For the so-called non-traditional stable isotopes, multicollector ICP-MS instruments are usually applied. In many cases, chemical separation procedures are required before accurate isotope measurements can be performed. The off-line separation of Rb and Sr or Nd and Sm is the classical procedure employed to eliminate isobaric interferences before multicollector ICP-MS measurement of Sr and Nd isotope ratios. Also, this procedure allows matrix separation for precise and accurate Sr and Nd isotope ratios to be obtained. In our laboratory we have evaluated the separation of Rb-Sr and Nd-Sm isobars by liquid chromatography and on-line multicollector ICP-MS detection. The combination of this chromatographic procedure with multiple linear regression of the raw chromatographic data resulted in Sr and Nd isotope ratios with precisions and accuracies typical of off-line sample preparation procedures. On the other hand, methods for the labelling of individual organisms (such as a given plant, fish or animal) are required for population studies. We have developed a dual isotope labelling procedure which can be unique for a given individual, can be inherited in living organisms and it is stable. The detection of the isotopic signature is based also on multiple linear regression. The labelling of fish and its detection in otoliths by Laser Ablation ICP-MS will be discussed using trout and salmon as examples. As a conclusion, isotope measurement procedures based on multiple linear regression can be a viable alternative in multicollector ICP-MS measurements.

  1. Interpretation of Standardized Regression Coefficients in Multiple Regression.

    ERIC Educational Resources Information Center

    Thayer, Jerome D.

    The extent to which standardized regression coefficients (beta values) can be used to determine the importance of a variable in an equation was explored. The beta value and the part correlation coefficient--also called the semi-partial correlation coefficient and reported in squared form as the incremental "r squared"--were compared for variables…

  2. Nonparametric survival analysis using Bayesian Additive Regression Trees (BART).

    PubMed

    Sparapani, Rodney A; Logan, Brent R; McCulloch, Robert E; Laud, Purushottam W

    2016-07-20

    Bayesian additive regression trees (BART) provide a framework for flexible nonparametric modeling of relationships of covariates to outcomes. Recently, BART models have been shown to provide excellent predictive performance, for both continuous and binary outcomes, and exceeding that of its competitors. Software is also readily available for such outcomes. In this article, we introduce modeling that extends the usefulness of BART in medical applications by addressing needs arising in survival analysis. Simulation studies of one-sample and two-sample scenarios, in comparison with long-standing traditional methods, establish face validity of the new approach. We then demonstrate the model's ability to accommodate data from complex regression models with a simulation study of a nonproportional hazards scenario with crossing survival functions and survival function estimation in a scenario where hazards are multiplicatively modified by a highly nonlinear function of the covariates. Using data from a recently published study of patients undergoing hematopoietic stem cell transplantation, we illustrate the use and some advantages of the proposed method in medical investigations. Copyright © 2016 John Wiley & Sons, Ltd. PMID:26854022

  3. Relationship between Multiple Regression and Selected Multivariable Methods.

    ERIC Educational Resources Information Center

    Schumacker, Randall E.

    The relationship of multiple linear regression to various multivariate statistical techniques is discussed. The importance of the standardized partial regression coefficient (beta weight) in multiple linear regression as it is applied in path, factor, LISREL, and discriminant analyses is emphasized. The multivariate methods discussed in this paper…

  4. Suppression Situations in Multiple Linear Regression

    ERIC Educational Resources Information Center

    Shieh, Gwowen

    2006-01-01

    This article proposes alternative expressions for the two most prevailing definitions of suppression without resorting to the standardized regression modeling. The formulation provides a simple basis for the examination of their relationship. For the two-predictor regression, the author demonstrates that the previous results in the literature are…

  5. Testing Different Model Building Procedures Using Multiple Regression.

    ERIC Educational Resources Information Center

    Thayer, Jerome D.

    The stepwise regression method of selecting predictors for computer assisted multiple regression analysis was compared with forward, backward, and best subsets regression, using 16 data sets. The results indicated the stepwise method was preferred because of its practical nature, when the models chosen by different selection methods were similar…

  6. Interpret with caution: multicollinearity in multiple regression of cognitive data.

    PubMed

    Morrison, Catriona M

    2003-08-01

    Shibihara and Kondo in 2002 reported a reanalysis of the 1997 Kanji picture-naming data of Yamazaki, Ellis, Morrison, and Lambon-Ralph in which independent variables were highly correlated. Their addition of the variable visual familiarity altered the previously reported pattern of results, indicating that visual familiarity, but not age of acquisition, was important in predicting Kanji naming speed. The present paper argues that caution should be taken when drawing conclusions from multiple regression analyses in which the independent variables are so highly correlated, as such multicollinearity can lead to unreliable output.

  7. Estimation of adjusted rate differences using additive negative binomial regression.

    PubMed

    Donoghoe, Mark W; Marschner, Ian C

    2016-08-15

    Rate differences are an important effect measure in biostatistics and provide an alternative perspective to rate ratios. When the data are event counts observed during an exposure period, adjusted rate differences may be estimated using an identity-link Poisson generalised linear model, also known as additive Poisson regression. A problem with this approach is that the assumption of equality of mean and variance rarely holds in real data, which often show overdispersion. An additive negative binomial model is the natural alternative to account for this; however, standard model-fitting methods are often unable to cope with the constrained parameter space arising from the non-negativity restrictions of the additive model. In this paper, we propose a novel solution to this problem using a variant of the expectation-conditional maximisation-either algorithm. Our method provides a reliable way to fit an additive negative binomial regression model and also permits flexible generalisations using semi-parametric regression functions. We illustrate the method using a placebo-controlled clinical trial of fenofibrate treatment in patients with type II diabetes, where the outcome is the number of laser therapy courses administered to treat diabetic retinopathy. An R package is available that implements the proposed method. Copyright © 2016 John Wiley & Sons, Ltd. PMID:27073156

  8. General Nature of Multicollinearity in Multiple Regression Analysis.

    ERIC Educational Resources Information Center

    Liu, Richard

    1981-01-01

    Discusses multiple regression, a very popular statistical technique in the field of education. One of the basic assumptions in regression analysis requires that independent variables in the equation should not be highly correlated. The problem of multicollinearity and some of the solutions to it are discussed. (Author)

  9. Floating Data and the Problem with Illustrating Multiple Regression.

    ERIC Educational Resources Information Center

    Sachau, Daniel A.

    2000-01-01

    Discusses how to introduce basic concepts of multiple regression by creating a large-scale, three-dimensional regression model using the classroom walls and floor. Addresses teaching points that should be covered and reveals student reaction to the model. Finds that the greatest benefit of the model is the low fear, walk-through, nonmathematical…

  10. Enhance-Synergism and Suppression Effects in Multiple Regression

    ERIC Educational Resources Information Center

    Lipovetsky, Stan; Conklin, W. Michael

    2004-01-01

    Relations between pairwise correlations and the coefficient of multiple determination in regression analysis are considered. The conditions for the occurrence of enhance-synergism and suppression effects when multiple determination becomes bigger than the total of squared correlations of the dependent variable with the regressors are discussed. It…

  11. Hierarchical regression for epidemiologic analyses of multiple exposures.

    PubMed Central

    Greenland, S

    1994-01-01

    Many epidemiologic investigations are designed to study the effects of multiple exposures. Most of these studies are analyzed either by fitting a risk-regression model with all exposures forced in the model, or by using a preliminary-testing algorithm, such as stepwise regression, to produce a smaller model. Research indicates that hierarchical modeling methods can outperform these conventional approaches. These methods are reviewed and compared to two hierarchical methods, empirical-Bayes regression and a variant here called "semi-Bayes" regression, to full-model maximum likelihood and to model reduction by preliminary testing. The performance of the methods in a problem of predicting neonatal-mortality rates are compared. Based on the literature to date, it is suggested that hierarchical methods should become part of the standard approaches to multiple-exposure studies. PMID:7851328

  12. Multiple regression for physiological data analysis: the problem of multicollinearity.

    PubMed

    Slinker, B K; Glantz, S A

    1985-07-01

    Multiple linear regression, in which several predictor variables are related to a response variable, is a powerful statistical tool for gaining quantitative insight into complex in vivo physiological systems. For these insights to be correct, all predictor variables must be uncorrelated. However, in many physiological experiments the predictor variables cannot be precisely controlled and thus change in parallel (i.e., they are highly correlated). There is a redundancy of information about the response, a situation called multicollinearity, that leads to numerical problems in estimating the parameters in regression equations; the parameters are often of incorrect magnitude or sign or have large standard errors. Although multicollinearity can be avoided with good experimental design, not all interesting physiological questions can be studied without encountering multicollinearity. In these cases various ad hoc procedures have been proposed to mitigate multicollinearity. Although many of these procedures are controversial, they can be helpful in applying multiple linear regression to some physiological problems.

  13. Interpreting Multiple Linear Regression: A Guidebook of Variable Importance

    ERIC Educational Resources Information Center

    Nathans, Laura L.; Oswald, Frederick L.; Nimon, Kim

    2012-01-01

    Multiple regression (MR) analyses are commonly employed in social science fields. It is also common for interpretation of results to typically reflect overreliance on beta weights, often resulting in very limited interpretations of variable importance. It appears that few researchers employ other methods to obtain a fuller understanding of what…

  14. Some Applied Research Concerns Using Multiple Linear Regression Analysis.

    ERIC Educational Resources Information Center

    Newman, Isadore; Fraas, John W.

    The intention of this paper is to provide an overall reference on how a researcher can apply multiple linear regression in order to utilize the advantages that it has to offer. The advantages and some concerns expressed about the technique are examined. A number of practical ways by which researchers can deal with such concerns as…

  15. Multiple Regression Analyses in Clinical Child and Adolescent Psychology

    ERIC Educational Resources Information Center

    Jaccard, James; Guilamo-Ramos, Vincent; Johansson, Margaret; Bouris, Alida

    2006-01-01

    A major form of data analysis in clinical child and adolescent psychology is multiple regression. This article reviews issues in the application of such methods in light of the research designs typical of this field. Issues addressed include controlling covariates, evaluation of predictor relevance, comparing predictors, analysis of moderation,…

  16. Functional Regression Models for Epistasis Analysis of Multiple Quantitative Traits.

    PubMed

    Zhang, Futao; Xie, Dan; Liang, Meimei; Xiong, Momiao

    2016-04-01

    To date, most genetic analyses of phenotypes have focused on analyzing single traits or analyzing each phenotype independently. However, joint epistasis analysis of multiple complementary traits will increase statistical power and improve our understanding of the complicated genetic structure of the complex diseases. Despite their importance in uncovering the genetic structure of complex traits, the statistical methods for identifying epistasis in multiple phenotypes remains fundamentally unexplored. To fill this gap, we formulate a test for interaction between two genes in multiple quantitative trait analysis as a multiple functional regression (MFRG) in which the genotype functions (genetic variant profiles) are defined as a function of the genomic position of the genetic variants. We use large-scale simulations to calculate Type I error rates for testing interaction between two genes with multiple phenotypes and to compare the power with multivariate pairwise interaction analysis and single trait interaction analysis by a single variate functional regression model. To further evaluate performance, the MFRG for epistasis analysis is applied to five phenotypes of exome sequence data from the NHLBI's Exome Sequencing Project (ESP) to detect pleiotropic epistasis. A total of 267 pairs of genes that formed a genetic interaction network showed significant evidence of epistasis influencing five traits. The results demonstrate that the joint interaction analysis of multiple phenotypes has a much higher power to detect interaction than the interaction analysis of a single trait and may open a new direction to fully uncovering the genetic structure of multiple phenotypes.

  17. Functional Regression Models for Epistasis Analysis of Multiple Quantitative Traits.

    PubMed

    Zhang, Futao; Xie, Dan; Liang, Meimei; Xiong, Momiao

    2016-04-01

    To date, most genetic analyses of phenotypes have focused on analyzing single traits or analyzing each phenotype independently. However, joint epistasis analysis of multiple complementary traits will increase statistical power and improve our understanding of the complicated genetic structure of the complex diseases. Despite their importance in uncovering the genetic structure of complex traits, the statistical methods for identifying epistasis in multiple phenotypes remains fundamentally unexplored. To fill this gap, we formulate a test for interaction between two genes in multiple quantitative trait analysis as a multiple functional regression (MFRG) in which the genotype functions (genetic variant profiles) are defined as a function of the genomic position of the genetic variants. We use large-scale simulations to calculate Type I error rates for testing interaction between two genes with multiple phenotypes and to compare the power with multivariate pairwise interaction analysis and single trait interaction analysis by a single variate functional regression model. To further evaluate performance, the MFRG for epistasis analysis is applied to five phenotypes of exome sequence data from the NHLBI's Exome Sequencing Project (ESP) to detect pleiotropic epistasis. A total of 267 pairs of genes that formed a genetic interaction network showed significant evidence of epistasis influencing five traits. The results demonstrate that the joint interaction analysis of multiple phenotypes has a much higher power to detect interaction than the interaction analysis of a single trait and may open a new direction to fully uncovering the genetic structure of multiple phenotypes. PMID:27104857

  18. Developing Multiplicative Thinking from Additive Reasoning

    ERIC Educational Resources Information Center

    Tobias, Jennifer M.; Andreasen, Janet B.

    2013-01-01

    As students progress through elementary school, they encounter mathematics concepts that shift from additive to multiplicative situations (NCTM 2000). When they encounter fraction problems that require multiplicative thinking, they tend to incorrectly extend additive properties from whole numbers (Post et al. 1985). As a result, topics such as …

  19. The comparison between several robust ridge regression estimators in the presence of multicollinearity and multiple outliers

    NASA Astrophysics Data System (ADS)

    Zahari, Siti Meriam; Ramli, Norazan Mohamed; Moktar, Balkiah; Zainol, Mohammad Said

    2014-09-01

    In the presence of multicollinearity and multiple outliers, statistical inference of linear regression model using ordinary least squares (OLS) estimators would be severely affected and produces misleading results. To overcome this, many approaches have been investigated. These include robust methods which were reported to be less sensitive to the presence of outliers. In addition, ridge regression technique was employed to tackle multicollinearity problem. In order to mitigate both problems, a combination of ridge regression and robust methods was discussed in this study. The superiority of this approach was examined when simultaneous presence of multicollinearity and multiple outliers occurred in multiple linear regression. This study aimed to look at the performance of several well-known robust estimators; M, MM, RIDGE and robust ridge regression estimators, namely Weighted Ridge M-estimator (WRM), Weighted Ridge MM (WRMM), Ridge MM (RMM), in such a situation. Results of the study showed that in the presence of simultaneous multicollinearity and multiple outliers (in both x and y-direction), the RMM and RIDGE are more or less similar in terms of superiority over the other estimators, regardless of the number of observation, level of collinearity and percentage of outliers used. However, when outliers occurred in only single direction (y-direction), the WRMM estimator is the most superior among the robust ridge regression estimators, by producing the least variance. In conclusion, the robust ridge regression is the best alternative as compared to robust and conventional least squares estimators when dealing with simultaneous presence of multicollinearity and outliers.

  20. A Solution to Separation and Multicollinearity in Multiple Logistic Regression.

    PubMed

    Shen, Jianzhao; Gao, Sujuan

    2008-10-01

    In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27-38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth's penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study.

  1. Forecasting relativistic electron flux using dynamic multiple regression models

    NASA Astrophysics Data System (ADS)

    Wei, H.-L.; Billings, S. A.; Surjalal Sharma, A.; Wing, S.; Boynton, R. J.; Walker, S. N.

    2011-02-01

    The forecast of high energy electron fluxes in the radiation belts is important because the exposure of modern spacecraft to high energy particles can result in significant damage to onboard systems. A comprehensive physical model of processes related to electron energisation that can be used for such a forecast has not yet been developed. In the present paper a systems identification approach is exploited to deduce a dynamic multiple regression model that can be used to predict the daily maximum of high energy electron fluxes at geosynchronous orbit from data. It is shown that the model developed provides reliable predictions.

  2. Teasing out the effect of tutorials via multiple regression

    NASA Astrophysics Data System (ADS)

    Chasteen, Stephanie V.

    2012-02-01

    We transformed an upper-division physics course using a variety of elements, including homework help sessions, tutorials, clicker questions with peer instruction, and explicit learning goals. Overall, the course transformations improved student learning, as measured by our conceptual assessment. Since these transformations were multi-faceted, we would like to understand the impact of individual course elements. Attendance at tutorials and homework help sessions was optional, and occurred outside the class environment. In order to identify the impact of these optional out-of-class sessions, given self-selection effects in student attendance, we performed a multiple regression analysis. Even when background variables are taken into account, tutorial attendance is positively correlated with student conceptual understanding of the material - though not with performance on course exams. Other elements that increase student time-on-task, such as homework help sessions and lectures, do not achieve the same impacts.

  3. Modeling pan evaporation for Kuwait by multiple linear regression.

    PubMed

    Almedeij, Jaber

    2012-01-01

    Evaporation is an important parameter for many projects related to hydrology and water resources systems. This paper constitutes the first study conducted in Kuwait to obtain empirical relations for the estimation of daily and monthly pan evaporation as functions of available meteorological data of temperature, relative humidity, and wind speed. The data used here for the modeling are daily measurements of substantial continuity coverage, within a period of 17 years between January 1993 and December 2009, which can be considered representative of the desert climate of the urban zone of the country. Multiple linear regression technique is used with a procedure of variable selection for fitting the best model forms. The correlations of evaporation with temperature and relative humidity are also transformed in order to linearize the existing curvilinear patterns of the data by using power and exponential functions, respectively. The evaporation models suggested with the best variable combinations were shown to produce results that are in a reasonable agreement with observation values.

  4. Robust visual tracking via speedup multiple kernel ridge regression

    NASA Astrophysics Data System (ADS)

    Qian, Cheng; Breckon, Toby P.; Li, Hui

    2015-09-01

    Most of the tracking methods attempt to build up feature spaces to represent the appearance of a target. However, limited by the complex structure of the distribution of features, the feature spaces constructed in a linear manner cannot characterize the nonlinear structure well. We propose an appearance model based on kernel ridge regression for visual tracking. Dense sampling is fulfilled around the target image patches to collect the training samples. In order to obtain a kernel space in favor of describing the target appearance, multiple kernel learning is introduced into the selection of kernels. Under the framework, instead of a single kernel, a linear combination of kernels is learned from the training samples to create a kernel space. Resorting to the circulant property of a kernel matrix, a fast interpolate iterative algorithm is developed to seek coefficients that are assigned to these kernels so as to give an optimal combination. After the regression function is learned, all candidate image patches gathered are taken as the input of the function, and the candidate with the maximal response is regarded as the object image patch. Extensive experimental results demonstrate that the proposed method outperforms other state-of-the-art tracking methods.

  5. Multiple regression analyses in the prediction of aerospace instrument costs

    NASA Astrophysics Data System (ADS)

    Tran, Linh

    The aerospace industry has been investing for decades in ways to improve its efficiency in estimating the project life cycle cost (LCC). One of the major focuses in the LCC is the cost/prediction of aerospace instruments done during the early conceptual design phase of the project. The accuracy of early cost predictions affects the project scheduling and funding, and it is often the major cause for project cost overruns. The prediction of instruments' cost is based on the statistical analysis of these independent variables: Mass (kg), Power (watts), Instrument Type, Technology Readiness Level (TRL), Destination: earth orbiting or planetary, Data rates (kbps), Number of bands, Number of channels, Design life (months), and Development duration (months). This author is proposing a cost prediction approach of aerospace instruments based on these statistical analyses: Clustering Analysis, Principle Components Analysis (PCA), Bootstrap, and multiple regressions (both linear and non-linear). In the proposed approach, the Cost Estimating Relationship (CER) will be developed for the dependent variable Instrument Cost by using a combination of multiple independent variables. "The Full Model" will be developed and executed to estimate the full set of nine variables. The SAS program, Excel, Automatic Cost Estimating Integrate Tool (ACEIT) and Minitab are the tools to aid the analysis. Through the analysis, the cost drivers will be identified which will help develop an ultimate cost estimating software tool for the Instrument Cost prediction and optimization of future missions.

  6. Novel applications of multitask learning and multiple output regression to multiple genetic trait prediction

    PubMed Central

    Kuhn, David; Parida, Laxmi

    2016-01-01

    Given a set of biallelic molecular markers, such as SNPs, with genotype values encoded numerically on a collection of plant, animal or human samples, the goal of genetic trait prediction is to predict the quantitative trait values by simultaneously modeling all marker effects. Genetic trait prediction is usually represented as linear regression models. In many cases, for the same set of samples and markers, multiple traits are observed. Some of these traits might be correlated with each other. Therefore, modeling all the multiple traits together may improve the prediction accuracy. In this work, we view the multitrait prediction problem from a machine learning angle: as either a multitask learning problem or a multiple output regression problem, depending on whether different traits share the same genotype matrix or not. We then adapted multitask learning algorithms and multiple output regression algorithms to solve the multitrait prediction problem. We proposed a few strategies to improve the least square error of the prediction from these algorithms. Our experiments show that modeling multiple traits together could improve the prediction accuracy for correlated traits. Availability and implementation: The programs we used are either public or directly from the referred authors, such as MALSAR (http://www.public.asu.edu/~jye02/Software/MALSAR/) package. The Avocado data set has not been published yet and is available upon request. Contact: dhe@us.ibm.com PMID:27307640

  7. Overcoming multicollinearity in multiple regression using correlation coefficient

    NASA Astrophysics Data System (ADS)

    Zainodin, H. J.; Yap, S. J.

    2013-09-01

    Multicollinearity happens when there are high correlations among independent variables. In this case, it would be difficult to distinguish between the contributions of these independent variables to that of the dependent variable as they may compete to explain much of the similar variance. Besides, the problem of multicollinearity also violates the assumption of multiple regression: that there is no collinearity among the possible independent variables. Thus, an alternative approach is introduced in overcoming the multicollinearity problem in achieving a well represented model eventually. This approach is accomplished by removing the multicollinearity source variables on the basis of the correlation coefficient values based on full correlation matrix. Using the full correlation matrix can facilitate the implementation of Excel function in removing the multicollinearity source variables. It is found that this procedure is easier and time-saving especially when dealing with greater number of independent variables in a model and a large number of all possible models. Hence, in this paper detailed insight of the procedure is shown, compared and implemented.

  8. Factor analysis and multiple regression between topography and precipitation on Jeju Island, Korea

    NASA Astrophysics Data System (ADS)

    Um, Myoung-Jin; Yun, Hyeseon; Jeong, Chang-Sam; Heo, Jun-Haeng

    2011-11-01

    SummaryIn this study, new factors that influence precipitation were extracted from geographic variables using factor analysis, which allow for an accurate estimation of orographic precipitation. Correlation analysis was also used to examine the relationship between nine topographic variables from digital elevation models (DEMs) and the precipitation in Jeju Island. In addition, a spatial analysis was performed in order to verify the validity of the regression model. From the results of the correlation analysis, it was found that all of the topographic variables had a positive correlation with the precipitation. The relations between the variables also changed in accordance with a change in the precipitation duration. However, upon examining the correlation matrix, no significant relationship between the latitude and the aspect was found. According to the factor analysis, eight topographic variables (latitude being the exception) were found to have a direct influence on the precipitation. Three factors were then extracted from the eight topographic variables. By directly comparing the multiple regression model with the factors (model 1) to the multiple regression model with the topographic variables (model 3), it was found that model 1 did not violate the limits of statistical significance and multicollinearity. As such, model 1 was considered to be appropriate for estimating the precipitation when taking into account the topography. In the study of model 1, the multiple regression model using factor analysis was found to be the best method for estimating the orographic precipitation on Jeju Island.

  9. Regression Discontinuity Designs with Multiple Rating-Score Variables

    ERIC Educational Resources Information Center

    Reardon, Sean F.; Robinson, Joseph P.

    2012-01-01

    In the absence of a randomized control trial, regression discontinuity (RD) designs can produce plausible estimates of the treatment effect on an outcome for individuals near a cutoff score. In the standard RD design, individuals with rating scores higher than some exogenously determined cutoff score are assigned to one treatment condition; those…

  10. Using Robust Standard Errors to Combine Multiple Regression Estimates with Meta-Analysis

    ERIC Educational Resources Information Center

    Williams, Ryan T.

    2012-01-01

    Combining multiple regression estimates with meta-analysis has continued to be a difficult task. A variety of methods have been proposed and used to combine multiple regression slope estimates with meta-analysis, however, most of these methods have serious methodological and practical limitations. The purpose of this study was to explore the use…

  11. False Positives in Multiple Regression: Unanticipated Consequences of Measurement Error in the Predictor Variables

    ERIC Educational Resources Information Center

    Shear, Benjamin R.; Zumbo, Bruno D.

    2013-01-01

    Type I error rates in multiple regression, and hence the chance for false positive research findings, can be drastically inflated when multiple regression models are used to analyze data that contain random measurement error. This article shows the potential for inflated Type I error rates in commonly encountered scenarios and provides new…

  12. The Detection and Interpretation of Interaction Effects between Continuous Variables in Multiple Regression.

    ERIC Educational Resources Information Center

    Jaccard, James; And Others

    1990-01-01

    Issues in the detection and interpretation of interaction effects between quantitative variables in multiple regression analysis are discussed. Recent discussions associated with problems of multicollinearity are reviewed in the context of the conditional nature of multiple regression with product terms. (TJH)

  13. Beyond Multiple Regression: Using Commonality Analysis to Better Understand R[superscript 2] Results

    ERIC Educational Resources Information Center

    Warne, Russell T.

    2011-01-01

    Multiple regression is one of the most common statistical methods used in quantitative educational research. Despite the versatility and easy interpretability of multiple regression, it has some shortcomings in the detection of suppressor variables and for somewhat arbitrarily assigning values to the structure coefficients of correlated…

  14. Validity and Cross-Validity of Metric and Nonmetric Multiple Regression.

    ERIC Educational Resources Information Center

    MacCallum, Robert C.; And Others

    1979-01-01

    Questions are raised concerning differences between traditional metric multiple regression, which assumes all variables to be measured on interval scales, and nonmetric multiple regression. The ordinal model is generally superior in fitting derivation samples but the metric technique fits better than the nonmetric in cross-validation samples.…

  15. A Bayesian approach for the multiplicative binomial regression model

    NASA Astrophysics Data System (ADS)

    Paraíba, Carolina C. M.; Diniz, Carlos A. R.; Pires, Rubiane M.

    2012-10-01

    In the present paper, we focus our attention on Altham's multiplicative binomial model under the Bayesian perspective, modeling both the probability of success and the dispersion parameters. We present results based on a simulated data set to access the quality of Bayesian estimates and Bayesian diagnostic for model assessment.

  16. Multiple linear regression with correlations among the predictor variables. Theory and computer algorithm ridge (FORTRAN 77)

    NASA Astrophysics Data System (ADS)

    van Gaans, P. F. M.; Vriend, S. P.

    Application of ridge regression in geoscience usually is a more appropriate technique than ordinary least-squares regression, especially in the situation of highly intercorrelated predictor variables. A FORTRAN 77 program RIDGE for ridged multiple linear regression is presented. The theory of linear regression and ridge regression is treated, to allow for a careful interpretation of the results and to understand the structure of the program. The program gives various parameters to evaluate the extent of multicollinearity within a given regression problem, such as the correlation matrix, multiple correlations among the predictors, variance inflation factors, eigenvalues, condition number, and the determinant of the predictors correlation matrix. The best method for the optimum choice of the ridge parameter with ridge regression has not been established yet. Estimates of the ridge bias, ridged variance inflation factors, estimates, and norms for the ridge parameter therefore are given as output by RIDGE and should complement inspection of the ridge traces. Application within the earth sciences is discussed.

  17. Multiple regression technique for Pth degree polynominals with and without linear cross products

    NASA Technical Reports Server (NTRS)

    Davis, J. W.

    1973-01-01

    A multiple regression technique was developed by which the nonlinear behavior of specified independent variables can be related to a given dependent variable. The polynomial expression can be of Pth degree and can incorporate N independent variables. Two cases are treated such that mathematical models can be studied both with and without linear cross products. The resulting surface fits can be used to summarize trends for a given phenomenon and provide a mathematical relationship for subsequent analysis. To implement this technique, separate computer programs were developed for the case without linear cross products and for the case incorporating such cross products which evaluate the various constants in the model regression equation. In addition, the significance of the estimated regression equation is considered and the standard deviation, the F statistic, the maximum absolute percent error, and the average of the absolute values of the percent of error evaluated. The computer programs and their manner of utilization are described. Sample problems are included to illustrate the use and capability of the technique which show the output formats and typical plots comparing computer results to each set of input data.

  18. MULTIPLE REGRESSION MODELS FOR HINDCASTING AND FORECASTING MIDSUMMER HYPOXIA IN THE GULF OF MEXICO

    EPA Science Inventory

    A new suite of multiple regression models were developed that describe the relationship between the area of bottom water hypoxia along the northern Gulf of Mexico and Mississippi-Atchafalaya River nitrate concentration, total phosphorus (TP) concentration, and discharge. Variabil...

  19. SOME STATISTICAL ISSUES RELATED TO MULTIPLE LINEAR REGRESSION MODELING OF BEACH BACTERIA CONCENTRATIONS

    EPA Science Inventory

    As a fast and effective technique, the multiple linear regression (MLR) method has been widely used in modeling and prediction of beach bacteria concentrations. Among previous works on this subject, however, several issues were insufficiently or inconsistently addressed. Those is...

  20. Isolating and Examining Sources of Suppression and Multicollinearity in Multiple Linear Regression

    ERIC Educational Resources Information Center

    Beckstead, Jason W.

    2012-01-01

    The presence of suppression (and multicollinearity) in multiple regression analysis complicates interpretation of predictor-criterion relationships. The mathematical conditions that produce suppression in regression analysis have received considerable attention in the methodological literature but until now nothing in the way of an analytic…

  1. Confidence Intervals for an Effect Size Measure in Multiple Linear Regression

    ERIC Educational Resources Information Center

    Algina, James; Keselman, H. J.; Penfield, Randall D.

    2007-01-01

    The increase in the squared multiple correlation coefficient ([Delta]R[squared]) associated with a variable in a regression equation is a commonly used measure of importance in regression analysis. The coverage probability that an asymptotic and percentile bootstrap confidence interval includes [Delta][rho][squared] was investigated. As expected,…

  2. The Importance of Structure Coefficients in Multiple Regression: A Review with Examples from Published Literature.

    ERIC Educational Resources Information Center

    Burdenski, Thomas K., Jr.

    This paper discusses the importance of interpreting both regression coefficients and structure coefficients when analyzing the results of multiple regression analysis, particularly with correlated predictor variables. The concepts of multicolinearity and suppressor effects are introduced, along with examples from the previously published articles…

  3. An improved multiple linear regression and data analysis computer program package

    NASA Technical Reports Server (NTRS)

    Sidik, S. M.

    1972-01-01

    NEWRAP, an improved version of a previous multiple linear regression program called RAPIER, CREDUC, and CRSPLT, allows for a complete regression analysis including cross plots of the independent and dependent variables, correlation coefficients, regression coefficients, analysis of variance tables, t-statistics and their probability levels, rejection of independent variables, plots of residuals against the independent and dependent variables, and a canonical reduction of quadratic response functions useful in optimum seeking experimentation. A major improvement over RAPIER is that all regression calculations are done in double precision arithmetic.

  4. Noninvasive spectral imaging of skin chromophores based on multiple regression analysis aided by Monte Carlo simulation

    NASA Astrophysics Data System (ADS)

    Nishidate, Izumi; Wiswadarma, Aditya; Hase, Yota; Tanaka, Noriyuki; Maeda, Takaaki; Niizeki, Kyuichi; Aizu, Yoshihisa

    2011-08-01

    In order to visualize melanin and blood concentrations and oxygen saturation in human skin tissue, a simple imaging technique based on multispectral diffuse reflectance images acquired at six wavelengths (500, 520, 540, 560, 580 and 600nm) was developed. The technique utilizes multiple regression analysis aided by Monte Carlo simulation for diffuse reflectance spectra. Using the absorbance spectrum as a response variable and the extinction coefficients of melanin, oxygenated hemoglobin, and deoxygenated hemoglobin as predictor variables, multiple regression analysis provides regression coefficients. Concentrations of melanin and total blood are then determined from the regression coefficients using conversion vectors that are deduced numerically in advance, while oxygen saturation is obtained directly from the regression coefficients. Experiments with a tissue-like agar gel phantom validated the method. In vivo experiments with human skin of the human hand during upper limb occlusion and of the inner forearm exposed to UV irradiation demonstrated the ability of the method to evaluate physiological reactions of human skin tissue.

  5. Variables Associated with Communicative Participation in People with Multiple Sclerosis: A Regression Analysis

    ERIC Educational Resources Information Center

    Baylor, Carolyn; Yorkston, Kathryn; Bamer, Alyssa; Britton, Deanna; Amtmann, Dagmar

    2010-01-01

    Purpose: To explore variables associated with self-reported communicative participation in a sample (n = 498) of community-dwelling adults with multiple sclerosis (MS). Method: A battery of questionnaires was administered online or on paper per participant preference. Data were analyzed using multiple linear backward stepwise regression. The…

  6. Use of Empirical Estimates of Shrinkage in Multiple Regression: A Caution.

    ERIC Educational Resources Information Center

    Kromrey, Jeffrey D.; Hines, Constance V.

    1995-01-01

    The accuracy of four empirical techniques to estimate shrinkage in multiple regression was studied through Monte Carlo simulation. None of the techniques provided unbiased estimates of the population squared multiple correlation coefficient, but the normalized jackknife and bootstrap techniques demonstrated marginally acceptable performance with…

  7. Estimating R-squared Shrinkage in Multiple Regression: A Comparison of Different Analytical Methods.

    ERIC Educational Resources Information Center

    Yin, Ping; Fan, Xitao

    2001-01-01

    Studied the effectiveness of various analytical formulas for estimating "R" squared shrinkage in multiple regression analysis, focusing on estimators of the squared population multiple correlation coefficient and the squared population cross validity coefficient. Simulation results suggest that the most widely used Wherry (R. Wherry, 1931) formula…

  8. Multiple regression models for hindcasting and forecasting midsummer hypoxia in the Gulf of Mexico.

    PubMed

    Greene, Richard M; Lehrter, John C; Hagy, James D

    2009-07-01

    A new suite of multiple regression models was developed that describes relationships between the area of bottom water hypoxia along the northern Gulf of Mexico and Mississippi-Atchafalaya River nitrate concentration, total phosphorus (TP) concentration, and discharge. Model input variables were derived from two load estimation methods, the adjusted maximum likelihood estimation (AMLE) and the composite (COMP) method, developed by the U.S. Geological Survey. Variability in midsummer hypoxic area was described by models that incorporated May discharge, May nitrate, and February TP concentrations or their spring (discharge and nitrate) and winter (TP) averages. The regression models predicted the observed hypoxic area within +/-30%, yet model residuals showed an increasing trend with time. An additional model variable, Epoch, which allowed post-1993 observations to have a different intercept than earlier observations, suggested that hypoxic area has been 6450 km2 greater per unit discharge and nutrients since 1993. Model forecasts predicted that a dual 45% reduction in nitrate and TP concentration would likely reduce hypoxic area to approximately 5000 km2, the coastal goal established by the Mississippi River/Gulf of Mexico Watershed Nutrient Task Force. However, the COMP load estimation method, which is more accurate than the AMLE method, resulted in a smaller predicted hypoxia response to any given nutrient reduction than models based on the AMLE method. Monte Carlo simulations predicted that five years after an instantaneous 50% nitrate reduction or dual 45% nitrate and TP reduction it would be possible to resolve a significant reduction in hypoxic area. However, if nutrient reduction targets were achieved gradually (e.g., over 10 years), much more than a decade would be required before a significant downward trend in both nutrient concentrations and hypoxic area could be resolved against the large background of interannual variability. The multiple regression

  9. Modeling HTL of industrial workers using multiple regression and path analytic techniques.

    PubMed

    Smith, C R; Seitz, M R; Borton, T E; Kleinstein, R N; Wilmoth, J N

    1984-04-01

    This study compared path analytic with multiple regression analyses of hearing threshold levels (HTLs) on 258 adult textile workers evenly divided into low- and high-noise exposure groups. Demographic variables common in HTL studies were examined, with the addition of iris color, as well as selected two-way interactions. Variables of interest were similarly distributed in both groups. The results indicated that (1) different statistical procedures can lead to different conclusions even with the same HTL data for the same Ss; (2) conflicting conclusions may be artifacts of the analytic methodologies employed for data analysis; (3) a well-formulated theory under which path analytic techniques are employed may clarify somewhat the way a variable affects HTL values through its correlational connections with other antecedent variables included in the theoretical model; (4) multicollinearity among independent variables on which HTL is regressed usually presents a problem in unraveling exactly how each variable influences noise-induced hearing loss; and (5) because of the contradictory nature of its direct and indirect effects on HTL, iris color provides little, if any, explanatory assistance for modeling HTL.

  10. Tools to support interpreting multiple regression in the face of multicollinearity.

    PubMed

    Kraha, Amanda; Turner, Heather; Nimon, Kim; Zientek, Linda Reichwein; Henson, Robin K

    2012-01-01

    While multicollinearity may increase the difficulty of interpreting multiple regression (MR) results, it should not cause undue problems for the knowledgeable researcher. In the current paper, we argue that rather than using one technique to investigate regression results, researchers should consider multiple indices to understand the contributions that predictors make not only to a regression model, but to each other as well. Some of the techniques to interpret MR effects include, but are not limited to, correlation coefficients, beta weights, structure coefficients, all possible subsets regression, commonality coefficients, dominance weights, and relative importance weights. This article will review a set of techniques to interpret MR effects, identify the elements of the data on which the methods focus, and identify statistical software to support such analyses.

  11. Modeling of retardance in ferrofluid with Taguchi-based multiple regression analysis

    NASA Astrophysics Data System (ADS)

    Lin, Jing-Fung; Wu, Jyh-Shyang; Sheu, Jer-Jia

    2015-03-01

    The citric acid (CA) coated Fe3O4 ferrofluids are prepared by a co-precipitation method and the magneto-optical retardance property is measured by a Stokes polarimeter. Optimization and multiple regression of retardance in ferrofluids are executed by combining Taguchi method and Excel. From the nine tests for four parameters, including pH of suspension, molar ratio of CA to Fe3O4, volume of CA, and coating temperature, influence sequence and excellent program are found. Multiple regression analysis and F-test on the significance of regression equation are performed. It is found that the model F value is much larger than Fcritical and significance level P <0.0001. So it can be concluded that the regression model has statistically significant predictive ability. Substituting excellent program into equation, retardance is obtained as 32.703°, higher than the highest value in tests by 11.4%.

  12. Estimate the contribution of incubation parameters influence egg hatchability using multiple linear regression analysis

    PubMed Central

    Khalil, Mohamed H.; Shebl, Mostafa K.; Kosba, Mohamed A.; El-Sabrout, Karim; Zaki, Nesma

    2016-01-01

    Aim: This research was conducted to determine the most affecting parameters on hatchability of indigenous and improved local chickens’ eggs. Materials and Methods: Five parameters were studied (fertility, early and late embryonic mortalities, shape index, egg weight, and egg weight loss) on four strains, namely Fayoumi, Alexandria, Matrouh, and Montazah. Multiple linear regression was performed on the studied parameters to determine the most influencing one on hatchability. Results: The results showed significant differences in commercial and scientific hatchability among strains. Alexandria strain has the highest significant commercial hatchability (80.70%). Regarding the studied strains, highly significant differences in hatching chick weight among strains were observed. Using multiple linear regression analysis, fertility made the greatest percent contribution (71.31%) to hatchability, and the lowest percent contributions were made by shape index and egg weight loss. Conclusion: A prediction of hatchability using multiple regression analysis could be a good tool to improve hatchability percentage in chickens.

  13. Estimate the contribution of incubation parameters influence egg hatchability using multiple linear regression analysis

    PubMed Central

    Khalil, Mohamed H.; Shebl, Mostafa K.; Kosba, Mohamed A.; El-Sabrout, Karim; Zaki, Nesma

    2016-01-01

    Aim: This research was conducted to determine the most affecting parameters on hatchability of indigenous and improved local chickens’ eggs. Materials and Methods: Five parameters were studied (fertility, early and late embryonic mortalities, shape index, egg weight, and egg weight loss) on four strains, namely Fayoumi, Alexandria, Matrouh, and Montazah. Multiple linear regression was performed on the studied parameters to determine the most influencing one on hatchability. Results: The results showed significant differences in commercial and scientific hatchability among strains. Alexandria strain has the highest significant commercial hatchability (80.70%). Regarding the studied strains, highly significant differences in hatching chick weight among strains were observed. Using multiple linear regression analysis, fertility made the greatest percent contribution (71.31%) to hatchability, and the lowest percent contributions were made by shape index and egg weight loss. Conclusion: A prediction of hatchability using multiple regression analysis could be a good tool to improve hatchability percentage in chickens. PMID:27651666

  14. An automatic method for producing robust regression models from hyperspectral data using multiple simple genetic algorithms

    NASA Astrophysics Data System (ADS)

    Sykas, Dimitris; Karathanassi, Vassilia

    2015-06-01

    This paper presents a new method for automatically determining the optimum regression model, which enable the estimation of a parameter. The concept lies on the combination of k spectral pre-processing algorithms (SPPAs) that enhance spectral features correlated to the desired parameter. Initially a pre-processing algorithm uses as input a single spectral signature and transforms it according to the SPPA function. A k-step combination of SPPAs uses k preprocessing algorithms serially. The result of each SPPA is used as input to the next SPPA, and so on until the k desired pre-processed signatures are reached. These signatures are then used as input to three different regression methods: the Normalized band Difference Regression (NDR), the Multiple Linear Regression (MLR) and the Partial Least Squares Regression (PLSR). Three Simple Genetic Algorithms (SGAs) are used, one for each regression method, for the selection of the optimum combination of k SPPAs. The performance of the SGAs is evaluated based on the RMS error of the regression models. The evaluation not only indicates the selection of the optimum SPPA combination but also the regression method that produces the optimum prediction model. The proposed method was applied on soil spectral measurements in order to predict Soil Organic Matter (SOM). In this study, the maximum value assigned to k was 3. PLSR yielded the highest accuracy while NDR's accuracy was satisfactory compared to its complexity. MLR method showed severe drawbacks due to the presence of noise in terms of collinearity at the spectral bands. Most of the regression methods required a 3-step combination of SPPAs for achieving the highest performance. The selected preprocessing algorithms were different for each regression method since each regression method handles with a different way the explanatory variables.

  15. Curvilinear Relationships in Special Education Research: How Multiple Regression Analysis Can Be Used To Investigate Nonlinear Effects.

    ERIC Educational Resources Information Center

    Barringer, Mary S.

    Researchers are becoming increasingly aware of the advantages of using multiple regression as opposed to analysis of variance (ANOVA) or analysis of covariance (ANCOVA). Multiple regression is more versatile and does not force the researcher to throw away variance by categorizing intervally scaled data. Polynomial regression analysis offers the…

  16. Understanding Child Stunting in India: A Comprehensive Analysis of Socio-Economic, Nutritional and Environmental Determinants Using Additive Quantile Regression

    PubMed Central

    Fenske, Nora; Burns, Jacob; Hothorn, Torsten; Rehfuess, Eva A.

    2013-01-01

    Background Most attempts to address undernutrition, responsible for one third of global child deaths, have fallen behind expectations. This suggests that the assumptions underlying current modelling and intervention practices should be revisited. Objective We undertook a comprehensive analysis of the determinants of child stunting in India, and explored whether the established focus on linear effects of single risks is appropriate. Design Using cross-sectional data for children aged 0–24 months from the Indian National Family Health Survey for 2005/2006, we populated an evidence-based diagram of immediate, intermediate and underlying determinants of stunting. We modelled linear, non-linear, spatial and age-varying effects of these determinants using additive quantile regression for four quantiles of the Z-score of standardized height-for-age and logistic regression for stunting and severe stunting. Results At least one variable within each of eleven groups of determinants was significantly associated with height-for-age in the 35% Z-score quantile regression. The non-modifiable risk factors child age and sex, and the protective factors household wealth, maternal education and BMI showed the largest effects. Being a twin or multiple birth was associated with dramatically decreased height-for-age. Maternal age, maternal BMI, birth order and number of antenatal visits influenced child stunting in non-linear ways. Findings across the four quantile and two logistic regression models were largely comparable. Conclusions Our analysis confirms the multifactorial nature of child stunting. It emphasizes the need to pursue a systems-based approach and to consider non-linear effects, and suggests that differential effects across the height-for-age distribution do not play a major role. PMID:24223839

  17. INTRODUCTION TO A COMBINED MULTIPLE LINEAR REGRESSION AND ARMA MODELING APPROACH FOR BEACH BACTERIA PREDICTION

    EPA Science Inventory

    Due to the complexity of the processes contributing to beach bacteria concentrations, many researchers rely on statistical modeling, among which multiple linear regression (MLR) modeling is most widely used. Despite its ease of use and interpretation, there may be time dependence...

  18. A Modified Gauss-Jordan Procedure as an Alternative to Iterative Procedures in Multiple Regression.

    ERIC Educational Resources Information Center

    Roscoe, John T.; Kittleson, Howard M.

    Correlation matrices involving linear dependencies are common in educational research. In such matrices, there is no unique solution for the multiple regression coefficients. Although computer programs using iterative techniques are used to overcome this problem, these techniques possess certain disadvantages. Accordingly, a modified Gauss-Jordan…

  19. Using Robust Variance Estimation to Combine Multiple Regression Estimates with Meta-Analysis

    ERIC Educational Resources Information Center

    Williams, Ryan

    2013-01-01

    The purpose of this study was to explore the use of robust variance estimation for combining commonly specified multiple regression models and for combining sample-dependent focal slope estimates from diversely specified models. The proposed estimator obviates traditionally required information about the covariance structure of the dependent…

  20. Multiple Regression Analysis of Factors that May Influence Middle School Science Scores

    ERIC Educational Resources Information Center

    Glover, Judith

    2012-01-01

    The purpose of this quantitative multiple regression study was to determine whether a relationship existed between Maryland State Assessment (MSA) reading scores, MSA math scores, gender, ethnicity, age, and MSA science scores. Also examined was if MSA reading scores, MSA math scores, gender, ethnicity, and age can be used in combination or alone…

  1. What Is Wrong with ANOVA and Multiple Regression? Analyzing Sentence Reading Times with Hierarchical Linear Models

    ERIC Educational Resources Information Center

    Richter, Tobias

    2006-01-01

    Most reading time studies using naturalistic texts yield data sets characterized by a multilevel structure: Sentences (sentence level) are nested within persons (person level). In contrast to analysis of variance and multiple regression techniques, hierarchical linear models take the multilevel structure of reading time data into account. They…

  2. A Simple and Convenient Method of Multiple Linear Regression to Calculate Iodine Molecular Constants

    ERIC Educational Resources Information Center

    Cooper, Paul D.

    2010-01-01

    A new procedure using a student-friendly least-squares multiple linear-regression technique utilizing a function within Microsoft Excel is described that enables students to calculate molecular constants from the vibronic spectrum of iodine. This method is advantageous pedagogically as it calculates molecular constants for ground and excited…

  3. Double Cross-Validation in Multiple Regression: A Method of Estimating the Stability of Results.

    ERIC Educational Resources Information Center

    Rowell, R. Kevin

    In multiple regression analysis, where resulting predictive equation effectiveness is subject to shrinkage, it is especially important to evaluate result replicability. Double cross-validation is an empirical method by which an estimate of invariance or stability can be obtained from research data. A procedure for double cross-validation is…

  4. Assessing the Impact of Influential Observations on Multiple Regression Analysis on Human Resource Research.

    ERIC Educational Resources Information Center

    Bates, Reid A.; Holton, Elwood F., III; Burnett, Michael F.

    1999-01-01

    A case study of learning transfer demonstrates the possible effect of influential observation on linear regression analysis. A diagnostic method that tests for violation of assumptions, multicollinearity, and individual and multiple influential observations helps determine which observation to delete to eliminate bias. (SK)

  5. A Spreadsheet Tool for Learning the Multiple Regression F-Test, T-Tests, and Multicollinearity

    ERIC Educational Resources Information Center

    Martin, David

    2008-01-01

    This note presents a spreadsheet tool that allows teachers the opportunity to guide students towards answering on their own questions related to the multiple regression F-test, the t-tests, and multicollinearity. The note demonstrates approaches for using the spreadsheet that might be appropriate for three different levels of statistics classes,…

  6. Predicting Final GPA of Graduate School Students: Comparing Artificial Neural Networking and Simultaneous Multiple Regression

    ERIC Educational Resources Information Center

    Anderson, Joan L.

    2006-01-01

    Data from graduate student applications at a large Western university were used to determine which factors were the best predictors of success in graduate school, as defined by cumulative graduate grade point average. Two statistical models were employed and compared: artificial neural networking and simultaneous multiple regression. Both models…

  7. High-Dose Vitamin C Promotes Regression of Multiple Pulmonary Metastases Originating from Hepatocellular Carcinoma

    PubMed Central

    Seo, Min-Seok; Kim, Ja-Kyung

    2015-01-01

    We report a case of regression of multiple pulmonary metastases, which originated from hepatocellular carcinoma after treatment with intravenous administration of high-dose vitamin C. A 74-year-old woman presented to the clinic for her cancer-related symptoms such as general weakness and anorexia. After undergoing initial transarterial chemoembolization (TACE), local recurrence with multiple pulmonary metastases was found. She refused further conventional therapy, including sorafenib tosylate (Nexavar). She did receive high doses of vitamin C (70 g), which were administered into a peripheral vein twice a week for 10 months, and multiple pulmonary metastases were observed to have completely regressed. She then underwent subsequent TACE, resulting in remission of her primary hepatocellular carcinoma. PMID:26256994

  8. Estimation of radiation risk in presence of classical additive and Berkson multiplicative errors in exposure doses.

    PubMed

    Masiuk, S V; Shklyar, S V; Kukush, A G; Carroll, R J; Kovgan, L N; Likhtarov, I A

    2016-07-01

    In this paper, the influence of measurement errors in exposure doses in a regression model with binary response is studied. Recently, it has been recognized that uncertainty in exposure dose is characterized by errors of two types: classical additive errors and Berkson multiplicative errors. The combination of classical additive and Berkson multiplicative errors has not been considered in the literature previously. In a simulation study based on data from radio-epidemiological research of thyroid cancer in Ukraine caused by the Chornobyl accident, it is shown that ignoring measurement errors in doses leads to overestimation of background prevalence and underestimation of excess relative risk. In the work, several methods to reduce these biases are proposed. They are new regression calibration, an additive version of efficient SIMEX, and novel corrected score methods.

  9. Locating multiple interacting quantitative trait Loci with the zero-inflated generalized poisson regression.

    PubMed

    Erhardt, Vinzenz; Bogdan, Malgorzata; Czado, Claudia

    2010-01-01

    We consider the problem of locating multiple interacting quantitative trait loci (QTL) influencing traits measured in counts. In many applications the distribution of the count variable has a spike at zero. Zero-inflated generalized Poisson regression (ZIGPR) allows for an additional probability mass at zero and hence an improvement in the detection of significant loci. Classical model selection criteria often overestimate the QTL number. Therefore, modified versions of the Bayesian Information Criterion (mBIC and EBIC) were successfully used for QTL mapping. We apply these criteria based on ZIGPR as well as simpler models. An extensive simulation study shows their good power detecting QTL while controlling the false discovery rate. We illustrate how the inability of the Poisson distribution to account for over-dispersion leads to an overestimation of the QTL number and hence strongly discourages its application for identifying factors influencing count data. The proposed method is used to analyze the mice gallstone data of Lyons et al. (2003). Our results suggest the existence of a novel QTL on chromosome 4 interacting with another QTL previously identified on chromosome 5. We provide the corresponding code in R.

  10. Clinical evaluation of the temporomandibular joint following orthognathic surgery--multiple logistic regression analysis.

    PubMed

    Aoyama, Shigeru; Kino, Koji; Kobayashi, Jyunji; Yoshimasu, Hidemi; Amagasa, Teruo

    2005-06-01

    This study compares temporomandibular joint dysfunction (TMD) symptoms before and after bilateral sagittal split ramus osteotomy, and identifies predictive factors for the postoperative TMD symptoms by assessing the adjusted odds ratio using multiple logistic regression analysis. A consecutive series of 37 cases treated only with bilateral sagittal split ramus osteotomy were evaluated. New postoperative TMD symptoms appeared in 9 cases, preoperative TMD symptoms disappeared in 6 cases, and TMD symptoms were unchanged in 5 cases. The median period until the interincisal opening range attained 40 mm was 5 months (range, from 2 to 15 months). Age was a positive factor in patients with postoperative TMD symptoms, with an odds ratio of 1.43 (95 percent confidence interval, from 1.05 to 1.93). In addition, the maximum value of the bilateral setback distance of more than 9 mm was a positive factor of 6.95 (95 percent confidence interval, from 1.06 to 45.42). We concluded that surgical correction in skeletal malocclusion may affect temporomandibular joint dysfunction symptoms. PMID:16187616

  11. [Multiple dependent variables LS-SVM regression algorithm and its application in NIR spectral quantitative analysis].

    PubMed

    An, Xin; Xu, Shuo; Zhang, Lu-Da; Su, Shi-Guang

    2009-01-01

    In the present paper, on the basis of LS-SVM algorithm, we built a multiple dependent variables LS-SVM (MLS-SVM) regression model whose weights can be optimized, and gave the corresponding algorithm. Furthermore, we theoretically explained the relationship between MLS-SVM and LS-SVM. Sixty four broomcorn samples were taken as experimental material, and the sample ratio of modeling set to predicting set was 51 : 13. We first selected randomly and uniformly five weight groups in the interval [0, 1], and then in the way of leave-one-out (LOO) rule determined one appropriate weight group and parameters including penalizing parameters and kernel parameters in the model according to the criterion of the minimum of average relative error. Then a multiple dependent variables quantitative analysis model was built with NIR spectrum and simultaneously analyzed three chemical constituents containing protein, lysine and starch. Finally, the average relative errors between actual values and predicted ones by the model of three components for the predicting set were 1.65%, 6.47% and 1.37%, respectively, and the correlation coefficients were 0.9940, 0.8392 and 0.8825, respectively. For comparison, LS-SVM was also utilized, for which the average relative errors were 1.68%, 6.25% and 1.47%, respectively, and the correlation coefficients were 0.9941, 0.8310 and 0.8800, respectively. It is obvious that MLS-SVM algorithm is comparable to LS-SVM algorithm in modeling analysis performance, and both of them can give satisfying results. The result shows that the model with MLS-SVM algorithm is capable of doing multi-components NIR quantitative analysis synchronously. Thus MLS-SVM algorithm offers a new multiple dependent variables quantitative analysis approach for chemometrics. In addition, the weights have certain effect on the prediction performance of the model with MLS-SVM, which is consistent with our intuition and is validated in this study. Therefore, it is necessary to optimize

  12. Using Regression Equations Built from Summary Data in the Psychological Assessment of the Individual Case: Extension to Multiple Regression

    ERIC Educational Resources Information Center

    Crawford, John R.; Garthwaite, Paul H.; Denham, Annie K.; Chelune, Gordon J.

    2012-01-01

    Regression equations have many useful roles in psychological assessment. Moreover, there is a large reservoir of published data that could be used to build regression equations; these equations could then be employed to test a wide variety of hypotheses concerning the functioning of individual cases. This resource is currently underused because…

  13. A Josephson systolic array processor for multiplication/addition operations

    SciTech Connect

    Morisue, M.; Li, F.Q.; Tobita, M.; Kaneko, S. )

    1991-03-01

    A novel Josephson systolic array processor to perform multiplication/addition operations is proposed. The systolic array processor proposed here consists of a set of three kinds of interconnected cells of which main circuits are made by using SQUID gates. A multiplication of 2 bits by 2 bits is performed in the single cell at a time and an addition of three data with two bits is simultaneously performed in an another type of cell. Furthermore, information in this system flows between cells in a pipeline fashion so that a high performance can be achieved. In this paper the principle of Josephson systolic array processor is described in detail and the simulation results are illustrated for the multiplication/addition of (4 bits {times} 4 bits + 8 bits). The results show that these operations can be executed in 330ps.

  14. Multiple Regression Model Based Sequential Probability Ratio Test for Structural Change Detection of Time Series

    NASA Astrophysics Data System (ADS)

    Takeda, Katsunori; Hattori, Tetsuo; Kawano, Hiromichi

    In real time analysis and forecasting of time series data, it is important to detect the structural change as immediately, correctly, and simply as possible. And it is necessary for rebuilding the next prediction model after the change point as soon as possible. For this kind of time series data analysis, in general, multiple linear regression models are used. In this paper, we present two methods, i.e., Sequential Probability Ratio Test (SPRT) and Chow Test that is well-known in economics, and describe those experimental evaluations of the effectiveness in the change detection using the multiple regression models. Moreover, we extend the definition of the detected change point in the SPRT method, and show the improvement of the change detection accuracy.

  15. Optimization of fixture layouts of glass laser optics using multiple kernel regression.

    PubMed

    Su, Jianhua; Cao, Enhua; Qiao, Hong

    2014-05-10

    We aim to build an integrated fixturing model to describe the structural properties and thermal properties of the support frame of glass laser optics. Therefore, (a) a near global optimal set of clamps can be computed to minimize the surface shape error of the glass laser optic based on the proposed model, and (b) a desired surface shape error can be obtained by adjusting the clamping forces under various environmental temperatures based on the model. To construct the model, we develop a new multiple kernel learning method and call it multiple kernel support vector functional regression. The proposed method uses two layer regressions to group and order the data sources by the weights of the kernels and the factors of the layers. Because of that, the influences of the clamps and the temperature can be evaluated by grouping them into different layers. PMID:24922017

  16. User's Guide to the Weighted-Multiple-Linear Regression Program (WREG version 1.0)

    USGS Publications Warehouse

    Eng, Ken; Chen, Yin-Yu; Kiang, Julie.E.

    2009-01-01

    Streamflow is not measured at every location in a stream network. Yet hydrologists, State and local agencies, and the general public still seek to know streamflow characteristics, such as mean annual flow or flood flows with different exceedance probabilities, at ungaged basins. The goals of this guide are to introduce and familiarize the user with the weighted multiple-linear regression (WREG) program, and to also provide the theoretical background for program features. The program is intended to be used to develop a regional estimation equation for streamflow characteristics that can be applied at an ungaged basin, or to improve the corresponding estimate at continuous-record streamflow gages with short records. The regional estimation equation results from a multiple-linear regression that relates the observable basin characteristics, such as drainage area, to streamflow characteristics.

  17. Multiple-regression equations for estimating low flows at ungaged stream sites in Ohio

    USGS Publications Warehouse

    Koltun, G.F.; Schwartz, R.R.

    1987-01-01

    This report presents multiple-regression equations for estimating selected low-flow characteristics for most unregulated Ohio streams at sites where little or no discharge data are available. The equations relate combinations of drainage area, main-channel length, main-channel slope, average basin elevation, forested area, average annual precipitation, and an index of infiltration to low flows with durations of 7 and 30 days and average recurrence intervals of 2 and 10 years. Data from 132 long-term continuous-record gaging stations and partial-record sites in Ohio were used in the analyses. Multiple-regression analyses were first performed by using data from all 132 sites in an attempt to develop equations that would be applicable statewide. Standard errors for the statewide equations were too high (111 to 189 percent) for them to be of practical use in estimating low streamflows. Data for the state were then subdivided into five regions, and multiple-regression equations were developed for each region. Standard errors for four of the five regions improved, and raged from 43 to 106 percent. Standard errors for region 5 remained high (74 to 129 percent). The multiple-regression equations presented in this report are not applicable to streams with significant low-flow regulation. The equations also are not applicable if (1) the site has been gaged and low-flow estimates have been developed from gaging-station records, (2) low flow can be estimated by the drainage-area transference method from data for a nearby gaged site, or (3) a sufficient number of partial-record measurements made at the site can be adquately correlated with concurrent base flows at a suitable index station.

  18. Cost-Sensitive Boosting: Fitting an Additive Asymmetric Logistic Regression Model

    NASA Astrophysics Data System (ADS)

    Li, Qiu-Jie; Mao, Yao-Bin; Wang, Zhi-Quan; Xiang, Wen-Bo

    Conventional machine learning algorithms like boosting tend to equally treat misclassification errors that are not adequate to process certain cost-sensitive classification problems such as object detection. Although many cost-sensitive extensions of boosting by directly modifying the weighting strategy of correspond original algorithms have been proposed and reported, they are heuristic in nature and only proved effective by empirical results but lack sound theoretical analysis. This paper develops a framework from a statistical insight that can embody almost all existing cost-sensitive boosting algorithms: fitting an additive asymmetric logistic regression model by stage-wise optimization of certain criterions. Four cost-sensitive versions of boosting algorithms are derived, namely CSDA, CSRA, CSGA and CSLB which respectively correspond to Discrete AdaBoost, Real AdaBoost, Gentle AdaBoost and LogitBoost. Experimental results on the application of face detection have shown the effectiveness of the proposed learning framework in the reduction of the cumulative misclassification cost.

  19. Testing Nested Additive, Multiplicative, and General Multitrait-Multimethod Models.

    ERIC Educational Resources Information Center

    Coenders, Germa; Saris, Willem E.

    2000-01-01

    Provides alternatives to the definitions of additive and multiplicative method effects in multitrait-multimethod data given by D. Campbell and E. O'Connell (1967). The alternative definitions can be formulated by means of constraints in the parameters of the correlated uniqueness model (H. Marsh, 1989). (SLD)

  20. Multiplicative random regression model for heterogeneous variance adjustment in genetic evaluation for milk yield in Simmental.

    PubMed

    Lidauer, M H; Emmerling, R; Mäntysaari, E A

    2008-06-01

    A multiplicative random regression (M-RRM) test-day (TD) model was used to analyse daily milk yields from all available parities of German and Austrian Simmental dairy cattle. The method to account for heterogeneous variance (HV) was based on the multiplicative mixed model approach of Meuwissen. The variance model for the heterogeneity parameters included a fixed region x year x month x parity effect and a random herd x test-month effect with a within-herd first-order autocorrelation between test-months. Acceleration of variance model solutions after each multiplicative model cycle enabled fast convergence of adjustment factors and reduced total computing time significantly. Maximum Likelihood estimation of within-strata residual variances was enhanced by inclusion of approximated information on loss in degrees of freedom due to estimation of location parameters. This improved heterogeneity estimates for very small herds. The multiplicative model was compared with a model that assumed homogeneous variance. Re-estimated genetic variances, based on Mendelian sampling deviations, were homogeneous for the M-RRM TD model but heterogeneous for the homogeneous random regression TD model. Accounting for HV had large effect on cow ranking but moderate effect on bull ranking.

  1. Watershed Regressions for Pesticides (WARP) models for predicting stream concentrations of multiple pesticides

    USGS Publications Warehouse

    Stone, Wesley W.; Crawford, Charles G.; Gilliom, Robert J.

    2013-01-01

    Watershed Regressions for Pesticides for multiple pesticides (WARP-MP) are statistical models developed to predict concentration statistics for a wide range of pesticides in unmonitored streams. The WARP-MP models use the national atrazine WARP models in conjunction with an adjustment factor for each additional pesticide. The WARP-MP models perform best for pesticides with application timing and methods similar to those used with atrazine. For other pesticides, WARP-MP models tend to overpredict concentration statistics for the model development sites. For WARP and WARP-MP, the less-than-ideal sampling frequency for the model development sites leads to underestimation of the shorter-duration concentration; hence, the WARP models tend to underpredict 4- and 21-d maximum moving-average concentrations, with median errors ranging from 9 to 38% As a result of this sampling bias, pesticides that performed well with the model development sites are expected to have predictions that are biased low for these shorter-duration concentration statistics. The overprediction by WARP-MP apparent for some of the pesticides is variably offset by underestimation of the model development concentration statistics. Of the 112 pesticides used in the WARP-MP application to stream segments nationwide, 25 were predicted to have concentration statistics with a 50% or greater probability of exceeding one or more aquatic life benchmarks in one or more stream segments. Geographically, many of the modeled streams in the Corn Belt Region were predicted to have one or more pesticides that exceeded an aquatic life benchmark during 2009, indicating the potential vulnerability of streams in this region.

  2. Multiple trait model combining random regressions for daily feed intake with single measured performance traits of growing pigs

    PubMed Central

    Schnyder, Urs; Hofer, Andreas; Labroue, Florence; Künzi, Niklaus

    2002-01-01

    A random regression model for daily feed intake and a conventional multiple trait animal model for the four traits average daily gain on test (ADG), feed conversion ratio (FCR), carcass lean content and meat quality index were combined to analyse data from 1 449 castrated male Large White pigs performance tested in two French central testing stations in 1997. Group housed pigs fed ad libitum with electronic feed dispensers were tested from 35 to 100 kg live body weight. A quadratic polynomial in days on test was used as a regression function for weekly means of daily feed intake and to escribe its residual variance. The same fixed (batch) and random (additive genetic, pen and individual permanent environmental) effects were used for regression coefficients of feed intake and single measured traits. Variance components were estimated by means of a Bayesian analysis using Gibbs sampling. Four Gibbs chains were run for 550 000 rounds each, from which 50 000 rounds were discarded from the burn-in period. Estimates of posterior means of covariance matrices were calculated from the remaining two million samples. Low heritabilities of linear and quadratic regression coefficients and their unfavourable genetic correlations with other performance traits reveal that altering the shape of the feed intake curve by direct or indirect selection is difficult. PMID:11929625

  3. Multiple trait model combining random regressions for daily feed intake with single measured performance traits of growing pigs.

    PubMed

    Schnyder, Urs; Hofer, Andreas; Labroue, Florence; Künzi, Niklaus

    2002-01-01

    A random regression model for daily feed intake and a conventional multiple trait animal model for the four traits average daily gain on test (ADG), feed conversion ratio (FCR), carcass lean content and meat quality index were combined to analyse data from 1449 castrated male Large White pigs performance tested in two French central testing stations in 1997. Group housed pigs fed ad libitum with electronic feed dispensers were tested from 35 to 100 kg live body weight. A quadratic polynomial in days on test was used as a regression function for weekly means of daily feed intake and to describe its residual variance. The same fixed (batch) and random (additive genetic, pen and individual permanent environmental) effects were used for regression coefficients of feed intake and single measured traits. Variance components were estimated by means of a Bayesian analysis using Gibbs sampling. Four Gibbs chains were run for 550000 rounds each, from which 50000 rounds were discarded from the burn-in period. Estimates of posterior means of covariance matrices were calculated from the remaining two million samples. Low heritabilities of linear and quadratic regression coefficients and their unfavourable genetic correlations with other performance traits reveal that altering the shape of the feed intake curve by direct or indirect selection is difficult.

  4. Multiple regression analyses in artificial-grammar learning: the importance of control groups.

    PubMed

    Lotz, Anja; Kinder, Annette; Lachnit, Harald

    2009-03-01

    In artificial-grammar learning, it is crucial to ensure that above-chance performance in the test stage is due to learning in the training stage but not due to judgemental biases. Here we argue that multiple regression analysis can be successfully combined with the use of control groups to assess whether participants were able to transfer knowledge acquired during training when making judgements about test stimuli. We compared the regression weights of judgements in a transfer condition (training and test strings were constructed by the same grammar but with different letters) with those in a control condition. Predictors were identical in both conditions-judgements of control participants were treated as if they were based on knowledge gained in a standard training stage. The results of this experiment as well as reanalyses of a former study support the usefulness of our approach.

  5. Kinetics of tumor growth and regression in IgG multiple myeloma

    PubMed Central

    Sullivan, Peter W.; Salmon, Sydney E.

    1972-01-01

    Studies of immunoglobulin synthesis, total body tumor cell number, and tumor kinetics were carried out in a series of patients with IgG multiple myeloma. The changes in tumor size associated with tumor growth or with regression were underestimated when the concentration of serum M-component was used as the sole index of tumor mass. Calculation of the total body M-component synthetic rate (corrected for concentration-dependent changes in IgG metabolism) and tumor cell number gave a more accurate and predictable estimate of changes in tumor size. Tumor growth and drug-induced tumor regression were found to follow Gompertzian kinetics, with progressive retardation of the rate of change of tumor size in both of these circumstances. This retardation effect, describable with a constant α, may be caused by a shift in the proportion of tumor cells in the proliferative cycle. Drug sensitivity of the tumor could be described quantitatively with a calculation of BO, the tumor's initial sensitivity to a given drug regimen. Of particular clinical significance, the magnitude of a given patient's tumor regression could be predicted from the ratio of BO to α. Mathematical proof was obtained that the retardation constant determined during tumor regression also applied to the earlier period of tumor growth, and this constant was used to reconstruct the preclinical history of disease. In the average patient, fewer than 5 yr elapse from the initial tumor cell doubling to its clinical presentation with from 1011 to more than 1012 myeloma cells in the body. The reduction in total body tumor mass in most patients responding to therapy ranges from less than one to almost two orders of magnitude. Application of predictive kinetic analysis to the design of sequential drug regimens may lead to further improvement in the treatment of multiple myeloma and other tumors with similar growth characteristics. PMID:5040867

  6. Hierarchical Multiple Regression Modelling on Predictors of Behavior and Sexual Practices at Takoradi Polytechnic, Ghana

    PubMed Central

    Turkson, Anthony Joe; Otchey, James Eric

    2015-01-01

    Introduction: Various psychosocial studies on health related lifestyles lay emphasis on the fact that the perception one has of himself as being at risk of HIV/AIDS infection was a necessary condition for preventive behaviors to be adopted. Hierarchical Multiple Regression models was used to examine the relationship between eight independent variables and one dependent variable to isolate predictors which have significant influence on behavior and sexual practices. Methods: A Cross-sectional design was used for the study. Structured close-ended interviewer-administered questionnaire was used to collect primary data. Multistage stratified technique was used to sample views from 380 students from Takoradi Polytechnic, Ghana. A Hierarchical multiple regression model was used to ascertain the significance of certain predictors of sexual behavior and practices. Results: The variables that were extracted from the multiple regression were; for the constant; β=14.202, t=2.279, p=0.023, variable is significant; for the marital status; β=0.092, t=1.996, p<0.05, variable is significant; for the knowledge on AIDs; β= 0.090, t=1.996, p<0.05, variable is significant; for the attitude towards HIV/AIDs; β=0.486, t=10.575, p<0.001, variable is highly significant. Thus, the best fitting model for predicting behavior and sexual practices was a linear combination of the constant, one’s marital status, knowledge on HIV/AIDs and Attitude towards HIV/AIDs., Y (Behavior and sexual practices) = β0 + β1 (Marital status) + β2 (Knowledge on HIV AIDs issues) + β3 (Attitude towards HIV AIDs issues) β0, β1, β2 and β3 are respectively 14.201, 2.038, 0.148 and 0.486; the higher the better. Conclusions: Attitude and behavior change education on HIV/AIDs should be intensified in the institution so that students could adopt better lifestyles. PMID:25946917

  7. Multiple regression approach to optimize drilling operations in the Arabian Gulf area

    SciTech Connect

    Al-Betairi, E.A.; Moussa, M.M.; Al-Otaibi, S.

    1988-03-01

    This paper reports a successful application of multiple regression analysis, supported by a detailed statistical study to verify the Bourgoyne and Young model. The model estimates the optimum penetration rate (ROP), weight on bit (WOB), and rotary speed under the effect of controllable and uncontrollable factors. Field data from three wells in the Arabian Gulf were used and emphasized the validity of this model. The model coefficients are sensitive to the number of points included. The correlation coefficients and multicollinearity sensitivity of each drilling parameter on the ROP are studied.

  8. Accounting for Misclassified Outcomes in Binary Regression Models Using Multiple Imputation With Internal Validation Data

    PubMed Central

    Edwards, Jessie K.; Cole, Stephen R.; Troester, Melissa A.; Richardson, David B.

    2013-01-01

    Outcome misclassification is widespread in epidemiology, but methods to account for it are rarely used. We describe the use of multiple imputation to reduce bias when validation data are available for a subgroup of study participants. This approach is illustrated using data from 308 participants in the multicenter Herpetic Eye Disease Study between 1992 and 1998 (48% female; 85% white; median age, 49 years). The odds ratio comparing the acyclovir group with the placebo group on the gold-standard outcome (physician-diagnosed herpes simplex virus recurrence) was 0.62 (95% confidence interval (CI): 0.35, 1.09). We masked ourselves to physician diagnosis except for a 30% validation subgroup used to compare methods. Multiple imputation (odds ratio (OR) = 0.60; 95% CI: 0.24, 1.51) was compared with naive analysis using self-reported outcomes (OR = 0.90; 95% CI: 0.47, 1.73), analysis restricted to the validation subgroup (OR = 0.57; 95% CI: 0.20, 1.59), and direct maximum likelihood (OR = 0.62; 95% CI: 0.26, 1.53). In simulations, multiple imputation and direct maximum likelihood had greater statistical power than did analysis restricted to the validation subgroup, yet all 3 provided unbiased estimates of the odds ratio. The multiple-imputation approach was extended to estimate risk ratios using log-binomial regression. Multiple imputation has advantages regarding flexibility and ease of implementation for epidemiologists familiar with missing data methods. PMID:24627573

  9. Further Insight and Additional Inference Methods for Polynomial Regression Applied to the Analysis of Congruence

    ERIC Educational Resources Information Center

    Cohen, Ayala; Nahum-Shani, Inbal; Doveh, Etti

    2010-01-01

    In their seminal paper, Edwards and Parry (1993) presented the polynomial regression as a better alternative to applying difference score in the study of congruence. Although this method is increasingly applied in congruence research, its complexity relative to other methods for assessing congruence (e.g., difference score methods) was one of the…

  10. An Additional Measure of Overall Effect Size for Logistic Regression Models

    ERIC Educational Resources Information Center

    Allen, Jeff; Le, Huy

    2008-01-01

    Users of logistic regression models often need to describe the overall predictive strength, or effect size, of the model's predictors. Analogs of R[superscript 2] have been developed, but none of these measures are interpretable on the same scale as effects of individual predictors. Furthermore, R[superscript 2] analogs are not invariant to the…

  11. Modeling Errors in Daily Precipitation Measurements: Additive or Multiplicative?

    NASA Technical Reports Server (NTRS)

    Tian, Yudong; Huffman, George J.; Adler, Robert F.; Tang, Ling; Sapiano, Matthew; Maggioni, Viviana; Wu, Huan

    2013-01-01

    The definition and quantification of uncertainty depend on the error model used. For uncertainties in precipitation measurements, two types of error models have been widely adopted: the additive error model and the multiplicative error model. This leads to incompatible specifications of uncertainties and impedes intercomparison and application.In this letter, we assess the suitability of both models for satellite-based daily precipitation measurements in an effort to clarify the uncertainty representation. Three criteria were employed to evaluate the applicability of either model: (1) better separation of the systematic and random errors; (2) applicability to the large range of variability in daily precipitation; and (3) better predictive skills. It is found that the multiplicative error model is a much better choice under all three criteria. It extracted the systematic errors more cleanly, was more consistent with the large variability of precipitation measurements, and produced superior predictions of the error characteristics. The additive error model had several weaknesses, such as non constant variance resulting from systematic errors leaking into random errors, and the lack of prediction capability. Therefore, the multiplicative error model is a better choice.

  12. Isolating and Examining Sources of Suppression and Multicollinearity in Multiple Linear Regression.

    PubMed

    Beckstead, Jason W

    2012-03-30

    The presence of suppression (and multicollinearity) in multiple regression analysis complicates interpretation of predictor-criterion relationships. The mathematical conditions that produce suppression in regression analysis have received considerable attention in the methodological literature but until now nothing in the way of an analytic strategy to isolate, examine, and remove suppression effects has been offered. In this article such an approach, rooted in confirmatory factor analysis theory and employing matrix algebra, is developed. Suppression is viewed as the result of criterion-irrelevant variance operating among predictors. Decomposition of predictor variables into criterion-relevant and criterion-irrelevant components using structural equation modeling permits derivation of regression weights with the effects of criterion-irrelevant variance omitted. Three examples with data from applied research are used to illustrate the approach: the first assesses child and parent characteristics to explain why some parents of children with obsessive-compulsive disorder accommodate their child's compulsions more so than do others, the second examines various dimensions of personal health to explain individual differences in global quality of life among patients following heart surgery, and the third deals with quantifying the relative importance of various aptitudes for explaining academic performance in a sample of nursing students. The approach is offered as an analytic tool for investigators interested in understanding predictor-criterion relationships when complex patterns of intercorrelation among predictors are present and is shown to augment dominance analysis.

  13. Performance Evaluation of Button Bits in Coal Measure Rocks by Using Multiple Regression Analyses

    NASA Astrophysics Data System (ADS)

    Su, Okan

    2016-02-01

    Electro-hydraulic and jumbo drills are commonly used for underground coal mines and tunnel drives for the purpose of blasthole drilling and rock bolt installations. Not only machine parameters but also environmental conditions have significant effects on drilling. This study characterizes the performance of button bits during blasthole drilling in coal measure rocks by using multiple regression analyses. The penetration rate of jumbo and electro-hydraulic drills was measured in the field by employing bits in different diameters and the specific energy of the drilling was calculated at various locations, including highway tunnels and underground roadways of coal mines. Large block samples were collected from each location at which in situ drilling measurements were performed. Then, the effects of rock properties and machine parameters on the drilling performance were examined. Multiple regression models were developed for the prediction of the specific energy of the drilling and the penetration rate. The results revealed that hole area, impact (blow) energy, blows per minute of the piston within the drill, and some rock properties, such as the uniaxial compressive strength (UCS) and the drilling rate index (DRI), influence the drill performance.

  14. Removal of River-Stage Fluctuations from Well Response Using Multiple-Regression

    SciTech Connect

    Spane, Frank A.; Mackley, Rob D.

    2011-11-01

    Many contaminated unconfined aquifers are located in proximity to river systems. In groundwater studies, the physical presence of a river is commonly represented as a transient-head boundary that imposes hydrologic responses within the intersected unconfined aquifer. The periodic fluctuation of river-stage height at the boundary produces associated responses within the adjacent aquifer system, the magnitude of which is a function of the existing well, aquifer, boundary conditions, and river-stage fluctuation characteristics. The presence of well responses induced by the river stage can significantly limit characterization and monitoring of remedial activities within the stress-impacted area. This paper demonstrates the use of a time-domain, multiple-regression, convolution (superposition) method to develop well/aquifer river response function (RRF) relationships. Following RRF development, a multiple-regression deconvolution correction approach can be applied to remove river-stage effects from well water-level responses. Corrected well responses can then be analyzed to improve local aquifer characterization activities in support of optimizing remedial actions, assessing the area-of-influence of remediation activities, and determining mean groundwater flow and contaminant flux to the river system.

  15. Problems of correlations between explanatory variables in multiple regression analyses in the dental literature.

    PubMed

    Tu, Y-K; Kellett, M; Clerehugh, V; Gilthorpe, M S

    2005-10-01

    Multivariable analysis is a widely used statistical methodology for investigating associations amongst clinical variables. However, the problems of collinearity and multicollinearity, which can give rise to spurious results, have in the past frequently been disregarded in dental research. This article illustrates and explains the problems which may be encountered, in the hope of increasing awareness and understanding of these issues, thereby improving the quality of the statistical analyses undertaken in dental research. Three examples from different clinical dental specialties are used to demonstrate how to diagnose the problem of collinearity/multicollinearity in multiple regression analyses and to illustrate how collinearity/multicollinearity can seriously distort the model development process. Lack of awareness of these problems can give rise to misleading results and erroneous interpretations. Multivariable analysis is a useful tool for dental research, though only if its users thoroughly understand the assumptions and limitations of these methods. It would benefit evidence-based dentistry enormously if researchers were more aware of both the complexities involved in multiple regression when using these methods and of the need for expert statistical consultation in developing study design and selecting appropriate statistical methodologies.

  16. Waste generated in high-rise buildings construction: a quantification model based on statistical multiple regression.

    PubMed

    Parisi Kern, Andrea; Ferreira Dias, Michele; Piva Kulakowski, Marlova; Paulo Gomes, Luciana

    2015-05-01

    Reducing construction waste is becoming a key environmental issue in the construction industry. The quantification of waste generation rates in the construction sector is an invaluable management tool in supporting mitigation actions. However, the quantification of waste can be a difficult process because of the specific characteristics and the wide range of materials used in different construction projects. Large variations are observed in the methods used to predict the amount of waste generated because of the range of variables involved in construction processes and the different contexts in which these methods are employed. This paper proposes a statistical model to determine the amount of waste generated in the construction of high-rise buildings by assessing the influence of design process and production system, often mentioned as the major culprits behind the generation of waste in construction. Multiple regression was used to conduct a case study based on multiple sources of data of eighteen residential buildings. The resulting statistical model produced dependent (i.e. amount of waste generated) and independent variables associated with the design and the production system used. The best regression model obtained from the sample data resulted in an adjusted R(2) value of 0.694, which means that it predicts approximately 69% of the factors involved in the generation of waste in similar constructions. Most independent variables showed a low determination coefficient when assessed in isolation, which emphasizes the importance of assessing their joint influence on the response (dependent) variable.

  17. Removal of river-stage fluctuations from well response using multiple regression.

    PubMed

    Spane, Frank A; Mackley, Rob D

    2011-01-01

    Many contaminated unconfined aquifers are located in proximity to river systems. In groundwater studies, the physical presence of a river is commonly represented as a transient-head boundary that imposes hydrologic responses within the intersected unconfined aquifer. The periodic fluctuation of river-stage height at the boundary produces associated responses within the adjacent aquifer system, the magnitude of which is a function of the existing well, aquifer, boundary conditions, and characteristics of river-stage fluctuations. The presence of well responses induced by the river stage can significantly limit characterization and monitoring of remedial activities within the stress-impacted area. This article demonstrates the use of a time-domain, multiple-regression, convolution (superposition) method to develop well/aquifer river response function (RRF) relationships. Following RRF development, a multiple-regression deconvolution correction approach can be applied to remove river-stage effects from well water-level responses. Corrected well responses can then be analyzed to improve local aquifer characterization activities in support of optimizing remedial actions, assessing the area-of-influence of remediation activities, and determining mean groundwater flow and contaminant flux to the river system.

  18. Semiparametric Allelic Tests for Mapping Multiple Phenotypes: Binomial Regression and Mahalanobis Distance.

    PubMed

    Majumdar, Arunabha; Witte, John S; Ghosh, Saurabh

    2015-12-01

    Binary phenotypes commonly arise due to multiple underlying quantitative precursors and genetic variants may impact multiple traits in a pleiotropic manner. Hence, simultaneously analyzing such correlated traits may be more powerful than analyzing individual traits. Various genotype-level methods, e.g., MultiPhen (O'Reilly et al. []), have been developed to identify genetic factors underlying a multivariate phenotype. For univariate phenotypes, the usefulness and applicability of allele-level tests have been investigated. The test of allele frequency difference among cases and controls is commonly used for mapping case-control association. However, allelic methods for multivariate association mapping have not been studied much. In this article, we explore two allelic tests of multivariate association: one using a Binomial regression model based on inverted regression of genotype on phenotype (Binomial regression-based Association of Multivariate Phenotypes [BAMP]), and the other employing the Mahalanobis distance between two sample means of the multivariate phenotype vector for two alleles at a single-nucleotide polymorphism (Distance-based Association of Multivariate Phenotypes [DAMP]). These methods can incorporate both discrete and continuous phenotypes. Some theoretical properties for BAMP are studied. Using simulations, the power of the methods for detecting multivariate association is compared with the genotype-level test MultiPhen's. The allelic tests yield marginally higher power than MultiPhen for multivariate phenotypes. For one/two binary traits under recessive mode of inheritance, allelic tests are found to be substantially more powerful. All three tests are applied to two different real data and the results offer some support for the simulation study. We propose a hybrid approach for testing multivariate association that implements MultiPhen when Hardy-Weinberg Equilibrium (HWE) is violated and BAMP otherwise, because the allelic approaches assume HWE.

  19. Semiparametric Allelic Tests for Mapping Multiple Phenotypes: Binomial Regression and Mahalanobis Distance.

    PubMed

    Majumdar, Arunabha; Witte, John S; Ghosh, Saurabh

    2015-12-01

    Binary phenotypes commonly arise due to multiple underlying quantitative precursors and genetic variants may impact multiple traits in a pleiotropic manner. Hence, simultaneously analyzing such correlated traits may be more powerful than analyzing individual traits. Various genotype-level methods, e.g., MultiPhen (O'Reilly et al. []), have been developed to identify genetic factors underlying a multivariate phenotype. For univariate phenotypes, the usefulness and applicability of allele-level tests have been investigated. The test of allele frequency difference among cases and controls is commonly used for mapping case-control association. However, allelic methods for multivariate association mapping have not been studied much. In this article, we explore two allelic tests of multivariate association: one using a Binomial regression model based on inverted regression of genotype on phenotype (Binomial regression-based Association of Multivariate Phenotypes [BAMP]), and the other employing the Mahalanobis distance between two sample means of the multivariate phenotype vector for two alleles at a single-nucleotide polymorphism (Distance-based Association of Multivariate Phenotypes [DAMP]). These methods can incorporate both discrete and continuous phenotypes. Some theoretical properties for BAMP are studied. Using simulations, the power of the methods for detecting multivariate association is compared with the genotype-level test MultiPhen's. The allelic tests yield marginally higher power than MultiPhen for multivariate phenotypes. For one/two binary traits under recessive mode of inheritance, allelic tests are found to be substantially more powerful. All three tests are applied to two different real data and the results offer some support for the simulation study. We propose a hybrid approach for testing multivariate association that implements MultiPhen when Hardy-Weinberg Equilibrium (HWE) is violated and BAMP otherwise, because the allelic approaches assume HWE

  20. Estimating leaf photosynthetic pigments information by stepwise multiple linear regression analysis and a leaf optical model

    NASA Astrophysics Data System (ADS)

    Liu, Pudong; Shi, Runhe; Wang, Hong; Bai, Kaixu; Gao, Wei

    2014-10-01

    Leaf pigments are key elements for plant photosynthesis and growth. Traditional manual sampling of these pigments is labor-intensive and costly, which also has the difficulty in capturing their temporal and spatial characteristics. The aim of this work is to estimate photosynthetic pigments at large scale by remote sensing. For this purpose, inverse model were proposed with the aid of stepwise multiple linear regression (SMLR) analysis. Furthermore, a leaf radiative transfer model (i.e. PROSPECT model) was employed to simulate the leaf reflectance where wavelength varies from 400 to 780 nm at 1 nm interval, and then these values were treated as the data from remote sensing observations. Meanwhile, simulated chlorophyll concentration (Cab), carotenoid concentration (Car) and their ratio (Cab/Car) were taken as target to build the regression model respectively. In this study, a total of 4000 samples were simulated via PROSPECT with different Cab, Car and leaf mesophyll structures as 70% of these samples were applied for training while the last 30% for model validation. Reflectance (r) and its mathematic transformations (1/r and log (1/r)) were all employed to build regression model respectively. Results showed fair agreements between pigments and simulated reflectance with all adjusted coefficients of determination (R2) larger than 0.8 as 6 wavebands were selected to build the SMLR model. The largest value of R2 for Cab, Car and Cab/Car are 0.8845, 0.876 and 0.8765, respectively. Meanwhile, mathematic transformations of reflectance showed little influence on regression accuracy. We concluded that it was feasible to estimate the chlorophyll and carotenoids and their ratio based on statistical model with leaf reflectance data.

  1. Correlation between mRNA and protein abundance in Desulfovibrio vulgaris: A multiple regression to identify sources of variations

    SciTech Connect

    Nie, Lei; Wu, G; Zhang, Weiwen

    2006-01-13

    Using whole-genome microarray and LC-MC/MS proteomic data collected from Desulfovibrio vulgaris grown under three different conditions, we systematically investigate the relationship between mRNA and protein abundunce by a multiple regression approach.

  2. Determining the Spatial and Seasonal Variability in OM/OC Ratios across the U.S. Using Multiple Regression

    EPA Science Inventory

    Data from the Interagency Monitoring of Protected Visual Environments (IMPROVE) network are used to estimate organic mass to organic carbon (OM/OC) ratios across the United States by extending previously published multiple regression techniques. Our new methodology addresses com...

  3. Using the Coefficient of Determination "R"[superscript 2] to Test the Significance of Multiple Linear Regression

    ERIC Educational Resources Information Center

    Quinino, Roberto C.; Reis, Edna A.; Bessegato, Lupercio F.

    2013-01-01

    This article proposes the use of the coefficient of determination as a statistic for hypothesis testing in multiple linear regression based on distributions acquired by beta sampling. (Contains 3 figures.)

  4. Combining multiple regression and principal component analysis for accurate predictions for column ozone in Peninsular Malaysia

    NASA Astrophysics Data System (ADS)

    Rajab, Jasim M.; MatJafri, M. Z.; Lim, H. S.

    2013-06-01

    This study encompasses columnar ozone modelling in the peninsular Malaysia. Data of eight atmospheric parameters [air surface temperature (AST), carbon monoxide (CO), methane (CH4), water vapour (H2Ovapour), skin surface temperature (SSKT), atmosphere temperature (AT), relative humidity (RH), and mean surface pressure (MSP)] data set, retrieved from NASA's Atmospheric Infrared Sounder (AIRS), for the entire period (2003-2008) was employed to develop models to predict the value of columnar ozone (O3) in study area. The combined method, which is based on using both multiple regressions combined with principal component analysis (PCA) modelling, was used to predict columnar ozone. This combined approach was utilized to improve the prediction accuracy of columnar ozone. Separate analysis was carried out for north east monsoon (NEM) and south west monsoon (SWM) seasons. The O3 was negatively correlated with CH4, H2Ovapour, RH, and MSP, whereas it was positively correlated with CO, AST, SSKT, and AT during both the NEM and SWM season periods. Multiple regression analysis was used to fit the columnar ozone data using the atmospheric parameter's variables as predictors. A variable selection method based on high loading of varimax rotated principal components was used to acquire subsets of the predictor variables to be comprised in the linear regression model of the atmospheric parameter's variables. It was found that the increase in columnar O3 value is associated with an increase in the values of AST, SSKT, AT, and CO and with a drop in the levels of CH4, H2Ovapour, RH, and MSP. The result of fitting the best models for the columnar O3 value using eight of the independent variables gave about the same values of the R (≈0.93) and R2 (≈0.86) for both the NEM and SWM seasons. The common variables that appeared in both regression equations were SSKT, CH4 and RH, and the principal precursor of the columnar O3 value in both the NEM and SWM seasons was SSKT.

  5. Melanin and blood concentration in human skin studied by multiple regression analysis: experiments

    NASA Astrophysics Data System (ADS)

    Shimada, M.; Yamada, Y.; Itoh, M.; Yatagai, T.

    2001-09-01

    Knowledge of the mechanism of human skin colour and measurement of melanin and blood concentration in human skin are needed in the medical and cosmetic fields. The absorbance spectrum from reflectance at the visible wavelength of human skin increases under several conditions such as a sunburn or scalding. The change of the absorbance spectrum from reflectance including the scattering effect does not correspond to the molar absorption spectrum of melanin and blood. The modified Beer-Lambert law is applied to the change in the absorbance spectrum from reflectance of human skin as the change in melanin and blood is assumed to be small. The concentration of melanin and blood was estimated from the absorbance spectrum reflectance of human skin using multiple regression analysis. Estimated concentrations were compared with the measured one in a phantom experiment and this method was applied to in vivo skin.

  6. Estimating changes in river faecal coliform loading using nonparametric multiplicative regression.

    PubMed

    Schulz, Christopher J; Childers, Gary W

    2011-03-01

    Faecal coliform (FC) concentration was monitored weekly in the Tangipahoa River over an eight year period. Available USGS discharge and precipitation data were used to construct a nonparametric multiplicative regression (NPMR) model for both forecasting and backcasting of FC density. NPMR backcasting and forecasting of FC allowed for estimation of concentration for any flow regime. During this study a remediation effort was undertaken to improve disinfection systems of contributing municipal waste water treatment plants in the watershed. Time-series analysis of FC concentrations demonstrated a drop in FC levels coinciding with remediation efforts. The NPMR model suggested the reduction in FC levels was not due to climate variance (i.e. discharge and precipitation changes) alone. Use of the NPMR method circumvented the need for construction of a more complex physical watershed model to estimate FC loading in the river. This method can be used to detect and estimate new discharge impacts, or forecast daily FC estimates.

  7. Spontaneous Regression of Multiple Pulmonary Metastases After Radiofrequency Ablation of a Single Metastasis

    SciTech Connect

    Rao, Pramod; Escudier, Bernard; Baere, Thierry de

    2011-04-15

    We report two cases of spontaneous regression of multiple pulmonary metastases occurring after radiofrequency ablation (RFA) of a single lung metastasis. To the best of our knowledge, these are the first such cases reported. These two patients presented with lung metastases progressive despite treatment with interleukin-2, interferon, or sorafenib but were safely ablated with percutaneous RFA under computed tomography guidance. Percutaneous RFA allowed control of the targeted tumors for >1 year. Distant lung metastases presented an objective response despite the fact that they received no targeted local treatment. Local ablative techniques, such as RFA, induce the release of tumor-degradation product, which is probably responsible for an immunologic reaction that is able to produce a response in distant tumors.

  8. Multiple linear regression models of urban runoff pollutant load and event mean concentration considering rainfall variables.

    PubMed

    Maniquiz, Marla C; Lee, Soyoung; Kim, Lee-Hyung

    2010-01-01

    Rainfall is an important factor in estimating the event mean concentration (EMC) which is used to quantify the washed-off pollutant concentrations from non-point sources (NPSs). Pollutant loads could also be calculated using rainfall, catchment area and runoff coefficient. In this study, runoff quantity and quality data gathered from a 28-month monitoring conducted on the road and parking lot sites in Korea were evaluated using multiple linear regression (MLR) to develop equations for estimating pollutant loads and EMCs as a function of rainfall variables. The results revealed that total event rainfall and average rainfall intensity are possible predictors of pollutant loads. Overall, the models are indicators of the high uncertainties of NPSs; perhaps estimation of EMCs and loads could be accurately obtained by means of water quality sampling or a long-term monitoring is needed to gather more data that can be used for the development of estimation models.

  9. Stepwise multiple regression method of greenhouse gas emission modeling in the energy sector in Poland.

    PubMed

    Kolasa-Wiecek, Alicja

    2015-04-01

    The energy sector in Poland is the source of 81% of greenhouse gas (GHG) emissions. Poland, among other European Union countries, occupies a leading position with regard to coal consumption. Polish energy sector actively participates in efforts to reduce GHG emissions to the atmosphere, through a gradual decrease of the share of coal in the fuel mix and development of renewable energy sources. All evidence which completes the knowledge about issues related to GHG emissions is a valuable source of information. The article presents the results of modeling of GHG emissions which are generated by the energy sector in Poland. For a better understanding of the quantitative relationship between total consumption of primary energy and greenhouse gas emission, multiple stepwise regression model was applied. The modeling results of CO2 emissions demonstrate a high relationship (0.97) with the hard coal consumption variable. Adjustment coefficient of the model to actual data is high and equal to 95%. The backward step regression model, in the case of CH4 emission, indicated the presence of hard coal (0.66), peat and fuel wood (0.34), solid waste fuels, as well as other sources (-0.64) as the most important variables. The adjusted coefficient is suitable and equals R2=0.90. For N2O emission modeling the obtained coefficient of determination is low and equal to 43%. A significant variable influencing the amount of N2O emission is the peat and wood fuel consumption.

  10. Multiple Regression (MR) and Artificial Neural Network (ANN) models for prediction of soil suction

    NASA Astrophysics Data System (ADS)

    Erzin, Yusuf; Yilmaz, Isik

    2010-05-01

    This article presents a comparison of multiple regression (MR) and artificial neural network (ANN) model for prediction of soil suction of clayey soils. The results of the soil suction tests utilizing thermocouple psychrometers on statically compacted specimens of Bentonite-Kaolinite clay mixtures with varying soil properties were used to develope the models. The results obtained from both models were then compared with the experimental results. The performance indices such as coefficient of determination (R2), root mean square error (RMSE), mean absolute error (MAE), and variance account for (VAF) were used to control the performance of the prediction capacity of the models developed in this study. ANN model has shown higher prediction performance than regression model according to the performance indices. It is shown that ANN models provide significant improvements in prediction accuracy over statistical models. The potential benefits of soft computing models extend beyond the high computation rates. Higher performances of the soft computing models were sourced from greater degree of robustness and fault tolerance than traditional statistical models because there are many more processing neurons, each with primarily local connections. It appears that there is a possibility of estimating soil suction by using the proposed empirical relationships and soft computing models. The population of the analyzed data is relatively limited in this study. Therefore, the practical outcome of the proposed equations and models could be used, with acceptable accuracy.

  11. Multiple regression equations modelling of groundwater of Ajmer-Pushkar railway line region, Rajasthan (India).

    PubMed

    Mathur, Praveen; Sharma, Sarita; Soni, Bhupendra

    2010-01-01

    In the present work, an attempt is made to formulate multiple regression equations using all possible regressions method for groundwater quality assessment of Ajmer-Pushkar railway line region in pre- and post-monsoon seasons. Correlation studies revealed the existence of linear relationships (r 0.7) for electrical conductivity (EC), total hardness (TH) and total dissolved solids (TDS) with other water quality parameters. The highest correlation was found between EC and TDS (r = 0.973). EC showed highly significant positive correlation with Na, K, Cl, TDS and total solids (TS). TH showed highest correlation with Ca and Mg. TDS showed significant correlation with Na, K, SO4, PO4 and Cl. The study indicated that most of the contamination present was water soluble or ionic in nature. Mg was present as MgCl2; K mainly as KCl and K2SO4, and Na was present as the salts of Cl, SO4 and PO4. On the other hand, F and NO3 showed no significant correlations. The r2 values and F values (at 95% confidence limit, alpha = 0.05) for the modelled equations indicated high degree of linearity among independent and dependent variables. Also the error % between calculated and experimental values was contained within +/- 15% limit.

  12. Artificial neural network and multiple regression model for nickel(II) adsorption on powdered activated carbons.

    PubMed

    Hema, M; Srinivasan, K

    2011-07-01

    Nickel removal efficiency of powered activated carbons of coconut oilcake, neem oilcake and commercial carbon was investigated by using artificial neural network. The effective parameters for the removal of nickel (%R) by adsorption process, which included the pH, contact time (T), distinctiveness of activated carbon (Cn), amount of activated carbon (Cw) and initial concentration of nickel (Co) were investigated. Levenberg-Marquardt (LM) Back-propagation algorithm is used to train the network. The network topology was optimized by varying number of hidden layer and number of neurons in hidden layer. The model was developed in terms of training; validation and testing of experimental data, the test subsets that each of them contains 60%, 20% and 20% of total experimental data, respectively. Multiple regression equation was developed for nickel adsorption system and the output was compared with both simulated and experimental outputs. Standard deviation (SD) with respect to experimental output was quite higher in the case of regression model when compared with ANN model. The obtained experimental data best fitted with the artificial neural network. PMID:23029923

  13. Genetic-algorithm-based multiple regression with fuzzy inference system for detection of nocturnal hypoglycemic episodes.

    PubMed

    Ling, Steve S H; Nguyen, Hung T

    2011-03-01

    Hypoglycemia or low blood glucose is dangerous and can result in unconsciousness, seizures, and even death. It is a common and serious side effect of insulin therapy in patients with diabetes. Hypoglycemic monitor is a noninvasive monitor that measures some physiological parameters continuously to provide detection of hypoglycemic episodes in type 1 diabetes mellitus patients (T1DM). Based on heart rate (HR), corrected QT interval of the ECG signal, change of HR, and the change of corrected QT interval, we develop a genetic algorithm (GA)-based multiple regression with fuzzy inference system (FIS) to classify the presence of hypoglycemic episodes. GA is used to find the optimal fuzzy rules and membership functions of FIS and the model parameters of regression method. From a clinical study of 16 children with T1DM, natural occurrence of nocturnal hypoglycemic episodes is associated with HRs and corrected QT intervals. The overall data were organized into a training set (eight patients) and a testing set (another eight patients) randomly selected. The results show that the proposed algorithm performs a good sensitivity with an acceptable specificity. PMID:21349796

  14. Empirical predictive models of daily relativistic electron flux at geostationary orbit: Multiple regression analysis

    NASA Astrophysics Data System (ADS)

    Simms, Laura E.; Engebretson, Mark J.; Pilipenko, Viacheslav; Reeves, Geoffrey D.; Clilverd, Mark

    2016-04-01

    The daily maximum relativistic electron flux at geostationary orbit can be predicted well with a set of daily averaged predictor variables including previous day's flux, seed electron flux, solar wind velocity and number density, AE index, IMF Bz, Dst, and ULF and VLF wave power. As predictor variables are intercorrelated, we used multiple regression analyses to determine which are the most predictive of flux when other variables are controlled. Empirical models produced from regressions of flux on measured predictors from 1 day previous were reasonably effective at predicting novel observations. Adding previous flux to the parameter set improves the prediction of the peak of the increases but delays its anticipation of an event. Previous day's solar wind number density and velocity, AE index, and ULF wave activity are the most significant explanatory variables; however, the AE index, measuring substorm processes, shows a negative correlation with flux when other parameters are controlled. This may be due to the triggering of electromagnetic ion cyclotron waves by substorms that cause electron precipitation. VLF waves show lower, but significant, influence. The combined effect of ULF and VLF waves shows a synergistic interaction, where each increases the influence of the other on flux enhancement. Correlations between observations and predictions for this 1 day lag model ranged from 0.71 to 0.89 (average: 0.78). A path analysis of correlations between predictors suggests that solar wind and IMF parameters affect flux through intermediate processes such as ring current (Dst), AE, and wave activity.

  15. Modeling the Philippines' real gross domestic product: A normal estimation equation for multiple linear regression

    NASA Astrophysics Data System (ADS)

    Urrutia, Jackie D.; Tampis, Razzcelle L.; Mercado, Joseph; Baygan, Aaron Vito M.; Baccay, Edcon B.

    2016-02-01

    The objective of this research is to formulate a mathematical model for the Philippines' Real Gross Domestic Product (Real GDP). The following factors are considered: Consumers' Spending (x1), Government's Spending (x2), Capital Formation (x3) and Imports (x4) as the Independent Variables that can actually influence in the Real GDP in the Philippines (y). The researchers used a Normal Estimation Equation using Matrices to create the model for Real GDP and used α = 0.01.The researchers analyzed quarterly data from 1990 to 2013. The data were acquired from the National Statistical Coordination Board (NSCB) resulting to a total of 96 observations for each variable. The data have undergone a logarithmic transformation particularly the Dependent Variable (y) to satisfy all the assumptions of the Multiple Linear Regression Analysis. The mathematical model for Real GDP was formulated using Matrices through MATLAB. Based on the results, only three of the Independent Variables are significant to the Dependent Variable namely: Consumers' Spending (x1), Capital Formation (x3) and Imports (x4), hence, can actually predict Real GDP (y). The regression analysis displays that 98.7% (coefficient of determination) of the Independent Variables can actually predict the Dependent Variable. With 97.6% of the result in Paired T-Test, the Predicted Values obtained from the model showed no significant difference from the Actual Values of Real GDP. This research will be essential in appraising the forthcoming changes to aid the Government in implementing policies for the development of the economy.

  16. Additional results on 'Reducing geometric dilution of precision using ridge regression'

    NASA Astrophysics Data System (ADS)

    Kelly, Robert J.

    1990-07-01

    Kelly (1990) presented preliminary results on the feasibility of using ridge regression (RR) to reduce the effects of geometric dilution of precision (GDOP) error inflation in position-fix navigation systems. Recent results indicate that RR will not reduce GDOP bias inflation when biaslike measurement errors last much longer than the aircraft guidance-loop response time. This conclusion precludes the use of RR on navigation systems whose dominant error sources are biaslike; e.g., the GPS selective-availability error source. The simulation results given by Kelly are, however, valid for the conditions defined. Although RR has not yielded a satisfactory solution to the general GDOP problem, it has illuminated the role that multicollinearity plays in navigation signal processors such as the Kalman filter. Bias inflation, initial position guess errors, ridge-parameter selection methodology, and the recursive ridge filter are discussed.

  17. Screening for ketosis using multiple logistic regression based on milk yield and composition.

    PubMed

    Kayano, Mitsunori; Kataoka, Tomoko

    2015-11-01

    Multiple logistic regression was applied to milk yield and composition data for 632 records of healthy cows and 61 records of ketotic cows in Hokkaido, Japan. The purpose was to diagnose ketosis based on milk yield and composition, simultaneously. The cows were divided into two groups: (1) multiparous, including 314 healthy cows and 45 ketotic cows and (2) primiparous, including 318 healthy cows and 16 ketotic cows, since nutritional status, milk yield and composition are affected by parity. Multiple logistic regression was applied to these groups separately. For multiparous cows, milk yield (kg/day/cow) and protein-to-fat (P/F) ratio in milk were significant factors (P<0.05) for the diagnosis of ketosis. For primiparous cows, lactose content (%), solid not fat (SNF) content (%) and milk urea nitrogen (MUN) content (mg/dl) were significantly associated with ketosis (P<0.01). A diagnostic rule was constructed for each group of cows: (1) 9.978 × P/F ratio + 0.085 × milk yield <10 and (2) 2.327 × SNF - 2.703 × lactose + 0.225 × MUN <10. The sensitivity, specificity and the area under the curve (AUC) of the diagnostic rules were (1) 0.800, 0.729 and 0.811; (2) 0.813, 0.730 and 0.787, respectively. The P/F ratio, which is a widely used measure of ketosis, provided the sensitivity, specificity and AUC values of (1) 0.711, 0.726 and 0.781; and (2) 0.678, 0.767 and 0.738, respectively.

  18. Confidence intervals after multiple imputation: combining profile likelihood information from logistic regressions.

    PubMed

    Heinze, Georg; Ploner, Meinhard; Beyea, Jan

    2013-12-20

    In the logistic regression analysis of a small-sized, case-control study on Alzheimer's disease, some of the risk factors exhibited missing values, motivating the use of multiple imputation. Usually, Rubin's rules (RR) for combining point estimates and variances would then be used to estimate (symmetric) confidence intervals (CIs), on the assumption that the regression coefficients were distributed normally. Yet, rarely is this assumption tested, with or without transformation. In analyses of small, sparse, or nearly separated data sets, such symmetric CI may not be reliable. Thus, RR alternatives have been considered, for example, Bayesian sampling methods, but not yet those that combine profile likelihoods, particularly penalized profile likelihoods, which can remove first order biases and guarantee convergence of parameter estimation. To fill the gap, we consider the combination of penalized likelihood profiles (CLIP) by expressing them as posterior cumulative distribution functions (CDFs) obtained via a chi-squared approximation to the penalized likelihood ratio statistic. CDFs from multiple imputations can then easily be averaged into a combined CDF c , allowing confidence limits for a parameter β  at level 1 - α to be identified as those β* and β** that satisfy CDF c (β*) = α ∕ 2 and CDF c (β**) = 1 - α ∕ 2. We demonstrate that the CLIP method outperforms RR in analyzing both simulated data and data from our motivating example. CLIP can also be useful as a confirmatory tool, should it show that the simpler RR are adequate for extended analysis. We also compare the performance of CLIP to Bayesian sampling methods using Markov chain Monte Carlo. CLIP is available in the R package logistf. PMID:23873477

  19. Auditing quality control procedures in a chemical pathology laboratory--a multiple regression analysis.

    PubMed

    Tillyer, C R; Gobin, P T; Ray, A K; Rimanova, H

    1992-07-01

    We undertook a retrospective analysis of the monthly test rejection rates and the monthly external quality assessment scheme performance indices for our laboratory's two automated analysers, and examined the association of these variables with measures of laboratory workload, manpower, staff training, instrument servicing, seasonal and temporal factors and changes of calibration, method and assigned internal quality control values. Using multiple linear regression and stepwise multiple linear regression, we found that test rejection rates differed significantly between instruments, and were highest on the instrument performing the widest variety and lowest volume of tests. On that instrument, rejection rates were significantly associated with the introduction of new staff and laboratory manpower levels, and also showed a highly significant trend upwards over the study period, independent of the effects of the other variables examined. External quality assessment scheme performance indices showed small trends over the study period. They were not related to the test rejection rates on either analyser but also showed a significant association with the introduction of new staff and a small but significant association with laboratory workload. We conclude that the training and introduction of new staff and decreased laboratory manpower levels may significantly increase the level of test rejection, and adherence to appropriate quality control protocols effectively maintains the quality of the laboratory's results, but may not be completely successful in filtering out the effects of some assignable causes of variation in test results. It is suggested that clinical laboratories use the statistical approach adopted here to identify factors which may be adversely affecting quality performance and running costs and to provide evidence that quality control procedures are both cost- and quality-effective.

  20. A multiple linear regression analysis of hot corrosion attack on a series of nickel base turbine alloys

    NASA Technical Reports Server (NTRS)

    Barrett, C. A.

    1985-01-01

    Multiple linear regression analysis was used to determine an equation for estimating hot corrosion attack for a series of Ni base cast turbine alloys. The U transform (i.e., 1/sin (% A/100) to the 1/2) was shown to give the best estimate of the dependent variable, y. A complete second degree equation is described for the centered" weight chemistries for the elements Cr, Al, Ti, Mo, W, Cb, Ta, and Co. In addition linear terms for the minor elements C, B, and Zr were added for a basic 47 term equation. The best reduced equation was determined by the stepwise selection method with essentially 13 terms. The Cr term was found to be the most important accounting for 60 percent of the explained variability hot corrosion attack.

  1. A Meta-Regression Method for Studying Etiological Heterogeneity Across Disease Subtypes Classified by Multiple Biomarkers.

    PubMed

    Wang, Molin; Kuchiba, Aya; Ogino, Shuji

    2015-08-01

    In interdisciplinary biomedical, epidemiologic, and population research, it is increasingly necessary to consider pathogenesis and inherent heterogeneity of any given health condition and outcome. As the unique disease principle implies, no single biomarker can perfectly define disease subtypes. The complex nature of molecular pathology and biology necessitates biostatistical methodologies to simultaneously analyze multiple biomarkers and subtypes. To analyze and test for heterogeneity hypotheses across subtypes defined by multiple categorical and/or ordinal markers, we developed a meta-regression method that can utilize existing statistical software for mixed-model analysis. This method can be used to assess whether the exposure-subtype associations are different across subtypes defined by 1 marker while controlling for other markers and to evaluate whether the difference in exposure-subtype association across subtypes defined by 1 marker depends on any other markers. To illustrate this method in molecular pathological epidemiology research, we examined the associations between smoking status and colorectal cancer subtypes defined by 3 correlated tumor molecular characteristics (CpG island methylator phenotype, microsatellite instability, and the B-Raf protooncogene, serine/threonine kinase (BRAF), mutation) in the Nurses' Health Study (1980-2010) and the Health Professionals Follow-up Study (1986-2010). This method can be widely useful as molecular diagnostics and genomic technologies become routine in clinical medicine and public health.

  2. A New Measurement Equivalence Technique Based on Latent Class Regression as Compared with Multiple Indicators Multiple Causes

    PubMed Central

    Jamali, Jamshid; Ayatollahi, Seyyed Mohammad Taghi; Jafari, Peyman

    2016-01-01

    Background: Measurement equivalence is an essential prerequisite for making valid comparisons in mental health questionnaires across groups. In most methods used for assessing measurement equivalence, which is known as Differential Item Functioning (DIF), latent variables are assumed to be continuous. Objective: To compare a new method called Latent Class Regression (LCR) designed for discrete latent variable with the multiple indicators multiple cause (MIMIC) as a continuous latent variable technique to assess the measurement equivalence of the 12-item General Health Questionnaire (GHQ-12), which is a cross deferent subgroup of Iranian nurses. Methods: A cross-sectional survey was conducted in 2014 among 771 nurses working in the hospitals of Fars and Bushehr provinces of southern Iran. To identify the Minor Psychiatric Disorders (MPD), the nurses completed self-report GHQ-12 questionnaires and sociodemographic questions. Two uniform-DIF detection methods, LCR and MIMIC, were applied for comparability when the GHQ-12 score was assumed to be discrete and continuous, respectively. Results: The result of fitting LCR with 2 classes indicated that 27.4% of the nurses had MPD. Gender was identified as an influential factor of the level of MPD.LCR and MIMIC agree with detection of DIF and DIF-free items by gender, age, education and marital status in 83.3, 100.0, 91.7 and 83.3% cases, respectively. Conclusions: The results indicated that the GHQ-12 is to a great degree, an invariant measure for the assessment of MPD among nurses. High convergence between the two methods suggests using the LCR approach in cases of discrete latent variable, e.g. GHQ-12 and adequate sample size. PMID:27482129

  3. Spatial disaggregation of carbon dioxide emissions from road traffic based on multiple linear regression model

    NASA Astrophysics Data System (ADS)

    Shu, Yuqin; Lam, Nina S. N.

    2011-01-01

    Detailed estimates of carbon dioxide emissions at fine spatial scales are critical to both modelers and decision makers dealing with global warming and climate change. Globally, traffic-related emissions of carbon dioxide are growing rapidly. This paper presents a new method based on a multiple linear regression model to disaggregate traffic-related CO 2 emission estimates from the parish-level scale to a 1 × 1 km grid scale. Considering the allocation factors (population density, urban area, income, road density) together, we used a correlation and regression analysis to determine the relationship between these factors and traffic-related CO 2 emissions, and developed the best-fit model. The method was applied to downscale the traffic-related CO 2 emission values by parish (i.e. county) for the State of Louisiana into 1-km 2 grid cells. In the four highest parishes in traffic-related CO 2 emissions, the biggest area that has above average CO 2 emissions is found in East Baton Rouge, and the smallest area with no CO 2 emissions is also in East Baton Rouge, but Orleans has the most CO 2 emissions per unit area. The result reveals that high CO 2 emissions are concentrated in dense road network of urban areas with high population density and low CO 2 emissions are distributed in rural areas with low population density, sparse road network. The proposed method can be used to identify the emission "hot spots" at fine scale and is considered more accurate and less time-consuming than the previous methods.

  4. Optimization of end-members used in multiple linear regression geochemical mixing models

    NASA Astrophysics Data System (ADS)

    Dunlea, Ann G.; Murray, Richard W.

    2015-11-01

    Tracking marine sediment provenance (e.g., of dust, ash, hydrothermal material, etc.) provides insight into contemporary ocean processes and helps construct paleoceanographic records. In a simple system with only a few end-members that can be easily quantified by a unique chemical or isotopic signal, chemical ratios and normative calculations can help quantify the flux of sediment from the few sources. In a more complex system (e.g., each element comes from multiple sources), more sophisticated mixing models are required. MATLAB codes published in Pisias et al. solidified the foundation for application of a Constrained Least Squares (CLS) multiple linear regression technique that can use many elements and several end-members in a mixing model. However, rigorous sensitivity testing to check the robustness of the CLS model is time and labor intensive. MATLAB codes provided in this paper reduce the time and labor involved and facilitate finding a robust and stable CLS model. By quickly comparing the goodness of fit between thousands of different end-member combinations, users are able to identify trends in the results that reveal the CLS solution uniqueness and the end-member composition precision required for a good fit. Users can also rapidly check that they have the appropriate number and type of end-members in their model. In the end, these codes improve the user's confidence that the final CLS model(s) they select are the most reliable solutions. These advantages are demonstrated by application of the codes in two case studies of well-studied datasets (Nazca Plate and South Pacific Gyre).

  5. PUMA: A Unified Framework for Penalized Multiple Regression Analysis of GWAS Data

    PubMed Central

    Hoffman, Gabriel E.; Logsdon, Benjamin A.; Mezey, Jason G.

    2013-01-01

    Penalized Multiple Regression (PMR) can be used to discover novel disease associations in GWAS datasets. In practice, proposed PMR methods have not been able to identify well-supported associations in GWAS that are undetectable by standard association tests and thus these methods are not widely applied. Here, we present a combined algorithmic and heuristic framework for PUMA (Penalized Unified Multiple-locus Association) analysis that solves the problems of previously proposed methods including computational speed, poor performance on genome-scale simulated data, and identification of too many associations for real data to be biologically plausible. The framework includes a new minorize-maximization (MM) algorithm for generalized linear models (GLM) combined with heuristic model selection and testing methods for identification of robust associations. The PUMA framework implements the penalized maximum likelihood penalties previously proposed for GWAS analysis (i.e. Lasso, Adaptive Lasso, NEG, MCP), as well as a penalty that has not been previously applied to GWAS (i.e. LOG). Using simulations that closely mirror real GWAS data, we show that our framework has high performance and reliably increases power to detect weak associations, while existing PMR methods can perform worse than single marker testing in overall performance. To demonstrate the empirical value of PUMA, we analyzed GWAS data for type 1 diabetes, Crohns's disease, and rheumatoid arthritis, three autoimmune diseases from the original Wellcome Trust Case Control Consortium. Our analysis replicates known associations for these diseases and we discover novel etiologically relevant susceptibility loci that are invisible to standard single marker tests, including six novel associations implicating genes involved in pancreatic function, insulin pathways and immune-cell function in type 1 diabetes; three novel associations implicating genes in pro- and anti-inflammatory pathways in Crohn's disease; and one

  6. Alignment estimation performances of merit function regression with differential wavefront sampling in multiple design configuration optimization

    NASA Astrophysics Data System (ADS)

    Oh, Eunsong; Kim, Sug-Whan; Cho, Seongick; Ryu, Joo-Hyung

    2011-10-01

    In our earlier study[12], we suggested a new alignment algorithm called Multiple Design Configuration Optimization (MDCO hereafter) method combining the merit function regression (MFR) computation with the differential wavefront sampling method (DWS). In this study, we report alignment state estimation performances of the method for three target optical systems (i.e. i) a two-mirror Cassegrain telescope of 58mm in diameter for deep space earth observation, ii) a three-mirror anastigmat of 210mm in aperture for ocean monitoring from the geostationary orbit, and iii) on-axis/off-axis pairs of a extremely large telescope of 27.4m in aperture). First we introduced known amounts of alignment state disturbances to the target optical system elements. Example alignment parameter ranges may include, but not limited to, from 800microns to 10mm in decenter, and from 0.1 to 1.0 degree in tilt. We then ran alignment state estimation simulation using MDCO, MFR and DWS. The simulation results show that MDCO yields much better estimation performance than MFR and DWS over the alignment disturbance level of up to 150 times larger than the required tolerances. In particular, with its simple single field measurement, MDCO exhibits greater practicality and application potentials for shop floor optical testing environment than MFR and DWS.

  7. Current misuses of multiple regression for investigating bivariate hypotheses: an example from the organizational domain.

    PubMed

    O'Neill, Thomas A; McLarnon, Matthew J W; Schneider, Travis J; Gardner, Robert C

    2014-09-01

    By definition, multiple regression (MR) considers more than one predictor variable, and each variable's beta will depend on both its correlation with the criterion and its correlation with the other predictor(s). Despite ad nauseam coverage of this characteristic in organizational psychology and statistical texts, researchers' applications of MR in bivariate hypothesis testing has been the subject of recent and renewed interest. Accordingly, we conducted a targeted survey of the literature by coding articles, covering a five-year span from two top-tier organizational journals, that employed MR for testing bivariate relations. The results suggest that MR coefficients, rather than correlation coefficients, were most common for testing hypotheses of bivariate relations, yet supporting theoretical rationales were rarely offered. Regarding the potential impact on scientific advancement, in almost half of the articles reviewed (44 %), at least one conclusion of each study (i.e., that the hypothesis was or was not supported) would have been different, depending on the author's use of correlation or beta to test the bivariate hypothesis. It follows that inappropriate decisions to interpret the correlation versus the beta will affect the accumulation of consistent and replicable scientific evidence. We conclude with recommendations for improving bivariate hypothesis testing. PMID:24142838

  8. Prediction of peptide retention at different HPLC conditions from multiple linear regression models.

    PubMed

    Baczek, Tomasz; Wiczling, Paweł; Marszałł, Michał; Heyden, Yvan Vander; Kaliszan, Roman

    2005-01-01

    To quantitatively characterize the structure of a peptide and to predict its gradient retention time at given HPLC conditions three structural descriptors are used: (i) logarithm of the sum of retention times of the amino acids composing the peptide, log SumAA, (ii) logarithm of the van der Waals volume of the peptide, log VDW(Vol), (iii) and the logarithm of the peptide's calculated n-octanol-water partition coefficient, clog P. The log SumAA descriptor is obtained from empirical data for 20 natural amino acids, determined in a given HPLC system. The two other descriptors are calculated from the peptides' structural formulas using molecular modeling methods. The quantitative structure-retention relationships (QSRR), build by multiple linear regression, describe HPLC retention of peptide on a given chromatographic system on which the retention of the 20 amino acids was predetermined. A structurally diversified series of 98 peptides was employed. The predicted gradient retention times on several chromatographic systems were in good agreement with the experimental data. The QSRR equations, derived for a given system operated at variable gradient times and temperatures allowed for the prediction of peptide retention in that system. Matching the experimental HPLC retention to the theoretically predicted for a presumed peptide could facilitate original protein identification in proteomics. In conjunction with MS data, prediction of the retention time for a given peptide might be used to improve the confidence of peptide identifications and to increase the number of correctly identified peptides.

  9. Multiple regression and principal components analysis of puberty and growth in cattle.

    PubMed

    Baker, J F; Stewart, T S; Long, C R; Cartwright, T C

    1988-09-01

    Multiple regression and principal components analyses were employed to examine relationships among pubertal and growth characters. Records used were from 424 bulls and 475 heifers produced by a diallel mating of Angus, Brahman, Hereford, Holstein and Jersey breeds. Characters studied were age, weight and height at puberty and measurements of weight and hip height from 9 to 21 mo of age; pelvic measurements of heifers also were included. Measurements of weight and height near 1 yr of age were related most highly to pubertal age, weight adn height. Larger size near 1 yr of age was associated with younger, larger animals at puberty. Growth rate was associated with pubertal characters before, but not after, adjustment for effects of breed-type. Principal components of the variation of pubertal and growth characters among animals were strongly related to both weight and height. The majority of the variation among breed-types was due to height. Characteristic vectors of principal components describing the variation of bulls and heifers were strikingly similar. The variance-covariance structure of pubertal characters was essentially the same for both sexes even though the mean values of the characters differed. PMID:3170369

  10. Application of multiple regression analysis in optimization of anastrozole-loaded PLGA nanoparticles.

    PubMed

    Kumar, Abhinesh; Sawant, Krutika K

    2014-01-01

    The present investigation deals with development of anastrozole-loaded PLGA nanoparticles (NPs) as an alternate to conventional cancer therapy. The NPs were prepared by nanoprecipitation method and optimized using multiple regression analysis. Independent variables included drug:polymer ratio (X1), polymer concentration in organic phase (X2) and surfactant concentration in aqueous phase (X3) while dependent variables were percentage drug entrapment (PDE) and particle size (PS). Results of desirability criteria, check point analysis and normalized error were considered for selecting the formulation with highest PDE and lowest PS. Prepared NPs were characterized for zeta potential, transmission electron microscopy (TEM), differential scanning calorimetry (DSC) and in vitro drug release studies. DSC and TEM studies indicated absence of any drug-polymer interaction and spherical nature of NPs, respectively. In vitro drug release showed biphasic pattern exhibiting Fickian diffusion-based release mechanism. This delivery system of anastrozole is expected to reduce the side effects associated with the conventional cancer therapy by reducing dosing frequency.

  11. [Multiple stepwise regression analysis of etiological factors of esophageal cancer in Cixian county].

    PubMed

    Hou, J

    1989-01-01

    Cixian county, one of the high-risk counties of esophageal cancer in the world, has a standardized mortality of 142.19/10(5) population, 1969-1971. The incidence of esophageal cancer had dropped year by year from 1974 to 1982. The significance of the incidence tendency was studied. The results are highly significant (P less than 0.001). The causative factors of esophageal cancer including five independent variables: X1 (number of people taking sanitized water), X2 (number of people on pickled Chinese cabbage), X3 (annual output of fruit), X4 (annual output of fresh vegetable) and X5 (annual output of sweet potato) and one dependent variable Y (morbidity of esophageal cancer) were studied by correlative analysis and multiple stepwise regression. Three correlative factors (X1, X2, and X5) with significant effect on the esophageal cancer were selected from the five suspected factors. The result indicated that taking sanitized water, reducing the number of people on pickled Chinese cabbage, changing the structure of food and keeping the nutrient balance, might decrease the incidence of esophageal cancer. PMID:2789130

  12. A Nonlinear Causality Estimator Based on Non-Parametric Multiplicative Regression

    PubMed Central

    Nicolaou, Nicoletta; Constandinou, Timothy G.

    2016-01-01

    Causal prediction has become a popular tool for neuroscience applications, as it allows the study of relationships between different brain areas during rest, cognitive tasks or brain disorders. We propose a nonparametric approach for the estimation of nonlinear causal prediction for multivariate time series. In the proposed estimator, CNPMR, Autoregressive modeling is replaced by Nonparametric Multiplicative Regression (NPMR). NPMR quantifies interactions between a response variable (effect) and a set of predictor variables (cause); here, we modified NPMR for model prediction. We also demonstrate how a particular measure, the sensitivity Q, could be used to reveal the structure of the underlying causal relationships. We apply CNPMR on artificial data with known ground truth (5 datasets), as well as physiological data (2 datasets). CNPMR correctly identifies both linear and nonlinear causal connections that are present in the artificial data, as well as physiologically relevant connectivity in the real data, and does not seem to be affected by filtering. The Sensitivity measure also provides useful information about the latent connectivity.The proposed estimator addresses many of the limitations of linear Granger causality and other nonlinear causality estimators. CNPMR is compared with pairwise and conditional Granger causality (linear) and Kernel-Granger causality (nonlinear). The proposed estimator can be applied to pairwise or multivariate estimations without any modifications to the main method. Its nonpametric nature, its ability to capture nonlinear relationships and its robustness to filtering make it appealing for a number of applications. PMID:27378901

  13. A Nonlinear Causality Estimator Based on Non-Parametric Multiplicative Regression.

    PubMed

    Nicolaou, Nicoletta; Constandinou, Timothy G

    2016-01-01

    Causal prediction has become a popular tool for neuroscience applications, as it allows the study of relationships between different brain areas during rest, cognitive tasks or brain disorders. We propose a nonparametric approach for the estimation of nonlinear causal prediction for multivariate time series. In the proposed estimator, C NPMR , Autoregressive modeling is replaced by Nonparametric Multiplicative Regression (NPMR). NPMR quantifies interactions between a response variable (effect) and a set of predictor variables (cause); here, we modified NPMR for model prediction. We also demonstrate how a particular measure, the sensitivity Q, could be used to reveal the structure of the underlying causal relationships. We apply C NPMR on artificial data with known ground truth (5 datasets), as well as physiological data (2 datasets). C NPMR correctly identifies both linear and nonlinear causal connections that are present in the artificial data, as well as physiologically relevant connectivity in the real data, and does not seem to be affected by filtering. The Sensitivity measure also provides useful information about the latent connectivity.The proposed estimator addresses many of the limitations of linear Granger causality and other nonlinear causality estimators. C NPMR is compared with pairwise and conditional Granger causality (linear) and Kernel-Granger causality (nonlinear). The proposed estimator can be applied to pairwise or multivariate estimations without any modifications to the main method. Its nonpametric nature, its ability to capture nonlinear relationships and its robustness to filtering make it appealing for a number of applications.

  14. Anomalous particle pinch and scaling of vin/D based on transport analysis and multiple regression

    NASA Astrophysics Data System (ADS)

    Becker, G.; Kardaun, O.

    2007-01-01

    Predictions of density profiles in current tokamaks and ITER require a validated scaling relation for vin/D where vin is the anomalous inward drift velocity and D is the anomalous diffusion coefficient. Transport analysis is necessary for determining the anomalous particle pinch from measured density profiles and for separating the impact of particle sources. A set of discharges in ASDEX Upgrade, DIII-D, JET and ASDEX is analysed using a special version of the 1.5-D BALDUR transport code. Profiles of ρsvin/D with ρs the effective separatrix radius, five other dimensionless parameters and many further quantities in the confinement zone are compiled, resulting in the dataset VIND1.dat, which covers a wide parameter range. Weighted multiple regression is applied to the ASDEX Upgrade subset which leads to a two-term scaling \\rho _sv_in ({x'}) /D ({x'}) =0.0432 [ { ({L_{T_{\\rme}} ({ \\bar {x}'}) / \\rho _s}) ^{-2.58}+7.13 \\, U_L^{1.55} \

  15. A factor analysis-multiple regression model for source apportionment of suspended particulate matter

    NASA Astrophysics Data System (ADS)

    Okamoto, Shin'ichi; Hayashi, Masayuki; Nakajima, Masaomi; Kainuma, Yasutaka; Shiozawa, Kiyoshige

    A factor analysis-multiple regression (FA-MR) model has been used for a source apportionment study in the Tokyo metropolitan area. By a varimax rotated factor analysis, five source types could be identified: refuse incineration, soil and automobile, secondary particles, sea salt and steel mill. Quantitative estimations using the FA-MR model corresponded to the calculated contributing concentrations determined by using a weighted least-squares CMB model. However, the source type of refuse incineration identified by the FA-MR model was similar to that of biomass burning, rather than that produced by an incineration plant. The estimated contributions of sea salt and steel mill by the FA-MR model contained those of other sources, which have the same temporal variation of contributing concentrations. This symptom was caused by a multicollinearity problem. Although this result shows the limitation of the multivariate receptor model, it gives useful information concerning source types and their distribution by comparing with the results of the CMB model. In the Tokyo metropolitan area, the contributions from soil (including road dust), automobile, secondary particles and refuse incineration (biomass burning) were larger than industrial contributions: fuel oil combustion and steel mill. However, since vanadium is highly correlated with SO 42- and other secondary particle related elements, a major portion of secondary particles is considered to be related to fuel oil combustion.

  16. [Clinical research XX. From clinical judgment to multiple logistic regression model].

    PubMed

    Berea-Baltierra, Ricardo; Rivas-Ruiz, Rodolfo; Pérez-Rodríguez, Marcela; Palacios-Cruz, Lino; Moreno, Jorge; Talavera, Juan O

    2014-01-01

    The complexity of the causality phenomenon in clinical practice implies that the result of a maneuver is not solely caused by the maneuver, but by the interaction among the maneuver and other baseline factors or variables occurring during the maneuver. This requires methodological designs that allow the evaluation of these variables. When the outcome is a binary variable, we use the multiple logistic regression model (MLRM). This multivariate model is useful when we want to predict or explain, adjusting due to the effect of several risk factors, the effect of a maneuver or exposition over the outcome. In order to perform an MLRM, the outcome or dependent variable must be a binary variable and both categories must mutually exclude each other (i.e. live/death, healthy/ill); on the other hand, independent variables or risk factors may be either qualitative or quantitative. The effect measure obtained from this model is the odds ratio (OR) with 95 % confidence intervals (CI), from which we can estimate the proportion of the outcome's variability explained through the risk factors. For these reasons, the MLRM is used in clinical research, since one of the main objectives in clinical practice comprises the ability to predict or explain an event where different risk or prognostic factors are taken into account.

  17. Fast Quantitative Analysis Of Museum Objects Using Laser-Induced Breakdown Spectroscopy And Multiple Regression Algorithms

    NASA Astrophysics Data System (ADS)

    Lorenzetti, G.; Foresta, A.; Palleschi, V.; Legnaioli, S.

    2009-09-01

    The recent development of mobile instrumentation, specifically devoted to in situ analysis and study of museum objects, allows the acquisition of many LIBS spectra in very short time. However, such large amount of data calls for new analytical approaches which would guarantee a prompt analysis of the results obtained. In this communication, we will present and discuss the advantages of statistical analytical methods, such as Partial Least Squares Multiple Regression algorithms vs. the classical calibration curve approach. PLS algorithms allows to obtain in real time the information on the composition of the objects under study; this feature of the method, compared to the traditional off-line analysis of the data, is extremely useful for the optimization of the measurement times and number of points associated with the analysis. In fact, the real time availability of the compositional information gives the possibility of concentrating the attention on the most `interesting' parts of the object, without over-sampling the zones which would not provide useful information for the scholars or the conservators. Some example on the applications of this method will be presented, including the studies recently performed by the researcher of the Applied Laser Spectroscopy Laboratory on museum bronze objects.

  18. PARAMETRIC AND NON PARAMETRIC (MARS: MULTIVARIATE ADDITIVE REGRESSION SPLINES) LOGISTIC REGRESSIONS FOR PREDICTION OF A DICHOTOMOUS RESPONSE VARIABLE WITH AN EXAMPLE FOR PRESENCE/ABSENCE OF AMPHIBIANS

    EPA Science Inventory

    The purpose of this report is to provide a reference manual that could be used by investigators for making informed use of logistic regression using two methods (standard logistic regression and MARS). The details for analyses of relationships between a dependent binary response ...

  19. Source apportionment based on an atmospheric dispersion model and multiple linear regression analysis

    NASA Astrophysics Data System (ADS)

    Fushimi, Akihiro; Kawashima, Hiroto; Kajihara, Hideo

    Understanding the contribution of each emission source of air pollutants to ambient concentrations is important to establish effective measures for risk reduction. We have developed a source apportionment method based on an atmospheric dispersion model and multiple linear regression analysis (MLR) in conjunction with ambient concentrations simultaneously measured at points in a grid network. We used a Gaussian plume dispersion model developed by the US Environmental Protection Agency called the Industrial Source Complex model (ISC) in the method. Our method does not require emission amounts or source profiles. The method was applied to the case of benzene in the vicinity of the Keiyo Central Coastal Industrial Complex (KCCIC), one of the biggest industrial complexes in Japan. Benzene concentrations were simultaneously measured from December 2001 to July 2002 at sites in a grid network established in the KCCIC and the surrounding residential area. The method was used to estimate benzene emissions from the factories in the KCCIC and from automobiles along a section of a road, and then the annual average contribution of the KCCIC to the ambient concentrations was estimated based on the estimated emissions. The estimated contributions of the KCCIC were 65% inside the complex, 49% at 0.5-km sites, 35% at 1.5-km sites, 20% at 3.3-km sites, and 9% at a 5.6-km site. The estimated concentrations agreed well with the measured values. The estimated emissions from the factories and the road were slightly larger than those reported in the first Pollutant Release and Transfer Register (PRTR). These results support the reliability of our method. This method can be applied to other chemicals or regions to achieve reasonable source apportionments.

  20. Oral health-related risk behaviours and attitudes among Croatian adolescents--multiple logistic regression analysis.

    PubMed

    Spalj, Stjepan; Spalj, Vedrana Tudor; Ivanković, Luida; Plancak, Darije

    2014-03-01

    The aim of this study was to explore the patterns of oral health-related risk behaviours in relation to dental status, attitudes, motivation and knowledge among Croatian adolescents. The assessment was conducted in the sample of 750 male subjects - military recruits aged 18-28 in Croatia using the questionnaire and clinical examination. Mean number of decayed, missing and filled teeth (DMFT) and Significant Caries Index (SIC) were calculated. Multiple logistic regression models were crated for analysis. Although models of risk behaviours were statistically significant their explanatory values were quite low. Five of them--rarely toothbrushing, not using hygiene auxiliaries, rarely visiting dentist, toothache as a primary reason to visit dentist, and demand for tooth extraction due to toothache--had the highest explanatory values ranging from 21-29% and correctly classified 73-89% of subjects. Toothache as a primary reason to visit dentist, extraction as preferable therapy when toothache occurs, not having brushing education in school and frequent gingival bleeding were significantly related to population with high caries experience (DMFT > or = 14 according to SiC) producing Odds ratios of 1.6 (95% CI 1.07-2.46), 2.1 (95% CI 1.29-3.25), 1.8 (95% CI 1.21-2.74) and 2.4 (95% CI 1.21-2.74) respectively. DMFT> or = 14 model had low explanatory value of 6.5% and correctly classified 83% of subjects. It can be concluded that oral health-related risk behaviours are interrelated. Poor association was seen between attitudes concerning oral health and oral health-related risk behaviours, indicating insufficient motivation to change lifestyle and habits. Self-reported oral hygiene habits were not strongly related to dental status.

  1. Analyzing Regression-Discontinuity Designs with Multiple Assignment Variables: A Comparative Study of Four Estimation Methods

    ERIC Educational Resources Information Center

    Wong, Vivian C.; Steiner, Peter M.; Cook, Thomas D.

    2013-01-01

    In a traditional regression-discontinuity design (RDD), units are assigned to treatment on the basis of a cutoff score and a continuous assignment variable. The treatment effect is measured at a single cutoff location along the assignment variable. This article introduces the multivariate regression-discontinuity design (MRDD), where multiple…

  2. Multiple Regression Methodology in the "Journal of Education for Students Placed at Risk": Effect Sizes and Structure Coefficients.

    ERIC Educational Resources Information Center

    Herring, Jennifer C.

    This study reviewed the statistical practices in published research articles in the Journal of Education for Students Placed at Risk to determine the reporting of effect sizes and structure coefficients. Of the 12 quantitative studies found in the last 3 volumes of the journal, only 3 were identified as using multiple regression analysis. Two of…

  3. The Use of Multiple Regression Models to Determine if Conjoint Analysis Should Be Conducted on Aggregate Data.

    ERIC Educational Resources Information Center

    Fraas, John W.; Newman, Isadore

    1996-01-01

    In a conjoint-analysis consumer-preference study, researchers must determine whether the product factor estimates, which measure consumer preferences, should be calculated and interpreted for each respondent or collectively. Multiple regression models can determine whether to aggregate data by examining factor-respondent interaction effects. This…

  4. Physical and Cognitive-Affective Factors Associated with Fatigue in Individuals with Fibromyalgia: A Multiple Regression Analysis

    ERIC Educational Resources Information Center

    Muller, Veronica; Brooks, Jessica; Tu, Wei-Mo; Moser, Erin; Lo, Chu-Ling; Chan, Fong

    2015-01-01

    Purpose: The main objective of this study was to determine the extent to which physical and cognitive-affective factors are associated with fibromyalgia (FM) fatigue. Method: A quantitative descriptive design using correlation techniques and multiple regression analysis. The participants consisted of 302 members of the National Fibromyalgia &…

  5. Latent Variable Regression 4-Level Hierarchical Model Using Multisite Multiple-Cohorts Longitudinal Data. CRESST Report 801

    ERIC Educational Resources Information Center

    Choi, Kilchan

    2011-01-01

    This report explores a new latent variable regression 4-level hierarchical model for monitoring school performance over time using multisite multiple-cohorts longitudinal data. This kind of data set has a 4-level hierarchical structure: time-series observation nested within students who are nested within different cohorts of students. These…

  6. R Squared Shrinkage in Multiple Regression Research: An Empirical Evaluation of Use and Impact of Adjusted Effect Formulae.

    ERIC Educational Resources Information Center

    Thatcher, Greg W.; Henson, Robin K.

    This study examined research in training and development to determine effect size reporting practices. It focused on the reporting of corrected effect sizes in research articles using multiple regression analyses. When possible, researchers calculated corrected effect sizes and determine if the associated shrinkage could have impacted researcher…

  7. Estimating the Coefficient of Cross-validity in Multiple Regression: A Comparison of Analytical and Empirical Methods.

    ERIC Educational Resources Information Center

    Kromrey, Jeffrey D.; Hines, Constance V.

    1996-01-01

    The accuracy of three analytical formulas for shrinkage estimation and four empirical techniques were investigated in a Monte Carlo study of the coefficient of cross-validity in multiple regression. Substantial statistical bias was evident for all techniques except the formula of M. W. Brown (1975) and multicross-validation. (SLD)

  8. The Overall Odds Ratio as an Intuitive Effect Size Index for Multiple Logistic Regression: Examination of Further Refinements

    ERIC Educational Resources Information Center

    Le, Huy; Marcus, Justin

    2012-01-01

    This study used Monte Carlo simulation to examine the properties of the overall odds ratio (OOR), which was recently introduced as an index for overall effect size in multiple logistic regression. It was found that the OOR was relatively independent of study base rate and performed better than most commonly used R-square analogs in indexing model…

  9. Insights into antioxidant activity of 1-adamantylthiopyridine analogs using multiple linear regression.

    PubMed

    Worachartcheewan, Apilak; Nantasenamat, Chanin; Owasirikul, Wiwat; Monnor, Teerawat; Naruepantawart, Orapan; Janyapaisarn, Sayamon; Prachayasittikul, Supaluk; Prachayasittikul, Virapong

    2014-02-12

    A data set of 1-adamantylthiopyridine analogs (1-19) with antioxidant activity, comprising of 2,2-diphenyl-1-picrylhydrazyl (DPPH) and superoxide dismutase (SOD) activities, was used for constructing quantitative structure-activity relationship (QSAR) models. Molecular structures were geometrically optimized at B3LYP/6-31g(d) level and subjected for further molecular descriptor calculation using Dragon software. Multiple linear regression (MLR) was employed for the development of QSAR models using 3 significant descriptors (i.e. Mor29e, F04[N-N] and GATS5v) for predicting the DPPH activity and 2 essential descriptors (i.e. EEig06r and Mor06v) for predicting the SOD activity. Such molecular descriptors accounted for the effects and positions of substituent groups (R) on the 1-adamantylthiopyridine ring. The results showed that high atomic electronegativity of polar substituent group (R = CO2H) afforded high DPPH activity, while substituent with high atomic van der Waals volumes such as R = Br gave high SOD activity. Leave-one-out cross-validation (LOO-CV) and external test set were used for model validation. Correlation coefficient (QCV) and root mean squared error (RMSECV) of the LOO-CV set for predicting DPPH activity were 0.5784 and 8.3440, respectively, while QExt and RMSEExt of external test set corresponded to 0.7353 and 4.2721, respectively. Furthermore, QCV and RMSECV values of the LOO-CV set for predicting SOD activity were 0.7549 and 5.6380, respectively. The QSAR model's equation was then used in predicting the SOD activity of tested compounds and these were subsequently verified experimentally. It was observed that the experimental activity was more potent than the predicted activity. Structure-activity relationships of significant descriptors governing antioxidant activity are also discussed. The QSAR models investigated herein are anticipated to be useful in the rational design and development of novel compounds with antioxidant activity. PMID

  10. Multiple linear regression to estimate time-frequency electrophysiological responses in single trials.

    PubMed

    Hu, L; Zhang, Z G; Mouraux, A; Iannetti, G D

    2015-05-01

    Transient sensory, motor or cognitive event elicit not only phase-locked event-related potentials (ERPs) in the ongoing electroencephalogram (EEG), but also induce non-phase-locked modulations of ongoing EEG oscillations. These modulations can be detected when single-trial waveforms are analysed in the time-frequency domain, and consist in stimulus-induced decreases (event-related desynchronization, ERD) or increases (event-related synchronization, ERS) of synchrony in the activity of the underlying neuronal populations. ERD and ERS reflect changes in the parameters that control oscillations in neuronal networks and, depending on the frequency at which they occur, represent neuronal mechanisms involved in cortical activation, inhibition and binding. ERD and ERS are commonly estimated by averaging the time-frequency decomposition of single trials. However, their trial-to-trial variability that can reflect physiologically-important information is lost by across-trial averaging. Here, we aim to (1) develop novel approaches to explore single-trial parameters (including latency, frequency and magnitude) of ERP/ERD/ERS; (2) disclose the relationship between estimated single-trial parameters and other experimental factors (e.g., perceived intensity). We found that (1) stimulus-elicited ERP/ERD/ERS can be correctly separated using principal component analysis (PCA) decomposition with Varimax rotation on the single-trial time-frequency distributions; (2) time-frequency multiple linear regression with dispersion term (TF-MLRd) enhances the signal-to-noise ratio of ERP/ERD/ERS in single trials, and provides an unbiased estimation of their latency, frequency, and magnitude at single-trial level; (3) these estimates can be meaningfully correlated with each other and with other experimental factors at single-trial level (e.g., perceived stimulus intensity and ERP magnitude). The methods described in this article allow exploring fully non-phase-locked stimulus-induced cortical

  11. Multiple linear regression to estimate time-frequency electrophysiological responses in single trials

    PubMed Central

    Hu, L.; Zhang, Z.G.; Mouraux, A.; Iannetti, G.D.

    2015-01-01

    Transient sensory, motor or cognitive event elicit not only phase-locked event-related potentials (ERPs) in the ongoing electroencephalogram (EEG), but also induce non-phase-locked modulations of ongoing EEG oscillations. These modulations can be detected when single-trial waveforms are analysed in the time-frequency domain, and consist in stimulus-induced decreases (event-related desynchronization, ERD) or increases (event-related synchronization, ERS) of synchrony in the activity of the underlying neuronal populations. ERD and ERS reflect changes in the parameters that control oscillations in neuronal networks and, depending on the frequency at which they occur, represent neuronal mechanisms involved in cortical activation, inhibition and binding. ERD and ERS are commonly estimated by averaging the time-frequency decomposition of single trials. However, their trial-to-trial variability that can reflect physiologically-important information is lost by across-trial averaging. Here, we aim to (1) develop novel approaches to explore single-trial parameters (including latency, frequency and magnitude) of ERP/ERD/ERS; (2) disclose the relationship between estimated single-trial parameters and other experimental factors (e.g., perceived intensity). We found that (1) stimulus-elicited ERP/ERD/ERS can be correctly separated using principal component analysis (PCA) decomposition with Varimax rotation on the single-trial time-frequency distributions; (2) time-frequency multiple linear regression with dispersion term (TF-MLRd) enhances the signal-to-noise ratio of ERP/ERD/ERS in single trials, and provides an unbiased estimation of their latency, frequency, and magnitude at single-trial level; (3) these estimates can be meaningfully correlated with each other and with other experimental factors at single-trial level (e.g., perceived stimulus intensity and ERP magnitude). The methods described in this article allow exploring fully non-phase-locked stimulus-induced cortical

  12. Multiple linear regression to estimate time-frequency electrophysiological responses in single trials.

    PubMed

    Hu, L; Zhang, Z G; Mouraux, A; Iannetti, G D

    2015-05-01

    Transient sensory, motor or cognitive event elicit not only phase-locked event-related potentials (ERPs) in the ongoing electroencephalogram (EEG), but also induce non-phase-locked modulations of ongoing EEG oscillations. These modulations can be detected when single-trial waveforms are analysed in the time-frequency domain, and consist in stimulus-induced decreases (event-related desynchronization, ERD) or increases (event-related synchronization, ERS) of synchrony in the activity of the underlying neuronal populations. ERD and ERS reflect changes in the parameters that control oscillations in neuronal networks and, depending on the frequency at which they occur, represent neuronal mechanisms involved in cortical activation, inhibition and binding. ERD and ERS are commonly estimated by averaging the time-frequency decomposition of single trials. However, their trial-to-trial variability that can reflect physiologically-important information is lost by across-trial averaging. Here, we aim to (1) develop novel approaches to explore single-trial parameters (including latency, frequency and magnitude) of ERP/ERD/ERS; (2) disclose the relationship between estimated single-trial parameters and other experimental factors (e.g., perceived intensity). We found that (1) stimulus-elicited ERP/ERD/ERS can be correctly separated using principal component analysis (PCA) decomposition with Varimax rotation on the single-trial time-frequency distributions; (2) time-frequency multiple linear regression with dispersion term (TF-MLRd) enhances the signal-to-noise ratio of ERP/ERD/ERS in single trials, and provides an unbiased estimation of their latency, frequency, and magnitude at single-trial level; (3) these estimates can be meaningfully correlated with each other and with other experimental factors at single-trial level (e.g., perceived stimulus intensity and ERP magnitude). The methods described in this article allow exploring fully non-phase-locked stimulus-induced cortical

  13. Integrative analysis of multiple diverse omics datasets by sparse group multitask regression.

    PubMed

    Lin, Dongdong; Zhang, Jigang; Li, Jingyao; He, Hao; Deng, Hong-Wen; Wang, Yu-Ping

    2014-01-01

    A variety of high throughput genome-wide assays enable the exploration of genetic risk factors underlying complex traits. Although these studies have remarkable impact on identifying susceptible biomarkers, they suffer from issues such as limited sample size and low reproducibility. Combining individual studies of different genetic levels/platforms has the promise to improve the power and consistency of biomarker identification. In this paper, we propose a novel integrative method, namely sparse group multitask regression, for integrating diverse omics datasets, platforms, and populations to identify risk genes/factors of complex diseases. This method combines multitask learning with sparse group regularization, which will: (1) treat the biomarker identification in each single study as a task and then combine them by multitask learning; (2) group variables from all studies for identifying significant genes; (3) enforce sparse constraint on groups of variables to overcome the "small sample, but large variables" problem. We introduce two sparse group penalties: sparse group lasso and sparse group ridge in our multitask model, and provide an effective algorithm for each model. In addition, we propose a significance test for the identification of potential risk genes. Two simulation studies are performed to evaluate the performance of our integrative method by comparing it with conventional meta-analysis method. The results show that our sparse group multitask method outperforms meta-analysis method significantly. In an application to our osteoporosis studies, 7 genes are identified as significant genes by our method and are found to have significant effects in other three independent studies for validation. The most significant gene SOD2 has been identified in our previous osteoporosis study involving the same expression dataset. Several other genes such as TREML2, HTR1E, and GLO1 are shown to be novel susceptible genes for osteoporosis, as confirmed from other

  14. Multiple linear regression models to fit magnitude using rupture length, rupture width, rupture area, and surface displacement

    NASA Astrophysics Data System (ADS)

    Chu, A.; Zhuang, J.

    2015-12-01

    Wells and Coppersmith (1994) have used fault data to fit simple linear regression (SLR) models to explain linear relations between moment magnitude and logarithms of fault measurements such as rupture length, rupture width, rupture area and surface displacement. Our work extends their analyses to multiple linear regression (MLR) models by considering two or more predictors with updated data. Treating the quantitative variables (rupture length, rupture width, rupture area and surface displacement) as predictors to fit linear regression models on magnitude, we have discovered that the two-predictor model using rupture area and maximum displacement fits the best. The next best alternative predictors are surface length and rupture area. Neither slip type nor slip direction is a significant predictor by fitting of analysis of variance (ANOVA) and analysis of covariance (ANCOVA) models. Corrected Akaike information criterion (Burnham and Anderson, 2002) is used as a model assessment criterion. Comparisons between simple linear regression models of Wells and Coppersmith (1994) and our multiple linear regression models are presented. Our work is done using fault data from Wells and Coppersmith (1994) and new data from Ellswort (2000), Hanks and Bakun (2002, 2008), Shaw (2013), and Finite-Source Rupture Model Database (http://equake-rc.info/SRCMOD/, 2015).

  15. Adults' strategies for simple addition and multiplication: verbal self-reports and the operand recognition paradigm.

    PubMed

    Metcalfe, Arron W S; Campbell, Jamie I D

    2011-05-01

    Accurate measurement of cognitive strategies is important in diverse areas of psychological research. Strategy self-reports are a common measure, but C. Thevenot, M. Fanget, and M. Fayol (2007) proposed a more objective method to distinguish different strategies in the context of mental arithmetic. In their operand recognition paradigm, speed of recognition memory for problem operands after solving a problem indexes strategy (e.g., direct memory retrieval vs. a procedural strategy). Here, in 2 experiments, operand recognition time was the same following simple addition or multiplication, but, consistent with a wide variety of previous research, strategy reports indicated much greater use of procedures (e.g., counting) for addition than multiplication. Operation, problem size (e.g., 2 + 3 vs. 8 + 9), and operand format (digits vs. words) had interactive effects on reported procedure use that were not reflected in recognition performance. Regression analyses suggested that recognition time was influenced at least as much by the relative difficulty of the preceding problem as by the strategy used. The findings indicate that the operand recognition paradigm is not a reliable substitute for strategy reports and highlight the potential impact of difficulty-related carryover effects in sequential cognitive tasks. PMID:21261421

  16. Hierarchical Regression for Multiple Comparisons in a Case-Control Study of Occupational Risks for Lung Cancer

    PubMed Central

    Corbin, Marine; Richiardi, Lorenzo; Vermeulen, Roel; Kromhout, Hans; Merletti, Franco; Peters, Susan; Simonato, Lorenzo; Steenland, Kyle; Pearce, Neil; Maule, Milena

    2012-01-01

    Background Occupational studies often involve multiple comparisons and therefore suffer from false positive findings. Semi-Bayes adjustment methods have sometimes been used to address this issue. Hierarchical regression is a more general approach, including Semi-Bayes adjustment as a special case, that aims at improving the validity of standard maximum-likelihood estimates in the presence of multiple comparisons by incorporating similarities between the exposures of interest in a second-stage model. Methodology/Principal Findings We re-analysed data from an occupational case-control study of lung cancer, applying hierarchical regression. In the second-stage model, we included the exposure to three known lung carcinogens (asbestos, chromium and silica) for each occupation, under the assumption that occupations entailing similar carcinogenic exposures are associated with similar risks of lung cancer. Hierarchical regression estimates had smaller confidence intervals than maximum-likelihood estimates. The shrinkage toward the null was stronger for extreme, less stable estimates (e.g., “specialised farmers”: maximum-likelihood OR: 3.44, 95%CI 0.90–13.17; hierarchical regression OR: 1.53, 95%CI 0.63–3.68). Unlike Semi-Bayes adjustment toward the global mean, hierarchical regression did not shrink all the ORs towards the null (e.g., “Metal smelting, converting and refining furnacemen”: maximum-likelihood OR: 1.07, Semi-Bayes OR: 1.06, hierarchical regression OR: 1.26). Conclusions/Significance Hierarchical regression could be a valuable tool in occupational studies in which disease risk is estimated for a large amount of occupations when we have information available on the key carcinogenic exposures involved in each occupation. With the constant progress in exposure assessment methods in occupational settings and the availability of Job Exposure Matrices, it should become easier to apply this approach. PMID:22701732

  17. Land use regression modeling of intra-urban residential variability in multiple traffic-related air pollutants

    PubMed Central

    Clougherty, Jane E; Wright, Rosalind J; Baxter, Lisa K; Levy, Jonathan I

    2008-01-01

    Background There is a growing body of literature linking GIS-based measures of traffic density to asthma and other respiratory outcomes. However, no consensus exists on which traffic indicators best capture variability in different pollutants or within different settings. As part of a study on childhood asthma etiology, we examined variability in outdoor concentrations of multiple traffic-related air pollutants within urban communities, using a range of GIS-based predictors and land use regression techniques. Methods We measured fine particulate matter (PM2.5), nitrogen dioxide (NO2), and elemental carbon (EC) outside 44 homes representing a range of traffic densities and neighborhoods across Boston, Massachusetts and nearby communities. Multiple three to four-day average samples were collected at each home during winters and summers from 2003 to 2005. Traffic indicators were derived using Massachusetts Highway Department data and direct traffic counts. Multivariate regression analyses were performed separately for each pollutant, using traffic indicators, land use, meteorology, site characteristics, and central site concentrations. Results PM2.5 was strongly associated with the central site monitor (R2 = 0.68). Additional variability was explained by total roadway length within 100 m of the home, smoking or grilling near the monitor, and block-group population density (R2 = 0.76). EC showed greater spatial variability, especially during winter months, and was predicted by roadway length within 200 m of the home. The influence of traffic was greater under low wind speed conditions, and concentrations were lower during summer (R2 = 0.52). NO2 showed significant spatial variability, predicted by population density and roadway length within 50 m of the home, modified by site characteristics (obstruction), and with higher concentrations during summer (R2 = 0.56). Conclusion Each pollutant examined displayed somewhat different spatial patterns within urban neighborhoods

  18. Regression models for patient-reported measures having ordered categories recorded on multiple occasions

    PubMed Central

    Preisser, J. S.; Phillips, C.; Perin, J.; Schwartz, T. A.

    2011-01-01

    Objectives The article reviews proportional and partial proportional odds regression for ordered categorical outcomes, such as patient-reported measures, that are frequently used in clinical research in dentistry. Methods The proportional odds regression model for ordinal data is a generalization of ordinary logistic regression for dichotomous responses. When the proportional odds assumption holds for some but not all of the covariates, the lesser known partial proportional odds model is shown to provide a useful extension. Results The ordinal data models are illustrated for the analysis of repeated ordinal outcomes to determine whether the burden associated with sensory alteration following a bilateral sagittal split osteotomy procedure differed for those patients who were given opening exercises only following surgery and those who received sensory retraining exercises in conjunction with standard opening exercises. Conclusions Proportional and partial proportional odds models are broadly applicable to the analysis of cross-sectional and longitudinal ordinal data in dental research. PMID:21070317

  19. Development of multiple linear regression models for predicting the stormwater quality of urban sub-watersheds.

    PubMed

    Arora, Amarpreet S; Reddy, Akepati S

    2014-01-01

    Stormwater management at urban sub-watershed level has been envisioned to include stormwater collection, treatment, and disposal of treated stormwater through groundwater recharging. Sizing, operation and control of the stormwater management systems require information on the quantities and characteristics of the stormwater generated. Stormwater characteristics depend upon dry spell between two successive rainfall events, intensity of rainfall and watershed characteristics. However, sampling and analysis of stormwater, spanning only few rainfall events, provides insufficient information on the characteristics. An attempt has been made in the present study to assess the stormwater characteristics through regression modeling. Stormwater of five sub-watersheds of Patiala city were sampled and analyzed. The results obtained were related with the antecedent dry periods and with the intensity of the rainfall event through regression modeling. Obtained regression models were used to assess the stormwater quality for various antecedent dry periods and rainfall event intensities.

  20. An additional monogenic disorder that masquerades as multiple sclerosis

    SciTech Connect

    Vahedi, K.; Tournier-Lasserve, E.; Vahedi, K.

    1996-11-11

    In their comprehensive differential diagnosis of monogenic diseases that can mimic multiple sclerosis, Natowicz and Bejjani did not include a newly recognized monogenic disorder known under the acronym of CADASIL (Cerebral Autosomal Dominant Arteriopathy with Subcortical Infarcts and Leukoencephalopathy); this disorder can mimic MS clinically and radiologically to a remarkable extent. The underlying histopathological lesion of CADASIL is a non-atherosclerotic, non-amyloid arteriopathy affecting mainly the penetrating medullary arteries to the subcortical white matter and basal ganglia. Electron microscopy shows an abnormal deposit of granular osmiophilic material in the arterial wall. These arterial changes are observed in various tissues even though clinical manifestations seem to be restricted to the central nervous system. The CADASIL gene was mapped recently to chromosome 19 and gene identification is ongoing. 6 refs., 1 fig.

  1. The Development and Demonstration of Multiple Regression Models for Operant Conditioning Questions.

    ERIC Educational Resources Information Center

    Fanning, Fred; Newman, Isadore

    Based on the assumption that inferential statistics can make the operant conditioner more sensitive to possible significant relationships, regressions models were developed to test the statistical significance between slopes and Y intercepts of the experimental and control group subjects. These results were then compared to the traditional operant…

  2. Multiple Logistic Regression Analysis of Cigarette Use among High School Students

    ERIC Educational Resources Information Center

    Adwere-Boamah, Joseph

    2011-01-01

    A binary logistic regression analysis was performed to predict high school students' cigarette smoking behavior from selected predictors from 2009 CDC Youth Risk Behavior Surveillance Survey. The specific target student behavior of interest was frequent cigarette use. Five predictor variables included in the model were: a) race, b) frequency of…

  3. Hierarchical Multiple Regression in Counseling Research: Common Problems and Possible Remedies.

    ERIC Educational Resources Information Center

    Petrocelli, John V.

    2003-01-01

    A brief content analysis was conducted on the use of hierarchical regression in counseling research published in the "Journal of Counseling Psychology" and the "Journal of Counseling & Development" during the years 1997-2001. Common problems are cited and possible remedies are described. (Contains 43 references and 3 tables.) (Author)

  4. Analyzing Regression-Discontinuity Designs with Multiple Assignment Variables: A Comparative Study of Four Estimation Methods

    ERIC Educational Resources Information Center

    Wong, Vivian C.; Steiner, Peter M.; Cook, Thomas D.

    2012-01-01

    In a traditional regression-discontinuity design (RDD), units are assigned to treatment and comparison conditions solely on the basis of a single cutoff score on a continuous assignment variable. The discontinuity in the functional form of the outcome at the cutoff represents the treatment effect, or the average treatment effect at the cutoff.…

  5. The Generalized Regression Discontinuity Design: Using Multiple Assignment Variables and Cutoffs to Estimate Treatment Effects

    ERIC Educational Resources Information Center

    Wong, Vivian C.; Steiner, Peter M.; Cook, Thomas D.

    2009-01-01

    This paper introduces a generalization of the regression-discontinuity design (RDD). Traditionally, RDD is considered in a two-dimensional framework, with a single assignment variable and cutoff. Treatment effects are measured at a single location along the assignment variable. However, this represents a specialized (and straight-forward)…

  6. Point Estimates and Confidence Intervals for Variable Importance in Multiple Linear Regression

    ERIC Educational Resources Information Center

    Thomas, D. Roland; Zhu, PengCheng; Decady, Yves J.

    2007-01-01

    The topic of variable importance in linear regression is reviewed, and a measure first justified theoretically by Pratt (1987) is examined in detail. Asymptotic variance estimates are used to construct individual and simultaneous confidence intervals for these importance measures. A simulation study of their coverage properties is reported, and an…

  7. Multiple regression analysis in modelling of carbon dioxide emissions by energy consumption use in Malaysia

    NASA Astrophysics Data System (ADS)

    Keat, Sim Chong; Chun, Beh Boon; San, Lim Hwee; Jafri, Mohd Zubir Mat

    2015-04-01

    Climate change due to carbon dioxide (CO2) emissions is one of the most complex challenges threatening our planet. This issue considered as a great and international concern that primary attributed from different fossil fuels. In this paper, regression model is used for analyzing the causal relationship among CO2 emissions based on the energy consumption in Malaysia using time series data for the period of 1980-2010. The equations were developed using regression model based on the eight major sources that contribute to the CO2 emissions such as non energy, Liquefied Petroleum Gas (LPG), diesel, kerosene, refinery gas, Aviation Turbine Fuel (ATF) and Aviation Gasoline (AV Gas), fuel oil and motor petrol. The related data partly used for predict the regression model (1980-2000) and partly used for validate the regression model (2001-2010). The results of the prediction model with the measured data showed a high correlation coefficient (R2=0.9544), indicating the model's accuracy and efficiency. These results are accurate and can be used in early warning of the population to comply with air quality standards.

  8. Use of Structure Coefficients in Published Multiple Regression Articles: Beta Is Not Enough.

    ERIC Educational Resources Information Center

    Courville, Troy; Thompson, Bruce

    2001-01-01

    Reviewed articles published in the "Journal of Applied Psychology" (JAP) to determine how interpretations might have differed if standardized regression coefficients and structure coefficients (or bivariate "r"s of predictors with the criterion) had been interpreted. Summarizes some dramatic misinterpretations or incomplete interpretations.…

  9. A Generalized Logistic Regression Procedure to Detect Differential Item Functioning among Multiple Groups

    ERIC Educational Resources Information Center

    Magis, David; Raiche, Gilles; Beland, Sebastien; Gerard, Paul

    2011-01-01

    We present an extension of the logistic regression procedure to identify dichotomous differential item functioning (DIF) in the presence of more than two groups of respondents. Starting from the usual framework of a single focal group, we propose a general approach to estimate the item response functions in each group and to test for the presence…

  10. Prediction of the Rock Mass Diggability Index by Using Fuzzy Clustering-Based, ANN and Multiple Regression Methods

    NASA Astrophysics Data System (ADS)

    Saeidi, Omid; Torabi, Seyed Rahman; Ataei, Mohammad

    2014-03-01

    Rock mass classification systems are one of the most common ways of determining rock mass excavatability and related equipment assessment. However, the strength and weak points of such rating-based classifications have always been questionable. Such classification systems assign quantifiable values to predefined classified geotechnical parameters of rock mass. This causes particular ambiguities, leading to the misuse of such classifications in practical applications. Recently, intelligence system approaches such as artificial neural networks (ANNs) and neuro-fuzzy methods, along with multiple regression models, have been used successfully to overcome such uncertainties. The purpose of the present study is the construction of several models by using an adaptive neuro-fuzzy inference system (ANFIS) method with two data clustering approaches, including fuzzy c-means (FCM) clustering and subtractive clustering, an ANN and non-linear multiple regression to estimate the basic rock mass diggability index. A set of data from several case studies was used to obtain the real rock mass diggability index and compared to the predicted values by the constructed models. In conclusion, it was observed that ANFIS based on the FCM model shows higher accuracy and correlation with actual data compared to that of the ANN and multiple regression. As a result, one can use the assimilation of ANNs with fuzzy clustering-based models to construct such rigorous predictor tools.

  11. Quantifying components of the hydrologic cycle in Virginia using chemical hydrograph separation and multiple regression analysis

    USGS Publications Warehouse

    Sanford, Ward E.; Nelms, David L.; Pope, Jason P.; Selnick, David L.

    2012-01-01

    This study by the U.S. Geological Survey, prepared in cooperation with the Virginia Department of Environmental Quality, quantifies the components of the hydrologic cycle across the Commonwealth of Virginia. Long-term, mean fluxes were calculated for precipitation, surface runoff, infiltration, total evapotranspiration (ET), riparian ET, recharge, base flow (or groundwater discharge) and net total outflow. Fluxes of these components were first estimated on a number of real-time-gaged watersheds across Virginia. Specific conductance was used to distinguish and separate surface runoff from base flow. Specific-conductance data were collected every 15 minutes at 75 real-time gages for approximately 18 months between March 2007 and August 2008. Precipitation was estimated for 1971–2000 using PRISM climate data. Precipitation and temperature from the PRISM data were used to develop a regression-based relation to estimate total ET. The proportion of watershed precipitation that becomes surface runoff was related to physiographic province and rock type in a runoff regression equation. Component flux estimates from the watersheds were transferred to flux estimates for counties and independent cities using the ET and runoff regression equations. Only 48 of the 75 watersheds yielded sufficient data, and data from these 48 were used in the final runoff regression equation. The base-flow proportion for the 48 watersheds averaged 72 percent using specific conductance, a value that was substantially higher than the 61 percent average calculated using a graphical-separation technique (the USGS program PART). Final results for the study are presented as component flux estimates for all counties and independent cities in Virginia.

  12. Use of generalized additive models and cokriging of spatial residuals to improve land-use regression estimates of nitrogen oxides in Southern California

    PubMed Central

    Li, Lianfa; Wu, Jun; Wilhelm, Michelle; Ritz, Beate

    2012-01-01

    Land-use regression (LUR) models have been developed to estimate spatial distributions of traffic-related pollutants. Several studies have examined spatial autocorrelation among residuals in LUR models, but few utilized spatial residual information in model prediction, or examined the impact of modeling methods, monitoring site selection, or traffic data quality on LUR performance. This study aims to improve spatial models for traffic-related pollutants using generalized additive models (GAM) combined with cokriging of spatial residuals. Specifically, we developed spatial models for nitrogen dioxide (NO2) and nitrogen oxides (NOx) concentrations in Southern California separately for two seasons (summer and winter) based on over 240 sampling locations. Pollutant concentrations were disaggregated into three components: local means, spatial residuals, and normal random residuals. Local means were modeled by GAM. Spatial residuals were cokriged with global residuals at nearby sampling locations that were spatially auto-correlated. We compared this two-stage approach with four commonly-used spatial models: universal kriging, multiple linear LUR and GAM with and without a spatial smoothing term. Leave-one-out cross validation was conducted for model validation and comparison purposes. The results show that our GAM plus cokriging models predicted summer and winter NO2 and NOx concentration surfaces well, with cross validation R2 values ranging from 0.88 to 0.92. While local covariates accounted for partial variance of the measured NO2 and NOx concentrations, spatial autocorrelation accounted for about 20% of the variance. Our spatial GAM model improved R2 considerably compared to the other four approaches. Conclusively, our two-stage model captured summer and winter differences in NO2 and NOx spatial distributions in Southern California well. When sampling location selection cannot be optimized for the intended model and fewer covariates are available as predictors for

  13. Multiple regression models of δ13C and δ15N for fish populations in the eastern Gulf of Mexico

    NASA Astrophysics Data System (ADS)

    Radabaugh, Kara R.; Peebles, Ernst B.

    2014-08-01

    Multiple regression models were created to explain spatial and temporal variation in the δ13C and δ15N values of fish populations on the West Florida Shelf (eastern Gulf of Mexico, USA). Extensive trawl surveys from three time periods were used to acquire muscle samples from seven groundfish species. Isotopic variation (δ13Cvar and δ15Nvar) was calculated as the deviation from the isotopic mean of each fish species. Static spatial data and dynamic water quality parameters were used to create models predicting δ13Cvar and δ15Nvar in three fish species that were caught in the summers of 2009 and 2010. Additional data sets were then used to determine the accuracy of the models for predicting isotopic variation (1) in a different time period (fall 2010) and (2) among four entirely different fish species that were collected during summer 2009. The δ15Nvar model was relatively stable and could be applied to different time periods and species with similar accuracy (mean absolute errors 0.31-0.33‰). The δ13Cvar model had a lower predictive capability and mean absolute errors ranged from 0.42 to 0.48‰. δ15N trends are likely linked to gradients in nitrogen fixation and Mississippi River influence on the West Florida Shelf, while δ13C trends may be linked to changes in algal species, photosynthetic fractionation, and abundance of benthic vs. planktonic basal resources. These models of isotopic variability may be useful for future stable isotope investigations of trophic level, basal resource use, and animal migration on the West Florida Shelf.

  14. Estimation of streamflow, base flow, and nitrate-nitrogen loads in Iowa using multiple linear regression models

    USGS Publications Warehouse

    Schilling, K.E.; Wolter, C.F.

    2005-01-01

    Nineteen variables, including precipitation, soils and geology, land use, and basin morphologic characteristics, were evaluated to develop Iowa regression models to predict total streamflow (Q), base flow (Qb), storm flow (Qs) and base flow percentage (%Qb) in gauged and ungauged watersheds in the state. Discharge records from a set of 33 watersheds across the state for the 1980 to 2000 period were separated into Qb and Qs. Multiple linear regression found that 75.5 percent of long term average Q was explained by rainfall, sand content, and row crop percentage variables, whereas 88.5 percent of Qb was explained by these three variables plus permeability and floodplain area variables. Qs was explained by average rainfall and %Qb was a function of row crop percentage, permeability, and basin slope variables. Regional regression models developed for long term average Q and Qb were adapted to annual rainfall and showed good correlation between measured and predicted values. Combining the regression model for Q with an estimate of mean annual nitrate concentration, a map of potential nitrate loads in the state was produced. Results from this study have important implications for understanding geomorphic and land use controls on streamflow and base flow in Iowa watersheds and similar agriculture dominated watersheds in the glaciated Midwest. (JAWRA) (Copyright ?? 2005).

  15. Normalization Ridge Regression in Practice II: The Estimation of Multiple Feedback Linkages.

    ERIC Educational Resources Information Center

    Bulcock, J. W.

    The use of the two-stage least squares (2 SLS) procedure for estimating nonrecursive social science models is often impractical when multiple feedback linkages are required. This is because 2 SLS is extremely sensitive to multicollinearity. The standard statistical solution to the multicollinearity problem is a biased, variance reduced procedure…

  16. Early Parallel Activation of Semantics and Phonology in Picture Naming: Evidence from a Multiple Linear Regression MEG Study.

    PubMed

    Miozzo, Michele; Pulvermüller, Friedemann; Hauk, Olaf

    2015-10-01

    The time course of brain activation during word production has become an area of increasingly intense investigation in cognitive neuroscience. The predominant view has been that semantic and phonological processes are activated sequentially, at about 150 and 200-400 ms after picture onset. Although evidence from prior studies has been interpreted as supporting this view, these studies were arguably not ideally suited to detect early brain activation of semantic and phonological processes. We here used a multiple linear regression approach to magnetoencephalography (MEG) analysis of picture naming in order to investigate early effects of variables specifically related to visual, semantic, and phonological processing. This was combined with distributed minimum-norm source estimation and region-of-interest analysis. Brain activation associated with visual image complexity appeared in occipital cortex at about 100 ms after picture presentation onset. At about 150 ms, semantic variables became physiologically manifest in left frontotemporal regions. In the same latency range, we found an effect of phonological variables in the left middle temporal gyrus. Our results demonstrate that multiple linear regression analysis is sensitive to early effects of multiple psycholinguistic variables in picture naming. Crucially, our results suggest that access to phonological information might begin in parallel with semantic processing around 150 ms after picture onset.

  17. Early Parallel Activation of Semantics and Phonology in Picture Naming: Evidence from a Multiple Linear Regression MEG Study

    PubMed Central

    Miozzo, Michele; Pulvermüller, Friedemann; Hauk, Olaf

    2015-01-01

    The time course of brain activation during word production has become an area of increasingly intense investigation in cognitive neuroscience. The predominant view has been that semantic and phonological processes are activated sequentially, at about 150 and 200–400 ms after picture onset. Although evidence from prior studies has been interpreted as supporting this view, these studies were arguably not ideally suited to detect early brain activation of semantic and phonological processes. We here used a multiple linear regression approach to magnetoencephalography (MEG) analysis of picture naming in order to investigate early effects of variables specifically related to visual, semantic, and phonological processing. This was combined with distributed minimum-norm source estimation and region-of-interest analysis. Brain activation associated with visual image complexity appeared in occipital cortex at about 100 ms after picture presentation onset. At about 150 ms, semantic variables became physiologically manifest in left frontotemporal regions. In the same latency range, we found an effect of phonological variables in the left middle temporal gyrus. Our results demonstrate that multiple linear regression analysis is sensitive to early effects of multiple psycholinguistic variables in picture naming. Crucially, our results suggest that access to phonological information might begin in parallel with semantic processing around 150 ms after picture onset. PMID:25005037

  18. Regression of multiple intracranial meningiomas after cessation of long-term progesterone agonist therapy.

    PubMed

    Vadivelu, Sudhakar; Sharer, Leroy; Schulder, Michael

    2010-05-01

    The authors present the case of a patient that demonstrates the long-standing use of megestrol acetate, a progesterone agonist, and its association with multiple intracranial meningioma presentation. Discontinuation of megestrol acetate led to shrinkage of multiple tumors and to the complete resolution of one tumor. Histological examination demonstrated that the largest tumor had high (by > 25% of tumor cell nuclei) progesterone-positive expression, including progesterone receptor (PR) isoform B, compared with low expression of PR isoform A; there was no evidence of estrogen receptor expression and only unaccentuated collagen expression. This is the first clinical report illustrating a causal relationship between exogenous hormones and modulation of meningioma biology in situ. PMID:19731987

  19. Multiple regression and Artificial Neural Network for long-term rainfall forecasting using large scale climate modes

    NASA Astrophysics Data System (ADS)

    Mekanik, F.; Imteaz, M. A.; Gato-Trinidad, S.; Elmahdi, A.

    2013-10-01

    In this study, the application of Artificial Neural Networks (ANN) and Multiple regression analysis (MR) to forecast long-term seasonal spring rainfall in Victoria, Australia was investigated using lagged El Nino Southern Oscillation (ENSO) and Indian Ocean Dipole (IOD) as potential predictors. The use of dual (combined lagged ENSO-IOD) input sets for calibrating and validating ANN and MR Models is proposed to investigate the simultaneous effect of past values of these two major climate modes on long-term spring rainfall prediction. The MR models that did not violate the limits of statistical significance and multicollinearity were selected for future spring rainfall forecast. The ANN was developed in the form of multilayer perceptron using Levenberg-Marquardt algorithm. Both MR and ANN modelling were assessed statistically using mean square error (MSE), mean absolute error (MAE), Pearson correlation (r) and Willmott index of agreement (d). The developed MR and ANN models were tested on out-of-sample test sets; the MR models showed very poor generalisation ability for east Victoria with correlation coefficients of -0.99 to -0.90 compared to ANN with correlation coefficients of 0.42-0.93; ANN models also showed better generalisation ability for central and west Victoria with correlation coefficients of 0.68-0.85 and 0.58-0.97 respectively. The ability of multiple regression models to forecast out-of-sample sets is compatible with ANN for Daylesford in central Victoria and Kaniva in west Victoria (r = 0.92 and 0.67 respectively). The errors of the testing sets for ANN models are generally lower compared to multiple regression models. The statistical analysis suggest the potential of ANN over MR models for rainfall forecasting using large scale climate modes.

  20. Ca analysis: an Excel based program for the analysis of intracellular calcium transients including multiple, simultaneous regression analysis.

    PubMed

    Greensmith, David J

    2014-01-01

    Here I present an Excel based program for the analysis of intracellular Ca transients recorded using fluorescent indicators. The program can perform all the necessary steps which convert recorded raw voltage changes into meaningful physiological information. The program performs two fundamental processes. (1) It can prepare the raw signal by several methods. (2) It can then be used to analyze the prepared data to provide information such as absolute intracellular Ca levels. Also, the rates of change of Ca can be measured using multiple, simultaneous regression analysis. I demonstrate that this program performs equally well as commercially available software, but has numerous advantages, namely creating a simplified, self-contained analysis workflow.

  1. Use of multiple regression models in the study of sandhopper orientation under natural conditions

    NASA Astrophysics Data System (ADS)

    Marchetti, Giovanni M.; Scapini, Felicita

    2003-10-01

    In sandhoppers (Amphipoda; Talitridae), typical dwellers of the supralittoral zone of sandy beaches, orientation with respect to the sun and landscape vision is adapted to the local direction of the shoreline. Variation of this behavioural adaptation can be related to the characteristics of the beach. Measures of orientation with respect to the shoreline direction can thus be made as a tool to assess beach stability versus changeability, once the sources of variation are correctly interpreted. Orientation of animals can be studied by statistical analysis of directions taken after release in nature. In this paper some new tools for exploring directional data are reviewed, with special emphasis on non-parametric smoothers and regression models. Results from a large study concerning one species of sandhoppers, Talitrus saltator (Montagu), from an exposed sandy beach in northeastern Tunisia are presented. Seasonal differences in orientation behaviour were shown with a higher scatter in autumn with respect to spring. The higher scatter shown in autumn depended both on intrinsic (sex) and external (climatic conditions and landscape visibility) factors and was related to the tendency of this species to migrate towards the dune anticipating winter conditions.

  2. Screening houses for vapor intrusion risks: a multiple regression analysis approach.

    PubMed

    Johnston, Jill E; Gibson, Jacqueline MacDonald

    2013-06-01

    The migration of chlorinated volatile organic compounds from groundwater to indoor air-known as vapor intrusion-can be an important exposure pathway at hazardous waste sites. Because sampling indoor air at every potentially affected home is often logistically infeasible, screening tools are needed to help identify at-risk homes. Currently, the U.S. Environmental Protection Agency (EPA) uses a simple screening approach that employs a generic vapor "attenuation factor," the ratio of the indoor air pollutant concentration to the pollutant concentration in the soil gas directly above the groundwater table. At every potentially affected home above contaminated groundwater, the EPA assumes the vapor attenuation factor is less than 1/1000--that is, that the indoor air concentration will not exceed 1/1000 times the soil-gas concentration immediately above groundwater. This paper reports on a screening-level model that improves on the EPA approach by considering environmental, contaminant, and household characteristics. The model is based on an analysis of the EPA's vapor intrusion database, which contains almost 2,400 indoor air and corresponding subsurface concentration samples collected in 15 states. We use the site data to develop a multilevel regression model for predicting the vapor attenuation factor. We find that the attenuation factor varies significantly with soil type, depth to groundwater, season, household foundation type, and contaminant molecular weight. The resulting model decreases the rate of false negatives compared to EPA's screening approach.

  3. Establishment of In Silico Prediction Models for CYP3A4 and CYP2B6 Induction in Human Hepatocytes by Multiple Regression Analysis Using Azole Compounds.

    PubMed

    Nagai, Mika; Konno, Yoshihiro; Satsukawa, Masahiro; Yamashita, Shinji; Yoshinari, Kouichi

    2016-08-01

    Drug-drug interactions (DDIs) via cytochrome P450 (P450) induction are one clinical problem leading to increased risk of adverse effects and the need for dosage adjustments and additional therapeutic monitoring. In silico models for predicting P450 induction are useful for avoiding DDI risk. In this study, we have established regression models for CYP3A4 and CYP2B6 induction in human hepatocytes using several physicochemical parameters for a set of azole compounds with different P450 induction as characteristics as model compounds. To obtain a well-correlated regression model, the compounds for CYP3A4 or CYP2B6 induction were independently selected from the tested azole compounds using principal component analysis with fold-induction data. Both of the multiple linear regression models obtained for CYP3A4 and CYP2B6 induction are represented by different sets of physicochemical parameters. The adjusted coefficients of determination for these models were of 0.8 and 0.9, respectively. The fold-induction of the validation compounds, another set of 12 azole-containing compounds, were predicted within twofold limits for both CYP3A4 and CYP2B6. The concordance for the prediction of CYP3A4 induction was 87% with another validation set, 23 marketed drugs. However, the prediction of CYP2B6 induction tended to be overestimated for these marketed drugs. The regression models show that lipophilicity mostly contributes to CYP3A4 induction, whereas not only the lipophilicity but also the molecular polarity is important for CYP2B6 induction. Our regression models, especially that for CYP3A4 induction, might provide useful methods to avoid potent CYP3A4 or CYP2B6 inducers during the lead optimization stage without performing induction assays in human hepatocytes.

  4. Peer Rated Therapeutic Talent and Affective Sensitivity: A Multiple Regression Approach.

    ERIC Educational Resources Information Center

    Jackson, Eugene

    1985-01-01

    Used peer rated measures of Warmth, Understanding and Openness to predict scores on the Kagan Affective Sensitivity Scale-E80 among 66 undergraduates who had participated in interpersonal skills training groups. Results indicated that, as an additively composite index of Therapeutic Talent, they were positively correlated with affective…

  5. Data of multiple regressions analysis between selected biomarkers related to glutamate excitotoxicity and oxidative stress in Saudi autistic patients

    PubMed Central

    El-Ansary, Afaf

    2016-01-01

    This work demonstrates data of multiple regression analysis between nine biomarkers related to glutamate excitotoxicity and impaired detoxification as two mechanisms recently recorded as autism phenotypes. The presented data was obtained by measuring a panel of markers in 20 autistic patients aged 3–15 years and 20 age and gender matching healthy controls. Levels of GSH, glutathione status (GSH/GSSG), glutathione reductase (GR), glutathione-s-transferase (GST), thioredoxin (Trx), thioredoxin reductase (TrxR) and peroxidoxins (Prxs I and III), glutamate, glutamine, glutamate/glutamine ratio glutamate dehydrogenase (GDH) in plasma and mercury (Hg) in red blood cells were determined in both groups. In Multiple regression analysis, R2 values which describe the proportion or percentage of variance in the dependent variable attributed to the variance in the independent variables together were calculated. Moreover, β coefficients values which show the direction either positive or negative and the contribution of the independent variable relative to the other independent variables in explaining the variation of the dependent variable were determined. A panel of inter-related markers was recorded. This paper contains data related to and supporting research articles currently published entitled “Mechanism of nitrogen metabolism-related parameters and enzyme activities in the pathophysiology of autism” [1], “Novel metabolic biomarkers related to sulfur-dependent detoxification pathways in autistic patients of Saudi Arabia [2], and “A key role for an impaired detoxification mechanism in the etiology and severity of autism spectrum disorders” [3]. PMID:26933667

  6. Estimation of nutrients and organic matter in Korean swine slurry using multiple regression analysis of physical and chemical properties.

    PubMed

    Suresh, Arumuganainar; Choi, Hong Lim

    2011-10-01

    Swine waste land application has increased due to organic fertilization, but excess application in an arable system can cause environmental risk. Therefore, in situ characterizations of such resources are important prior to application. To explore this, 41 swine slurry samples were collected from Korea, and wide differences were observed in the physico-biochemical properties. However, significant (P<0.001) multiple property correlations (R²) were obtained between nutrients with specific gravity (SG), electrical conductivity (EC), total solids (TS) and pH. The different combinations of hydrometer, EC meter, drying oven and pH meter were found useful to estimate Mn, Fe, Ca, K, Al, Na, N and 5-day biochemical oxygen demands (BOD₅) at improved R² values of 0.83, 0.82, 0.77, 0.75, 0.67, 0.47, 0.88 and 0.70, respectively. The results from this study suggest that multiple property regressions can facilitate the prediction of micronutrients and organic matter much better than a single property regression for livestock waste.

  7. Data of multiple regressions analysis between selected biomarkers related to glutamate excitotoxicity and oxidative stress in Saudi autistic patients.

    PubMed

    El-Ansary, Afaf

    2016-06-01

    This work demonstrates data of multiple regression analysis between nine biomarkers related to glutamate excitotoxicity and impaired detoxification as two mechanisms recently recorded as autism phenotypes. The presented data was obtained by measuring a panel of markers in 20 autistic patients aged 3-15 years and 20 age and gender matching healthy controls. Levels of GSH, glutathione status (GSH/GSSG), glutathione reductase (GR), glutathione-s-transferase (GST), thioredoxin (Trx), thioredoxin reductase (TrxR) and peroxidoxins (Prxs I and III), glutamate, glutamine, glutamate/glutamine ratio glutamate dehydrogenase (GDH) in plasma and mercury (Hg) in red blood cells were determined in both groups. In Multiple regression analysis, R (2) values which describe the proportion or percentage of variance in the dependent variable attributed to the variance in the independent variables together were calculated. Moreover, β coefficients values which show the direction either positive or negative and the contribution of the independent variable relative to the other independent variables in explaining the variation of the dependent variable were determined. A panel of inter-related markers was recorded. This paper contains data related to and supporting research articles currently published entitled "Mechanism of nitrogen metabolism-related parameters and enzyme activities in the pathophysiology of autism" [1], "Novel metabolic biomarkers related to sulfur-dependent detoxification pathways in autistic patients of Saudi Arabia [2], and "A key role for an impaired detoxification mechanism in the etiology and severity of autism spectrum disorders" [3]. PMID:26933667

  8. Sequential Processing and the Matching-Stimulus Interval Effect in ERP Components: An Exploration of the Mechanism Using Multiple Regression

    PubMed Central

    Steiner, Genevieve Z.; Barry, Robert J.; Gonsalvez, Craig J.

    2016-01-01

    In oddball tasks, increasing the time between stimuli within a particular condition (target-to-target interval, TTI; nontarget-to-nontarget interval, NNI) systematically enhances N1, P2, and P300 event-related potential (ERP) component amplitudes. This study examined the mechanism underpinning these effects in ERP components recorded from 28 adults who completed a conventional three-tone oddball task. Bivariate correlations, partial correlations and multiple regression explored component changes due to preceding ERP component amplitudes and intervals found within the stimulus series, rather than constraining the task with experimentally constructed intervals, which has been adequately explored in prior studies. Multiple regression showed that for targets, N1 and TTI predicted N2, TTI predicted P3a and P3b, and Processing Negativity (PN), P3b, and TTI predicted reaction time. For rare nontargets, P1 predicted N1, NNI predicted N2, and N1 predicted Slow Wave (SW). Findings show that the mechanism is operating on separate stages of stimulus-processing, suggestive of either increased activation within a number of stimulus-specific pathways, or very long component generator recovery cycles. These results demonstrate the extent to which matching-stimulus intervals influence ERP component amplitudes and behavior in a three-tone oddball task, and should be taken into account when designing similar studies. PMID:27445774

  9. Databased comparison of Sparse Bayesian Learning and Multiple Linear Regression for statistical downscaling of low flow indices

    NASA Astrophysics Data System (ADS)

    Joshi, Deepti; St-Hilaire, André; Daigle, Anik; Ouarda, Taha B. M. J.

    2013-04-01

    SummaryThis study attempts to compare the performance of two statistical downscaling frameworks in downscaling hydrological indices (descriptive statistics) characterizing the low flow regimes of three rivers in Eastern Canada - Moisie, Romaine and Ouelle. The statistical models selected are Relevance Vector Machine (RVM), an implementation of Sparse Bayesian Learning, and the Automated Statistical Downscaling tool (ASD), an implementation of Multiple Linear Regression. Inputs to both frameworks involve climate variables significantly (α = 0.05) correlated with the indices. These variables were processed using Canonical Correlation Analysis and the resulting canonical variates scores were used as input to RVM to estimate the selected low flow indices. In ASD, the significantly correlated climate variables were subjected to backward stepwise predictor selection and the selected predictors were subsequently used to estimate the selected low flow indices using Multiple Linear Regression. With respect to the correlation between climate variables and the selected low flow indices, it was observed that all indices are influenced, primarily, by wind components (Vertical, Zonal and Meridonal) and humidity variables (Specific and Relative Humidity). The downscaling performance of the framework involving RVM was found to be better than ASD in terms of Relative Root Mean Square Error, Relative Mean Absolute Bias and Coefficient of Determination. In all cases, the former resulted in less variability of the performance indices between calibration and validation sets, implying better generalization ability than for the latter.

  10. Estimation of nutrients and organic matter in Korean swine slurry using multiple regression analysis of physical and chemical properties.

    PubMed

    Suresh, Arumuganainar; Choi, Hong Lim

    2011-10-01

    Swine waste land application has increased due to organic fertilization, but excess application in an arable system can cause environmental risk. Therefore, in situ characterizations of such resources are important prior to application. To explore this, 41 swine slurry samples were collected from Korea, and wide differences were observed in the physico-biochemical properties. However, significant (P<0.001) multiple property correlations (R²) were obtained between nutrients with specific gravity (SG), electrical conductivity (EC), total solids (TS) and pH. The different combinations of hydrometer, EC meter, drying oven and pH meter were found useful to estimate Mn, Fe, Ca, K, Al, Na, N and 5-day biochemical oxygen demands (BOD₅) at improved R² values of 0.83, 0.82, 0.77, 0.75, 0.67, 0.47, 0.88 and 0.70, respectively. The results from this study suggest that multiple property regressions can facilitate the prediction of micronutrients and organic matter much better than a single property regression for livestock waste. PMID:21767950

  11. A general equation to obtain multiple cut-off scores on a test from multinomial logistic regression.

    PubMed

    Bersabé, Rosa; Rivas, Teresa

    2010-05-01

    The authors derive a general equation to compute multiple cut-offs on a total test score in order to classify individuals into more than two ordinal categories. The equation is derived from the multinomial logistic regression (MLR) model, which is an extension of the binary logistic regression (BLR) model to accommodate polytomous outcome variables. From this analytical procedure, cut-off scores are established at the test score (the predictor variable) at which an individual is as likely to be in category j as in category j+1 of an ordinal outcome variable. The application of the complete procedure is illustrated by an example with data from an actual study on eating disorders. In this example, two cut-off scores on the Eating Attitudes Test (EAT-26) scores are obtained in order to classify individuals into three ordinal categories: asymptomatic, symptomatic and eating disorder. Diagnoses were made from the responses to a self-report (Q-EDD) that operationalises DSM-IV criteria for eating disorders. Alternatives to the MLR model to set multiple cut-off scores are discussed.

  12. Computing mammographic density from a multiple regression model constructed with image-acquisition parameters from a full-field digital mammographic unit

    NASA Astrophysics Data System (ADS)

    Lu, Lee-Jane W.; Nishino, Thomas K.; Khamapirad, Tuenchit; Grady, James J.; Leonard, Morton H., Jr.; Brunder, Donald G.

    2007-08-01

    Breast density (the percentage of fibroglandular tissue in the breast) has been suggested to be a useful surrogate marker for breast cancer risk. It is conventionally measured using screen-film mammographic images by a labor-intensive histogram segmentation method (HSM). We have adapted and modified the HSM for measuring breast density from raw digital mammograms acquired by full-field digital mammography. Multiple regression model analyses showed that many of the instrument parameters for acquiring the screening mammograms (e.g. breast compression thickness, radiological thickness, radiation dose, compression force, etc) and image pixel intensity statistics of the imaged breasts were strong predictors of the observed threshold values (model R2 = 0.93) and %-density (R2 = 0.84). The intra-class correlation coefficient of the %-density for duplicate images was estimated to be 0.80, using the regression model-derived threshold values, and 0.94 if estimated directly from the parameter estimates of the %-density prediction regression model. Therefore, with additional research, these mathematical models could be used to compute breast density objectively, automatically bypassing the HSM step, and could greatly facilitate breast cancer research studies.

  13. Sequential Monte Carlo tracking of the marginal artery by multiple cue fusion and random forest regression.

    PubMed

    Cherry, Kevin M; Peplinski, Brandon; Kim, Lauren; Wang, Shijun; Lu, Le; Zhang, Weidong; Liu, Jianfei; Wei, Zhuoshi; Summers, Ronald M

    2015-01-01

    Given the potential importance of marginal artery localization in automated registration in computed tomography colonography (CTC), we have devised a semi-automated method of marginal vessel detection employing sequential Monte Carlo tracking (also known as particle filtering tracking) by multiple cue fusion based on intensity, vesselness, organ detection, and minimum spanning tree information for poorly enhanced vessel segments. We then employed a random forest algorithm for intelligent cue fusion and decision making which achieved high sensitivity and robustness. After applying a vessel pruning procedure to the tracking results, we achieved statistically significantly improved precision compared to a baseline Hessian detection method (2.7% versus 75.2%, p<0.001). This method also showed statistically significantly improved recall rate compared to a 2-cue baseline method using fewer vessel cues (30.7% versus 67.7%, p<0.001). These results demonstrate that marginal artery localization on CTC is feasible by combining a discriminative classifier (i.e., random forest) with a sequential Monte Carlo tracking mechanism. In so doing, we present the effective application of an anatomical probability map to vessel pruning as well as a supplementary spatial coordinate system for colonic segmentation and registration when this task has been confounded by colon lumen collapse.

  14. Integration of geographic information systems and logistic multiple regression for aquatic macrophyte modeling

    SciTech Connect

    Narumalani, S.; Jensen, J.R.; Althausen, J.D.; Burkhalter, S.; Mackey, H.E. Jr.

    1994-06-01

    Since aquatic macrophytes have an important influence on the physical and chemical processes of an ecosystem while simultaneously affecting human activity, it is imperative that they be inventoried and managed wisely. However, mapping wetlands can be a major challenge because they are found in diverse geographic areas ranging from small tributary streams, to shrub or scrub and marsh communities, to open water lacustrian environments. In addition, the type and spatial distribution of wetlands can change dramatically from season to season, especially when nonpersistent species are present. This research, focuses on developing a model for predicting the future growth and distribution of aquatic macrophytes. This model will use a geographic information system (GIS) to analyze some of the biophysical variables that affect aquatic macrophyte growth and distribution. The data will provide scientists information on the future spatial growth and distribution of aquatic macrophytes. This study focuses on the Savannah River Site Par Pond (1,000 ha) and L Lake (400 ha) these are two cooling ponds that have received thermal effluent from nuclear reactor operations. Par Pond was constructed in 1958, and natural invasion of wetland has occurred over its 35-year history, with much of the shoreline having developed extensive beds of persistent and non-persistent aquatic macrophytes.

  15. Influence of Additive and Multiplicative Structure and Direction of Comparison on the Reversal Error

    ERIC Educational Resources Information Center

    González-Calero, José Antonio; Arnau, David; Laserna-Belenguer, Belén

    2015-01-01

    An empirical study has been carried out to evaluate the potential of word order matching and static comparison as explanatory models of reversal error. Data was collected from 214 undergraduate students who translated a set of additive and multiplicative comparisons expressed in Spanish into algebraic language. In these multiplicative comparisons…

  16. Artificial neural networks and multiple linear regression model using principal components to estimate rainfall over South America

    NASA Astrophysics Data System (ADS)

    dos Santos, T. S.; Mendes, D.; Torres, R. R.

    2015-08-01

    Several studies have been devoted to dynamic and statistical downscaling for analysis of both climate variability and climate change. This paper introduces an application of artificial neural networks (ANN) and multiple linear regression (MLR) by principal components to estimate rainfall in South America. This method is proposed for downscaling monthly precipitation time series over South America for three regions: the Amazon, Northeastern Brazil and the La Plata Basin, which is one of the regions of the planet that will be most affected by the climate change projected for the end of the 21st century. The downscaling models were developed and validated using CMIP5 model out- put and observed monthly precipitation. We used GCMs experiments for the 20th century (RCP Historical; 1970-1999) and two scenarios (RCP 2.6 and 8.5; 2070-2100). The model test results indicate that the ANN significantly outperforms the MLR downscaling of monthly precipitation variability.

  17. A Technique for Estimating Intensity of Emotional Expressions and Speaking Styles in Speech Based on Multiple-Regression HSMM

    NASA Astrophysics Data System (ADS)

    Nose, Takashi; Kobayashi, Takao

    In this paper, we propose a technique for estimating the degree or intensity of emotional expressions and speaking styles appearing in speech. The key idea is based on a style control technique for speech synthesis using a multiple regression hidden semi-Markov model (MRHSMM), and the proposed technique can be viewed as the inverse of the style control. In the proposed technique, the acoustic features of spectrum, power, fundamental frequency, and duration are simultaneously modeled using the MRHSMM. We derive an algorithm for estimating explanatory variables of the MRHSMM, each of which represents the degree or intensity of emotional expressions and speaking styles appearing in acoustic features of speech, based on a maximum likelihood criterion. We show experimental results to demonstrate the ability of the proposed technique using two types of speech data, simulated emotional speech and spontaneous speech with different speaking styles. It is found that the estimated values have correlation with human perception.

  18. Artificial neural networks and multiple linear regression model using principal components to estimate rainfall over South America

    NASA Astrophysics Data System (ADS)

    Soares dos Santos, T.; Mendes, D.; Rodrigues Torres, R.

    2016-01-01

    Several studies have been devoted to dynamic and statistical downscaling for analysis of both climate variability and climate change. This paper introduces an application of artificial neural networks (ANNs) and multiple linear regression (MLR) by principal components to estimate rainfall in South America. This method is proposed for downscaling monthly precipitation time series over South America for three regions: the Amazon; northeastern Brazil; and the La Plata Basin, which is one of the regions of the planet that will be most affected by the climate change projected for the end of the 21st century. The downscaling models were developed and validated using CMIP5 model output and observed monthly precipitation. We used general circulation model (GCM) experiments for the 20th century (RCP historical; 1970-1999) and two scenarios (RCP 2.6 and 8.5; 2070-2100). The model test results indicate that the ANNs significantly outperform the MLR downscaling of monthly precipitation variability.

  19. Solving Capelin Time Series Ecosystem Problem Using Hybrid ANN-GAs Model and Multiple Linear Regression Model

    NASA Astrophysics Data System (ADS)

    Eghnam, Karam M.; Sheta, Alaa F.

    2008-06-01

    Development of accurate models is necessary in critical applications such as prediction. In this paper, a solution to the stock prediction problem of the Barents Sea capelin is introduced using Artificial Neural Network (ANN) and Multiple Linear model Regression (MLR) models. The Capelin stock in the Barents Sea is one of the largest in the world. It normally maintained a fishery with annual catches of up to 3 million tons. The Capelin stock problem has an impact in the fish stock development. The proposed prediction model was developed using an ANNs with their weights adapted using Genetic Algorithm (GA). The proposed model was compared to traditional linear model the MLR. The results showed that the ANN-GA model produced an overall accuracy of 21% better than the MLR model.

  20. Effect size and power in assessing moderating effects of categorical variables using multiple regression: a 30-year review.

    PubMed

    Aguinis, Herman; Beaty, James C; Boik, Robert J; Pierce, Charles A

    2005-01-01

    The authors conducted a 30-year review (1969-1998) of the size of moderating effects of categorical variables as assessed using multiple regression. The median observed effect size (f(2)) is only .002, but 72% of the moderator tests reviewed had power of .80 or greater to detect a targeted effect conventionally defined as small. Results suggest the need to minimize the influence of artifacts that produce a downward bias in the observed effect size and put into question the use of conventional definitions of moderating effect sizes. As long as an effect has a meaningful impact, the authors advise researchers to conduct a power analysis and plan future research designs on the basis of smaller and more realistic targeted effect sizes.

  1. Application of cluster analysis and multiple regression to calculate the effect of vegetation and topography on snow accumulation and snowmelt

    NASA Astrophysics Data System (ADS)

    Pevná, Hana; Jeníček, Michal

    2014-05-01

    Snow is the important component of hydrological cycle in the central Europe. Large quantity of water is accumulated as snow during winter period and this water runs off into rivers in relative short time during spring period. Increased risk of floods in central Europe exists namely in alpine and pre-alpine catchments which have the pluvio-nival flow regime. Research of snow accumulation and snowmelt processes is important for runoff forecast and reservoir management. The research is carried out in small mountain catchments in the Czech Republic. Experimental catchments are differing in elevation range, aspect, slope and type of vegetation cover. Automatic and field measurements of the snow depth and snow water equivalent (SWE) have been caring out at specific localities since 2008. Each locality is specified with elevation, aspect, slope and vegetation type (open area, clearing, young forest, sparse mature forest and dense mature forest). Measurements of snow depth and SWE are carried out at 19 localities both during snow accumulation and snow melt period. Data of snow depth and SWE were assessed using both simple statistical analysis and multiple regression and cluster analysis in order to describe the spatial distribution in snow accumulation and snowmelt. The correlation of SWE with vegetation type, elevation, aspect and slope was tested. The main findings of the research show that vegetation type has the most significant influence on the snowpack distribution and on the snow accumulation and snowmelt dynamics. Significant correlations were also proved for aspect (especially for southern slopes). The study completes similar results carried out in different study areas and climatic conditions but moreover it shows changes of importace of governing factors during snow accumulation and snowmelt periods. The results demonstrate a good applicability of cluster analysis and multiple regression for description of snowpack distribution.

  2. Comparison of Multiple Linear Regressions and Neural Networks based QSAR models for the design of new antitubercular compounds.

    PubMed

    Ventura, Cristina; Latino, Diogo A R S; Martins, Filomena

    2013-01-01

    The performance of two QSAR methodologies, namely Multiple Linear Regressions (MLR) and Neural Networks (NN), towards the modeling and prediction of antitubercular activity was evaluated and compared. A data set of 173 potentially active compounds belonging to the hydrazide family and represented by 96 descriptors was analyzed. Models were built with Multiple Linear Regressions (MLR), single Feed-Forward Neural Networks (FFNNs), ensembles of FFNNs and Associative Neural Networks (AsNNs) using four different data sets and different types of descriptors. The predictive ability of the different techniques used were assessed and discussed on the basis of different validation criteria and results show in general a better performance of AsNNs in terms of learning ability and prediction of antitubercular behaviors when compared with all other methods. MLR have, however, the advantage of pinpointing the most relevant molecular characteristics responsible for the behavior of these compounds against Mycobacterium tuberculosis. The best results for the larger data set (94 compounds in training set and 18 in test set) were obtained with AsNNs using seven descriptors (R(2) of 0.874 and RMSE of 0.437 against R(2) of 0.845 and RMSE of 0.472 in MLRs, for test set). Counter-Propagation Neural Networks (CPNNs) were trained with the same data sets and descriptors. From the scrutiny of the weight levels in each CPNN and the information retrieved from MLRs, a rational design of potentially active compounds was attempted. Two new compounds were synthesized and tested against M. tuberculosis showing an activity close to that predicted by the majority of the models.

  3. Statistical analysis of water-quality data containing multiple detection limits: S-language software for regression on order statistics

    USGS Publications Warehouse

    Lee, L.; Helsel, D.

    2005-01-01

    Trace contaminants in water, including metals and organics, often are measured at sufficiently low concentrations to be reported only as values below the instrument detection limit. Interpretation of these "less thans" is complicated when multiple detection limits occur. Statistical methods for multiply censored, or multiple-detection limit, datasets have been developed for medical and industrial statistics, and can be employed to estimate summary statistics or model the distributions of trace-level environmental data. We describe S-language-based software tools that perform robust linear regression on order statistics (ROS). The ROS method has been evaluated as one of the most reliable procedures for developing summary statistics of multiply censored data. It is applicable to any dataset that has 0 to 80% of its values censored. These tools are a part of a software library, or add-on package, for the R environment for statistical computing. This library can be used to generate ROS models and associated summary statistics, plot modeled distributions, and predict exceedance probabilities of water-quality standards. ?? 2005 Elsevier Ltd. All rights reserved.

  4. [A case of multiple hepatic metastases of gastric cancer that showed complete regression by systemic chemotherapy using paclitaxel and UFT-E].

    PubMed

    Okamura, Hiroko; Fujiwara, Hitoshi; Ichikawa, Daisuke; Okamoto, Kazuma; Kikuchi, Shojiro; Kubota, Takeshi; Ikoma, Hisashi; Nakanishi, Masayoshi; Ochiai, Toshiya; Sakakura, Chouhei; Kokuba, Yukihito; Taniguchi, Hiroki; Sonoyama, Teruhisa; Otsuji, Eigo

    2009-06-01

    We report a case of gastric cancer with simultaneous multiple liver metastasis that was successfully treated by paclitaxel and UFT-E. A 54-year-old man with gastric cancer was admitted to our hospital for further examination and treatment. A type III gastric cancer was located in the lower to middle part of the gastric body. Abdominal CT revealed multiple liver metastases and lymph node metastases. Then, we performed distal gastrectomy and cholecystectomy. Postoperative pathological diagnosis was stage IV(a type 3 tumor( 78x65 mm), pT3, por 2, INF g, ly3, v0, pN2(+)(26/ 28), H1(bilobular multiple metastases), CY0, P0). Postoperatively, he was treated with S-1 po at 100 mg/body/day as first-line chemotherapy. Thirteen days after S-1 initiation, he was readmitted due to grade 3 diarrhea, and S-1 was immediately stopped. After his general condition was improved, paclitaxel was administered biweekly at a dose of 80 mg/m2. He was discharged after twice administration, and the regimen was continued at an outpatient clinic. Four months after the operation, abdominal computed tomography(CT)showed a remarkable reduction of the multiple liver metastases, and the serum levels of tumor markers(CEA, CA19-9)were reduced. Five months after the operation, the serum levels of tumor markers elevated again. Then, additional administration of UFT-E po(300 mg/body daily) was started. Seven months after the operation, abdominal CT showed a complete regression of the multiple liver metastasis, and the serum levels of tumor markers were also reduced within the normal range. During chemotherapy at an outpatient clinic, critical adverse effects did not appear. Paclitaxel or paclitaxel combined with UFT-E might be an effective regimen as second- or third-line chemotherapy for the liver metastases of gastric cancer.

  5. Predicting Patient Advocacy Engagement: A Multiple Regression Analysis Using Data From Health Professionals in Acute-Care Hospitals.

    PubMed

    Jansson, Bruce S; Nyamathi, Adeline; Heidemann, Gretchen; Duan, Lei; Kaplan, Charles

    2015-01-01

    Although literature documents the need for hospital social workers, nurses, and medical residents to engage in patient advocacy, little information exists about what predicts the extent they do so. This study aims to identify predictors of health professionals' patient advocacy engagement with respect to a broad range of patients' problems. A cross-sectional research design was employed with a sample of 94 social workers, 97 nurses, and 104 medical residents recruited from eight hospitals in Los Angeles. Bivariate correlations explored whether seven scales (Patient Advocacy Eagerness, Ethical Commitment, Skills, Tangible Support, Organizational Receptivity, Belief Other Professionals Engage, and Belief the Hospital Empowers Patients) were associated with patient advocacy engagement, measured by the validated Patient Advocacy Engagement Scale. Regression analysis examined whether these scales, when controlling for sociodemographic and setting variables, predicted patient advocacy engagement. While all seven predictor scales were significantly associated with patient advocacy engagement in correlational analyses, only Eagerness, Skills, and Belief the Hospital Empowers Patients predicted patient advocacy engagement in regression analyses. Additionally, younger professionals engaged in higher levels of patient advocacy than older professionals, and social workers engaged in greater patient advocacy than nurses. Limitations and the utility of these findings for acute-care hospitals are discussed. PMID:26317762

  6. Predicting Patient Advocacy Engagement: A Multiple Regression Analysis Using Data From Health Professionals in Acute-Care Hospitals.

    PubMed

    Jansson, Bruce S; Nyamathi, Adeline; Heidemann, Gretchen; Duan, Lei; Kaplan, Charles

    2015-01-01

    Although literature documents the need for hospital social workers, nurses, and medical residents to engage in patient advocacy, little information exists about what predicts the extent they do so. This study aims to identify predictors of health professionals' patient advocacy engagement with respect to a broad range of patients' problems. A cross-sectional research design was employed with a sample of 94 social workers, 97 nurses, and 104 medical residents recruited from eight hospitals in Los Angeles. Bivariate correlations explored whether seven scales (Patient Advocacy Eagerness, Ethical Commitment, Skills, Tangible Support, Organizational Receptivity, Belief Other Professionals Engage, and Belief the Hospital Empowers Patients) were associated with patient advocacy engagement, measured by the validated Patient Advocacy Engagement Scale. Regression analysis examined whether these scales, when controlling for sociodemographic and setting variables, predicted patient advocacy engagement. While all seven predictor scales were significantly associated with patient advocacy engagement in correlational analyses, only Eagerness, Skills, and Belief the Hospital Empowers Patients predicted patient advocacy engagement in regression analyses. Additionally, younger professionals engaged in higher levels of patient advocacy than older professionals, and social workers engaged in greater patient advocacy than nurses. Limitations and the utility of these findings for acute-care hospitals are discussed.

  7. Combining different functions to describe milk, fat, and protein yield in goats using Bayesian multiple-trait random regression models.

    PubMed

    Oliveira, H R; Silva, F F; Siqueira, O H G B D; Souza, N O; Junqueira, V S; Resende, M D V; Borquis, R R A; Rodrigues, M T

    2016-05-01

    We proposed multiple-trait random regression models (MTRRM) combining different functions to describe milk yield (MY) and fat (FP) and protein (PP) percentage in dairy goat genetic evaluation by using Bayesian inference. A total of 3,856 MY, FP, and PP test-day records, measured between 2000 and 2014, from 535 first lactations of Saanen and Alpine goats, including their cross, were used in this study. The initial analyses were performed using the following single-trait random regression models (STRRM): third- and fifth-order Legendre polynomials (Leg3 and Leg5), linear B-splines with 3 and 5 knots, the Ali and Schaeffer function (Ali), and Wilmink function. Heterogeneity of residual variances was modeled considering 3 classes. After the selection of the best STRRM to describe each trait on the basis of the deviance information criterion (DIC) and posterior model probabilities (PMP), the functions were combined to compose the MTRRM. All combined MTRRM presented lower DIC values and higher PMP, showing the superiority of these models when compared to other MTRRM based only on the same function assumed for all traits. Among the combined MTRRM, those considering Ali to describe MY and PP and Leg5 to describe FP (Ali_Leg5_Ali model) presented the best fit. From the Ali_Leg5_Ali model, heritability estimates over time for MY, FP. and PP ranged from 0.25 to 0.54, 0.27 to 0.48, and 0.35 to 0.51, respectively. Genetic correlation between MY and FP, MY and PP, and FP and PP ranged from -0.58 to 0.03, -0.46 to 0.12, and 0.37 to 0.64, respectively. We concluded that combining different functions under a MTRRM approach can be a plausible alternative for joint genetic evaluation of milk yield and milk constituents in goats. PMID:27285684

  8. Combining different functions to describe milk, fat, and protein yield in goats using Bayesian multiple-trait random regression models.

    PubMed

    Oliveira, H R; Silva, F F; Siqueira, O H G B D; Souza, N O; Junqueira, V S; Resende, M D V; Borquis, R R A; Rodrigues, M T

    2016-05-01

    We proposed multiple-trait random regression models (MTRRM) combining different functions to describe milk yield (MY) and fat (FP) and protein (PP) percentage in dairy goat genetic evaluation by using Bayesian inference. A total of 3,856 MY, FP, and PP test-day records, measured between 2000 and 2014, from 535 first lactations of Saanen and Alpine goats, including their cross, were used in this study. The initial analyses were performed using the following single-trait random regression models (STRRM): third- and fifth-order Legendre polynomials (Leg3 and Leg5), linear B-splines with 3 and 5 knots, the Ali and Schaeffer function (Ali), and Wilmink function. Heterogeneity of residual variances was modeled considering 3 classes. After the selection of the best STRRM to describe each trait on the basis of the deviance information criterion (DIC) and posterior model probabilities (PMP), the functions were combined to compose the MTRRM. All combined MTRRM presented lower DIC values and higher PMP, showing the superiority of these models when compared to other MTRRM based only on the same function assumed for all traits. Among the combined MTRRM, those considering Ali to describe MY and PP and Leg5 to describe FP (Ali_Leg5_Ali model) presented the best fit. From the Ali_Leg5_Ali model, heritability estimates over time for MY, FP. and PP ranged from 0.25 to 0.54, 0.27 to 0.48, and 0.35 to 0.51, respectively. Genetic correlation between MY and FP, MY and PP, and FP and PP ranged from -0.58 to 0.03, -0.46 to 0.12, and 0.37 to 0.64, respectively. We concluded that combining different functions under a MTRRM approach can be a plausible alternative for joint genetic evaluation of milk yield and milk constituents in goats.

  9. An improved approach for measuring the impact of multiple CO2 conductances on the apparent photorespiratory CO2 compensation point through slope-intercept regression.

    PubMed

    Walker, Berkley J; Skabelund, Dane C; Busch, Florian A; Ort, Donald R

    2016-06-01

    Biochemical models of leaf photosynthesis, which are essential for understanding the impact of photosynthesis to changing environments, depend on accurate parameterizations. One such parameter, the photorespiratory CO2 compensation point can be measured from the intersection of several CO2 response curves measured under sub-saturating illumination. However, determining the actual intersection while accounting for experimental noise can be challenging. Additionally, leaf photosynthesis model outcomes are sensitive to the diffusion paths of CO2 released from the mitochondria. This diffusion path of CO2 includes both chloroplastic as well as cell wall resistances to CO2 , which are not readily measurable. Both the difficulties of determining the photorespiratory CO2 compensation point and the impact of multiple intercellular resistances to CO2 can be addressed through application of slope-intercept regression. This technical report summarizes an improved framework for implementing slope-intercept regression to evaluate measurements of the photorespiratory CO2 compensation point. This approach extends past work to include the cases of both Rubisco and Ribulose-1,5-bisphosphate (RuBP)-limited photosynthesis. This report further presents two interactive graphical applications and a spreadsheet-based tool to allow users to apply slope-intercept theory to their data. PMID:27103099

  10. An improved approach for measuring the impact of multiple CO2 conductances on the apparent photorespiratory CO2 compensation point through slope-intercept regression.

    PubMed

    Walker, Berkley J; Skabelund, Dane C; Busch, Florian A; Ort, Donald R

    2016-06-01

    Biochemical models of leaf photosynthesis, which are essential for understanding the impact of photosynthesis to changing environments, depend on accurate parameterizations. One such parameter, the photorespiratory CO2 compensation point can be measured from the intersection of several CO2 response curves measured under sub-saturating illumination. However, determining the actual intersection while accounting for experimental noise can be challenging. Additionally, leaf photosynthesis model outcomes are sensitive to the diffusion paths of CO2 released from the mitochondria. This diffusion path of CO2 includes both chloroplastic as well as cell wall resistances to CO2 , which are not readily measurable. Both the difficulties of determining the photorespiratory CO2 compensation point and the impact of multiple intercellular resistances to CO2 can be addressed through application of slope-intercept regression. This technical report summarizes an improved framework for implementing slope-intercept regression to evaluate measurements of the photorespiratory CO2 compensation point. This approach extends past work to include the cases of both Rubisco and Ribulose-1,5-bisphosphate (RuBP)-limited photosynthesis. This report further presents two interactive graphical applications and a spreadsheet-based tool to allow users to apply slope-intercept theory to their data.

  11. Internal correction of spectral interferences and mass bias for selenium metabolism studies using enriched stable isotopes in combination with multiple linear regression.

    PubMed

    Lunøe, Kristoffer; Martínez-Sierra, Justo Giner; Gammelgaard, Bente; Alonso, J Ignacio García

    2012-03-01

    The analytical methodology for the in vivo study of selenium metabolism using two enriched selenium isotopes has been modified, allowing for the internal correction of spectral interferences and mass bias both for total selenium and speciation analysis. The method is based on the combination of an already described dual-isotope procedure with a new data treatment strategy based on multiple linear regression. A metabolic enriched isotope ((77)Se) is given orally to the test subject and a second isotope ((74)Se) is employed for quantification. In our approach, all possible polyatomic interferences occurring in the measurement of the isotope composition of selenium by collision cell quadrupole ICP-MS are taken into account and their relative contribution calculated by multiple linear regression after minimisation of the residuals. As a result, all spectral interferences and mass bias are corrected internally allowing the fast and independent quantification of natural abundance selenium ((nat)Se) and enriched (77)Se. In this sense, the calculation of the tracer/tracee ratio in each sample is straightforward. The method has been applied to study the time-related tissue incorporation of (77)Se in male Wistar rats while maintaining the (nat)Se steady-state conditions. Additionally, metabolically relevant information such as selenoprotein synthesis and selenium elimination in urine could be studied using the proposed methodology. In this case, serum proteins were separated by affinity chromatography while reverse phase was employed for urine metabolites. In both cases, (74)Se was used as a post-column isotope dilution spike. The application of multiple linear regression to the whole chromatogram allowed us to calculate the contribution of bromine hydride, selenium hydride, argon polyatomics and mass bias on the observed selenium isotope patterns. By minimising the square sum of residuals for the whole chromatogram, internal correction of spectral interferences and mass

  12. Stochastic Vortex Dynamics in Two-Dimensional Easy Plane Ferromagnets: Multiplicative Versus Additive Noise

    SciTech Connect

    Kamppeter, T.; Mertens, F.G.; Moro, E.; Sanchez, A.; Bishop, A.R.

    1998-09-01

    We study how thermal fluctuations affect the dynamics of vortices in the two-dimensional anisotropic Heisenberg model depending on their additive or multiplicative character. Using a collective coordinate theory, we analytically show that multiplicative noise, arising from fluctuations in the local field term of the Landau-Lifshitz equations, and Langevin-like additive noise have the same effect on vortex dynamics (within a very plausible assumption consistent with the collective coordinate approach). This is a highly non-trivial result as multiplicative and additive noises usually modify the dynamics in very different ways. We also carry out numerical simulations of both versions of the model finding that they indeed give rise to very similar vortex dynamics.

  13. Effect of multiplicative and additive noise on genetic transcriptional regulatory mechanism

    NASA Astrophysics Data System (ADS)

    Liu, Xue-Mei; Xie, Hui-Zhang; Liu, Liang-Gang; Li, Zhi-Bing

    2009-02-01

    A multiplicative noise and an additive noise are introduced in the kinetic model of Smolen-Baxter-Byrne [P. Smolen, D.A. Baxter, J.H. Byrne, Amer. J. Physiol. Cell. Physiol. 274 (1998) 531], in which the expression of gene is controlled by protein concentration of transcriptional activator. The Fokker-Planck equation is solved and the steady-state probability distribution is obtained numerically. It is found that the multiplicative noise converts the bistability to monostability that can be regarded as a noise-induced transition. The additive noise reduces the transcription efficiency. The correlation between the multiplicative noise and the additive noise works as a genetic switch and regulates the gene transcription effectively.

  14. The use of artificial neural networks and multiple linear regression to predict rate of medical waste generation

    SciTech Connect

    Jahandideh, Sepideh Jahandideh, Samad; Asadabadi, Ebrahim Barzegari; Askarian, Mehrdad; Movahedi, Mohammad Mehdi; Hosseini, Somayyeh; Jahandideh, Mina

    2009-11-15

    Prediction of the amount of hospital waste production will be helpful in the storage, transportation and disposal of hospital waste management. Based on this fact, two predictor models including artificial neural networks (ANNs) and multiple linear regression (MLR) were applied to predict the rate of medical waste generation totally and in different types of sharp, infectious and general. In this study, a 5-fold cross-validation procedure on a database containing total of 50 hospitals of Fars province (Iran) were used to verify the performance of the models. Three performance measures including MAR, RMSE and R{sup 2} were used to evaluate performance of models. The MLR as a conventional model obtained poor prediction performance measure values. However, MLR distinguished hospital capacity and bed occupancy as more significant parameters. On the other hand, ANNs as a more powerful model, which has not been introduced in predicting rate of medical waste generation, showed high performance measure values, especially 0.99 value of R{sup 2} confirming the good fit of the data. Such satisfactory results could be attributed to the non-linear nature of ANNs in problem solving which provides the opportunity for relating independent variables to dependent ones non-linearly. In conclusion, the obtained results showed that our ANN-based model approach is very promising and may play a useful role in developing a better cost-effective strategy for waste management in future.

  15. Multiple regression analysis of relationship between frontal lobe phosphorus metabolism and clinical symptoms in patients with schizophrenia.

    PubMed

    Shioiri, T; Someya, T; Murashita, J; Kato, T; Hamakawa, H; Fujii, K; Inubushi, T

    1997-12-30

    We investigated the differences among diagnostic types of 36 schizophrenic patients in the brain phosphorus metabolism in the frontal lobe. We performed phosphorus-31 magnetic resonance spectroscopy (31P-MRS) in the frontal region in patients with schizophrenia of the catatonic (n = 4), disorganized (n = 8), paranoid (n = 10) and undifferentiated (n = 14) types. In the disorganized type, the PME level was significantly decreased compared to those in the other three types, while the phosphodiester (PDE) level tended to be higher, although not significantly, than those in the other types. Using multiple regression analysis, we investigated whether or not the clinical symptoms were correlated with the brain phosphorus metabolism. An increased motor retardation factor score was significantly correlated with decreased PME level, whereas more severe emotional withdrawal and blunted affect were associated with increased PDE level. These results suggest that altered membrane phospholipid metabolism in the frontal region may be associated with negative symptoms and that schizophrenia of the disorganized type is associated with more severe negative symptoms and may present more severe brain abnormalities compared to the other types.

  16. Comparing Effects of Biologic Agents in Treating Patients with Rheumatoid Arthritis: A Multiple Treatment Comparison Regression Analysis

    PubMed Central

    Tvete, Ingunn Fride; Natvig, Bent; Gåsemyr, Jørund; Meland, Nils; Røine, Marianne; Klemp, Marianne

    2015-01-01

    Rheumatoid arthritis patients have been treated with disease modifying anti-rheumatic drugs (DMARDs) and the newer biologic drugs. We sought to compare and rank the biologics with respect to efficacy. We performed a literature search identifying 54 publications encompassing 9 biologics. We conducted a multiple treatment comparison regression analysis letting the number experiencing a 50% improvement on the ACR score be dependent upon dose level and disease duration for assessing the comparable relative effect between biologics and placebo or DMARD. The analysis embraced all treatment and comparator arms over all publications. Hence, all measured effects of any biologic agent contributed to the comparison of all biologic agents relative to each other either given alone or combined with DMARD. We found the drug effect to be dependent on dose level, but not on disease duration, and the impact of a high versus low dose level was the same for all drugs (higher doses indicated a higher frequency of ACR50 scores). The ranking of the drugs when given without DMARD was certolizumab (ranked highest), etanercept, tocilizumab/ abatacept and adalimumab. The ranking of the drugs when given with DMARD was certolizumab (ranked highest), tocilizumab, anakinra, rituximab, golimumab/ infliximab/ abatacept, adalimumab/ etanercept. Still, all drugs were effective. All biologic agents were effective compared to placebo, with certolizumab the most effective and adalimumab (without DMARD treatment) and adalimumab/ etanercept (combined with DMARD treatment) the least effective. The drugs were in general more effective, except for etanercept, when given together with DMARDs. PMID:26356639

  17. Crude oil price forecasting based on hybridizing wavelet multiple linear regression model, particle swarm optimization techniques, and principal component analysis.

    PubMed

    Shabri, Ani; Samsudin, Ruhaidah

    2014-01-01

    Crude oil prices do play significant role in the global economy and are a key input into option pricing formulas, portfolio allocation, and risk measurement. In this paper, a hybrid model integrating wavelet and multiple linear regressions (MLR) is proposed for crude oil price forecasting. In this model, Mallat wavelet transform is first selected to decompose an original time series into several subseries with different scale. Then, the principal component analysis (PCA) is used in processing subseries data in MLR for crude oil price forecasting. The particle swarm optimization (PSO) is used to adopt the optimal parameters of the MLR model. To assess the effectiveness of this model, daily crude oil market, West Texas Intermediate (WTI), has been used as the case study. Time series prediction capability performance of the WMLR model is compared with the MLR, ARIMA, and GARCH models using various statistics measures. The experimental results show that the proposed model outperforms the individual models in forecasting of the crude oil prices series.

  18. QSAR study of HCV NS5B polymerase inhibitors using the genetic algorithm-multiple linear regression (GA-MLR)

    PubMed Central

    Rafiei, Hamid; Khanzadeh, Marziyeh; Mozaffari, Shahla; Bostanifar, Mohammad Hassan; Avval, Zhila Mohajeri; Aalizadeh, Reza; Pourbasheer, Eslam

    2016-01-01

    Quantitative structure-activity relationship (QSAR) study has been employed for predicting the inhibitory activities of the Hepatitis C virus (HCV) NS5B polymerase inhibitors. A data set consisted of 72 compounds was selected, and then different types of molecular descriptors were calculated. The whole data set was split into a training set (80 % of the dataset) and a test set (20 % of the dataset) using principle component analysis. The stepwise (SW) and the genetic algorithm (GA) techniques were used as variable selection tools. Multiple linear regression method was then used to linearly correlate the selected descriptors with inhibitory activities. Several validation technique including leave-one-out and leave-group-out cross-validation, Y-randomization method were used to evaluate the internal capability of the derived models. The external prediction ability of the derived models was further analyzed using modified r2, concordance correlation coefficient values and Golbraikh and Tropsha acceptable model criteria's. Based on the derived results (GA-MLR), some new insights toward molecular structural requirements for obtaining better inhibitory activity were obtained. PMID:27065774

  19. QSAR study of HCV NS5B polymerase inhibitors using the genetic algorithm-multiple linear regression (GA-MLR).

    PubMed

    Rafiei, Hamid; Khanzadeh, Marziyeh; Mozaffari, Shahla; Bostanifar, Mohammad Hassan; Avval, Zhila Mohajeri; Aalizadeh, Reza; Pourbasheer, Eslam

    2016-01-01

    Quantitative structure-activity relationship (QSAR) study has been employed for predicting the inhibitory activities of the Hepatitis C virus (HCV) NS5B polymerase inhibitors . A data set consisted of 72 compounds was selected, and then different types of molecular descriptors were calculated. The whole data set was split into a training set (80 % of the dataset) and a test set (20 % of the dataset) using principle component analysis. The stepwise (SW) and the genetic algorithm (GA) techniques were used as variable selection tools. Multiple linear regression method was then used to linearly correlate the selected descriptors with inhibitory activities. Several validation technique including leave-one-out and leave-group-out cross-validation, Y-randomization method were used to evaluate the internal capability of the derived models. The external prediction ability of the derived models was further analyzed using modified r(2), concordance correlation coefficient values and Golbraikh and Tropsha acceptable model criteria's. Based on the derived results (GA-MLR), some new insights toward molecular structural requirements for obtaining better inhibitory activity were obtained. PMID:27065774

  20. Risk Assessment and Prediction of Flyrock Distance by Combined Multiple Regression Analysis and Monte Carlo Simulation of Quarry Blasting

    NASA Astrophysics Data System (ADS)

    Armaghani, Danial Jahed; Mahdiyar, Amir; Hasanipanah, Mahdi; Faradonbeh, Roohollah Shirani; Khandelwal, Manoj; Amnieh, Hassan Bakhshandeh

    2016-09-01

    Flyrock is considered as one of the main causes of human injury, fatalities, and structural damage among all undesirable environmental impacts of blasting. Therefore, it seems that the proper prediction/simulation of flyrock is essential, especially in order to determine blast safety area. If proper control measures are taken, then the flyrock distance can be controlled, and, in return, the risk of damage can be reduced or eliminated. The first objective of this study was to develop a predictive model for flyrock estimation based on multiple regression (MR) analyses, and after that, using the developed MR model, flyrock phenomenon was simulated by the Monte Carlo (MC) approach. In order to achieve objectives of this study, 62 blasting operations were investigated in Ulu Tiram quarry, Malaysia, and some controllable and uncontrollable factors were carefully recorded/calculated. The obtained results of MC modeling indicated that this approach is capable of simulating flyrock ranges with a good level of accuracy. The mean of simulated flyrock by MC was obtained as 236.3 m, while this value was achieved as 238.6 m for the measured one. Furthermore, a sensitivity analysis was also conducted to investigate the effects of model inputs on the output of the system. The analysis demonstrated that powder factor is the most influential parameter on fly rock among all model inputs. It is noticeable that the proposed MR and MC models should be utilized only in the studied area and the direct use of them in the other conditions is not recommended.

  1. Ranking contributing areas of salt and selenium in the Lower Gunnison River Basin, Colorado, using multiple linear regression models

    USGS Publications Warehouse

    Linard, Joshua I.

    2013-01-01

    Mitigating the effects of salt and selenium on water quality in the Grand Valley and lower Gunnison River Basin in western Colorado is a major concern for land managers. Previous modeling indicated means to improve the models by including more detailed geospatial data and a more rigorous method for developing the models. After evaluating all possible combinations of geospatial variables, four multiple linear regression models resulted that could estimate irrigation-season salt yield, nonirrigation-season salt yield, irrigation-season selenium yield, and nonirrigation-season selenium yield. The adjusted r-squared and the residual standard error (in units of log-transformed yield) of the models were, respectively, 0.87 and 2.03 for the irrigation-season salt model, 0.90 and 1.25 for the nonirrigation-season salt model, 0.85 and 2.94 for the irrigation-season selenium model, and 0.93 and 1.75 for the nonirrigation-season selenium model. The four models were used to estimate yields and loads from contributing areas corresponding to 12-digit hydrologic unit codes in the lower Gunnison River Basin study area. Each of the 175 contributing areas was ranked according to its estimated mean seasonal yield of salt and selenium.

  2. Crude oil price forecasting based on hybridizing wavelet multiple linear regression model, particle swarm optimization techniques, and principal component analysis.

    PubMed

    Shabri, Ani; Samsudin, Ruhaidah

    2014-01-01

    Crude oil prices do play significant role in the global economy and are a key input into option pricing formulas, portfolio allocation, and risk measurement. In this paper, a hybrid model integrating wavelet and multiple linear regressions (MLR) is proposed for crude oil price forecasting. In this model, Mallat wavelet transform is first selected to decompose an original time series into several subseries with different scale. Then, the principal component analysis (PCA) is used in processing subseries data in MLR for crude oil price forecasting. The particle swarm optimization (PSO) is used to adopt the optimal parameters of the MLR model. To assess the effectiveness of this model, daily crude oil market, West Texas Intermediate (WTI), has been used as the case study. Time series prediction capability performance of the WMLR model is compared with the MLR, ARIMA, and GARCH models using various statistics measures. The experimental results show that the proposed model outperforms the individual models in forecasting of the crude oil prices series. PMID:24895666

  3. A multiple imputation approach to the analysis of interval-censored failure time data with the additive hazards model

    PubMed Central

    Chen, Ling; Sun, Jianguo

    2013-01-01

    This paper discusses regression analysis of interval-censored failure time data, which occur in many fields including demographical, epidemiological, financial, medical, and sociological studies. For the problem, we focus on the situation where the survival time of interest can be described by the additive hazards model and a multiple imputation approach is presented for inference. A major advantage of the approach is its simplicity and it can be easily implemented by using the existing software packages for right-censored failure time data. Extensive simulation studies are conducted which indicate that the approach performs well for practical situations and is comparable to the existing methods. The methodology is applied to a set of interval-censored failure time data arising from an AIDS clinical trial. PMID:25419022

  4. Estimating Dbh of Trees Employing Multiple Linear Regression of the best Lidar-Derived Parameter Combination Automated in Python in a Natural Broadleaf Forest in the Philippines

    NASA Astrophysics Data System (ADS)

    Ibanez, C. A. G.; Carcellar, B. G., III; Paringit, E. C.; Argamosa, R. J. L.; Faelga, R. A. G.; Posilero, M. A. V.; Zaragosa, G. P.; Dimayacyac, N. A.

    2016-06-01

    Diameter-at-Breast-Height Estimation is a prerequisite in various allometric equations estimating important forestry indices like stem volume, basal area, biomass and carbon stock. LiDAR Technology has a means of directly obtaining different forest parameters, except DBH, from the behavior and characteristics of point cloud unique in different forest classes. Extensive tree inventory was done on a two-hectare established sample plot in Mt. Makiling, Laguna for a natural growth forest. Coordinates, height, and canopy cover were measured and types of species were identified to compare to LiDAR derivatives. Multiple linear regression was used to get LiDAR-derived DBH by integrating field-derived DBH and 27 LiDAR-derived parameters at 20m, 10m, and 5m grid resolutions. To know the best combination of parameters in DBH Estimation, all possible combinations of parameters were generated and automated using python scripts and additional regression related libraries such as Numpy, Scipy, and Scikit learn were used. The combination that yields the highest r-squared or coefficient of determination and lowest AIC (Akaike's Information Criterion) and BIC (Bayesian Information Criterion) was determined to be the best equation. The equation is at its best using 11 parameters at 10mgrid size and at of 0.604 r-squared, 154.04 AIC and 175.08 BIC. Combination of parameters may differ among forest classes for further studies. Additional statistical tests can be supplemented to help determine the correlation among parameters such as Kaiser- Meyer-Olkin (KMO) Coefficient and the Barlett's Test for Spherecity (BTS).

  5. An evaluation of logic regression-based biomarker discovery across multiple intergenic regions for predicting host specificity in Escherichia coli.

    PubMed

    Zhi, Shuai; Li, Qiaozhi; Yasui, Yutaka; Banting, Graham; Edge, Thomas A; Topp, Edward; McAllister, Tim A; Neumann, Norman F

    2016-10-01

    Several studies have demonstrated that E. coli appears to display some level of host adaptation and specificity. Recent studies in our laboratory support these findings as determined by logic regression modeling of single nucleotide polymorphisms (SNP) in intergenic regions (ITGRs). We sought to determine the degree of host-specific information encoded in various ITGRs across a library of animal E. coli isolates using both whole genome analysis and a targeted ITGR sequencing approach. Our findings demonstrated that ITGRs across the genome encode various degrees of host-specific information. Incorporating multiple ITGRs (i.e., concatenation) into logic regression model building resulted in greater host-specificity and sensitivity outcomes in biomarkers, but the overall level of polymorphism in an ITGR did not correlate with the degree of host-specificity encoded in the ITGR. This suggests that distinct SNPs in ITGRs may be more important in defining host-specificity than overall sequence variation, explaining why traditional unsupervised learning phylogenetic approaches may be less informative in terms of revealing host-specific information encoded in DNA sequence. In silico analysis of 80 candidate ITGRs from publically available E. coli genomes was performed as a tool for discovering highly host-specific ITGRs. In one ITGR (ydeR-yedS) we identified a SNP biomarker that was 98% specific for cattle and for which 92% of all E. coli isolates originating from cattle carried this unique biomarker. In the case of humans, a host-specific biomarker (98% specificity) was identified in the concatenated ITGR sequences of rcsD-ompC, ydeR-yedS, and rclR-ykgE, and for which 78% of E. coli originating from humans carried this biomarker. Interestingly, human-specific biomarkers were dominant in ITGRs regulating antibiotic resistance, whereas in cattle host-specific biomarkers were found in ITGRs involved in stress regulation. These data suggest that evolution towards host

  6. An evaluation of logic regression-based biomarker discovery across multiple intergenic regions for predicting host specificity in Escherichia coli.

    PubMed

    Zhi, Shuai; Li, Qiaozhi; Yasui, Yutaka; Banting, Graham; Edge, Thomas A; Topp, Edward; McAllister, Tim A; Neumann, Norman F

    2016-10-01

    Several studies have demonstrated that E. coli appears to display some level of host adaptation and specificity. Recent studies in our laboratory support these findings as determined by logic regression modeling of single nucleotide polymorphisms (SNP) in intergenic regions (ITGRs). We sought to determine the degree of host-specific information encoded in various ITGRs across a library of animal E. coli isolates using both whole genome analysis and a targeted ITGR sequencing approach. Our findings demonstrated that ITGRs across the genome encode various degrees of host-specific information. Incorporating multiple ITGRs (i.e., concatenation) into logic regression model building resulted in greater host-specificity and sensitivity outcomes in biomarkers, but the overall level of polymorphism in an ITGR did not correlate with the degree of host-specificity encoded in the ITGR. This suggests that distinct SNPs in ITGRs may be more important in defining host-specificity than overall sequence variation, explaining why traditional unsupervised learning phylogenetic approaches may be less informative in terms of revealing host-specific information encoded in DNA sequence. In silico analysis of 80 candidate ITGRs from publically available E. coli genomes was performed as a tool for discovering highly host-specific ITGRs. In one ITGR (ydeR-yedS) we identified a SNP biomarker that was 98% specific for cattle and for which 92% of all E. coli isolates originating from cattle carried this unique biomarker. In the case of humans, a host-specific biomarker (98% specificity) was identified in the concatenated ITGR sequences of rcsD-ompC, ydeR-yedS, and rclR-ykgE, and for which 78% of E. coli originating from humans carried this biomarker. Interestingly, human-specific biomarkers were dominant in ITGRs regulating antibiotic resistance, whereas in cattle host-specific biomarkers were found in ITGRs involved in stress regulation. These data suggest that evolution towards host

  7. Additives

    NASA Technical Reports Server (NTRS)

    Smalheer, C. V.

    1973-01-01

    The chemistry of lubricant additives is discussed to show what the additives are chemically and what functions they perform in the lubrication of various kinds of equipment. Current theories regarding the mode of action of lubricant additives are presented. The additive groups discussed include the following: (1) detergents and dispersants, (2) corrosion inhibitors, (3) antioxidants, (4) viscosity index improvers, (5) pour point depressants, and (6) antifouling agents.

  8. Are major behavioral and sociodemographic risk factors for mortality additive or multiplicative in their effects?

    PubMed

    Mehta, Neil; Preston, Samuel

    2016-04-01

    All individuals are subject to multiple risk factors for mortality. In this paper, we consider the nature of interactions between certain major sociodemographic and behavioral risk factors associated with all-cause mortality in the United States. We develop the formal logic pertaining to two forms of interaction between risk factors, additive and multiplicative relations. We then consider the general circumstances in which additive or multiplicative relations might be expected. We argue that expectations about interactions among socio-demographic variables, and their relation to behavioral variables, have been stated in terms of additivity. However, the statistical models typically used to estimate the relation between risk factors and mortality assume that risk factors act multiplicatively. We examine empirically the nature of interactions among five major risk factors associated with all-cause mortality: smoking, obesity, race, sex, and educational attainment. Data were drawn from the cross-sectional NHANES III (1988-1994) and NHANES 1999-2010 surveys, linked to death records through December 31, 2011. Our analytic sample comprised 35,604 respondents and 5369 deaths. We find that obesity is additive with each of the remaining four variables. We speculate that its additivity is a reflection of the fact that obese status is generally achieved later in life. For all pairings of socio-demographic variables, risks are multiplicative. For survival chances, it is much more dangerous to be poorly educated if you are black or if you are male. And it is much riskier to be a male if you are black. These traits, established at birth or during childhood, literally result in deadly combinations. We conclude that the identification of interactions among risk factors can cast valuable light on the nature of the process being studied. It also has public health implications by identifying especially vulnerable groups and by properly identifying the proportion of deaths

  9. Spectroscopic determination of leaf biochemistry using band-depth analysis of absorption features and stepwise multiple linear regression

    USGS Publications Warehouse

    Kokaly, R.F.; Clark, R.N.

    1999-01-01

    We develop a new method for estimating the biochemistry of plant material using spectroscopy. Normalized band depths calculated from the continuum-removed reflectance spectra of dried and ground leaves were used to estimate their concentrations of nitrogen, lignin, and cellulose. Stepwise multiple linear regression was used to select wavelengths in the broad absorption features centered at 1.73 ??m, 2.10 ??m, and 2.30 ??m that were highly correlated with the chemistry of samples from eastern U.S. forests. Band depths of absorption features at these wavelengths were found to also be highly correlated with the chemistry of four other sites. A subset of data from the eastern U.S. forest sites was used to derive linear equations that were applied to the remaining data to successfully estimate their nitrogen, lignin, and cellulose concentrations. Correlations were highest for nitrogen (R2 from 0.75 to 0.94). The consistent results indicate the possibility of establishing a single equation capable of estimating the chemical concentrations in a wide variety of species from the reflectance spectra of dried leaves. The extension of this method to remote sensing was investigated. The effects of leaf water content, sensor signal-to-noise and bandpass, atmospheric effects, and background soil exposure were examined. Leaf water was found to be the greatest challenge to extending this empirical method to the analysis of fresh whole leaves and complete vegetation canopies. The influence of leaf water on reflectance spectra must be removed to within 10%. Other effects were reduced by continuum removal and normalization of band depths. If the effects of leaf water can be compensated for, it might be possible to extend this method to remote sensing data acquired by imaging spectrometers to give estimates of nitrogen, lignin, and cellulose concentrations over large areas for use in ecosystem studies.We develop a new method for estimating the biochemistry of plant material using

  10. Predicting Distribution and Inter-Annual Variability of Tropical Cyclone Intensity from a Stochastic, Multiple-Linear Regression Model

    NASA Astrophysics Data System (ADS)

    Lee, C. Y.; Tippett, M. K.; Sobel, A. H.; Camargo, S. J.

    2014-12-01

    We are working towards the development of a new statistical-dynamical downscaling system to study the influence of climate on tropical cyclones (TCs). The first step is development of an appropriate model for TC intensity as a function of environmental variables. We approach this issue with a stochastic model consisting of a multiple linear regression model (MLR) for 12-hour intensity forecasts as a deterministic component, and a random error generator as a stochastic component. Similar to the operational Statistical Hurricane Intensity Prediction Scheme (SHIPS), MLR relates the surrounding environment to storm intensity, but with only essential predictors calculated from monthly-mean NCEP reanalysis fields (potential intensity, shear, etc.) and from persistence. The deterministic MLR is developed with data from 1981-1999 and tested with data from 2000-2012 for the Atlantic, Eastern North Pacific, Western North Pacific, Indian Ocean, and Southern Hemisphere basins. While the global MLR's skill is comparable to that of the operational statistical models (e.g., SHIPS), the distribution of the predicted maximum intensity from deterministic results has a systematic low bias compared to observations; the deterministic MLR creates almost no storms with intensities greater than 100 kt. The deterministic MLR can be significantly improved by adding the stochastic component, based on the distribution of random forecasting errors from the deterministic model compared to the training data. This stochastic component may be thought of as representing the component of TC intensification that is not linearly related to the environmental variables. We find that in order for the stochastic model to accurately capture the observed distribution of maximum storm intensities, the stochastic component must be auto-correlated across 12-hour time steps. This presentation also includes a detailed discussion of the distributions of other TC-intensity related quantities, as well as the inter

  11. Multi-stratified multiple regression tests of the linear/no-threshold theory of radon-induced lung cancer

    SciTech Connect

    Cohen, B.L.

    1992-12-31

    A plot of lung-cancer rates versus radon exposures in 965 US counties, or in all US states, has a strong negative slope, b, in sharp contrast to the strong positive slope predicted by linear/no-threshold theory. The discrepancy between these slopes exceeds 20 standard deviations (SD). Including smoking frequency in the analysis substantially improves fits to a linear relationship but has little effect on the discrepancy in b, because correlations between smoking frequency and radon levels are quite weak. Including 17 socioeconomic variables (SEV) in multiple regression analysis reduces the discrepancy to 15 SD. Data were divided into segments by stratifying on each SEV in turn, and on geography, and on both simultaneously, giving over 300 data sets to be analyzed individually, but negative slopes predominated. The slope is negative whether one considers only the most urban counties or only the most rural; only the richest or only the poorest; only the richest in the South Atlantic region or only the poorest in that region, etc., etc.,; and for all the strata in between. Since this is an ecological study, the well-known problems with ecological studies were investigated and found not to be applicable here. The {open_quotes}ecological fallacy{close_quotes} was shown not to apply in testing a linear/no-threshold theory, and the vulnerability to confounding is greatly reduced when confounding factors are only weakly correlated with radon levels, as is generally the case here. All confounding factors known to correlate with radon and with lung cancer were investigated quantitatively and found to have little effect on the discrepancy.

  12. Prediction of the processing factor for pesticides in apple juice by principal component analysis and multiple linear regression.

    PubMed

    Martin, L; Mezcua, M; Ferrer, C; Gil Garcia, M D; Malato, O; Fernandez-Alba, A R

    2013-01-01

    The main objective of this work was to establish a mathematical function that correlates pesticide residue levels in apple juice with the levels of the pesticides applied on the raw fruit, taking into account some of their physicochemical properties such as water solubility, the octanol/water partition coefficient, the organic carbon partition coefficient, vapour pressure and density. A mixture of 12 pesticides was applied to an apple tree; apples were collected after 10 days of application. After harvest, apples were treated with a mixture of three post-harvest pesticides and the fruits were then processed in order to obtain apple juice following a routine industrial process. The pesticide residue levels in the apple samples were analysed using two multi-residue methods based on LC-MS/MS and GC-MS/MS. The concentration of pesticides was determined in samples derived from the different steps of processing. The processing factors (the coefficient between residue level in the processed commodity and the residue level in the commodity to be processed) obtained for the full juicing process were found to vary among the different pesticides studied. In order to investigate the relationships between the levels of pesticide residue found in apple juice samples and their physicochemical properties, principal component analysis (PCA) was performed using two sets of samples (one of them using experimental data obtained in this work and the other including the data taken from the literature). In both cases the correlation was found between processing factors of pesticides in the apple juice and the negative logarithms (base 10) of the water solubility, octanol/water partition coefficient and organic carbon partition coefficient. The linear correlation between these physicochemical properties and the processing factor were established using a multiple linear regression technique.

  13. Predicting punching acceleration from selected strength and power variables in elite karate athletes: a multiple regression analysis.

    PubMed

    Loturco, Irineu; Artioli, Guilherme Giannini; Kobal, Ronaldo; Gil, Saulo; Franchini, Emerson

    2014-07-01

    This study investigated the relationship between punching acceleration and selected strength and power variables in 19 professional karate athletes from the Brazilian National Team (9 men and 10 women; age, 23 ± 3 years; height, 1.71 ± 0.09 m; and body mass [BM], 67.34 ± 13.44 kg). Punching acceleration was assessed under 4 different conditions in a randomized order: (a) fixed distance aiming to attain maximum speed (FS), (b) fixed distance aiming to attain maximum impact (FI), (c) self-selected distance aiming to attain maximum speed, and (d) self-selected distance aiming to attain maximum impact. The selected strength and power variables were as follows: maximal dynamic strength in bench press and squat-machine, squat and countermovement jump height, mean propulsive power in bench throw and jump squat, and mean propulsive velocity in jump squat with 40% of BM. Upper- and lower-body power and maximal dynamic strength variables were positively correlated to punch acceleration in all conditions. Multiple regression analysis also revealed predictive variables: relative mean propulsive power in squat jump (W·kg-1), and maximal dynamic strength 1 repetition maximum in both bench press and squat-machine exercises. An impact-oriented instruction and a self-selected distance to start the movement seem to be crucial to reach the highest acceleration during punching execution. This investigation, while demonstrating strong correlations between punching acceleration and strength-power variables, also provides important information for coaches, especially for designing better training strategies to improve punching speed.

  14. Multiple regression analysis to assess the role of plankton on the distribution and speciation of mercury in water of a contaminated lagoon.

    PubMed

    Stoichev, T; Tessier, E; Amouroux, D; Almeida, C M; Basto, M C P; Vasconcelos, V M

    2016-11-15

    Spatial and seasonal variation of mercury species aqueous concentrations and distributions was carried out during six sampling campaigns at four locations within Laranjo Bay, the most mercury-contaminated area of the Aveiro Lagoon (Portugal). Inorganic mercury (IHg(II)) and methylmercury (MeHg) were determined in filter-retained (IHgPART, MeHgPART) and filtered (<0.45μm) fractions (IHg(II)DISS, MeHgDISS). The concentrations of IHgPART depended on site and on dilution with downstream particles. Similar processes were evidenced for MeHgPART, however, its concentrations increased for particles rich in phaeophytin (Pha). The concentrations of MeHgDISS, and especially those of IHg(II)DISS, increased with Pha concentrations in the water. Multiple regression models are able to depict MeHgPART, IHg(II)DISS and MeHgDISS concentrations with salinity and Pha concentrations exhibiting additive statistical effects and allowing separation of possible addition and removal processes. A link between phytoplankton/algae and consumers' grazing pressure in the contaminated area can be involved to increase concentrations of IHg(II)DISS and MeHgPART. These processes could lead to suspended particles enriched with MeHg and to the enhancement of IHg(II) and MeHg availability in surface waters and higher transfer to the food web. PMID:27484944

  15. Quantification of Treatment Effect Modification on Both an Additive and Multiplicative Scale

    PubMed Central

    Girerd, Nicolas; Rabilloud, Muriel; Pibarot, Philippe; Mathieu, Patrick; Roy, Pascal

    2016-01-01

    Background In both observational and randomized studies, associations with overall survival are by and large assessed on a multiplicative scale using the Cox model. However, clinicians and clinical researchers have an ardent interest in assessing absolute benefit associated with treatments. In older patients, some studies have reported lower relative treatment effect, which might translate into similar or even greater absolute treatment effect given their high baseline hazard for clinical events. Methods The effect of treatment and the effect modification of treatment were respectively assessed using a multiplicative and an additive hazard model in an analysis adjusted for propensity score in the context of coronary surgery. Results The multiplicative model yielded a lower relative hazard reduction with bilateral internal thoracic artery grafting in older patients (Hazard ratio for interaction/year = 1.03, 95%CI: 1.00 to 1.06, p = 0.05) whereas the additive model reported a similar absolute hazard reduction with increasing age (Delta for interaction/year = 0.10, 95%CI: -0.27 to 0.46, p = 0.61). The number needed to treat derived from the propensity score-adjusted multiplicative model was remarkably similar at the end of the follow-up in patients aged < = 60 and in patients >70. Conclusions The present example demonstrates that a lower treatment effect in older patients on a relative scale can conversely translate into a similar treatment effect on an additive scale due to large baseline hazard differences. Importantly, absolute risk reduction, either crude or adjusted, can be calculated from multiplicative survival models. We advocate for a wider use of the absolute scale, especially using additive hazard models, to assess treatment effect and treatment effect modification. PMID:27045168

  16. An Investigation of the Relationship of Intellective and Personality Variables to Success in an Independent Study Science Course Through the Use of a Modified Multiple Regression Model.

    ERIC Educational Resources Information Center

    Szabo, Michael; Feldhusen, John F.

    This is an empirical study of selected learner characteristics and their relation to academic success, as indicated by course grades, in a structured independent study learning program. This program, called the Audio-Tutorial System, was utilized in an undergraduate college course in the biological sciences. By use of multiple regression analysis,…

  17. Selective impairments for addition, subtraction and multiplication. implications for the organisation of arithmetical facts.

    PubMed

    van Harskamp, N J; Cipolotti, L

    2001-06-01

    This study reports for the first time a selective impairment for simple addition in patient FS. Moreover, patient VP presented with a selective impairment for simple multiplication and patient DT with a selective impairment for simple subtraction. These findings are discussed in the context of two of the most influential models for the organisation of arithmetical facts in memory (Dehaene and Cohen, 1995, 1997, and Dagenbach and McCloskey, 1992). Dehaene and Cohen (1995, 1997) have proposed that dissociation between arithmetical facts result from a selective impairment to two different types of processing: rote verbal memory for multiplication and simple addition vs. quantity processing for subtraction and division. Dagenbach and McCloskey (1992) suggest dissociation between arithmetical facts result from a selective damage to segregated memory networks specific for each operation. We will argue that our findings are problematic for Dehaene's model and in good accord with McCloskey's view. PMID:11485063

  18. Multiplicative noise effects on electroconvection in controlling additive noise by a magnetic field

    NASA Astrophysics Data System (ADS)

    Huh, Jong-Hoon

    2015-12-01

    We report multiplicative noise-induced threshold shift of electroconvection (EC) in the presence of a magnetic field H . Controlling the thermal fluctuation (i.e., additive noise) of the rodlike molecules of nematic liquid crystals by H , the EC threshold is examined at various noise levels [characterized by their intensity and cutoff frequency (fc) ]. For a sufficiently strong H (i.e., ignorable additive noise), a modified noise sensitivity characterizing the shift problem is in good agreement with experimental results for colored as well as white noise (fc→∞ ) ; until now, there was a large deviation for (sufficiently) colored noises. The present study shows that H provides us with ideal conditions for studying the corresponding Carr-Helfrich theory considering pure multiplicative noise.

  19. The mechanical properties of high speed GTAW weld and factors of nonlinear multiple regression model under external transverse magnetic field

    NASA Astrophysics Data System (ADS)

    Lu, Lin; Chang, Yunlong; Li, Yingmin; He, Youyou

    2013-05-01

    A transverse magnetic field was introduced to the arc plasma in the process of welding stainless steel tubes by high-speed Tungsten Inert Gas Arc Welding (TIG for short) without filler wire. The influence of external magnetic field on welding quality was investigated. 9 sets of parameters were designed by the means of orthogonal experiment. The welding joint tensile strength and form factor of weld were regarded as the main standards of welding quality. A binary quadratic nonlinear regression equation was established with the conditions of magnetic induction and flow rate of Ar gas. The residual standard deviation was calculated to adjust the accuracy of regression model. The results showed that, the regression model was correct and effective in calculating the tensile strength and aspect ratio of weld. Two 3D regression models were designed respectively, and then the impact law of magnetic induction on welding quality was researched.

  20. Additivity of Feature-Based and Symmetry-Based Grouping Effects in Multiple Object Tracking

    PubMed Central

    Wang, Chundi; Zhang, Xuemin; Li, Yongna; Lyu, Chuang

    2016-01-01

    Multiple object tracking (MOT) is an attentional process wherein people track several moving targets among several distractors. Symmetry, an important indicator of regularity, is a general spatial pattern observed in natural and artificial scenes. According to the “laws of perceptual organization” proposed by Gestalt psychologists, regularity is a principle of perceptual grouping, such as similarity and closure. A great deal of research reported that feature-based similarity grouping (e.g., grouping based on color, size, or shape) among targets in MOT tasks can improve tracking performance. However, no additive feature-based grouping effects have been reported where the tracking objects had two or more features. “Additive effect” refers to a greater grouping effect produced by grouping based on multiple cues instead of one cue. Can spatial symmetry produce a similar grouping effect similar to that of feature similarity in MOT tasks? Are the grouping effects based on symmetry and feature similarity additive? This study includes four experiments to address these questions. The results of Experiments 1 and 2 demonstrated the automatic symmetry-based grouping effects. More importantly, an additive grouping effect of symmetry and feature similarity was observed in Experiments 3 and 4. Our findings indicate that symmetry can produce an enhanced grouping effect in MOT and facilitate the grouping effect based on color or shape similarity. The “where” and “what” pathways might have played an important role in the additive grouping effect. PMID:27199875

  1. Additivity of Feature-Based and Symmetry-Based Grouping Effects in Multiple Object Tracking.

    PubMed

    Wang, Chundi; Zhang, Xuemin; Li, Yongna; Lyu, Chuang

    2016-01-01

    Multiple object tracking (MOT) is an attentional process wherein people track several moving targets among several distractors. Symmetry, an important indicator of regularity, is a general spatial pattern observed in natural and artificial scenes. According to the "laws of perceptual organization" proposed by Gestalt psychologists, regularity is a principle of perceptual grouping, such as similarity and closure. A great deal of research reported that feature-based similarity grouping (e.g., grouping based on color, size, or shape) among targets in MOT tasks can improve tracking performance. However, no additive feature-based grouping effects have been reported where the tracking objects had two or more features. "Additive effect" refers to a greater grouping effect produced by grouping based on multiple cues instead of one cue. Can spatial symmetry produce a similar grouping effect similar to that of feature similarity in MOT tasks? Are the grouping effects based on symmetry and feature similarity additive? This study includes four experiments to address these questions. The results of Experiments 1 and 2 demonstrated the automatic symmetry-based grouping effects. More importantly, an additive grouping effect of symmetry and feature similarity was observed in Experiments 3 and 4. Our findings indicate that symmetry can produce an enhanced grouping effect in MOT and facilitate the grouping effect based on color or shape similarity. The "where" and "what" pathways might have played an important role in the additive grouping effect.

  2. Spontaneous regression of multiple pulmonary nodules in a patient with unclassified renal cell carcinoma following laparoscopic partial nephrectomy: A case report

    PubMed Central

    UEDA, KOSUKE; SUEKANE, SHIGETAKA; MITANI, TOMOTARO; CHIKUI, KATSUAKI; EJIMA, KAZUHISA; SUYAMA, SHUNSUKE; NAKIRI, MAKOTO; NISHIHARA, KIYOAKI; MATSUO, MITSUNORI; IGAWA, TSUKASA

    2016-01-01

    Spontaneous regression of metastatic renal cell carcinoma (RCC) is rare, but well-documented in clear cell RCC. However, there are no reports on spontaneous regression of unclassified RCC. Since the radiological findings of pulmonary infarcts and inflammatory pseudotumors are similar to those of metastases from RCC, a definitive diagnosis is difficult without performing a histological examination. A 56-year-old woman underwent medical examination by a physician. An abdominal computed tomography (CT) scan revealed a 22-mm mass with a cystic area in the right kidney, as well as multiple enlarged lymph nodes in the common iliac, external iliac and groin areas, bilaterally. A chest CT revealed multiple pulmonary nodules bilaterally, the largest measuring 15 mm. Since the right renal tumor was suspected to be an RCC, laparoscopic partial nephrectomy was performed. The final pathological diagnosis of the renal tumor was unclassified RCC. One month following surgery, a CT scan revealed spontaneous regression of the pulmonary nodules. We herein present a rare case of spontaneous regression of pulmonary nodules in a patient with unclassified RCC following laparoscopic partial nephrectomy. To the best of our knowledge, this is the first case of spontaneous regression in unclassified RCC. PMID:27330764

  3. Using multiple regression, Bayesian networks and artificial neural networks for prediction of total egg production in European quails based on earlier expressed phenotypes.

    PubMed

    Felipe, Vivian P S; Silva, Martinho A; Valente, Bruno D; Rosa, Guilherme J M

    2015-04-01

    The prediction of total egg production (TEP) potential in poultry is an important task to aid optimized management decisions in commercial enterprises. The objective of the present study was to compare different modeling approaches for prediction of TEP in meat type quails (Coturnix coturnix coturnix) using phenotypes such as weight, weight gain, egg production and egg quality measurements. Phenotypic data on 30 traits from two lines (L1, n=180; and L2, n=205) of quail were modeled to predict TEP. Prediction models included multiple linear regression and artificial neural network (ANN). Moreover, Bayesian network (BN) and a stepwise approach were used as variable selection methods. BN results showed that TEP is independent from other earlier expressed traits when conditioned on egg production from 35 to 80 days of age (EP1). In addition, the prediction accuracy was much lower when EP1 was not included in the model. The best predictive model was ANN, after feature selection, showing prediction correlations of r=0.792 and r=0.714 for L1 and L2, respectively. In conclusion, machine learning methods may be useful, but reasonable prediction accuracies are obtained only when partial egg production measurements are included in the model.

  4. Development of multiple linear regression models as predictive tools for fecal indicator concentrations in a stretch of the lower Lahn River, Germany.

    PubMed

    Herrig, Ilona M; Böer, Simone I; Brennholt, Nicole; Manz, Werner

    2015-11-15

    Since rivers are typically subject to rapid changes in microbiological water quality, tools are needed to allow timely water quality assessment. A promising approach is the application of predictive models. In our study, we developed multiple linear regression (MLR) models in order to predict the abundance of the fecal indicator organisms Escherichia coli (EC), intestinal enterococci (IE) and somatic coliphages (SC) in the Lahn River, Germany. The models were developed on the basis of an extensive set of environmental parameters collected during a 12-months monitoring period. Two models were developed for each type of indicator: 1) an extended model including the maximum number of variables significantly explaining variations in indicator abundance and 2) a simplified model reduced to the three most influential explanatory variables, thus obtaining a model which is less resource-intensive with regard to required data. Both approaches have the ability to model multiple sites within one river stretch. The three most important predictive variables in the optimized models for the bacterial indicators were NH4-N, turbidity and global solar irradiance, whereas chlorophyll a content, discharge and NH4-N were reliable model variables for somatic coliphages. Depending on indicator type, the extended mode models also included the additional variables rainfall, O2 content, pH and chlorophyll a. The extended mode models could explain 69% (EC), 74% (IE) and 72% (SC) of the observed variance in fecal indicator concentrations. The optimized models explained the observed variance in fecal indicator concentrations to 65% (EC), 70% (IE) and 68% (SC). Site-specific efficiencies ranged up to 82% (EC) and 81% (IE, SC). Our results suggest that MLR models are a promising tool for a timely water quality assessment in the Lahn area. PMID:26318647

  5. Partial Least Squares Regression Can Aid in Detecting Differential Abundance of Multiple Features in Sets of Metagenomic Samples.

    PubMed

    Libiger, Ondrej; Schork, Nicholas J

    2015-01-01

    It is now feasible to examine the composition and diversity of microbial communities (i.e., "microbiomes") that populate different human organs and orifices using DNA sequencing and related technologies. To explore the potential links between changes in microbial communities and various diseases in the human body, it is essential to test associations involving different species within and across microbiomes, environmental settings and disease states. Although a number of statistical techniques exist for carrying out relevant analyses, it is unclear which of these techniques exhibit the greatest statistical power to detect associations given the complexity of most microbiome datasets. We compared the statistical power of principal component regression, partial least squares regression, regularized regression, distance-based regression, Hill's diversity measures, and a modified test implemented in the popular and widely used microbiome analysis methodology "Metastats" across a wide range of simulated scenarios involving changes in feature abundance between two sets of metagenomic samples. For this purpose, simulation studies were used to change the abundance of microbial species in a real dataset from a published study examining human hands. Each technique was applied to the same data, and its ability to detect the simulated change in abundance was assessed. We hypothesized that a small subset of methods would outperform the rest in terms of the statistical power. Indeed, we found that the Metastats technique modified to accommodate multivariate analysis and partial least squares regression yielded high power under the models and data sets we studied. The statistical power of diversity measure-based tests, distance-based regression and regularized regression was significantly lower. Our results provide insight into powerful analysis strategies that utilize information on species counts from large microbiome data sets exhibiting skewed frequency distributions obtained

  6. Partial Least Squares Regression Can Aid in Detecting Differential Abundance of Multiple Features in Sets of Metagenomic Samples

    PubMed Central

    Libiger, Ondrej; Schork, Nicholas J.

    2015-01-01

    It is now feasible to examine the composition and diversity of microbial communities (i.e., “microbiomes”) that populate different human organs and orifices using DNA sequencing and related technologies. To explore the potential links between changes in microbial communities and various diseases in the human body, it is essential to test associations involving different species within and across microbiomes, environmental settings and disease states. Although a number of statistical techniques exist for carrying out relevant analyses, it is unclear which of these techniques exhibit the greatest statistical power to detect associations given the complexity of most microbiome datasets. We compared the statistical power of principal component regression, partial least squares regression, regularized regression, distance-based regression, Hill's diversity measures, and a modified test implemented in the popular and widely used microbiome analysis methodology “Metastats” across a wide range of simulated scenarios involving changes in feature abundance between two sets of metagenomic samples. For this purpose, simulation studies were used to change the abundance of microbial species in a real dataset from a published study examining human hands. Each technique was applied to the same data, and its ability to detect the simulated change in abundance was assessed. We hypothesized that a small subset of methods would outperform the rest in terms of the statistical power. Indeed, we found that the Metastats technique modified to accommodate multivariate analysis and partial least squares regression yielded high power under the models and data sets we studied. The statistical power of diversity measure-based tests, distance-based regression and regularized regression was significantly lower. Our results provide insight into powerful analysis strategies that utilize information on species counts from large microbiome data sets exhibiting skewed frequency distributions

  7. Guide to using Multiple Regression in Excel (MRCX v.1.1) for Removal of River Stage Effects from Well Water Levels

    SciTech Connect

    Mackley, Rob D.; Spane, Frank A.; Pulsipher, Trenton C.; Allwardt, Craig H.

    2010-09-01

    A software tool was created in Fiscal Year 2010 (FY11) that enables multiple-regression correction of well water levels for river-stage effects. This task was conducted as part of the Remediation Science and Technology project of CH2MHILL Plateau Remediation Company (CHPRC). This document contains an overview of the correction methodology and a user’s manual for Multiple Regression in Excel (MRCX) v.1.1. It also contains a step-by-step tutorial that shows users how to use MRCX to correct river effects in two different wells. This report is accompanied by an enclosed CD that contains the MRCX installer application and files used in the tutorial exercises.

  8. Multiple Linear Regression Analysis of Factors Affecting Real Property Price Index From Case Study Research In Istanbul/Turkey

    NASA Astrophysics Data System (ADS)

    Denli, H. H.; Koc, Z.

    2015-12-01

    Estimation of real properties depending on standards is difficult to apply in time and location. Regression analysis construct mathematical models which describe or explain relationships that may exist between variables. The problem of identifying price differences of properties to obtain a price index can be converted into a regression problem, and standard techniques of regression analysis can be used to estimate the index. Considering regression analysis for real estate valuation, which are presented in real marketing process with its current characteristics and quantifiers, the method will help us to find the effective factors or variables in the formation of the value. In this study, prices of housing for sale in Zeytinburnu, a district in Istanbul, are associated with its characteristics to find a price index, based on information received from a real estate web page. The associated variables used for the analysis are age, size in m2, number of floors having the house, floor number of the estate and number of rooms. The price of the estate represents the dependent variable, whereas the rest are independent variables. Prices from 60 real estates have been used for the analysis. Same price valued locations have been found and plotted on the map and equivalence curves have been drawn identifying the same valued zones as lines.

  9. Statistically Differentiating between Interaction and Nonlinearity in Multiple Regression Analysis: A Monte Carlo Investigation of a Recommended Strategy.

    ERIC Educational Resources Information Center

    Kromrey, Jeffrey D.; Foster-Johnson, Lynn

    1999-01-01

    Shows that the procedure recommended by D. Lubinski and L. Humphreys (1990) for differentiating between moderated and nonlinear regression models evidences statistical problems characteristic of stepwise procedures. Interprets Monte Carlo results in terms of the researchers' need to differentiate between exploratory and confirmatory aspects of…

  10. Multiple Regression with Varying Levels of Correlation among Predictors: Monte Carlo Sampling from Normal and Non-Normal Populations.

    ERIC Educational Resources Information Center

    Vasu, Ellen Storey

    1978-01-01

    The effects of the violation of the assumption of normality in the conditional distributions of the dependent variable, coupled with the condition of multicollinearity upon the outcome of testing the hypothesis that the regression coefficient equals zero, are investigated via a Monte Carlo study. (Author/JKS)

  11. Photocatalyzed multiple additions of amines to {alpha}, {beta}-unsaturated esters and nitriles

    SciTech Connect

    Das, S.; Kumar, J.S.D.; Thomas, K.G.; Shivaramayya, K.; George, M.V. |

    1994-02-11

    Photoelectron-transfer-catalyzed intermolecular carbon-carbon bond formation of primary, secondary, and tertiary amines with {alpha}, {beta}-unsaturated esters and nitriles using photosensitizers such as anthraquinone, acridone, and dicyanoanthracene has been investigated. The addition of {alpha}-aminoalkyl radicals, generated via photoelectron-transfer processes, to olefinic substrates and the subsequent 1,5-hydrogen abstraction reactions of the amine-olefin adduct radicals lead to a number of interesting multiple-olefin-added products. The adducts of the primary and secondary amines with {alpha}, {beta}-unsaturated esters undergo further cyclizations to give spiro and cyclic lactams, respectively.

  12. Recursive ideal observer detection of known M-ary signals in multiplicative and additive Gaussian noise.

    NASA Technical Reports Server (NTRS)

    Painter, J. H.; Gupta, S. C.

    1973-01-01

    This paper presents the derivation of the recursive algorithms necessary for real-time digital detection of M-ary known signals that are subject to independent multiplicative and additive Gaussian noises. The motivating application is minimum probability of error detection of digital data-link messages aboard civil aircraft in the earth reflection multipath environment. For each known signal, the detector contains one Kalman filter and one probability computer. The filters estimate the multipath disturbance. The estimates and the received signal drive the probability computers. Outputs of all the computers are compared in amplitude to give the signal decision. The practicality and usefulness of the detector are extensively discussed.

  13. Comparison of multiple linear regression, partial least squares and artificial neural networks for prediction of gas chromatographic relative retention times of trimethylsilylated anabolic androgenic steroids.

    PubMed

    Fragkaki, A G; Farmaki, E; Thomaidis, N; Tsantili-Kakoulidou, A; Angelis, Y S; Koupparis, M; Georgakopoulos, C

    2012-09-21

    The comparison among different modelling techniques, such as multiple linear regression, partial least squares and artificial neural networks, has been performed in order to construct and evaluate models for prediction of gas chromatographic relative retention times of trimethylsilylated anabolic androgenic steroids. The performance of the quantitative structure-retention relationship study, using the multiple linear regression and partial least squares techniques, has been previously conducted. In the present study, artificial neural networks models were constructed and used for the prediction of relative retention times of anabolic androgenic steroids, while their efficiency is compared with that of the models derived from the multiple linear regression and partial least squares techniques. For overall ranking of the models, a novel procedure [Trends Anal. Chem. 29 (2010) 101-109] based on sum of ranking differences was applied, which permits the best model to be selected. The suggested models are considered useful for the estimation of relative retention times of designer steroids for which no analytical data are available.

  14. Modulation of orientation-selective neurons by motion: when additive, when multiplicative?

    PubMed Central

    Lüdge, Torsten; Urbanczik, Robert; Senn, Walter

    2014-01-01

    The recurrent interaction among orientation-selective neurons in the primary visual cortex (V1) is suited to enhance contours in a noisy visual scene. Motion is known to have a strong pop-up effect in perceiving contours, but how motion-sensitive neurons in V1 support contour detection remains vastly elusive. Here we suggest how the various types of motion-sensitive neurons observed in V1 should be wired together in a micro-circuitry to optimally extract contours in the visual scene. Motion-sensitive neurons can be selective about the direction of motion occurring at some spot or respond equally to all directions (pandirectional). We show that, in the light of figure-ground segregation, direction-selective motion neurons should additively modulate the corresponding orientation-selective neurons with preferred orientation orthogonal to the motion direction. In turn, to maximally enhance contours, pandirectional motion neurons should multiplicatively modulate all orientation-selective neurons with co-localized receptive fields. This multiplicative modulation amplifies the local V1-circuitry among co-aligned orientation-selective neurons for detecting elongated contours. We suggest that the additive modulation by direction-specific motion neurons is achieved through synaptic projections to the somatic region, and the multiplicative modulation by pandirectional motion neurons through projections to the apical region of orientation-specific pyramidal neurons. For the purpose of contour detection, the V1-intrinsic integration of motion information is advantageous over a downstream integration as it exploits the recurrent V1-circuitry designed for that task. PMID:24999328

  15. Application of least squares support vector regression and linear multiple regression for modeling removal of methyl orange onto tin oxide nanoparticles loaded on activated carbon and activated carbon prepared from Pistacia atlantica wood.

    PubMed

    Ghaedi, M; Rahimi, Mahmoud Reza; Ghaedi, A M; Tyagi, Inderjeet; Agarwal, Shilpi; Gupta, Vinod Kumar

    2016-01-01

    Two novel and eco friendly adsorbents namely tin oxide nanoparticles loaded on activated carbon (SnO2-NP-AC) and activated carbon prepared from wood tree Pistacia atlantica (AC-PAW) were used for the rapid removal and fast adsorption of methyl orange (MO) from the aqueous phase. The dependency of MO removal with various adsorption influential parameters was well modeled and optimized using multiple linear regressions (MLR) and least squares support vector regression (LSSVR). The optimal parameters for the LSSVR model were found based on γ value of 0.76 and σ(2) of 0.15. For testing the data set, the mean square error (MSE) values of 0.0010 and the coefficient of determination (R(2)) values of 0.976 were obtained for LSSVR model, and the MSE value of 0.0037 and the R(2) value of 0.897 were obtained for the MLR model. The adsorption equilibrium and kinetic data was found to be well fitted and in good agreement with Langmuir isotherm model and second-order equation and intra-particle diffusion models respectively. The small amount of the proposed SnO2-NP-AC and AC-PAW (0.015 g and 0.08 g) is applicable for successful rapid removal of methyl orange (>95%). The maximum adsorption capacity for SnO2-NP-AC and AC-PAW was 250 mg g(-1) and 125 mg g(-1) respectively. PMID:26414425

  16. Is the structural diversity of tripeptides sufficient for developing functional food additives with satisfactory multiple bioactivities?

    NASA Astrophysics Data System (ADS)

    Wang, Jian-Hui; Liu, Yong-Le; Ning, Jing-Heng; Yu, Jian; Li, Xiang-Hong; Wang, Fa-Xiang

    2013-05-01

    Multifunctional peptides have attracted increasing attention in the food science community because of their therapeutic potential, low toxicity and rapid intestinal absorption. However, previous study demonstrated that the limited structural variations make it difficult to optimize dipeptide molecules in a good balance between desirable and undesirable properties (F. Tian, P. Zhou, F. Lv, R. Song, Z. Li, J. Pept. Sci. 13 (2007) 549-566). In the present work, we attempt to answer whether the structural diversity is sufficient for a tripeptide to have satisfactory multiple bioactivities. Statistical test, structural examination and energetic analysis confirm that peptides of three amino acids long can bind tightly to human angiotensin converting enzyme (ACE) and thus exert significant antihypertensive efficacy. Further quantitative structure-activity relationship (QSAR) modeling and prediction of all 8000 possible tripeptides reveal that their ACE-inhibitory potency exhibits a good (positive) relationship to antioxidative activity, but has only a quite modest correlation with bitterness. This means that it is possible to find certain tripeptide entities possessing the optimal combination of strong ACE-inhibitory potency, high antioxidative activity and weak bitter taste, which are the promising candidates for developing multifunctional food additives with satisfactory multiple bioactivities. The marked difference between dipeptide and tripeptide can be attributed to the fact that the structural diversity of peptides increases dramatically with a slight change in sequence length.

  17. A new approach to handle additive and multiplicative uncertainties in the measurement for ? LPV filtering

    NASA Astrophysics Data System (ADS)

    Lacerda, Márcio J.; Tognetti, Eduardo S.; Oliveira, Ricardo C. L. F.; Peres, Pedro L. D.

    2016-04-01

    This paper presents a general framework to cope with full-order ? linear parameter-varying (LPV) filter design subject to inexactly measured parameters. The main novelty is the ability of handling additive and multiplicative uncertainties in the measurements, for both continuous and discrete-time LPV systems, in a unified approach. By conveniently modelling scheduling parameters and uncertainties affecting the measurements, the ? filter design problem can be expressed in terms of robust matrix inequalities that become linear when two scalar parameters are fixed. Therefore, the proposed conditions can be efficiently solved through linear matrix inequality relaxations based on polynomial solutions. Numerical examples are presented to illustrate the improved efficiency of the proposed approach when compared to other methods and, more important, its capability to deal with scenarios where the available strategies in the literature cannot be used.

  18. Generalized Additive Mixed-Models for Pharmacology Using Integrated Discrete Multiple Organ Co-Culture

    PubMed Central

    Ingersoll, Thomas; Cole, Stephanie; Madren-Whalley, Janna; Booker, Lamont; Dorsey, Russell; Li, Albert; Salem, Harry

    2016-01-01

    Integrated Discrete Multiple Organ Co-culture (IDMOC) is emerging as an in-vitro alternative to in-vivo animal models for pharmacology studies. IDMOC allows dose-response relationships to be investigated at the tissue and organoid levels, yet, these relationships often exhibit responses that are far more complex than the binary responses often measured in whole animals. To accommodate departure from binary endpoints, IDMOC requires an expansion of analytic techniques beyond simple linear probit and logistic models familiar in toxicology. IDMOC dose-responses may be measured at continuous scales, exhibit significant non-linearity such as local maxima or minima, and may include non-independent measures. Generalized additive mixed-modeling (GAMM) provides an alternative description of dose-response that relaxes assumptions of independence and linearity. We compared GAMMs to traditional linear models for describing dose-response in IDMOC pharmacology studies. PMID:27110941

  19. Improved spatial regression analysis of diffusion tensor imaging for lesion detection during longitudinal progression of multiple sclerosis in individual subjects

    NASA Astrophysics Data System (ADS)

    Liu, Bilan; Qiu, Xing; Zhu, Tong; Tian, Wei; Hu, Rui; Ekholm, Sven; Schifitto, Giovanni; Zhong, Jianhui

    2016-03-01

    Subject-specific longitudinal DTI study is vital for investigation of pathological changes of lesions and disease evolution. Spatial Regression Analysis of Diffusion tensor imaging (SPREAD) is a non-parametric permutation-based statistical framework that combines spatial regression and resampling techniques to achieve effective detection of localized longitudinal diffusion changes within the whole brain at individual level without a priori hypotheses. However, boundary blurring and dislocation limit its sensitivity, especially towards detecting lesions of irregular shapes. In the present study, we propose an improved SPREAD (dubbed improved SPREAD, or iSPREAD) method by incorporating a three-dimensional (3D) nonlinear anisotropic diffusion filtering method, which provides edge-preserving image smoothing through a nonlinear scale space approach. The statistical inference based on iSPREAD was evaluated and compared with the original SPREAD method using both simulated and in vivo human brain data. Results demonstrated that the sensitivity and accuracy of the SPREAD method has been improved substantially by adapting nonlinear anisotropic filtering. iSPREAD identifies subject-specific longitudinal changes in the brain with improved sensitivity, accuracy, and enhanced statistical power, especially when the spatial correlation is heterogeneous among neighboring image pixels in DTI.

  20. Improved spatial regression analysis of diffusion tensor imaging for lesion detection during longitudinal progression of multiple sclerosis in individual subjects.

    PubMed

    Liu, Bilan; Qiu, Xing; Zhu, Tong; Tian, Wei; Hu, Rui; Ekholm, Sven; Schifitto, Giovanni; Zhong, Jianhui

    2016-03-21

    Subject-specific longitudinal DTI study is vital for investigation of pathological changes of lesions and disease evolution. Spatial Regression Analysis of Diffusion tensor imaging (SPREAD) is a non-parametric permutation-based statistical framework that combines spatial regression and resampling techniques to achieve effective detection of localized longitudinal diffusion changes within the whole brain at individual level without a priori hypotheses. However, boundary blurring and dislocation limit its sensitivity, especially towards detecting lesions of irregular shapes. In the present study, we propose an improved SPREAD (dubbed improved SPREAD, or iSPREAD) method by incorporating a three-dimensional (3D) nonlinear anisotropic diffusion filtering method, which provides edge-preserving image smoothing through a nonlinear scale space approach. The statistical inference based on iSPREAD was evaluated and compared with the original SPREAD method using both simulated and in vivo human brain data. Results demonstrated that the sensitivity and accuracy of the SPREAD method has been improved substantially by adapting nonlinear anisotropic filtering. iSPREAD identifies subject-specific longitudinal changes in the brain with improved sensitivity, accuracy, and enhanced statistical power, especially when the spatial correlation is heterogeneous among neighboring image pixels in DTI. PMID:26948513

  1. Study relationship between inorganic and organic coal analysis with gross calorific value by multiple regression and ANFIS

    USGS Publications Warehouse

    Chelgani, S.C.; Hart, B.; Grady, W.C.; Hower, J.C.

    2011-01-01

    The relationship between maceral content plus mineral matter and gross calorific value (GCV) for a wide range of West Virginia coal samples (from 6518 to 15330 BTU/lb; 15.16 to 35.66MJ/kg) has been investigated by multivariable regression and adaptive neuro-fuzzy inference system (ANFIS). The stepwise least square mathematical method comparison between liptinite, vitrinite, plus mineral matter as input data sets with measured GCV reported a nonlinear correlation coefficient (R2) of 0.83. Using the same data set the correlation between the predicted GCV from the ANFIS model and the actual GCV reported a R2 value of 0.96. It was determined that the GCV-based prediction methods, as used in this article, can provide a reasonable estimation of GCV. Copyright ?? Taylor & Francis Group, LLC.

  2. Multiple Linkage Disequilibrium Mapping Methods to Validate Additive Quantitative Trait Loci in Korean Native Cattle (Hanwoo).

    PubMed

    Li, Yi; Kim, Jong-Joo

    2015-07-01

    The efficiency of genome-wide association analysis (GWAS) depends on power of detection for quantitative trait loci (QTL) and precision for QTL mapping. In this study, three different strategies for GWAS were applied to detect QTL for carcass quality traits in the Korean cattle, Hanwoo; a linkage disequilibrium single locus regression method (LDRM), a combined linkage and linkage disequilibrium analysis (LDLA) and a BayesCπ approach. The phenotypes of 486 steers were collected for weaning weight (WWT), yearling weight (YWT), carcass weight (CWT), backfat thickness (BFT), longissimus dorsi muscle area, and marbling score (Marb). Also the genotype data for the steers and their sires were scored with the Illumina bovine 50K single nucleotide polymorphism (SNP) chips. For the two former GWAS methods, threshold values were set at false discovery rate <0.01 on a chromosome-wide level, while a cut-off threshold value was set in the latter model, such that the top five windows, each of which comprised 10 adjacent SNPs, were chosen with significant variation for the phenotype. Four major additive QTL from these three methods had high concordance found in 64.1 to 64.9Mb for Bos taurus autosome (BTA) 7 for WWT, 24.3 to 25.4Mb for BTA14 for CWT, 0.5 to 1.5Mb for BTA6 for BFT and 26.3 to 33.4Mb for BTA29 for BFT. Several candidate genes (i.e. glutamate receptor, ionotropic, ampa 1 [GRIA1], family with sequence similarity 110, member B [FAM110B], and thymocyte selection-associated high mobility group box [TOX]) may be identified close to these QTL. Our result suggests that the use of different linkage disequilibrium mapping approaches can provide more reliable chromosome regions to further pinpoint DNA makers or causative genes in these regions.

  3. Multiple Linkage Disequilibrium Mapping Methods to Validate Additive Quantitative Trait Loci in Korean Native Cattle (Hanwoo).

    PubMed

    Li, Yi; Kim, Jong-Joo

    2015-07-01

    The efficiency of genome-wide association analysis (GWAS) depends on power of detection for quantitative trait loci (QTL) and precision for QTL mapping. In this study, three different strategies for GWAS were applied to detect QTL for carcass quality traits in the Korean cattle, Hanwoo; a linkage disequilibrium single locus regression method (LDRM), a combined linkage and linkage disequilibrium analysis (LDLA) and a BayesCπ approach. The phenotypes of 486 steers were collected for weaning weight (WWT), yearling weight (YWT), carcass weight (CWT), backfat thickness (BFT), longissimus dorsi muscle area, and marbling score (Marb). Also the genotype data for the steers and their sires were scored with the Illumina bovine 50K single nucleotide polymorphism (SNP) chips. For the two former GWAS methods, threshold values were set at false discovery rate <0.01 on a chromosome-wide level, while a cut-off threshold value was set in the latter model, such that the top five windows, each of which comprised 10 adjacent SNPs, were chosen with significant variation for the phenotype. Four major additive QTL from these three methods had high concordance found in 64.1 to 64.9Mb for Bos taurus autosome (BTA) 7 for WWT, 24.3 to 25.4Mb for BTA14 for CWT, 0.5 to 1.5Mb for BTA6 for BFT and 26.3 to 33.4Mb for BTA29 for BFT. Several candidate genes (i.e. glutamate receptor, ionotropic, ampa 1 [GRIA1], family with sequence similarity 110, member B [FAM110B], and thymocyte selection-associated high mobility group box [TOX]) may be identified close to these QTL. Our result suggests that the use of different linkage disequilibrium mapping approaches can provide more reliable chromosome regions to further pinpoint DNA makers or causative genes in these regions. PMID:26104396

  4. Application of the deletion/substitution/addition algorithm to selecting land use regression models for interpolating air pollution measurements in California

    NASA Astrophysics Data System (ADS)

    Beckerman, Bernardo S.; Jerrett, Michael; Martin, Randall V.; van Donkelaar, Aaron; Ross, Zev; Burnett, Richard T.

    2013-10-01

    Land use regression (LUR) models are widely employed in health studies to characterize chronic exposure to air pollution. The LUR is essentially an interpolation technique that employs the pollutant of interest as the dependent variable with proximate land use, traffic, and physical environmental variables used as independent predictors. Two major limitations with this method have not been addressed: (1) variable selection in the model building process, and (2) dealing with unbalanced repeated measures. In this paper, we address these issues with a modeling framework that implements the deletion/substitution/addition (DSA) machine learning algorithm that uses a generalized linear model to average over unbalanced temporal observations. Models were derived for fine particulate matter with aerodynamic diameter of 2.5 microns or less (PM2.5) and nitrogen dioxide (NO2) using monthly observations. We used 4119 observations at 108 sites and 15,301 observations at 138 sites for PM2.5 and NO2, respectively. We derived models with good predictive capacity (cross-validated-R2 values were 0.65 and 0.71 for PM2.5 and NO2, respectively). By addressing these two shortcomings in current approaches to LUR modeling, we have developed a framework that minimizes arbitrary decisions during the model selection process. We have also demonstrated how to integrate temporally unbalanced data in a theoretically sound manner. These developments could have widespread applicability for future LUR modeling efforts.

  5. Forecasting hourly PM(10) concentration in Cyprus through artificial neural networks and multiple regression models: implications to local environmental management.

    PubMed

    Paschalidou, Anastasia K; Karakitsios, Spyridon; Kleanthous, Savvas; Kassomenos, Pavlos A

    2011-02-01

    In the present work, two types of artificial neural network (NN) models using the multilayer perceptron (MLP) and the radial basis function (RBF) techniques, as well as a model based on principal component regression analysis (PCRA), are employed to forecast hourly PM(10) concentrations in four urban areas (Larnaca, Limassol, Nicosia and Paphos) in Cyprus. The model development is based on a variety of meteorological and pollutant parameters corresponding to the 2-year period between July 2006 and June 2008, and the model evaluation is achieved through the use of a series of well-established evaluation instruments and methodologies. The evaluation reveals that the MLP NN models display the best forecasting performance with R (2) values ranging between 0.65 and 0.76, whereas the RBF NNs and the PCRA models reveal a rather weak performance with R (2) values between 0.37-0.43 and 0.33-0.38, respectively. The derived MLP models are also used to forecast Saharan dust episodes with remarkable success (probability of detection ranging between 0.68 and 0.71). On the whole, the analysis shows that the models introduced here could provide local authorities with reliable and precise predictions and alarms about air quality if used on an operational basis. PMID:20652425

  6. Forecasting hourly PM(10) concentration in Cyprus through artificial neural networks and multiple regression models: implications to local environmental management.

    PubMed

    Paschalidou, Anastasia K; Karakitsios, Spyridon; Kleanthous, Savvas; Kassomenos, Pavlos A

    2011-02-01

    In the present work, two types of artificial neural network (NN) models using the multilayer perceptron (MLP) and the radial basis function (RBF) techniques, as well as a model based on principal component regression analysis (PCRA), are employed to forecast hourly PM(10) concentrations in four urban areas (Larnaca, Limassol, Nicosia and Paphos) in Cyprus. The model development is based on a variety of meteorological and pollutant parameters corresponding to the 2-year period between July 2006 and June 2008, and the model evaluation is achieved through the use of a series of well-established evaluation instruments and methodologies. The evaluation reveals that the MLP NN models display the best forecasting performance with R (2) values ranging between 0.65 and 0.76, whereas the RBF NNs and the PCRA models reveal a rather weak performance with R (2) values between 0.37-0.43 and 0.33-0.38, respectively. The derived MLP models are also used to forecast Saharan dust episodes with remarkable success (probability of detection ranging between 0.68 and 0.71). On the whole, the analysis shows that the models introduced here could provide local authorities with reliable and precise predictions and alarms about air quality if used on an operational basis.

  7. Radiologic assessment of third molar tooth and spheno-occipital synchondrosis for age estimation: a multiple regression analysis study.

    PubMed

    Demirturk Kocasarac, Husniye; Sinanoglu, Alper; Noujeim, Marcel; Helvacioglu Yigit, Dilek; Baydemir, Canan

    2016-05-01

    For forensic age estimation, radiographic assessment of third molar mineralization is important between 14 and 21 years which coincides with the legal age in most countries. The spheno-occipital synchondrosis (SOS) is an important growth site during development, and its use for age estimation is beneficial when combined with other markers. In this study, we aimed to develop a regression model to estimate and narrow the age range based on the radiologic assessment of third molar and SOS in a Turkish subpopulation. Panoramic radiographs and cone beam CT scans of 349 subjects (182 males, 167 females) with age between 8 and 25 were evaluated. Four-stage system was used to evaluate the fusion degree of SOS, and Demirjian's eight stages of development for calcification for third molars. The Pearson correlation indicated a strong positive relationship between age and third molar calcification for both sexes (r = 0.850 for females, r = 0.839 for males, P < 0.001) and also between age and SOS fusion for females (r = 0.814), but a moderate relationship was found for males (r = 0.599), P < 0.001). Based on the results obtained, an age determination formula using these scores was established.

  8. Measuring decision weights in recognition experiments with multiple response alternatives: comparing the correlation and multinomial-logistic-regression methods.

    PubMed

    Dai, Huanping; Micheyl, Christophe

    2012-11-01

    Psychophysical "reverse-correlation" methods allow researchers to gain insight into the perceptual representations and decision weighting strategies of individual subjects in perceptual tasks. Although these methods have gained momentum, until recently their development was limited to experiments involving only two response categories. Recently, two approaches for estimating decision weights in m-alternative experiments have been put forward. One approach extends the two-category correlation method to m > 2 alternatives; the second uses multinomial logistic regression (MLR). In this article, the relative merits of the two methods are discussed, and the issues of convergence and statistical efficiency of the methods are evaluated quantitatively using Monte Carlo simulations. The results indicate that, for a range of values of the number of trials, the estimated weighting patterns are closer to their asymptotic values for the correlation method than for the MLR method. Moreover, for the MLR method, weight estimates for different stimulus components can exhibit strong correlations, making the analysis and interpretation of measured weighting patterns less straightforward than for the correlation method. These and other advantages of the correlation method, which include computational simplicity and a close relationship to other well-established psychophysical reverse-correlation methods, make it an attractive tool to uncover decision strategies in m-alternative experiments.

  9. Development of a regression model to predict copper toxicity to Daphnia magna and site-specific copper criteria across multiple surface-water drainages in an arid landscape.

    PubMed

    Fulton, Barry A; Meyer, Joseph S

    2014-08-01

    The water effect ratio (WER) procedure developed by the US Environmental Protection Agency is commonly used to derive site-specific criteria for point-source metal discharges into perennial waters. However, experience is limited with this method in the ephemeral and intermittent systems typical of arid climates. The present study presents a regression model to develop WER-based site-specific criteria for a network of ephemeral and intermittent streams influenced by nonpoint sources of Cu in the southwestern United States. Acute (48-h) Cu toxicity tests were performed concurrently with Daphnia magna in site water samples and hardness-matched laboratory waters. Median effect concentrations (EC50s) for Cu in site water samples (n=17) varied by more than 12-fold, and the range of calculated WER values was similar. Statistically significant (α=0.05) univariate predictors of site-specific Cu toxicity included (in sequence of decreasing significance) dissolved organic carbon (DOC), hardness/alkalinity ratio, alkalinity, K, and total dissolved solids. A multiple-regression model developed from a combination of DOC and alkalinity explained 85% of the toxicity variability in site water samples, providing a strong predictive tool that can be used in the WER framework when site-specific criteria values are derived. The biotic ligand model (BLM) underpredicted toxicity in site waters by more than 2-fold. Adjustments to the default BLM parameters improved the model's performance but did not provide a better predictive tool compared with the regression model developed from DOC and alkalinity.

  10. Binary Logistic Regression Versus Boosted Regression Trees in Assessing Landslide Susceptibility for Multiple-Occurring Regional Landslide Events: Application to the 2009 Storm Event in Messina (Sicily, southern Italy).

    NASA Astrophysics Data System (ADS)

    Lombardo, L.; Cama, M.; Maerker, M.; Parisi, L.; Rotigliano, E.

    2014-12-01

    This study aims at comparing the performances of Binary Logistic Regression (BLR) and Boosted Regression Trees (BRT) methods in assessing landslide susceptibility for multiple-occurrence regional landslide events within the Mediterranean region. A test area was selected in the north-eastern sector of Sicily (southern Italy), corresponding to the catchments of the Briga and the Giampilieri streams both stretching for few kilometres from the Peloritan ridge (eastern Sicily, Italy) to the Ionian sea. This area was struck on the 1st October 2009 by an extreme climatic event resulting in thousands of rapid shallow landslides, mainly of debris flows and debris avalanches types involving the weathered layer of a low to high grade metamorphic bedrock. Exploiting the same set of predictors and the 2009 landslide archive, BLR- and BRT-based susceptibility models were obtained for the two catchments separately, adopting a random partition (RP) technique for validation; besides, the models trained in one of the two catchments (Briga) were tested in predicting the landslide distribution in the other (Giampilieri), adopting a spatial partition (SP) based validation procedure. All the validation procedures were based on multi-folds tests so to evaluate and compare the reliability of the fitting, the prediction skill, the coherence in the predictor selection and the precision of the susceptibility estimates. All the obtained models for the two methods produced very high predictive performances, with a general congruence between BLR and BRT in the predictor importance. In particular, the research highlighted that BRT-models reached a higher prediction performance with respect to BLR-models, for RP based modelling, whilst for the SP-based models the difference in predictive skills between the two methods dropped drastically, converging to an analogous excellent performance. However, when looking at the precision of the probability estimates, BLR demonstrated to produce more robust

  11. Automated microbial metabolism laboratory. [design of advanced labeled release experiment based on single addition of soil and multiple sequential additions of media into test chambers

    NASA Technical Reports Server (NTRS)

    1974-01-01

    The design and rationale of an advanced labeled release experiment based on single addition of soil and multiple sequential additions of media into each of four test chambers are outlined. The feasibility for multiple addition tests was established and various details of the methodology were studied. The four chamber battery of tests include: (1) determination of the effect of various atmospheric gases and selection of that gas which produces an optimum response; (2) determination of the effect of incubation temperature and selection of the optimum temperature for performing Martian biochemical tests; (3) sterile soil is dosed with a battery of C-14 labeled substrates and subjected to experimental temperature range; and (4) determination of the possible inhibitory effects of water on Martian organisms is performed initially by dosing with 0.01 ml and 0.5 ml of medium, respectively. A series of specifically labeled substrates are then added to obtain patterns in metabolic 14CO2 (C-14)O2 evolution.

  12. Diplotype Trend Regression Analysis of the ADH Gene Cluster and the ALDH2 Gene: Multiple Significant Associations with Alcohol Dependence

    PubMed Central

    Luo, Xingguang; Kranzler, Henry R.; Zuo, Lingjun; Wang, Shuang; Schork, Nicholas J.; Gelernter, Joel

    2006-01-01

    The set of alcohol-metabolizing enzymes has considerable genetic and functional complexity. The relationships between some alcohol dehydrogenase (ADH) and aldehyde dehydrogenase (ALDH) genes and alcohol dependence (AD) have long been studied in many populations, but not comprehensively. In the present study, we genotyped 16 markers within the ADH gene cluster (including the ADH1A, ADH1B, ADH1C, ADH5, ADH6, and ADH7 genes), 4 markers within the ALDH2 gene, and 38 unlinked ancestry-informative markers in a case-control sample of 801 individuals. Associations between markers and disease were analyzed by a Hardy-Weinberg equilibrium (HWE) test, a conventional case-control comparison, a structured association analysis, and a novel diplotype trend regression (DTR) analysis. Finally, the disease alleles were fine mapped by a Hardy-Weinberg disequilibrium (HWD) measure (J). All markers were found to be in HWE in controls, but some markers showed HWD in cases. Genotypes of many markers were associated with AD. DTR analysis showed that ADH5 genotypes and diplotypes of ADH1A, ADH1B, ADH7, and ALDH2 were associated with AD in European Americans and/or African Americans. The risk-influencing alleles were fine mapped from among the markers studied and were found to coincide with some well-known functional variants. We demonstrated that DTR was more powerful than many other conventional association methods. We also found that several ADH genes and the ALDH2 gene were susceptibility loci for AD, and the associations were best explained by several independent risk genes. PMID:16685648

  13. Investigating the possible effects of trauma experiences and 5-HTT on the dissociative experiences of patients with OCD using path analysis and multiple regression.

    PubMed

    Lochner, Christine; Seedat, Soraya; Hemmings, Sian M J; Moolman-Smook, Johanna C; Kidd, Martin; Stein, Dan J

    2007-01-01

    Dissociation is defined as the disruption of the usually integrated functions of consciousness, such as memory, identity, and perceptions of the environment. Causes include various psychological, neurological and neurobiological mechanisms, none of which have been consistently supported. To our knowledge, the role of gene-environment interactions in dissociative experiences in obsessive-compulsive disorder (OCD) has not previously been investigated. Eighty-three Caucasian patients (29 male, 54 female) with a principal diagnosis of OCD were included. The Dissociative Experiences Scale was used to assess dissociation. The role of childhood trauma (assessed with the Childhood Trauma Questionnaire), and a functional 44-bp insertion/deletion polymorphism in the promoter region of the serotonin transporter, or 5-HTT, in mediating dissociation, was investigated using multiple regression analysis and path analysis using the partial least squares model. Both analyses indicated that an interaction between physical neglect and the S/S genotype of the 5-HTT gene significantly predicted dissociation in patients with OCD. Dissociation may be a predictor of poorer treatment outcome in patients with OCD; therefore, a better understanding of the mechanisms that underlie this phenomenon may be useful. Here, two different but related statistical techniques (multiple regression and partial least squares), confirmed that physical neglect and the 5-HTT genotype jointly play a role in predicting dissociation in OCD. PMID:17943026

  14. Comparison of multiple linear and nonlinear regression, autoregressive integrated moving average, artificial neural network, and wavelet artificial neural network methods for urban water demand forecasting in Montreal, Canada

    NASA Astrophysics Data System (ADS)

    Adamowski, Jan; Fung Chan, Hiu; Prasher, Shiv O.; Ozga-Zielinski, Bogdan; Sliusarieva, Anna

    2012-01-01

    Daily water demand forecasts are an important component of cost-effective and sustainable management and optimization of urban water supply systems. In this study, a method based on coupling discrete wavelet transforms (WA) and artificial neural networks (ANNs) for urban water demand forecasting applications is proposed and tested. Multiple linear regression (MLR), multiple nonlinear regression (MNLR), autoregressive integrated moving average (ARIMA), ANN and WA-ANN models for urban water demand forecasting at lead times of one day for the summer months (May to August) were developed, and their relative performance was compared using the coefficient of determination, root mean square error, relative root mean square error, and efficiency index. The key variables used to develop and validate the models were daily total precipitation, daily maximum temperature, and daily water demand data from 2001 to 2009 in the city of Montreal, Canada. The WA-ANN models were found to provide more accurate urban water demand forecasts than the MLR, MNLR, ARIMA, and ANN models. The results of this study indicate that coupled wavelet-neural network models are a potentially promising new method of urban water demand forecasting that merit further study.

  15. Investigating the possible effects of trauma experiences and 5-HTT on the dissociative experiences of patients with OCD using path analysis and multiple regression.

    PubMed

    Lochner, Christine; Seedat, Soraya; Hemmings, Sian M J; Moolman-Smook, Johanna C; Kidd, Martin; Stein, Dan J

    2007-01-01

    Dissociation is defined as the disruption of the usually integrated functions of consciousness, such as memory, identity, and perceptions of the environment. Causes include various psychological, neurological and neurobiological mechanisms, none of which have been consistently supported. To our knowledge, the role of gene-environment interactions in dissociative experiences in obsessive-compulsive disorder (OCD) has not previously been investigated. Eighty-three Caucasian patients (29 male, 54 female) with a principal diagnosis of OCD were included. The Dissociative Experiences Scale was used to assess dissociation. The role of childhood trauma (assessed with the Childhood Trauma Questionnaire), and a functional 44-bp insertion/deletion polymorphism in the promoter region of the serotonin transporter, or 5-HTT, in mediating dissociation, was investigated using multiple regression analysis and path analysis using the partial least squares model. Both analyses indicated that an interaction between physical neglect and the S/S genotype of the 5-HTT gene significantly predicted dissociation in patients with OCD. Dissociation may be a predictor of poorer treatment outcome in patients with OCD; therefore, a better understanding of the mechanisms that underlie this phenomenon may be useful. Here, two different but related statistical techniques (multiple regression and partial least squares), confirmed that physical neglect and the 5-HTT genotype jointly play a role in predicting dissociation in OCD.

  16. Recognition of extensive freshwater and brackish marshes and of multiple transgressions and regressions: The Holocene wetlands of the Delaware Bay and Atlantic Ocean coasts

    SciTech Connect

    Yi, H.I. . Dept. of Geology)

    1992-01-01

    Extensive and closely spaced cores (204) were analyzed to find detailed facies (microfacies) and paleoenvironments in the subsurface sediments along the Delaware Bay and Atlantic Ocean. To determine detailed facies and paleoenvironments, several composite methods were employed: traditional lithological analysis, botanical identification, macro- and micro-paleontological analysis, grain size analysis, organic and inorganic content, water content, mineral composition, particulate plant, and C-14 dating. Twenty-two sedimentary microfacies were identified in the surface and subsurface sediments of the study area. Most of the lower section of the Holocene sediments contained freshwater and brackish marsh microfacies which alternated or intercalated with fluvial microfacies or brackish tidal flat/tidal stream microfacies. After tides encroached upon the freshwater marshes and swamps, several events of transgression and regression were recorded in the stratigraphic section. Finally, saline paleoenvironments predominated at the top section of subsurface sediments. Within saline facies, three subgroups of salt marsh microfacies were identified: high salt marsh sub-microfacies, middle salt marsh sub-microfacies were identified: high salt marsh sub-microfacies, middle salt marsh sub-microfacies, and low salt marsh sub-microfacies. The major controlling factors of these paleoenvironmental changes were local relative sea-level fluctuations, sediment supply, pre-Holocene configuration, fluvial activity, groundwater influence, climatic change, sediment compaction, tectonics, isostasy and biological competition. Ten events of transgression and regression in some areas were found in about 2,000 years, but other areas apparently contained no evidence of multiple events of transgression and regression. Some other areas showed one or two distinctive events of transgression and regression. Therefore, further investigation is necessary to understand the details of these records.

  17. Does NASA's Constellation Architecture Offer Opportunities to Achieve Multiple Additional Goals in Space?

    NASA Technical Reports Server (NTRS)

    Thronson, Harley A.; Lester, Daniel F.

    2008-01-01

    Every major NASA human spaceflight program in the last four decades has been modified to achieve goals in space not incorporated within the original design goals: the Apollo Applications Program, Skylab, Space Shuttle, and International Space Station. Several groups in the US have been identifying major future science goals, the science facilities necessary to investigate them, as well as possible roles for augmented versions of elements of NASA's Constellation program. Specifically, teams in the astronomy community have been developing concepts for very capable missions to follow the James Webb Space Telescope that could take advantage of - or require - free-space operations by astronauts and/or robots. Taking as one example, the Single-Aperture Far-InfraRed (SAFIR) telescope with a approx.10+ m aperture proposed for operation in the 2020 timeframe. According to current NASA plans, the Ares V launch vehicle (or a variant) will be available about the same time, as will the capability to transport astronauts to the vicinity of the Moon via the Orion Crew Exploration Vehicle and associated systems. [As the lunar surface offers no advantages - and major disadvantages - for most major optical systems, the expensive system for landing and operating on the lunar surface is not required.] Although as currently conceived, SAFIR and other astronomical missions will operate at the Sun-Earth L2 location, it appears trivial to travel for servicing to the more accessible Earth-Moon L1,2 locations. Moreover, as the recent Orbital Express and Automated Transfer Vehicle missions have demonstrated, future robotic capabilities should offer capabilities that would (remotely) extend human presence far beyond the vicinity of the Earth. In addition to multiplying the value of NASA's architecture for future human spaceflight to achieve the goals multiple major stakeholders, if humans one day travel beyond the Earth-Moon system - say, to Mars - technologies and capabilities for operating

  18. Does NASA's Constellation Architecture Offer Opportunities to Achieve Multiple Additional Goals in Space?

    NASA Technical Reports Server (NTRS)

    Thronson, Harley; Lester, Daniel F.

    2008-01-01

    Every major NASA human spaceflight program in the last four decades has been modified to achieve goals in space not incorporated within the original design goals: the Apollo Applications Program, Skylab, Space Shuttle, and International Space Station. Several groups in the US have been identifying major future science goals, the science facilities necessary to investigate them, as well as possible roles for augmented versions of elements of NASA's Constellation program. Specifically, teams in the astronomy community have been developing concepts for very capable missions to follow the James Webb Space Telescope that could take advantage of - or require - free-space operations by astronauts and/or robots. Taking as one example, the Single-Aperture Far-InfraRed (SAFIR) telescope with a approx. 10+ m aperture proposed for operation in the 2020 timeframe. According to current NASA plans, the Ares V launch vehicle (or a variant) will be available about the same time, as will the capability to transport astronauts to the vicinity of the Moon via the Orion Crew Exploration Vehicle and associated systems. [As the lunar surface offers no advantages - and major disadvantages - for most major optical systems, the expensive system for landing and operating on the lunar surface is not required.] Although as currently conceived, SAFIR and other astronomical missions will operate at the Sun-Earth L2 location, it appears trivial to travel for servicing to the more accessible Earth-Moon L1,2 locations. Moreover. as the recent Orbital Express and Automated Transfer Vehicle missions have demonstrated, future robotic capabilities should offer capabilities that would (remotely) extend human presence far beyond the vicinity of the Earth. In addition to multiplying the value of NASA's architecture for future human spaceflight to achieve the goals multiple major stakeholders. if humans one day travel beyond the Earth-Moon system - say, to Mars - technologies and capabilities for operating

  19. Transitioning from Additive to Multiplicative Thinking: A Design and Teaching Experiment with Third through Fifth Graders

    ERIC Educational Resources Information Center

    Brickwedde, James

    2011-01-01

    The maturation of multiplicative thinking is key to student progress in middle school as rational number, ratio, and proportion concepts are encountered. But many students arrive from the intermediate grades and falter in developing this essential disposition. Elementary students have historically learned multiplication and division as operation…

  20. In vitro additive effect of imipenem combined with vancomycin against multiple-drug resistant, coagulase-negative Staphylococci.

    PubMed

    Traub, W H; Spohr, M; Bauer, D

    1986-09-01

    Imipenem combined with vancomycin resulted in a marked additive effect in vitro against 9 clinical isolates of multiple-drug resistant (MDR), coagulase-negative staphylococci, including strains resistant against imipenem. The additive effect was documented with the aid of checkerboard MIC determinations and with time kill curve experiments. In contrast, imipenem combined with vancomycin merely yielded weak additive or indifferent effects against 10 MDR isolates of Staphylococcus aureus, all of which were susceptible to imipenem.

  1. Assessing the impact of local meteorological variables on surface ozone in Hong Kong during 2000-2015 using quantile and multiple line regression models

    NASA Astrophysics Data System (ADS)

    Zhao, Wei; Fan, Shaojia; Guo, Hai; Gao, Bo; Sun, Jiaren; Chen, Laiguo

    2016-11-01

    The quantile regression (QR) method has been increasingly introduced to atmospheric environmental studies to explore the non-linear relationship between local meteorological conditions and ozone mixing ratios. In this study, we applied QR for the first time, together with multiple linear regression (MLR), to analyze the dominant meteorological parameters influencing the mean, 10th percentile, 90th percentile and 99th percentile of maximum daily 8-h average (MDA8) ozone concentrations in 2000-2015 in Hong Kong. The dominance analysis (DA) was used to assess the relative importance of meteorological variables in the regression models. Results showed that the MLR models worked better at suburban and rural sites than at urban sites, and worked better in winter than in summer. QR models performed better in summer for 99th and 90th percentiles and performed better in autumn and winter for 10th percentile. And QR models also performed better in suburban and rural areas for 10th percentile. The top 3 dominant variables associated with MDA8 ozone concentrations, changing with seasons and regions, were frequently associated with the six meteorological parameters: boundary layer height, humidity, wind direction, surface solar radiation, total cloud cover and sea level pressure. Temperature rarely became a significant variable in any season, which could partly explain the peak of monthly average ozone concentrations in October in Hong Kong. And we found the effect of solar radiation would be enhanced during extremely ozone pollution episodes (i.e., the 99th percentile). Finally, meteorological effects on MDA8 ozone had no significant changes before and after the 2010 Asian Games.

  2. Fundamental Analysis of the Linear Multiple Regression Technique for Quantification of Water Quality Parameters from Remote Sensing Data. Ph.D. Thesis - Old Dominion Univ.

    NASA Technical Reports Server (NTRS)

    Whitlock, C. H., III

    1977-01-01

    Constituents with linear radiance gradients with concentration may be quantified from signals which contain nonlinear atmospheric and surface reflection effects for both homogeneous and non-homogeneous water bodies provided accurate data can be obtained and nonlinearities are constant with wavelength. Statistical parameters must be used which give an indication of bias as well as total squared error to insure that an equation with an optimum combination of bands is selected. It is concluded that the effect of error in upwelled radiance measurements is to reduce the accuracy of the least square fitting process and to increase the number of points required to obtain a satisfactory fit. The problem of obtaining a multiple regression equation that is extremely sensitive to error is discussed.

  3. Using multiple linear regression and physicochemical changes of amino acid mutations to predict antigenic variants of influenza A/H3N2 viruses.

    PubMed

    Cui, Haibo; Wei, Xiaomei; Huang, Yu; Hu, Bin; Fang, Yaping; Wang, Jia

    2014-01-01

    Among human influenza viruses, strain A/H3N2 accounts for over a quarter of a million deaths annually. Antigenic variants of these viruses often render current vaccinations ineffective and lead to repeated infections. In this study, a computational model was developed to predict antigenic variants of the A/H3N2 strain. First, 18 critical antigenic amino acids in the hemagglutinin (HA) protein were recognized using a scoring method combining phi (ϕ) coefficient and information entropy. Next, a prediction model was developed by integrating multiple linear regression method with eight types of physicochemical changes in critical amino acid positions. When compared to other three known models, our prediction model achieved the best performance not only on the training dataset but also on the commonly-used testing dataset composed of 31878 antigenic relationships of the H3N2 influenza virus.

  4. Verifying the performance of artificial neural network and multiple linear regression in predicting the mean seasonal municipal solid waste generation rate: A case study of Fars province, Iran.

    PubMed

    Azadi, Sama; Karimi-Jashni, Ayoub

    2016-02-01

    Predicting the mass of solid waste generation plays an important role in integrated solid waste management plans. In this study, the performance of two predictive models, Artificial Neural Network (ANN) and Multiple Linear Regression (MLR) was verified to predict mean Seasonal Municipal Solid Waste Generation (SMSWG) rate. The accuracy of the proposed models is illustrated through a case study of 20 cities located in Fars Province, Iran. Four performance measures, MAE, MAPE, RMSE and R were used to evaluate the performance of these models. The MLR, as a conventional model, showed poor prediction performance. On the other hand, the results indicated that the ANN model, as a non-linear model, has a higher predictive accuracy when it comes to prediction of the mean SMSWG rate. As a result, in order to develop a more cost-effective strategy for waste management in the future, the ANN model could be used to predict the mean SMSWG rate. PMID:26482809

  5. Verifying the performance of artificial neural network and multiple linear regression in predicting the mean seasonal municipal solid waste generation rate: A case study of Fars province, Iran.

    PubMed

    Azadi, Sama; Karimi-Jashni, Ayoub

    2016-02-01

    Predicting the mass of solid waste generation plays an important role in integrated solid waste management plans. In this study, the performance of two predictive models, Artificial Neural Network (ANN) and Multiple Linear Regression (MLR) was verified to predict mean Seasonal Municipal Solid Waste Generation (SMSWG) rate. The accuracy of the proposed models is illustrated through a case study of 20 cities located in Fars Province, Iran. Four performance measures, MAE, MAPE, RMSE and R were used to evaluate the performance of these models. The MLR, as a conventional model, showed poor prediction performance. On the other hand, the results indicated that the ANN model, as a non-linear model, has a higher predictive accuracy when it comes to prediction of the mean SMSWG rate. As a result, in order to develop a more cost-effective strategy for waste management in the future, the ANN model could be used to predict the mean SMSWG rate.

  6. Development and application of a multiple linear regression model to consider the impact of weekly waste container capacity on the yield from kerbside recycling programmes in Scotland.

    PubMed

    Baird, Jim; Curry, Robin; Reid, Tim

    2013-03-01

    This article describes the development and application of a multiple linear regression model to identify how the key elements of waste and recycling infrastructure, namely container capacity and frequency of collection, affect the yield from municipal kerbside recycling programmes. The overall aim of the research was to gain an understanding of the factors affecting the yield from municipal kerbside recycling programmes in Scotland with an underlying objective to evaluate the efficacy of the model as a decision-support tool for informing the design of kerbside recycling programmes. The study isolates the principal kerbside collection service offered by all 32 councils across Scotland, eliminating those recycling programmes associated with flatted properties or multi-occupancies. The results of the regression analysis model have identified three principal factors which explain 80% of the variability in the average yield of the principal dry recyclate services: weekly residual waste capacity, number of materials collected and the weekly recycling capacity. The use of the model has been evaluated and recommendations made on ongoing methodological development and the use of the results in informing the design of kerbside recycling programmes. We hope that the research can provide insights for the further development of methods to optimise the design and operation of kerbside recycling programmes.

  7. Multiple Linear Regression Analysis Indicates Association of P-Glycoprotein Substrate or Inhibitor Character with Bitterness Intensity, Measured with a Sensor.

    PubMed

    Yano, Kentaro; Mita, Suzune; Morimoto, Kaori; Haraguchi, Tamami; Arakawa, Hiroshi; Yoshida, Miyako; Yamashita, Fumiyoshi; Uchida, Takahiro; Ogihara, Takuo

    2015-09-01

    P-glycoprotein (P-gp) regulates absorption of many drugs in the gastrointestinal tract and their accumulation in tumor tissues, but the basis of substrate recognition by P-gp remains unclear. Bitter-tasting phenylthiocarbamide, which stimulates taste receptor 2 member 38 (T2R38), increases P-gp activity and is a substrate of P-gp. This led us to hypothesize that bitterness intensity might be a predictor of P-gp-inhibitor/substrate status. Here, we measured the bitterness intensity of a panel of P-gp substrates and nonsubstrates with various taste sensors, and used multiple linear regression analysis to examine the relationship between P-gp-inhibitor/substrate status and various physical properties, including intensity of bitter taste measured with the taste sensor. We calculated the first principal component analysis score (PC1) as the representative value of bitterness, as all taste sensor's outputs shared significant correlation. The P-gp substrates showed remarkably greater mean bitterness intensity than non-P-gp substrates. We found that Km value of P-gp substrates were correlated with molecular weight, log P, and PC1 value, and the coefficient of determination (R(2) ) of the linear regression equation was 0.63. This relationship might be useful as an aid to predict P-gp substrate status at an early stage of drug discovery.

  8. A note on the relationships between multiple imputation, maximum likelihood and fully Bayesian methods for missing responses in linear regression models

    PubMed Central

    Ibrahim, Joseph G.

    2014-01-01

    Multiple Imputation, Maximum Likelihood and Fully Bayesian methods are the three most commonly used model-based approaches in missing data problems. Although it is easy to show that when the responses are missing at random (MAR), the complete case analysis is unbiased and efficient, the aforementioned methods are still commonly used in practice for this setting. To examine the performance of and relationships between these three methods in this setting, we derive and investigate small sample and asymptotic expressions of the estimates and standard errors, and fully examine how these estimates are related for the three approaches in the linear regression model when the responses are MAR. We show that when the responses are MAR in the linear model, the estimates of the regression coefficients using these three methods are asymptotically equivalent to the complete case estimates under general conditions. One simulation and a real data set from a liver cancer clinical trial are given to compare the properties of these methods when the responses are MAR. PMID:25309677

  9. A flexible mixed-effect negative binomial regression model for detecting unusual increases in MRI lesion counts in individual multiple sclerosis patients.

    PubMed

    Kondo, Yumi; Zhao, Yinshan; Petkau, John

    2015-06-15

    We develop a new modeling approach to enhance a recently proposed method to detect increases of contrast-enhancing lesions (CELs) on repeated magnetic resonance imaging, which have been used as an indicator for potential adverse events in multiple sclerosis clinical trials. The method signals patients with unusual increases in CEL activity by estimating the probability of observing CEL counts as large as those observed on a patient's recent scans conditional on the patient's CEL counts on previous scans. This conditional probability index (CPI), computed based on a mixed-effect negative binomial regression model, can vary substantially depending on the choice of distribution for the patient-specific random effects. Therefore, we relax this parametric assumption to model the random effects with an infinite mixture of beta distributions, using the Dirichlet process, which effectively allows any form of distribution. To our knowledge, no previous literature considers a mixed-effect regression for longitudinal count variables where the random effect is modeled with a Dirichlet process mixture. As our inference is in the Bayesian framework, we adopt a meta-analytic approach to develop an informative prior based on previous clinical trials. This is particularly helpful at the early stages of trials when less data are available. Our enhanced method is illustrated with CEL data from 10 previous multiple sclerosis clinical trials. Our simulation study shows that our procedure estimates the CPI more accurately than parametric alternatives when the patient-specific random effect distribution is misspecified and that an informative prior improves the accuracy of the CPI estimates. PMID:25784219

  10. Modeling the dependence of respiration and photosynthesis upon light, acetate, carbon dioxide, nitrate and ammonium in Chlamydomonas reinhardtii using design of experiments and multiple regression

    PubMed Central

    2014-01-01

    Background In photosynthetic organisms, the influence of light, carbon and inorganic nitrogen sources on the cellular bioenergetics has extensively been studied independently, but little information is available on the cumulative effects of these factors. Here, sequential statistical analyses based on design of experiments (DOE) coupled to standard least squares multiple regression have been undertaken to model the dependence of respiratory and photosynthetic responses (assessed by oxymetric and chlorophyll fluorescence measurements) upon the concomitant modulation of light intensity as well as acetate, CO2, nitrate and ammonium concentrations in the culture medium of Chlamydomonas reinhardtii. The main goals of these analyses were to explain response variability (i.e. bioenergetic plasticity) and to characterize quantitatively the influence of the major explanatory factor(s). Results For each response, 2 successive rounds of multiple regression coupled to one-way ANOVA F-tests have been undertaken to select the major explanatory factor(s) (1st-round) and mathematically simulate their influence (2nd-round). These analyses reveal that a maximal number of 3 environmental factors over 5 is sufficient to explain most of the response variability, and interestingly highlight quadratic effects and second-order interactions in some cases. In parallel, the predictive ability of the 2nd-round models has also been investigated by k-fold cross-validation and experimental validation tests on new random combinations of factors. These validation procedures tend to indicate that the 2nd-round models can also be used to predict the responses with an inherent deviation quantified by the analytical error of the models. Conclusions Altogether, the results of the 2 rounds of modeling provide an overview of the bioenergetic adaptations of C. reinhardtii to changing environmental conditions and point out promising tracks for future in-depth investigations of the molecular mechanisms

  11. Multiple regression and inverse moments improve the characterization of the spatial scaling behavior of daily streamflows in the Southeast United States

    USGS Publications Warehouse

    Farmer, William H.; Over, Thomas M.; Vogel, Richard M.

    2015-01-01

    Understanding the spatial structure of daily streamflow is essential for managing freshwater resources, especially in poorly-gaged regions. Spatial scaling assumptions are common in flood frequency prediction (e.g., index-flood method) and the prediction of continuous streamflow at ungaged sites (e.g. drainage-area ratio), with simple scaling by drainage area being the most common assumption. In this study, scaling analyses of daily streamflow from 173 streamgages in the southeastern US resulted in three important findings. First, the use of only positive integer moment orders, as has been done in most previous studies, captures only the probabilistic and spatial scaling behavior of flows above an exceedance probability near the median; negative moment orders (inverse moments) are needed for lower streamflows. Second, assessing scaling by using drainage area alone is shown to result in a high degree of omitted-variable bias, masking the true spatial scaling behavior. Multiple regression is shown to mitigate this bias, controlling for regional heterogeneity of basin attributes, especially those correlated with drainage area. Previous univariate scaling analyses have neglected the scaling of low-flow events and may have produced biased estimates of the spatial scaling exponent. Third, the multiple regression results show that mean flows scale with an exponent of one, low flows scale with spatial scaling exponents greater than one, and high flows scale with exponents less than one. The relationship between scaling exponents and exceedance probabilities may be a fundamental signature of regional streamflow. This signature may improve our understanding of the physical processes generating streamflow at different exceedance probabilities. 

  12. USING DOSE ADDITION TO ESTIMATE CUMULATIVE RISKS FROM EXPOSURES TO MULTIPLE CHEMICALS

    EPA Science Inventory

    The Food Quality Protection Act (FQPA) of 1996 requires the EPA to consider the cumulative risk from exposure to multiple chemicals that have a common mechanism of toxicity. Three methods, hazard index (HI), point-of-departure index (PODI), and toxicity equivalence factor (TEF), ...

  13. Non-destructive evaluation of chlorophyll content in quinoa and amaranth leaves by simple and multiple regression analysis of RGB image components.

    PubMed

    Riccardi, M; Mele, G; Pulvento, C; Lavini, A; d'Andria, R; Jacobsen, S-E

    2014-06-01

    Leaf chlorophyll content provides valuable information about physiological status of plants; it is directly linked to photosynthetic potential and primary production. In vitro assessment by wet chemical extraction is the standard method for leaf chlorophyll determination. This measurement is expensive, laborious, and time consuming. Over the years alternative methods, rapid and non-destructive, have been explored. The aim of this work was to evaluate the applicability of a fast and non-invasive field method for estimation of chlorophyll content in quinoa and amaranth leaves based on RGB components analysis of digital images acquired with a standard SLR camera. Digital images of leaves from different genotypes of quinoa and amaranth were acquired directly in the field. Mean values of each RGB component were evaluated via image analysis software and correlated to leaf chlorophyll provided by standard laboratory procedure. Single and multiple regression models using RGB color components as independent variables have been tested and validated. The performance of the proposed method was compared to that of the widely used non-destructive SPAD method. Sensitivity of the best regression models for different genotypes of quinoa and amaranth was also checked. Color data acquisition of the leaves in the field with a digital camera was quick, more effective, and lower cost than SPAD. The proposed RGB models provided better correlation (highest R (2)) and prediction (lowest RMSEP) of the true value of foliar chlorophyll content and had a lower amount of noise in the whole range of chlorophyll studied compared with SPAD and other leaf image processing based models when applied to quinoa and amaranth.

  14. Efficiency and Adaptiveness of Multiple School-Taught Strategies in the Domain of Simple Addition

    ERIC Educational Resources Information Center

    Torbeyns, Joke; Verschaffel, Lieven; Ghesquiere, Pol

    2004-01-01

    This study investigated the fluency with which first-graders with strong, moderate, or weak mathematical abilities apply the decomposition-to-10 and tie strategy on almost-tie sums with bridge over 10. It also assessed children's memorized knowledge of additions up to 20. Children's strategies were analysed in terms of Lemaire and Siegler's model…

  15. Multiple linear regression model for bromate formation based on the survey data of source waters from geographically different regions across China.

    PubMed

    Yu, Jianwei; Liu, Juan; An, Wei; Wang, Yongjing; Zhang, Junzhi; Wei, Wei; Su, Ming; Yang, Min

    2015-01-01

    A total of 86 source water samples from 38 cities across major watersheds of China were collected for a bromide (Br(-)) survey, and the bromate (BrO3 (-)) formation potentials (BFPs) of 41 samples with Br(-) concentration >20 μg L(-1) were evaluated using a batch ozonation reactor. Statistical analyses indicated that higher alkalinity, hardness, and pH of water samples could lead to higher BFPs, with alkalinity as the most important factor. Based on the survey data, a multiple linear regression (MLR) model including three parameters (alkalinity, ozone dose, and total organic carbon (TOC)) was established with a relatively good prediction performance (model selection criterion = 2.01, R (2) = 0.724), using logarithmic transformation of the variables. Furthermore, a contour plot was used to interpret the influence of alkalinity and TOC on BrO3 (-) formation with prediction accuracy as high as 71 %, suggesting that these two parameters, apart from ozone dosage, were the most important ones affecting the BFPs of source waters with Br(-) concentration >20 μg L(-1). The model could be a useful tool for the prediction of the BFPs of source water.

  16. Multiple Regression Analysis of the Variable Component in the Near-Infrared Region for Type 1 AGN MCG +08-11-011

    NASA Astrophysics Data System (ADS)

    Tomita, Hiroyuki; Yoshii, Yuzuru; Kobayashi, Yukiyasu; Minezaki, Takeo; Enya, Keigo; Suganuma, Masahiro; Aoki, Tsutomu; Koshida, Shintaro; Yamauchi, Masahiro

    2006-11-01

    We propose a new method of analyzing a variable component for type 1 active galactic nuclei (AGNs) in the near-infrared wavelength region. This analysis uses a multiple regression technique and divides the variable component into two components originating in the accretion disk at the center of an AGN and from the dust torus that far surrounds the disk. Applying this analysis to the long-term VHK monitoring data of MCG +08-11-011 that were obtained by the MAGNUM project, we found that the (H-K) color temperature of the dust component is T=1635+/-20 K, which agrees with the sublimation temperature of dust grains, and that the time delay of K to H variations is Δt~6 days, which indicates the existence of a radial temperature gradient in the dust torus. As for the disk component, we found that the power-law spectrum of fν~να in the V to near-infrared HK bands varies with a fixed index of α~-0.1 to +0.4, which is broadly consistent with the irradiated standard disk model. The outer part of the disk therefore extends out to a radial distance where the temperature decreases to radiate the light in the near-infrared.

  17. Examining the full effects of landscape heterogeneity on spatial genetic variation: a multiple matrix regression approach for quantifying geographic and ecological isolation.

    PubMed

    Wang, Ian J

    2013-12-01

    Understanding the effects of landscape heterogeneity on spatial genetic variation is a primary goal of landscape genetics. Ecological and geographic variables can contribute to genetic structure through geographic isolation, in which geographic barriers and distances restrict gene flow, and ecological isolation, in which gene flow among populations inhabiting different environments is limited by selection against dispersers moving between them. Although methods have been developed to study geographic isolation in detail, ecological isolation has received much less attention, partly because disentangling the effects of these mechanisms is inherently difficult. Here, I describe a novel approach for quantifying the effects of geographic and ecological isolation using multiple matrix regression with randomization. I explored the parameter space over which this method is effective using a series of individual-based simulations and found that it accurately describes the effects of geographic and ecological isolation over a wide range of conditions. I also applied this method to a set of real-world datasets to show that ecological isolation is an often overlooked but important contributor to patterns of spatial genetic variation and to demonstrate how this analysis can provide new insights into how landscapes contribute to the evolution of genetic variation in nature.

  18. Application of Multiple Linear Regression and Extended Principal-Component Analysis to Determination of the Acid Dissociation Constant of 7-Hydroxycoumarin in Water/AOT/Isooctane Reverse Micelles.

    PubMed

    Caselli; Daniele; Mangone; Paolillo

    2000-01-15

    The apparent pK(a) of dyes in water-in-oil microemulsions depends on the charge of the acid and base forms of the buffers present in the water pool. Extended principal-component analysis allows the precise determination of the apparent pK(a) and of the spectra of the acid and base forms of the dye. Combination with multiple linear regression increases the precision. The pK(a) of 7-hydroxycoumarin (umbelliferone) was spectrophotometrically measured in a water/AOT/isooctane microemulsion in the presence of a series of buffers carrying different charges at various different water/surfactant ratios. The spectra of the acid and base forms of the dye in the microemulsion are very similar to those in bulk water in the presence of Tris and ammonia. The presence of carbonate changes somewhat the spectrum of the acid form. Results are discussed taking into account the profile of the electrostatic potential drop in the water pool and the possible partition of umbelliferone between the aqueous core and the surfactant. The pK(a) values corrected for these effects are independent of w(0) and are close to the value of the pK(a) in bulk water. Copyright 2000 Academic Press.

  19. Determination of the acid dissociation constant of bromocresol green and cresol red in water/AOT/isooctane reverse micelles by multiple linear regression and extended principal component analysis.

    PubMed

    Caselli, Maurizio; Mangone, Annarosa; Paolillo, Paola; Traini, Angela

    2002-01-01

    The pKa of 3',3",5',5"tetrabromo-m-cresolsulfonephtalein (Bromocresol Green) and o-cresolsulphonephtalein (Cresol Red) was spectrophotometrically measured in a water/AOT/isooctane microemulsion in the presence of a series of buffers carrying different charges at different water/surfactant ratios. Extended Principal Component Analysis was used for a precise determination of the apparent pKa and of the spectra of the acid and base forms of the dye. The apparent pKa of dyes in water-in-oil microemulsions depends on the charge of the acid and base forms of the buffers present in the water pool. Combination with multiple linear regression increases the precision. Results are discussed taking into account the profile of the electrostatic potential in the water pool and the possible partition of the indicator between the aqueous core and the surfactant. The pKa corrected for these effects are independent of w0 and are close to the value of the pKa in bulk water. On the basis of a tentative hypothesis it is possible to calculate the true pKa of the buffer in the pool.

  20. The role of chemometrics in single and sequential extraction assays: a review. Part II. Cluster analysis, multiple linear regression, mixture resolution, experimental design and other techniques.

    PubMed

    Giacomino, Agnese; Abollino, Ornella; Malandrino, Mery; Mentasti, Edoardo

    2011-03-01

    Single and sequential extraction procedures are used for studying element mobility and availability in solid matrices, like soils, sediments, sludge, and airborne particulate matter. In the first part of this review we reported an overview on these procedures and described the applications of chemometric uni- and bivariate techniques and of multivariate pattern recognition techniques based on variable reduction to the experimental results obtained. The second part of the review deals with the use of chemometrics not only for the visualization and interpretation of data, but also for the investigation of the effects of experimental conditions on the response, the optimization of their values and the calculation of element fractionation. We will describe the principles of the multivariate chemometric techniques considered, the aims for which they were applied and the key findings obtained. The following topics will be critically addressed: pattern recognition by cluster analysis (CA), linear discriminant analysis (LDA) and other less common techniques; modelling by multiple linear regression (MLR); investigation of spatial distribution of variables by geostatistics; calculation of fractionation patterns by a mixture resolution method (Chemometric Identification of Substrates and Element Distributions, CISED); optimization and characterization of extraction procedures by experimental design; other multivariate techniques less commonly applied. PMID:21334477

  1. Logistic regression analysis of multiple noninvasive tests for the prediction of the presence and extent of coronary artery disease in men

    SciTech Connect

    Hung, J.; Chaitman, B.R.; Lam, J.; Lesperance, J.; Dupras, G.; Fines, P.; Cherkaoui, O.; Robert, P.; Bourassa, M.G.

    1985-08-01

    The incremental diagnostic yield of clinical data, exercise ECG, stress thallium scintigraphy, and cardiac fluoroscopy to predict coronary and multivessel disease was assessed in 171 symptomatic men by means of multiple logistic regression analyses. When clinical variables alone were analyzed, chest pain type and age were predictive of coronary disease, whereas chest pain type, age, a family history of premature coronary disease before age 55 years, and abnormal ST-T wave changes on the rest ECG were predictive of multivessel disease. The percentage of patients correctly classified by cardiac fluoroscopy (presence or absence of coronary artery calcification), exercise ECG, and thallium scintigraphy was 9%, 25%, and 50%, respectively, greater than for clinical variables, when the presence or absence of coronary disease was the outcome, and 13%, 25%, and 29%, respectively, when multivessel disease was studied; 5% of patients were misclassified. When the 37 clinical and noninvasive test variables were analyzed jointly, the most significant variable predictive of coronary disease was an abnormal thallium scan and for multivessel disease, the amount of exercise performed. The data from this study provide a quantitative model and confirm previous reports that optimal diagnostic efficacy is obtained when noninvasive tests are ordered sequentially. In symptomatic men, cardiac fluoroscopy is a relatively ineffective test when compared to exercise ECG and thallium scintigraphy.

  2. Multiple Pathways Suppress Telomere Addition to DNA Breaks in the Drosophila Germline

    PubMed Central

    Beaucher, Michelle; Zheng, Xiao-Feng; Amariei, Flavia; Rong, Yikang S.

    2012-01-01

    Telomeres protect chromosome ends from being repaired as double-strand breaks (DSBs). Just as DSB repair is suppressed at telomeres, de novo telomere addition is suppressed at the site of DSBs. To identify factors responsible for this suppression, we developed an assay to monitor de novo telomere formation in Drosophila, an organism in which telomeres can be established on chromosome ends with essentially any sequence. Germline expression of the I-SceI endonuclease resulted in precise telomere formation at its cut site with high efficiency. Using this assay, we quantified the frequency of telomere formation in different genetic backgrounds with known or possible defects in DNA damage repair. We showed that disruption of DSB repair factors (Rad51 or DNA ligase IV) or DSB sensing factors (ATRIP or MDC1) resulted in more efficient telomere formation. Interestingly, partial disruption of factors that normally regulate telomere protection (ATM or NBS) also led to higher frequencies of telomere formation, suggesting that these proteins have opposing roles in telomere maintenance vs. establishment. In the ku70 mutant background, telomere establishment was preceded by excessive degradation of DSB ends, which were stabilized upon telomere formation. Most strikingly, the removal of ATRIP caused a dramatic increase in telomeric retrotransposon attachment to broken ends. Our study identifies several pathways thatsuppress telomere addition at DSBs, paving the way for future mechanistic studies. PMID:22446318

  3. Protein-Protein Interaction Analysis Highlights Additional Loci of Interest for Multiple Sclerosis

    PubMed Central

    Ragnedda, Giammario; Disanto, Giulio; Giovannoni, Gavin; Ebers, George C.; Sotgiu, Stefano; Ramagopalan, Sreeram V.

    2012-01-01

    Genetic factors play an important role in determining the risk of multiple sclerosis (MS). The strongest genetic association in MS is located within the major histocompatibility complex class II region (MHC), but more than 50 MS loci of modest effect located outside the MHC have now been identified. However, the relative candidate genes that underlie these associations and their functions are largely unknown. We conducted a protein-protein interaction (PPI) analysis of gene products coded in loci recently reported to be MS associated at the genome-wide significance level and in loci suggestive of MS association. Our aim was to identify which suggestive regions are more likely to be truly associated, which genes are mostly implicated in the PPI network and their expression profile. From three recent independent association studies, SNPs were considered and divided into significant and suggestive depending on the strength of the statistical association. Using the Disease Association Protein-Protein Link Evaluator tool we found that direct interactions among genetic products were significantly higher than expected by chance when considering both significant regions alone (p<0.0002) and significant plus suggestive (p<0.007). The number of genes involved in the network was 43. Of these, 23 were located within suggestive regions and many of them directly interacted with proteins coded within significant regions. These included genes such as SYK, IL-6, CSF2RB, FCLR3, EIF4EBP2 and CHST12. Using the gene portal BioGPS, we tested the expression of these genes in 24 different tissues and found the highest values among immune-related cells as compared to non-immune tissues (p<0.001). A gene ontology analysis confirmed the immune-related functions of these genes. In conclusion, loci currently suggestive of MS association interact with and have similar expression profiles and function as those significantly associated, highlighting the fact that more common variants remain to be

  4. Autistic Regression

    ERIC Educational Resources Information Center

    Matson, Johnny L.; Kozlowski, Alison M.

    2010-01-01

    Autistic regression is one of the many mysteries in the developmental course of autism and pervasive developmental disorders not otherwise specified (PDD-NOS). Various definitions of this phenomenon have been used, further clouding the study of the topic. Despite this problem, some efforts at establishing prevalence have been made. The purpose of…

  5. Investigation of the relationship between very warm days in Romania and large-scale atmospheric circulation using multiple linear regression approach

    NASA Astrophysics Data System (ADS)

    Barbu, N.; Cuculeanu, V.; Stefan, S.

    2016-10-01

    The aim of this study is to investigate the relationship between the frequency of very warm days (TX90p) in Romania and large-scale atmospheric circulation for winter (December-February) and summer (June-August) between 1962 and 2010. In order to achieve this, two catalogues from COST733Action were used to derive daily circulation types. Seasonal occurrence frequencies of the circulation types were calculated and have been utilized as predictors within the multiple linear regression model (MLRM) for the estimation of winter and summer TX90p values for 85 synoptic stations covering the entire Romania. A forward selection procedure has been utilized to find adequate predictor combinations and those predictor combinations were tested for collinearity. The performance of the MLRMs has been quantified based on the explained variance. Furthermore, the leave-one-out cross-validation procedure was applied and the root-mean-squared error skill score was calculated at station level in order to obtain reliable evidence of MLRM robustness. From this analysis, it can be stated that the MLRM performance is higher in winter compared to summer. This is due to the annual cycle of incoming insolation and to the local factors such as orography and surface albedo variations. The MLRM performances exhibit distinct variations between regions with high performance in wintertime for the eastern and southern part of the country and in summertime for the western part of the country. One can conclude that the MLRM generally captures quite well the TX90p variability and reveals the potential for statistical downscaling of TX90p values based on circulation types.

  6. Taking into account latency, amplitude, and morphology: improved estimation of single-trial ERPs by wavelet filtering and multiple linear regression

    PubMed Central

    Hu, L.; Liang, M.; Mouraux, A.; Wise, R. G.; Hu, Y.

    2011-01-01

    Across-trial averaging is a widely used approach to enhance the signal-to-noise ratio (SNR) of event-related potentials (ERPs). However, across-trial variability of ERP latency and amplitude may contain physiologically relevant information that is lost by across-trial averaging. Hence, we aimed to develop a novel method that uses 1) wavelet filtering (WF) to enhance the SNR of ERPs and 2) a multiple linear regression with a dispersion term (MLRd) that takes into account shape distortions to estimate the single-trial latency and amplitude of ERP peaks. Using simulated ERP data sets containing different levels of noise, we provide evidence that, compared with other approaches, the proposed WF+MLRd method yields the most accurate estimate of single-trial ERP features. When applied to a real laser-evoked potential data set, the WF+MLRd approach provides reliable estimation of single-trial latency, amplitude, and morphology of ERPs and thereby allows performing meaningful correlations at single-trial level. We obtained three main findings. First, WF significantly enhances the SNR of single-trial ERPs. Second, MLRd effectively captures and measures the variability in the morphology of single-trial ERPs, thus providing an accurate and unbiased estimate of their peak latency and amplitude. Third, intensity of pain perception significantly correlates with the single-trial estimates of N2 and P2 amplitude. These results indicate that WF+MLRd can be used to explore the dynamics between different ERP features, behavioral variables, and other neuroimaging measures of brain activity, thus providing new insights into the functional significance of the different brain processes underlying the brain responses to sensory stimuli. PMID:21880936

  7. Effect of fat additions to diets of dairy cattle on milk production and components: a meta-analysis and meta-regression.

    PubMed

    Rabiee, A R; Breinhild, K; Scott, W; Golder, H M; Block, E; Lean, I J

    2012-06-01

    The objectives of this study were to critically review randomized controlled trials, and quantify, using meta-analysis and meta-regression, the effects of supplementation with fats on milk production and components by dairy cows. We reviewed 59 papers, of which 38 (containing 86 comparisons) met eligibility criteria. Five groups of fats were evaluated: tallows, calcium salts of palm fat (Megalac, Church and Dwight Co. Inc., Princeton, NJ), oilseeds, prilled fat, and other calcium salts. Milk production responses to fats were significant, and the estimated mean difference was 1.05 kg/cow per day, but results were heterogeneous. Milk yield increased with increased difference in dry matter intake (DMI) between treatment and control groups, decreased with predicted metabolizable energy (ME) balance between these groups, and decreased with increased difference in soluble protein percentage of the diet between groups. Decreases in DMI were significant for Megalac, oilseeds, and other Ca salts, and approached significance for tallow. Feeding fat for a longer period increased DMI, as did greater differences in the amount of soluble protein percentage of the diet between control and treatment diets. Tallow, oilseeds, and other Ca salts reduced, whereas Megalac increased, milk fat percentage. Milk fat percentage effects were heterogeneous for fat source. Differences between treatment and control groups in duodenal concentrations of C18:2 and C 18:0 fatty acids and Mg percentage reduced the milk fat percentage standardized mean difference. Milk fat yield responses to fat treatments were very variable. The other Ca salts substantially decrease, and the Megalac and oilseeds increased, fat yield. Fat yield increased with increased DMI difference between groups and was lower with an increased estimated ME balance between treatment and control groups, indicating increased partitioning of fat to body tissue reserves. Feeding fats decreased milk protein percentage, but results were

  8. Robust Regression.

    PubMed

    Huang, Dong; Cabral, Ricardo; De la Torre, Fernando

    2016-02-01

    Discriminative methods (e.g., kernel regression, SVM) have been extensively used to solve problems such as object recognition, image alignment and pose estimation from images. These methods typically map image features ( X) to continuous (e.g., pose) or discrete (e.g., object category) values. A major drawback of existing discriminative methods is that samples are directly projected onto a subspace and hence fail to account for outliers common in realistic training sets due to occlusion, specular reflections or noise. It is important to notice that existing discriminative approaches assume the input variables X to be noise free. Thus, discriminative methods experience significant performance degradation when gross outliers are present. Despite its obvious importance, the problem of robust discriminative learning has been relatively unexplored in computer vision. This paper develops the theory of robust regression (RR) and presents an effective convex approach that uses recent advances on rank minimization. The framework applies to a variety of problems in computer vision including robust linear discriminant analysis, regression with missing data, and multi-label classification. Several synthetic and real examples with applications to head pose estimation from images, image and video classification and facial attribute classification with missing data are used to illustrate the benefits of RR. PMID:26761740

  9. Using multiple calibration sets to improve the quantitative accuracy of partial least squares (PLS) regression on open-path fourier transform infrared (OP/FT-IR) spectra of ammonia over wide concentration ranges

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A technique of using multiple calibration sets in partial least squares regression (PLS) was proposed to improve the quantitative determination of ammonia from open-path Fourier transform infrared spectra. The spectra were measured near animal farms, and the path-integrated concentration of ammonia...

  10. Morse–Smale Regression

    SciTech Connect

    Gerber, Samuel; Rubel, Oliver; Bremer, Peer -Timo; Pascucci, Valerio; Whitaker, Ross T.

    2012-01-19

    This paper introduces a novel partition-based regression approach that incorporates topological information. Partition-based regression typically introduces a quality-of-fit-driven decomposition of the domain. The emphasis in this work is on a topologically meaningful segmentation. Thus, the proposed regression approach is based on a segmentation induced by a discrete approximation of the Morse–Smale complex. This yields a segmentation with partitions corresponding to regions of the function with a single minimum and maximum that are often well approximated by a linear model. This approach yields regression models that are amenable to interpretation and have good predictive capacity. Typically, regression estimates are quantified by their geometrical accuracy. For the proposed regression, an important aspect is the quality of the segmentation itself. Thus, this article introduces a new criterion that measures the topological accuracy of the estimate. The topological accuracy provides a complementary measure to the classical geometrical error measures and is very sensitive to overfitting. The Morse–Smale regression is compared to state-of-the-art approaches in terms of geometry and topology and yields comparable or improved fits in many cases. Finally, a detailed study on climate-simulation data demonstrates the application of the Morse–Smale regression. Supplementary Materials are available online and contain an implementation of the proposed approach in the R package msr, an analysis and simulations on the stability of the Morse–Smale complex approximation, and additional tables for the climate-simulation study.

  11. New Insights into Trace Element Partitioning in Amphibole from Multiple Regression Analysis, with Application to the Magma Plumbing System of Mt. Lamington (Papua New Guinea)

    NASA Astrophysics Data System (ADS)

    Zhang, J.; Humphreys, M.; Cooper, G.; Davidson, J.; Macpherson, C.

    2015-12-01

    We present a new multiple regression (MR) analysis of published amphibole-melt trace element partitioning data, with the aim of retrieving robust relationships between amphibole crystal-chemical compositions and trace element partition coefficients (D). We examined experimental data for calcic amphiboles of kaersutite, pargasite, tschermakite (Tsch), magnesiohornblende (MgHbl) and magnesiohastingsite (MgHst) compositions crystallized from basanitic-rhyolitic melts (n = 150). The MR analysis demonstrates the varying significance of amphibole major element components assigned to different crystallographic sites (T, M1-3, M4, A) as independent variables in controlling D, and it allows us to retrieve statistically significant relationships for REE, Y, Rb, Sr, Pb, Ti, Zr, Nb (n > 25, R2 > 0.6, p-value < 0.05). For example, DLREE are controlled by SiT, M1-3 site components and CaM4, whereas DMREE-HREE are controlled solely by M1-3 site components. Our overall results for the REE are supported by application of the lattice strain model (Blundy & Wood, 1994). A significant advantage of our study over previous work linking D to melt polymerization (e.g. Tiepolo et al., 2007) is the ability to reconstruct melt compositions from in situ amphibole compositional analyses and published D data. We applied our MR analysis to Mt. Lamington (PNG), where Mg-Hst in quenched mafic enclaves are juxtaposed with MgHbl-Tsch phenocrysts from andesitic host lavas. The results indicate that MgHbl-Tsch are crystallized from a cool, rhyolitic melt (800-900±50 ºC, 70-77±5 wt % SiO2; Ridolfi & Renzulli 2012) with lower Rb and Sr and higher Pb, relative to a hot, andesitic-dacitic melt (950-1,000±50 ºC; 60-70±5 wt % SiO2) where MgHst are crystallized. REE and Nb contents are similar in both types of melts despite higher REE and Nb in MgHbl-Tsch. Therefore, the REE compositional disparity between MgHst and MgHbl-Tsch is driven by the difference in the DREE, rather than the melt REE

  12. Precision Efficacy Analysis for Regression.

    ERIC Educational Resources Information Center

    Brooks, Gordon P.

    When multiple linear regression is used to develop a prediction model, sample size must be large enough to ensure stable coefficients. If the derivation sample size is inadequate, the model may not predict well for future subjects. The precision efficacy analysis for regression (PEAR) method uses a cross- validity approach to select sample sizes…

  13. Building Regression Models: The Importance of Graphics.

    ERIC Educational Resources Information Center

    Dunn, Richard

    1989-01-01

    Points out reasons for using graphical methods to teach simple and multiple regression analysis. Argues that a graphically oriented approach has considerable pedagogic advantages in the exposition of simple and multiple regression. Shows that graphical methods may play a central role in the process of building regression models. (Author/LS)

  14. Molecular cytogenetic identification of a wheat-rye 1R addition line with multiple spikelets and resistance to powdery mildew.

    PubMed

    Yang, Wujuan; Wang, Changyou; Chen, Chunhuan; Wang, Yajuan; Zhang, Hong; Liu, Xinlun; Ji, Wanquan

    2016-04-01

    Alien addition lines are important for transferring useful genes from alien species into common wheat. Rye is an important and valuable gene resource for improving wheat disease resistance, yield, and environment adaptation. A new wheat-rye addition line, N9436B, was developed from the progeny of the cross of common wheat (Triticum aestivum L., 2n = 6x = 42, AABBDD) cultivar Shaanmai 611 and rye (Secale cereal L., 2n = 2x = 14, RR) accession Austrian rye. We characterized this new line by cytology, genomic in situ hybridization (GISH), fluorescence in situ hybridization (FISH), molecular markers, and disease resistance screening. N9436B was stable in morphology and cytology, with a chromosome composition of 2n = 42 + 2t = 22II. GISH investigations showed that this line contained two rye chromosomes. GISH, FISH, and molecular maker identification suggested that the introduced R chromosome and the missing wheat chromosome arms were 1R chromosome and 2DL chromosome arm, respectively. N9436B exhibited 30-37 spikelets per spike and a high level of resistance to powdery mildew (Blumeria graminis f. sp. tritici, Bgt) isolate E09 at the seedling stage. N9436B was cytologically stable, had the trait of multiple spikelets, and was resistant to powdery mildew; this line should thus be useful in wheat improvement.

  15. Preventing Return of Fear in an Animal Model of Anxiety: Additive Effects of Massive Extinction and Extinction in Multiple Contexts

    PubMed Central

    Laborda, Mario A.; Miller, Ralph R.

    2013-01-01

    Fear conditioning and experimental extinction have been presented as models of anxiety disorders and exposure therapy, respectively. Moreover, the return of fear serves as a model of relapse after exposure therapy. Here we present two experiments, with rats as subjects in a lick suppression preparation, in which we assessed the additive effects of two different treatments to attenuate the return of fear. First, we evaluated whether two phenomena known to generate return of fear (i.e., spontaneous recovery and renewal) summate to produce a stronger reappearance of extinguished fear. At test, rats evaluated outside the extinction context following a long delay after extinction (i.e., a delayed context shift) exhibited greater return of extinguished fear than rats evaluated outside the extinction context alone, but return of extinguished fear following a delayed context shift did not significantly differ from the return of fear elicited in rats tested following a long delay after extinction alone. Additionally, extinction in multiple contexts and a massive extinction treatment each attenuated the strong return of fear produced by a delayed context shift. Moreover, the conjoint action of these treatments was significantly more successful in preventing the reappearance of extinguished fear, suggesting that extensive cue exposure administered in several different therapeutic settings has the potential to reduce relapse after therapy for anxiety disorders, more than either manipulation alone. PMID:23611075

  16. The effectiveness of selected feed and water additives for reducing Salmonella spp. of public health importance in broiler chickens: a systematic review, meta-analysis, and meta-regression approach.

    PubMed

    Totton, Sarah C; Farrar, Ashley M; Wilkins, Wendy; Bucher, Oliver; Waddell, Lisa A; Wilhelm, Barbara J; McEwen, Scott A; Rajić, Andrijana

    2012-10-01

    Eating inappropriately prepared poultry meat is a major cause of foodborne salmonellosis. Our objectives were to determine the efficacy of feed and water additives (other than competitive exclusion and antimicrobials) on reducing Salmonella prevalence or concentration in broiler chickens using systematic review-meta-analysis and to explore sources of heterogeneity found in the meta-analysis through meta-regression. Six electronic databases were searched (Current Contents (1999-2009), Agricola (1924-2009), MEDLINE (1860-2009), Scopus (1960-2009), Centre for Agricultural Bioscience (CAB) (1913-2009), and CAB Global Health (1971-2009)), five topic experts were contacted, and the bibliographies of review articles and a topic-relevant textbook were manually searched to identify all relevant research. Study inclusion criteria comprised: English-language primary research investigating the effects of feed and water additives on the Salmonella prevalence or concentration in broiler chickens. Data extraction and study methodological assessment were conducted by two reviewers independently using pretested forms. Seventy challenge studies (n=910 unique treatment-control comparisons), seven controlled studies (n=154), and one quasi-experiment (n=1) met the inclusion criteria. Compared to an assumed control group prevalence of 44 of 1000 broilers, random-effects meta-analysis indicated that the Salmonella cecal colonization in groups with prebiotics (fructooligosaccharide, lactose, whey, dried milk, lactulose, lactosucrose, sucrose, maltose, mannanoligosaccharide) added to feed or water was 15 out of 1000 broilers; with lactose added to feed or water it was 10 out of 1000 broilers; with experimental chlorate product (ECP) added to feed or water it was 21 out of 1000. For ECP the concentration of Salmonella in the ceca was decreased by 0.61 log(10)cfu/g in the treated group compared to the control group. Significant heterogeneity (Cochran's Q-statistic p≤0.10) was observed

  17. Discounting of monetary rewards that are both delayed and probabilistic: delay and probability combine multiplicatively, not additively.

    PubMed

    Vanderveldt, Ariana; Green, Leonard; Myerson, Joel

    2015-01-01

    The value of an outcome is affected both by the delay until its receipt (delay discounting) and by the likelihood of its receipt (probability discounting). Despite being well-described by the same hyperboloid function, delay and probability discounting involve fundamentally different processes, as revealed, for example, by the differential effects of reward amount. Previous research has focused on the discounting of delayed and probabilistic rewards separately, with little research examining more complex situations in which rewards are both delayed and probabilistic. In 2 experiments, participants made choices between smaller rewards that were both immediate and certain and larger rewards that were both delayed and probabilistic. Analyses revealed significant interactions between delay and probability factors inconsistent with an additive model. In contrast, a hyperboloid discounting model in which delay and probability were combined multiplicatively provided an excellent fit to the data. These results suggest that the hyperboloid is a good descriptor of decision making in complicated monetary choice situations like those people encounter in everyday life.

  18. Discounting of Monetary Rewards that are Both Delayed and Probabilistic: Delay and Probability Combine Multiplicatively, not Additively

    PubMed Central

    Vanderveldt, Ariana; Green, Leonard; Myerson, Joel

    2014-01-01

    The value of an outcome is affected both by the delay until its receipt (delay discounting) and by the likelihood of its receipt (probability discounting). Despite being well-described by the same hyperboloid function, delay and probability discounting involve fundamentally different processes, as revealed, for example, by the differential effects of reward amount. Previous research has focused on the discounting of delayed and probabilistic rewards separately, with little research examining more complex situations in which rewards are both delayed and probabilistic. In two experiments, participants made choices between smaller rewards that were both immediate and certain and larger rewards that were both delayed and probabilistic. Analyses revealed significant interactions between delay and probability factors inconsistent with an additive model. In contrast, a hyperboloid discounting model in which delay and probability were combined multiplicatively provided an excellent fit to the data. These results suggest that the hyperboloid is a good descriptor of decision making in complicated monetary choice situations like those people encounter in everyday life. PMID:24933696

  19. Stochastic resonance in a piecewise nonlinear model driven by multiplicative non-Gaussian noise and additive white noise

    NASA Astrophysics Data System (ADS)

    Guo, Yongfeng; Shen, Yajun; Tan, Jianguo

    2016-09-01

    The phenomenon of stochastic resonance (SR) in a piecewise nonlinear model driven by a periodic signal and correlated noises for the cases of a multiplicative non-Gaussian noise and an additive Gaussian white noise is investigated. Applying the path integral approach, the unified colored noise approximation and the two-state model theory, the analytical expression of the signal-to-noise ratio (SNR) is derived. It is found that conventional stochastic resonance exists in this system. From numerical computations we obtain that: (i) As a function of the non-Gaussian noise intensity, the SNR is increased when the non-Gaussian noise deviation parameter q is increased. (ii) As a function of the Gaussian noise intensity, the SNR is decreased when q is increased. This demonstrates that the effect of the non-Gaussian noise on SNR is different from that of the Gaussian noise in this system. Moreover, we further discuss the effect of the correlation time of the non-Gaussian noise, cross-correlation strength, the amplitude and frequency of the periodic signal on SR.

  20. Confidence Intervals, Power Calculation, and Sample Size Estimation for the Squared Multiple Correlation Coefficient under the Fixed and Random Regression Models: A Computer Program and Useful Standard Tables.

    ERIC Educational Resources Information Center

    Mendoza, Jorge L.; Stafford, Karen L.

    2001-01-01

    Introduces a computer package written for Mathematica, the purpose of which is to perform a number of difficult iterative functions with respect to the squared multiple correlation coefficient under the fixed and random models. These functions include computation of the confidence interval upper and lower bounds, power calculation, calculation of…

  1. Verification of a New Biocompatible Single-Use Film Formulation with Optimized Additive Content for Multiple Bioprocess Applications

    PubMed Central

    Jurkiewicz, Elke; Husemann, Ute; Greller, Gerhard; Barbaroux, Magali; Fenge, Christel

    2014-01-01

    Single-use bioprocessing bags and bioreactors gained significant importance in the industry as they offer a number of advantages over traditional stainless steel solutions. However, there is continued concern that the plastic materials might release potentially toxic substances negatively impacting cell growth and product titers, or even compromise drug safety when using single-use bags for intermediate or drug substance storage. In this study, we have focused on the in vitro detection of potentially cytotoxic leachables originating from the recently developed new polyethylene (PE) multilayer film called S80. This new film was developed to guarantee biocompatibility for multiple bioprocess applications, for example, storage of process fluids, mixing, and cell culture bioreactors. For this purpose, we examined a protein-free cell culture medium that had been used to extract leachables from freshly gamma-irradiated sample bags in a standardized cell culture assay. We investigated sample bags from films generated to establish the operating ranges of the film extrusion process. Further, we studied sample bags of different age after gamma-irradiation and finally, we performed extended media extraction trials at cold room conditions using sample bags. In contrast to a nonoptimized film formulation, our data demonstrate no cytotoxic effect of the S80 polymer film formulation under any of the investigated conditions. The S80 film formulation is based on an optimized PE polymer composition and additive package. Full traceability alongside specifications and controls of all critical raw materials, and process controls of the manufacturing process, that is, film extrusion and gamma-irradiation, have been established to ensure lot-to-lot consistency. © 2014 American Institute of Chemical Engineers Biotechnol. Prog., 30:1171–1176, 2014 PMID:24850537

  2. An improved algorithm of temperature compensation for a near infrared multiple-acquisition system based on two-dimensional regression analysis.

    PubMed

    Yu, Xu-yao; An, Jia-bao; Yu, Hui; Shi, Yao; Deng, Yong; Zhou, Jia-lu; Xu, Ke-xin

    2015-08-01

    The near infrared (NIR) spectroscopy analytical technique is one of the most advanced and promising tools in many domains. NIR acquisition is easily influenced by temperature, thereby affecting qualitative and quantitative analyses. In this paper, a temperature compensation model was established between NIR signals and output voltage values based on two-dimensional regression analysis. The effectiveness of the proposed compensation scheme was experimentally demonstrated by the measurement of six super luminescent diode sources at 293-313 K. The coefficient of variation was decreased 2-fold with this compensation algorithm. The results indicated that it was suitable for various NIR spectral acquisition systems with lower complexity and a higher signal-noise-ratio after being applied to an acousto-optic-tunable-filter system. PMID:26329222

  3. Multiple molecular and cellular changes associated with tumour stasis and regression during IL-12 therapy of a murine breast cancer model.

    PubMed

    Dias, S; Thomas, H; Balkwill, F

    1998-01-01

    IL-12 treatment of a murine transplantable breast carcinoma (HTH-K) led to tumour regression and cure which was related to the duration of treatment. We studied the sequential molecular and phenotypic changes in IL-12-treated tumours. IFN-gamma mRNA was detected 8 hr after the first treatment. mRNA expression for the IFN-gamma-inducible genes beta 2-microglobulin and indoleamine dioxygenase (IDO) was induced subsequently, together with the chemokine IP-10. IL-12-treated tumours had an abundant cellular infiltrate, consisting mainly of CD8+ T cells. mRNA for granzyme B and perforin also could be detected, suggesting that those cells were activated. After 7 days of daily therapy, tumours in IL-12-treated mice had a significant reduction in vasculature. Finally, the number of apoptotic tumour cells increased throughout IL-12 treatment. We compared the anti-tumour effects of IL-12 to those induced by IFN-gamma therapy, which caused initial tumour stasis but subsequent tumour progression. IFN-gamma induced beta 2-microglobulin and IDO over a 7-day period, but IP-10 was induced only transiently. IFN-gamma caused a lesser cellular infiltrate, a minor anti-angiogenic effect and a transient apoptotic effect. The success of IL-12 may be due to its ability to produce a distinct sequence of molecular and phenotypic changes in tumours, leading to an anti-tumour immune response, toxicity against tumour cells and an anti-angiogenic effect. Other cytokines, such as IFN-gamma, induce some, but not all, of these actions. Comparison of IL-12 and IFN-gamma suggests that sustained induction of IP-10 and activation of a resulting cellular infiltrate may be key changes in regressing tumours. PMID:9426704

  4. Regression: A Bibliography.

    ERIC Educational Resources Information Center

    Pedrini, D. T.; Pedrini, Bonnie C.

    Regression, another mechanism studied by Sigmund Freud, has had much research, e.g., hypnotic regression, frustration regression, schizophrenic regression, and infra-human-animal regression (often directly related to fixation). Many investigators worked with hypnotic age regression, which has a long history, going back to Russian reflexologists.…

  5. Practical Session: Simple Linear Regression

    NASA Astrophysics Data System (ADS)

    Clausel, M.; Grégoire, G.

    2014-12-01

    Two exercises are proposed to illustrate the simple linear regression. The first one is based on the famous Galton's data set on heredity. We use the lm R command and get coefficients estimates, standard error of the error, R2, residuals …In the second example, devoted to data related to the vapor tension of mercury, we fit a simple linear regression, predict values, and anticipate on multiple linear regression. This pratical session is an excerpt from practical exercises proposed by A. Dalalyan at EPNC (see Exercises 1 and 2 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_4.pdf).

  6. A new three-dimensional magnetopause model with a support vector regression machine and a large database of multiple spacecraft observations

    NASA Astrophysics Data System (ADS)

    Wang, Y.; Sibeck, D. G.; Merka, J.; Boardsen, S. A.; Karimabadi, H.; Sipes, T. B.; Šafránková, J.; Jelínek, K.; Lin, R.

    2013-05-01

    We present results from a new three-dimensional empirical magnetopause model based on 15,089 magnetopause crossings from 23 spacecraft. To construct the model, we introduce a Support Vector Regression Machine (SVRM) technique with a systematic approach that balances model smoothness with fitting accuracy to produce a model that reveals the manner in which the size and shape of the magnetopause depend upon various control parameters without any assumptions concerning the analytical shape of the magnetopause. The new model fits the data used in the modeling very accurately, and can guarantee a similar accuracy when predicting unseen observations within the applicable range of control parameters. We introduce a new error analysis technique based upon the SVRM that enables us to obtain model errors appropriate to different locations and control parameters. We find significant east-west elongations in the magnetopause shape for many combinations of control parameters. Variations in the Earth's dipole tilt can cause significant magnetopause north/south asymmetries and deviation of the magnetopause nose from the Sun-Earth line nonlinearly by as much as 5 Re. Subsolar magnetopause erosion effect under southward IMF is seen which is strongly affected by solar wind dynamic pressure. Further, we find significant shrinking of high-latitude magnetopause with decreased magnetopause flaring angle during northward IMF.

  7. Dentist and practice characteristics associated with restorative treatment of enamel caries in permanent teeth: multiple-regression modeling of observational clinical data from The National Dental PBRN

    PubMed Central

    Fellows, Jeffrey L; Gordan, Valeria V.; Gilbert, Gregg H.; Rindal, D. Brad; Qvist, Vibeke; Litaker, Mark S.; Benjamin, Paul; Flink, Håkan; Pihlstrom, Daniel J.; Johnson, Neil

    2014-01-01

    Purpose Current evidence in dentistry recommends non-surgical treatment to manage enamel caries lesions. However, surveyed practitioners report they would restore enamel lesions that are confined to the enamel. We used actual clinical data to evaluate patient, dentist, and practice characteristics associated with restoration of enamel caries, while accounting for other factors. Methods We combined data from a National Dental Practice-Based Research Network observational study of consecutive restorations placed in previously unrestored permanent tooth surfaces and practice/demographic data from 229 participating network dentists. Analysis of variance and logistic regression, using generalized estimating equations (GEE) and variable selection within blocks, were used to test the hypothesis that patient, dentist, and practice characteristics were associated with variations in enamel restorations of occlusal and proximal caries compared to dentin lesions, accounting for dentist and patient clustering. Results Network dentists from 5 regions placed 6,891 restorations involving occlusal and/or proximal caries lesions. Enamel restorations accounted for 16% of enrolled occlusal caries lesions and 6% of enrolled proximal caries lesions. Enamel occlusal restorations varied significantly (p<0.05) by patient age and race/ethnicity, dentist use of caries risk assessment, network region, and practice type. Enamel proximal restorations varied significantly (p<0.05) by dentist race/ethnicity, network region, and practice type. CLINICAL SIGNIFICANCE Identifying patient, dentist, and practice characteristics associated with enamel caries restorations can guide strategies to improve provider adherence to evidence-based clinical recommendations. PMID:25000667

  8. Additive transgene expression and genetic introgression in multiple green-fluorescent protein transgenic crop x weed hybrid generations.

    PubMed

    Halfhill, M D; Millwood, R J; Weissinger, A K; Warwick, S I; Stewart, C N

    2003-11-01

    The level of transgene expression in crop x weed hybrids and the degree to which crop-specific genes are integrated into hybrid populations are important factors in assessing the potential ecological and agricultural risks of gene flow associated with genetic engineering. The average transgene zygosity and genetic structure of transgenic hybrid populations change with the progression of generations, and the green fluorescent protein (GFP) transgene is an ideal marker to quantify transgene expression in advancing populations. The homozygous T(1) single-locus insert GFP/ Bacillus thuringiensis (Bt) transgenic canola ( Brassica napus, cv Westar) with two copies of the transgene fluoresced twice as much as hemizygous individuals with only one copy of the transgene. These data indicate that the expression of the GFP gene was additive, and fluorescence could be used to determine zygosity status. Several hybrid generations (BC(1)F(1), BC(2)F(1)) were produced by backcrossing various GFP/Bt transgenic canola ( B. napus, cv Westar) and birdseed rape ( Brassica rapa) hybrid generations onto B. rapa. Intercrossed generations (BC(2)F(2) Bulk) were generated by crossing BC(2)F(1) individuals in the presence of a pollinating insect ( Musca domestica L.). The ploidy of plants in the BC(2)F(2) Bulk hybrid generation was identical to the weedy parental species, B. rapa. AFLP analysis was used to quantify the degree of B. napus introgression into multiple backcross hybrid generations with B. rapa. The F(1) hybrid generations contained 95-97% of the B. napus-specific AFLP markers, and each successive backcross generation demonstrated a reduction of markers resulting in the 15-29% presence in the BC(2)F(2) Bulk population. Average fluorescence of each successive hybrid generation was analyzed, and homozygous canola lines and hybrid populations that contained individuals homozygous for GFP (BC(2)F(2) Bulk) demonstrated significantly higher fluorescence than hemizygous hybrid

  9. Additive transgene expression and genetic introgression in multiple green-fluorescent protein transgenic crop x weed hybrid generations.

    PubMed

    Halfhill, M D; Millwood, R J; Weissinger, A K; Warwick, S I; Stewart, C N

    2003-11-01

    The level of transgene expression in crop x weed hybrids and the degree to which crop-specific genes are integrated into hybrid populations are important factors in assessing the potential ecological and agricultural risks of gene flow associated with genetic engineering. The average transgene zygosity and genetic structure of transgenic hybrid populations change with the progression of generations, and the green fluorescent protein (GFP) transgene is an ideal marker to quantify transgene expression in advancing populations. The homozygous T(1) single-locus insert GFP/ Bacillus thuringiensis (Bt) transgenic canola ( Brassica napus, cv Westar) with two copies of the transgene fluoresced twice as much as hemizygous individuals with only one copy of the transgene. These data indicate that the expression of the GFP gene was additive, and fluorescence could be used to determine zygosity status. Several hybrid generations (BC(1)F(1), BC(2)F(1)) were produced by backcrossing various GFP/Bt transgenic canola ( B. napus, cv Westar) and birdseed rape ( Brassica rapa) hybrid generations onto B. rapa. Intercrossed generations (BC(2)F(2) Bulk) were generated by crossing BC(2)F(1) individuals in the presence of a pollinating insect ( Musca domestica L.). The ploidy of plants in the BC(2)F(2) Bulk hybrid generation was identical to the weedy parental species, B. rapa. AFLP analysis was used to quantify the degree of B. napus introgression into multiple backcross hybrid generations with B. rapa. The F(1) hybrid generations contained 95-97% of the B. napus-specific AFLP markers, and each successive backcross generation demonstrated a reduction of markers resulting in the 15-29% presence in the BC(2)F(2) Bulk population. Average fluorescence of each successive hybrid generation was analyzed, and homozygous canola lines and hybrid populations that contained individuals homozygous for GFP (BC(2)F(2) Bulk) demonstrated significantly higher fluorescence than hemizygous hybrid

  10. Retro-regression--another important multivariate regression improvement.

    PubMed

    Randić, M

    2001-01-01

    We review the serious problem associated with instabilities of the coefficients of regression equations, referred to as the MRA (multivariate regression analysis) "nightmare of the first kind". This is manifested when in a stepwise regression a descriptor is included or excluded from a regression. The consequence is an unpredictable change of the coefficients of the descriptors that remain in the regression equation. We follow with consideration of an even more serious problem, referred to as the MRA "nightmare of the second kind", arising when optimal descriptors are selected from a large pool of descriptors. This process typically causes at different steps of the stepwise regression a replacement of several previously used descriptors by new ones. We describe a procedure that resolves these difficulties. The approach is illustrated on boiling points of nonanes which are considered (1) by using an ordered connectivity basis; (2) by using an ordering resulting from application of greedy algorithm; and (3) by using an ordering derived from an exhaustive search for optimal descriptors. A novel variant of multiple regression analysis, called retro-regression (RR), is outlined showing how it resolves the ambiguities associated with both "nightmares" of the first and the second kind of MRA. PMID:11410035

  11. Trends in Mathematics and Science Performance in 18 Countries: Multiple Regression Analysis of the Cohort Effects of TIMSS 1995-2007

    ERIC Educational Resources Information Center

    Hong, Hee Kyung

    2012-01-01

    The purpose of this study was to simultaneously examine relationships between teacher quality and instructional time and mathematics and science achievement of 8th grade cohorts in 18 advanced and developing economies. In addition, the study examined changes in mathematics and science performance across the two groups of economies over time using…

  12. Regression Analysis: Legal Applications in Institutional Research

    ERIC Educational Resources Information Center

    Frizell, Julie A.; Shippen, Benjamin S., Jr.; Luna, Andrew L.

    2008-01-01

    This article reviews multiple regression analysis, describes how its results should be interpreted, and instructs institutional researchers on how to conduct such analyses using an example focused on faculty pay equity between men and women. The use of multiple regression analysis will be presented as a method with which to compare salaries of…

  13. "Bunched Black Swans" in Complex Geosystems: Cross-Disciplinary Approaches to the Additive and Multiplicative Modelling of Correlated Extreme Bursts

    NASA Astrophysics Data System (ADS)

    Watkins, N. W.; Rypdal, M.; Lovsletten, O.

    2012-12-01

    -stationarity explicitly built in. In record breaking statistics, a record is defined in the sense used in everyday language, to be the largest value yet recorded in a time series, for example, the 2004 Sumatran Boxing Day earthquake was at the time the largest to be digitally recorded. The third group of approaches (e.g. avalanches) are explicitly spatiotemporal and so also include spatial structure. This presentation will discuss two examples of our recent work on the burst problem. We will show numerical results extending the preliminary results presented in [Watkins et al, PRE, 2009] using a standard additive model, linear fractional stable motion (LFSM). LFSM explicitly includes both heavy tails and long range dependence, allowing us to study how these 2 effects compete in determining the burst duration and size exponent probability distributions. We will contrast these simulations with new analytical studies of bursts in a multiplicative process, the multifractal random walk (MRW). We will present an analytical derivation for the scaling of the burst durations and make a preliminary comparison with data from the AE index from solar-terrestrial physics. We believe our result is more generally applicable than the MRW model, and that it applies to a broad class of multifractal processes.

  14. The Regression Trunk Approach to Discover Treatment Covariate Interaction

    ERIC Educational Resources Information Center

    Dusseldorp, Elise; Meulman, Jacqueline J.

    2004-01-01

    The regression trunk approach (RTA) is an integration of regression trees and multiple linear regression analysis. In this paper RTA is used to discover treatment covariate interactions, in the regression of one continuous variable on a treatment variable with "multiple" covariates. The performance of RTA is compared to the classical method of…

  15. Regression and Data Mining Methods for Analyses of Multiple Rare Variants in the Genetic Analysis Workshop 17 Mini-Exome Data

    PubMed Central

    Bailey-Wilson, Joan E.; Brennan, Jennifer S.; Bull, Shelley B; Culverhouse, Robert; Kim, Yoonhee; Jiang, Yuan; Jung, Jeesun; Li, Qing; Lamina, Claudia; Liu, Ying; Mägi, Reedik; Niu, Yue S.; Simpson, Claire L.; Wang, Libo; Yilmaz, Yildiz E.; Zhang, Heping; Zhang, Zhaogong

    2012-01-01

    Group 14 of Genetic Analysis Workshop 17 examined several issues related to analysis of complex traits using DNA sequence data. These issues included novel methods for analyzing rare genetic variants in an aggregated manner (often termed collapsing rare variants), evaluation of various study designs to increase power to detect effects of rare variants, and the use of machine learning approaches to model highly complex heterogeneous traits. Various published and novel methods for analyzing traits with extreme locus and allelic heterogeneity were applied to the simulated quantitative and disease phenotypes. Overall, we conclude that power is (as expected) dependent on locus-specific heritability or contribution to disease risk, large samples will be required to detect rare causal variants with small effect sizes, extreme phenotype sampling designs may increase power for smaller laboratory costs, methods that allow joint analysis of multiple variants per gene or pathway are more powerful in general than analyses of individual rare variants, population-specific analyses can be optimal when different subpopulations harbor private causal mutations, and machine learning methods may be useful for selecting subsets of predictors for follow-up in the presence of extreme locus heterogeneity and large numbers of potential predictors. PMID:22128066

  16. A Comparison of Seven Cox Regression-Based Models to Account for Heterogeneity Across Multiple HIV Treatment Cohorts in Latin America and the Caribbean

    PubMed Central

    Giganti, Mark J.; Luz, Paula M.; Caro-Vega, Yanink; Cesar, Carina; Padgett, Denis; Koenig, Serena; Echevarria, Juan; McGowan, Catherine C.; Shepherd, Bryan E.

    2015-01-01

    Abstract Many studies of HIV/AIDS aggregate data from multiple cohorts to improve power and generalizability. There are several analysis approaches to account for cross-cohort heterogeneity; we assessed how different approaches can impact results from an HIV/AIDS study investigating predictors of mortality. Using data from 13,658 HIV-infected patients starting antiretroviral therapy from seven Latin American and Caribbean cohorts, we illustrate the assumptions of seven readily implementable approaches to account for across cohort heterogeneity with Cox proportional hazards models, and we compare hazard ratio estimates across approaches. As a sensitivity analysis, we modify cohort membership to generate specific heterogeneity conditions. Hazard ratio estimates varied slightly between the seven analysis approaches, but differences were not clinically meaningful. Adjusted hazard ratio estimates for the association between AIDS at treatment initiation and death varied from 2.00 to 2.20 across approaches that accounted for heterogeneity; the adjusted hazard ratio was estimated as 1.73 in analyses that ignored across cohort heterogeneity. In sensitivity analyses with more extreme heterogeneity, we noted a slightly greater distinction between approaches. Despite substantial heterogeneity between cohorts, the impact of the specific approach to account for heterogeneity was minimal in our case study. Our results suggest that it is important to account for across cohort heterogeneity in analyses, but that the specific technique for addressing heterogeneity may be less important. Because of their flexibility in accounting for cohort heterogeneity, we prefer stratification or meta-analysis methods, but we encourage investigators to consider their specific study conditions and objectives. PMID:25647087

  17. A Comparison of Seven Cox Regression-Based Models to Account for Heterogeneity Across Multiple HIV Treatment Cohorts in Latin America and the Caribbean.

    PubMed

    Giganti, Mark J; Luz, Paula M; Caro-Vega, Yanink; Cesar, Carina; Padgett, Denis; Koenig, Serena; Echevarria, Juan; McGowan, Catherine C; Shepherd, Bryan E

    2015-05-01

    Many studies of HIV/AIDS aggregate data from multiple cohorts to improve power and generalizability. There are several analysis approaches to account for cross-cohort heterogeneity; we assessed how different approaches can impact results from an HIV/AIDS study investigating predictors of mortality. Using data from 13,658 HIV-infected patients starting antiretroviral therapy from seven Latin American and Caribbean cohorts, we illustrate the assumptions of seven readily implementable approaches to account for across cohort heterogeneity with Cox proportional hazards models, and we compare hazard ratio estimates across approaches. As a sensitivity analysis, we modify cohort membership to generate specific heterogeneity conditions. Hazard ratio estimates varied slightly between the seven analysis approaches, but differences were not clinically meaningful. Adjusted hazard ratio estimates for the association between AIDS at treatment initiation and death varied from 2.00 to 2.20 across approaches that accounted for heterogeneity; the adjusted hazard ratio was estimated as 1.73 in analyses that ignored across cohort heterogeneity. In sensitivity analyses with more extreme heterogeneity, we noted a slightly greater distinction between approaches. Despite substantial heterogeneity between cohorts, the impact of the specific approach to account for heterogeneity was minimal in our case study. Our results suggest that it is important to account for across cohort heterogeneity in analyses, but that the specific technique for addressing heterogeneity may be less important. Because of their flexibility in accounting for cohort heterogeneity, we prefer stratification or meta-analysis methods, but we encourage investigators to consider their specific study conditions and objectives.

  18. Joint regression analysis and AMMI model applied to oat improvement

    NASA Astrophysics Data System (ADS)

    Oliveira, A.; Oliveira, T. A.; Mejza, S.

    2012-09-01

    In our work we present an application of some biometrical methods useful in genotype stability evaluation, namely AMMI model, Joint Regression Analysis (JRA) and multiple comparison tests. A genotype stability analysis of oat (Avena Sativa L.) grain yield was carried out using data of the Portuguese Plant Breeding Board, sample of the 22 different genotypes during the years 2002, 2003 and 2004 in six locations. In Ferreira et al. (2006) the authors state the relevance of the regression models and of the Additive Main Effects and Multiplicative Interactions (AMMI) model, to study and to estimate phenotypic stability effects. As computational techniques we use the Zigzag algorithm to estimate the regression coefficients and the agricolae-package available in R software for AMMI model analysis.

  19. Cactus: An Introduction to Regression

    ERIC Educational Resources Information Center

    Hyde, Hartley

    2008-01-01

    When the author first used "VisiCalc," the author thought it a very useful tool when he had the formulas. But how could he design a spreadsheet if there was no known formula for the quantities he was trying to predict? A few months later, the author relates he learned to use multiple linear regression software and suddenly it all clicked into…

  20. Tactics for modeling multiple salivary analyte data in relation to behavior problems: Additive, ratio, and interaction effects.

    PubMed

    Chen, Frances R; Raine, Adrian; Granger, Douglas A

    2015-01-01

    Individual differences in the psychobiology of the stress response have been linked to behavior problems in youth yet most research has focused on single signaling molecules released by either the hypothalamic-pituitary-adrenal axis or the autonomic nervous system. As our understanding about biobehavioral relationships develops it is clear that multiple signals from the biological stress systems work in coordination to affect behavior problems. Questions are raised as to whether coordinated effects should be statistically represented as ratio or interactive terms. We address this knowledge gap by providing a theoretical overview of the concepts and rationales, and illustrating the analytical tactics. Salivary samples collected from 446 youth aged 11-12 were assayed for salivary alpha-amylase (sAA), dehydroepiandrosterone-sulfate (DHEA-s) and cortisol. Coordinated effect of DHEA-s and cortisol, and coordinated effect of sAA and cortisol on externalizing and internalizing problems (Child Behavior Checklist) were tested with the ratio and the interaction approaches using multi-group path analysis. Findings consistent with previous studies include a positive association between cortisol/DHEA-s ratio and internalizing problems; and a negative association between cortisol and externalizing problems conditional on low levels of sAA. This study highlights the importance of matching analytical strategy with research hypothesis when integrating salivary bioscience into research in behavior problems. Recommendations are made for investigating multiple salivary analytes in relation to behavior problems. PMID:25462892

  1. Tactics for modeling multiple salivary analyte data in relation to behavior problems: Additive, ratio, and interaction effects.

    PubMed

    Chen, Frances R; Raine, Adrian; Granger, Douglas A

    2015-01-01

    Individual differences in the psychobiology of the stress response have been linked to behavior problems in youth yet most research has focused on single signaling molecules released by either the hypothalamic-pituitary-adrenal axis or the autonomic nervous system. As our understanding about biobehavioral relationships develops it is clear that multiple signals from the biological stress systems work in coordination to affect behavior problems. Questions are raised as to whether coordinated effects should be statistically represented as ratio or interactive terms. We address this knowledge gap by providing a theoretical overview of the concepts and rationales, and illustrating the analytical tactics. Salivary samples collected from 446 youth aged 11-12 were assayed for salivary alpha-amylase (sAA), dehydroepiandrosterone-sulfate (DHEA-s) and cortisol. Coordinated effect of DHEA-s and cortisol, and coordinated effect of sAA and cortisol on externalizing and internalizing problems (Child Behavior Checklist) were tested with the ratio and the interaction approaches using multi-group path analysis. Findings consistent with previous studies include a positive association between cortisol/DHEA-s ratio and internalizing problems; and a negative association between cortisol and externalizing problems conditional on low levels of sAA. This study highlights the importance of matching analytical strategy with research hypothesis when integrating salivary bioscience into research in behavior problems. Recommendations are made for investigating multiple salivary analytes in relation to behavior problems.

  2. Formation of peptides from amino acids by single or multiple additions of ATP to suspensions of nucleoproteinoid microparticles

    NASA Technical Reports Server (NTRS)

    Nakashima, T.; Fox, S. W.

    1981-01-01

    The synthesis of peptides from individual amino acids or pairs of amino acids and ATP in the presence of catalysis by nucleoproteinoid microparticles is investigated. Experiments were performed with suspensions formed from the condensation of lysine-rich and acidic proteinoids with polyadenylic acid, to which were added glycine, phenylalanine, proline, lysine or glycine-phenylalanine mixtures, and ATP either at once or serially. Peptide yields are found to be greatest for equal amounts of acidic and basic proteinoids. The addition of imidazole is found to alter the preference of glycine-phenylalanine mixtures to form mixed heteropeptides rather than homopeptides. A rapid ATP decay in the peptide synthesis reaction is observed, and a greater yield is obtained for repeated small additions than for a single addition of ATP. The experimental system has properties similar to modern cells, and represents an organizational unit ready for the evolution of associated biochemical pathways.

  3. Wrong Signs in Regression Coefficients

    NASA Technical Reports Server (NTRS)

    McGee, Holly

    1999-01-01

    When using parametric cost estimation, it is important to note the possibility of the regression coefficients having the wrong sign. A wrong sign is defined as a sign on the regression coefficient opposite to the researcher's intuition and experience. Some possible causes for the wrong sign discussed in this paper are a small range of x's, leverage points, missing variables, multicollinearity, and computational error. Additionally, techniques for determining the cause of the wrong sign are given.

  4. Multiple predictor smoothing methods for sensitivity analysis.

    SciTech Connect

    Helton, Jon Craig; Storlie, Curtis B.

    2006-08-01

    The use of multiple predictor smoothing methods in sampling-based sensitivity analyses of complex models is investigated. Specifically, sensitivity analysis procedures based on smoothing methods employing the stepwise application of the following nonparametric regression techniques are described: (1) locally weighted regression (LOESS), (2) additive models, (3) projection pursuit regression, and (4) recursive partitioning regression. The indicated procedures are illustrated with both simple test problems and results from a performance assessment for a radioactive waste disposal facility (i.e., the Waste Isolation Pilot Plant). As shown by the example illustrations, the use of smoothing procedures based on nonparametric regression techniques can yield more informative sensitivity analysis results than can be obtained with more traditional sensitivity analysis procedures based on linear regression, rank regression or quadratic regression when nonlinear relationships between model inputs and model predictions are present.

  5. Predicting Counselor Effectiveness: A Multiple Regression Approach.

    ERIC Educational Resources Information Center

    Mendoza, Buena Flor H.

    This study attempted to determine whether counselor effectiveness designated by a high level of performance in a first counseling practicum as ranked by faculty supervisors, can be predicted with a knowledge of the extent to which the individual possesses the personal qualities of open-mindedness, tolerance for ambiguity, general mental health,…

  6. Combined action of time-delay and colored cross-associated multiplicative and additive noises on stability and stochastic resonance for a stochastic metapopulation system

    NASA Astrophysics Data System (ADS)

    Wang, Kang-Kang; Zong, De-Cai; Wang, Ya-Jun; Li, Sheng-Hong

    2016-05-01

    In this paper, the transition between the stable state of a big density and the extinction state and stochastic resonance (SR) for a time-delayed metapopulation system disturbed by colored cross-correlated noises are investigated. By applying the fast descent method, the small time-delay approximation and McNamara and Wiesenfeld's SR theory, we investigate the impacts of time-delay, the multiplicative, additive noises and colored cross-correlated noise on the SNR and the shift between the two states of the system. Numerical results show that the multiplicative, additive noises and time-delay can all speed up the transition from the stable state to the extinction state, while the correlation noise and its correlation time can slow down the extinction process of the population system. With respect to SNR, the multiplicative noise always weakens the SR effect, while noise correlation time plays a dual role in motivating the SR phenomenon. Meanwhile, time-delay mainly plays a negative role in stimulating the SR phenomenon. Conversely, it could motivate the SR effect to increase the strength of the cross-correlation noise in the SNR-β plot, while the increase of additive noise intensity will firstly excite SR, and then suppress the SR effect.

  7. Survival Data and Regression Models

    NASA Astrophysics Data System (ADS)

    Grégoire, G.

    2014-12-01

    We start this chapter by introducing some basic elements for the analysis of censored survival data. Then we focus on right censored data and develop two types of regression models. The first one concerns the so-called accelerated failure time models (AFT), which are parametric models where a function of a parameter depends linearly on the covariables. The second one is a semiparametric model, where the covariables enter in a multiplicative form in the expression of the hazard rate function. The main statistical tool for analysing these regression models is the maximum likelihood methodology and, in spite we recall some essential results about the ML theory, we refer to the chapter "Logistic Regression" for a more detailed presentation.

  8. Interquantile Shrinkage in Regression Models

    PubMed Central

    Jiang, Liewen; Wang, Huixia Judy; Bondell, Howard D.

    2012-01-01

    Conventional analysis using quantile regression typically focuses on fitting the regression model at different quantiles separately. However, in situations where the quantile coefficients share some common feature, joint modeling of multiple quantiles to accommodate the commonality often leads to more efficient estimation. One example of common features is that a predictor may have a constant effect over one region of quantile levels but varying effects in other regions. To automatically perform estimation and detection of the interquantile commonality, we develop two penalization methods. When the quantile slope coefficients indeed do not change across quantile levels, the proposed methods will shrink the slopes towards constant and thus improve the estimation efficiency. We establish the oracle properties of the two proposed penalization methods. Through numerical investigations, we demonstrate that the proposed methods lead to estimations with competitive or higher efficiency than the standard quantile regression estimation in finite samples. Supplemental materials for the article are available online. PMID:24363546

  9. LRGS: Linear Regression by Gibbs Sampling

    NASA Astrophysics Data System (ADS)

    Mantz, Adam B.

    2016-02-01

    LRGS (Linear Regression by Gibbs Sampling) implements a Gibbs sampler to solve the problem of multivariate linear regression with uncertainties in all measured quantities and intrinsic scatter. LRGS extends an algorithm by Kelly (2007) that used Gibbs sampling for performing linear regression in fairly general cases in two ways: generalizing the procedure for multiple response variables, and modeling the prior distribution of covariates using a Dirichlet process.

  10. Multiple Stressors in Agricultural Streams: A Mesocosm Study of Interactions among Raised Water Temperature, Sediment Addition and Nutrient Enrichment

    PubMed Central

    Piggott, Jeremy J.; Lange, Katharina; Townsend, Colin R.; Matthaei, Christoph D.

    2012-01-01

    Changes to land use affect streams through nutrient enrichment, increased inputs of sediment and, where riparian vegetation has been removed, raised water temperature. We manipulated all three stressors in experimental streamside channels for 30 days and determined the individual and pair-wise combined effects on benthic invertebrate and algal communities and on leaf decay, a measure of ecosystem functioning. We added nutrients (phosphorus+nitrogen; high, intermediate, natural) and/or sediment (grain size 0.2 mm; high, intermediate, natural) to 18 channels supplied with water from a nearby stream. Temperature was increased by 1.4°C in half the channels, simulating the loss of upstream and adjacent riparian shade. Sediment affected 93% of all biological response variables (either as an individual effect or via an interaction with another stressor) generally in a negative manner, while nutrient enrichment affected 59% (mostly positive) and raised temperature 59% (mostly positive). More of the algal components of the community responded to stressors acting individually than did invertebrate components, whereas pair-wise stressor interactions were more common in the invertebrate community. Stressors interacted often and in a complex manner, with interactions between sediment and temperature most common. Thus, the negative impact of high sediment on taxon richness of both algae and invertebrates was stronger at raised temperature, further reducing biodiversity. In addition, the decay rate of leaf material (strength loss) accelerated with nutrient enrichment at ambient but not at raised temperature. A key implication of our findings for resource managers is that the removal of riparian shading from streams already subjected to high sediment inputs, or land-use changes that increase erosion or nutrient runoff in a landscape without riparian buffers, may have unexpected effects on stream health. We highlight the likely importance of intact or restored buffer strips, both

  11. Multiple stressors in agricultural streams: a mesocosm study of interactions among raised water temperature, sediment addition and nutrient enrichment.

    PubMed

    Piggott, Jeremy J; Lange, Katharina; Townsend, Colin R; Matthaei, Christoph D

    2012-01-01

    Changes to land use affect streams through nutrient enrichment, increased inputs of sediment and, where riparian vegetation has been removed, raised water temperature. We manipulated all three stressors in experimental streamside channels for 30 days and determined the individual and pair-wise combined effects on benthic invertebrate and algal communities and on leaf decay, a measure of ecosystem functioning. We added nutrients (phosphorus+nitrogen; high, intermediate, natural) and/or sediment (grain size 0.2 mm; high, intermediate, natural) to 18 channels supplied with water from a nearby stream. Temperature was increased by 1.4°C in half the channels, simulating the loss of upstream and adjacent riparian shade. Sediment affected 93% of all biological response variables (either as an individual effect or via an interaction with another stressor) generally in a negative manner, while nutrient enrichment affected 59% (mostly positive) and raised temperature 59% (mostly positive). More of the algal components of the community responded to stressors acting individually than did invertebrate components, whereas pair-wise stressor interactions were more common in the invertebrate community. Stressors interacted often and in a complex manner, with interactions between sediment and temperature most common. Thus, the negative impact of high sediment on taxon richness of both algae and invertebrates was stronger at raised temperature, further reducing biodiversity. In addition, the decay rate of leaf material (strength loss) accelerated with nutrient enrichment at ambient but not at raised temperature. A key implication of our findings for resource managers is that the removal of riparian shading from streams already subjected to high sediment inputs, or land-use changes that increase erosion or nutrient runoff in a landscape without riparian buffers, may have unexpected effects on stream health. We highlight the likely importance of intact or restored buffer strips, both

  12. Unitary Response Regression Models

    ERIC Educational Resources Information Center

    Lipovetsky, S.

    2007-01-01

    The dependent variable in a regular linear regression is a numerical variable, and in a logistic regression it is a binary or categorical variable. In these models the dependent variable has varying values. However, there are problems yielding an identity output of a constant value which can also be modelled in a linear or logistic regression with…

  13. NCCS Regression Test Harness

    SciTech Connect

    Tharrington, Arnold N.

    2015-09-09

    The NCCS Regression Test Harness is a software package that provides a framework to perform regression and acceptance testing on NCCS High Performance Computers. The package is written in Python and has only the dependency of a Subversion repository to store the regression tests.

  14. Fully Regressive Melanoma

    PubMed Central

    Ehrsam, Eric; Kallini, Joseph R.; Lebas, Damien; Modiano, Philippe; Cotten, Hervé

    2016-01-01

    Fully regressive melanoma is a phenomenon in which the primary cutaneous melanoma becomes completely replaced by fibrotic components as a result of host immune response. Although 10 to 35 percent of cases of cutaneous melanomas may partially regress, fully regressive melanoma is very rare; only 47 cases have been reported in the literature to date. AH of the cases of fully regressive melanoma reported in the literature were diagnosed in conjunction with metastasis on a patient. The authors describe a case of fully regressive melanoma without any metastases at the time of its diagnosis. Characteristic findings on dermoscopy, as well as the absence of melanoma on final biopsy, confirmed the diagnosis.

  15. Fully Regressive Melanoma

    PubMed Central

    Ehrsam, Eric; Kallini, Joseph R.; Lebas, Damien; Modiano, Philippe; Cotten, Hervé

    2016-01-01

    Fully regressive melanoma is a phenomenon in which the primary cutaneous melanoma becomes completely replaced by fibrotic components as a result of host immune response. Although 10 to 35 percent of cases of cutaneous melanomas may partially regress, fully regressive melanoma is very rare; only 47 cases have been reported in the literature to date. AH of the cases of fully regressive melanoma reported in the literature were diagnosed in conjunction with metastasis on a patient. The authors describe a case of fully regressive melanoma without any metastases at the time of its diagnosis. Characteristic findings on dermoscopy, as well as the absence of melanoma on final biopsy, confirmed the diagnosis. PMID:27672418

  16. Research of the Additional Losses Occurring in Optical Fiber at its Multiple Bends in the Range Waves 1310nm, 1550nm and 1625nm Long

    NASA Astrophysics Data System (ADS)

    Yurchenko, A. V.; Gorlov, N. I.; Alkina, A. D.; Mekhtiev, A. D.; Kovtun, A. A.

    2016-01-01

    Article is devoted to research of the additional losses occurring in the optical fiber at its multiple bends in the range waves of 1310 nanometers, 1550 nanometers and 1625 nanometers long. Article is directed on creation of the external factors methods which allow to estimate and eliminate negative influence. The automated way of calculation of losses at a bend is developed. Results of scientific researches are used by engineers of “Kazaktelekom” AS for practical definition of losses service conditions. For modeling the Wolfram|Alpha environment — the knowledge base and a set of computing algorithms was chosen. The greatest losses are noted on wavelength 1310nm and 1625nm. All dependences are nonlinear. Losses with each following excess are multiplicative.

  17. Hybrid fuzzy regression with trapezoidal fuzzy data

    NASA Astrophysics Data System (ADS)

    Razzaghnia, T.; Danesh, S.; Maleki, A.

    2011-12-01

    In this regard, this research deals with a method for hybrid fuzzy least-squares regression. The extension of symmetric triangular fuzzy coefficients to asymmetric trapezoidal fuzzy coefficients is considered as an effective measure for removing unnecessary fuzziness of the linear fuzzy model. First, trapezoidal fuzzy variable is applied to derive a bivariate regression model. In the following, normal equations are formulated to solve the four parts of hybrid regression coefficients. Also the model is extended to multiple regression analysis. Eventually, method is compared with Y-H.O. chang's model.

  18. Improved Regression Calibration

    ERIC Educational Resources Information Center

    Skrondal, Anders; Kuha, Jouni

    2012-01-01

    The likelihood for generalized linear models with covariate measurement error cannot in general be expressed in closed form, which makes maximum likelihood estimation taxing. A popular alternative is regression calibration which is computationally efficient at the cost of inconsistent estimation. We propose an improved regression calibration…

  19. Comparing Predictors in Multivariate Regression Models: An Extension of Dominance Analysis

    ERIC Educational Resources Information Center

    Azen, Razia; Budescu, David V.

    2006-01-01

    Dominance analysis (DA) is a method used to compare the relative importance of predictors in multiple regression. DA determines the dominance of one predictor over another by comparing their additional R[squared] contributions across all subset models. In this article DA is extended to multivariate models by identifying a minimal set of criteria…

  20. Regression problems for magnitudes

    NASA Astrophysics Data System (ADS)

    Castellaro, S.; Mulargia, F.; Kagan, Y. Y.

    2006-06-01

    Least-squares linear regression is so popular that it is sometimes applied without checking whether its basic requirements are satisfied. In particular, in studying earthquake phenomena, the conditions (a) that the uncertainty on the independent variable is at least one order of magnitude smaller than the one on the dependent variable, (b) that both data and uncertainties are normally distributed and (c) that residuals are constant are at times disregarded. This may easily lead to wrong results. As an alternative to least squares, when the ratio between errors on the independent and the dependent variable can be estimated, orthogonal regression can be applied. We test the performance of orthogonal regression in its general form against Gaussian and non-Gaussian data and error distributions and compare it with standard least-square regression. General orthogonal regression is found to be superior or equal to the standard least squares in all the cases investigated and its use is recommended. We also compare the performance of orthogonal regression versus standard regression when, as often happens in the literature, the ratio between errors on the independent and the dependent variables cannot be estimated and is arbitrarily set to 1. We apply these results to magnitude scale conversion, which is a common problem in seismology, with important implications in seismic hazard evaluation, and analyse it through specific tests. Our analysis concludes that the commonly used standard regression may induce systematic errors in magnitude conversion as high as 0.3-0.4, and, even more importantly, this can introduce apparent catalogue incompleteness, as well as a heavy bias in estimates of the slope of the frequency-magnitude distributions. All this can be avoided by using the general orthogonal regression in magnitude conversions.

  1. A Gibbs sampler for multivariate linear regression

    NASA Astrophysics Data System (ADS)

    Mantz, Adam B.

    2016-04-01

    Kelly described an efficient algorithm, using Gibbs sampling, for performing linear regression in the fairly general case where non-zero measurement errors exist for both the covariates and response variables, where these measurements may be correlated (for the same data point), where the response variable is affected by intrinsic scatter in addition to measurement error, and where the prior distribution of covariates is modelled by a flexible mixture of Gaussians rather than assumed to be uniform. Here, I extend the Kelly algorithm in two ways. First, the procedure is generalized to the case of multiple response variables. Secondly, I describe how to model the prior distribution of covariates using a Dirichlet process, which can be thought of as a Gaussian mixture where the number of mixture components is learned from the data. I present an example of multivariate regression using the extended algorithm, namely fitting scaling relations of the gas mass, temperature, and luminosity of dynamically relaxed galaxy clusters as a function of their mass and redshift. An implementation of the Gibbs sampler in the R language, called LRGS, is provided.

  2. Mapping geogenic radon potential by regression kriging.

    PubMed

    Pásztor, László; Szabó, Katalin Zsuzsanna; Szatmári, Gábor; Laborczi, Annamária; Horváth, Ákos

    2016-02-15

    Radon ((222)Rn) gas is produced in the radioactive decay chain of uranium ((238)U) which is an element that is naturally present in soils. Radon is transported mainly by diffusion and convection mechanisms through the soil depending mainly on the physical and meteorological parameters of the soil and can enter and accumulate in buildings. Health risks originating from indoor radon concentration can be attributed to natural factors and is characterized by geogenic radon potential (GRP). Identification of areas with high health risks require spatial modeling, that is, mapping of radon risk. In addition to geology and meteorology, physical soil properties play a significant role in the determination of GRP. In order to compile a reliable GRP map for a model area in Central-Hungary, spatial auxiliary information representing GRP forming environmental factors were taken into account to support the spatial inference of the locally measured GRP values. Since the number of measured sites was limited, efficient spatial prediction methodologies were searched for to construct a reliable map for a larger area. Regression kriging (RK) was applied for the interpolation using spatially exhaustive auxiliary data on soil, geology, topography, land use and climate. RK divides the spatial inference into two parts. Firstly, the deterministic component of the target variable is determined by a regression model. The residuals of the multiple linear regression analysis represent the spatially varying but dependent stochastic component, which are interpolated by kriging. The final map is the sum of the two component predictions. Overall accuracy of the map was tested by Leave-One-Out Cross-Validation. Furthermore the spatial reliability of the resultant map is also estimated by the calculation of the 90% prediction interval of the local prediction values. The applicability of the applied method as well as that of the map is discussed briefly. PMID:26706761

  3. Mapping geogenic radon potential by regression kriging.

    PubMed

    Pásztor, László; Szabó, Katalin Zsuzsanna; Szatmári, Gábor; Laborczi, Annamária; Horváth, Ákos

    2016-02-15

    Radon ((222)Rn) gas is produced in the radioactive decay chain of uranium ((238)U) which is an element that is naturally present in soils. Radon is transported mainly by diffusion and convection mechanisms through the soil depending mainly on the physical and meteorological parameters of the soil and can enter and accumulate in buildings. Health risks originating from indoor radon concentration can be attributed to natural factors and is characterized by geogenic radon potential (GRP). Identification of areas with high health risks require spatial modeling, that is, mapping of radon risk. In addition to geology and meteorology, physical soil properties play a significant role in the determination of GRP. In order to compile a reliable GRP map for a model area in Central-Hungary, spatial auxiliary information representing GRP forming environmental factors were taken into account to support the spatial inference of the locally measured GRP values. Since the number of measured sites was limited, efficient spatial prediction methodologies were searched for to construct a reliable map for a larger area. Regression kriging (RK) was applied for the interpolation using spatially exhaustive auxiliary data on soil, geology, topography, land use and climate. RK divides the spatial inference into two parts. Firstly, the deterministic component of the target variable is determined by a regression model. The residuals of the multiple linear regression analysis represent the spatially varying but dependent stochastic component, which are interpolated by kriging. The final map is the sum of the two component predictions. Overall accuracy of the map was tested by Leave-One-Out Cross-Validation. Furthermore the spatial reliability of the resultant map is also estimated by the calculation of the 90% prediction interval of the local prediction values. The applicability of the applied method as well as that of the map is discussed briefly.

  4. Multivariate Regression with Calibration*

    PubMed Central

    Liu, Han; Wang, Lie; Zhao, Tuo

    2014-01-01

    We propose a new method named calibrated multivariate regression (CMR) for fitting high dimensional multivariate regression models. Compared to existing methods, CMR calibrates the regularization for each regression task with respect to its noise level so that it is simultaneously tuning insensitive and achieves an improved finite-sample performance. Computationally, we develop an efficient smoothed proximal gradient algorithm which has a worst-case iteration complexity O(1/ε), where ε is a pre-specified numerical accuracy. Theoretically, we prove that CMR achieves the optimal rate of convergence in parameter estimation. We illustrate the usefulness of CMR by thorough numerical simulations and show that CMR consistently outperforms other high dimensional multivariate regression methods. We also apply CMR on a brain activity prediction problem and find that CMR is as competitive as the handcrafted model created by human experts. PMID:25620861

  5. Metamorphic geodesic regression.

    PubMed

    Hong, Yi; Joshi, Sarang; Sanchez, Mar; Styner, Martin; Niethammer, Marc

    2012-01-01

    We propose a metamorphic geodesic regression approach approximating spatial transformations for image time-series while simultaneously accounting for intensity changes. Such changes occur for example in magnetic resonance imaging (MRI) studies of the developing brain due to myelination. To simplify computations we propose an approximate metamorphic geodesic regression formulation that only requires pairwise computations of image metamorphoses. The approximated solution is an appropriately weighted average of initial momenta. To obtain initial momenta reliably, we develop a shooting method for image metamorphosis.

  6. Use of probabilistic weights to enhance linear regression myoelectric control

    NASA Astrophysics Data System (ADS)

    Smith, Lauren H.; Kuiken, Todd A.; Hargrove, Levi J.

    2015-12-01

    Objective. Clinically available prostheses for transradial amputees do not allow simultaneous myoelectric control of degrees of freedom (DOFs). Linear regression methods can provide simultaneous myoelectric control, but frequently also result in difficulty with isolating individual DOFs when desired. This study evaluated the potential of using probabilistic estimates of categories of gross prosthesis movement, which are commonly used in classification-based myoelectric control, to enhance linear regression myoelectric control. Approach. Gaussian models were fit to electromyogram (EMG) feature distributions for three movement classes at each DOF (no movement, or movement in either direction) and used to weight the output of linear regression models by the probability that the user intended the movement. Eight able-bodied and two transradial amputee subjects worked in a virtual Fitts’ law task to evaluate differences in controllability between linear regression and probability-weighted regression for an intramuscular EMG-based three-DOF wrist and hand system. Main results. Real-time and offline analyses in able-bodied subjects demonstrated that probability weighting improved performance during single-DOF tasks (p < 0.05) by preventing extraneous movement at additional DOFs. Similar results were seen in experiments with two transradial amputees. Though goodness-of-fit evaluations suggested that the EMG feature distributions showed some deviations from the Gaussian, equal-covariance assumptions used in this experiment, the assumptions were sufficiently met to provide improved performance compared to linear regression control. Significance. Use of probability weights can improve the ability to isolate individual during linear regression myoelectric control, while maintaining the ability to simultaneously control multiple DOFs.

  7. Regression Calibration with Heteroscedastic Error Variance

    PubMed Central

    Spiegelman, Donna; Logan, Roger; Grove, Douglas

    2011-01-01

    The problem of covariate measurement error with heteroscedastic measurement error variance is considered. Standard regression calibration assumes that the measurement error has a homoscedastic measurement error variance. An estimator is proposed to correct regression coefficients for covariate measurement error with heteroscedastic variance. Point and interval estimates are derived. Validation data containing the gold standard must be available. This estimator is a closed-form correction of the uncorrected primary regression coefficients, which may be of logistic or Cox proportional hazards model form, and is closely related to the version of regression calibration developed by Rosner et al. (1990). The primary regression model can include multiple covariates measured without error. The use of these estimators is illustrated in two data sets, one taken from occupational epidemiology (the ACE study) and one taken from nutritional epidemiology (the Nurses’ Health Study). In both cases, although there was evidence of moderate heteroscedasticity, there was little difference in estimation or inference using this new procedure compared to standard regression calibration. It is shown theoretically that unless the relative risk is large or measurement error severe, standard regression calibration approximations will typically be adequate, even with moderate heteroscedasticity in the measurement error model variance. In a detailed simulation study, standard regression calibration performed either as well as or better than the new estimator. When the disease is rare and the errors normally distributed, or when measurement error is moderate, standard regression calibration remains the method of choice. PMID:22848187

  8. Multiple Logistic Regression Analysis of Risk Factors Associated with Denture Plaque and Staining in Chinese Removable Denture Wearers over 40 Years Old in Xi’an – a Cross-Sectional Study

    PubMed Central

    Chai, Zhiguo; Chen, Jihua; Zhang, Shaofeng

    2014-01-01

    Background Removable dentures are subject to plaque and/or staining problems. Denture hygiene habits and risk factors differ among countries and regions. The aims of this study were to assess hygiene habits and denture plaque and staining risk factors in Chinese removable denture wearers aged >40 years in Xi’an through multiple logistic regression analysis (MLRA). Methods Questionnaires were administered to 222 patients whose removable dentures were examined clinically to assess wear status and levels of plaque and staining. Univariate analyses were performed to identify potential risk factors for denture plaque/staining. MLRA was performed to identify significant risk factors. Results Brushing (77.93%) was the most prevalent cleaning method in the present study. Only 16.4% of patients regularly used commercial cleansers. Most (81.08%) patients removed their dentures overnight. MLRA indicated that potential risk factors for denture plaque were the duration of denture use (reference, ≤0.5 years; 2.1–5 years: OR = 4.155, P = 0.001; >5 years: OR = 7.238, P<0.001) and cleaning method (reference, chemical cleanser; running water: OR = 7.081, P = 0.010; brushing: OR = 3.567, P = 0.005). Potential risk factors for denture staining were female gender (OR = 0.377, P = 0.013), smoking (OR = 5.471, P = 0.031), tea consumption (OR = 3.957, P = 0.002), denture scratching (OR = 4.557, P = 0.036), duration of denture use (reference, ≤0.5 years; 2.1–5 years: OR = 7.899, P = 0.001; >5 years: OR = 27.226, P<0.001), and cleaning method (reference, chemical cleanser; running water: OR = 29.184, P<0.001; brushing: OR = 4.236, P = 0.007). Conclusion Denture hygiene habits need further improvement. An understanding of the risk factors for denture plaque and staining may provide the basis for preventive efforts. PMID:24498369

  9. Using Leverage and Influence to Introduce Regression Diagnostics.

    ERIC Educational Resources Information Center

    Hoaglin, David C.

    1988-01-01

    Techniques for teaching linear regression are provided. Discussed are leverage and the hat matrix in simple regression, residuals, the notion of leaving out each observation individually, and use of this to study influence on fitted values and to define residuals. Finally, corresponding diagnostics for multiple regression are discussed. (MNS)

  10. Latent Regression Analysis.

    PubMed

    Tarpey, Thaddeus; Petkova, Eva

    2010-07-01

    Finite mixture models have come to play a very prominent role in modelling data. The finite mixture model is predicated on the assumption that distinct latent groups exist in the population. The finite mixture model therefore is based on a categorical latent variable that distinguishes the different groups. Often in practice distinct sub-populations do not actually exist. For example, disease severity (e.g. depression) may vary continuously and therefore, a distinction of diseased and not-diseased may not be based on the existence of distinct sub-populations. Thus, what is needed is a generalization of the finite mixture's discrete latent predictor to a continuous latent predictor. We cast the finite mixture model as a regression model with a latent Bernoulli predictor. A latent regression model is proposed by replacing the discrete Bernoulli predictor by a continuous latent predictor with a beta distribution. Motivation for the latent regression model arises from applications where distinct latent classes do not exist, but instead individuals vary according to a continuous latent variable. The shapes of the beta density are very flexible and can approximate the discrete Bernoulli distribution. Examples and a simulation are provided to illustrate the latent regression model. In particular, the latent regression model is used to model placebo effect among drug treated subjects in a depression study. PMID:20625443

  11. Semiparametric Regression Pursuit.

    PubMed

    Huang, Jian; Wei, Fengrong; Ma, Shuangge

    2012-10-01

    The semiparametric partially linear model allows flexible modeling of covariate effects on the response variable in regression. It combines the flexibility of nonparametric regression and parsimony of linear regression. The most important assumption in the existing methods for the estimation in this model is to assume a priori that it is known which covariates have a linear effect and which do not. However, in applied work, this is rarely known in advance. We consider the problem of estimation in the partially linear models without assuming a priori which covariates have linear effects. We propose a semiparametric regression pursuit method for identifying the covariates with a linear effect. Our proposed method is a penalized regression approach using a group minimax concave penalty. Under suitable conditions we show that the proposed approach is model-pursuit consistent, meaning that it can correctly determine which covariates have a linear effect and which do not with high probability. The performance of the proposed method is evaluated using simulation studies, which support our theoretical results. A real data example is used to illustrated the application of the proposed method. PMID:23559831

  12. [Understanding logistic regression].

    PubMed

    El Sanharawi, M; Naudet, F

    2013-10-01

    Logistic regression is one of the most common multivariate analysis models utilized in epidemiology. It allows the measurement of the association between the occurrence of an event (qualitative dependent variable) and factors susceptible to influence it (explicative variables). The choice of explicative variables that should be included in the logistic regression model is based on prior knowledge of the disease physiopathology and the statistical association between the variable and the event, as measured by the odds ratio. The main steps for the procedure, the conditions of application, and the essential tools for its interpretation are discussed concisely. We also discuss the importance of the choice of variables that must be included and retained in the regression model in order to avoid the omission of important confounding factors. Finally, by way of illustration, we provide an example from the literature, which should help the reader test his or her knowledge.

  13. Logistic regression: a brief primer.

    PubMed

    Stoltzfus, Jill C

    2011-10-01

    Regression techniques are versatile in their application to medical research because they can measure associations, predict outcomes, and control for confounding variable effects. As one such technique, logistic regression is an efficient and powerful way to analyze the effect of a group of independent variables on a binary outcome by quantifying each independent variable's unique contribution. Using components of linear regression reflected in the logit scale, logistic regression iteratively identifies the strongest linear combination of variables with the greatest probability of detecting the observed outcome. Important considerations when conducting logistic regression include selecting independent variables, ensuring that relevant assumptions are met, and choosing an appropriate model building strategy. For independent variable selection, one should be guided by such factors as accepted theory, previous empirical investigations, clinical considerations, and univariate statistical analyses, with acknowledgement of potential confounding variables that should be accounted for. Basic assumptions that must be met for logistic regression include independence of errors, linearity in the logit for continuous variables, absence of multicollinearity, and lack of strongly influential outliers. Additionally, there should be an adequate number of events per independent variable to avoid an overfit model, with commonly recommended minimum "rules of thumb" ranging from 10 to 20 events per covariate. Regarding model building strategies, the three general types are direct/standard, sequential/hierarchical, and stepwise/statistical, with each having a different emphasis and purpose. Before reaching definitive conclusions from the results of any of these methods, one should formally quantify the model's internal validity (i.e., replicability within the same data set) and external validity (i.e., generalizability beyond the current sample). The resulting logistic regression model

  14. Practical Session: Logistic Regression

    NASA Astrophysics Data System (ADS)

    Clausel, M.; Grégoire, G.

    2014-12-01

    An exercise is proposed to illustrate the logistic regression. One investigates the different risk factors in the apparition of coronary heart disease. It has been proposed in Chapter 5 of the book of D.G. Kleinbaum and M. Klein, "Logistic Regression", Statistics for Biology and Health, Springer Science Business Media, LLC (2010) and also by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr341.pdf). This example is based on data given in the file evans.txt coming from http://www.sph.emory.edu/dkleinb/logreg3.htm#data.

  15. An Effect Size for Regression Predictors in Meta-Analysis

    ERIC Educational Resources Information Center

    Aloe, Ariel M.; Becker, Betsy Jane

    2012-01-01

    A new effect size representing the predictive power of an independent variable from a multiple regression model is presented. The index, denoted as r[subscript sp], is the semipartial correlation of the predictor with the outcome of interest. This effect size can be computed when multiple predictor variables are included in the regression model…

  16. Modelling of filariasis in East Java with Poisson regression and generalized Poisson regression models

    NASA Astrophysics Data System (ADS)

    Darnah

    2016-04-01

    Poisson regression has been used if the response variable is count data that based on the Poisson distribution. The Poisson distribution assumed equal dispersion. In fact, a situation where count data are over dispersion or under dispersion so that Poisson regression inappropriate because it may underestimate the standard errors and overstate the significance of the regression parameters, and consequently, giving misleading inference about the regression parameters. This paper suggests the generalized Poisson regression model to handling over dispersion and under dispersion on the Poisson regression model. The Poisson regression model and generalized Poisson regression model will be applied the number of filariasis cases in East Java. Based regression Poisson model the factors influence of filariasis are the percentage of families who don't behave clean and healthy living and the percentage of families who don't have a healthy house. The Poisson regression model occurs over dispersion so that we using generalized Poisson regression. The best generalized Poisson regression model showing the factor influence of filariasis is percentage of families who don't have healthy house. Interpretation of result the model is each additional 1 percentage of families who don't have healthy house will add 1 people filariasis patient.

  17. Explorations in Statistics: Regression

    ERIC Educational Resources Information Center

    Curran-Everett, Douglas

    2011-01-01

    Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This seventh installment of "Explorations in Statistics" explores regression, a technique that estimates the nature of the relationship between two things for which we may only surmise a mechanistic or predictive connection.…

  18. Modern Regression Discontinuity Analysis

    ERIC Educational Resources Information Center

    Bloom, Howard S.

    2012-01-01

    This article provides a detailed discussion of the theory and practice of modern regression discontinuity (RD) analysis for estimating the effects of interventions or treatments. Part 1 briefly chronicles the history of RD analysis and summarizes its past applications. Part 2 explains how in theory an RD analysis can identify an average effect of…

  19. Mechanisms of neuroblastoma regression

    PubMed Central

    Brodeur, Garrett M.; Bagatell, Rochelle

    2014-01-01

    Recent genomic and biological studies of neuroblastoma have shed light on the dramatic heterogeneity in the clinical behaviour of this disease, which spans from spontaneous regression or differentiation in some patients, to relentless disease progression in others, despite intensive multimodality therapy. This evidence also suggests several possible mechanisms to explain the phenomena of spontaneous regression in neuroblastomas, including neurotrophin deprivation, humoral or cellular immunity, loss of telomerase activity and alterations in epigenetic regulation. A better understanding of the mechanisms of spontaneous regression might help to identify optimal therapeutic approaches for patients with these tumours. Currently, the most druggable mechanism is the delayed activation of developmentally programmed cell death regulated by the tropomyosin receptor kinase A pathway. Indeed, targeted therapy aimed at inhibiting neurotrophin receptors might be used in lieu of conventional chemotherapy or radiation in infants with biologically favourable tumours that require treatment. Alternative approaches consist of breaking immune tolerance to tumour antigens or activating neurotrophin receptor pathways to induce neuronal differentiation. These approaches are likely to be most effective against biologically favourable tumours, but they might also provide insights into treatment of biologically unfavourable tumours. We describe the different mechanisms of spontaneous neuroblastoma regression and the consequent therapeutic approaches. PMID:25331179

  20. A New Sample Size Formula for Regression.

    ERIC Educational Resources Information Center

    Brooks, Gordon P.; Barcikowski, Robert S.

    The focus of this research was to determine the efficacy of a new method of selecting sample sizes for multiple linear regression. A Monte Carlo simulation was used to study both empirical predictive power rates and empirical statistical power rates of the new method and seven other methods: those of C. N. Park and A. L. Dudycha (1974); J. Cohen…

  1. Commonality Analysis for the Regression Case.

    ERIC Educational Resources Information Center

    Murthy, Kavita

    Commonality analysis is a procedure for decomposing the coefficient of determination (R superscript 2) in multiple regression analyses into the percent of variance in the dependent variable associated with each independent variable uniquely, and the proportion of explained variance associated with the common effects of predictors in various…

  2. Using Regression Analysis: A Guided Tour.

    ERIC Educational Resources Information Center

    Shelton, Fred Ames

    1987-01-01

    Discusses the use and interpretation of multiple regression analysis with computer programs and presents a flow chart of the process. A general explanation of the flow chart is provided, followed by an example showing the development of a linear equation which could be used in estimating manufacturing overhead cost. (Author/LRW)

  3. Ridge Regression Signal Processing

    NASA Technical Reports Server (NTRS)

    Kuhl, Mark R.

    1990-01-01

    The introduction of the Global Positioning System (GPS) into the National Airspace System (NAS) necessitates the development of Receiver Autonomous Integrity Monitoring (RAIM) techniques. In order to guarantee a certain level of integrity, a thorough understanding of modern estimation techniques applied to navigational problems is required. The extended Kalman filter (EKF) is derived and analyzed under poor geometry conditions. It was found that the performance of the EKF is difficult to predict, since the EKF is designed for a Gaussian environment. A novel approach is implemented which incorporates ridge regression to explain the behavior of an EKF in the presence of dynamics under poor geometry conditions. The basic principles of ridge regression theory are presented, followed by the derivation of a linearized recursive ridge estimator. Computer simulations are performed to confirm the underlying theory and to provide a comparative analysis of the EKF and the recursive ridge estimator.

  4. Different sleep onset criteria at the multiple sleep latency test (MSLT): an additional marker to differentiate central nervous system (CNS) hypersomnias.

    PubMed

    Pizza, Fabio; Vandi, Stefano; Detto, Stefania; Poli, Francesca; Franceschini, Christian; Montagna, Pasquale; Plazzi, Giuseppe

    2011-03-01

    Excessive daytime sleepiness (EDS) has different correlates in non-rapid eye movement (NREM) [idiopathic hypersomnia (IH) without long sleep time] and REM sleep [narcolepsy without cataplexy (NwoC) and narcolepsy with cataplexy (NC)]-related hypersomnias of central origin. We analysed sleep onset characteristics at the multiple sleep latency test (MSLT) applying simultaneously two sleep onset criteria in 44 NC, seven NwoC and 16 IH consecutive patients referred for subjective EDS complaint. Sleep latency (SL) at MSLT was assessed both as the time elapsed to the occurrence of a single epoch of sleep Stage 1 NREM (SL) and of unequivocal sleep [three sleep Stage 1 NREM epochs or any other sleep stage epoch, sustained SL (SusSL)]. Idiopathic hypersomnia patients showed significantly (P<0.0001) longer SusSL than SL (7.7±2.5 versus 5.6±1.3 min, respectively) compared to NwoC (5.8±2.5 versus 5.3±2.2 min) and NC patients (4.1±3 versus 3.9±3 min). A mean difference threshold between SusSL and SL ≥27 s reached a diagnostic value to discriminate IH versus NC and NwoC sufferers (sensitivity 88%; specificity 82%). Moreover, NC patients showed better subjective sleepiness perception than NwoC and IH cases in the comparison between naps with or without sleep occurrence. Simultaneous application of the two widely used sleep onset criteria differentiates IH further from NC and NwoC patients: IH fluctuate through a wake-Stage 1 NREM sleep state before the onset of sustained sleep, while NC and NwoC shift abruptly into a sustained sleep. The combination of SusSL and SL determination at MSLT should be tested as an additional objective differential criterion for EDS disorders.

  5. Effect of sucrose availability on wheel-running as an operant and as a reinforcing consequence on a multiple schedule: Additive effects of extrinsic and automatic reinforcement.

    PubMed

    Belke, Terry W; Pierce, W David

    2015-07-01

    As a follow up to Belke and Pierce's (2014) study, we assessed the effects of repeated presentation and removal of sucrose solution on the behavior of rats responding on a two-component multiple schedule. Rats completed 15 wheel turns (FR 15) for either 15% or 0% sucrose solution in the manipulated component and lever pressed 10 times on average (VR 10) for an opportunity to complete 15 wheel turns (FR 15) in the other component. In contrast to our earlier study, the components advanced based on time (every 8min) rather than completed responses. Results showed that in the manipulated component wheel-running rates were higher and the latency to initiate running longer when sucrose was present (15%) compared to absent (0% or water); the number of obtained outcomes (sucrose/water), however, did not differ with the presentation and withdrawal of sucrose. For the wheel-running as reinforcement component, rates of wheel turns, overall lever-pressing rates, and obtained wheel-running reinforcements were higher, and postreinforcement pauses shorter, when sucrose was present (15%) than absent (0%) in manipulated component. Overall, our findings suggest that wheel-running rate regardless of its function (operant or reinforcement) is maintained by automatically generated consequences (automatic reinforcement) and is increased as an operant by adding experimentally arranged sucrose reinforcement (extrinsic reinforcement). This additive effect on operant wheel-running generalizes through induction or arousal to the wheel-running as reinforcement component, increasing the rate of responding for opportunities to run and the rate of wheel-running per opportunity.

  6. Regression modeling of ground-water flow

    USGS Publications Warehouse

    Cooley, R.L.; Naff, R.L.

    1985-01-01

    Nonlinear multiple regression methods are developed to model and analyze groundwater flow systems. Complete descriptions of regression methodology as applied to groundwater flow models allow scientists and engineers engaged in flow modeling to apply the methods to a wide range of problems. Organization of the text proceeds from an introduction that discusses the general topic of groundwater flow modeling, to a review of basic statistics necessary to properly apply regression techniques, and then to the main topic: exposition and use of linear and nonlinear regression to model groundwater flow. Statistical procedures are given to analyze and use the regression models. A number of exercises and answers are included to exercise the student on nearly all the methods that are presented for modeling and statistical analysis. Three computer programs implement the more complex methods. These three are a general two-dimensional, steady-state regression model for flow in an anisotropic, heterogeneous porous medium, a program to calculate a measure of model nonlinearity with respect to the regression parameters, and a program to analyze model errors in computed dependent variables such as hydraulic head. (USGS)

  7. Orthogonal Regression: A Teaching Perspective

    ERIC Educational Resources Information Center

    Carr, James R.

    2012-01-01

    A well-known approach to linear least squares regression is that which involves minimizing the sum of squared orthogonal projections of data points onto the best fit line. This form of regression is known as orthogonal regression, and the linear model that it yields is known as the major axis. A similar method, reduced major axis regression, is…

  8. Regression Segmentation for M³ Spinal Images.

    PubMed

    Wang, Zhijie; Zhen, Xiantong; Tay, KengYeow; Osman, Said; Romano, Walter; Li, Shuo

    2015-08-01

    Clinical routine often requires to analyze spinal images of multiple anatomic structures in multiple anatomic planes from multiple imaging modalities (M(3)). Unfortunately, existing methods for segmenting spinal images are still limited to one specific structure, in one specific plane or from one specific modality (S(3)). In this paper, we propose a novel approach, Regression Segmentation, that is for the first time able to segment M(3) spinal images in one single unified framework. This approach formulates the segmentation task innovatively as a boundary regression problem: modeling a highly nonlinear mapping function from substantially diverse M(3) images directly to desired object boundaries. Leveraging the advancement of sparse kernel machines, regression segmentation is fulfilled by a multi-dimensional support vector regressor (MSVR) which operates in an implicit, high dimensional feature space where M(3) diversity and specificity can be systematically categorized, extracted, and handled. The proposed regression segmentation approach was thoroughly tested on images from 113 clinical subjects including both disc and vertebral structures, in both sagittal and axial planes, and from both MRI and CT modalities. The overall result reaches a high dice similarity index (DSI) 0.912 and a low boundary distance (BD) 0.928 mm. With our unified and expendable framework, an efficient clinical tool for M(3) spinal image segmentation can be easily achieved, and will substantially benefit the diagnosis and treatment of spinal diseases.

  9. Structural regression trees

    SciTech Connect

    Kramer, S.

    1996-12-31

    In many real-world domains the task of machine learning algorithms is to learn a theory for predicting numerical values. In particular several standard test domains used in Inductive Logic Programming (ILP) are concerned with predicting numerical values from examples and relational and mostly non-determinate background knowledge. However, so far no ILP algorithm except one can predict numbers and cope with nondeterminate background knowledge. (The only exception is a covering algorithm called FORS.) In this paper we present Structural Regression Trees (SRT), a new algorithm which can be applied to the above class of problems. SRT integrates the statistical method of regression trees into ILP. It constructs a tree containing a literal (an atomic formula or its negation) or a conjunction of literals in each node, and assigns a numerical value to each leaf. SRT provides more comprehensible results than purely statistical methods, and can be applied to a class of problems most other ILP systems cannot handle. Experiments in several real-world domains demonstrate that the approach is competitive with existing methods, indicating that the advantages are not at the expense of predictive accuracy.

  10. Spontaneous hypnotic age regression: case report.

    PubMed

    Spiegel, D; Rosenfeld, A

    1984-12-01

    Age regression--reliving the past as though it were occurring in the present, with age appropriate vocabulary, mental content, and affect--can occur with instruction in highly hypnotizable individuals, but has rarely been reported to occur spontaneously, especially as a primary symptom. The psychiatric presentation and treatment of a 16-year-old girl with spontaneous age regressions accessible and controllable with hypnosis and psychotherapy are described. Areas of overlap and divergence between this patient's symptoms and those found in patients with hysterical fugue and multiple personality syndrome are also discussed.

  11. Influence of Al³⁺ addition on the flocculation and sedimentation of activated sludge: comparison of single and multiple dosing patterns.

    PubMed

    Wen, Yue; Zheng, Wanlin; Yang, Yundi; Cao, Asheng; Zhou, Qi

    2015-05-15

    In this study, the flocculation and sedimentation performance of activated sludge (AS) with single and multiple dosing of trivalent aluminum (Al(3+)) were studied. The AS samples were cultivated in sequencing batch reactors at 22 °C. The dosages of Al(3+) were 0.00, 0.125, 0.5, 1.0, and 1.5 meq/L for single dosing, and 0.1 meq/L for multiple dosing. Under single dosing conditions, as Al(3+) dosage increased, the zeta potential, total interaction energy, and effluent turbidity decreased, whereas the sludge volume index (SVI) increased, indicating that single Al(3+) dosing could enhance sludge flocculation, but deteriorate sedimentation. By comparison, adding an equal amount of Al(3+) through multiple dosing achieved a similar reduction in turbidity, but the zeta potential was higher, while the loosely bound extracellular polymeric substances (LB-EPS) content and SVI remarkably declined. Although the difference in the flocculation performances between the two dosing patterns was not significant, the underlying mechanisms were quite distinct: the interaction energy played a more important role under single dosing conditions, whereas multiple dosing was more effective in reducing the EPS content. Multiple dosing, which allows sufficient time for sludge restructuring and floc aggregation, could simultaneously optimize sludge flocculation and sedimentation.

  12. CSWS-related autistic regression versus autistic regression without CSWS.

    PubMed

    Tuchman, Roberto

    2009-08-01

    Continuous spike-waves during slow-wave sleep (CSWS) and Landau-Kleffner syndrome (LKS) are two clinical epileptic syndromes that are associated with the electroencephalography (EEG) pattern of electrical status epilepticus during slow wave sleep (ESES). Autistic regression occurs in approximately 30% of children with autism and is associated with an epileptiform EEG in approximately 20%. The behavioral phenotypes of CSWS, LKS, and autistic regression overlap. However, the differences in age of regression, degree and type of regression, and frequency of epilepsy and EEG abnormalities suggest that these are distinct phenotypes. CSWS with autistic regression is rare, as is autistic regression associated with ESES. The pathophysiology and as such the treatment implications for children with CSWS and autistic regression are distinct from those with autistic regression without CSWS.

  13. Spatial vulnerability assessments by regression kriging

    NASA Astrophysics Data System (ADS)

    Pásztor, László; Laborczi, Annamária; Takács, Katalin; Szatmári, Gábor

    2016-04-01

    information representing IEW or GRP forming environmental factors were taken into account to support the spatial inference of the locally experienced IEW frequency and measured GRP values respectively. An efficient spatial prediction methodology was applied to construct reliable maps, namely regression kriging (RK) using spatially exhaustive auxiliary data on soil, geology, topography, land use and climate. RK divides the spatial inference into two parts. Firstly the deterministic component of the target variable is determined by a regression model. The residuals of the multiple linear regression analysis represent the spatially varying but dependent stochastic component, which are interpolated by kriging. The final map is the sum of the two component predictions. Application of RK also provides the possibility of inherent accuracy assessment. The resulting maps are characterized by global and local measures of its accuracy. Additionally the method enables interval estimation for spatial extension of the areas of predefined risk categories. All of these outputs provide useful contribution to spatial planning, action planning and decision making. Acknowledgement: Our work was partly supported by the Hungarian National Scientific Research Foundation (OTKA, Grant No. K105167).

  14. ANALYSIS OF THE MOTOR NEUROTOXICITY INDUCED BY ACUTE ORAL EXPOSURE TO MULTIPLE PYRETHROID COMPOUNDS IN THE RAT USING AN ADDITIVITY MODEL.

    EPA Science Inventory

    Use of pyrethroids has increased in the last decade, and co-exposure to multiple pyrethroids has been reported in humans. Pyrethroids produce neurotoxicity in mammals at dosages far below those producing lethality. The Food Quality Protection Act requires the EPA to consider cumu...

  15. Multiple Sclerosis

    MedlinePlus

    ... Awards Enhancing Diversity Find People About NINDS NINDS Multiple Sclerosis Information Page Condensed from Multiple Sclerosis: Hope Through ... en Español Additional resources from MedlinePlus What is Multiple Sclerosis? An unpredictable disease of the central nervous system, ...

  16. Genetic Programming Transforms in Linear Regression Situations

    NASA Astrophysics Data System (ADS)

    Castillo, Flor; Kordon, Arthur; Villa, Carlos

    The chapter summarizes the use of Genetic Programming (GP) inMultiple Linear Regression (MLR) to address multicollinearity and Lack of Fit (LOF). The basis of the proposed method is applying appropriate input transforms (model respecification) that deal with these issues while preserving the information content of the original variables. The transforms are selected from symbolic regression models with optimal trade-off between accuracy of prediction and expressional complexity, generated by multiobjective Pareto-front GP. The chapter includes a comparative study of the GP-generated transforms with Ridge Regression, a variant of ordinary Multiple Linear Regression, which has been a useful and commonly employed approach for reducing multicollinearity. The advantages of GP-generated model respecification are clearly defined and demonstrated. Some recommendations for transforms selection are given as well. The application benefits of the proposed approach are illustrated with a real industrial application in one of the broadest empirical modeling areas in manufacturing - robust inferential sensors. The chapter contributes to increasing the awareness of the potential of GP in statistical model building by MLR.

  17. Quantile Regression in the Study of Developmental Sciences

    ERIC Educational Resources Information Center

    Petscher, Yaacov; Logan, Jessica A. R.

    2014-01-01

    Linear regression analysis is one of the most common techniques applied in developmental research, but only allows for an estimate of the average relations between the predictor(s) and the outcome. This study describes quantile regression, which provides estimates of the relations between the predictor(s) and outcome, but across multiple points of…

  18. Regression Commonality Analysis: A Technique for Quantitative Theory Building

    ERIC Educational Resources Information Center

    Nimon, Kim; Reio, Thomas G., Jr.

    2011-01-01

    When it comes to multiple linear regression analysis (MLR), it is common for social and behavioral science researchers to rely predominately on beta weights when evaluating how predictors contribute to a regression model. Presenting an underutilized statistical technique, this article describes how organizational researchers can use commonality…

  19. Least-Squares Linear Regression and Schrodinger's Cat: Perspectives on the Analysis of Regression Residuals.

    ERIC Educational Resources Information Center

    Hecht, Jeffrey B.

    The analysis of regression residuals and detection of outliers are discussed, with emphasis on determining how deviant an individual data point must be to be considered an outlier and the impact that multiple suspected outlier data points have on the process of outlier determination and treatment. Only bivariate (one dependent and one independent)…

  20. Wild bootstrap for quantile regression.

    PubMed

    Feng, Xingdong; He, Xuming; Hu, Jianhua

    2011-12-01

    The existing theory of the wild bootstrap has focused on linear estimators. In this note, we broaden its validity by providing a class of weight distributions that is asymptotically valid for quantile regression estimators. As most weight distributions in the literature lead to biased variance estimates for nonlinear estimators of linear regression, we propose a modification of the wild bootstrap that admits a broader class of weight distributions for quantile regression. A simulation study on median regression is carried out to compare various bootstrap methods. With a simple finite-sample correction, the wild bootstrap is shown to account for general forms of heteroscedasticity in a regression model with fixed design points.

  1. Direct comparison between genomic constitution and flavonoid contents in Allium multiple alien addition lines reveals chromosomal locations of genes related to biosynthesis from dihydrokaempferol to quercetin glucosides in scaly leaf of shallot (Allium cepa L.).

    PubMed

    Masuzaki, S; Shigyo, M; Yamauchi, N

    2006-02-01

    The extrachromosome 5A of shallot (Allium cepa L., genomes AA) has an important role in flavonoid biosynthesis in the scaly leaf of Allium fistulosum-shallot monosomic addition lines (FF+nA). This study deals with the production and biochemical characterisation of A. fistulosum-shallot multiple alien addition lines carrying at least 5A to determine the chromosomal locations of genes for quercetin formation. The multiple alien additions were selected from the crossing between allotriploid FFA (female symbol) and A. fistulosum (male symbol). The 113 plants obtained from this cross were analysed by a chromosome 5A-specific PGI isozyme marker of shallot. Thirty plants were preliminarily selected for an alien addition carrying 5A. The chromosome numbers of the 30 plants varied from 18 to 23. The other extrachromosomes in 19 plants were completely identified by using seven other chromosome markers of shallot. High-performance liquid chromatography analyses of the 19 multiple additions were conducted to identify the flavonoid compounds produced in the scaly leaves. Direct comparisons between the chromosomal constitution and the flavonoid contents of the multiple alien additions revealed that a flavonoid 3'-hydroxylase (F3'H) gene for the synthesis of quercetin from kaempferol was located on 7A and that an anonymous gene involved in the glucosidation of quercetin was on 3A or 4A. As a result of supplemental SCAR analyses by using genomic DNAs from two complete sets of A. fistulosum-shallot monosomic additions, we have assigned F3'H to 7A and flavonol synthase to 4A.

  2. Direct comparison between genomic constitution and flavonoid contents in Allium multiple alien addition lines reveals chromosomal locations of genes related to biosynthesis from dihydrokaempferol to quercetin glucosides in scaly leaf of shallot (Allium cepa L.).

    PubMed

    Masuzaki, S; Shigyo, M; Yamauchi, N

    2006-02-01

    The extrachromosome 5A of shallot (Allium cepa L., genomes AA) has an important role in flavonoid biosynthesis in the scaly leaf of Allium fistulosum-shallot monosomic addition lines (FF+nA). This study deals with the production and biochemical characterisation of A. fistulosum-shallot multiple alien addition lines carrying at least 5A to determine the chromosomal locations of genes for quercetin formation. The multiple alien additions were selected from the crossing between allotriploid FFA (female symbol) and A. fistulosum (male symbol). The 113 plants obtained from this cross were analysed by a chromosome 5A-specific PGI isozyme marker of shallot. Thirty plants were preliminarily selected for an alien addition carrying 5A. The chromosome numbers of the 30 plants varied from 18 to 23. The other extrachromosomes in 19 plants were completely identified by using seven other chromosome markers of shallot. High-performance liquid chromatography analyses of the 19 multiple additions were conducted to identify the flavonoid compounds produced in the scaly leaves. Direct comparisons between the chromosomal constitution and the flavonoid contents of the multiple alien additions revealed that a flavonoid 3'-hydroxylase (F3'H) gene for the synthesis of quercetin from kaempferol was located on 7A and that an anonymous gene involved in the glucosidation of quercetin was on 3A or 4A. As a result of supplemental SCAR analyses by using genomic DNAs from two complete sets of A. fistulosum-shallot monosomic additions, we have assigned F3'H to 7A and flavonol synthase to 4A. PMID:16411131

  3. Representation of exposures in regression analysis and interpretation of regression coefficients: basic concepts and pitfalls.

    PubMed

    Leffondré, Karen; Jager, Kitty J; Boucquemont, Julie; Stel, Vianda S; Heinze, Georg

    2014-10-01

    Regression models are being used to quantify the effect of an exposure on an outcome, while adjusting for potential confounders. While the type of regression model to be used is determined by the nature of the outcome variable, e.g. linear regression has to be applied for continuous outcome variables, all regression models can handle any kind of exposure variables. However, some fundamentals of representation of the exposure in a regression model and also some potential pitfalls have to be kept in mind in order to obtain meaningful interpretation of results. The objective of this educational paper was to illustrate these fundamentals and pitfalls, using various multiple regression models applied to data from a hypothetical cohort of 3000 patients with chronic kidney disease. In particular, we illustrate how to represent different types of exposure variables (binary, categorical with two or more categories and continuous), and how to interpret the regression coefficients in linear, logistic and Cox models. We also discuss the linearity assumption in these models, and show how wrongly assuming linearity may produce biased results and how flexible modelling using spline functions may provide better estimates.

  4. The comparison of robust partial least squares regression with robust principal component regression on a real

    NASA Astrophysics Data System (ADS)

    Polat, Esra; Gunay, Suleyman

    2013-10-01

    One of the problems encountered in Multiple Linear Regression (MLR) is multicollinearity, which causes the overestimation of the regression parameters and increase of the variance of these parameters. Hence, in case of multicollinearity presents, biased estimation procedures such as classical Principal Component Regression (CPCR) and Partial Least Squares Regression (PLSR) are then performed. SIMPLS algorithm is the leading PLSR algorithm because of its speed, efficiency and results are easier to interpret. However, both of the CPCR and SIMPLS yield very unreliable results when the data set contains outlying observations. Therefore, Hubert and Vanden Branden (2003) have been presented a robust PCR (RPCR) method and a robust PLSR (RPLSR) method called RSIMPLS. In RPCR, firstly, a robust Principal Component Analysis (PCA) method for high-dimensional data on the independent variables is applied, then, the dependent variables are regressed on the scores using a robust regression method. RSIMPLS has been constructed from a robust covariance matrix for high-dimensional data and robust linear regression. The purpose of this study is to show the usage of RPCR and RSIMPLS methods on an econometric data set, hence, making a comparison of two methods on an inflation model of Turkey. The considered methods have been compared in terms of predictive ability and goodness of fit by using a robust Root Mean Squared Error of Cross-validation (R-RMSECV), a robust R2 value and Robust Component Selection (RCS) statistic.

  5. Evaluating differential effects using regression interactions and regression mixture models

    PubMed Central

    Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung

    2015-01-01

    Research increasingly emphasizes understanding differential effects. This paper focuses on understanding regression mixture models, a relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their formulation, and their assumptions are compared using Monte Carlo simulations and real data analysis. The capabilities of regression mixture models are described and specific issues to be addressed when conducting regression mixtures are proposed. The paper aims to clarify the role that regression mixtures can take in the estimation of differential effects and increase awareness of the benefits and potential pitfalls of this approach. Regression mixture models are shown to be a potentially effective exploratory method for finding differential effects when these effects can be defined by a small number of classes of respondents who share a typical relationship between a predictor and an outcome. It is also shown that the comparison between regression mixture models and interactions becomes substantially more complex as the number of classes increases. It is argued that regression interactions are well suited for direct tests of specific hypotheses about differential effects and regression mixtures provide a useful approach for exploring effect heterogeneity given adequate samples and study design. PMID:26556903

  6. Censored partial regression.

    PubMed

    Orbe, Jesus; Ferreira, Eva; Núñez-Antón, Vicente

    2003-01-01

    In this work we study the effect of several covariates on a censored response variable with unknown probability distribution. A semiparametric model is proposed to consider situations where the functional form of the effect of one or more covariates is unknown, as is the case in the application presented in this work. We provide its estimation procedure and, in addition, a bootstrap technique to make inference on the parameters. A simulation study has been carried out to show the good performance of the proposed estimation process and to analyse the effect of the censorship. Finally, we present the results when the methodology is applied to AIDS diagnosed patients.

  7. Linear regression in astronomy. II

    NASA Technical Reports Server (NTRS)

    Feigelson, Eric D.; Babu, Gutti J.

    1992-01-01

    A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.

  8. Quantile regression for climate data

    NASA Astrophysics Data System (ADS)

    Marasinghe, Dilhani Shalika

    Quantile regression is a developing statistical tool which is used to explain the relationship between response and predictor variables. This thesis describes two examples of climatology using quantile regression.Our main goal is to estimate derivatives of a conditional mean and/or conditional quantile function. We introduce a method to handle autocorrelation in the framework of quantile regression and used it with the temperature data. Also we explain some properties of the tornado data which is non-normally distributed. Even though quantile regression provides a more comprehensive view, when talking about residuals with the normality and the constant variance assumption, we would prefer least square regression for our temperature analysis. When dealing with the non-normality and non constant variance assumption, quantile regression is a better candidate for the estimation of the derivative.

  9. Evaluating Differential Effects Using Regression Interactions and Regression Mixture Models

    ERIC Educational Resources Information Center

    Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung

    2015-01-01

    Research increasingly emphasizes understanding differential effects. This article focuses on understanding regression mixture models, which are relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their…

  10. Addition of low concentrations of an ionic liquid to a base oil reduces friction over multiple length scales: a combined nano- and macrotribology investigation.

    PubMed

    Li, Hua; Somers, Anthony E; Howlett, Patrick C; Rutland, Mark W; Forsyth, Maria; Atkin, Rob

    2016-03-01

    The efficacy of ionic liquids (ILs) as lubricant additives to a model base oil has been probed at the nanoscale and macroscale as a function of IL concentration using the same materials. Silica surfaces lubricated with mixtures of the IL trihexyl(tetradecyl)phosphonium bis(2,4,4-trimethylpentyl)phosphinate and hexadecane are probed using atomic force microscopy (AFM) (nanoscale) and ball-on-disc tribometer (macroscale). At both length scales the pure IL is a much more effective lubricant than hexadecane. At the nanoscale, 2.0 mol% IL (and above) in hexadecane lubricates the silica as well as the pure IL due to the formation of a robust IL boundary layer that separates the sliding surfaces. At the macroscale the lubrication is highly load dependent; at low loads all the mixtures lubricate as effectively as the pure IL, whereas at higher loads rather high concentrations are required to provide IL like lubrication. Wear is also pronounced at high loads, for all cases except the pure IL, and a tribofilm is formed. Together, the nano- and macroscales results reveal that the IL is an effective lubricant additive - it reduces friction - in both the boundary regime at the nanoscale and mixed regime at the macroscale.

  11. Assessing risk factors for periodontitis using regression

    NASA Astrophysics Data System (ADS)

    Lobo Pereira, J. A.; Ferreira, Maria Cristina; Oliveira, Teresa

    2013-10-01

    Multivariate statistical analysis is indispensable to assess the associations and interactions between different factors and the risk of periodontitis. Among others, regression analysis is a statistical technique widely used in healthcare to investigate and model the relationship between variables. In our work we study the impact of socio-demographic, medical and behavioral factors on periodontal health. Using regression, linear and logistic models, we can assess the relevance, as risk factors for periodontitis disease, of the following independent variables (IVs): Age, Gender, Diabetic Status, Education, Smoking status and Plaque Index. The multiple linear regression analysis model was built to evaluate the influence of IVs on mean Attachment Loss (AL). Thus, the regression coefficients along with respective p-values will be obtained as well as the respective p-values from the significance tests. The classification of a case (individual) adopted in the logistic model was the extent of the destruction of periodontal tissues defined by an Attachment Loss greater than or equal to 4 mm in 25% (AL≥4mm/≥25%) of sites surveyed. The association measures include the Odds Ratios together with the correspondent 95% confidence intervals.

  12. Can luteal regression be reversed?

    PubMed Central

    Telleria, Carlos M

    2006-01-01

    The corpus luteum is an endocrine gland whose limited lifespan is hormonally programmed. This debate article summarizes findings of our research group that challenge the principle that the end of function of the corpus luteum or luteal regression, once triggered, cannot be reversed. Overturning luteal regression by pharmacological manipulations may be of critical significance in designing strategies to improve fertility efficacy. PMID:17074090

  13. Logistic Regression: Concept and Application

    ERIC Educational Resources Information Center

    Cokluk, Omay

    2010-01-01

    The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…

  14. Wild bootstrap for quantile regression.

    PubMed

    Feng, Xingdong; He, Xuming; Hu, Jianhua

    2011-12-01

    The existing theory of the wild bootstrap has focused on linear estimators. In this note, we broaden its validity by providing a class of weight distributions that is asymptotically valid for quantile regression estimators. As most weight distributions in the literature lead to biased variance estimates for nonlinear estimators of linear regression, we propose a modification of the wild bootstrap that admits a broader class of weight distributions for quantile regression. A simulation study on median regression is carried out to compare various bootstrap methods. With a simple finite-sample correction, the wild bootstrap is shown to account for general forms of heteroscedasticity in a regression model with fixed design points. PMID:23049133

  15. [Regression grading in gastrointestinal tumors].

    PubMed

    Tischoff, I; Tannapfel, A

    2012-02-01

    Preoperative neoadjuvant chemoradiation therapy is a well-established and essential part of the interdisciplinary treatment of gastrointestinal tumors. Neoadjuvant treatment leads to regressive changes in tumors. To evaluate the histological tumor response different scoring systems describing regressive changes are used and known as tumor regression grading. Tumor regression grading is usually based on the presence of residual vital tumor cells in proportion to the total tumor size. Currently, no nationally or internationally accepted grading systems exist. In general, common guidelines should be used in the pathohistological diagnostics of tumors after neoadjuvant therapy. In particularly, the standard tumor grading will be replaced by tumor regression grading. Furthermore, tumors after neoadjuvant treatment are marked with the prefix "y" in the TNM classification. PMID:22293790

  16. Fungible weights in logistic regression.

    PubMed

    Jones, Jeff A; Waller, Niels G

    2016-06-01

    In this article we develop methods for assessing parameter sensitivity in logistic regression models. To set the stage for this work, we first review Waller's (2008) equations for computing fungible weights in linear regression. Next, we describe 2 methods for computing fungible weights in logistic regression. To demonstrate the utility of these methods, we compute fungible logistic regression weights using data from the Centers for Disease Control and Prevention's (2010) Youth Risk Behavior Surveillance Survey, and we illustrate how these alternate weights can be used to evaluate parameter sensitivity. To make our work accessible to the research community, we provide R code (R Core Team, 2015) that will generate both kinds of fungible logistic regression weights. (PsycINFO Database Record

  17. Representing Multiplication

    ERIC Educational Resources Information Center

    Harries, Tony; Barmby, Patrick

    2008-01-01

    In this study, the authors wish to explore the use of visual representations in facilitating the understanding of multiplication. In doing so, they examine the different aspects of multiplication that they can access through different representations. In addition, they draw on a study that they have been carrying out looking at pupils' actual use…

  18. Does finger sense predict addition performance?

    PubMed

    Newman, Sharlene D

    2016-05-01

    The impact of fingers on numerical and mathematical cognition has received a great deal of attention recently. However, the precise role that fingers play in numerical cognition is unknown. The current study explores the relationship between finger sense, arithmetic and general cognitive ability. Seventy-six children between the ages of 5 and 12 participated in the study. The results of stepwise multiple regression analyses demonstrated that while general cognitive ability including language processing was a predictor of addition performance, finger sense was not. The impact of age on the relationship between finger sense, and addition was further examined. The participants were separated into two groups based on age. The results showed that finger gnosia score impacted addition performance in the older group but not the younger group. These results appear to support the hypothesis that fingers provide a scaffold for calculation and that if that scaffold is not properly built, it has continued differential consequences to mathematical cognition. PMID:26993292

  19. A tutorial on Bayesian Normal linear regression

    NASA Astrophysics Data System (ADS)

    Klauenberg, Katy; Wübbeler, Gerd; Mickan, Bodo; Harris, Peter; Elster, Clemens

    2015-12-01

    Regression is a common task in metrology and often applied to calibrate instruments, evaluate inter-laboratory comparisons or determine fundamental constants, for example. Yet, a regression model cannot be uniquely formulated as a measurement function, and consequently the Guide to the Expression of Uncertainty in Measurement (GUM) and its supplements are not applicable directly. Bayesian inference, however, is well suited to regression tasks, and has the advantage of accounting for additional a priori information, which typically robustifies analyses. Furthermore, it is anticipated that future revisions of the GUM shall also embrace the Bayesian view. Guidance on Bayesian inference for regression tasks is largely lacking in metrology. For linear regression models with Gaussian measurement errors this tutorial gives explicit guidance. Divided into three steps, the tutorial first illustrates how a priori knowledge, which is available from previous experiments, can be translated into prior distributions from a specific class. These prior distributions have the advantage of yielding analytical, closed form results, thus avoiding the need to apply numerical methods such as Markov Chain Monte Carlo. Secondly, formulas for the posterior results are given, explained and illustrated, and software implementations are provided. In the third step, Bayesian tools are used to assess the assumptions behind the suggested approach. These three steps (prior elicitation, posterior calculation, and robustness to prior uncertainty and model adequacy) are critical to Bayesian inference. The general guidance given here for Normal linear regression tasks is accompanied by a simple, but real-world, metrological example. The calibration of a flow device serves as a running example and illustrates the three steps. It is shown that prior knowledge from previous calibrations of the same sonic nozzle enables robust predictions even for extrapolations.

  20. Modeling confounding by half-sibling regression.

    PubMed

    Schölkopf, Bernhard; Hogg, David W; Wang, Dun; Foreman-Mackey, Daniel; Janzing, Dominik; Simon-Gabriel, Carl-Johann; Peters, Jonas

    2016-07-01

    We describe a method for removing the effect of confounders to reconstruct a latent quantity of interest. The method, referred to as "half-sibling regression," is inspired by recent work in causal inference using additive noise models. We provide a theoretical justification, discussing both independent and identically distributed as well as time series data, respectively, and illustrate the potential of the method in a challenging astronomy application. PMID:27382154

  1. Modeling confounding by half-sibling regression.

    PubMed

    Schölkopf, Bernhard; Hogg, David W; Wang, Dun; Foreman-Mackey, Daniel; Janzing, Dominik; Simon-Gabriel, Carl-Johann; Peters, Jonas

    2016-07-01

    We describe a method for removing the effect of confounders to reconstruct a latent quantity of interest. The method, referred to as "half-sibling regression," is inspired by recent work in causal inference using additive noise models. We provide a theoretical justification, discussing both independent and identically distributed as well as time series data, respectively, and illustrate the potential of the method in a challenging astronomy application.

  2. Modeling confounding by half-sibling regression

    PubMed Central

    Schölkopf, Bernhard; Hogg, David W.; Wang, Dun; Foreman-Mackey, Daniel; Janzing, Dominik; Simon-Gabriel, Carl-Johann; Peters, Jonas

    2016-01-01

    We describe a method for removing the effect of confounders to reconstruct a latent quantity of interest. The method, referred to as “half-sibling regression,” is inspired by recent work in causal inference using additive noise models. We provide a theoretical justification, discussing both independent and identically distributed as well as time series data, respectively, and illustrate the potential of the method in a challenging astronomy application. PMID:27382154

  3. Regression methods for spatial data

    NASA Technical Reports Server (NTRS)

    Yakowitz, S. J.; Szidarovszky, F.

    1982-01-01

    The kriging approach, a parametric regression method used by hydrologists and mining engineers, among others also provides an error estimate the integral of the regression function. The kriging method is explored and some of its statistical characteristics are described. The Watson method and theory are extended so that the kriging features are displayed. Theoretical and computational comparisons of the kriging and Watson approaches are offered.

  4. Basis Selection for Wavelet Regression

    NASA Technical Reports Server (NTRS)

    Wheeler, Kevin R.; Lau, Sonie (Technical Monitor)

    1998-01-01

    A wavelet basis selection procedure is presented for wavelet regression. Both the basis and the threshold are selected using cross-validation. The method includes the capability of incorporating prior knowledge on the smoothness (or shape of the basis functions) into the basis selection procedure. The results of the method are demonstrated on sampled functions widely used in the wavelet regression literature. The results of the method are contrasted with other published methods.

  5. Regression Discontinuity Designs in Epidemiology

    PubMed Central

    Moscoe, Ellen; Mutevedzi, Portia; Newell, Marie-Louise; Bärnighausen, Till

    2014-01-01

    When patients receive an intervention based on whether they score below or above some threshold value on a continuously measured random variable, the intervention will be randomly assigned for patients close to the threshold. The regression discontinuity design exploits this fact to estimate causal treatment effects. In spite of its recent proliferation in economics, the regression discontinuity design has not been widely adopted in epidemiology. We describe regression discontinuity, its implementation, and the assumptions required for causal inference. We show that regression discontinuity is generalizable to the survival and nonlinear models that are mainstays of epidemiologic analysis. We then present an application of regression discontinuity to the much-debated epidemiologic question of when to start HIV patients on antiretroviral therapy. Using data from a large South African cohort (2007–2011), we estimate the causal effect of early versus deferred treatment eligibility on mortality. Patients whose first CD4 count was just below the 200 cells/μL CD4 count threshold had a 35% lower hazard of death (hazard ratio = 0.65 [95% confidence interval = 0.45–0.94]) than patients presenting with CD4 counts just above the threshold. We close by discussing the strengths and limitations of regression discontinuity designs for epidemiology. PMID:25061922

  6. Spontaneous regression in advanced squamous cell lung carcinoma

    PubMed Central

    Park, Yeon Hee; Park, Bo Mi; Park, Se Yeon; Choi, Jae Woo; Kim, Sun Young; Kim, Ju Ock; Jung, Sung Soo; Park, Hee Sun; Moon, Jae Young

    2016-01-01

    Spontaneous regression of malignant tumors is rare especially of lung tumor and biological mechanism of such remission has not been addressed. We report the case of a 79-year-old Korean patient with non-small cell lung cancer, squamous cell cancer with a right hilar tumor and multiple lymph nodes, lung to lung metastasis that spontaneously regressed without any therapies. He has sustained partial remission state for one year and eight months after the first histological diagnosis. PMID:27076978

  7. [Outline of political conclusions of multiple regressions: integrants and problems].

    PubMed

    Dixon, R B

    1978-01-01

    In this article the author criticizes the methodology and the findings of an article by Mauldin and Berelson which appeared in 1978 in Studies in Family Planning about population decrease in developing countries and about its implications on population policies. According to the author that article did not take into consideration: 1) the fact that socioeconomic conditions in a given country are more important than family planning programs for a decrease in fertility rate; 2) the fact that it is not known which kinds of family planning programs are more effective, and which kind of social level is more conducive to fertility decrease; and, 3) the status and educational level of women in the countries studied. In conclusion, the author states that the findings of Mauldin and Berelson, although interesting, imply arbitrary procedures and statistics, and cannot be used for the purpose of population policy.

  8. Dissociating Conflict Adaptation from Feature Integration: A Multiple Regression Approach

    ERIC Educational Resources Information Center

    Notebaert, Wim; Verguts, Tom

    2007-01-01

    Congruency effects are typically smaller after incongruent than after congruent trials. One explanation is in terms of higher levels of cognitive control after detection of conflict (conflict adaptation; e.g., M. M. Botvinick, T. S. Braver, D. M. Barch, C. S. Carter, & J. D. Cohen, 2001). An alternative explanation for these results is based on…

  9. Norming Clinical Questionnaires with Multiple Regression: The Pain Cognition List

    ERIC Educational Resources Information Center

    Van Breukelen, Gerard J. P.; Vlaeyen, Johan W. S.

    2005-01-01

    Questionnaires for measuring patients' feelings or beliefs are commonly used in clinical settings for diagnostic purposes, clinical decision making, or treatment evaluation. Raw scores of a patient can be evaluated by comparing them with norms based on a reference population. Using the Pain Cognition List (PCL-2003) as an example, this article…

  10. Estimating peak flow characteristics at ungaged sites by ridge regression

    USGS Publications Warehouse

    Tasker, Gary D.

    1982-01-01

    A regression simulation model, is combined with a multisite streamflow generator to simulate a regional regression of 50-year peak discharge against a set of basin characteristics. Monte Carlo experiments are used to compare the unbiased ordinary lease squares parameter estimator with Hoerl and Kennard's (1970a) ridge estimator in which the biasing parameter is that proposed by Hoerl, Kennard, and Baldwin (1975). The simulation results indicate a substantial improvement in parameter estimation using ridge regression when the correlation between basin characteristics is more than about 0.90. In addition, results indicate a strong potential for improving the mean square error of prediction of a peak-flow characteristic versus basin characteristics regression model when the basin characteristics are approximately colinear. The simulation covers a range of regression parameters, streamflow statistics, and basin characteristics commonly found in regional regression studies.

  11. Food additives

    PubMed Central

    Spencer, Michael

    1974-01-01

    Food additives are discussed from the food technology point of view. The reasons for their use are summarized: (1) to protect food from chemical and microbiological attack; (2) to even out seasonal supplies; (3) to improve their eating quality; (4) to improve their nutritional value. The various types of food additives are considered, e.g. colours, flavours, emulsifiers, bread and flour additives, preservatives, and nutritional additives. The paper concludes with consideration of those circumstances in which the use of additives is (a) justified and (b) unjustified. PMID:4467857

  12. A Unified Approach to Power Calculation and Sample Size Determination for Random Regression Models

    ERIC Educational Resources Information Center

    Shieh, Gwowen

    2007-01-01

    The underlying statistical models for multiple regression analysis are typically attributed to two types of modeling: fixed and random. The procedures for calculating power and sample size under the fixed regression models are well known. However, the literature on random regression models is limited and has been confined to the case of all…

  13. An Explanation of the Effectiveness of Latent Semantic Indexing by Means of a Bayesian Regression Model.

    ERIC Educational Resources Information Center

    Story, Roger E.

    1996-01-01

    Discussion of the use of Latent Semantic Indexing to determine relevancy in information retrieval focuses on statistical regression and Bayesian methods. Topics include keyword searching; a multiple regression model; how the regression model can aid search methods; and limitations of this approach, including complexity, linearity, and…

  14. Standardized Regression Coefficients as Indices of Effect Sizes in Meta-Analysis

    ERIC Educational Resources Information Center

    Kim, Rae Seon

    2011-01-01

    When conducting a meta-analysis, it is common to find many collected studies that report regression analyses, because multiple regression analysis is widely used in many fields. Meta-analysis uses effect sizes drawn from individual studies as a means of synthesizing a collection of results. However, indices of effect size from regression analyses…

  15. Regional Regression Equations to Estimate Flow-Duration Statistics at Ungaged Stream Sites in Connecticut

    USGS Publications Warehouse

    Ahearn, Elizabeth A.

    2010-01-01

    Multiple linear regression equations for determining flow-duration statistics were developed to estimate select flow exceedances ranging from 25- to 99-percent for six 'bioperiods'-Salmonid Spawning (November), Overwinter (December-February), Habitat Forming (March-April), Clupeid Spawning (May), Resident Spawning (June), and Rearing and Growth (July-October)-in Connecticut. Regression equations also were developed to estimate the 25- and 99-percent flow exceedances without reference to a bioperiod. In total, 32 equations were developed. The predictive equations were based on regression analyses relating flow statistics from streamgages to GIS-determined basin and climatic characteristics for the drainage areas of those streamgages. Thirty-nine streamgages (and an additional 6 short-term streamgages and 28 partial-record sites for the non-bioperiod 99-percent exceedance) in Connecticut and adjacent areas of neighboring States were used in the regression analysis. Weighted least squares regression analysis was used to determine the predictive equations; weights were assigned based on record length. The basin characteristics-drainage area, percentage of area with coarse-grained stratified deposits, percentage of area with wetlands, mean monthly precipitation (November), mean seasonal precipitation (December, January, and February), and mean basin elevation-are used as explanatory variables in the equations. Standard errors of estimate of the 32 equations ranged from 10.7 to 156 percent with medians of 19.2 and 55.4 percent to predict the 25- and 99-percent exceedances, respectively. Regression equations to estimate high and median flows (25- to 75-percent exceedances) are better predictors (smaller variability of the residual values around the regression line) than the equations to estimate low flows (less than 75-percent exceedance). The Habitat Forming (March-April) bioperiod had the smallest standard errors of estimate, ranging from 10.7 to 20.9 percent. In

  16. Functional Generalized Additive Models.

    PubMed

    McLean, Mathew W; Hooker, Giles; Staicu, Ana-Maria; Scheipl, Fabian; Ruppert, David

    2014-01-01

    We introduce the functional generalized additive model (FGAM), a novel regression model for association studies between a scalar response and a functional predictor. We model the link-transformed mean response as the integral with respect to t of F{X(t), t} where F(·,·) is an unknown regression function and X(t) is a functional covariate. Rather than having an additive model in a finite number of principal components as in Müller and Yao (2008), our model incorporates the functional predictor directly and thus our model can be viewed as the natural functional extension of generalized additive models. We estimate F(·,·) using tensor-product B-splines with roughness penalties. A pointwise quantile transformation of the functional predictor is also considered to ensure each tensor-product B-spline has observed data on its support. The methods are evaluated using simulated data and their predictive performance is compared with other competing scalar-on-function regression alternatives. We illustrate the usefulness of our approach through an application to brain tractography, where X(t) is a signal from diffusion tensor imaging at position, t, along a tract in the brain. In one example, the response is disease-status (case or control) and in a second example, it is the score on a cognitive test. R code for performing the simulations and fitting the FGAM can be found in supplemental materials available online.

  17. Regressive Evolution in Astyanax Cavefish

    PubMed Central

    Jeffery, William R.

    2013-01-01

    A diverse group of animals, including members of most major phyla, have adapted to life in the perpetual darkness of caves. These animals are united by the convergence of two regressive phenotypes, loss of eyes and pigmentation. The mechanisms of regressive evolution are poorly understood. The teleost Astyanax mexicanus is of special significance in studies of regressive evolution in cave animals. This species includes an ancestral surface dwelling form and many con-specific cave-dwelling forms, some of which have evolved their recessive phenotypes independently. Recent advances in Astyanax development and genetics have provided new information about how eyes and pigment are lost during cavefish evolution; namely, they have revealed some of the molecular and cellular mechanisms involved in trait modification, the number and identity of the underlying genes and mutations, the molecular basis of parallel evolution, and the evolutionary forces driving adaptation to the cave environment. PMID:19640230

  18. Laplace regression with censored data.

    PubMed

    Bottai, Matteo; Zhang, Jiajia

    2010-08-01

    We consider a regression model where the error term is assumed to follow a type of asymmetric Laplace distribution. We explore its use in the estimation of conditional quantiles of a continuous outcome variable given a set of covariates in the presence of random censoring. Censoring may depend on covariates. Estimation of the regression coefficients is carried out by maximizing a non-differentiable likelihood function. In the scenarios considered in a simulation study, the Laplace estimator showed correct coverage and shorter computation time than the alternative methods considered, some of which occasionally failed to converge. We illustrate the use of Laplace regression with an application to survival time in patients with small cell lung cancer.

  19. [Is regression of atherosclerosis possible?].

    PubMed

    Thomas, D; Richard, J L; Emmerich, J; Bruckert, E; Delahaye, F

    1992-10-01

    Experimental studies have shown the regression of atherosclerosis in animals given a cholesterol-rich diet and then given a normal diet or hypolipidemic therapy. Despite favourable results of clinical trials of primary prevention modifying the lipid profile, the concept of atherosclerosis regression in man remains very controversial. The methodological approach is difficult: this is based on angiographic data and requires strict standardisation of angiographic views and reliable quantitative techniques of analysis which are available with image processing. Several methodologically acceptable clinical coronary studies have shown not only stabilisation but also regression of atherosclerotic lesions with reductions of about 25% in total cholesterol levels and of about 40% in LDL cholesterol levels. These reductions were obtained either by drugs as in CLAS (Cholesterol Lowering Atherosclerosis Study), FATS (Familial Atherosclerosis Treatment Study) and SCOR (Specialized Center of Research Intervention Trial), by profound modifications in dietary habits as in the Lifestyle Heart Trial, or by surgery (ileo-caecal bypass) as in POSCH (Program On the Surgical Control of the Hyperlipidemias). On the other hand, trials with non-lipid lowering drugs such as the calcium antagonists (INTACT, MHIS) have not shown significant regression of existing atherosclerotic lesions but only a decrease on the number of new lesions. The clinical benefits of these regression studies are difficult to demonstrate given the limited period of observation, relatively small population numbers and the fact that in some cases the subjects were asymptomatic. The decrease in the number of cardiovascular events therefore seems relatively modest and concerns essentially subjects who were symptomatic initially. The clinical repercussion of studies of prevention involving a single lipid factor is probably partially due to the reduction in progression and anatomical regression of the atherosclerotic plaque

  20. Progression and regression of the atherosclerotic plaque.

    PubMed

    de Feyter, P J; Vos, J; Deckers, J W

    1995-08-01

    In animals in which atherosclerosis was induced experimentally (by a high cholesterol diet) regression of the atherosclerotic lesion was demonstrated after serum cholesterol was reduced by cholesterol- lowering drugs or a low-fat diet. Whether regression of advanced coronary arterly lesions also takes place in humans after a similar intervention remains conjectural. However, several randomized studies, primarily employing lipid-lowering intervention or comprehensive changes in lifestyle, have demonstrated, using serial angiograms, that it is possible to achieve less progression, arrest or even (small) regression of atherosclerotic lesions. The lipid-lowering trials (NHBLI, CLAS, POSCH, FATS, SCOR and STARS) studied 1240 symptomatic patients, mostly men, with moderately elevated cholesterol levels and moderately severe angiographic-proven coronary artery disease. A variety of lipid-lowering drugs, in addition to a diet, were used over an intervention period ranging from 2 to 3 years. In all but one study (NHBLI), the progression of coronary atherosclerosis was less in the treated group, but regression was induced in only a few patients. The overall relative risk of progression of coronary atherosclerosis was 0 x 62 and 2 x 13, respectively. The induced angiographic differences were small and did not produce any significant haemodynamic benefit. The most important result was tht the disease process could be stabilized in the majority of patients. Three comprehensive lifestyle change trials (the Lifestyle Heart study, STARS and the Heidelberg Study) studied 183 patients, who were subjected to stress management, and/or intensive exercise, in addition to a low fat diet, over a period ranging from 1 to 3 years. All three trials demonstrated less progression, and more regression with overall relative risks of 0 x 40 and 2 x 35 respectively, in the intervention groups. Angiographic trials demonstrated that retardation or arrest of coronary atherosclerosis was possible

  1. Correcting Regression Equations for Restriction of Range: Effects on Veterinary Candidate Selection.

    ERIC Educational Resources Information Center

    Stuck, Ivan A.

    Predictor weights estimated by using multiple linear regression are biased when there is restriction in the range (RR) of the dependent variable. Standardized multiple regression yields partial correlations as weights for the predictors, and these can be corrected for range difference between calibration and application samples. However,…

  2. Weighting Regressions by Propensity Scores

    ERIC Educational Resources Information Center

    Freedman, David A.; Berk, Richard A.

    2008-01-01

    Regressions can be weighted by propensity scores in order to reduce bias. However, weighting is likely to increase random error in the estimates, and to bias the estimated standard errors downward, even when selection mechanisms are well understood. Moreover, in some cases, weighting will increase the bias in estimated causal parameters. If…

  3. Ridge Regression for Interactive Models.

    ERIC Educational Resources Information Center

    Tate, Richard L.

    1988-01-01

    An exploratory study of the value of ridge regression for interactive models is reported. Assuming that the linear terms in a simple interactive model are centered to eliminate non-essential multicollinearity, a variety of common models, representing both ordinal and disordinal interactions, are shown to have "orientations" that are favorable to…

  4. Quantile Regression with Censored Data

    ERIC Educational Resources Information Center

    Lin, Guixian

    2009-01-01

    The Cox proportional hazards model and the accelerated failure time model are frequently used in survival data analysis. They are powerful, yet have limitation due to their model assumptions. Quantile regression offers a semiparametric approach to model data with possible heterogeneity. It is particularly powerful for censored responses, where the…

  5. Modeling Polytomous Item Responses Using Simultaneously Estimated Multinomial Logistic Regression Models

    ERIC Educational Resources Information Center

    Anderson, Carolyn J.; Verkuilen, Jay; Peyton, Buddy L.

    2010-01-01

    Survey items with multiple response categories and multiple-choice test questions are ubiquitous in psychological and educational research. We illustrate the use of log-multiplicative association (LMA) models that are extensions of the well-known multinomial logistic regression model for multiple dependent outcome variables to reanalyze a set of…

  6. Regression Verification Using Impact Summaries

    NASA Technical Reports Server (NTRS)

    Backes, John; Person, Suzette J.; Rungta, Neha; Thachuk, Oksana

    2013-01-01

    Regression verification techniques are used to prove equivalence of syntactically similar programs. Checking equivalence of large programs, however, can be computationally expensive. Existing regression verification techniques rely on abstraction and decomposition techniques to reduce the computational effort of checking equivalence of the entire program. These techniques are sound but not complete. In this work, we propose a novel approach to improve scalability of regression verification by classifying the program behaviors generated during symbolic execution as either impacted or unimpacted. Our technique uses a combination of static analysis and symbolic execution to generate summaries of impacted program behaviors. The impact summaries are then checked for equivalence using an o-the-shelf decision procedure. We prove that our approach is both sound and complete for sequential programs, with respect to the depth bound of symbolic execution. Our evaluation on a set of sequential C artifacts shows that reducing the size of the summaries can help reduce the cost of software equivalence checking. Various reduction, abstraction, and compositional techniques have been developed to help scale software verification techniques to industrial-sized systems. Although such techniques have greatly increased the size and complexity of systems that can be checked, analysis of large software systems remains costly. Regression analysis techniques, e.g., regression testing [16], regression model checking [22], and regression verification [19], restrict the scope of the analysis by leveraging the differences between program versions. These techniques are based on the idea that if code is checked early in development, then subsequent versions can be checked against a prior (checked) version, leveraging the results of the previous analysis to reduce analysis cost of the current version. Regression verification addresses the problem of proving equivalence of closely related program

  7. Decreasing Multicollinearity: A Method for Models with Multiplicative Functions.

    ERIC Educational Resources Information Center

    Smith, Kent W.; Sasaki, M. S.

    1979-01-01

    A method is proposed for overcoming the problem of multicollinearity in multiple regression equations where multiplicative independent terms are entered. The method is not a ridge regression solution. (JKS)

  8. Embedded Sensors for Measuring Surface Regression

    NASA Technical Reports Server (NTRS)

    Gramer, Daniel J.; Taagen, Thomas J.; Vermaak, Anton G.

    2006-01-01

    non-eroding end of the sensor. The sensor signal can be transmitted from inside a high-pressure chamber to the ambient environment, using commercially available feedthrough connectors. Miniaturized internal recorders or wireless data transmission could also potentially be employed to eliminate the need for producing penetrations in the chamber case. The rungs are designed so that as each successive rung is eroded away, the resistance changes by an amount that yields a readily measurable signal larger than the background noise. (In addition, signal-conditioning techniques are used in processing the resistance readings to mitigate the effect of noise.) Hence, each discrete change of resistance serves to indicate the arrival of the regressing host material front at the known depth of the affected resistor rung. The average rate of regression between two adjacent resistors can be calculated simply as the distance between the resistors divided by the time interval between their resistance jumps. Advanced data reduction techniques have also been developed to establish the instantaneous surface position and regression rate when the regressing front is between rungs.

  9. Convex Regression with Interpretable Sharp Partitions

    PubMed Central

    Petersen, Ashley; Simon, Noah; Witten, Daniela

    2016-01-01

    We consider the problem of predicting an outcome variable on the basis of a small number of covariates, using an interpretable yet non-additive model. We propose convex regression with interpretable sharp partitions (CRISP) for this task. CRISP partitions the covariate space into blocks in a data-adaptive way, and fits a mean model within each block. Unlike other partitioning methods, CRISP is fit using a non-greedy approach by solving a convex optimization problem, resulting in low-variance fits. We explore the properties of CRISP, and evaluate its performance in a simulation study and on a housing price data set.

  10. Convex Regression with Interpretable Sharp Partitions

    PubMed Central

    Petersen, Ashley; Simon, Noah; Witten, Daniela

    2016-01-01

    We consider the problem of predicting an outcome variable on the basis of a small number of covariates, using an interpretable yet non-additive model. We propose convex regression with interpretable sharp partitions (CRISP) for this task. CRISP partitions the covariate space into blocks in a data-adaptive way, and fits a mean model within each block. Unlike other partitioning methods, CRISP is fit using a non-greedy approach by solving a convex optimization problem, resulting in low-variance fits. We explore the properties of CRISP, and evaluate its performance in a simulation study and on a housing price data set. PMID:27635120

  11. Estimating the exceedance probability of rain rate by logistic regression

    NASA Technical Reports Server (NTRS)

    Chiu, Long S.; Kedem, Benjamin

    1990-01-01

    Recent studies have shown that the fraction of an area with rain intensity above a fixed threshold is highly correlated with the area-averaged rain rate. To estimate the fractional rainy area, a logistic regression model, which estimates the conditional probability that rain rate over an area exceeds a fixed threshold given the values of related covariates, is developed. The problem of dependency in the data in the estimation procedure is bypassed by the method of partial likelihood. Analyses of simulated scanning multichannel microwave radiometer and observed electrically scanning microwave radiometer data during the Global Atlantic Tropical Experiment period show that the use of logistic regression in pixel classification is superior to multiple regression in predicting whether rain rate at each pixel exceeds a given threshold, even in the presence of noisy data. The potential of the logistic regression technique in satellite rain rate estimation is discussed.

  12. Assessing Longitudinal Change: Adjustment for Regression to the Mean Effects

    ERIC Educational Resources Information Center

    Rocconi, Louis M.; Ethington, Corinna A.

    2009-01-01

    Pascarella (J Coll Stud Dev 47:508-520, 2006) has called for an increase in use of longitudinal data with pretest-posttest design when studying effects on college students. However, such designs that use multiple measures to document change are vulnerable to an important threat to internal validity, regression to the mean. Herein, we discuss a…

  13. Default Bayes Factors for Model Selection in Regression

    ERIC Educational Resources Information Center

    Rouder, Jeffrey N.; Morey, Richard D.

    2012-01-01

    In this article, we present a Bayes factor solution for inference in multiple regression. Bayes factors are principled measures of the relative evidence from data for various models or positions, including models that embed null hypotheses. In this regard, they may be used to state positive evidence for a lack of an effect, which is not possible…

  14. Validity Shrinkage in Ridge Regression: A Simulation Study.

    ERIC Educational Resources Information Center

    Faden, Vivian; Bobko, Philip

    1982-01-01

    Ridge regression offers advantages over ordinary least squares estimation when a validity shrinkage criterion is considered. Comparisons of cross-validated multiple correlations indicate that ridge estimation is superior when the predictors are multicollinear, the number of predictors is large relative to sample size, and the population multiple…

  15. Quantile Regression With Measurement Error

    PubMed Central

    Wei, Ying; Carroll, Raymond J.

    2010-01-01

    Regression quantiles can be substantially biased when the covariates are measured with error. In this paper we propose a new method that produces consistent linear quantile estimation in the presence of covariate measurement error. The method corrects the measurement error induced bias by constructing joint estimating equations that simultaneously hold for all the quantile levels. An iterative EM-type estimation algorithm to obtain the solutions to such joint estimation equations is provided. The finite sample performance of the proposed method is investigated in a simulation study, and compared to the standard regression calibration approach. Finally, we apply our methodology to part of the National Collaborative Perinatal Project growth data, a longitudinal study with an unusual measurement error structure. PMID:20305802

  16. Precision and Recall for Regression

    NASA Astrophysics Data System (ADS)

    Torgo, Luis; Ribeiro, Rita

    Cost sensitive prediction is a key task in many real world applications. Most existing research in this area deals with classification problems. This paper addresses a related regression problem: the prediction of rare extreme values of a continuous variable. These values are often regarded as outliers and removed from posterior analysis. However, for many applications (e.g. in finance, meteorology, biology, etc.) these are the key values that we want to accurately predict. Any learning method obtains models by optimizing some preference criteria. In this paper we propose new evaluation criteria that are more adequate for these applications. We describe a generalization for regression of the concepts of precision and recall often used in classification. Using these new evaluation metrics we are able to focus the evaluation of predictive models on the cases that really matter for these applications. Our experiments indicate the advantages of the use of these new measures when comparing predictive models in the context of our target applications.

  17. Initial external validation of REGRESS in public health graduate students.

    PubMed

    Kidwell, Kelley M; Enders, Felicity B

    2014-12-01

    Linear regression is typically taught as a second and potentially last required (bio)statistics course for Public Health and Clinical and Translational Science students. There has been much research on the attitudes of students toward basic biostatistics, but there has not been much assessing students' understanding of critical regression topics. The REGRESS (REsearch on Global Regression Expectations in StatisticS) quiz developed at Mayo Clinic utilizes 27 questions to assess understanding for simple and multiple linear regression. We performed an initial external validation of this tool with 117 University of Michigan public health students. We compare the results of pre- and postcourse quiz scores from the Michigan cohort to scores of Mayo medical students and professional statisticians. University of Michigan students performed higher than Mayo students on the precourse quiz due to previous related coursework, but did not perform as high postcourse indicating the need for course modification. In the Michigan cohort, REGRESS scores improved by a mean (standard deviation) of 4.6 (3.4), p < 0.0001. Our results support the use of the REGRESS quiz as a learning tool for students and an evaluation tool to identify topics for curricular improvement for teachers, while we highlight future directions of research. PMID:25041650

  18. Initial external validation of REGRESS in public health graduate students.

    PubMed

    Kidwell, Kelley M; Enders, Felicity B

    2014-12-01

    Linear regression is typically taught as a second and potentially last required (bio)statistics course for Public Health and Clinical and Translational Science students. There has been much research on the attitudes of students toward basic biostatistics, but there has not been much assessing students' understanding of critical regression topics. The REGRESS (REsearch on Global Regression Expectations in StatisticS) quiz developed at Mayo Clinic utilizes 27 questions to assess understanding for simple and multiple linear regression. We performed an initial external validation of this tool with 117 University of Michigan public health students. We compare the results of pre- and postcourse quiz scores from the Michigan cohort to scores of Mayo medical students and professional statisticians. University of Michigan students performed higher than Mayo students on the precourse quiz due to previous related coursework, but did not perform as high postcourse indicating the need for course modification. In the Michigan cohort, REGRESS scores improved by a mean (standard deviation) of 4.6 (3.4), p < 0.0001. Our results support the use of the REGRESS quiz as a learning tool for students and an evaluation tool to identify topics for curricular improvement for teachers, while we highlight future directions of research.

  19. Shape regression for vertebra fracture quantification

    NASA Astrophysics Data System (ADS)

    Lund, Michael Tillge; de Bruijne, Marleen; Tanko, Laszlo B.; Nielsen, Mads

    2005-04-01

    Accurate and reliable identification and quantification of vertebral fractures constitute a challenge both in clinical trials and in diagnosis of osteoporosis. Various efforts have been made to develop reliable, objective, and reproducible methods for assessing vertebral fractures, but at present there is no consensus concerning a universally accepted diagnostic definition of vertebral fractures. In this project we want to investigate whether or not it is possible to accurately reconstruct the shape of a normal vertebra, using a neighbouring vertebra as prior information. The reconstructed shape can then be used to develop a novel vertebra fracture measure, by comparing the segmented vertebra shape with its reconstructed normal shape. The vertebrae in lateral x-rays of the lumbar spine were manually annotated by a medical expert. With this dataset we built a shape model, with equidistant point distribution between the four corner points. Based on the shape model, a multiple linear regression model of a normal vertebra shape was developed for each dataset using leave-one-out cross-validation. The reconstructed shape was calculated for each dataset using these regression models. The average prediction error for the annotated shape was on average 3%.

  20. Regression Models For Saffron Yields in Iran

    NASA Astrophysics Data System (ADS)

    S. H, Sanaeinejad; S. N, Hosseini

    Saffron is an important crop in social and economical aspects in Khorassan Province (Northeast of Iran). In this research wetried to evaluate trends of saffron yield in recent years and to study the relationship between saffron yield and the climate change. A regression analysis was used to predict saffron yield based on 20 years of yield data in Birjand, Ghaen and Ferdows cities.Climatologically data for the same periods was provided by database of Khorassan Climatology Center. Climatologically data includedtemperature, rainfall, relative humidity and sunshine hours for ModelI, and temperature and rainfall for Model II. The results showed the coefficients of determination for Birjand, Ferdows and Ghaen for Model I were 0.69, 0.50 and 0.81 respectively. Also coefficients of determination for the same cities for model II were 0.53, 0.50 and 0.72 respectively. Multiple regression analysisindicated that among weather variables, temperature was the key parameter for variation ofsaffron yield. It was concluded that increasing temperature at spring was the main cause of declined saffron yield during recent years across the province. Finally, yield trend was predicted for the last 5 years using time series analysis.

  1. Phosphazene additives

    DOEpatents

    Harrup, Mason K; Rollins, Harry W

    2013-11-26

    An additive comprising a phosphazene compound that has at least two reactive functional groups and at least one capping functional group bonded to phosphorus atoms of the phosphazene compound. One of the at least two reactive functional groups is configured to react with cellulose and the other of the at least two reactive functional groups is configured to react with a resin, such as an amine resin of a polycarboxylic acid resin. The at least one capping functional group is selected from the group consisting of a short chain ether group, an alkoxy group, or an aryloxy group. Also disclosed are an additive-resin admixture, a method of treating a wood product, and a wood product.

  2. Potlining Additives

    SciTech Connect

    Rudolf Keller

    2004-08-10

    In this project, a concept to improve the performance of aluminum production cells by introducing potlining additives was examined and tested. Boron oxide was added to cathode blocks, and titanium was dissolved in the metal pool; this resulted in the formation of titanium diboride and caused the molten aluminum to wet the carbonaceous cathode surface. Such wetting reportedly leads to operational improvements and extended cell life. In addition, boron oxide suppresses cyanide formation. This final report presents and discusses the results of this project. Substantial economic benefits for the practical implementation of the technology are projected, especially for modern cells with graphitized blocks. For example, with an energy savings of about 5% and an increase in pot life from 1500 to 2500 days, a cost savings of $ 0.023 per pound of aluminum produced is projected for a 200 kA pot.

  3. Regression analysis of cytopathological data

    SciTech Connect

    Whittemore, A.S.; McLarty, J.W.; Fortson, N.; Anderson, K.

    1982-12-01

    Epithelial cells from the human body are frequently labelled according to one of several ordered levels of abnormality, ranging from normal to malignant. The label of the most abnormal cell in a specimen determines the score for the specimen. This paper presents a model for the regression of specimen scores against continuous and discrete variables, as in host exposure to carcinogens. Application to data and tests for adequacy of model fit are illustrated using sputum specimens obtained from a cohort of former asbestos workers.

  4. Regression analysis of reported earthquake precursors. I. Presentation of data

    NASA Astrophysics Data System (ADS)

    Niazi, Mansour

    1984-11-01

    Around 700 reported precursors of about 350 earthquakes, including the negative observations, have been compiled in 11 categories with 31 subdivisions. The data base is subjected to an initial sorting and screening by imposing three restrictions on the ranges of main shock magnitude ( M≥4.0), precursory time ( t≤20 years), and the epicentral distance of observation points ( X m≤4.100.3 M ). Of the 31 subcategories of precursory phenomena, 18 with 9 data points or more are independently studied by regressing their precursory times against magnitude. The preliminary results tend to classify the precursors into three groups: 1. The precursors which show weak or no correlation between time and the magnitude of the eventual main shock. Examples of this group are foreshocks and precursory tilt. 2. The precursors which show clear scaling with magnitude. These include seismic velocity ratio ( V p/Vs), travel time delay, duration of seismic quiescence, and, to some degree, the variation of b-value, and anomalous seismicity. 3. The precursors which display clustering of precursory times around a mean value, which differs for different precursors from a few hours to a few years. Examples include the conductivity rate, geoelectric current and potential, strain, water well level, geochemical anomalies, change of focal mechanism, and the enhancement of seismicity reported only for larger earthquakes. Some of the precursors in this category, such as leveling changes and the occurrence of microseismicity, show bimodal patterns of precursory times and may partially be coseismic. In addition, each category with a sufficient number of reported estimates of distance and signal amplitude is subjected to multiple linear regression. The usefulness of these regressions at this stage appears to be limited to specifying which of the parameters shows a more significant correlation. Standard deviations of residuals of precursory time against magnitude are generally reduced when

  5. A rotor optimization using regression analysis

    NASA Technical Reports Server (NTRS)

    Giansante, N.

    1984-01-01

    The design and development of helicopter rotors is subject to the many design variables and their interactions that effect rotor operation. Until recently, selection of rotor design variables to achieve specified rotor operational qualities has been a costly, time consuming, repetitive task. For the past several years, Kaman Aerospace Corporation has successfully applied multiple linear regression analysis, coupled with optimization and sensitivity procedures, in the analytical design of rotor systems. It is concluded that approximating equations can be developed rapidly for a multiplicity of objective and constraint functions and optimizations can be performed in a rapid and cost effective manner; the number and/or range of design variables can be increased by expanding the data base and developing approximating functions to reflect the expanded design space; the order of the approximating equations can be expanded easily to improve correlation between analyzer results and the approximating equations; gradients of the approximating equations can be calculated easily and these gradients are smooth functions reducing the risk of numerical problems in the optimization; the use of approximating functions allows the problem to be started easily and rapidly from various initial designs to enhance the probability of finding a global optimum; and the approximating equations are independent of the analysis or optimization codes used.

  6. Birthweight Related Factors in Northwestern Iran: Using Quantile Regression Method

    PubMed Central

    Fallah, Ramazan; Kazemnejad, Anoshirvan; Zayeri, Farid; Shoghli, Alireza

    2016-01-01

    Introduction: Birthweight is one of the most important predicting indicators of the health status in adulthood. Having a balanced birthweight is one of the priorities of the health system in most of the industrial and developed countries. This indicator is used to assess the growth and health status of the infants. The aim of this study was to assess the birthweight of the neonates by using quantile regression in Zanjan province. Methods: This analytical descriptive study was carried out using pre-registered (March 2010 - March 2012) data of neonates in urban/rural health centers of Zanjan province using multiple-stage cluster sampling. Data were analyzed using multiple linear regressions andquantile regression method and SAS 9.2 statistical software. Results: From 8456 newborn baby, 4146 (49%) were female. The mean age of the mothers was 27.1±5.4 years. The mean birthweight of the neonates was 3104 ± 431 grams. Five hundred and seventy-three patients (6.8%) of the neonates were less than 2500 grams. In all quantiles, gestational age of neonates (p<0.05), weight and educational level of the mothers (p<0.05) showed a linear significant relationship with the i of the neonates. However, sex and birth rank of the neonates, mothers age, place of residence (urban/rural) and career were not significant in all quantiles (p>0.05). Conclusion: This study revealed the results of multiple linear regression and quantile regression were not identical. We strictly recommend the use of quantile regression when an asymmetric response variable or data with outliers is available. PMID:26925889

  7. Multiatlas Segmentation as Nonparametric Regression

    PubMed Central

    Awate, Suyash P.; Whitaker, Ross T.

    2015-01-01

    This paper proposes a novel theoretical framework to model and analyze the statistical characteristics of a wide range of segmentation methods that incorporate a database of label maps or atlases; such methods are termed as label fusion or multiatlas segmentation. We model these multiatlas segmentation problems as nonparametric regression problems in the high-dimensional space of image patches. We analyze the nonparametric estimator’s convergence behavior that characterizes expected segmentation error as a function of the size of the multiatlas database. We show that this error has an analytic form involving several parameters that are fundamental to the specific segmentation problem (determined by the chosen anatomical structure, imaging modality, registration algorithm, and label-fusion algorithm). We describe how to estimate these parameters and show that several human anatomical structures exhibit the trends modeled analytically. We use these parameter estimates to optimize the regression estimator. We show that the expected error for large database sizes is well predicted by models learned on small databases. Thus, a few expert segmentations can help predict the database sizes required to keep the expected error below a specified tolerance level. Such cost-benefit analysis is crucial for deploying clinical multiatlas segmentation systems. PMID:24802528

  8. Variable Selection in ROC Regression

    PubMed Central

    2013-01-01

    Regression models are introduced into the receiver operating characteristic (ROC) analysis to accommodate effects of covariates, such as genes. If many covariates are available, the variable selection issue arises. The traditional induced methodology separately models outcomes of diseased and nondiseased groups; thus, separate application of variable selections to two models will bring barriers in interpretation, due to differences in selected models. Furthermore, in the ROC regression, the accuracy of area under the curve (AUC) should be the focus instead of aiming at the consistency of model selection or the good prediction performance. In this paper, we obtain one single objective function with the group SCAD to select grouped variables, which adapts to popular criteria of model selection, and propose a two-stage framework to apply the focused information criterion (FIC). Some asymptotic properties of the proposed methods are derived. Simulation studies show that the grouped variable selection is superior to separate model selections. Furthermore, the FIC improves the accuracy of the estimated AUC compared with other criteria. PMID:24312135

  9. Selection of Higher Order Regression Models in the Analysis of Multi-Factorial Transcription Data

    PubMed Central

    Prazeres da Costa, Olivia; Hoffman, Arthur; Rey, Johannes W.; Mansmann, Ulrich

    2014-01-01

    Introduction Many studies examine gene expression data that has been obtained under the influence of multiple factors, such as genetic background, environmental conditions, or exposure to diseases. The interplay of multiple factors may lead to effect modification and confounding. Higher order linear regression models can account for these effects. We present a new methodology for linear model selection and apply it to microarray data of bone marrow-derived macrophages. This experiment investigates the influence of three variable factors: the genetic background of the mice from which the macrophages were obtained, Yersinia enterocolitica infection (two strains, and a mock control), and treatment/non-treatment with interferon-γ. Results We set up four different linear regression models in a hierarchical order. We introduce the eruption plot as a new practical tool for model selection complementary to global testing. It visually compares the size and significance of effect estimates between two nested models. Using this methodology we were able to select the most appropriate model by keeping only relevant factors showing additional explanatory power. Application to experimental data allowed us to qualify the interaction of factors as either neutral (no interaction), alleviating (co-occurring effects are weaker than expected from the single effects), or aggravating (stronger than expected). We find a biologically meaningful gene cluster of putative C2TA target genes that appear to be co-regulated with MHC class II genes. Conclusions We introduced the eruption plot as a tool for visual model comparison to identify relevant higher order interactions in the analysis of expression data obtained under the influence of multiple factors. We conclude that model selection in higher order linear regression models should generally be performed for the analysis of multi-factorial microarray data. PMID:24658540

  10. Isotope labeling studies on the formation of multiple addition products of alanine in the pyrolysis residue of glucose/alanine mixtures by high-resolution ESI-TOF-MS.

    PubMed

    Chu, Fong Lam; Sleno, Lekha; Yaylayan, Varoujan A

    2011-11-01

    Pyrolysis was used as a microscale sample preparation tool to generate glucose/alanine reaction products to minimize the use of expensive labeled precursors in isotope labeling studies. The residue remaining after the pyrolysis at 250 °C was analyzed by electrospray time-of-flight mass spectrometry (ESI-TOF-MS). It was observed that a peak at m/z 199.1445 in the ESI-TOF-MS spectrum appeared only when the model system contained at least 2-fold excess alanine. The accurate mass determination indeed indicated the presence of two nitrogen atoms in the molecular formula (C(10)H(18)N(2)O(2)). To verify the origin of the carbon atoms in this unknown compound, model studies with [(13)U(6)]glucose, [(13)C-1]alanine, [(13)C-2]alanine, [(13)C-3]alanine, and [(15)N]alanine were also performed. Glucose furnished six carbon atoms, and alanine provides four carbon (2 × C-2 and 2 × C-3) and two nitrogen atoms. When commercially available fructosylalanine (N-attached to C-1) was reacted with only 1 mol of alanine, a peak at m/z 199.1445 was once again observed. In addition, when 3-deoxyglucosone (3-DG) was reacted with a 2-fold excess of alanine, a peak at m/z 199.1433 was also generated, confirming the points of attachment of the two amino acids at C-1 and C-2 atoms of 3-DG. These studies have indicated that amino acids can undergo multiple addition reactions with 1,2-dicarbonyl compounds such as 3-deoxyglucosone and eventually form a tetrahydropyrazine moiety.

  11. Evaluating Additive Interaction Using Survival Percentiles.

    PubMed

    Bellavia, Andrea; Bottai, Matteo; Orsini, Nicola

    2016-05-01

    Evaluation of statistical interaction in time-to-event analysis is usually limited to the study of multiplicative interaction, via inclusion of a product term in a Cox proportional-hazard model. Measures of additive interaction are available but seldom used. All measures of interaction in survival analysis, whether additive or multiplicative, are in the metric of hazard, usually assuming that the interaction between two predictors of interest is constant during the follow-up period. We introduce a measure to evaluate additive interaction in survival analysis in the metric of time. This measure can be calculated by evaluating survival percentiles, defined as the time points by which different subpopulations reach the same incidence proportion. Using this approach, the probability of the outcome is fixed and the time variable is estimated. We also show that by using a regression model for the evaluation of conditional survival percentiles, including a product term between the two exposures in the model, interaction is evaluated as a deviation from additivity of the effects. In the simple case of two binary exposures, the product term is interpreted as excess/decrease in survival time (i.e., years, months, days) due to the presence of both exposures. This measure of interaction is dependent on the fraction of events being considered, thus allowing evaluation of how interaction changes during the observed follow-up. Evaluation of interaction in the context of survival percentiles allows deriving a measure of additive interaction without assuming a constant effect over time, overcoming two main limitations of commonly used approaches.

  12. Probing for the Multiplicative Term in Modern Expectancy-Value Theory: A Latent Interaction Modeling Study

    ERIC Educational Resources Information Center

    Trautwein, Ulrich; Marsh, Herbert W.; Nagengast, Benjamin; Ludtke, Oliver; Nagy, Gabriel; Jonkmann, Kathrin

    2012-01-01

    In modern expectancy-value theory (EVT) in educational psychology, expectancy and value beliefs additively predict performance, persistence, and task choice. In contrast to earlier formulations of EVT, the multiplicative term Expectancy x Value in regression-type models typically plays no major role in educational psychology. The present study…

  13. Evaluation and application of regional turbidity-sediment regression models in Virginia

    USGS Publications Warehouse

    Hyer, Kenneth; Jastram, John D.; Moyer, Douglas; Webber, James; Chanat, Jeffrey G.

    2015-01-01

    Conventional thinking has long held that turbidity-sediment surrogate-regression equations are site specific and that regression equations developed at a single monitoring station should not be applied to another station; however, few studies have evaluated this issue in a rigorous manner. If robust regional turbidity-sediment models can be developed successfully, their applications could greatly expand the usage of these methods. Suspended sediment load estimation could occur as soon as flow and turbidity monitoring commence at a site, suspended sediment sampling frequencies for various projects potentially could be reduced, and special-project applications (sediment monitoring following dam removal, for example) could be significantly enhanced. The objective of this effort was to investigate the turbidity-suspended sediment concentration (SSC) relations at all available USGS monitoring sites within Virginia to determine whether meaningful turbidity-sediment regression models can be developed by combining the data from multiple monitoring stations into a single model, known as a “regional” model. Following the development of the regional model, additional objectives included a comparison of predicted SSCs between the regional model and commonly used site-specific models, as well as an evaluation of why specific monitoring stations did not fit the regional model.

  14. Understanding and Interpreting Regression Parameter Estimates in Given Contexts: A Monte Carlo Study of Characteristics of Regression and Structural Coefficients, Effect Size R Squared and Significance Level of Predictors.

    ERIC Educational Resources Information Center

    Jiang, Ying Hong; Smith, Philip L.

    This Monte Carlo study explored relationships among standard and unstandardized regression coefficients, structural coefficients, multiple R_ squared, and significance level of predictors for a variety of linear regression scenarios. Ten regression models with three predictors were included, and four conditions were varied that were expected to…

  15. Heritability Across the Distribution: An Application of Quantile Regression

    PubMed Central

    Petrill, Stephen A.; Hart, Sara A.; Schatschneider, Christopher; Thompson, Lee A.; Deater-Deckard, Kirby; DeThorne, Laura S.; Bartlett, Christopher

    2016-01-01

    We introduce a new method for analyzing twin data called quantile regression. Through the application presented here, quantile regression is able to assess the genetic and environmental etiology of any skill or ability, at multiple points in the distribution of that skill or ability. This method is compared to the Cherny et al. (Behav Genet 22:153–162, 1992) method in an application to four different reading-related outcomes in 304 pairs of first-grade same sex twins enrolled in the Western Reserve Reading Project. Findings across the two methods were similar; both indicated some variation across the distribution of the genetic and shared environmental influences on non-word reading. However, quantile regression provides more details about the location and size of the measured effect. Applications of the technique are discussed. PMID:21877231

  16. Semiparametric regression during 2003–2007*

    PubMed Central

    Ruppert, David; Wand, M.P.; Carroll, Raymond J.

    2010-01-01

    Semiparametric regression is a fusion between parametric regression and nonparametric regression that integrates low-rank penalized splines, mixed model and hierarchical Bayesian methodology – thus allowing more streamlined handling of longitudinal and spatial correlation. We review progress in the field over the five-year period between 2003 and 2007. We find semiparametric regression to be a vibrant field with substantial involvement and activity, continual enhancement and widespread application. PMID:20305800

  17. Nonlinear Identification Using Orthogonal Forward Regression With Nested Optimal Regularization.

    PubMed

    Hong, Xia; Chen, Sheng; Gao, Junbin; Harris, Chris J

    2015-12-01

    An efficient data based-modeling algorithm for nonlinear system identification is introduced for radial basis function (RBF) neural networks with the aim of maximizing generalization capability based on the concept of leave-one-out (LOO) cross validation. Each of the RBF kernels has its own kernel width parameter and the basic idea is to optimize the multiple pairs of regularization parameters and kernel widths, each of which is associated with a kernel, one at a time within the orthogonal forward regression (OFR) procedure. Thus, each OFR step consists of one model term selection based on the LOO mean square error (LOOMSE), followed by the optimization of the associated kernel width and regularization parameter, also based on the LOOMSE. Since like our previous state-of-the-art local regularization assisted orthogonal least squares (LROLS) algorithm, the same LOOMSE is adopted for model selection, our proposed new OFR algorithm is also capable of producing a very sparse RBF model with excellent generalization performance. Unlike our previous LROLS algorithm which requires an additional iterative loop to optimize the regularization parameters as well as an additional procedure to optimize the kernel width, the proposed new OFR algorithm optimizes both the kernel widths and regularization parameters within the single OFR procedure, and consequently the required computational complexity is dramatically reduced. Nonlinear system identification examples are included to demonstrate the effectiveness of this new approach in comparison to the well-known approaches of support vector machine and least absolute shrinkage and selection operator as well as the LROLS algorithm.

  18. A Heterogeneous Bayesian Regression Model for Cross-Sectional Data Involving a Single Observation per Response Unit

    ERIC Educational Resources Information Center

    Fong, Duncan K. H.; Ebbes, Peter; DeSarbo, Wayne S.

    2012-01-01

    Multiple regression is frequently used across the various social sciences to analyze cross-sectional data. However, it can often times be challenging to justify the assumption of common regression coefficients across all respondents. This manuscript presents a heterogeneous Bayesian regression model that enables the estimation of…

  19. [Spatial interpolation of soil organic matter using regression Kriging and geographically weighted regression Kriging].

    PubMed

    Yang, Shun-hua; Zhang, Hai-tao; Guo, Long; Ren, Yan

    2015-06-01

    Relative elevation and stream power index were selected as auxiliary variables based on correlation analysis for mapping soil organic matter. Geographically weighted regression Kriging (GWRK) and regression Kriging (RK) were used for spatial interpolation of soil organic matter and compared with ordinary Kriging (OK), which acts as a control. The results indicated that soil or- ganic matter was significantly positively correlated with relative elevation whilst it had a significantly negative correlation with stream power index. Semivariance analysis showed that both soil organic matter content and its residuals (including ordinary least square regression residual and GWR resi- dual) had strong spatial autocorrelation. Interpolation accuracies by different methods were esti- mated based on a data set of 98 validation samples. Results showed that the mean error (ME), mean absolute error (MAE) and root mean square error (RMSE) of RK were respectively 39.2%, 17.7% and 20.6% lower than the corresponding values of OK, with a relative-improvement (RI) of 20.63. GWRK showed a similar tendency, having its ME, MAE and RMSE to be respectively 60.6%, 23.7% and 27.6% lower than those of OK, with a RI of 59.79. Therefore, both RK and GWRK significantly improved the accuracy of OK interpolation of soil organic matter due to their in- corporation of auxiliary variables. In addition, GWRK performed obviously better than RK did in this study, and its improved performance should be attributed to the consideration of sample spatial locations. PMID:26572015

  20. Developmental Regression in Autism Spectrum Disorders

    ERIC Educational Resources Information Center

    Rogers, Sally J.

    2004-01-01

    The occurrence of developmental regression in autism is one of the more puzzling features of this disorder. Although several studies have documented the validity of parental reports of regression using home videos, accumulating data suggest that most children who demonstrate regression also demonstrated previous, subtle, developmental differences.…