Regression Commonality Analysis: A Technique for Quantitative Theory Building
ERIC Educational Resources Information Center
Nimon, Kim; Reio, Thomas G., Jr.
2011-01-01
When it comes to multiple linear regression analysis (MLR), it is common for social and behavioral science researchers to rely predominately on beta weights when evaluating how predictors contribute to a regression model. Presenting an underutilized statistical technique, this article describes how organizational researchers can use commonality…
Regression: The Apple Does Not Fall Far From the Tree.
Vetter, Thomas R; Schober, Patrick
2018-05-15
Researchers and clinicians are frequently interested in either: (1) assessing whether there is a relationship or association between 2 or more variables and quantifying this association; or (2) determining whether 1 or more variables can predict another variable. The strength of such an association is mainly described by the correlation. However, regression analysis and regression models can be used not only to identify whether there is a significant relationship or association between variables but also to generate estimations of such a predictive relationship between variables. This basic statistical tutorial discusses the fundamental concepts and techniques related to the most common types of regression analysis and modeling, including simple linear regression, multiple regression, logistic regression, ordinal regression, and Poisson regression, as well as the common yet often underrecognized phenomenon of regression toward the mean. The various types of regression analysis are powerful statistical techniques, which when appropriately applied, can allow for the valid interpretation of complex, multifactorial data. Regression analysis and models can assess whether there is a relationship or association between 2 or more observed variables and estimate the strength of this association, as well as determine whether 1 or more variables can predict another variable. Regression is thus being applied more commonly in anesthesia, perioperative, critical care, and pain research. However, it is crucial to note that regression can identify plausible risk factors; it does not prove causation (a definitive cause and effect relationship). The results of a regression analysis instead identify independent (predictor) variable(s) associated with the dependent (outcome) variable. As with other statistical methods, applying regression requires that certain assumptions be met, which can be tested with specific diagnostics.
Standardized Regression Coefficients as Indices of Effect Sizes in Meta-Analysis
ERIC Educational Resources Information Center
Kim, Rae Seon
2011-01-01
When conducting a meta-analysis, it is common to find many collected studies that report regression analyses, because multiple regression analysis is widely used in many fields. Meta-analysis uses effect sizes drawn from individual studies as a means of synthesizing a collection of results. However, indices of effect size from regression analyses…
Beyond Multiple Regression: Using Commonality Analysis to Better Understand R[superscript 2] Results
ERIC Educational Resources Information Center
Warne, Russell T.
2011-01-01
Multiple regression is one of the most common statistical methods used in quantitative educational research. Despite the versatility and easy interpretability of multiple regression, it has some shortcomings in the detection of suppressor variables and for somewhat arbitrarily assigning values to the structure coefficients of correlated…
Hoch, Jeffrey S; Dewa, Carolyn S
2014-04-01
Economic evaluations commonly accompany trials of new treatments or interventions; however, regression methods and their corresponding advantages for the analysis of cost-effectiveness data are not well known. To illustrate regression-based economic evaluation, we present a case study investigating the cost-effectiveness of a collaborative mental health care program for people receiving short-term disability benefits for psychiatric disorders. We implement net benefit regression to illustrate its strengths and limitations. Net benefit regression offers a simple option for cost-effectiveness analyses of person-level data. By placing economic evaluation in a regression framework, regression-based techniques can facilitate the analysis and provide simple solutions to commonly encountered challenges. Economic evaluations of person-level data (eg, from a clinical trial) should use net benefit regression to facilitate analysis and enhance results.
Stepwise versus Hierarchical Regression: Pros and Cons
ERIC Educational Resources Information Center
Lewis, Mitzi
2007-01-01
Multiple regression is commonly used in social and behavioral data analysis. In multiple regression contexts, researchers are very often interested in determining the "best" predictors in the analysis. This focus may stem from a need to identify those predictors that are supportive of theory. Alternatively, the researcher may simply be interested…
Common pitfalls in statistical analysis: Linear regression analysis
Aggarwal, Rakesh; Ranganathan, Priya
2017-01-01
In a previous article in this series, we explained correlation analysis which describes the strength of relationship between two continuous variables. In this article, we deal with linear regression analysis which predicts the value of one continuous variable from another. We also discuss the assumptions and pitfalls associated with this analysis. PMID:28447022
MODELING SNAKE MICROHABITAT FROM RADIOTELEMETRY STUDIES USING POLYTOMOUS LOGISTIC REGRESSION
Multivariate analysis of snake microhabitat has historically used techniques that were derived under assumptions of normality and common covariance structure (e.g., discriminant function analysis, MANOVA). In this study, polytomous logistic regression (PLR which does not require ...
John W. Edwards; Susan C. Loeb; David C. Guynn
1994-01-01
Multiple regression and use-availability analyses are two methods for examining habitat selection. Use-availability analysis is commonly used to evaluate macrohabitat selection whereas multiple regression analysis can be used to determine microhabitat selection. We compared these techniques using behavioral observations (n = 5534) and telemetry locations (n = 2089) of...
Hierarchical Multiple Regression in Counseling Research: Common Problems and Possible Remedies.
ERIC Educational Resources Information Center
Petrocelli, John V.
2003-01-01
A brief content analysis was conducted on the use of hierarchical regression in counseling research published in the "Journal of Counseling Psychology" and the "Journal of Counseling & Development" during the years 1997-2001. Common problems are cited and possible remedies are described. (Contains 43 references and 3 tables.) (Author)
ERIC Educational Resources Information Center
Beauducel, Andre
2007-01-01
It was investigated whether commonly used factor score estimates lead to the same reproduced covariance matrix of observed variables. This was achieved by means of Schonemann and Steiger's (1976) regression component analysis, since it is possible to compute the reproduced covariance matrices of the regression components corresponding to different…
A Simulation Investigation of Principal Component Regression.
ERIC Educational Resources Information Center
Allen, David E.
Regression analysis is one of the more common analytic tools used by researchers. However, multicollinearity between the predictor variables can cause problems in using the results of regression analyses. Problems associated with multicollinearity include entanglement of relative influences of variables due to reduced precision of estimation,…
Interquantile Shrinkage in Regression Models
Jiang, Liewen; Wang, Huixia Judy; Bondell, Howard D.
2012-01-01
Conventional analysis using quantile regression typically focuses on fitting the regression model at different quantiles separately. However, in situations where the quantile coefficients share some common feature, joint modeling of multiple quantiles to accommodate the commonality often leads to more efficient estimation. One example of common features is that a predictor may have a constant effect over one region of quantile levels but varying effects in other regions. To automatically perform estimation and detection of the interquantile commonality, we develop two penalization methods. When the quantile slope coefficients indeed do not change across quantile levels, the proposed methods will shrink the slopes towards constant and thus improve the estimation efficiency. We establish the oracle properties of the two proposed penalization methods. Through numerical investigations, we demonstrate that the proposed methods lead to estimations with competitive or higher efficiency than the standard quantile regression estimation in finite samples. Supplemental materials for the article are available online. PMID:24363546
The Variance Normalization Method of Ridge Regression Analysis.
ERIC Educational Resources Information Center
Bulcock, J. W.; And Others
The testing of contemporary sociological theory often calls for the application of structural-equation models to data which are inherently collinear. It is shown that simple ridge regression, which is commonly used for controlling the instability of ordinary least squares regression estimates in ill-conditioned data sets, is not a legitimate…
USDA-ARS?s Scientific Manuscript database
In multivariate regression analysis of spectroscopy data, spectral preprocessing is often performed to reduce unwanted background information (offsets, sloped baselines) or accentuate absorption features in intrinsically overlapping bands. These procedures, also known as pretreatments, are commonly ...
Two Paradoxes in Linear Regression Analysis.
Feng, Ge; Peng, Jing; Tu, Dongke; Zheng, Julia Z; Feng, Changyong
2016-12-25
Regression is one of the favorite tools in applied statistics. However, misuse and misinterpretation of results from regression analysis are common in biomedical research. In this paper we use statistical theory and simulation studies to clarify some paradoxes around this popular statistical method. In particular, we show that a widely used model selection procedure employed in many publications in top medical journals is wrong. Formal procedures based on solid statistical theory should be used in model selection.
ERIC Educational Resources Information Center
Li, Spencer D.
2011-01-01
Mediation analysis in child and adolescent development research is possible using large secondary data sets. This article provides an overview of two statistical methods commonly used to test mediated effects in secondary analysis: multiple regression and structural equation modeling (SEM). Two empirical studies are presented to illustrate the…
Quantile Regression in the Study of Developmental Sciences
ERIC Educational Resources Information Center
Petscher, Yaacov; Logan, Jessica A. R.
2014-01-01
Linear regression analysis is one of the most common techniques applied in developmental research, but only allows for an estimate of the average relations between the predictor(s) and the outcome. This study describes quantile regression, which provides estimates of the relations between the predictor(s) and outcome, but across multiple points of…
John Hogland; Nedret Billor; Nathaniel Anderson
2013-01-01
Discriminant analysis, referred to as maximum likelihood classification within popular remote sensing software packages, is a common supervised technique used by analysts. Polytomous logistic regression (PLR), also referred to as multinomial logistic regression, is an alternative classification approach that is less restrictive, more flexible, and easy to interpret. To...
Linear regression analysis: part 14 of a series on evaluation of scientific publications.
Schneider, Astrid; Hommel, Gerhard; Blettner, Maria
2010-11-01
Regression analysis is an important statistical method for the analysis of medical data. It enables the identification and characterization of relationships among multiple factors. It also enables the identification of prognostically relevant risk factors and the calculation of risk scores for individual prognostication. This article is based on selected textbooks of statistics, a selective review of the literature, and our own experience. After a brief introduction of the uni- and multivariable regression models, illustrative examples are given to explain what the important considerations are before a regression analysis is performed, and how the results should be interpreted. The reader should then be able to judge whether the method has been used correctly and interpret the results appropriately. The performance and interpretation of linear regression analysis are subject to a variety of pitfalls, which are discussed here in detail. The reader is made aware of common errors of interpretation through practical examples. Both the opportunities for applying linear regression analysis and its limitations are presented.
Two Paradoxes in Linear Regression Analysis
FENG, Ge; PENG, Jing; TU, Dongke; ZHENG, Julia Z.; FENG, Changyong
2016-01-01
Summary Regression is one of the favorite tools in applied statistics. However, misuse and misinterpretation of results from regression analysis are common in biomedical research. In this paper we use statistical theory and simulation studies to clarify some paradoxes around this popular statistical method. In particular, we show that a widely used model selection procedure employed in many publications in top medical journals is wrong. Formal procedures based on solid statistical theory should be used in model selection. PMID:28638214
ERIC Educational Resources Information Center
Preacher, Kristopher J.; Curran, Patrick J.; Bauer, Daniel J.
2006-01-01
Simple slopes, regions of significance, and confidence bands are commonly used to evaluate interactions in multiple linear regression (MLR) models, and the use of these techniques has recently been extended to multilevel or hierarchical linear modeling (HLM) and latent curve analysis (LCA). However, conducting these tests and plotting the…
Frndak, Seth E; Smerbeck, Audrey M; Irwin, Lauren N; Drake, Allison S; Kordovski, Victoria M; Kunker, Katrina A; Khan, Anjum L; Benedict, Ralph H B
2016-10-01
We endeavored to clarify how distinct co-occurring symptoms relate to the presence of negative work events in employed multiple sclerosis (MS) patients. Latent profile analysis (LPA) was utilized to elucidate common disability patterns by isolating patient subpopulations. Samples of 272 employed MS patients and 209 healthy controls (HC) were administered neuroperformance tests of ambulation, hand dexterity, processing speed, and memory. Regression-based norms were created from the HC sample. LPA identified latent profiles using the regression-based z-scores. Finally, multinomial logistic regression tested for negative work event differences among the latent profiles. Four profiles were identified via LPA: a common profile (55%) characterized by slightly below average performance in all domains, a broadly low-performing profile (18%), a poor motor abilities profile with average cognition (17%), and a generally high-functioning profile (9%). Multinomial regression analysis revealed that the uniformly low-performing profile demonstrated a higher likelihood of reported negative work events. Employed MS patients with co-occurring motor, memory and processing speed impairments were most likely to report a negative work event, classifying them as uniquely at risk for job loss.
Robust analysis of trends in noisy tokamak confinement data using geodesic least squares regression
DOE Office of Scientific and Technical Information (OSTI.GOV)
Verdoolaege, G., E-mail: geert.verdoolaege@ugent.be; Laboratory for Plasma Physics, Royal Military Academy, B-1000 Brussels; Shabbir, A.
Regression analysis is a very common activity in fusion science for unveiling trends and parametric dependencies, but it can be a difficult matter. We have recently developed the method of geodesic least squares (GLS) regression that is able to handle errors in all variables, is robust against data outliers and uncertainty in the regression model, and can be used with arbitrary distribution models and regression functions. We here report on first results of application of GLS to estimation of the multi-machine scaling law for the energy confinement time in tokamaks, demonstrating improved consistency of the GLS results compared to standardmore » least squares.« less
NASA Astrophysics Data System (ADS)
Haddad, Khaled; Rahman, Ataur; A Zaman, Mohammad; Shrestha, Surendra
2013-03-01
SummaryIn regional hydrologic regression analysis, model selection and validation are regarded as important steps. Here, the model selection is usually based on some measurements of goodness-of-fit between the model prediction and observed data. In Regional Flood Frequency Analysis (RFFA), leave-one-out (LOO) validation or a fixed percentage leave out validation (e.g., 10%) is commonly adopted to assess the predictive ability of regression-based prediction equations. This paper develops a Monte Carlo Cross Validation (MCCV) technique (which has widely been adopted in Chemometrics and Econometrics) in RFFA using Generalised Least Squares Regression (GLSR) and compares it with the most commonly adopted LOO validation approach. The study uses simulated and regional flood data from the state of New South Wales in Australia. It is found that when developing hydrologic regression models, application of the MCCV is likely to result in a more parsimonious model than the LOO. It has also been found that the MCCV can provide a more realistic estimate of a model's predictive ability when compared with the LOO.
Partial least squares (PLS) analysis offers a number of advantages over the more traditionally used regression analyses applied in landscape ecology, particularly for determining the associations among multiple constituents of surface water and landscape configuration. Common dat...
Nie, Z Q; Ou, Y Q; Zhuang, J; Qu, Y J; Mai, J Z; Chen, J M; Liu, X Q
2016-05-01
Conditional logistic regression analysis and unconditional logistic regression analysis are commonly used in case control study, but Cox proportional hazard model is often used in survival data analysis. Most literature only refer to main effect model, however, generalized linear model differs from general linear model, and the interaction was composed of multiplicative interaction and additive interaction. The former is only statistical significant, but the latter has biological significance. In this paper, macros was written by using SAS 9.4 and the contrast ratio, attributable proportion due to interaction and synergy index were calculated while calculating the items of logistic and Cox regression interactions, and the confidence intervals of Wald, delta and profile likelihood were used to evaluate additive interaction for the reference in big data analysis in clinical epidemiology and in analysis of genetic multiplicative and additive interactions.
Partial least squares (PLS) analysis offers a number of advantages over the more traditionally used regression analyses applied in landscape ecology to study the associations among constituents of surface water and landscapes. Common data problems in ecological studies include: s...
Using Robust Variance Estimation to Combine Multiple Regression Estimates with Meta-Analysis
ERIC Educational Resources Information Center
Williams, Ryan
2013-01-01
The purpose of this study was to explore the use of robust variance estimation for combining commonly specified multiple regression models and for combining sample-dependent focal slope estimates from diversely specified models. The proposed estimator obviates traditionally required information about the covariance structure of the dependent…
Park, Ji Hyun; Kim, Hyeon-Young; Lee, Hanna; Yun, Eun Kyoung
2015-12-01
This study compares the performance of the logistic regression and decision tree analysis methods for assessing the risk factors for infection in cancer patients undergoing chemotherapy. The subjects were 732 cancer patients who were receiving chemotherapy at K university hospital in Seoul, Korea. The data were collected between March 2011 and February 2013 and were processed for descriptive analysis, logistic regression and decision tree analysis using the IBM SPSS Statistics 19 and Modeler 15.1 programs. The most common risk factors for infection in cancer patients receiving chemotherapy were identified as alkylating agents, vinca alkaloid and underlying diabetes mellitus. The logistic regression explained 66.7% of the variation in the data in terms of sensitivity and 88.9% in terms of specificity. The decision tree analysis accounted for 55.0% of the variation in the data in terms of sensitivity and 89.0% in terms of specificity. As for the overall classification accuracy, the logistic regression explained 88.0% and the decision tree analysis explained 87.2%. The logistic regression analysis showed a higher degree of sensitivity and classification accuracy. Therefore, logistic regression analysis is concluded to be the more effective and useful method for establishing an infection prediction model for patients undergoing chemotherapy. Copyright © 2015 Elsevier Ltd. All rights reserved.
Vitamin D insufficiency and subclinical atherosclerosis in non-diabetic males living with HIV.
Portilla, Joaquín; Moreno-Pérez, Oscar; Serna-Candel, Carmen; Escoín, Corina; Alfayate, Rocio; Reus, Sergio; Merino, Esperanza; Boix, Vicente; Giner, Livia; Sánchez-Payá, José; Picó, Antonio
2014-01-01
Vitamin D insufficiency (VDI) has been associated with increased cardiovascular risk in the non-HIV population. This study evaluates the relationship among serum 25-hydroxyvitamin D [25(OH)D] levels, cardiovascular risk factors, adipokines, antiviral therapy (ART) and subclinical atherosclerosis in HIV-infected males. A cross-sectional study in ambulatory care was made in non-diabetic patients living with HIV. VDI was defined as 25(OH)D serum levels <75 nmol/L. Fasting lipids, glucose, inflammatory markers (tumour necrosis factor-α, interleukin-6, high-sensitivity C-reactive protein) and endothelial markers (plasminogen activator inhibitor-1, or PAI-I) were measured. The common carotid artery intima-media thickness (C-IMT) was determined. A multivariate logistic regression analysis was made to identify factors associated with the presence of VDI, while multivariate linear regression analysis was used to identify factors associated with common C-IMT. Eighty-nine patients were included (age 42 ± 8 years), 18.9% were in CDC (US Centers for Disease Control and Prevention) stage C and 75 were on ART. VDI was associated with ART exposure, sedentary lifestyle, higher triglycerides levels and PAI-I. In univariate analysis, VDI was associated with greater common C-IMT. The multivariate linear regression model, adjusted by confounding factors, revealed an independent association between common C-IMT and patient age, time of exposure to protease inhibitors (PIs) and impaired fasting glucose (IFG). In contrast, there were no independent associations between common C-IMT and VDI or inflammatory and endothelial markers. VDI was not independently associated with subclinical atherosclerosis in non-diabetic males living with HIV. Older age, a longer exposure to PIs, and IFG were independent factors associated with common C-IMT in this population.
Confidence Intervals for Squared Semipartial Correlation Coefficients: The Effect of Nonnormality
ERIC Educational Resources Information Center
Algina, James; Keselman, H. J.; Penfield, Randall D.
2010-01-01
The increase in the squared multiple correlation coefficient ([delta]R[superscript 2]) associated with a variable in a regression equation is a commonly used measure of importance in regression analysis. Algina, Keselman, and Penfield found that intervals based on asymptotic principles were typically very inaccurate, even though the sample size…
Early Home Activities and Oral Language Skills in Middle Childhood: A Quantile Analysis
ERIC Educational Resources Information Center
Law, James; Rush, Robert; King, Tom; Westrupp, Elizabeth; Reilly, Sheena
2018-01-01
Oral language development is a key outcome of elementary school, and it is important to identify factors that predict it most effectively. Commonly researchers use ordinary least squares regression with conclusions restricted to average performance conditional on relevant covariates. Quantile regression offers a more sophisticated alternative.…
Robust mislabel logistic regression without modeling mislabel probabilities.
Hung, Hung; Jou, Zhi-Yu; Huang, Su-Yun
2018-03-01
Logistic regression is among the most widely used statistical methods for linear discriminant analysis. In many applications, we only observe possibly mislabeled responses. Fitting a conventional logistic regression can then lead to biased estimation. One common resolution is to fit a mislabel logistic regression model, which takes into consideration of mislabeled responses. Another common method is to adopt a robust M-estimation by down-weighting suspected instances. In this work, we propose a new robust mislabel logistic regression based on γ-divergence. Our proposal possesses two advantageous features: (1) It does not need to model the mislabel probabilities. (2) The minimum γ-divergence estimation leads to a weighted estimating equation without the need to include any bias correction term, that is, it is automatically bias-corrected. These features make the proposed γ-logistic regression more robust in model fitting and more intuitive for model interpretation through a simple weighting scheme. Our method is also easy to implement, and two types of algorithms are included. Simulation studies and the Pima data application are presented to demonstrate the performance of γ-logistic regression. © 2017, The International Biometric Society.
Handling nonnormality and variance heterogeneity for quantitative sublethal toxicity tests.
Ritz, Christian; Van der Vliet, Leana
2009-09-01
The advantages of using regression-based techniques to derive endpoints from environmental toxicity data are clear, and slowly, this superior analytical technique is gaining acceptance. As use of regression-based analysis becomes more widespread, some of the associated nuances and potential problems come into sharper focus. Looking at data sets that cover a broad spectrum of standard test species, we noticed that some model fits to data failed to meet two key assumptions-variance homogeneity and normality-that are necessary for correct statistical analysis via regression-based techniques. Failure to meet these assumptions often is caused by reduced variance at the concentrations showing severe adverse effects. Although commonly used with linear regression analysis, transformation of the response variable only is not appropriate when fitting data using nonlinear regression techniques. Through analysis of sample data sets, including Lemna minor, Eisenia andrei (terrestrial earthworm), and algae, we show that both the so-called Box-Cox transformation and use of the Poisson distribution can help to correct variance heterogeneity and nonnormality and so allow nonlinear regression analysis to be implemented. Both the Box-Cox transformation and the Poisson distribution can be readily implemented into existing protocols for statistical analysis. By correcting for nonnormality and variance heterogeneity, these two statistical tools can be used to encourage the transition to regression-based analysis and the depreciation of less-desirable and less-flexible analytical techniques, such as linear interpolation.
[How to fit and interpret multilevel models using SPSS].
Pardo, Antonio; Ruiz, Miguel A; San Martín, Rafael
2007-05-01
Hierarchic or multilevel models are used to analyse data when cases belong to known groups and sample units are selected both from the individual level and from the group level. In this work, the multilevel models most commonly discussed in the statistic literature are described, explaining how to fit these models using the SPSS program (any version as of the 11 th ) and how to interpret the outcomes of the analysis. Five particular models are described, fitted, and interpreted: (1) one-way analysis of variance with random effects, (2) regression analysis with means-as-outcomes, (3) one-way analysis of covariance with random effects, (4) regression analysis with random coefficients, and (5) regression analysis with means- and slopes-as-outcomes. All models are explained, trying to make them understandable to researchers in health and behaviour sciences.
Anderson, Carl A; McRae, Allan F; Visscher, Peter M
2006-07-01
Standard quantitative trait loci (QTL) mapping techniques commonly assume that the trait is both fully observed and normally distributed. When considering survival or age-at-onset traits these assumptions are often incorrect. Methods have been developed to map QTL for survival traits; however, they are both computationally intensive and not available in standard genome analysis software packages. We propose a grouped linear regression method for the analysis of continuous survival data. Using simulation we compare this method to both the Cox and Weibull proportional hazards models and a standard linear regression method that ignores censoring. The grouped linear regression method is of equivalent power to both the Cox and Weibull proportional hazards methods and is significantly better than the standard linear regression method when censored observations are present. The method is also robust to the proportion of censored individuals and the underlying distribution of the trait. On the basis of linear regression methodology, the grouped linear regression model is computationally simple and fast and can be implemented readily in freely available statistical software.
Otwombe, Kennedy N.; Petzold, Max; Martinson, Neil; Chirwa, Tobias
2014-01-01
Background Research in the predictors of all-cause mortality in HIV-infected people has widely been reported in literature. Making an informed decision requires understanding the methods used. Objectives We present a review on study designs, statistical methods and their appropriateness in original articles reporting on predictors of all-cause mortality in HIV-infected people between January 2002 and December 2011. Statistical methods were compared between 2002–2006 and 2007–2011. Time-to-event analysis techniques were considered appropriate. Data Sources Pubmed/Medline. Study Eligibility Criteria Original English-language articles were abstracted. Letters to the editor, editorials, reviews, systematic reviews, meta-analysis, case reports and any other ineligible articles were excluded. Results A total of 189 studies were identified (n = 91 in 2002–2006 and n = 98 in 2007–2011) out of which 130 (69%) were prospective and 56 (30%) were retrospective. One hundred and eighty-two (96%) studies described their sample using descriptive statistics while 32 (17%) made comparisons using t-tests. Kaplan-Meier methods for time-to-event analysis were commonly used in the earlier period (n = 69, 76% vs. n = 53, 54%, p = 0.002). Predictors of mortality in the two periods were commonly determined using Cox regression analysis (n = 67, 75% vs. n = 63, 64%, p = 0.12). Only 7 (4%) used advanced survival analysis methods of Cox regression analysis with frailty in which 6 (3%) were used in the later period. Thirty-two (17%) used logistic regression while 8 (4%) used other methods. There were significantly more articles from the first period using appropriate methods compared to the second (n = 80, 88% vs. n = 69, 70%, p-value = 0.003). Conclusion Descriptive statistics and survival analysis techniques remain the most common methods of analysis in publications on predictors of all-cause mortality in HIV-infected cohorts while prospective research designs are favoured. Sophisticated techniques of time-dependent Cox regression and Cox regression with frailty are scarce. This motivates for more training in the use of advanced time-to-event methods. PMID:24498313
Moderation analysis using a two-level regression model.
Yuan, Ke-Hai; Cheng, Ying; Maxwell, Scott
2014-10-01
Moderation analysis is widely used in social and behavioral research. The most commonly used model for moderation analysis is moderated multiple regression (MMR) in which the explanatory variables of the regression model include product terms, and the model is typically estimated by least squares (LS). This paper argues for a two-level regression model in which the regression coefficients of a criterion variable on predictors are further regressed on moderator variables. An algorithm for estimating the parameters of the two-level model by normal-distribution-based maximum likelihood (NML) is developed. Formulas for the standard errors (SEs) of the parameter estimates are provided and studied. Results indicate that, when heteroscedasticity exists, NML with the two-level model gives more efficient and more accurate parameter estimates than the LS analysis of the MMR model. When error variances are homoscedastic, NML with the two-level model leads to essentially the same results as LS with the MMR model. Most importantly, the two-level regression model permits estimating the percentage of variance of each regression coefficient that is due to moderator variables. When applied to data from General Social Surveys 1991, NML with the two-level model identified a significant moderation effect of race on the regression of job prestige on years of education while LS with the MMR model did not. An R package is also developed and documented to facilitate the application of the two-level model.
Time series regression studies in environmental epidemiology.
Bhaskaran, Krishnan; Gasparrini, Antonio; Hajat, Shakoor; Smeeth, Liam; Armstrong, Ben
2013-08-01
Time series regression studies have been widely used in environmental epidemiology, notably in investigating the short-term associations between exposures such as air pollution, weather variables or pollen, and health outcomes such as mortality, myocardial infarction or disease-specific hospital admissions. Typically, for both exposure and outcome, data are available at regular time intervals (e.g. daily pollution levels and daily mortality counts) and the aim is to explore short-term associations between them. In this article, we describe the general features of time series data, and we outline the analysis process, beginning with descriptive analysis, then focusing on issues in time series regression that differ from other regression methods: modelling short-term fluctuations in the presence of seasonal and long-term patterns, dealing with time varying confounding factors and modelling delayed ('lagged') associations between exposure and outcome. We finish with advice on model checking and sensitivity analysis, and some common extensions to the basic model.
Regression analysis using dependent Polya trees.
Schörgendorfer, Angela; Branscum, Adam J
2013-11-30
Many commonly used models for linear regression analysis force overly simplistic shape and scale constraints on the residual structure of data. We propose a semiparametric Bayesian model for regression analysis that produces data-driven inference by using a new type of dependent Polya tree prior to model arbitrary residual distributions that are allowed to evolve across increasing levels of an ordinal covariate (e.g., time, in repeated measurement studies). By modeling residual distributions at consecutive covariate levels or time points using separate, but dependent Polya tree priors, distributional information is pooled while allowing for broad pliability to accommodate many types of changing residual distributions. We can use the proposed dependent residual structure in a wide range of regression settings, including fixed-effects and mixed-effects linear and nonlinear models for cross-sectional, prospective, and repeated measurement data. A simulation study illustrates the flexibility of our novel semiparametric regression model to accurately capture evolving residual distributions. In an application to immune development data on immunoglobulin G antibodies in children, our new model outperforms several contemporary semiparametric regression models based on a predictive model selection criterion. Copyright © 2013 John Wiley & Sons, Ltd.
ERIC Educational Resources Information Center
Kapes, Jerome T.; And Others
Three models of multiple regression analysis (MRA): single equation, commonality analysis, and path analysis, were applied to longitudinal data from the Pennsylvania Vocational Development Study. Variables influencing weekly income of vocational education students one year after high school graduation were examined: grade point averages (grades…
Advanced statistics: linear regression, part I: simple linear regression.
Marill, Keith A
2004-01-01
Simple linear regression is a mathematical technique used to model the relationship between a single independent predictor variable and a single dependent outcome variable. In this, the first of a two-part series exploring concepts in linear regression analysis, the four fundamental assumptions and the mechanics of simple linear regression are reviewed. The most common technique used to derive the regression line, the method of least squares, is described. The reader will be acquainted with other important concepts in simple linear regression, including: variable transformations, dummy variables, relationship to inference testing, and leverage. Simplified clinical examples with small datasets and graphic models are used to illustrate the points. This will provide a foundation for the second article in this series: a discussion of multiple linear regression, in which there are multiple predictor variables.
Bennett, Bradley C; Husby, Chad E
2008-03-28
Botanical pharmacopoeias are non-random subsets of floras, with some taxonomic groups over- or under-represented. Moerman [Moerman, D.E., 1979. Symbols and selectivity: a statistical analysis of Native American medical ethnobotany, Journal of Ethnopharmacology 1, 111-119] introduced linear regression/residual analysis to examine these patterns. However, regression, the commonly-employed analysis, suffers from several statistical flaws. We use contingency table and binomial analyses to examine patterns of Shuar medicinal plant use (from Amazonian Ecuador). We first analyzed the Shuar data using Moerman's approach, modified to better meet requirements of linear regression analysis. Second, we assessed the exact randomization contingency table test for goodness of fit. Third, we developed a binomial model to test for non-random selection of plants in individual families. Modified regression models (which accommodated assumptions of linear regression) reduced R(2) to from 0.59 to 0.38, but did not eliminate all problems associated with regression analyses. Contingency table analyses revealed that the entire flora departs from the null model of equal proportions of medicinal plants in all families. In the binomial analysis, only 10 angiosperm families (of 115) differed significantly from the null model. These 10 families are largely responsible for patterns seen at higher taxonomic levels. Contingency table and binomial analyses offer an easy and statistically valid alternative to the regression approach.
Zhang, Chao; Jia, Pengli; Yu, Liu; Xu, Chang
2018-05-01
Dose-response meta-analysis (DRMA) is widely applied to investigate the dose-specific relationship between independent and dependent variables. Such methods have been in use for over 30 years and are increasingly employed in healthcare and clinical decision-making. In this article, we give an overview of the methodology used in DRMA. We summarize the commonly used regression model and the pooled method in DRMA. We also use an example to illustrate how to employ a DRMA by these methods. Five regression models, linear regression, piecewise regression, natural polynomial regression, fractional polynomial regression, and restricted cubic spline regression, were illustrated in this article to fit the dose-response relationship. And two types of pooling approaches, that is, one-stage approach and two-stage approach are illustrated to pool the dose-response relationship across studies. The example showed similar results among these models. Several dose-response meta-analysis methods can be used for investigating the relationship between exposure level and the risk of an outcome. However the methodology of DRMA still needs to be improved. © 2018 Chinese Cochrane Center, West China Hospital of Sichuan University and John Wiley & Sons Australia, Ltd.
NASA Astrophysics Data System (ADS)
Ferreira, Paulo; Kristoufek, Ladislav
2017-11-01
We analyse the covered interest parity (CIP) using two novel regression frameworks based on cross-correlation analysis (detrended cross-correlation analysis and detrending moving-average cross-correlation analysis), which allow for studying the relationships at different scales and work well under non-stationarity and heavy tails. CIP is a measure of capital mobility commonly used to analyse financial integration, which remains an interesting feature of study in the context of the European Union. The importance of this features is related to the fact that the adoption of a common currency is associated with some benefits for countries, but also involves some risks such as the loss of economic instruments to face possible asymmetric shocks. While studying the Eurozone members could explain some problems in the common currency, studying the non-Euro countries is important to analyse if they are fit to take the possible benefits. Our results point to the CIP verification mainly in the Central European countries while in the remaining countries, the verification of the parity is only residual.
Quantile Regression in the Study of Developmental Sciences
Petscher, Yaacov; Logan, Jessica A. R.
2014-01-01
Linear regression analysis is one of the most common techniques applied in developmental research, but only allows for an estimate of the average relations between the predictor(s) and the outcome. This study describes quantile regression, which provides estimates of the relations between the predictor(s) and outcome, but across multiple points of the outcome’s distribution. Using data from the High School and Beyond and U.S. Sustained Effects Study databases, quantile regression is demonstrated and contrasted with linear regression when considering models with: (a) one continuous predictor, (b) one dichotomous predictor, (c) a continuous and a dichotomous predictor, and (d) a longitudinal application. Results from each example exhibited the differential inferences which may be drawn using linear or quantile regression. PMID:24329596
A single determinant dominates the rate of yeast protein evolution.
Drummond, D Allan; Raval, Alpan; Wilke, Claus O
2006-02-01
A gene's rate of sequence evolution is among the most fundamental evolutionary quantities in common use, but what determines evolutionary rates has remained unclear. Here, we carry out the first combined analysis of seven predictors (gene expression level, dispensability, protein abundance, codon adaptation index, gene length, number of protein-protein interactions, and the gene's centrality in the interaction network) previously reported to have independent influences on protein evolutionary rates. Strikingly, our analysis reveals a single dominant variable linked to the number of translation events which explains 40-fold more variation in evolutionary rate than any other, suggesting that protein evolutionary rate has a single major determinant among the seven predictors. The dominant variable explains nearly half the variation in the rate of synonymous and protein evolution. We show that the two most commonly used methods to disentangle the determinants of evolutionary rate, partial correlation analysis and ordinary multivariate regression, produce misleading or spurious results when applied to noisy biological data. We overcome these difficulties by employing principal component regression, a multivariate regression of evolutionary rate against the principal components of the predictor variables. Our results support the hypothesis that translational selection governs the rate of synonymous and protein sequence evolution in yeast.
Regression Analysis of Mixed Panel Count Data with Dependent Terminal Events
Yu, Guanglei; Zhu, Liang; Li, Yang; Sun, Jianguo; Robison, Leslie L.
2017-01-01
Event history studies are commonly conducted in many fields and a great deal of literature has been established for the analysis of the two types of data commonly arising from these studies: recurrent event data and panel count data. The former arises if all study subjects are followed continuously, while the latter means that each study subject is observed only at discrete time points. In reality, a third type of data, a mixture of the two types of the data above, may occur and furthermore, as with the first two types of the data, there may exist a dependent terminal event, which may preclude the occurrences of recurrent events of interest. This paper discusses regression analysis of mixed recurrent event and panel count data in the presence of a terminal event and an estimating equation-based approach is proposed for estimation of regression parameters of interest. In addition, the asymptotic properties of the proposed estimator are established and a simulation study conducted to assess the finite-sample performance of the proposed method suggests that it works well in practical situations. Finally the methodology is applied to a childhood cancer study that motivated this study. PMID:28098397
A general framework for the use of logistic regression models in meta-analysis.
Simmonds, Mark C; Higgins, Julian Pt
2016-12-01
Where individual participant data are available for every randomised trial in a meta-analysis of dichotomous event outcomes, "one-stage" random-effects logistic regression models have been proposed as a way to analyse these data. Such models can also be used even when individual participant data are not available and we have only summary contingency table data. One benefit of this one-stage regression model over conventional meta-analysis methods is that it maximises the correct binomial likelihood for the data and so does not require the common assumption that effect estimates are normally distributed. A second benefit of using this model is that it may be applied, with only minor modification, in a range of meta-analytic scenarios, including meta-regression, network meta-analyses and meta-analyses of diagnostic test accuracy. This single model can potentially replace the variety of often complex methods used in these areas. This paper considers, with a range of meta-analysis examples, how random-effects logistic regression models may be used in a number of different types of meta-analyses. This one-stage approach is compared with widely used meta-analysis methods including Bayesian network meta-analysis and the bivariate and hierarchical summary receiver operating characteristic (ROC) models for meta-analyses of diagnostic test accuracy. © The Author(s) 2014.
Biomass relations for components of five Minnesota shrubs.
Richard R. Buech; David J. Rugg
1995-01-01
Presents equations for estimating biomass of six components on five species of shrubs common to northeastern Minnesota. Regression analysis is used to compare the performance of three estimators of biomass.
ERIC Educational Resources Information Center
Gelman, Andrew; Imbens, Guido
2014-01-01
It is common in regression discontinuity analysis to control for high order (third, fourth, or higher) polynomials of the forcing variable. We argue that estimators for causal effects based on such methods can be misleading, and we recommend researchers do not use them, and instead use estimators based on local linear or quadratic polynomials or…
Testing Interaction Effects without Discarding Variance.
ERIC Educational Resources Information Center
Lopez, Kay A.
Analysis of variance (ANOVA) and multiple regression are two of the most commonly used methods of data analysis in behavioral science research. Although ANOVA was intended for use with experimental designs, educational researchers have used ANOVA extensively in aptitude-treatment interaction (ATI) research. This practice tends to make researchers…
A Regression Framework for Effect Size Assessments in Longitudinal Modeling of Group Differences
Feingold, Alan
2013-01-01
The use of growth modeling analysis (GMA)--particularly multilevel analysis and latent growth modeling--to test the significance of intervention effects has increased exponentially in prevention science, clinical psychology, and psychiatry over the past 15 years. Model-based effect sizes for differences in means between two independent groups in GMA can be expressed in the same metric (Cohen’s d) commonly used in classical analysis and meta-analysis. This article first reviews conceptual issues regarding calculation of d for findings from GMA and then introduces an integrative framework for effect size assessments that subsumes GMA. The new approach uses the structure of the linear regression model, from which effect sizes for findings from diverse cross-sectional and longitudinal analyses can be calculated with familiar statistics, such as the regression coefficient, the standard deviation of the dependent measure, and study duration. PMID:23956615
Modeling Outcomes with Floor or Ceiling Effects: An Introduction to the Tobit Model
ERIC Educational Resources Information Center
McBee, Matthew
2010-01-01
In gifted education research, it is common for outcome variables to exhibit strong floor or ceiling effects due to insufficient range of measurement of many instruments when used with gifted populations. Common statistical methods (e.g., analysis of variance, linear regression) produce biased estimates when such effects are present. In practice,…
Factor Retention in Exploratory Factor Analysis: A Comparison of Alternative Methods.
ERIC Educational Resources Information Center
Mumford, Karen R.; Ferron, John M.; Hines, Constance V.; Hogarty, Kristine Y.; Kromrey, Jeffery D.
This study compared the effectiveness of 10 methods of determining the number of factors to retain in exploratory common factor analysis. The 10 methods included the Kaiser rule and a modified Kaiser criterion, 3 variations of parallel analysis, 4 regression-based variations of the scree procedure, and the minimum average partial procedure. The…
NASA Astrophysics Data System (ADS)
Jiang, Weiping; Ma, Jun; Li, Zhao; Zhou, Xiaohui; Zhou, Boye
2018-05-01
The analysis of the correlations between the noise in different components of GPS stations has positive significance to those trying to obtain more accurate uncertainty of velocity with respect to station motion. Previous research into noise in GPS position time series focused mainly on single component evaluation, which affects the acquisition of precise station positions, the velocity field, and its uncertainty. In this study, before and after removing the common-mode error (CME), we performed one-dimensional linear regression analysis of the noise amplitude vectors in different components of 126 GPS stations with a combination of white noise, flicker noise, and random walking noise in Southern California. The results show that, on the one hand, there are above-moderate degrees of correlation between the white noise amplitude vectors in all components of the stations before and after removal of the CME, while the correlations between flicker noise amplitude vectors in horizontal and vertical components are enhanced from un-correlated to moderately correlated by removing the CME. On the other hand, the significance tests show that, all of the obtained linear regression equations, which represent a unique function of the noise amplitude in any two components, are of practical value after removing the CME. According to the noise amplitude estimates in two components and the linear regression equations, more accurate noise amplitudes can be acquired in the two components.
Regression analysis of mixed panel count data with dependent terminal events.
Yu, Guanglei; Zhu, Liang; Li, Yang; Sun, Jianguo; Robison, Leslie L
2017-05-10
Event history studies are commonly conducted in many fields, and a great deal of literature has been established for the analysis of the two types of data commonly arising from these studies: recurrent event data and panel count data. The former arises if all study subjects are followed continuously, while the latter means that each study subject is observed only at discrete time points. In reality, a third type of data, a mixture of the two types of the data earlier, may occur and furthermore, as with the first two types of the data, there may exist a dependent terminal event, which may preclude the occurrences of recurrent events of interest. This paper discusses regression analysis of mixed recurrent event and panel count data in the presence of a terminal event and an estimating equation-based approach is proposed for estimation of regression parameters of interest. In addition, the asymptotic properties of the proposed estimator are established, and a simulation study conducted to assess the finite-sample performance of the proposed method suggests that it works well in practical situations. Finally, the methodology is applied to a childhood cancer study that motivated this study. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
SPSS and SAS programs for comparing Pearson correlations and OLS regression coefficients.
Weaver, Bruce; Wuensch, Karl L
2013-09-01
Several procedures that use summary data to test hypotheses about Pearson correlations and ordinary least squares regression coefficients have been described in various books and articles. To our knowledge, however, no single resource describes all of the most common tests. Furthermore, many of these tests have not yet been implemented in popular statistical software packages such as SPSS and SAS. In this article, we describe all of the most common tests and provide SPSS and SAS programs to perform them. When they are applicable, our code also computes 100 × (1 - α)% confidence intervals corresponding to the tests. For testing hypotheses about independent regression coefficients, we demonstrate one method that uses summary data and another that uses raw data (i.e., Potthoff analysis). When the raw data are available, the latter method is preferred, because use of summary data entails some loss of precision due to rounding.
Determining Predictor Importance in Hierarchical Linear Models Using Dominance Analysis
ERIC Educational Resources Information Center
Luo, Wen; Azen, Razia
2013-01-01
Dominance analysis (DA) is a method used to evaluate the relative importance of predictors that was originally proposed for linear regression models. This article proposes an extension of DA that allows researchers to determine the relative importance of predictors in hierarchical linear models (HLM). Commonly used measures of model adequacy in…
Digression and Value Concatenation to Enable Privacy-Preserving Regression.
Li, Xiao-Bai; Sarkar, Sumit
2014-09-01
Regression techniques can be used not only for legitimate data analysis, but also to infer private information about individuals. In this paper, we demonstrate that regression trees, a popular data-analysis and data-mining technique, can be used to effectively reveal individuals' sensitive data. This problem, which we call a "regression attack," has not been addressed in the data privacy literature, and existing privacy-preserving techniques are not appropriate in coping with this problem. We propose a new approach to counter regression attacks. To protect against privacy disclosure, our approach introduces a novel measure, called digression , which assesses the sensitive value disclosure risk in the process of building a regression tree model. Specifically, we develop an algorithm that uses the measure for pruning the tree to limit disclosure of sensitive data. We also propose a dynamic value-concatenation method for anonymizing data, which better preserves data utility than a user-defined generalization scheme commonly used in existing approaches. Our approach can be used for anonymizing both numeric and categorical data. An experimental study is conducted using real-world financial, economic and healthcare data. The results of the experiments demonstrate that the proposed approach is very effective in protecting data privacy while preserving data quality for research and analysis.
Hu, Yannan; van Lenthe, Frank J; Hoffmann, Rasmus; van Hedel, Karen; Mackenbach, Johan P
2017-04-20
The scientific evidence-base for policies to tackle health inequalities is limited. Natural policy experiments (NPE) have drawn increasing attention as a means to evaluating the effects of policies on health. Several analytical methods can be used to evaluate the outcomes of NPEs in terms of average population health, but it is unclear whether they can also be used to assess the outcomes of NPEs in terms of health inequalities. The aim of this study therefore was to assess whether, and to demonstrate how, a number of commonly used analytical methods for the evaluation of NPEs can be applied to quantify the effect of policies on health inequalities. We identified seven quantitative analytical methods for the evaluation of NPEs: regression adjustment, propensity score matching, difference-in-differences analysis, fixed effects analysis, instrumental variable analysis, regression discontinuity and interrupted time-series. We assessed whether these methods can be used to quantify the effect of policies on the magnitude of health inequalities either by conducting a stratified analysis or by including an interaction term, and illustrated both approaches in a fictitious numerical example. All seven methods can be used to quantify the equity impact of policies on absolute and relative inequalities in health by conducting an analysis stratified by socioeconomic position, and all but one (propensity score matching) can be used to quantify equity impacts by inclusion of an interaction term between socioeconomic position and policy exposure. Methods commonly used in economics and econometrics for the evaluation of NPEs can also be applied to assess the equity impact of policies, and our illustrations provide guidance on how to do this appropriately. The low external validity of results from instrumental variable analysis and regression discontinuity makes these methods less desirable for assessing policy effects on population-level health inequalities. Increased use of the methods in social epidemiology will help to build an evidence base to support policy making in the area of health inequalities.
Logistic regression applied to natural hazards: rare event logistic regression with replications
NASA Astrophysics Data System (ADS)
Guns, M.; Vanacker, V.
2012-06-01
Statistical analysis of natural hazards needs particular attention, as most of these phenomena are rare events. This study shows that the ordinary rare event logistic regression, as it is now commonly used in geomorphologic studies, does not always lead to a robust detection of controlling factors, as the results can be strongly sample-dependent. In this paper, we introduce some concepts of Monte Carlo simulations in rare event logistic regression. This technique, so-called rare event logistic regression with replications, combines the strength of probabilistic and statistical methods, and allows overcoming some of the limitations of previous developments through robust variable selection. This technique was here developed for the analyses of landslide controlling factors, but the concept is widely applicable for statistical analyses of natural hazards.
Length bias correction in gene ontology enrichment analysis using logistic regression.
Mi, Gu; Di, Yanming; Emerson, Sarah; Cumbie, Jason S; Chang, Jeff H
2012-01-01
When assessing differential gene expression from RNA sequencing data, commonly used statistical tests tend to have greater power to detect differential expression of genes encoding longer transcripts. This phenomenon, called "length bias", will influence subsequent analyses such as Gene Ontology enrichment analysis. In the presence of length bias, Gene Ontology categories that include longer genes are more likely to be identified as enriched. These categories, however, are not necessarily biologically more relevant. We show that one can effectively adjust for length bias in Gene Ontology analysis by including transcript length as a covariate in a logistic regression model. The logistic regression model makes the statistical issue underlying length bias more transparent: transcript length becomes a confounding factor when it correlates with both the Gene Ontology membership and the significance of the differential expression test. The inclusion of the transcript length as a covariate allows one to investigate the direct correlation between the Gene Ontology membership and the significance of testing differential expression, conditional on the transcript length. We present both real and simulated data examples to show that the logistic regression approach is simple, effective, and flexible.
Prunier, J G; Colyn, M; Legendre, X; Nimon, K F; Flamand, M C
2015-01-01
Direct gradient analyses in spatial genetics provide unique opportunities to describe the inherent complexity of genetic variation in wildlife species and are the object of many methodological developments. However, multicollinearity among explanatory variables is a systemic issue in multivariate regression analyses and is likely to cause serious difficulties in properly interpreting results of direct gradient analyses, with the risk of erroneous conclusions, misdirected research and inefficient or counterproductive conservation measures. Using simulated data sets along with linear and logistic regressions on distance matrices, we illustrate how commonality analysis (CA), a detailed variance-partitioning procedure that was recently introduced in the field of ecology, can be used to deal with nonindependence among spatial predictors. By decomposing model fit indices into unique and common (or shared) variance components, CA allows identifying the location and magnitude of multicollinearity, revealing spurious correlations and thus thoroughly improving the interpretation of multivariate regressions. Despite a few inherent limitations, especially in the case of resistance model optimization, this review highlights the great potential of CA to account for complex multicollinearity patterns in spatial genetics and identifies future applications and lines of research. We strongly urge spatial geneticists to systematically investigate commonalities when performing direct gradient analyses. © 2014 John Wiley & Sons Ltd.
Tuuli, Methodius G; Odibo, Anthony O
2011-08-01
The objective of this article is to discuss the rationale for common statistical tests used for the analysis and interpretation of prenatal diagnostic imaging studies. Examples from the literature are used to illustrate descriptive and inferential statistics. The uses and limitations of linear and logistic regression analyses are discussed in detail.
Flora, David B.; LaBrish, Cathy; Chalmers, R. Philip
2011-01-01
We provide a basic review of the data screening and assumption testing issues relevant to exploratory and confirmatory factor analysis along with practical advice for conducting analyses that are sensitive to these concerns. Historically, factor analysis was developed for explaining the relationships among many continuous test scores, which led to the expression of the common factor model as a multivariate linear regression model with observed, continuous variables serving as dependent variables, and unobserved factors as the independent, explanatory variables. Thus, we begin our paper with a review of the assumptions for the common factor model and data screening issues as they pertain to the factor analysis of continuous observed variables. In particular, we describe how principles from regression diagnostics also apply to factor analysis. Next, because modern applications of factor analysis frequently involve the analysis of the individual items from a single test or questionnaire, an important focus of this paper is the factor analysis of items. Although the traditional linear factor model is well-suited to the analysis of continuously distributed variables, commonly used item types, including Likert-type items, almost always produce dichotomous or ordered categorical variables. We describe how relationships among such items are often not well described by product-moment correlations, which has clear ramifications for the traditional linear factor analysis. An alternative, non-linear factor analysis using polychoric correlations has become more readily available to applied researchers and thus more popular. Consequently, we also review the assumptions and data-screening issues involved in this method. Throughout the paper, we demonstrate these procedures using an historic data set of nine cognitive ability variables. PMID:22403561
Interpretation of commonly used statistical regression models.
Kasza, Jessica; Wolfe, Rory
2014-01-01
A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.
Lin, Meihua; Li, Haoli; Zhao, Xiaolei; Qin, Jiheng
2013-01-01
Genome-wide analysis of gene-gene interactions has been recognized as a powerful avenue to identify the missing genetic components that can not be detected by using current single-point association analysis. Recently, several model-free methods (e.g. the commonly used information based metrics and several logistic regression-based metrics) were developed for detecting non-linear dependence between genetic loci, but they are potentially at the risk of inflated false positive error, in particular when the main effects at one or both loci are salient. In this study, we proposed two conditional entropy-based metrics to challenge this limitation. Extensive simulations demonstrated that the two proposed metrics, provided the disease is rare, could maintain consistently correct false positive rate. In the scenarios for a common disease, our proposed metrics achieved better or comparable control of false positive error, compared to four previously proposed model-free metrics. In terms of power, our methods outperformed several competing metrics in a range of common disease models. Furthermore, in real data analyses, both metrics succeeded in detecting interactions and were competitive with the originally reported results or the logistic regression approaches. In conclusion, the proposed conditional entropy-based metrics are promising as alternatives to current model-based approaches for detecting genuine epistatic effects. PMID:24339984
Schörgendorfer, Angela; Branscum, Adam J; Hanson, Timothy E
2013-06-01
Logistic regression is a popular tool for risk analysis in medical and population health science. With continuous response data, it is common to create a dichotomous outcome for logistic regression analysis by specifying a threshold for positivity. Fitting a linear regression to the nondichotomized response variable assuming a logistic sampling model for the data has been empirically shown to yield more efficient estimates of odds ratios than ordinary logistic regression of the dichotomized endpoint. We illustrate that risk inference is not robust to departures from the parametric logistic distribution. Moreover, the model assumption of proportional odds is generally not satisfied when the condition of a logistic distribution for the data is violated, leading to biased inference from a parametric logistic analysis. We develop novel Bayesian semiparametric methodology for testing goodness of fit of parametric logistic regression with continuous measurement data. The testing procedures hold for any cutoff threshold and our approach simultaneously provides the ability to perform semiparametric risk estimation. Bayes factors are calculated using the Savage-Dickey ratio for testing the null hypothesis of logistic regression versus a semiparametric generalization. We propose a fully Bayesian and a computationally efficient empirical Bayesian approach to testing, and we present methods for semiparametric estimation of risks, relative risks, and odds ratios when parametric logistic regression fails. Theoretical results establish the consistency of the empirical Bayes test. Results from simulated data show that the proposed approach provides accurate inference irrespective of whether parametric assumptions hold or not. Evaluation of risk factors for obesity shows that different inferences are derived from an analysis of a real data set when deviations from a logistic distribution are permissible in a flexible semiparametric framework. © 2013, The International Biometric Society.
Ondeck, Nathaniel T; Fu, Michael C; Skrip, Laura A; McLynn, Ryan P; Su, Edwin P; Grauer, Jonathan N
2018-03-01
Despite the advantages of large, national datasets, one continuing concern is missing data values. Complete case analysis, where only cases with complete data are analyzed, is commonly used rather than more statistically rigorous approaches such as multiple imputation. This study characterizes the potential selection bias introduced using complete case analysis and compares the results of common regressions using both techniques following unicompartmental knee arthroplasty. Patients undergoing unicompartmental knee arthroplasty were extracted from the 2005 to 2015 National Surgical Quality Improvement Program. As examples, the demographics of patients with and without missing preoperative albumin and hematocrit values were compared. Missing data were then treated with both complete case analysis and multiple imputation (an approach that reproduces the variation and associations that would have been present in a full dataset) and the conclusions of common regressions for adverse outcomes were compared. A total of 6117 patients were included, of which 56.7% were missing at least one value. Younger, female, and healthier patients were more likely to have missing preoperative albumin and hematocrit values. The use of complete case analysis removed 3467 patients from the study in comparison with multiple imputation which included all 6117 patients. The 2 methods of handling missing values led to differing associations of low preoperative laboratory values with commonly studied adverse outcomes. The use of complete case analysis can introduce selection bias and may lead to different conclusions in comparison with the statistically rigorous multiple imputation approach. Joint surgeons should consider the methods of handling missing values when interpreting arthroplasty research. Copyright © 2017 Elsevier Inc. All rights reserved.
Tukiendorf, Andrzej; Mansournia, Mohammad Ali; Wydmański, Jerzy; Wolny-Rokicka, Edyta
2017-04-01
Background: Clinical datasets for epithelial ovarian cancer brain metastatic patients are usually small in size. When adequate case numbers are lacking, resulting estimates of regression coefficients may demonstrate bias. One of the direct approaches to reduce such sparse-data bias is based on penalized estimation. Methods: A re- analysis of formerly reported hazard ratios in diagnosed patients was performed using penalized Cox regression with a popular SAS package providing additional software codes for a statistical computational procedure. Results: It was found that the penalized approach can readily diminish sparse data artefacts and radically reduce the magnitude of estimated regression coefficients. Conclusions: It was confirmed that classical statistical approaches may exaggerate regression estimates or distort study interpretations and conclusions. The results support the thesis that penalization via weak informative priors and data augmentation are the safest approaches to shrink sparse data artefacts frequently occurring in epidemiological research. Creative Commons Attribution License
Tzeng, Jung-Ying; Zhang, Daowen; Pongpanich, Monnat; Smith, Chris; McCarthy, Mark I.; Sale, Michèle M.; Worrall, Bradford B.; Hsu, Fang-Chi; Thomas, Duncan C.; Sullivan, Patrick F.
2011-01-01
Genomic association analyses of complex traits demand statistical tools that are capable of detecting small effects of common and rare variants and modeling complex interaction effects and yet are computationally feasible. In this work, we introduce a similarity-based regression method for assessing the main genetic and interaction effects of a group of markers on quantitative traits. The method uses genetic similarity to aggregate information from multiple polymorphic sites and integrates adaptive weights that depend on allele frequencies to accomodate common and uncommon variants. Collapsing information at the similarity level instead of the genotype level avoids canceling signals that have the opposite etiological effects and is applicable to any class of genetic variants without the need for dichotomizing the allele types. To assess gene-trait associations, we regress trait similarities for pairs of unrelated individuals on their genetic similarities and assess association by using a score test whose limiting distribution is derived in this work. The proposed regression framework allows for covariates, has the capacity to model both main and interaction effects, can be applied to a mixture of different polymorphism types, and is computationally efficient. These features make it an ideal tool for evaluating associations between phenotype and marker sets defined by linkage disequilibrium (LD) blocks, genes, or pathways in whole-genome analysis. PMID:21835306
Arano, Ichiro; Sugimoto, Tomoyuki; Hamasaki, Toshimitsu; Ohno, Yuko
2010-04-23
Survival analysis methods such as the Kaplan-Meier method, log-rank test, and Cox proportional hazards regression (Cox regression) are commonly used to analyze data from randomized withdrawal studies in patients with major depressive disorder. However, unfortunately, such common methods may be inappropriate when a long-term censored relapse-free time appears in data as the methods assume that if complete follow-up were possible for all individuals, each would eventually experience the event of interest. In this paper, to analyse data including such a long-term censored relapse-free time, we discuss a semi-parametric cure regression (Cox cure regression), which combines a logistic formulation for the probability of occurrence of an event with a Cox proportional hazards specification for the time of occurrence of the event. In specifying the treatment's effect on disease-free survival, we consider the fraction of long-term survivors and the risks associated with a relapse of the disease. In addition, we develop a tree-based method for the time to event data to identify groups of patients with differing prognoses (cure survival CART). Although analysis methods typically adapt the log-rank statistic for recursive partitioning procedures, the method applied here used a likelihood ratio (LR) test statistic from a fitting of cure survival regression assuming exponential and Weibull distributions for the latency time of relapse. The method is illustrated using data from a sertraline randomized withdrawal study in patients with major depressive disorder. We concluded that Cox cure regression reveals facts on who may be cured, and how the treatment and other factors effect on the cured incidence and on the relapse time of uncured patients, and that cure survival CART output provides easily understandable and interpretable information, useful both in identifying groups of patients with differing prognoses and in utilizing Cox cure regression models leading to meaningful interpretations.
Improving power and robustness for detecting genetic association with extreme-value sampling design.
Chen, Hua Yun; Li, Mingyao
2011-12-01
Extreme-value sampling design that samples subjects with extremely large or small quantitative trait values is commonly used in genetic association studies. Samples in such designs are often treated as "cases" and "controls" and analyzed using logistic regression. Such a case-control analysis ignores the potential dose-response relationship between the quantitative trait and the underlying trait locus and thus may lead to loss of power in detecting genetic association. An alternative approach to analyzing such data is to model the dose-response relationship by a linear regression model. However, parameter estimation from this model can be biased, which may lead to inflated type I errors. We propose a robust and efficient approach that takes into consideration of both the biased sampling design and the potential dose-response relationship. Extensive simulations demonstrate that the proposed method is more powerful than the traditional logistic regression analysis and is more robust than the linear regression analysis. We applied our method to the analysis of a candidate gene association study on high-density lipoprotein cholesterol (HDL-C) which includes study subjects with extremely high or low HDL-C levels. Using our method, we identified several SNPs showing a stronger evidence of association with HDL-C than the traditional case-control logistic regression analysis. Our results suggest that it is important to appropriately model the quantitative traits and to adjust for the biased sampling when dose-response relationship exists in extreme-value sampling designs. © 2011 Wiley Periodicals, Inc.
Statistical correlations of crime with arrests
NASA Astrophysics Data System (ADS)
Kuelling, Albert C.
1997-01-01
Regression analysis shows that the overall crime rate correlates with the overall arrest rate. Violent crime only weakly correlates with the violent arrest rate, but strongly correlates with the property arrest rate. Contrary to common impressions, increasing arrest rates do not significantly increase loading on incarceration facilities.
Local linear regression for function learning: an analysis based on sample discrepancy.
Cervellera, Cristiano; Macciò, Danilo
2014-11-01
Local linear regression models, a kind of nonparametric structures that locally perform a linear estimation of the target function, are analyzed in the context of empirical risk minimization (ERM) for function learning. The analysis is carried out with emphasis on geometric properties of the available data. In particular, the discrepancy of the observation points used both to build the local regression models and compute the empirical risk is considered. This allows to treat indifferently the case in which the samples come from a random external source and the one in which the input space can be freely explored. Both consistency of the ERM procedure and approximating capabilities of the estimator are analyzed, proving conditions to ensure convergence. Since the theoretical analysis shows that the estimation improves as the discrepancy of the observation points becomes smaller, low-discrepancy sequences, a family of sampling methods commonly employed for efficient numerical integration, are also analyzed. Simulation results involving two different examples of function learning are provided.
Hip fractures are risky business: an analysis of the NSQIP data.
Sathiyakumar, Vasanth; Greenberg, Sarah E; Molina, Cesar S; Thakore, Rachel V; Obremskey, William T; Sethi, Manish K
2015-04-01
Hip fractures are one of the most common types of orthopaedic injury with high rates of morbidity. Currently, no study has compared risk factors and adverse events following the different types of hip fracture surgeries. The purpose of this paper is to investigate the major and minor adverse events and risk factors for complication development associated with five common surgeries for the treatment of hip fractures using the NSQIP database. Using the ACS-NSQIP database, complications for five forms of hip surgeries were selected and categorized into major and minor adverse events. Demographics and clinical variables were collected and an unadjusted bivariate logistic regression analyses was performed to determine significant risk factors for adverse events. Five multivariate regressions were run for each surgery as well as a combined regression analysis. A total of 9640 patients undergoing surgery for hip fracture were identified with an adverse events rate of 25.2% (n=2433). Open reduction and internal fixation of a femoral neck fracture had the greatest percentage of all major events (16.6%) and total adverse events (27.4%), whereas partial hip hemiarthroplasty had the greatest percentage of all minor events (11.6%). Mortality was the most common major adverse event (44.9-50.6%). For minor complications, urinary tract infections were the most common minor adverse event (52.7-62.6%). Significant risk factors for development of any adverse event included age, BMI, gender, race, active smoking status, history of COPD, history of CHF, ASA score, dyspnoea, and functional status, with various combinations of these factors significantly affecting complication development for the individual surgeries. Hip fractures are associated with significantly high numbers of adverse events. The type of surgery affects the type of complications developed and also has an effect on what risk factors significantly predict the development of a complication. Concerted efforts from orthopaedists should be made to identify higher risk patients and prevent the most common adverse events that occur postoperatively. Copyright © 2014 Elsevier Ltd. All rights reserved.
Gene set analysis using variance component tests.
Huang, Yen-Tsung; Lin, Xihong
2013-06-28
Gene set analyses have become increasingly important in genomic research, as many complex diseases are contributed jointly by alterations of numerous genes. Genes often coordinate together as a functional repertoire, e.g., a biological pathway/network and are highly correlated. However, most of the existing gene set analysis methods do not fully account for the correlation among the genes. Here we propose to tackle this important feature of a gene set to improve statistical power in gene set analyses. We propose to model the effects of an independent variable, e.g., exposure/biological status (yes/no), on multiple gene expression values in a gene set using a multivariate linear regression model, where the correlation among the genes is explicitly modeled using a working covariance matrix. We develop TEGS (Test for the Effect of a Gene Set), a variance component test for the gene set effects by assuming a common distribution for regression coefficients in multivariate linear regression models, and calculate the p-values using permutation and a scaled chi-square approximation. We show using simulations that type I error is protected under different choices of working covariance matrices and power is improved as the working covariance approaches the true covariance. The global test is a special case of TEGS when correlation among genes in a gene set is ignored. Using both simulation data and a published diabetes dataset, we show that our test outperforms the commonly used approaches, the global test and gene set enrichment analysis (GSEA). We develop a gene set analyses method (TEGS) under the multivariate regression framework, which directly models the interdependence of the expression values in a gene set using a working covariance. TEGS outperforms two widely used methods, GSEA and global test in both simulation and a diabetes microarray data.
A refined method for multivariate meta-analysis and meta-regression.
Jackson, Daniel; Riley, Richard D
2014-02-20
Making inferences about the average treatment effect using the random effects model for meta-analysis is problematic in the common situation where there is a small number of studies. This is because estimates of the between-study variance are not precise enough to accurately apply the conventional methods for testing and deriving a confidence interval for the average effect. We have found that a refined method for univariate meta-analysis, which applies a scaling factor to the estimated effects' standard error, provides more accurate inference. We explain how to extend this method to the multivariate scenario and show that our proposal for refined multivariate meta-analysis and meta-regression can provide more accurate inferences than the more conventional approach. We explain how our proposed approach can be implemented using standard output from multivariate meta-analysis software packages and apply our methodology to two real examples. Copyright © 2013 John Wiley & Sons, Ltd.
Non-stationary hydrologic frequency analysis using B-spline quantile regression
NASA Astrophysics Data System (ADS)
Nasri, B.; Bouezmarni, T.; St-Hilaire, A.; Ouarda, T. B. M. J.
2017-11-01
Hydrologic frequency analysis is commonly used by engineers and hydrologists to provide the basic information on planning, design and management of hydraulic and water resources systems under the assumption of stationarity. However, with increasing evidence of climate change, it is possible that the assumption of stationarity, which is prerequisite for traditional frequency analysis and hence, the results of conventional analysis would become questionable. In this study, we consider a framework for frequency analysis of extremes based on B-Spline quantile regression which allows to model data in the presence of non-stationarity and/or dependence on covariates with linear and non-linear dependence. A Markov Chain Monte Carlo (MCMC) algorithm was used to estimate quantiles and their posterior distributions. A coefficient of determination and Bayesian information criterion (BIC) for quantile regression are used in order to select the best model, i.e. for each quantile, we choose the degree and number of knots of the adequate B-spline quantile regression model. The method is applied to annual maximum and minimum streamflow records in Ontario, Canada. Climate indices are considered to describe the non-stationarity in the variable of interest and to estimate the quantiles in this case. The results show large differences between the non-stationary quantiles and their stationary equivalents for an annual maximum and minimum discharge with high annual non-exceedance probabilities.
Attrition in Psychotherapy: A Survival Analysis
ERIC Educational Resources Information Center
Roseborough, David John; McLeod, Jeffrey T.; Wright, Florence I.
2016-01-01
Purpose: Attrition is a common problem in psychotherapy and can be defined as clients ending treatment before achieving an optimal response. Method: This longitudinal, archival study utilized data for 3,728 clients, using the Outcome Questionnaire 45.2. A Cox regression proportional hazards (hazard ratios) model was used in order to better…
High Loading of Polygenic Risk for ADHD in Children With Comorbid Aggression
Hamshere, Marian L.; Langley, Kate; Martin, Joanna; Agha, Sharifah Shameem; Stergiakouli, Evangelia; Anney, Richard J.L.; Buitelaar, Jan; Faraone, Stephen V.; Lesch, Klaus-Peter; Neale, Benjamin M.; Franke, Barbara; Sonuga-Barke, Edmund; Asherson, Philip; Merwood, Andrew; Kuntsi, Jonna; Medland, Sarah E.; Ripke, Stephan; Steinhausen, Hans-Christoph; Freitag, Christine; Reif, Andreas; Renner, Tobias J.; Romanos, Marcel; Romanos, Jasmin; Warnke, Andreas; Meyer, Jobst; Palmason, Haukur; Vasquez, Alejandro Arias; Lambregts-Rommelse, Nanda; Roeyers, Herbert; Biederman, Joseph; Doyle, Alysa E.; Hakonarson, Hakon; Rothenberger, Aribert; Banaschewski, Tobias; Oades, Robert D.; McGough, James J.; Kent, Lindsey; Williams, Nigel; Owen, Michael J.; Holmans, Peter
2013-01-01
Objective Although attention deficit hyperactivity disorder (ADHD) is highly heritable, genome-wide association studies (GWAS) have not yet identified any common genetic variants that contribute to risk. There is evidence that aggression or conduct disorder in children with ADHD indexes higher genetic loading and clinical severity. The authors examine whether common genetic variants considered en masse as polygenic scores for ADHD are especially enriched in children with comorbid conduct disorder. Method Polygenic scores derived from an ADHD GWAS meta-analysis were calculated in an independent ADHD sample (452 case subjects, 5,081 comparison subjects). Multivariate logistic regression analyses were employed to compare polygenic scores in the ADHD and comparison groups and test for higher scores in ADHD case subjects with comorbid conduct disorder relative to comparison subjects and relative to those without comorbid conduct disorder. Association with symptom scores was tested using linear regression. Results Polygenic risk for ADHD, derived from the meta-analysis, was higher in the independent ADHD group than in the comparison group. Polygenic score was significantly higher in ADHD case subjects with conduct disorder relative to ADHD case subjects without conduct disorder. ADHD polygenic score showed significant association with comorbid conduct disorder symptoms. This relationship was explained by the aggression items. Conclusions Common genetic variation is relevant to ADHD, especially in individuals with comorbid aggression. The findings suggest that the previously published ADHD GWAS meta-analysis contains weak but true associations with common variants, support for which falls below genome-wide significance levels. The findings also highlight the fact that aggression in ADHD indexes genetic as well as clinical severity. PMID:23599091
Multiple Imputation of a Randomly Censored Covariate Improves Logistic Regression Analysis.
Atem, Folefac D; Qian, Jing; Maye, Jacqueline E; Johnson, Keith A; Betensky, Rebecca A
2016-01-01
Randomly censored covariates arise frequently in epidemiologic studies. The most commonly used methods, including complete case and single imputation or substitution, suffer from inefficiency and bias. They make strong parametric assumptions or they consider limit of detection censoring only. We employ multiple imputation, in conjunction with semi-parametric modeling of the censored covariate, to overcome these shortcomings and to facilitate robust estimation. We develop a multiple imputation approach for randomly censored covariates within the framework of a logistic regression model. We use the non-parametric estimate of the covariate distribution or the semiparametric Cox model estimate in the presence of additional covariates in the model. We evaluate this procedure in simulations, and compare its operating characteristics to those from the complete case analysis and a survival regression approach. We apply the procedures to an Alzheimer's study of the association between amyloid positivity and maternal age of onset of dementia. Multiple imputation achieves lower standard errors and higher power than the complete case approach under heavy and moderate censoring and is comparable under light censoring. The survival regression approach achieves the highest power among all procedures, but does not produce interpretable estimates of association. Multiple imputation offers a favorable alternative to complete case analysis and ad hoc substitution methods in the presence of randomly censored covariates within the framework of logistic regression.
Skolasky, Richard L; Maggard, Anica M; Li, David; Riley, Lee H; Wegener, Stephen T
2015-07-01
To determine the effect of health behavior change counseling (HBCC) on patient activation and the influence of patient activation on rehabilitation engagement, and to identify common barriers to engagement among individuals undergoing surgery for degenerative lumbar spinal stenosis. Prospective clinical trial. Academic medical center. Consecutive lumbar spine surgery patients (N=122) defined in our companion article (Part I) were assigned to a control group (did not receive HBCC, n=59) or HBCC group (received HBCC, n=63). Brief motivational interviewing-based HBCC versus control (significance, P<.05). We assessed patient activation before and after intervention. Rehabilitation engagement was assessed using the physical therapist-reported Hopkins Rehabilitation Engagement Rating Scale and by a ratio of self-reported physical therapy and home exercise completion. Common barriers to rehabilitation engagement were identified through thematic analysis. Patient activation predicted engagement (standardized regression weight, .682; P<.001). Postintervention patient activation was predicted by baseline patient activation (standardized regression weight, .808; P<.001) and receipt of HBCC (standardized regression weight, .444; P<.001). The effect of HBCC on rehabilitation engagement was mediated by patient activation (standardized regression weight, .079; P=.395). One-third of the HBCC group did not show improvement compared with the control group. Thematic analysis identified 3 common barriers to engagement: (1) low self-efficacy because of lack of knowledge and support (62%); (2) anxiety related to fear of movement (57%); and (3) concern about pain management (48%). The influence of HBCC on rehabilitation engagement was mediated by patient activation. Despite improvements in patient activation, one-third of patients reported low rehabilitation engagement. Addressing these barriers should lead to greater improvements in rehabilitation engagement. Copyright © 2015 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
The Relationship between Language Literacy and ELL Student Academic Performance in Mathematics
ERIC Educational Resources Information Center
Lawon, Molly A.
2017-01-01
This quantitative study used regression analysis to investigate the correlation of limited language proficiency and the performance of English Language Learner (ELL) students on two commonly used math assessments, namely the Smarter Balanced Assessment Consortium (SBAC) and the Measures of Academic Progress (MAP). Scores were analyzed for eighth…
Changing the Perspective on Early Development of Rett Syndrome
ERIC Educational Resources Information Center
Marschik, Peter B.; Kaufmann, Walter E.; Sigafoos, Jeff; Wolin, Thomas; Zhang, Dajie; Bartl-Pokorny, Katrin D.; Pini, Giorgio; Zappella, Michele; Tager-Flusberg, Helen; Einspieler, Christa; Johnston, Michael V.
2013-01-01
We delineated the achievement of early speech-language milestones in 15 young children with Rett syndrome ("MECP2" positive) in the first two years of life using retrospective video analysis. By contrast to the commonly accepted concept that these children are normal in the pre-regression period, we found markedly atypical development of…
Gradient descent for robust kernel-based regression
NASA Astrophysics Data System (ADS)
Guo, Zheng-Chu; Hu, Ting; Shi, Lei
2018-06-01
In this paper, we study the gradient descent algorithm generated by a robust loss function over a reproducing kernel Hilbert space (RKHS). The loss function is defined by a windowing function G and a scale parameter σ, which can include a wide range of commonly used robust losses for regression. There is still a gap between theoretical analysis and optimization process of empirical risk minimization based on loss: the estimator needs to be global optimal in the theoretical analysis while the optimization method can not ensure the global optimality of its solutions. In this paper, we aim to fill this gap by developing a novel theoretical analysis on the performance of estimators generated by the gradient descent algorithm. We demonstrate that with an appropriately chosen scale parameter σ, the gradient update with early stopping rules can approximate the regression function. Our elegant error analysis can lead to convergence in the standard L 2 norm and the strong RKHS norm, both of which are optimal in the mini-max sense. We show that the scale parameter σ plays an important role in providing robustness as well as fast convergence. The numerical experiments implemented on synthetic examples and real data set also support our theoretical results.
Regression with Small Data Sets: A Case Study using Code Surrogates in Additive Manufacturing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kamath, C.; Fan, Y. J.
There has been an increasing interest in recent years in the mining of massive data sets whose sizes are measured in terabytes. While it is easy to collect such large data sets in some application domains, there are others where collecting even a single data point can be very expensive, so the resulting data sets have only tens or hundreds of samples. For example, when complex computer simulations are used to understand a scientific phenomenon, we want to run the simulation for many different values of the input parameters and analyze the resulting output. The data set relating the simulationmore » inputs and outputs is typically quite small, especially when each run of the simulation is expensive. However, regression techniques can still be used on such data sets to build an inexpensive \\surrogate" that could provide an approximate output for a given set of inputs. A good surrogate can be very useful in sensitivity analysis, uncertainty analysis, and in designing experiments. In this paper, we compare different regression techniques to determine how well they predict melt-pool characteristics in the problem domain of additive manufacturing. Our analysis indicates that some of the commonly used regression methods do perform quite well even on small data sets.« less
Goodness-of-fit tests and model diagnostics for negative binomial regression of RNA sequencing data.
Mi, Gu; Di, Yanming; Schafer, Daniel W
2015-01-01
This work is about assessing model adequacy for negative binomial (NB) regression, particularly (1) assessing the adequacy of the NB assumption, and (2) assessing the appropriateness of models for NB dispersion parameters. Tools for the first are appropriate for NB regression generally; those for the second are primarily intended for RNA sequencing (RNA-Seq) data analysis. The typically small number of biological samples and large number of genes in RNA-Seq analysis motivate us to address the trade-offs between robustness and statistical power using NB regression models. One widely-used power-saving strategy, for example, is to assume some commonalities of NB dispersion parameters across genes via simple models relating them to mean expression rates, and many such models have been proposed. As RNA-Seq analysis is becoming ever more popular, it is appropriate to make more thorough investigations into power and robustness of the resulting methods, and into practical tools for model assessment. In this article, we propose simulation-based statistical tests and diagnostic graphics to address model adequacy. We provide simulated and real data examples to illustrate that our proposed methods are effective for detecting the misspecification of the NB mean-variance relationship as well as judging the adequacy of fit of several NB dispersion models.
Cameron, Isobel M; Scott, Neil W; Adler, Mats; Reid, Ian C
2014-12-01
It is important for clinical practice and research that measurement scales of well-being and quality of life exhibit only minimal differential item functioning (DIF). DIF occurs where different groups of people endorse items in a scale to different extents after being matched by the intended scale attribute. We investigate the equivalence or otherwise of common methods of assessing DIF. Three methods of measuring age- and sex-related DIF (ordinal logistic regression, Rasch analysis and Mantel χ(2) procedure) were applied to Hospital Anxiety Depression Scale (HADS) data pertaining to a sample of 1,068 patients consulting primary care practitioners. Three items were flagged by all three approaches as having either age- or sex-related DIF with a consistent direction of effect; a further three items identified did not meet stricter criteria for important DIF using at least one method. When applying strict criteria for significant DIF, ordinal logistic regression was slightly less sensitive. Ordinal logistic regression, Rasch analysis and contingency table methods yielded consistent results when identifying DIF in the HADS depression and HADS anxiety scales. Regardless of methods applied, investigators should use a combination of statistical significance, magnitude of the DIF effect and investigator judgement when interpreting the results.
Breaking the solid ground of common sense: undoing "structure" with Michael Balint.
Bonomi, Carlo
2003-09-01
Balint's great merit was to question what, in the classical perspective, was assumed as a prerequisite for analysis and thus located beyond analysis: the maturity of the ego. A fundamental premise of his work was Ferenczi's distrust for the structural model, which praised the maturity of the ego and its verbal, social, and adaptive abilities. Ferenczi's view of ego maturation as a trauma derivative was strikingly different from the theories of all other psychoanalytic schools and seems to be responsible for Balint's understanding of regression as a sort of inverted process that enables the undoing of the sheltering structures of the mature mind. Balint's understanding of the relation between mature ego and regression diverged not only from the ego psychologists, who emphasized the idea of therapeutic alliance, but also from most of the authors who embraced the object-relational view, like Klein (who considered regression a manifestation of the patient's craving for oral gratification), Fairbairn (who gave up the notion of regression), and Guntrip (who viewed regression as a schizoid phenomenon related to the ego weakness). According to Balint, the clinical appearance of a regression would "depend also on the way the regression is recognized, is accepted, and is responded to by the analyst." In this respect, his position was close to Winnicott's reformulation of the therapeutic action. Yet, the work of Balint reflects the persuasion that the progressive fluidification of the solid structure could be enabled only by the analyst's capacity for becoming himself or herself [unsolid].
A refined method for multivariate meta-analysis and meta-regression
Jackson, Daniel; Riley, Richard D
2014-01-01
Making inferences about the average treatment effect using the random effects model for meta-analysis is problematic in the common situation where there is a small number of studies. This is because estimates of the between-study variance are not precise enough to accurately apply the conventional methods for testing and deriving a confidence interval for the average effect. We have found that a refined method for univariate meta-analysis, which applies a scaling factor to the estimated effects’ standard error, provides more accurate inference. We explain how to extend this method to the multivariate scenario and show that our proposal for refined multivariate meta-analysis and meta-regression can provide more accurate inferences than the more conventional approach. We explain how our proposed approach can be implemented using standard output from multivariate meta-analysis software packages and apply our methodology to two real examples. © 2013 The Authors. Statistics in Medicine published by John Wiley & Sons, Ltd. PMID:23996351
a Comparison Between Two Ols-Based Approaches to Estimating Urban Multifractal Parameters
NASA Astrophysics Data System (ADS)
Huang, Lin-Shan; Chen, Yan-Guang
Multifractal theory provides a new spatial analytical tool for urban studies, but many basic problems remain to be solved. Among various pending issues, the most significant one is how to obtain proper multifractal dimension spectrums. If an algorithm is improperly used, the parameter spectrums will be abnormal. This paper is devoted to investigating two ordinary least squares (OLS)-based approaches for estimating urban multifractal parameters. Using empirical study and comparative analysis, we demonstrate how to utilize the adequate linear regression to calculate multifractal parameters. The OLS regression analysis has two different approaches. One is that the intercept is fixed to zero, and the other is that the intercept is not limited. The results of comparative study show that the zero-intercept regression yields proper multifractal parameter spectrums within certain scale range of moment order, while the common regression method often leads to abnormal multifractal parameter values. A conclusion can be reached that fixing the intercept to zero is a more advisable regression method for multifractal parameters estimation, and the shapes of spectral curves and value ranges of fractal parameters can be employed to diagnose urban problems. This research is helpful for scientists to understand multifractal models and apply a more reasonable technique to multifractal parameter calculations.
On the use of log-transformation vs. nonlinear regression for analyzing biological power laws.
Xiao, Xiao; White, Ethan P; Hooten, Mevin B; Durham, Susan L
2011-10-01
Power-law relationships are among the most well-studied functional relationships in biology. Recently the common practice of fitting power laws using linear regression (LR) on log-transformed data has been criticized, calling into question the conclusions of hundreds of studies. It has been suggested that nonlinear regression (NLR) is preferable, but no rigorous comparison of these two methods has been conducted. Using Monte Carlo simulations, we demonstrate that the error distribution determines which method performs better, with NLR better characterizing data with additive, homoscedastic, normal error and LR better characterizing data with multiplicative, heteroscedastic, lognormal error. Analysis of 471 biological power laws shows that both forms of error occur in nature. While previous analyses based on log-transformation appear to be generally valid, future analyses should choose methods based on a combination of biological plausibility and analysis of the error distribution. We provide detailed guidelines and associated computer code for doing so, including a model averaging approach for cases where the error structure is uncertain.
Breeding habitat preference of preimaginal black flies (Diptera: Simuliidae) in Peninsular Malaysia.
Ya'cob, Zubaidah; Takaoka, Hiroyuki; Pramual, Pairot; Low, Van Lun; Sofian-Azirun, Mohd
2016-01-01
To investigate the breeding habitat preference of black flies, a comprehensive black fly survey was conducted for the first time in Peninsular Malaysia. Preimaginal black flies (pupae and larvae) were collected manually from 180 stream points encompassing northern, southern, central and east coast of the Peninsular Malaysia. A total of 47 black fly species were recorded in this study. The predominant species were Simulium trangense (36.7%) and Simulium angulistylum (33.3%). Relatively common species were Simulium cheongi (29.4%), Simulium tani (25.6%), Simulium nobile (16.2%), Simulium sheilae (14.5%) and Simulium bishopi (10.6%). Principal Component Analysis (PCA) of all stream variables revealed four PCs that accounted for 69.3% of the total intersite variance. Regression analysis revealed that high species richness is associated with larger, deeper, faster and higher discharge streams with larger streambed particles, more riparian vegetation and low pH (F=22.7, d.f.=1, 173; P<0.001). Relationship between species occurrence of seven common species (present in >10% of the sampling sites) was assessed. Forward logistic regression analysis indicated that four species were significantly related to the stream variables. S. nobile and S. tani prefer large, fast flowing streams with higher pH, large streambed particles and riparian trees. S. bishopi was commonly found at high elevation with cooler stream, low conductivity, higher conductivity and more riparian trees. In contrast, S. sheilae was negatively correlated with PC-2, thus, this species commonly found at low elevation, warmer stream with low conductivity and less riparian trees. The results of this study are consistent with previous studies from other geographic regions, which indicated that both physical and chemical stream conditions are the key factors for black fly ecology. Copyright © 2015 Elsevier B.V. All rights reserved.
Quantitative analysis of the mixtures of illicit drugs using terahertz time-domain spectroscopy
NASA Astrophysics Data System (ADS)
Jiang, Dejun; Zhao, Shusen; Shen, Jingling
2008-03-01
A method was proposed to quantitatively inspect the mixtures of illicit drugs with terahertz time-domain spectroscopy technique. The mass percentages of all components in a mixture can be obtained by linear regression analysis, on the assumption that all components in the mixture and their absorption features be known. For illicit drugs were scarce and expensive, firstly we used common chemicals, Benzophenone, Anthraquinone, Pyridoxine hydrochloride and L-Ascorbic acid in the experiment. Then illicit drugs and a common adulterant, methamphetamine and flour, were selected for our experiment. Experimental results were in significant agreement with actual content, which suggested that it could be an effective method for quantitative identification of illicit drugs.
NASA Astrophysics Data System (ADS)
Krivtsov, S. N.; Yakimov, I. V.; Ozornin, S. P.
2018-03-01
A mathematical model of a solenoid common rail fuel injector was developed. Its difference from existing models is control valve wear simulation. A common rail injector of 0445110376 Series (Cummins ISf 2.8 Diesel engine) produced by Bosch Company was used as a research object. Injector parameters (fuel delivery and back leakage) were determined by calculation and experimental methods. GT-Suite model average R2 is 0.93 which means that it predicts the injection rate shape very accurately (nominal and marginal technical conditions of an injector). Numerical analysis and experimental studies showed that control valve wear increases back leakage and fuel delivery (especially at 160 MPa). The regression models for determining fuel delivery and back leakage effects on fuel pressure and energizing time were developed (for nominal and marginal technical conditions).
Hayes, Timothy; Usami, Satoshi; Jacobucci, Ross; McArdle, John J
2015-12-01
In this article, we describe a recent development in the analysis of attrition: using classification and regression trees (CART) and random forest methods to generate inverse sampling weights. These flexible machine learning techniques have the potential to capture complex nonlinear, interactive selection models, yet to our knowledge, their performance in the missing data analysis context has never been evaluated. To assess the potential benefits of these methods, we compare their performance with commonly employed multiple imputation and complete case techniques in 2 simulations. These initial results suggest that weights computed from pruned CART analyses performed well in terms of both bias and efficiency when compared with other methods. We discuss the implications of these findings for applied researchers. (c) 2015 APA, all rights reserved).
Hayes, Timothy; Usami, Satoshi; Jacobucci, Ross; McArdle, John J.
2016-01-01
In this article, we describe a recent development in the analysis of attrition: using classification and regression trees (CART) and random forest methods to generate inverse sampling weights. These flexible machine learning techniques have the potential to capture complex nonlinear, interactive selection models, yet to our knowledge, their performance in the missing data analysis context has never been evaluated. To assess the potential benefits of these methods, we compare their performance with commonly employed multiple imputation and complete case techniques in 2 simulations. These initial results suggest that weights computed from pruned CART analyses performed well in terms of both bias and efficiency when compared with other methods. We discuss the implications of these findings for applied researchers. PMID:26389526
Jackson, Dan; White, Ian R; Riley, Richard D
2013-01-01
Multivariate meta-analysis is becoming more commonly used. Methods for fitting the multivariate random effects model include maximum likelihood, restricted maximum likelihood, Bayesian estimation and multivariate generalisations of the standard univariate method of moments. Here, we provide a new multivariate method of moments for estimating the between-study covariance matrix with the properties that (1) it allows for either complete or incomplete outcomes and (2) it allows for covariates through meta-regression. Further, for complete data, it is invariant to linear transformations. Our method reduces to the usual univariate method of moments, proposed by DerSimonian and Laird, in a single dimension. We illustrate our method and compare it with some of the alternatives using a simulation study and a real example. PMID:23401213
ERIC Educational Resources Information Center
Toutkoushian, Robert K.
This paper proposes a five-step process by which to analyze whether the salary ratio between junior and senior college faculty exhibits salary compression, a term used to describe an unusually small differential between faculty with different levels of experience. The procedure utilizes commonly used statistical techniques (multiple regression…
Hydrological predictions at a watershed scale are commonly based on extrapolation and upscaling of hydrological behavior at plot and hillslope scales. Yet, dominant hydrological drivers at a hillslope may not be as dominant at the watershed scale because of the heterogeneity of w...
Handling Missing Data: Analysis of a Challenging Data Set Using Multiple Imputation
ERIC Educational Resources Information Center
Pampaka, Maria; Hutcheson, Graeme; Williams, Julian
2016-01-01
Missing data is endemic in much educational research. However, practices such as step-wise regression common in the educational research literature have been shown to be dangerous when significant data are missing, and multiple imputation (MI) is generally recommended by statisticians. In this paper, we provide a review of these advances and their…
The Effect of Attending Tutoring on Course Grades in Calculus I
ERIC Educational Resources Information Center
Rickard, Brian; Mills, Melissa
2018-01-01
Tutoring centres are common in universities in the United States, but there are few published studies that statistically examine the effects of tutoring on student success. This study utilizes multiple regression analysis to model the effect of tutoring attendance on final course grades in Calculus I. Our model predicted that every three visits to…
Ryberg, Karen R.
2007-01-01
This report presents the results of a study by the U.S. Geological Survey, done in cooperation with the North Dakota State Water Commission, to estimate water-quality constituent concentrations at seven sites on the Sheyenne River, N. Dak. Regression analysis of water-quality data collected in 1980-2006 was used to estimate concentrations for hardness, dissolved solids, calcium, magnesium, sodium, and sulfate. The explanatory variables examined for the regression relations were continuously monitored streamflow, specific conductance, and water temperature. For the conditions observed in 1980-2006, streamflow was a significant explanatory variable for some constituents. Specific conductance was a significant explanatory variable for all of the constituents, and water temperature was not a statistically significant explanatory variable for any of the constituents in this study. The regression relations were evaluated using common measures of variability, including R2, the proportion of variability in the estimated constituent concentration explained by the explanatory variables and regression equation. R2 values ranged from 0.784 for calcium to 0.997 for dissolved solids. The regression relations also were evaluated by calculating the median relative percentage difference (RPD) between measured constituent concentration and the constituent concentration estimated by the regression equations. Median RPDs ranged from 1.7 for dissolved solids to 11.5 for sulfate. The regression relations also may be used to estimate daily constituent loads. The relations should be monitored for change over time, especially at sites 2 and 3 which have a short period of record. In addition, caution should be used when the Sheyenne River is affected by ice or when upstream sites are affected by isolated storm runoff. Almost all of the outliers and highly influential samples removed from the analysis were made during periods when the Sheyenne River might be affected by ice.
Results of Database Studies in Spine Surgery Can Be Influenced by Missing Data.
Basques, Bryce A; McLynn, Ryan P; Fice, Michael P; Samuel, Andre M; Lukasiewicz, Adam M; Bohl, Daniel D; Ahn, Junyoung; Singh, Kern; Grauer, Jonathan N
2017-12-01
National databases are increasingly being used for research in spine surgery; however, one limitation of such databases that has received sparse mention is the frequency of missing data. Studies using these databases often do not emphasize the percentage of missing data for each variable used and do not specify how patients with missing data are incorporated into analyses. This study uses the American College of Surgeons National Surgical Quality Improvement Program (ACS-NSQIP) database to examine whether different treatments of missing data can influence the results of spine studies. (1) What is the frequency of missing data fields for demographics, medical comorbidities, preoperative laboratory values, operating room times, and length of stay recorded in ACS-NSQIP? (2) Using three common approaches to handling missing data, how frequently do those approaches agree in terms of finding particular variables to be associated with adverse events? (3) Do different approaches to handling missing data influence the outcomes and effect sizes of an analysis testing for an association with these variables with occurrence of adverse events? Patients who underwent spine surgery between 2005 and 2013 were identified from the ACS-NSQIP database. A total of 88,471 patients undergoing spine surgery were identified. The most common procedures were anterior cervical discectomy and fusion, lumbar decompression, and lumbar fusion. Demographics, comorbidities, and perioperative laboratory values were tabulated for each patient, and the percent of missing data was noted for each variable. These variables were tested for an association with "any adverse event" using three separate multivariate regressions that used the most common treatments for missing data. In the first regression, patients with any missing data were excluded. In the second regression, missing data were treated as a negative or "reference" value; for continuous variables, the mean of each variable's reference range was computed and imputed. In the third regression, any variables with > 10% rate of missing data were removed from the regression; among variables with ≤ 10% missing data, individual cases with missing values were excluded. The results of these regressions were compared to determine how the different treatments of missing data could affect the results of spine studies using the ACS-NSQIP database. Of the 88,471 patients, as many as 4441 (5%) had missing elements among demographic data, 69,184 (72%) among comorbidities, 70,892 (80%) among preoperative laboratory values, and 56,551 (64%) among operating room times. Considering the three different treatments of missing data, we found different risk factors for adverse events. Of 44 risk factors found to be associated with adverse events in any analysis, only 15 (34%) of these risk factors were common among the three regressions. The second treatment of missing data (assuming "normal" value) found the most risk factors (40) to be associated with any adverse event, whereas the first treatment (deleting patients with missing data) found the fewest associations at 20. Among the risk factors associated with any adverse event, the 10 with the greatest effect size (odds ratio) by each regression were ranked. Of the 15 variables in the top 10 for any regression, six of these were common among all three lists. Differing treatments of missing data can influence the results of spine studies using the ACS-NSQIP. The current study highlights the importance of considering how such missing data are handled. Until there are better guidelines on the best approaches to handle missing data, investigators should report how missing data were handled to increase the quality and transparency of orthopaedic database research. Readers of large database studies should note whether handling of missing data was addressed and consider potential bias with high rates or unspecified or weak methods for handling missing data.
Robustness of meta-analyses in finding gene × environment interactions
Shi, Gang; Nehorai, Arye
2017-01-01
Meta-analyses that synthesize statistical evidence across studies have become important analytical tools for genetic studies. Inspired by the success of genome-wide association studies of the genetic main effect, researchers are searching for gene × environment interactions. Confounders are routinely included in the genome-wide gene × environment interaction analysis as covariates; however, this does not control for any confounding effects on the results if covariate × environment interactions are present. We carried out simulation studies to evaluate the robustness to the covariate × environment confounder for meta-regression and joint meta-analysis, which are two commonly used meta-analysis methods for testing the gene × environment interaction or the genetic main effect and interaction jointly. Here we show that meta-regression is robust to the covariate × environment confounder while joint meta-analysis is subject to the confounding effect with inflated type I error rates. Given vast sample sizes employed in genome-wide gene × environment interaction studies, non-significant covariate × environment interactions at the study level could substantially elevate the type I error rate at the consortium level. When covariate × environment confounders are present, type I errors can be controlled in joint meta-analysis by including the covariate × environment terms in the analysis at the study level. Alternatively, meta-regression can be applied, which is robust to potential covariate × environment confounders. PMID:28362796
Incidence Trend and Epidemiology of Common Cancers in the Center of Iran.
Rafiemanesh, Hosein; Rajaei-Behbahani, Narjes; Khani, Yousef; Hosseini, Sayedehafagh; Pournamdar, Zahra; Mohammadian-Hafshejani, Abdollah; Soltani, Shahin; Hosseini, Seyedeh Akram; Khazaei, Salman; Salehiniya, Hamid
2015-07-13
Cancer is a major public health problem in Iran and many other parts of the world. The cancer incidence is different in various countries and in country provinces. Geographical differences in the cancer incidence lead to be important to conduct an epidemiological study of the disease. This study aimed to investigate cancer epidemiology and trend in the province of Qom, located in center of Iran. This is an analytical cross-sectional study carried out based on re-analysis cancer registry report and the disease management center of health ministry from 2004 to 2008 in the province of Qom. To describe incidence time trends, we carried out join point regression analysis using the software Join point Regression Program, Version 4.1.1.1. There were 3,029 registered cases of cancer during 5 years studied. Sex ratio was 1.32 (male to female). Considering the frequency and mean standardized incidence, the most common cancer in women were breast, skin, colorectal, stomach, and esophagus, respectively while in men the most common cancers included skin, stomach, colorectal, bladder, and prostate, respectively. There was an increasing and significant trend, according to the annual percentage change (APC) equal to 8.08% (CI: 5.1-11.1) for all site cancer in women. The incidence trend of all cancers was increasing in this area. Hence, planning for identifying risk factors and performing programs for dealing with the disease are essential.
Measuring missing heritability: Inferring the contribution of common variants
Golan, David; Lander, Eric S.; Rosset, Saharon
2014-01-01
Genome-wide association studies (GWASs), also called common variant association studies (CVASs), have uncovered thousands of genetic variants associated with hundreds of diseases. However, the variants that reach statistical significance typically explain only a small fraction of the heritability. One explanation for the “missing heritability” is that there are many additional disease-associated common variants whose effects are too small to detect with current sample sizes. It therefore is useful to have methods to quantify the heritability due to common variation, without having to identify all causal variants. Recent studies applied restricted maximum likelihood (REML) estimation to case–control studies for diseases. Here, we show that REML considerably underestimates the fraction of heritability due to common variation in this setting. The degree of underestimation increases with the rarity of disease, the heritability of the disease, and the size of the sample. Instead, we develop a general framework for heritability estimation, called phenotype correlation–genotype correlation (PCGC) regression, which generalizes the well-known Haseman–Elston regression method. We show that PCGC regression yields unbiased estimates. Applying PCGC regression to six diseases, we estimate the proportion of the phenotypic variance due to common variants to range from 25% to 56% and the proportion of heritability due to common variants from 41% to 68% (mean 60%). These results suggest that common variants may explain at least half the heritability for many diseases. PCGC regression also is readily applicable to other settings, including analyzing extreme-phenotype studies and adjusting for covariates such as sex, age, and population structure. PMID:25422463
Yang, Xiaowei; Nie, Kun
2008-03-15
Longitudinal data sets in biomedical research often consist of large numbers of repeated measures. In many cases, the trajectories do not look globally linear or polynomial, making it difficult to summarize the data or test hypotheses using standard longitudinal data analysis based on various linear models. An alternative approach is to apply the approaches of functional data analysis, which directly target the continuous nonlinear curves underlying discretely sampled repeated measures. For the purposes of data exploration, many functional data analysis strategies have been developed based on various schemes of smoothing, but fewer options are available for making causal inferences regarding predictor-outcome relationships, a common task seen in hypothesis-driven medical studies. To compare groups of curves, two testing strategies with good power have been proposed for high-dimensional analysis of variance: the Fourier-based adaptive Neyman test and the wavelet-based thresholding test. Using a smoking cessation clinical trial data set, this paper demonstrates how to extend the strategies for hypothesis testing into the framework of functional linear regression models (FLRMs) with continuous functional responses and categorical or continuous scalar predictors. The analysis procedure consists of three steps: first, apply the Fourier or wavelet transform to the original repeated measures; then fit a multivariate linear model in the transformed domain; and finally, test the regression coefficients using either adaptive Neyman or thresholding statistics. Since a FLRM can be viewed as a natural extension of the traditional multiple linear regression model, the development of this model and computational tools should enhance the capacity of medical statistics for longitudinal data.
Predicting recreational water quality advisories: A comparison of statistical methods
Brooks, Wesley R.; Corsi, Steven R.; Fienen, Michael N.; Carvin, Rebecca B.
2016-01-01
Epidemiological studies indicate that fecal indicator bacteria (FIB) in beach water are associated with illnesses among people having contact with the water. In order to mitigate public health impacts, many beaches are posted with an advisory when the concentration of FIB exceeds a beach action value. The most commonly used method of measuring FIB concentration takes 18–24 h before returning a result. In order to avoid the 24 h lag, it has become common to ”nowcast” the FIB concentration using statistical regressions on environmental surrogate variables. Most commonly, nowcast models are estimated using ordinary least squares regression, but other regression methods from the statistical and machine learning literature are sometimes used. This study compares 14 regression methods across 7 Wisconsin beaches to identify which consistently produces the most accurate predictions. A random forest model is identified as the most accurate, followed by multiple regression fit using the adaptive LASSO.
A guide to understanding meta-analysis.
Israel, Heidi; Richter, Randy R
2011-07-01
With the focus on evidence-based practice in healthcare, a well-conducted systematic review that includes a meta-analysis where indicated represents a high level of evidence for treatment effectiveness. The purpose of this commentary is to assist clinicians in understanding meta-analysis as a statistical tool using both published articles and explanations of components of the technique. We describe what meta-analysis is, what heterogeneity is, and how it affects meta-analysis, effect size, the modeling techniques of meta-analysis, and strengths and weaknesses of meta-analysis. Common components like forest plot interpretation, software that may be used, special cases for meta-analysis, such as subgroup analysis, individual patient data, and meta-regression, and a discussion of criticisms, are included.
Patounakis, George; Hill, Micah J
2018-06-01
The purpose of the current review is to describe the common pitfalls in design and statistical analysis of reproductive medicine studies. It serves to guide both authors and reviewers toward reducing the incidence of spurious statistical results and erroneous conclusions. The large amount of data gathered in IVF cycles leads to problems with multiplicity, multicollinearity, and over fitting of regression models. Furthermore, the use of the word 'trend' to describe nonsignificant results has increased in recent years. Finally, methods to accurately account for female age in infertility research models are becoming more common and necessary. The pitfalls of study design and analysis reviewed provide a framework for authors and reviewers to approach clinical research in the field of reproductive medicine. By providing a more rigorous approach to study design and analysis, the literature in reproductive medicine will have more reliable conclusions that can stand the test of time.
Robust Variable Selection with Exponential Squared Loss.
Wang, Xueqin; Jiang, Yunlu; Huang, Mian; Zhang, Heping
2013-04-01
Robust variable selection procedures through penalized regression have been gaining increased attention in the literature. They can be used to perform variable selection and are expected to yield robust estimates. However, to the best of our knowledge, the robustness of those penalized regression procedures has not been well characterized. In this paper, we propose a class of penalized robust regression estimators based on exponential squared loss. The motivation for this new procedure is that it enables us to characterize its robustness that has not been done for the existing procedures, while its performance is near optimal and superior to some recently developed methods. Specifically, under defined regularity conditions, our estimators are [Formula: see text] and possess the oracle property. Importantly, we show that our estimators can achieve the highest asymptotic breakdown point of 1/2 and that their influence functions are bounded with respect to the outliers in either the response or the covariate domain. We performed simulation studies to compare our proposed method with some recent methods, using the oracle method as the benchmark. We consider common sources of influential points. Our simulation studies reveal that our proposed method performs similarly to the oracle method in terms of the model error and the positive selection rate even in the presence of influential points. In contrast, other existing procedures have a much lower non-causal selection rate. Furthermore, we re-analyze the Boston Housing Price Dataset and the Plasma Beta-Carotene Level Dataset that are commonly used examples for regression diagnostics of influential points. Our analysis unravels the discrepancies of using our robust method versus the other penalized regression method, underscoring the importance of developing and applying robust penalized regression methods.
Robust Variable Selection with Exponential Squared Loss
Wang, Xueqin; Jiang, Yunlu; Huang, Mian; Zhang, Heping
2013-01-01
Robust variable selection procedures through penalized regression have been gaining increased attention in the literature. They can be used to perform variable selection and are expected to yield robust estimates. However, to the best of our knowledge, the robustness of those penalized regression procedures has not been well characterized. In this paper, we propose a class of penalized robust regression estimators based on exponential squared loss. The motivation for this new procedure is that it enables us to characterize its robustness that has not been done for the existing procedures, while its performance is near optimal and superior to some recently developed methods. Specifically, under defined regularity conditions, our estimators are n-consistent and possess the oracle property. Importantly, we show that our estimators can achieve the highest asymptotic breakdown point of 1/2 and that their influence functions are bounded with respect to the outliers in either the response or the covariate domain. We performed simulation studies to compare our proposed method with some recent methods, using the oracle method as the benchmark. We consider common sources of influential points. Our simulation studies reveal that our proposed method performs similarly to the oracle method in terms of the model error and the positive selection rate even in the presence of influential points. In contrast, other existing procedures have a much lower non-causal selection rate. Furthermore, we re-analyze the Boston Housing Price Dataset and the Plasma Beta-Carotene Level Dataset that are commonly used examples for regression diagnostics of influential points. Our analysis unravels the discrepancies of using our robust method versus the other penalized regression method, underscoring the importance of developing and applying robust penalized regression methods. PMID:23913996
Glass, Lisa M; Dickson, Rolland C; Anderson, Joseph C; Suriawinata, Arief A; Putra, Juan; Berk, Brian S; Toor, Arifa
2015-04-01
Given the rising epidemics of obesity and metabolic syndrome, nonalcoholic steatohepatitis (NASH) is now the most common cause of liver disease in the developed world. Effective treatment for NASH, either to reverse or prevent the progression of hepatic fibrosis, is currently lacking. To define the predictors associated with improved hepatic fibrosis in NASH patients undergoing serial liver biopsies at prolonged biopsy interval. This is a cohort study of 45 NASH patients undergoing serial liver biopsies for clinical monitoring in a tertiary care setting. Biopsies were scored using the NASH Clinical Research Network guidelines. Fibrosis regression was defined as improvement in fibrosis score ≥1 stage. Univariate analysis utilized Fisher's exact or Student's t test. Multivariate regression models determined independent predictors for regression of fibrosis. Forty-five NASH patients with biopsies collected at a mean interval of 4.6 years (±1.4) were included. The mean initial fibrosis stage was 1.96, two patients had cirrhosis and 12 patients (26.7 %) underwent bariatric surgery. There was a significantly higher rate of fibrosis regression among patients who lost ≥10 % total body weight (TBW) (63.2 vs. 9.1 %; p = 0.001) and who underwent bariatric surgery (47.4 vs. 4.5 %; p = 0.003). Factors such as age, gender, glucose intolerance, elevated ferritin, and A1AT heterozygosity did not influence fibrosis regression. On multivariate analysis, only weight loss of ≥10 % TBW predicted fibrosis regression [OR 8.14 (CI 1.08-61.17)]. Results indicate that regression of fibrosis in NASH is possible, even in advanced stages. Weight loss of ≥10 % TBW predicts fibrosis regression.
Guo, Fuyou; Shashikiran, Tagilapalli; Chen, Xi; Yang, Lei; Liu, Xianzhi; Song, Laijun
2015-01-01
Background: Deep venous thrombosis (DVT) contributes significantly to the morbidity and mortality of neurosurgical patients; however, no data regarding lower extremity DVT in postoperative Chinese neurosurgical patients have been reported. Materials and Methods: From January 2012 to December 2013, 196 patients without preoperative DVT who underwent neurosurgical operations were evaluated by color Doppler ultrasonography and D-dimer level measurements on the 3rd, 7th, and 14th days after surgery. Follow-up clinical data were recorded to determine the incidence of lower extremity DVT in postoperative neurosurgical patients and to analyze related clinical features. First, a single factor analysis, Chi-square test, was used to select statistically significant factors. Then, a multivariate analysis, binary logistic regression analysis, was used to determine risk factors for lower extremity DVT in postoperative neurosurgical patients. Results: Lower extremity DVT occurred in 61 patients, and the incidence of DVT was 31.1% in the enrolled Chinese neurosurgical patients. The common symptoms of DVT were limb swelling and lower extremity pain as well as increased soft tissue tension. The common sites of venous involvement were the calf muscle and peroneal and posterior tibial veins. The single factor analysis showed statistically significant differences in DVT risk factors, including age, hypertension, smoking status, operation time, a bedridden or paralyzed state, the presence of a tumor, postoperative dehydration, and glucocorticoid treatment, between the two groups (P < 0.05). The binary logistic regression analysis showed that an age greater than 50 years, hypertension, a bedridden or paralyzed state, the presence of a tumor, and postoperative dehydration were risk factors for lower extremity DVT in postoperative neurosurgical patients. Conclusions: Lower extremity DVT was a common complication following craniotomy in the enrolled Chinese neurosurgical patients. Multiple factors were identified as predictive of DVT in neurosurgical patients, including the presence of a tumor, an age greater than 50 years, hypertension, and immobility. PMID:26752303
ERIC Educational Resources Information Center
Bobbett, Gordon C.; And Others
The relationships among factors reported on school district (SD) report cards were studied for 121 Tennessee SDs. The report cards provided data on student outcomes (achievement test scores) and SD characteristics. Relationships were studied through linear regression, Pearson product moment correlation, and Guttman's partial correlation. Six…
A flexible count data regression model for risk analysis.
Guikema, Seth D; Coffelt, Jeremy P; Goffelt, Jeremy P
2008-02-01
In many cases, risk and reliability analyses involve estimating the probabilities of discrete events such as hardware failures and occurrences of disease or death. There is often additional information in the form of explanatory variables that can be used to help estimate the likelihood of different numbers of events in the future through the use of an appropriate regression model, such as a generalized linear model. However, existing generalized linear models (GLM) are limited in their ability to handle the types of variance structures often encountered in using count data in risk and reliability analysis. In particular, standard models cannot handle both underdispersed data (variance less than the mean) and overdispersed data (variance greater than the mean) in a single coherent modeling framework. This article presents a new GLM based on a reformulation of the Conway-Maxwell Poisson (COM) distribution that is useful for both underdispersed and overdispersed count data and demonstrates this model by applying it to the assessment of electric power system reliability. The results show that the proposed COM GLM can provide as good of fits to data as the commonly used existing models for overdispered data sets while outperforming these commonly used models for underdispersed data sets.
Factors associated with vocal fold pathologies in teachers.
Souza, Carla Lima de; Carvalho, Fernando Martins; Araújo, Tânia Maria de; Reis, Eduardo José Farias Borges Dos; Lima, Verônica Maria Cadena; Porto, Lauro Antonio
2011-10-01
To analyze factors associated with the prevalence of the medical diagnosis of vocal fold pathologies in teachers. A census-based epidemiological, cross-sectional study was conducted with 4,495 public primary and secondary school teachers in the city of Salvador, Northeastern Brazil, between March and April 2006. The dependent variable was the self-reported medical diagnosis of vocal fold pathologies and the independent variables were sociodemographic characteristics; professional activity; work organization/interpersonal relationships; physical work environment characteristics; frequency of common mental disorders, measured by the Self-Reporting Questionnaire-20 (SRQ-20 >7); and general health conditions. Descriptive statistical, bivariate and multiple logistic regression analysis techniques were used. The prevalence of self-reported medical diagnosis of vocal fold pathologies was 18.9%. In the logistic regression analysis, the variables that remained associated with this medical diagnosis were as follows: being female, having worked as a teacher for more than seven years, excessive voice use, reporting more than five unfavorable physical work environment characteristics and presence of common mental disorders. The presence of self-reported vocal fold pathologies was associated with factors that point out the need of actions that promote teachers' vocal health and changes in their work structure and organization.
Al-Modallal, Hanan
2016-10-01
The purpose of this study was to examine the cumulative effect of childhood and adulthood violence on depressive symptoms in a sample of Jordanian college women. Snowball sampling technique was used to recruit the participants. The participants were heterosexual college-aged women between the ages of 18 and 25. The participants were asked about their experiences of childhood violence (including physical violence, sexual violence, psychological violence, and witnessing parental violence), partner violence (including physical partner violence and sexual partner violence), experiences of depressive symptoms, and about other demographic and familial factors as possible predictors for their complaints of depressive symptoms. Multiple linear regression analysis was implemented to identify demographic- and violence-related predictors of their complainants of depressive symptoms. Logistic regression analysis was further performed to identify possible type(s) of violence associated with the increased risk of depressive symptoms. The prevalence of depressive symptoms in this sample was 47.4%. For the violence experience, witnessing parental violence was the most common during childhood, experienced by 40 (41.2%) women, and physical partner violence was the most common in adulthood, experienced by 35 (36.1%) women. Results of logistic regression analysis indicated that experiencing two types of violence (regardless of the time of occurrence) was significant in predicting depressive symptoms (odds ratio [OR] = 3.45, p < .05). Among college women's demographic characteristics, marital status (single vs. engaged), mothers' level of education, income, and smoking were significant in predicting depressive symptoms. Assessment of physical violence and depressive symptoms including the cumulative impact of longer periods of violence on depressive symptoms is recommended to be explored in future studies. © The Author(s) 2015.
Byrne, Enda M; Gehrman, Philip R; Trzaskowski, Maciej; Tiemeier, Henning; Pack, Allan I
2016-10-01
We sought to examine how much of the heritability of self-report sleep duration is tagged by common genetic variation in populations of European ancestry and to test if the common variants contributing to sleep duration are also associated with other diseases and traits. We utilized linkage disequilibrium (LD)-score regression to estimate the heritability tagged by common single nucleotide polymorphisms (SNPs) in the CHARGE consortium genome-wide association study (GWAS) of self-report sleep duration. We also used bivariate LD-score regression to investigate the genetic correlation of sleep duration with other publicly available GWAS datasets. We show that 6% (SE = 1%) of the variance in self-report sleep duration in the CHARGE study is tagged by common SNPs in European populations. Furthermore, we find evidence of a positive genetic correlation (rG) between sleep duration and type 2 diabetes (rG = 0.26, P = 0.02), and between sleep duration and schizophrenia (rG = 0.19, P = 0.01). Our results show that increased sample sizes will identify more common variants for self-report sleep duration; however, the heritability tagged is small when compared to other traits and diseases. These results also suggest that those who carry variants that increase risk to type 2 diabetes and schizophrenia are more likely to report longer sleep duration. © 2016 Associated Professional Sleep Societies, LLC.
Zhou, Jinzhe; Zhou, Yanbing; Cao, Shougen; Li, Shikuan; Wang, Hao; Niu, Zhaojian; Chen, Dong; Wang, Dongsheng; Lv, Liang; Zhang, Jian; Li, Yu; Jiao, Xuelong; Tan, Xiaojie; Zhang, Jianli; Wang, Haibo; Zhang, Bingyuan; Lu, Yun; Sun, Zhenqing
2016-01-01
Reporting of surgical complications is common, but few provide information about the severity and estimate risk factors of complications. If have, but lack of specificity. We retrospectively analyzed data on 2795 gastric cancer patients underwent surgical procedure at the Affiliated Hospital of Qingdao University between June 2007 and June 2012, established multivariate logistic regression model to predictive risk factors related to the postoperative complications according to the Clavien-Dindo classification system. Twenty-four out of 86 variables were identified statistically significant in univariate logistic regression analysis, 11 significant variables entered multivariate analysis were employed to produce the risk model. Liver cirrhosis, diabetes mellitus, Child classification, invasion of neighboring organs, combined resection, introperative transfusion, Billroth II anastomosis of reconstruction, malnutrition, surgical volume of surgeons, operating time and age were independent risk factors for postoperative complications after gastrectomy. Based on logistic regression equation, p=Exp∑BiXi / (1+Exp∑BiXi), multivariate logistic regression predictive model that calculated the risk of postoperative morbidity was developed, p = 1/(1 + e((4.810-1.287X1-0.504X2-0.500X3-0.474X4-0.405X5-0.318X6-0.316X7-0.305X8-0.278X9-0.255X10-0.138X11))). The accuracy, sensitivity and specificity of the model to predict the postoperative complications were 86.7%, 76.2% and 88.6%, respectively. This risk model based on Clavien-Dindo grading severity of complications system and logistic regression analysis can predict severe morbidity specific to an individual patient's risk factors, estimate patients' risks and benefits of gastric surgery as an accurate decision-making tool and may serve as a template for the development of risk models for other surgical groups.
A Comparative Study of Pairwise Learning Methods Based on Kernel Ridge Regression.
Stock, Michiel; Pahikkala, Tapio; Airola, Antti; De Baets, Bernard; Waegeman, Willem
2018-06-12
Many machine learning problems can be formulated as predicting labels for a pair of objects. Problems of that kind are often referred to as pairwise learning, dyadic prediction, or network inference problems. During the past decade, kernel methods have played a dominant role in pairwise learning. They still obtain a state-of-the-art predictive performance, but a theoretical analysis of their behavior has been underexplored in the machine learning literature. In this work we review and unify kernel-based algorithms that are commonly used in different pairwise learning settings, ranging from matrix filtering to zero-shot learning. To this end, we focus on closed-form efficient instantiations of Kronecker kernel ridge regression. We show that independent task kernel ridge regression, two-step kernel ridge regression, and a linear matrix filter arise naturally as a special case of Kronecker kernel ridge regression, implying that all these methods implicitly minimize a squared loss. In addition, we analyze universality, consistency, and spectral filtering properties. Our theoretical results provide valuable insights into assessing the advantages and limitations of existing pairwise learning methods.
Panayi, Efstathios; Peters, Gareth W; Kyriakides, George
2017-01-01
Quantifying the effects of environmental factors over the duration of the growing process on Agaricus Bisporus (button mushroom) yields has been difficult, as common functional data analysis approaches require fixed length functional data. The data available from commercial growers, however, is of variable duration, due to commercial considerations. We employ a recently proposed regression technique termed Variable-Domain Functional Regression in order to be able to accommodate these irregular-length datasets. In this way, we are able to quantify the contribution of covariates such as temperature, humidity and water spraying volumes across the growing process, and for different lengths of growing processes. Our results indicate that optimal oxygen and temperature levels vary across the growing cycle and we propose environmental schedules for these covariates to optimise overall yields.
Panayi, Efstathios; Kyriakides, George
2017-01-01
Quantifying the effects of environmental factors over the duration of the growing process on Agaricus Bisporus (button mushroom) yields has been difficult, as common functional data analysis approaches require fixed length functional data. The data available from commercial growers, however, is of variable duration, due to commercial considerations. We employ a recently proposed regression technique termed Variable-Domain Functional Regression in order to be able to accommodate these irregular-length datasets. In this way, we are able to quantify the contribution of covariates such as temperature, humidity and water spraying volumes across the growing process, and for different lengths of growing processes. Our results indicate that optimal oxygen and temperature levels vary across the growing cycle and we propose environmental schedules for these covariates to optimise overall yields. PMID:28961254
NASA Astrophysics Data System (ADS)
Sulistianingsih, E.; Kiftiah, M.; Rosadi, D.; Wahyuni, H.
2017-04-01
Gross Domestic Product (GDP) is an indicator of economic growth in a region. GDP is a panel data, which consists of cross-section and time series data. Meanwhile, panel regression is a tool which can be utilised to analyse panel data. There are three models in panel regression, namely Common Effect Model (CEM), Fixed Effect Model (FEM) and Random Effect Model (REM). The models will be chosen based on results of Chow Test, Hausman Test and Lagrange Multiplier Test. This research analyses palm oil about production, export, and government consumption to five district GDP are in West Kalimantan, namely Sanggau, Sintang, Sambas, Ketapang and Bengkayang by panel regression. Based on the results of analyses, it concluded that REM, which adjusted-determination-coefficient is 0,823, is the best model in this case. Also, according to the result, only Export and Government Consumption that influence GDP of the districts.
“Smooth” Semiparametric Regression Analysis for Arbitrarily Censored Time-to-Event Data
Zhang, Min; Davidian, Marie
2008-01-01
Summary A general framework for regression analysis of time-to-event data subject to arbitrary patterns of censoring is proposed. The approach is relevant when the analyst is willing to assume that distributions governing model components that are ordinarily left unspecified in popular semiparametric regression models, such as the baseline hazard function in the proportional hazards model, have densities satisfying mild “smoothness” conditions. Densities are approximated by a truncated series expansion that, for fixed degree of truncation, results in a “parametric” representation, which makes likelihood-based inference coupled with adaptive choice of the degree of truncation, and hence flexibility of the model, computationally and conceptually straightforward with data subject to any pattern of censoring. The formulation allows popular models, such as the proportional hazards, proportional odds, and accelerated failure time models, to be placed in a common framework; provides a principled basis for choosing among them; and renders useful extensions of the models straightforward. The utility and performance of the methods are demonstrated via simulations and by application to data from time-to-event studies. PMID:17970813
On the use of log-transformation vs. nonlinear regression for analyzing biological power laws
Xiao, X.; White, E.P.; Hooten, M.B.; Durham, S.L.
2011-01-01
Power-law relationships are among the most well-studied functional relationships in biology. Recently the common practice of fitting power laws using linear regression (LR) on log-transformed data has been criticized, calling into question the conclusions of hundreds of studies. It has been suggested that nonlinear regression (NLR) is preferable, but no rigorous comparison of these two methods has been conducted. Using Monte Carlo simulations, we demonstrate that the error distribution determines which method performs better, with NLR better characterizing data with additive, homoscedastic, normal error and LR better characterizing data with multiplicative, heteroscedastic, lognormal error. Analysis of 471 biological power laws shows that both forms of error occur in nature. While previous analyses based on log-transformation appear to be generally valid, future analyses should choose methods based on a combination of biological plausibility and analysis of the error distribution. We provide detailed guidelines and associated computer code for doing so, including a model averaging approach for cases where the error structure is uncertain. ?? 2011 by the Ecological Society of America.
Non-ignorable missingness in logistic regression.
Wang, Joanna J J; Bartlett, Mark; Ryan, Louise
2017-08-30
Nonresponses and missing data are common in observational studies. Ignoring or inadequately handling missing data may lead to biased parameter estimation, incorrect standard errors and, as a consequence, incorrect statistical inference and conclusions. We present a strategy for modelling non-ignorable missingness where the probability of nonresponse depends on the outcome. Using a simple case of logistic regression, we quantify the bias in regression estimates and show the observed likelihood is non-identifiable under non-ignorable missing data mechanism. We then adopt a selection model factorisation of the joint distribution as the basis for a sensitivity analysis to study changes in estimated parameters and the robustness of study conclusions against different assumptions. A Bayesian framework for model estimation is used as it provides a flexible approach for incorporating different missing data assumptions and conducting sensitivity analysis. Using simulated data, we explore the performance of the Bayesian selection model in correcting for bias in a logistic regression. We then implement our strategy using survey data from the 45 and Up Study to investigate factors associated with worsening health from the baseline to follow-up survey. Our findings have practical implications for the use of the 45 and Up Study data to answer important research questions relating to health and quality-of-life. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Sargolzaie, Narjes; Miri-Moghaddam, Ebrahim
2014-01-01
The most common differential diagnosis of β-thalassemia (β-thal) trait is iron deficiency anemia. Several red blood cell equations were introduced during different studies for differential diagnosis between β-thal trait and iron deficiency anemia. Due to genetic variations in different regions, these equations cannot be useful in all population. The aim of this study was to determine a native equation with high accuracy for differential diagnosis of β-thal trait and iron deficiency anemia for the Sistan and Baluchestan population by logistic regression analysis. We selected 77 iron deficiency anemia and 100 β-thal trait cases. We used binary logistic regression analysis and determined best equations for probability prediction of β-thal trait against iron deficiency anemia in our population. We compared diagnostic values and receiver operative characteristic (ROC) curve related to this equation and another 10 published equations in discriminating β-thal trait and iron deficiency anemia. The binary logistic regression analysis determined the best equation for best probability prediction of β-thal trait against iron deficiency anemia with area under curve (AUC) 0.998. Based on ROC curves and AUC, Green & King, England & Frazer, and then Sirdah indices, respectively, had the most accuracy after our equation. We suggest that to get the best equation and cut-off in each region, one needs to evaluate specific information of each region, specifically in areas where populations are homogeneous, to provide a specific formula for differentiating between β-thal trait and iron deficiency anemia.
Vascular Disease, ESRD, and Death: Interpreting Competing Risk Analyses
Coresh, Josef; Segev, Dorry L.; Kucirka, Lauren M.; Tighiouart, Hocine; Sarnak, Mark J.
2012-01-01
Summary Background and objectives Vascular disease, a common condition in CKD, is a risk factor for mortality and ESRD. Optimal patient care requires accurate estimation and ordering of these competing risks. Design, setting, participants, & measurements This is a prospective cohort study of screened (n=885) and randomized participants (n=837) in the Modification of Diet in Renal Disease study (original study enrollment, 1989–1992), evaluating the association of vascular disease with ESRD and pre-ESRD mortality using standard survival analysis and competing risk regression. Results The method of analysis resulted in markedly different estimates. Cumulative incidence by standard analysis (censoring at the competing event) implied that, with vascular disease, the 15-year incidence was 66% and 51% for ESRD and pre-ESRD death, respectively. A more accurate representation of absolute risk was estimated with competing risk regression: 15-year incidence was 54% and 29% for ESRD and pre-ESRD death, respectively. For the association of vascular disease with pre-ESRD death, estimates of relative risk by the two methods were similar (standard survival analysis adjusted hazard ratio, 1.63; 95% confidence interval, 1.20–2.20; competing risk regression adjusted subhazard ratio, 1.57; 95% confidence interval, 1.15–2.14). In contrast, the hazard and subhazard ratios differed substantially for other associations, such as GFR and pre-ESRD mortality. Conclusions When competing events exist, absolute risk is better estimated using competing risk regression, but etiologic associations by this method must be carefully interpreted. The presence of vascular disease in CKD decreases the likelihood of survival to ESRD, independent of age and other risk factors. PMID:22859747
Vascular disease, ESRD, and death: interpreting competing risk analyses.
Grams, Morgan E; Coresh, Josef; Segev, Dorry L; Kucirka, Lauren M; Tighiouart, Hocine; Sarnak, Mark J
2012-10-01
Vascular disease, a common condition in CKD, is a risk factor for mortality and ESRD. Optimal patient care requires accurate estimation and ordering of these competing risks. This is a prospective cohort study of screened (n=885) and randomized participants (n=837) in the Modification of Diet in Renal Disease study (original study enrollment, 1989-1992), evaluating the association of vascular disease with ESRD and pre-ESRD mortality using standard survival analysis and competing risk regression. The method of analysis resulted in markedly different estimates. Cumulative incidence by standard analysis (censoring at the competing event) implied that, with vascular disease, the 15-year incidence was 66% and 51% for ESRD and pre-ESRD death, respectively. A more accurate representation of absolute risk was estimated with competing risk regression: 15-year incidence was 54% and 29% for ESRD and pre-ESRD death, respectively. For the association of vascular disease with pre-ESRD death, estimates of relative risk by the two methods were similar (standard survival analysis adjusted hazard ratio, 1.63; 95% confidence interval, 1.20-2.20; competing risk regression adjusted subhazard ratio, 1.57; 95% confidence interval, 1.15-2.14). In contrast, the hazard and subhazard ratios differed substantially for other associations, such as GFR and pre-ESRD mortality. When competing events exist, absolute risk is better estimated using competing risk regression, but etiologic associations by this method must be carefully interpreted. The presence of vascular disease in CKD decreases the likelihood of survival to ESRD, independent of age and other risk factors.
Giacomino, Agnese; Abollino, Ornella; Malandrino, Mery; Mentasti, Edoardo
2011-03-04
Single and sequential extraction procedures are used for studying element mobility and availability in solid matrices, like soils, sediments, sludge, and airborne particulate matter. In the first part of this review we reported an overview on these procedures and described the applications of chemometric uni- and bivariate techniques and of multivariate pattern recognition techniques based on variable reduction to the experimental results obtained. The second part of the review deals with the use of chemometrics not only for the visualization and interpretation of data, but also for the investigation of the effects of experimental conditions on the response, the optimization of their values and the calculation of element fractionation. We will describe the principles of the multivariate chemometric techniques considered, the aims for which they were applied and the key findings obtained. The following topics will be critically addressed: pattern recognition by cluster analysis (CA), linear discriminant analysis (LDA) and other less common techniques; modelling by multiple linear regression (MLR); investigation of spatial distribution of variables by geostatistics; calculation of fractionation patterns by a mixture resolution method (Chemometric Identification of Substrates and Element Distributions, CISED); optimization and characterization of extraction procedures by experimental design; other multivariate techniques less commonly applied. Copyright © 2010 Elsevier B.V. All rights reserved.
Atlas of climate change effects in 150 bird species of the Eastern United States
Stephen Matthews; Raymond O' Connor; Louis R. Iverson; Anantha M. Prasad
2004-01-01
NOTE: Instructions for navigating this publication can be found on the front cover. This atlas documents the current and potential future distribution of 150 common bird species in the Eastern United States. Distribution data for individual species were derived from the Breeding Bird Survey (BBS) from 1981 to 1990. Regression tree analysis was used to model the BBS...
ERIC Educational Resources Information Center
Liu, Xing
2008-01-01
The proportional odds (PO) model, which is also called cumulative odds model (Agresti, 1996, 2002 ; Armstrong & Sloan, 1989; Long, 1997, Long & Freese, 2006; McCullagh, 1980; McCullagh & Nelder, 1989; Powers & Xie, 2000; O'Connell, 2006), is one of the most commonly used models for the analysis of ordinal categorical data and comes from the class…
Equations relating compacted and uncompacted live crown ratio for common tree species in the South
KaDonna C. Randolph
2010-01-01
Species-specific equations to predict uncompacted crown ratio (UNCR) from compacted live crown ratio (CCR), tree length, and stem diameter were developed for 24 species and 12 genera in the southern United States. Using data from the US Forest Service Forest Inventory and Analysis program, nonlinear regression was used to model UNCR with a logistic function. Model...
Rhodes, Kirsty M; Turner, Rebecca M; White, Ian R; Jackson, Dan; Spiegelhalter, David J; Higgins, Julian P T
2016-12-20
Many meta-analyses combine results from only a small number of studies, a situation in which the between-study variance is imprecisely estimated when standard methods are applied. Bayesian meta-analysis allows incorporation of external evidence on heterogeneity, providing the potential for more robust inference on the effect size of interest. We present a method for performing Bayesian meta-analysis using data augmentation, in which we represent an informative conjugate prior for between-study variance by pseudo data and use meta-regression for estimation. To assist in this, we derive predictive inverse-gamma distributions for the between-study variance expected in future meta-analyses. These may serve as priors for heterogeneity in new meta-analyses. In a simulation study, we compare approximate Bayesian methods using meta-regression and pseudo data against fully Bayesian approaches based on importance sampling techniques and Markov chain Monte Carlo (MCMC). We compare the frequentist properties of these Bayesian methods with those of the commonly used frequentist DerSimonian and Laird procedure. The method is implemented in standard statistical software and provides a less complex alternative to standard MCMC approaches. An importance sampling approach produces almost identical results to standard MCMC approaches, and results obtained through meta-regression and pseudo data are very similar. On average, data augmentation provides closer results to MCMC, if implemented using restricted maximum likelihood estimation rather than DerSimonian and Laird or maximum likelihood estimation. The methods are applied to real datasets, and an extension to network meta-analysis is described. The proposed method facilitates Bayesian meta-analysis in a way that is accessible to applied researchers. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
Locally Weighted Score Estimation for Quantile Classification in Binary Regression Models
Rice, John D.; Taylor, Jeremy M. G.
2016-01-01
One common use of binary response regression methods is classification based on an arbitrary probability threshold dictated by the particular application. Since this is given to us a priori, it is sensible to incorporate the threshold into our estimation procedure. Specifically, for the linear logistic model, we solve a set of locally weighted score equations, using a kernel-like weight function centered at the threshold. The bandwidth for the weight function is selected by cross validation of a novel hybrid loss function that combines classification error and a continuous measure of divergence between observed and fitted values; other possible cross-validation functions based on more common binary classification metrics are also examined. This work has much in common with robust estimation, but diers from previous approaches in this area in its focus on prediction, specifically classification into high- and low-risk groups. Simulation results are given showing the reduction in error rates that can be obtained with this method when compared with maximum likelihood estimation, especially under certain forms of model misspecification. Analysis of a melanoma data set is presented to illustrate the use of the method in practice. PMID:28018492
Two-dimensional advective transport in ground-water flow parameter estimation
Anderman, E.R.; Hill, M.C.; Poeter, E.P.
1996-01-01
Nonlinear regression is useful in ground-water flow parameter estimation, but problems of parameter insensitivity and correlation often exist given commonly available hydraulic-head and head-dependent flow (for example, stream and lake gain or loss) observations. To address this problem, advective-transport observations are added to the ground-water flow, parameter-estimation model MODFLOWP using particle-tracking methods. The resulting model is used to investigate the importance of advective-transport observations relative to head-dependent flow observations when either or both are used in conjunction with hydraulic-head observations in a simulation of the sewage-discharge plume at Otis Air Force Base, Cape Cod, Massachusetts, USA. The analysis procedure for evaluating the probable effect of new observations on the regression results consists of two steps: (1) parameter sensitivities and correlations calculated at initial parameter values are used to assess the model parameterization and expected relative contributions of different types of observations to the regression; and (2) optimal parameter values are estimated by nonlinear regression and evaluated. In the Cape Cod parameter-estimation model, advective-transport observations did not significantly increase the overall parameter sensitivity; however: (1) inclusion of advective-transport observations decreased parameter correlation enough for more unique parameter values to be estimated by the regression; (2) realistic uncertainties in advective-transport observations had a small effect on parameter estimates relative to the precision with which the parameters were estimated; and (3) the regression results and sensitivity analysis provided insight into the dynamics of the ground-water flow system, especially the importance of accurate boundary conditions. In this work, advective-transport observations improved the calibration of the model and the estimation of ground-water flow parameters, and use of regression and related techniques produced significant insight into the physical system.
Kennedy, Jeffrey R.; Paretti, Nicholas V.
2014-01-01
Flooding in urban areas routinely causes severe damage to property and often results in loss of life. To investigate the effect of urbanization on the magnitude and frequency of flood peaks, a flood frequency analysis was carried out using data from urbanized streamgaging stations in Phoenix and Tucson, Arizona. Flood peaks at each station were predicted using the log-Pearson Type III distribution, fitted using the expected moments algorithm and the multiple Grubbs-Beck low outlier test. The station estimates were then compared to flood peaks estimated by rural-regression equations for Arizona, and to flood peaks adjusted for urbanization using a previously developed procedure for adjusting U.S. Geological Survey rural regression peak discharges in an urban setting. Only smaller, more common flood peaks at the 50-, 20-, 10-, and 4-percent annual exceedance probabilities (AEPs) demonstrate any increase in magnitude as a result of urbanization; the 1-, 0.5-, and 0.2-percent AEP flood estimates are predicted without bias by the rural-regression equations. Percent imperviousness was determined not to account for the difference in estimated flood peaks between stations, either when adjusting the rural-regression equations or when deriving urban-regression equations to predict flood peaks directly from basin characteristics. Comparison with urban adjustment equations indicates that flood peaks are systematically overestimated if the rural-regression-estimated flood peaks are adjusted upward to account for urbanization. At nearly every streamgaging station in the analysis, adjusted rural-regression estimates were greater than the estimates derived using station data. One likely reason for the lack of increase in flood peaks with urbanization is the presence of significant stormwater retention and detention structures within the watershed used in the study.
Markov chains and semi-Markov models in time-to-event analysis.
Abner, Erin L; Charnigo, Richard J; Kryscio, Richard J
2013-10-25
A variety of statistical methods are available to investigators for analysis of time-to-event data, often referred to as survival analysis. Kaplan-Meier estimation and Cox proportional hazards regression are commonly employed tools but are not appropriate for all studies, particularly in the presence of competing risks and when multiple or recurrent outcomes are of interest. Markov chain models can accommodate censored data, competing risks (informative censoring), multiple outcomes, recurrent outcomes, frailty, and non-constant survival probabilities. Markov chain models, though often overlooked by investigators in time-to-event analysis, have long been used in clinical studies and have widespread application in other fields.
Markov chains and semi-Markov models in time-to-event analysis
Abner, Erin L.; Charnigo, Richard J.; Kryscio, Richard J.
2014-01-01
A variety of statistical methods are available to investigators for analysis of time-to-event data, often referred to as survival analysis. Kaplan-Meier estimation and Cox proportional hazards regression are commonly employed tools but are not appropriate for all studies, particularly in the presence of competing risks and when multiple or recurrent outcomes are of interest. Markov chain models can accommodate censored data, competing risks (informative censoring), multiple outcomes, recurrent outcomes, frailty, and non-constant survival probabilities. Markov chain models, though often overlooked by investigators in time-to-event analysis, have long been used in clinical studies and have widespread application in other fields. PMID:24818062
Model synthesis in frequency analysis of Missouri floods
Hauth, Leland D.
1974-01-01
Synthetic flood records for 43 small-stream sites aided in definition of techniques for estimating the magnitude and frequency of floods in Missouri. The long-term synthetic flood records were generated by use of a digital computer model of the rainfall-runoff process. A relatively short period of concurrent rainfall and runoff data observed at each of the 43 sites was used to calibrate the model, and rainfall records covering from 66 to 78 years for four Missouri sites and pan-evaporation data were used to generate the synthetic records. Flood magnitude and frequency characteristics of both the synthetic records and observed long-term flood records available for 109 large-stream sites were used in a multiple-regression analysis to define relations for estimating future flood characteristics at ungaged sites. That analysis indicated that drainage basin size and slope were the most useful estimating variables. It also indicated that a more complex regression model than the commonly used log-linear one was needed for the range of drainage basin sizes available in this study.
NASA Astrophysics Data System (ADS)
Yang, Peng; Xia, Jun; Zhang, Yongyong; Han, Jian; Wu, Xia
2017-11-01
Because drought is a very common and widespread natural disaster, it has attracted a great deal of academic interest. Based on 12-month time scale standardized precipitation indices (SPI12) calculated from precipitation data recorded between 1960 and 2015 at 22 weather stations in the Tarim River Basin (TRB), this study aims to identify the trends of SPI and drought duration, severity, and frequency at various quantiles and to perform cluster analysis of drought events in the TRB. The results indicated that (1) both precipitation and temperature at most stations in the TRB exhibited significant positive trends during 1960-2015; (2) multiple scales of SPIs changed significantly around 1986; (3) based on quantile regression analysis of temporal drought changes, the positive SPI slopes indicated less severe and less frequent droughts at lower quantiles, but clear variation was detected in the drought frequency; and (4) significantly different trends were found in drought frequency probably between severe droughts and drought frequency.
Park, Brian D; Azefor, Nchang; Huang, Chun-Chih; Ricotta, John J
2013-04-01
Our aim was to determine national trends in treatment of ruptured abdominal aortic aneurysm (RAAA), with specific emphasis on open surgical repair (OSR) and endovascular aneurysm repair (EVAR) and its impact on mortality and complications. Data from the Nationwide Inpatient Sample (NIS) from 2005 to 2009 were queried to identify patients older than 59 years with RAAA. Three groups were studied: nonoperative (NO), EVAR, and OSR. Chi-square analysis was used to determine the relationship between treatment type and patient demographics, clinical characteristics, and hospital type. The impact of EVAR compared with OSR on mortality and overall complications was examined using logistic regression analysis. We identified 21,206 patients with RAAA from 2005 to 2009, of which 16,558 (78.1%) underwent operative repair and 21.8% received no operative treatment. In the operative group, 12,761 (77.1%) underwent OSR and 3,796 (22.9%) underwent EVAR. Endovascular aneurysm repair was more common in teaching hospitals (29.1% vs 15.2%, p < .0001) and in urban versus rural settings. Nonoperative approach was twice as common in rural versus urban hospitals. Reduced mortality was seen in patients transferred from another institutions (31.2% vs 39.4%, p = 0.014). Logistic regression analysis demonstrated a benefit of EVAR on both complication rate (OR = 0.492; CI, 0.380-0.636) and mortality (OR=0.535; CI, 0.395-0.724). Endovascular aneurysm repair use is increasing for RAAA and is more common in urban teaching hospitals while NO therapy is more common in rural hospitals. Endovascular aneurysm repair is associated with reduced mortality and complications across all age groups. Efforts to reduce mortality from RAAA should concentrate on reducing NO and OSR in patients who are suitable for EVAR. Copyright © 2013 American College of Surgeons. Published by Elsevier Inc. All rights reserved.
Waltemeyer, Scott D.
2008-01-01
Estimates of the magnitude and frequency of peak discharges are necessary for the reliable design of bridges, culverts, and open-channel hydraulic analysis, and for flood-hazard mapping in New Mexico and surrounding areas. The U.S. Geological Survey, in cooperation with the New Mexico Department of Transportation, updated estimates of peak-discharge magnitude for gaging stations in the region and updated regional equations for estimation of peak discharge and frequency at ungaged sites. Equations were developed for estimating the magnitude of peak discharges for recurrence intervals of 2, 5, 10, 25, 50, 100, and 500 years at ungaged sites by use of data collected through 2004 for 293 gaging stations on unregulated streams that have 10 or more years of record. Peak discharges for selected recurrence intervals were determined at gaging stations by fitting observed data to a log-Pearson Type III distribution with adjustments for a low-discharge threshold and a zero skew coefficient. A low-discharge threshold was applied to frequency analysis of 140 of the 293 gaging stations. This application provides an improved fit of the log-Pearson Type III frequency distribution. Use of the low-discharge threshold generally eliminated the peak discharge by having a recurrence interval of less than 1.4 years in the probability-density function. Within each of the nine regions, logarithms of the maximum peak discharges for selected recurrence intervals were related to logarithms of basin and climatic characteristics by using stepwise ordinary least-squares regression techniques for exploratory data analysis. Generalized least-squares regression techniques, an improved regression procedure that accounts for time and spatial sampling errors, then were applied to the same data used in the ordinary least-squares regression analyses. The average standard error of prediction, which includes average sampling error and average standard error of regression, ranged from 38 to 93 percent (mean value is 62, and median value is 59) for the 100-year flood. The 1996 investigation standard error of prediction for the flood regions ranged from 41 to 96 percent (mean value is 67, and median value is 68) for the 100-year flood that was analyzed by using generalized least-squares regression analysis. Overall, the equations based on generalized least-squares regression techniques are more reliable than those in the 1996 report because of the increased length of record and improved geographic information system (GIS) method to determine basin and climatic characteristics. Flood-frequency estimates can be made for ungaged sites upstream or downstream from gaging stations by using a method that transfers flood-frequency data at the gaging station to the ungaged site by using a drainage-area ratio adjustment equation. The peak discharge for a given recurrence interval at the gaging station, drainage-area ratio, and the drainage-area exponent from the regional regression equation of the respective region is used to transfer the peak discharge for the recurrence interval to the ungaged site. Maximum observed peak discharge as related to drainage area was determined for New Mexico. Extreme events are commonly used in the design and appraisal of bridge crossings and other structures. Bridge-scour evaluations are commonly made by using the 500-year peak discharge for these appraisals. Peak-discharge data collected at 293 gaging stations and 367 miscellaneous sites were used to develop a maximum peak-discharge relation as an alternative method of estimating peak discharge of an extreme event such as a maximum probable flood.
Classical Testing in Functional Linear Models.
Kong, Dehan; Staicu, Ana-Maria; Maity, Arnab
2016-01-01
We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications.
Classical Testing in Functional Linear Models
Kong, Dehan; Staicu, Ana-Maria; Maity, Arnab
2016-01-01
We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications. PMID:28955155
Jung, Julia; Nitzsche, Anika; Ernstmann, Nicole; Driller, Elke; Wasem, Jürgen; Stieler-Lorenz, Brigitte; Pfaff, Holger
2011-03-01
This study examines the association between perceived social capital and health promotion willingness (HPW) of companies from a chief executive officer's perspective. Data for the cross-sectional study were collected through telephone interviews with one chief executive officer from randomly selected companies within the German information and communication technology sector. A hierarchical multivariate logistic regression analysis was performed. Results of the logistic regression analysis of data from a total of n = 522 interviews suggest that higher values of perceived social capital are associated with pronounced HPW in companies (odds ratio = 3.78; 95% confidence intervals, 2.24 to 6.37). Our findings suggest that characteristics of high social capital, such as an established environment of trust as well as a feeling of common values and convictions could help promote HPW.
Klimek, Ludger; Schumacher, Helmut; Schütt, Tanja; Gräter, Heidemarie; Mueck, Tobias; Michel, Martin C
2017-02-01
The aim of this study was to explore factors affecting efficacy of treatment of common cold symptoms with an over-the-counter ibuprofen/pseudoephedrine combination product. Data from an anonymous survey among 1770 pharmacy customers purchasing the combination product for treatment of own common cold symptoms underwent post-hoc descriptive analysis. Scores of symptoms typically responsive to ibuprofen (headache, pharyngeal pain, joint pain and fever), typically responsive to pseudoephedrine (congested nose, congested sinus and runny nose), considered non-specific (sneezing, fatigue, dry cough, cough with expectoration) and comprising all 11 symptoms were analysed. Multiple regression analysis was applied to explore factors associated with greater reduction in symptom intensity or greater probability of experiencing a symptom reduction of at least 50%. After intake of first dose of medication, typically ibuprofen-sensitive, pseudoephedrine-responsive, non-specific and total symptoms were reduced by 60.0%, 46.3%, 45.4% and 52.8%, respectively. A symptom reduction of at least 50% was reported by 73.6%, 55.1%, 50.9% and 61.6% of participants, respectively. A high baseline score was associated with greater reductions in symptom scores but smaller probability of achieving an improvement of at least 50%. Across both multiple regression approaches, two tablets at first dosing were more effective than one and (except for ibuprofen-sensitive symptoms) starting treatment later than day 2 of the cold was generally less effective. Efficacy of an ibuprofen/pseudoephedrine combination in the treatment of common cold symptoms was dose-dependent and greatest when treatment started within the first 2 days after onset of symptoms. © 2016 The Authors. International Journal of Clinical Practice Published by John Wiley & Sons Ltd.
Nationwide Multicenter Reference Interval Study for 28 Common Biochemical Analytes in China.
Xia, Liangyu; Chen, Ming; Liu, Min; Tao, Zhihua; Li, Shijun; Wang, Liang; Cheng, Xinqi; Qin, Xuzhen; Han, Jianhua; Li, Pengchang; Hou, Li'an; Yu, Songlin; Ichihara, Kiyoshi; Qiu, Ling
2016-03-01
A nationwide multicenter study was conducted in the China to explore sources of variation of reference values and establish reference intervals for 28 common biochemical analytes, as a part of the International Federation of Clinical Chemistry and Laboratory Medicine, Committee on Reference Intervals and Decision Limits (IFCC/C-RIDL) global study on reference values. A total of 3148 apparently healthy volunteers were recruited in 6 cities covering a wide area in China. Blood samples were tested in 2 central laboratories using Beckman Coulter AU5800 chemistry analyzers. Certified reference materials and value-assigned serum panel were used for standardization of test results. Multiple regression analysis was performed to explore sources of variation. Need for partition of reference intervals was evaluated based on 3-level nested ANOVA. After secondary exclusion using the latent abnormal values exclusion method, reference intervals were derived by a parametric method using the modified Box-Cox formula. Test results of 20 analytes were made traceable to reference measurement procedures. By the ANOVA, significant sex-related and age-related differences were observed in 12 and 12 analytes, respectively. A small regional difference was observed in the results for albumin, glucose, and sodium. Multiple regression analysis revealed BMI-related changes in results of 9 analytes for man and 6 for woman. Reference intervals of 28 analytes were computed with 17 analytes partitioned by sex and/or age. In conclusion, reference intervals of 28 common chemistry analytes applicable to Chinese Han population were established by use of the latest methodology. Reference intervals of 20 analytes traceable to reference measurement procedures can be used as common reference intervals, whereas others can be used as the assay system-specific reference intervals in China.
Nationwide Multicenter Reference Interval Study for 28 Common Biochemical Analytes in China
Xia, Liangyu; Chen, Ming; Liu, Min; Tao, Zhihua; Li, Shijun; Wang, Liang; Cheng, Xinqi; Qin, Xuzhen; Han, Jianhua; Li, Pengchang; Hou, Li’an; Yu, Songlin; Ichihara, Kiyoshi; Qiu, Ling
2016-01-01
Abstract A nationwide multicenter study was conducted in the China to explore sources of variation of reference values and establish reference intervals for 28 common biochemical analytes, as a part of the International Federation of Clinical Chemistry and Laboratory Medicine, Committee on Reference Intervals and Decision Limits (IFCC/C-RIDL) global study on reference values. A total of 3148 apparently healthy volunteers were recruited in 6 cities covering a wide area in China. Blood samples were tested in 2 central laboratories using Beckman Coulter AU5800 chemistry analyzers. Certified reference materials and value-assigned serum panel were used for standardization of test results. Multiple regression analysis was performed to explore sources of variation. Need for partition of reference intervals was evaluated based on 3-level nested ANOVA. After secondary exclusion using the latent abnormal values exclusion method, reference intervals were derived by a parametric method using the modified Box–Cox formula. Test results of 20 analytes were made traceable to reference measurement procedures. By the ANOVA, significant sex-related and age-related differences were observed in 12 and 12 analytes, respectively. A small regional difference was observed in the results for albumin, glucose, and sodium. Multiple regression analysis revealed BMI-related changes in results of 9 analytes for man and 6 for woman. Reference intervals of 28 analytes were computed with 17 analytes partitioned by sex and/or age. In conclusion, reference intervals of 28 common chemistry analytes applicable to Chinese Han population were established by use of the latest methodology. Reference intervals of 20 analytes traceable to reference measurement procedures can be used as common reference intervals, whereas others can be used as the assay system-specific reference intervals in China. PMID:26945390
Reference-Free Removal of EEG-fMRI Ballistocardiogram Artifacts with Harmonic Regression
Krishnaswamy, Pavitra; Bonmassar, Giorgio; Poulsen, Catherine; Pierce, Eric T; Purdon, Patrick L.; Brown, Emery N.
2016-01-01
Combining electroencephalogram (EEG) recording and functional magnetic resonance imaging (fMRI) offers the potential for imaging brain activity with high spatial and temporal resolution. This potential remains limited by the significant ballistocardiogram (BCG) artifacts induced in the EEG by cardiac pulsation-related head movement within the magnetic field. We model the BCG artifact using a harmonic basis, pose the artifact removal problem as a local harmonic regression analysis, and develop an efficient maximum likelihood algorithm to estimate and remove BCG artifacts. Our analysis paradigm accounts for time-frequency overlap between the BCG artifacts and neurophysiologic EEG signals, and tracks the spatiotemporal variations in both the artifact and the signal. We evaluate performance on: simulated oscillatory and evoked responses constructed with realistic artifacts; actual anesthesia-induced oscillatory recordings; and actual visual evoked potential recordings. In each case, the local harmonic regression analysis effectively removes the BCG artifacts, and recovers the neurophysiologic EEG signals. We further show that our algorithm outperforms commonly used reference-based and component analysis techniques, particularly in low SNR conditions, the presence of significant time-frequency overlap between the artifact and the signal, and/or large spatiotemporal variations in the BCG. Because our algorithm does not require reference signals and has low computational complexity, it offers a practical tool for removing BCG artifacts from EEG data recorded in combination with fMRI. PMID:26151100
Predictive equations for the estimation of body size in seals and sea lions (Carnivora: Pinnipedia)
Churchill, Morgan; Clementz, Mark T; Kohno, Naoki
2014-01-01
Body size plays an important role in pinniped ecology and life history. However, body size data is often absent for historical, archaeological, and fossil specimens. To estimate the body size of pinnipeds (seals, sea lions, and walruses) for today and the past, we used 14 commonly preserved cranial measurements to develop sets of single variable and multivariate predictive equations for pinniped body mass and total length. Principal components analysis (PCA) was used to test whether separate family specific regressions were more appropriate than single predictive equations for Pinnipedia. The influence of phylogeny was tested with phylogenetic independent contrasts (PIC). The accuracy of these regressions was then assessed using a combination of coefficient of determination, percent prediction error, and standard error of estimation. Three different methods of multivariate analysis were examined: bidirectional stepwise model selection using Akaike information criteria; all-subsets model selection using Bayesian information criteria (BIC); and partial least squares regression. The PCA showed clear discrimination between Otariidae (fur seals and sea lions) and Phocidae (earless seals) for the 14 measurements, indicating the need for family-specific regression equations. The PIC analysis found that phylogeny had a minor influence on relationship between morphological variables and body size. The regressions for total length were more accurate than those for body mass, and equations specific to Otariidae were more accurate than those for Phocidae. Of the three multivariate methods, the all-subsets approach required the fewest number of variables to estimate body size accurately. We then used the single variable predictive equations and the all-subsets approach to estimate the body size of two recently extinct pinniped taxa, the Caribbean monk seal (Monachus tropicalis) and the Japanese sea lion (Zalophus japonicus). Body size estimates using single variable regressions generally under or over-estimated body size; however, the all-subset regression produced body size estimates that were close to historically recorded body length for these two species. This indicates that the all-subset regression equations developed in this study can estimate body size accurately. PMID:24916814
Bäck, Leif J J; Aro, Katri; Tapiovaara, Laura; Vikatmaa, Pirkka; de Bree, Remco; Fernández-Álvarez, Verónica; Kowalski, Luiz P; Nixon, Iain J; Rinaldo, Alessandra; Rodrigo, Juan P; Robbins, K Thomas; Silver, Carl E; Snyderman, Carl H; Suárez, Carlos; Takes, Robert P; Ferlito, Alfio
2018-06-01
Sacrifice and reconstruction of the carotid artery in cases of head and neck carcinoma with invasion of the common or internal carotid artery is debated. We conducted a systematic search of electronic databases and provide a review and meta-analysis. Of the 72 articles identified, 24 met the inclusion criteria resulting in the inclusion of 357 patients. The overall perioperative 30-day mortality was 3.6% (13/357). Permanent cerebrovascular complications occurred in 3.6% (13/357). Carotid blowout episodes were encountered in 1.4% (5/357). The meta-regression analysis showed a significant difference in 1-year overall survival between reports published from 1981-1999 (37.0%) and 2001-2016 (65.4%; P = .02). This review provides evidence that sacrifice with extracranial reconstruction of common or internal carotid artery in selected patients with head and neck carcinoma may improve survival with acceptable complication rates. However, all of the published literature is retrospective involving selected series and, therefore, precludes determining the absolute effectiveness of the surgery. © 2018 Wiley Periodicals, Inc.
On the equivalence of case-crossover and time series methods in environmental epidemiology.
Lu, Yun; Zeger, Scott L
2007-04-01
The case-crossover design was introduced in epidemiology 15 years ago as a method for studying the effects of a risk factor on a health event using only cases. The idea is to compare a case's exposure immediately prior to or during the case-defining event with that same person's exposure at otherwise similar "reference" times. An alternative approach to the analysis of daily exposure and case-only data is time series analysis. Here, log-linear regression models express the expected total number of events on each day as a function of the exposure level and potential confounding variables. In time series analyses of air pollution, smooth functions of time and weather are the main confounders. Time series and case-crossover methods are often viewed as competing methods. In this paper, we show that case-crossover using conditional logistic regression is a special case of time series analysis when there is a common exposure such as in air pollution studies. This equivalence provides computational convenience for case-crossover analyses and a better understanding of time series models. Time series log-linear regression accounts for overdispersion of the Poisson variance, while case-crossover analyses typically do not. This equivalence also permits model checking for case-crossover data using standard log-linear model diagnostics.
Azimian, Jalil; Piran, Pegah; Jahanihashemi, Hassan; Dehghankar, Leila
2017-04-01
Pressures in nursing can affect family life and marital problems, disrupt common social problems, increase work-family conflicts and endanger people's general health. To determine marital satisfaction and its relationship with job stress and general health of nurses. This descriptive and cross-sectional study was done in 2015 in medical educational centers of Qazvin by using an ENRICH marital satisfaction scale and General Health and Job Stress questionnaires completed by 123 nurses. Analysis was done by SPSS version 19 using descriptive and analytical statistics (Pearson correlation, t-test, ANOVA, Chi-square, regression line, multiple regression analysis). The findings showed that 64.4% of nurses had marital satisfaction. There was significant relationship between age (p=0.03), job experience (p=0.01), age of spouse (p=0.01) and marital satisfaction. The results showed that there was a significant relationship between marital satisfaction and general health (p<0.0001). Multiple regression analysis showed that there was a significant relationship between depression (p=0.012) and anxiety (p=0.001) with marital satisfaction. Due to high levels of job stress and disorder in general health of nurses and low marital satisfaction by running health promotion programs and paying attention to its dimensions can help work and family health of nurses.
Categorical Data Analysis Using a Skewed Weibull Regression Model
NASA Astrophysics Data System (ADS)
Caron, Renault; Sinha, Debajyoti; Dey, Dipak; Polpo, Adriano
2018-03-01
In this paper, we present a Weibull link (skewed) model for categorical response data arising from binomial as well as multinomial model. We show that, for such types of categorical data, the most commonly used models (logit, probit and complementary log-log) can be obtained as limiting cases. We further compare the proposed model with some other asymmetrical models. The Bayesian as well as frequentist estimation procedures for binomial and multinomial data responses are presented in details. The analysis of two data sets to show the efficiency of the proposed model is performed.
Fleetwood, V A; Gross, K N; Alex, G C; Cortina, C S; Smolevitz, J B; Sarvepalli, S; Bakhsh, S R; Poirier, J; Myers, J A; Singer, M A; Orkin, B A
2017-03-01
Anastomotic leak (AL) increases costs and cancer recurrence. Studies show decreased AL with side-to-side stapled anastomosis (SSA), but none identify risk factors within SSAs. We hypothesized that stapler characteristics and closure technique of the common enterotomy affect AL rates. Retrospective review of bowel SSAs was performed. Data included stapler brand, staple line oversewing, and closure method (handsewn, HC; linear stapler [Barcelona technique], BT; transverse stapler, TX). Primary endpoint was AL. Statistical analysis included Fisher's test and logistic regression. 463 patients were identified, 58.5% BT, 21.2% HC, and 20.3% TX. Covidien staplers comprised 74.9%, Ethicon 18.1%. There were no differences between stapler types (Covidien 5.8%, Ethicon 6.0%). However, AL rates varied by common side closure (BT 3.7% vs. TX 10.6%, p = 0.017), remaining significant on multivariate analysis. Closure method of the common side impacts AL rates. Barcelona technique has fewer leaks than transverse stapled closure. Further prospective evaluation is recommended. Copyright © 2017. Published by Elsevier Inc.
Using Remote Sensing Data to Evaluate Surface Soil Properties in Alabama Ultisols
NASA Technical Reports Server (NTRS)
Sullivan, Dana G.; Shaw, Joey N.; Rickman, Doug; Mask, Paul L.; Luvall, Jeff
2005-01-01
Evaluation of surface soil properties via remote sensing could facilitate soil survey mapping, erosion prediction and allocation of agrochemicals for precision management. The objective of this study was to evaluate the relationship between soil spectral signature and surface soil properties in conventionally managed row crop systems. High-resolution RS data were acquired over bare fields in the Coastal Plain, Appalachian Plateau, and Ridge and Valley provinces of Alabama using the Airborne Terrestrial Applications Sensor multispectral scanner. Soils ranged from sandy Kandiudults to fine textured Rhodudults. Surface soil samples (0-1 cm) were collected from 163 sampling points for soil organic carbon, particle size distribution, and citrate dithionite extractable iron content. Surface roughness, soil water content, and crusting were also measured during sampling. Two methods of analysis were evaluated: 1) multiple linear regression using common spectral band ratios, and 2) partial least squares regression. Our data show that thermal infrared spectra are highly, linearly related to soil organic carbon, sand and clay content. Soil organic carbon content was the most difficult to quantify in these highly weathered systems, where soil organic carbon was generally less than 1.2%. Estimates of sand and clay content were best using partial least squares regression at the Valley site, explaining 42-59% of the variability. In the Coastal Plain, sandy surfaces prone to crusting limited estimates of sand and clay content via partial least squares and regression with common band ratios. Estimates of iron oxide content were a function of mineralogy and best accomplished using specific band ratios, with regression explaining 36-65% of the variability at the Valley and Coastal Plain sites, respectively.
Data Analysis of Criteria Governing Selection of Active Guard/Reserve Colonel
2014-09-01
20 Figure 7. Graphically depicts the Marital Status breakdown of the packets submitted by Married (M); Divorced (D); Single (S); Widowed...Status and compares them to the number of packets selected within the each group. Married (M); Divorced (D); Single (S); Widowed (W...logistic regression to examine the determining factors of poverty in Kenya. The study 8 digs deeper than the three indicators commonly thought to
Inverse odds ratio-weighted estimation for causal mediation analysis.
Tchetgen Tchetgen, Eric J
2013-11-20
An important scientific goal of studies in the health and social sciences is increasingly to determine to what extent the total effect of a point exposure is mediated by an intermediate variable on the causal pathway between the exposure and the outcome. A causal framework has recently been proposed for mediation analysis, which gives rise to new definitions, formal identification results and novel estimators of direct and indirect effects. In the present paper, the author describes a new inverse odds ratio-weighted approach to estimate so-called natural direct and indirect effects. The approach, which uses as a weight the inverse of an estimate of the odds ratio function relating the exposure and the mediator, is universal in that it can be used to decompose total effects in a number of regression models commonly used in practice. Specifically, the approach may be used for effect decomposition in generalized linear models with a nonlinear link function, and in a number of other commonly used models such as the Cox proportional hazards regression for a survival outcome. The approach is simple and can be implemented in standard software provided a weight can be specified for each observation. An additional advantage of the method is that it easily incorporates multiple mediators of a categorical, discrete or continuous nature. Copyright © 2013 John Wiley & Sons, Ltd.
Epistasis analysis for quantitative traits by functional regression model.
Zhang, Futao; Boerwinkle, Eric; Xiong, Momiao
2014-06-01
The critical barrier in interaction analysis for rare variants is that most traditional statistical methods for testing interactions were originally designed for testing the interaction between common variants and are difficult to apply to rare variants because of their prohibitive computational time and poor ability. The great challenges for successful detection of interactions with next-generation sequencing (NGS) data are (1) lack of methods for interaction analysis with rare variants, (2) severe multiple testing, and (3) time-consuming computations. To meet these challenges, we shift the paradigm of interaction analysis between two loci to interaction analysis between two sets of loci or genomic regions and collectively test interactions between all possible pairs of SNPs within two genomic regions. In other words, we take a genome region as a basic unit of interaction analysis and use high-dimensional data reduction and functional data analysis techniques to develop a novel functional regression model to collectively test interactions between all possible pairs of single nucleotide polymorphisms (SNPs) within two genome regions. By intensive simulations, we demonstrate that the functional regression models for interaction analysis of the quantitative trait have the correct type 1 error rates and a much better ability to detect interactions than the current pairwise interaction analysis. The proposed method was applied to exome sequence data from the NHLBI's Exome Sequencing Project (ESP) and CHARGE-S study. We discovered 27 pairs of genes showing significant interactions after applying the Bonferroni correction (P-values < 4.58 × 10(-10)) in the ESP, and 11 were replicated in the CHARGE-S study. © 2014 Zhang et al.; Published by Cold Spring Harbor Laboratory Press.
Bayesian Unimodal Density Regression for Causal Inference
ERIC Educational Resources Information Center
Karabatsos, George; Walker, Stephen G.
2011-01-01
Karabatsos and Walker (2011) introduced a new Bayesian nonparametric (BNP) regression model. Through analyses of real and simulated data, they showed that the BNP regression model outperforms other parametric and nonparametric regression models of common use, in terms of predictive accuracy of the outcome (dependent) variable. The other,…
Bennett, Derrick A; Landry, Denise; Little, Julian; Minelli, Cosetta
2017-09-19
Several statistical approaches have been proposed to assess and correct for exposure measurement error. We aimed to provide a critical overview of the most common approaches used in nutritional epidemiology. MEDLINE, EMBASE, BIOSIS and CINAHL were searched for reports published in English up to May 2016 in order to ascertain studies that described methods aimed to quantify and/or correct for measurement error for a continuous exposure in nutritional epidemiology using a calibration study. We identified 126 studies, 43 of which described statistical methods and 83 that applied any of these methods to a real dataset. The statistical approaches in the eligible studies were grouped into: a) approaches to quantify the relationship between different dietary assessment instruments and "true intake", which were mostly based on correlation analysis and the method of triads; b) approaches to adjust point and interval estimates of diet-disease associations for measurement error, mostly based on regression calibration analysis and its extensions. Two approaches (multiple imputation and moment reconstruction) were identified that can deal with differential measurement error. For regression calibration, the most common approach to correct for measurement error used in nutritional epidemiology, it is crucial to ensure that its assumptions and requirements are fully met. Analyses that investigate the impact of departures from the classical measurement error model on regression calibration estimates can be helpful to researchers in interpreting their findings. With regard to the possible use of alternative methods when regression calibration is not appropriate, the choice of method should depend on the measurement error model assumed, the availability of suitable calibration study data and the potential for bias due to violation of the classical measurement error model assumptions. On the basis of this review, we provide some practical advice for the use of methods to assess and adjust for measurement error in nutritional epidemiology.
Biostatistics Series Module 6: Correlation and Linear Regression.
Hazra, Avijit; Gogtay, Nithya
2016-01-01
Correlation and linear regression are the most commonly used techniques for quantifying the association between two numeric variables. Correlation quantifies the strength of the linear relationship between paired variables, expressing this as a correlation coefficient. If both variables x and y are normally distributed, we calculate Pearson's correlation coefficient ( r ). If normality assumption is not met for one or both variables in a correlation analysis, a rank correlation coefficient, such as Spearman's rho (ρ) may be calculated. A hypothesis test of correlation tests whether the linear relationship between the two variables holds in the underlying population, in which case it returns a P < 0.05. A 95% confidence interval of the correlation coefficient can also be calculated for an idea of the correlation in the population. The value r 2 denotes the proportion of the variability of the dependent variable y that can be attributed to its linear relation with the independent variable x and is called the coefficient of determination. Linear regression is a technique that attempts to link two correlated variables x and y in the form of a mathematical equation ( y = a + bx ), such that given the value of one variable the other may be predicted. In general, the method of least squares is applied to obtain the equation of the regression line. Correlation and linear regression analysis are based on certain assumptions pertaining to the data sets. If these assumptions are not met, misleading conclusions may be drawn. The first assumption is that of linear relationship between the two variables. A scatter plot is essential before embarking on any correlation-regression analysis to show that this is indeed the case. Outliers or clustering within data sets can distort the correlation coefficient value. Finally, it is vital to remember that though strong correlation can be a pointer toward causation, the two are not synonymous.
Biostatistics Series Module 6: Correlation and Linear Regression
Hazra, Avijit; Gogtay, Nithya
2016-01-01
Correlation and linear regression are the most commonly used techniques for quantifying the association between two numeric variables. Correlation quantifies the strength of the linear relationship between paired variables, expressing this as a correlation coefficient. If both variables x and y are normally distributed, we calculate Pearson's correlation coefficient (r). If normality assumption is not met for one or both variables in a correlation analysis, a rank correlation coefficient, such as Spearman's rho (ρ) may be calculated. A hypothesis test of correlation tests whether the linear relationship between the two variables holds in the underlying population, in which case it returns a P < 0.05. A 95% confidence interval of the correlation coefficient can also be calculated for an idea of the correlation in the population. The value r2 denotes the proportion of the variability of the dependent variable y that can be attributed to its linear relation with the independent variable x and is called the coefficient of determination. Linear regression is a technique that attempts to link two correlated variables x and y in the form of a mathematical equation (y = a + bx), such that given the value of one variable the other may be predicted. In general, the method of least squares is applied to obtain the equation of the regression line. Correlation and linear regression analysis are based on certain assumptions pertaining to the data sets. If these assumptions are not met, misleading conclusions may be drawn. The first assumption is that of linear relationship between the two variables. A scatter plot is essential before embarking on any correlation-regression analysis to show that this is indeed the case. Outliers or clustering within data sets can distort the correlation coefficient value. Finally, it is vital to remember that though strong correlation can be a pointer toward causation, the two are not synonymous. PMID:27904175
Variable Selection for Regression Models of Percentile Flows
NASA Astrophysics Data System (ADS)
Fouad, G.
2017-12-01
Percentile flows describe the flow magnitude equaled or exceeded for a given percent of time, and are widely used in water resource management. However, these statistics are normally unavailable since most basins are ungauged. Percentile flows of ungauged basins are often predicted using regression models based on readily observable basin characteristics, such as mean elevation. The number of these independent variables is too large to evaluate all possible models. A subset of models is typically evaluated using automatic procedures, like stepwise regression. This ignores a large variety of methods from the field of feature (variable) selection and physical understanding of percentile flows. A study of 918 basins in the United States was conducted to compare an automatic regression procedure to the following variable selection methods: (1) principal component analysis, (2) correlation analysis, (3) random forests, (4) genetic programming, (5) Bayesian networks, and (6) physical understanding. The automatic regression procedure only performed better than principal component analysis. Poor performance of the regression procedure was due to a commonly used filter for multicollinearity, which rejected the strongest models because they had cross-correlated independent variables. Multicollinearity did not decrease model performance in validation because of a representative set of calibration basins. Variable selection methods based strictly on predictive power (numbers 2-5 from above) performed similarly, likely indicating a limit to the predictive power of the variables. Similar performance was also reached using variables selected based on physical understanding, a finding that substantiates recent calls to emphasize physical understanding in modeling for predictions in ungauged basins. The strongest variables highlighted the importance of geology and land cover, whereas widely used topographic variables were the weakest predictors. Variables suffered from a high degree of multicollinearity, possibly illustrating the co-evolution of climatic and physiographic conditions. Given the ineffectiveness of many variables used here, future work should develop new variables that target specific processes associated with percentile flows.
Sperm Retrieval in Patients with Klinefelter Syndrome: A Skewed Regression Model Analysis.
Chehrazi, Mohammad; Rahimiforoushani, Abbas; Sabbaghian, Marjan; Nourijelyani, Keramat; Sadighi Gilani, Mohammad Ali; Hoseini, Mostafa; Vesali, Samira; Yaseri, Mehdi; Alizadeh, Ahad; Mohammad, Kazem; Samani, Reza Omani
2017-01-01
The most common chromosomal abnormality due to non-obstructive azoospermia (NOA) is Klinefelter syndrome (KS) which occurs in 1-1.72 out of 500-1000 male infants. The probability of retrieving sperm as the outcome could be asymmetrically different between patients with and without KS, therefore logistic regression analysis is not a well-qualified test for this type of data. This study has been designed to evaluate skewed regression model analysis for data collected from microsurgical testicular sperm extraction (micro-TESE) among azoospermic patients with and without non-mosaic KS syndrome. This cohort study compared the micro-TESE outcome between 134 men with classic KS and 537 men with NOA and normal karyotype who were referred to Royan Institute between 2009 and 2011. In addition to our main outcome, which was sperm retrieval, we also used logistic and skewed regression analyses to compare the following demographic and hormonal factors: age, level of follicle stimulating hormone (FSH), luteinizing hormone (LH), and testosterone between the two groups. A comparison of the micro-TESE between the KS and control groups showed a success rate of 28.4% (38/134) for the KS group and 22.2% (119/537) for the control group. In the KS group, a significantly difference (P<0.001) existed between testosterone levels for the successful sperm retrieval group (3.4 ± 0.48 mg/mL) compared to the unsuccessful sperm retrieval group (2.33 ± 0.23 mg/mL). The index for quasi Akaike information criterion (QAIC) had a goodness of fit of 74 for the skewed model which was lower than logistic regression (QAIC=85). According to the results, skewed regression is more efficient in estimating sperm retrieval success when the data from patients with KS are analyzed. This finding should be investigated by conducting additional studies with different data structures.
Xu, Yun; Muhamadali, Howbeer; Sayqal, Ali; Dixon, Neil; Goodacre, Royston
2016-10-28
Partial least squares (PLS) is one of the most commonly used supervised modelling approaches for analysing multivariate metabolomics data. PLS is typically employed as either a regression model (PLS-R) or a classification model (PLS-DA). However, in metabolomics studies it is common to investigate multiple, potentially interacting, factors simultaneously following a specific experimental design. Such data often cannot be considered as a "pure" regression or a classification problem. Nevertheless, these data have often still been treated as a regression or classification problem and this could lead to ambiguous results. In this study, we investigated the feasibility of designing a hybrid target matrix Y that better reflects the experimental design than simple regression or binary class membership coding commonly used in PLS modelling. The new design of Y coding was based on the same principle used by structural modelling in machine learning techniques. Two real metabolomics datasets were used as examples to illustrate how the new Y coding can improve the interpretability of the PLS model compared to classic regression/classification coding.
Zhang, Weihong; Xin, Linlin; Lu, Ying
2017-01-01
Background Emerging data have established links between systemic metabolic dysfunction, such as diabetes and metabolic syndrome (MetS), with neurocognitive impairment, including dementia. The common gene signature and the associated signaling pathways of MetS, diabetes, and dementia have not been widely studied. Material/Methods We exploited the translational bioinformatics approach to choose the common gene signatures for both dementia and MetS. For this we employed “DisGeNET discovery platform”. Results Gene mining analysis revealed that a total of 173 genes (86 genes common to all three diseases) which comprised a proportion of 43% of the total genes associated with dementia. The gene enrichment analysis showed that these genes were involved in dysregulation in the neurological system (23.2%) and the central nervous system (20.8%) phenotype processes. The network analysis revealed APOE, APP, PARK2, CEPBP, PARP1, MT-CO2, CXCR4, IGFIR, CCR5, and PIK3CD as important nodes with significant interacting partners. The meta-regression analysis showed modest association of APOE with dementia and metabolic complications. The directionality of effects of the variants on Alzheimer disease is generally consistent with previous observations and did not differ by race/ethnicity (p>0.05), although our study had low power for this test. Conclusions Our novel approach showed APOE as a common gene signature with a link to dementia, MetS, and diabetes. Future gene association studies should focus on the association of gene polymorphisms with multiple disease models to identify novel putative drug targets. PMID:29229897
Lipiäinen, Tiina; Pessi, Jenni; Movahedi, Parisa; Koivistoinen, Juha; Kurki, Lauri; Tenhunen, Mari; Yliruusi, Jouko; Juppo, Anne M; Heikkonen, Jukka; Pahikkala, Tapio; Strachan, Clare J
2018-04-03
Raman spectroscopy is widely used for quantitative pharmaceutical analysis, but a common obstacle to its use is sample fluorescence masking the Raman signal. Time-gating provides an instrument-based method for rejecting fluorescence through temporal resolution of the spectral signal and allows Raman spectra of fluorescent materials to be obtained. An additional practical advantage is that analysis is possible in ambient lighting. This study assesses the efficacy of time-gated Raman spectroscopy for the quantitative measurement of fluorescent pharmaceuticals. Time-gated Raman spectroscopy with a 128 × (2) × 4 CMOS SPAD detector was applied for quantitative analysis of ternary mixtures of solid-state forms of the model drug, piroxicam (PRX). Partial least-squares (PLS) regression allowed quantification, with Raman-active time domain selection (based on visual inspection) improving performance. Model performance was further improved by using kernel-based regularized least-squares (RLS) regression with greedy feature selection in which the data use in both the Raman shift and time dimensions was statistically optimized. Overall, time-gated Raman spectroscopy, especially with optimized data analysis in both the spectral and time dimensions, shows potential for sensitive and relatively routine quantitative analysis of photoluminescent pharmaceuticals during drug development and manufacturing.
Bartlett, Jonathan W; Keogh, Ruth H
2018-06-01
Bayesian approaches for handling covariate measurement error are well established and yet arguably are still relatively little used by researchers. For some this is likely due to unfamiliarity or disagreement with the Bayesian inferential paradigm. For others a contributory factor is the inability of standard statistical packages to perform such Bayesian analyses. In this paper, we first give an overview of the Bayesian approach to handling covariate measurement error, and contrast it with regression calibration, arguably the most commonly adopted approach. We then argue why the Bayesian approach has a number of statistical advantages compared to regression calibration and demonstrate that implementing the Bayesian approach is usually quite feasible for the analyst. Next, we describe the closely related maximum likelihood and multiple imputation approaches and explain why we believe the Bayesian approach to generally be preferable. We then empirically compare the frequentist properties of regression calibration and the Bayesian approach through simulation studies. The flexibility of the Bayesian approach to handle both measurement error and missing data is then illustrated through an analysis of data from the Third National Health and Nutrition Examination Survey.
Template based rotation: A method for functional connectivity analysis with a priori templates☆
Schultz, Aaron P.; Chhatwal, Jasmeer P.; Huijbers, Willem; Hedden, Trey; van Dijk, Koene R.A.; McLaren, Donald G.; Ward, Andrew M.; Wigman, Sarah; Sperling, Reisa A.
2014-01-01
Functional connectivity magnetic resonance imaging (fcMRI) is a powerful tool for understanding the network level organization of the brain in research settings and is increasingly being used to study large-scale neuronal network degeneration in clinical trial settings. Presently, a variety of techniques, including seed-based correlation analysis and group independent components analysis (with either dual regression or back projection) are commonly employed to compute functional connectivity metrics. In the present report, we introduce template based rotation,1 a novel analytic approach optimized for use with a priori network parcellations, which may be particularly useful in clinical trial settings. Template based rotation was designed to leverage the stable spatial patterns of intrinsic connectivity derived from out-of-sample datasets by mapping data from novel sessions onto the previously defined a priori templates. We first demonstrate the feasibility of using previously defined a priori templates in connectivity analyses, and then compare the performance of template based rotation to seed based and dual regression methods by applying these analytic approaches to an fMRI dataset of normal young and elderly subjects. We observed that template based rotation and dual regression are approximately equivalent in detecting fcMRI differences between young and old subjects, demonstrating similar effect sizes for group differences and similar reliability metrics across 12 cortical networks. Both template based rotation and dual-regression demonstrated larger effect sizes and comparable reliabilities as compared to seed based correlation analysis, though all three methods yielded similar patterns of network differences. When performing inter-network and sub-network connectivity analyses, we observed that template based rotation offered greater flexibility, larger group differences, and more stable connectivity estimates as compared to dual regression and seed based analyses. This flexibility owes to the reduced spatial and temporal orthogonality constraints of template based rotation as compared to dual regression. These results suggest that template based rotation can provide a useful alternative to existing fcMRI analytic methods, particularly in clinical trial settings where predefined outcome measures and conserved network descriptions across groups are at a premium. PMID:25150630
Supporting Regularized Logistic Regression Privately and Efficiently.
Li, Wenfa; Liu, Hongzhe; Yang, Peng; Xie, Wei
2016-01-01
As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Concerns over data privacy make it increasingly difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used statistical model while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluations on several studies validate the privacy guarantee, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc.
Forecasting Container Throughput at the Doraleh Port in Djibouti through Time Series Analysis
NASA Astrophysics Data System (ADS)
Mohamed Ismael, Hawa; Vandyck, George Kobina
The Doraleh Container Terminal (DCT) located in Djibouti has been noted as the most technologically advanced container terminal on the African continent. DCT's strategic location at the crossroads of the main shipping lanes connecting Asia, Africa and Europe put it in a unique position to provide important shipping services to vessels plying that route. This paper aims to forecast container throughput through the Doraleh Container Port in Djibouti by Time Series Analysis. A selection of univariate forecasting models has been used, namely Triple Exponential Smoothing Model, Grey Model and Linear Regression Model. By utilizing the above three models and their combination, the forecast of container throughput through the Doraleh port was realized. A comparison of the different forecasting results of the three models, in addition to the combination forecast is then undertaken, based on commonly used evaluation criteria Mean Absolute Deviation (MAD) and Mean Absolute Percentage Error (MAPE). The study found that the Linear Regression forecasting Model was the best prediction method for forecasting the container throughput, since its forecast error was the least. Based on the regression model, a ten (10) year forecast for container throughput at DCT has been made.
Supporting Regularized Logistic Regression Privately and Efficiently
Li, Wenfa; Liu, Hongzhe; Yang, Peng; Xie, Wei
2016-01-01
As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Concerns over data privacy make it increasingly difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used statistical model while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluations on several studies validate the privacy guarantee, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc. PMID:27271738
A PDE approach for quantifying and visualizing tumor progression and regression
NASA Astrophysics Data System (ADS)
Sintay, Benjamin J.; Bourland, J. Daniel
2009-02-01
Quantification of changes in tumor shape and size allows physicians the ability to determine the effectiveness of various treatment options, adapt treatment, predict outcome, and map potential problem sites. Conventional methods are often based on metrics such as volume, diameter, or maximum cross sectional area. This work seeks to improve the visualization and analysis of tumor changes by simultaneously analyzing changes in the entire tumor volume. This method utilizes an elliptic partial differential equation (PDE) to provide a roadmap of boundary displacement that does not suffer from the discontinuities associated with other measures such as Euclidean distance. Streamline pathways defined by Laplace's equation (a commonly used PDE) are used to track tumor progression and regression at the tumor boundary. Laplace's equation is particularly useful because it provides a smooth, continuous solution that can be evaluated with sub-pixel precision on variable grid sizes. Several metrics are demonstrated including maximum, average, and total regression and progression. This method provides many advantages over conventional means of quantifying change in tumor shape because it is observer independent, stable for highly unusual geometries, and provides an analysis of the entire three-dimensional tumor volume.
Comparison of Survival Models for Analyzing Prognostic Factors in Gastric Cancer Patients
Habibi, Danial; Rafiei, Mohammad; Chehrei, Ali; Shayan, Zahra; Tafaqodi, Soheil
2018-03-27
Objective: There are a number of models for determining risk factors for survival of patients with gastric cancer. This study was conducted to select the model showing the best fit with available data. Methods: Cox regression and parametric models (Exponential, Weibull, Gompertz, Log normal, Log logistic and Generalized Gamma) were utilized in unadjusted and adjusted forms to detect factors influencing mortality of patients. Comparisons were made with Akaike Information Criterion (AIC) by using STATA 13 and R 3.1.3 softwares. Results: The results of this study indicated that all parametric models outperform the Cox regression model. The Log normal, Log logistic and Generalized Gamma provided the best performance in terms of AIC values (179.2, 179.4 and 181.1, respectively). On unadjusted analysis, the results of the Cox regression and parametric models indicated stage, grade, largest diameter of metastatic nest, largest diameter of LM, number of involved lymph nodes and the largest ratio of metastatic nests to lymph nodes, to be variables influencing the survival of patients with gastric cancer. On adjusted analysis, according to the best model (log normal), grade was found as the significant variable. Conclusion: The results suggested that all parametric models outperform the Cox model. The log normal model provides the best fit and is a good substitute for Cox regression. Creative Commons Attribution License
What kind of sexual dysfunction is most common among overweight and obese women in reproductive age?
Rabiepoor, S; Khalkhali, H R; Sadeghi, E
2017-03-01
The aim of this study was to investigate the association between body mass index (BMI) and sexual health and determine what kind of sexual dysfunction is most common among overweight and obese women in reproductive age from Iran. A cross-sectional descriptive design was adopted. The data of 198 women who referred to health centers during 2014-2015 in Iran were collected through convenient sampling. Data were collected using a demographic questionnaire, female sexual function and sexual satisfaction indexes. Participants' heights and weights were recorded in centimeters and kilogram. Data were analyzed applying descriptive statistics, one-way analysis of variance, regression logistic analysis and χ 2 . P-values<0.05 were considered significant. The mean age of women was 29.89±7.01 and ages ranged from 17 to 45 years. 85.9% of the participants had sexual dysfunction, and 69.7% had dissatisfaction and low satisfaction. According to our evaluations, orgasm dysfunction had the most frequency; on the other hand, desire dysfunction and pain dysfunction had the lowest frequency among overweight and obese women, respectively. Using logistic regression analysis, we have shown that BMI affected on sexual satisfaction, but there was not significant differences between BMI and sexual function. This article concludes that all women especially women with overweight and obesity should be counseled about health outcomes related to sexual activity. This article concludes that all women especially women with overweight and obesity should be counseled about health outcomes related to sexual activity.
Sreeramareddy, Chandrashekhar T; Panduru, Kishore V; Verma, Sharat C; Joshi, Hari S; Bates, Michael N
2008-01-24
Studies from developed countries have reported on host-related risk factors for extra-pulmonary tuberculosis (EPTB). However, similar studies from high-burden countries like Nepal are lacking. Therefore, we carried out this study to compare demographic, life-style and clinical characteristics between EPTB and PTB patients. A retrospective analysis was carried out on 474 Tuberculosis (TB) patients diagnosed in a tertiary care hospital in western Nepal. Characteristics of demography, life-style and clinical features were obtained from medical case records. Risk factors for being an EPTB patient relative to a PTB patient were identified using logistic regression analysis. The age distribution of the TB patients had a bimodal distribution. The male to female ratio for PTB was 2.29. EPTB was more common at younger ages (< 25 years) and in females. Common sites for EPTB were lymph nodes (42.6%) and peritoneum and/or intestines (14.8%). By logistic regression analysis, age less than 25 years (OR 2.11 95% CI 1.12-3.68) and female gender (OR 1.69, 95% CI 1.12-2.56) were associated with EPTB. Smoking, use of immunosuppressive drugs/steroids, diabetes and past history of TB were more likely to be associated with PTB. Results suggest that younger age and female gender may be independent risk factors for EPTB in a high-burden country like Nepal. TB control programmes may target young and female populations for EPTB case-finding. Further studies are necessary in other high-burden countries to confirm our findings.
Wang, Haiyong; Zhang, Chenyue; Zhang, Jingze; Kong, Li; Zhu, Hui; Yu, Jinming
2017-04-18
Studies on prognosis of different metastasis patterns in patients with different breast cancer subtypes (BCS) are limited. Therefore, we identified 7862 breast cancer patients with distant metastasis from 2010 to 2013 using Surveillance, Epidemiology, wand End Results (SEER) population-based data. The results showed that bone was the most common metastatic site and brain was the least common metastatic site, and the patients with HR+/HER2- occupied the highest metastasis proportion, the lowest metastasis proportion were found in HR-/HER2+ patients. Univariate and multivariate logistic regression analysis were used to analyze the association, and it was found that there were significant differences of distant metastasis patterns in patients with different BCS(different P value). Importantly, univariate and multivariate Cox regression analysis were used to analyze the prognosis. It was proven that only bone metastasis was not a prognostic factor in the HR+/HER2-, HR+/HER2+ and HR-/HER2+ subgroup (all, P > 0.05), and patients with brain metastasis had the worst cancer specific survival (CSS) in all the subgroups of BCS (all, P<0.01). Interestingly, for patients with two metastatic sites, those with bone and lung metastasis had best CSS in the HR+/HER2- (P<0.001) and HR+/HER2+ subgroups (P=0.009) However, for patients with three and four metastatic sites, there was no statistical difference in their CSS (all, P>0.05).
Wang, Haiyong; Zhang, Chenyue; Zhang, Jingze; Kong, Li; Zhu, Hui; Yu, Jinming
2017-01-01
Studies on prognosis of different metastasis patterns in patients with different breast cancer subtypes (BCS) are limited. Therefore, we identified 7862 breast cancer patients with distant metastasis from 2010 to 2013 using Surveillance, Epidemiology, wand End Results (SEER) population-based data. The results showed that bone was the most common metastatic site and brain was the least common metastatic site, and the patients with HR+/HER2− occupied the highest metastasis proportion, the lowest metastasis proportion were found in HR-/HER2+ patients. Univariate and multivariate logistic regression analysis were used to analyze the association, and it was found that there were significant differences of distant metastasis patterns in patients with different BCS(different P value). Importantly, univariate and multivariate Cox regression analysis were used to analyze the prognosis. It was proven that only bone metastasis was not a prognostic factor in the HR+/HER2-, HR+/HER2+ and HR-/HER2+ subgroup (all, P > 0.05), and patients with brain metastasis had the worst cancer specific survival (CSS) in all the subgroups of BCS (all, P<0.01). Interestingly, for patients with two metastatic sites, those with bone and lung metastasis had best CSS in the HR+/HER2- (P<0.001) and HR+/HER2+ subgroups (P=0.009) However, for patients with three and four metastatic sites, there was no statistical difference in their CSS (all, P>0.05). PMID:28038448
A matching framework to improve causal inference in interrupted time-series analysis.
Linden, Ariel
2018-04-01
Interrupted time-series analysis (ITSA) is a popular evaluation methodology in which a single treatment unit's outcome is studied over time and the intervention is expected to "interrupt" the level and/or trend of the outcome, subsequent to its introduction. When ITSA is implemented without a comparison group, the internal validity may be quite poor. Therefore, adding a comparable control group to serve as the counterfactual is always preferred. This paper introduces a novel matching framework, ITSAMATCH, to create a comparable control group by matching directly on covariates and then use these matches in the outcomes model. We evaluate the effect of California's Proposition 99 (passed in 1988) for reducing cigarette sales, by comparing California to other states not exposed to smoking reduction initiatives. We compare ITSAMATCH results to 2 commonly used matching approaches, synthetic controls (SYNTH), and regression adjustment; SYNTH reweights nontreated units to make them comparable to the treated unit, and regression adjusts covariates directly. Methods are compared by assessing covariate balance and treatment effects. Both ITSAMATCH and SYNTH achieved covariate balance and estimated similar treatment effects. The regression model found no treatment effect and produced inconsistent covariate adjustment. While the matching framework achieved results comparable to SYNTH, it has the advantage of being technically less complicated, while producing statistical estimates that are straightforward to interpret. Conversely, regression adjustment may "adjust away" a treatment effect. Given its advantages, ITSAMATCH should be considered as a primary approach for evaluating treatment effects in multiple-group time-series analysis. © 2017 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Hasan, Haliza; Ahmad, Sanizah; Osman, Balkish Mohd; Sapri, Shamsiah; Othman, Nadirah
2017-08-01
In regression analysis, missing covariate data has been a common problem. Many researchers use ad hoc methods to overcome this problem due to the ease of implementation. However, these methods require assumptions about the data that rarely hold in practice. Model-based methods such as Maximum Likelihood (ML) using the expectation maximization (EM) algorithm and Multiple Imputation (MI) are more promising when dealing with difficulties caused by missing data. Then again, inappropriate methods of missing value imputation can lead to serious bias that severely affects the parameter estimates. The main objective of this study is to provide a better understanding regarding missing data concept that can assist the researcher to select the appropriate missing data imputation methods. A simulation study was performed to assess the effects of different missing data techniques on the performance of a regression model. The covariate data were generated using an underlying multivariate normal distribution and the dependent variable was generated as a combination of explanatory variables. Missing values in covariate were simulated using a mechanism called missing at random (MAR). Four levels of missingness (10%, 20%, 30% and 40%) were imposed. ML and MI techniques available within SAS software were investigated. A linear regression analysis was fitted and the model performance measures; MSE, and R-Squared were obtained. Results of the analysis showed that MI is superior in handling missing data with highest R-Squared and lowest MSE when percent of missingness is less than 30%. Both methods are unable to handle larger than 30% level of missingness.
Donovan, Heidi S; Hagan, Teresa L; Campbell, Grace B; Boisen, Michelle M; Rosenblum, Leah M; Edwards, Robert P; Bovbjerg, Dana H; Horn, Charles C
2016-06-01
Nausea is a common and potentially serious effect of cytotoxic chemotherapy for recurrent ovarian cancer and may function as a sentinel symptom reflecting adverse effects on the gut-brain axis (GBA) more generally, but research is scant. As a first exploratory test of this GBA hypothesis, we compared women reporting nausea to women not reporting nausea with regard to the severity of other commonly reported symptoms in this patient population. A secondary analysis of data systematically collected from women in active chemotherapy treatment for recurrent ovarian cancer (n = 158) was conducted. The Symptom Representation Questionnaire (SRQ) provided severity ratings for 22 common symptoms related to cancer and chemotherapy. Independent sample t tests and regression analyses were used to compare women with and without nausea with regard to their experience of other symptoms. Nausea was reported by 89 (56.2 %) women. Symptoms that were significantly associated with nausea in bivariate and regression analyses included abdominal bloating, bowel disturbances, dizziness, depression, drowsiness, fatigue, headache, lack of appetite, memory problems, mood swings, shortness of breath, pain, sleep disturbance, urinary problems, vomiting, and weight loss. Symptoms that were not associated with nausea included hair loss, numbness and tingling, sexuality concerns, and weight gain. Nausea experienced during chemotherapy for recurrent ovarian cancer may be an indicator of broader effects on the gut-brain axis. A better understanding of the mechanisms underlying these effects could lead to the development of novel supportive therapies to increase the tolerability and effectiveness of cancer treatment.
NASA Astrophysics Data System (ADS)
Wübbeler, Gerd; Bodnar, Olha; Elster, Clemens
2018-02-01
Weighted least-squares estimation is commonly applied in metrology to fit models to measurements that are accompanied with quoted uncertainties. The weights are chosen in dependence on the quoted uncertainties. However, when data and model are inconsistent in view of the quoted uncertainties, this procedure does not yield adequate results. When it can be assumed that all uncertainties ought to be rescaled by a common factor, weighted least-squares estimation may still be used, provided that a simple correction of the uncertainty obtained for the estimated model is applied. We show that these uncertainties and credible intervals are robust, as they do not rely on the assumption of a Gaussian distribution of the data. Hence, common software for weighted least-squares estimation may still safely be employed in such a case, followed by a simple modification of the uncertainties obtained by that software. We also provide means of checking the assumptions of such an approach. The Bayesian regression procedure is applied to analyze the CODATA values for the Planck constant published over the past decades in terms of three different models: a constant model, a straight line model and a spline model. Our results indicate that the CODATA values may not have yet stabilized.
[Hazard function and life table: an introduction to the failure time analysis].
Matsushita, K; Inaba, H
1987-04-01
Failure time analysis has become popular in demographic studies. It can be viewed as a part of regression analysis with limited dependent variables as well as a special case of event history analysis and multistate demography. The idea of hazard function and failure time analysis, however, has not been properly introduced to nor commonly discussed by demographers in Japan. The concept of hazard function in comparison with life tables is briefly described, where the force of mortality is interchangeable with the hazard rate. The basic idea of failure time analysis is summarized for the cases of exponential distribution, normal distribution, and proportional hazard models. The multiple decrement life table is also introduced as an example of lifetime data analysis with cause-specific hazard rates.
Afshinnia, Farsad; Belanger, Karen; Palevsky, Paul M.; Young, Eric W.
2014-01-01
Background Hypocalcemia is very common in critically ill patients. While the effect of ionized calcium (iCa) on outcome is not well understood, manipulation of iCa in critically ill patients is a common practice. We analyzed all-cause mortality and several secondary outcomes in patients with acute kidney injury (AKI) by categories of serum iCa among participants in the Acute Renal Failure Trial Network (ATN) Study. Methods This is a post hoc secondary analysis of the ATN Study which was not preplanned in the original trial. Risk of mortality and renal recovery by categories of iCa were compared using multiple fixed and adjusted time-varying Cox regression models. Multiple linear regression models were used to explore the impact of baseline iCa on days free from ICU and hospital. Results A total of 685 patients were included in the analysis. Mean age was 60 (SD=15) years. There were 502 male patients (73.3%). Sixty-day all-cause mortality was 57.0%, 54.8%, and 54.4%, in patients with an iCa <1, 1–1.14, and ≥1.15 mmol/L, respectively (P=0.87). Mean of days free from ICU or hospital in all patients and the 28-day renal recovery in survivors to day 28 were not significantly different by categories of iCa. The hazard for death in a fully adjusted time-varying Cox regression survival model was 1.7 (95% CI: 1.3–2.4) comparing iCa <1 to iCa ≥1.15 mmol/L. No outcome was different for levels of iCa >1 mmol/L. Conclusion Severe hypocalcemia with iCa <1 mmol/L independently predicted mortality in patients with AKI needing renal replacement therapy. PMID:23992422
International consensus on preliminary definitions of improvement in adult and juvenile myositis.
Rider, Lisa G; Giannini, Edward H; Brunner, Hermine I; Ruperto, Nicola; James-Newton, Laura; Reed, Ann M; Lachenbruch, Peter A; Miller, Frederick W
2004-07-01
To use a core set of outcome measures to develop preliminary definitions of improvement for adult and juvenile myositis as composite end points for therapeutic trials. Twenty-nine experts in the assessment of myositis achieved consensus on 102 adult and 102 juvenile paper patient profiles as clinically improved or not improved. Two hundred twenty-seven candidate definitions of improvement were developed using the experts' consensus ratings as a gold standard and their judgment of clinically meaningful change in the core set of measures. Seventeen additional candidate definitions of improvement were developed from classification and regression tree analysis, a data-mining decision tree tool analysis. Six candidate definitions specifying percentage change or raw change in the core set of measures were developed using logistic regression analysis. Adult and pediatric working groups ranked the 13 top-performing candidate definitions for face validity, clinical sensibility, and ease of use, in which the sensitivity and specificity were >/=75% in adult, pediatric, and combined data sets. Nominal group technique was used to facilitate consensus formation. The definition of improvement (common to the adult and pediatric working groups) that ranked highest was 3 of any 6 of the core set measures improved by >/=20%, with no more than 2 worse by >/=25% (which could not include manual muscle testing to assess strength). Five and 4 additional preliminary definitions of improvement for adult and juvenile myositis, respectively, were also developed, with several definitions common to both groups. Participants also agreed to prospectively test 6 logistic regression definitions of improvement in clinical trials. Consensus preliminary definitions of improvement were developed for adult and juvenile myositis, and these incorporate clinically meaningful change in all myositis core set measures in a composite end point. These definitions require prospective validation, but they are now proposed for use as end points in all myositis trials.
Multivariate Bias Correction Procedures for Improving Water Quality Predictions from the SWAT Model
NASA Astrophysics Data System (ADS)
Arumugam, S.; Libera, D.
2017-12-01
Water quality observations are usually not available on a continuous basis for longer than 1-2 years at a time over a decadal period given the labor requirements making calibrating and validating mechanistic models difficult. Further, any physical model predictions inherently have bias (i.e., under/over estimation) and require post-simulation techniques to preserve the long-term mean monthly attributes. This study suggests a multivariate bias-correction technique and compares to a common technique in improving the performance of the SWAT model in predicting daily streamflow and TN loads across the southeast based on split-sample validation. The approach is a dimension reduction technique, canonical correlation analysis (CCA) that regresses the observed multivariate attributes with the SWAT model simulated values. The common approach is a regression based technique that uses an ordinary least squares regression to adjust model values. The observed cross-correlation between loadings and streamflow is better preserved when using canonical correlation while simultaneously reducing individual biases. Additionally, canonical correlation analysis does a better job in preserving the observed joint likelihood of observed streamflow and loadings. These procedures were applied to 3 watersheds chosen from the Water Quality Network in the Southeast Region; specifically, watersheds with sufficiently large drainage areas and number of observed data points. The performance of these two approaches are compared for the observed period and over a multi-decadal period using loading estimates from the USGS LOADEST model. Lastly, the CCA technique is applied in a forecasting sense by using 1-month ahead forecasts of P & T from ECHAM4.5 as forcings in the SWAT model. Skill in using the SWAT model for forecasting loadings and streamflow at the monthly and seasonal timescale is also discussed.
Massad, L. Stewart; Xie, Xianhong; Darragh, Teresa; Minkoff, Howard; Levine, Alexandra M.; Watts, D. Heather; Wright, Rodney L.; D’Souza, Gypsyamber; Colie, Christine; Strickler, Howard D.
2011-01-01
Objective To describe the natural history of genital warts and vulvar intraepithelial neoplasia (VIN) in women with human immunodeficiency virus (HIV). Methods A cohort of 2,791 HIV infected and 953 uninfected women followed for up to 13 years had genital examinations at 6-month intervals, with biopsy for lesions suspicious for VIN. Results The prevalence of warts was 4.4% (5.3% for HIV seropositive women and 1.9% for seronegative women, P < 0.0001). The cumulative incidence of warts was 33% (95% C.I. 30, 36%) in HIV seropositive and 9% (95% C.I. 6, 12%) in seronegative women (P < 0.0001). In multivariable analysis, lower CD4 lymphocyte count, younger age, and current smoking were strongly associated with risk for incident warts. Among 501 HIV seropositive and 43 seronegative women, warts regressed in 410 (82%) seropositive and 41 (95%) seronegative women (P = 0.02), most in the first year after diagnosis. In multivariable analysis, regression was negatively associated with HIV status and lower CD4 count as well as older age. Incident VIN of any grade occurred more frequently among HIV seropositive than seronegative women: 0.42 (0.33 – 0.53) vs 0.07 (0.02 – 0.18)/100 person-years (P < 0.0001). VIN2+ was found in 58 women (55 with and 3 without HIV, P < 0.001). Two women with HIV developed stage IB squamous cell vulvar cancers. Conclusion While genital warts and VIN are more common among HIV seropositive than seronegative women, wart regression is common even in women with HIV, and cancers are infrequent. PMID:21934446
Serum Irisin Predicts Mortality Risk in Acute Heart Failure Patients.
Shen, Shutong; Gao, Rongrong; Bei, Yihua; Li, Jin; Zhang, Haifeng; Zhou, Yanli; Yao, Wenming; Xu, Dongjie; Zhou, Fang; Jin, Mengchao; Wei, Siqi; Wang, Kai; Xu, Xuejuan; Li, Yongqin; Xiao, Junjie; Li, Xinli
2017-01-01
Irisin is a peptide hormone cleaved from a plasma membrane protein fibronectin type III domain containing protein 5 (FNDC5). Emerging studies have indicated association between serum irisin and many major chronic diseases including cardiovascular diseases. However, the role of serum irisin as a predictor for mortality risk in acute heart failure (AHF) patients is not clear. AHF patients were enrolled and serum was collected at the admission and all patients were followed up for 1 year. Enzyme-linked immunosorbent assay was used to measure serum irisin levels. To explore predictors for AHF mortality, the univariate and multivariate logistic regression analysis, and receiver-operator characteristic (ROC) curve analysis were used. To determine the role of serum irisin levels in predicting survival, Kaplan-Meier survival analysis was used. In this study, 161 AHF patients were enrolled and serum irisin level was found to be significantly higher in patients deceased in 1-year follow-up. The univariate logistic regression analysis identified 18 variables associated with all-cause mortality in AHF patients, while the multivariate logistic regression analysis identified 2 variables namely blood urea nitrogen and serum irisin. ROC curve analysis indicated that blood urea nitrogen and the most commonly used biomarker, NT-pro-BNP, displayed poor prognostic value for AHF (AUCs ≤ 0.700) compared to serum irisin (AUC = 0.753). Kaplan-Meier survival analysis demonstrated that AHF patients with higher serum irisin had significantly higher mortality (P<0.001). Collectively, our study identified serum irisin as a predictive biomarker for 1-year all-cause mortality in AHF patients though large multicenter studies are highly needed. © 2017 The Author(s). Published by S. Karger AG, Basel.
Drivers of wetland conversion: a global meta-analysis.
van Asselen, Sanneke; Verburg, Peter H; Vermaat, Jan E; Janse, Jan H
2013-01-01
Meta-analysis of case studies has become an important tool for synthesizing case study findings in land change. Meta-analyses of deforestation, urbanization, desertification and change in shifting cultivation systems have been published. This present study adds to this literature, with an analysis of the proximate causes and underlying forces of wetland conversion at a global scale using two complementary approaches of systematic review. Firstly, a meta-analysis of 105 case-study papers describing wetland conversion was performed, showing that different combinations of multiple-factor proximate causes, and underlying forces, drive wetland conversion. Agricultural development has been the main proximate cause of wetland conversion, and economic growth and population density are the most frequently identified underlying forces. Secondly, to add a more quantitative component to the study, a logistic meta-regression analysis was performed to estimate the likelihood of wetland conversion worldwide, using globally-consistent biophysical and socioeconomic location factor maps. Significant factors explaining wetland conversion, in order of importance, are market influence, total wetland area (lower conversion probability), mean annual temperature and cropland or built-up area. The regression analyses results support the outcomes of the meta-analysis of the processes of conversion mentioned in the individual case studies. In other meta-analyses of land change, similar factors (e.g., agricultural development, population growth, market/economic factors) are also identified as important causes of various types of land change (e.g., deforestation, desertification). Meta-analysis helps to identify commonalities across the various local case studies and identify which variables may lead to individual cases to behave differently. The meta-regression provides maps indicating the likelihood of wetland conversion worldwide based on the location factors that have determined historic conversions.
Drivers of Wetland Conversion: a Global Meta-Analysis
van Asselen, Sanneke; Verburg, Peter H.; Vermaat, Jan E.; Janse, Jan H.
2013-01-01
Meta-analysis of case studies has become an important tool for synthesizing case study findings in land change. Meta-analyses of deforestation, urbanization, desertification and change in shifting cultivation systems have been published. This present study adds to this literature, with an analysis of the proximate causes and underlying forces of wetland conversion at a global scale using two complementary approaches of systematic review. Firstly, a meta-analysis of 105 case-study papers describing wetland conversion was performed, showing that different combinations of multiple-factor proximate causes, and underlying forces, drive wetland conversion. Agricultural development has been the main proximate cause of wetland conversion, and economic growth and population density are the most frequently identified underlying forces. Secondly, to add a more quantitative component to the study, a logistic meta-regression analysis was performed to estimate the likelihood of wetland conversion worldwide, using globally-consistent biophysical and socioeconomic location factor maps. Significant factors explaining wetland conversion, in order of importance, are market influence, total wetland area (lower conversion probability), mean annual temperature and cropland or built-up area. The regression analyses results support the outcomes of the meta-analysis of the processes of conversion mentioned in the individual case studies. In other meta-analyses of land change, similar factors (e.g., agricultural development, population growth, market/economic factors) are also identified as important causes of various types of land change (e.g., deforestation, desertification). Meta-analysis helps to identify commonalities across the various local case studies and identify which variables may lead to individual cases to behave differently. The meta-regression provides maps indicating the likelihood of wetland conversion worldwide based on the location factors that have determined historic conversions. PMID:24282580
Agarwal, Parul; Sambamoorthi, Usha
2015-12-01
Depression is common among individuals with osteoarthritis and leads to increased healthcare burden. The objective of this study was to examine excess total healthcare expenditures associated with depression among individuals with osteoarthritis in the US. Adults with self-reported osteoarthritis (n = 1881) were identified using data from the 2010 Medical Expenditure Panel Survey (MEPS). Among those with osteoarthritis, chi-square tests and ordinary least square regressions (OLS) were used to examine differences in healthcare expenditures between those with and without depression. Post-regression linear decomposition technique was used to estimate the relative contribution of different constructs of the Anderson's behavioral model, i.e., predisposing, enabling, need, personal healthcare practices, and external environment factors, to the excess expenditures associated with depression among individuals with osteoarthritis. All analysis accounted for the complex survey design of MEPS. Depression coexisted among 20.6 % of adults with osteoarthritis. The average total healthcare expenditures were $13,684 among adults with depression compared to $9284 among those without depression. Multivariable OLS regression revealed that adults with depression had 38.8 % higher healthcare expenditures (p < 0.001) compared to those without depression. Post-regression linear decomposition analysis indicated that 50 % of differences in expenditures among adults with and without depression can be explained by differences in need factors. Among individuals with coexisting osteoarthritis and depression, excess healthcare expenditures associated with depression were mainly due to comorbid anxiety, chronic conditions and poor health status. These expenditures may potentially be reduced by providing timely intervention for need factors or by providing care under a collaborative care model.
Is adult gait less susceptible than paediatric gait to hip joint centre regression equation error?
Kiernan, D; Hosking, J; O'Brien, T
2016-03-01
Hip joint centre (HJC) regression equation error during paediatric gait has recently been shown to have clinical significance. In relation to adult gait, it has been inferred that comparable errors with children in absolute HJC position may in fact result in less significant kinematic and kinetic error. This study investigated the clinical agreement of three commonly used regression equation sets (Bell et al., Davis et al. and Orthotrak) for adult subjects against the equations of Harrington et al. The relationship between HJC position error and subject size was also investigated for the Davis et al. set. Full 3-dimensional gait analysis was performed on 12 healthy adult subjects with data for each set compared to Harrington et al. The Gait Profile Score, Gait Variable Score and GDI-kinetic were used to assess clinical significance while differences in HJC position between the Davis and Harrington sets were compared to leg length and subject height using regression analysis. A number of statistically significant differences were present in absolute HJC position. However, all sets fell below the clinically significant thresholds (GPS <1.6°, GDI-Kinetic <3.6 points). Linear regression revealed a statistically significant relationship for both increasing leg length and increasing subject height with decreasing error in anterior/posterior and superior/inferior directions. Results confirm a negligible clinical error for adult subjects suggesting that any of the examined sets could be used interchangeably. Decreasing error with both increasing leg length and increasing subject height suggests that the Davis set should be used cautiously on smaller subjects. Copyright © 2016 Elsevier B.V. All rights reserved.
Howard, Elizabeth J; Harville, Emily; Kissinger, Patricia; Xiong, Xu
2013-07-01
There is growing interest in the application of propensity scores (PS) in epidemiologic studies, especially within the field of reproductive epidemiology. This retrospective cohort study assesses the impact of a short interpregnancy interval (IPI) on preterm birth and compares the results of the conventional logistic regression analysis with analyses utilizing a PS. The study included 96,378 singleton infants from Louisiana birth certificate data (1995-2007). Five regression models designed for methods comparison are presented. Ten percent (10.17 %) of all births were preterm; 26.83 % of births were from a short IPI. The PS-adjusted model produced a more conservative estimate of the exposure variable compared to the conventional logistic regression method (β-coefficient: 0.21 vs. 0.43), as well as a smaller standard error (0.024 vs. 0.028), odds ratio and 95 % confidence intervals [1.15 (1.09, 1.20) vs. 1.23 (1.17, 1.30)]. The inclusion of more covariate and interaction terms in the PS did not change the estimates of the exposure variable. This analysis indicates that PS-adjusted regression may be appropriate for validation of conventional methods in a large dataset with a fairly common outcome. PS's may be beneficial in producing more precise estimates, especially for models with many confounders and effect modifiers and where conventional adjustment with logistic regression is unsatisfactory. Short intervals between pregnancies are associated with preterm birth in this population, according to either technique. Birth spacing is an issue that women have some control over. Educational interventions, including birth control, should be applied during prenatal visits and following delivery.
Sando, Roy; Chase, Katherine J.
2017-03-23
A common statistical procedure for estimating streamflow statistics at ungaged locations is to develop a relational model between streamflow and drainage basin characteristics at gaged locations using least squares regression analysis; however, least squares regression methods are parametric and make constraining assumptions about the data distribution. The random forest regression method provides an alternative nonparametric method for estimating streamflow characteristics at ungaged sites and requires that the data meet fewer statistical conditions than least squares regression methods.Random forest regression analysis was used to develop predictive models for 89 streamflow characteristics using Precipitation-Runoff Modeling System simulated streamflow data and drainage basin characteristics at 179 sites in central and eastern Montana. The predictive models were developed from streamflow data simulated for current (baseline, water years 1982–99) conditions and three future periods (water years 2021–38, 2046–63, and 2071–88) under three different climate-change scenarios. These predictive models were then used to predict streamflow characteristics for baseline conditions and three future periods at 1,707 fish sampling sites in central and eastern Montana. The average root mean square error for all predictive models was about 50 percent. When streamflow predictions at 23 fish sampling sites were compared to nearby locations with simulated data, the mean relative percent difference was about 43 percent. When predictions were compared to streamflow data recorded at 21 U.S. Geological Survey streamflow-gaging stations outside of the calibration basins, the average mean absolute percent error was about 73 percent.
High-Dimensional Heteroscedastic Regression with an Application to eQTL Data Analysis
Daye, Z. John; Chen, Jinbo; Li, Hongzhe
2011-01-01
Summary We consider the problem of high-dimensional regression under non-constant error variances. Despite being a common phenomenon in biological applications, heteroscedasticity has, so far, been largely ignored in high-dimensional analysis of genomic data sets. We propose a new methodology that allows non-constant error variances for high-dimensional estimation and model selection. Our method incorporates heteroscedasticity by simultaneously modeling both the mean and variance components via a novel doubly regularized approach. Extensive Monte Carlo simulations indicate that our proposed procedure can result in better estimation and variable selection than existing methods when heteroscedasticity arises from the presence of predictors explaining error variances and outliers. Further, we demonstrate the presence of heteroscedasticity in and apply our method to an expression quantitative trait loci (eQTLs) study of 112 yeast segregants. The new procedure can automatically account for heteroscedasticity in identifying the eQTLs that are associated with gene expression variations and lead to smaller prediction errors. These results demonstrate the importance of considering heteroscedasticity in eQTL data analysis. PMID:22547833
Azimian, Jalil; Piran, Pegah; Jahanihashemi, Hassan; Dehghankar, Leila
2017-01-01
Background Pressures in nursing can affect family life and marital problems, disrupt common social problems, increase work-family conflicts and endanger people’s general health. Aim To determine marital satisfaction and its relationship with job stress and general health of nurses. Methods This descriptive and cross-sectional study was done in 2015 in medical educational centers of Qazvin by using an ENRICH marital satisfaction scale and General Health and Job Stress questionnaires completed by 123 nurses. Analysis was done by SPSS version 19 using descriptive and analytical statistics (Pearson correlation, t-test, ANOVA, Chi-square, regression line, multiple regression analysis). Results The findings showed that 64.4% of nurses had marital satisfaction. There was significant relationship between age (p=0.03), job experience (p=0.01), age of spouse (p=0.01) and marital satisfaction. The results showed that there was a significant relationship between marital satisfaction and general health (p<0.0001). Multiple regression analysis showed that there was a significant relationship between depression (p=0.012) and anxiety (p=0.001) with marital satisfaction. Conclusions Due to high levels of job stress and disorder in general health of nurses and low marital satisfaction by running health promotion programs and paying attention to its dimensions can help work and family health of nurses. PMID:28607660
Nonlinear multivariate and time series analysis by neural network methods
NASA Astrophysics Data System (ADS)
Hsieh, William W.
2004-03-01
Methods in multivariate statistical analysis are essential for working with large amounts of geophysical data, data from observational arrays, from satellites, or from numerical model output. In classical multivariate statistical analysis, there is a hierarchy of methods, starting with linear regression at the base, followed by principal component analysis (PCA) and finally canonical correlation analysis (CCA). A multivariate time series method, the singular spectrum analysis (SSA), has been a fruitful extension of the PCA technique. The common drawback of these classical methods is that only linear structures can be correctly extracted from the data. Since the late 1980s, neural network methods have become popular for performing nonlinear regression and classification. More recently, neural network methods have been extended to perform nonlinear PCA (NLPCA), nonlinear CCA (NLCCA), and nonlinear SSA (NLSSA). This paper presents a unified view of the NLPCA, NLCCA, and NLSSA techniques and their applications to various data sets of the atmosphere and the ocean (especially for the El Niño-Southern Oscillation and the stratospheric quasi-biennial oscillation). These data sets reveal that the linear methods are often too simplistic to describe real-world systems, with a tendency to scatter a single oscillatory phenomenon into numerous unphysical modes or higher harmonics, which can be largely alleviated in the new nonlinear paradigm.
Modeling absolute differences in life expectancy with a censored skew-normal regression approach
Clough-Gorr, Kerri; Zwahlen, Marcel
2015-01-01
Parameter estimates from commonly used multivariable parametric survival regression models do not directly quantify differences in years of life expectancy. Gaussian linear regression models give results in terms of absolute mean differences, but are not appropriate in modeling life expectancy, because in many situations time to death has a negative skewed distribution. A regression approach using a skew-normal distribution would be an alternative to parametric survival models in the modeling of life expectancy, because parameter estimates can be interpreted in terms of survival time differences while allowing for skewness of the distribution. In this paper we show how to use the skew-normal regression so that censored and left-truncated observations are accounted for. With this we model differences in life expectancy using data from the Swiss National Cohort Study and from official life expectancy estimates and compare the results with those derived from commonly used survival regression models. We conclude that a censored skew-normal survival regression approach for left-truncated observations can be used to model differences in life expectancy across covariates of interest. PMID:26339544
Bumps in river profiles: uncertainty assessment and smoothing using quantile regression techniques
NASA Astrophysics Data System (ADS)
Schwanghart, Wolfgang; Scherler, Dirk
2017-12-01
The analysis of longitudinal river profiles is an important tool for studying landscape evolution. However, characterizing river profiles based on digital elevation models (DEMs) suffers from errors and artifacts that particularly prevail along valley bottoms. The aim of this study is to characterize uncertainties that arise from the analysis of river profiles derived from different, near-globally available DEMs. We devised new algorithms - quantile carving and the CRS algorithm - that rely on quantile regression to enable hydrological correction and the uncertainty quantification of river profiles. We find that globally available DEMs commonly overestimate river elevations in steep topography. The distributions of elevation errors become increasingly wider and right skewed if adjacent hillslope gradients are steep. Our analysis indicates that the AW3D DEM has the highest precision and lowest bias for the analysis of river profiles in mountainous topography. The new 12 m resolution TanDEM-X DEM has a very low precision, most likely due to the combined effect of steep valley walls and the presence of water surfaces in valley bottoms. Compared to the conventional approaches of carving and filling, we find that our new approach is able to reduce the elevation bias and errors in longitudinal river profiles.
Common mental disorders and associated factors: a study of women from a rural area.
Parreira, Bibiane Dias Miranda; Goulart, Bethania Ferreira; Haas, Vanderlei José; Silva, Sueli Riul da; Monteiro, Juliana Cristina Dos Santos; Gomes-Sponholz, Flávia Azevedo; Parreira, Bibiane Dias Miranda; Goulart, Bethania Ferreira; Haas, Vanderlei José; Silva, Sueli Riul da; Monteiro, Juliana Cristina Dos Santos; Gomes-Sponholz, Flávia Azevedo
2017-05-25
Identifying the prevalence of Common Mental Disorders and analyzing the influence of sociodemographic, economic, behavioral and reproductive health variables on Common Mental Disorders in women of childbearing age living in the rural area of Uberaba-MG, Brazil. An observational and cross-sectional study. Socio-demographic, economic, behavioral and reproductive health instruments were used, along with the Self-Reporting Questionnaire (SRQ-20) to identify common mental disorders. Multiple logistic regression was used for multivariate data analysis. 280 women participated in the study. The prevalence of Common Mental Disorders was 35.7%. In the logistic regression analysis, the variables of living with a partner and education level were associated with Common Mental Disorders, even after adjusting for the other variables. Our findings evidenced an association of social and behavioral factors with Common Mental Disorders among rural women. Identification and individualized care in primary health care are essential for the quality of life of these women. Identificar a prevalência do transtorno mental comum e analisar a influência de variáveis sociodemográficas, econômicas, comportamentais e de saúde reprodutiva sobre o transtorno mental comum em mulheres em idade fértil, residentes na zona rural do município de Uberaba-MG, Brasil. Estudo observacional e transversal. Foram utilizados instrumentos de caracterização sociodemográfica, econômica, comportamental e de saúde reprodutiva, e o Self-Reporting Questionnaire (SRQ-20) para identificar os transtornos mentais comuns. Na análise multivariada dos dados, foi utilizada a regressão logística múltipla. Participaram do estudo 280 mulheres. A prevalência do transtorno mental comum foi de 35,7%. Na análise de regressão logística, as variáveis convivência com o companheiro e escolaridade, associaram-se ao transtorno mental comum, mesmo após o ajuste para as demais variáveis. Os achados evidenciaram a associação de fatores sociais e comportamentais com o transtorno mental comum, entre mulheres rurais. A identificação e a assistência individualizada na atenção primária de saúde são essenciais para a qualidade de vida destas mulheres. Identificar la prevalencia de trastornos mentales comunes y analizar la influencia de las variables socio-demográficas, económicas, de comportamiento y de salud reproductiva en el trastorno mental común en las mujeres en edad fértil, que viven en el municipio rural de Uberaba, Minas Gerais, Brasil. Estudio observacional y transversal. Se usaron instrumentos sociodemográficos, económicos, de comportamiento y salud reproductiva, y el Self-Reporting Questionnaire (SRQ-20) para identificar los trastornos mentales comunes. En el análisis multifactorial de los datos, se utilizó la regresión logística múltiple. El estudio incluyó a 280 mujeres. La prevalencia de los trastornos mentales comunes fue de 35,7%. En el análisis de regresión logística, las variables convivencia con su pareja y la escolarización se asociaron con trastorno mental común, incluso después de ajustar por otras variables. Los resultados muestran la relación entre los factores sociales y de comportamiento con el trastorno mental común entre las mujeres rurales. La identificación y la atención individual en la atención primaria de salud son esenciales para la calidad de vida de las mujeres.
SOCIAL STABILITY AND HIV RISK BEHAVIOR: EVALUATING THE ROLE OF ACCUMULATED VULNERABILITY
German, Danielle; Latkin, Carl A.
2011-01-01
This study evaluated a cumulative and syndromic relationship among commonly co-occurring vulnerabilites (homelessness, incarceration, low-income, residential transition) in association with HIV-related risk behaviors among 635 low-income women in Baltimore. Analysis included descriptive statistics, logistic regression, latent class analysis and latent class regression. Both methods of assessing multidimensional instability showed significant associations with risk indicators. Risk of multiple partners, sex exchange, and drug use decreased significantly with each additional domain. Higher stability class membership (77%) was associated with decreased likelihood of multiple partners, exchange partners, recent drug use, and recent STI. Multidimensional social vulnerabilities were cumulatively and synergistically linked to HIV risk behavior. Independent instability measures may miss important contextual determinants of risk. Social stability offers a useful framework to understand the synergy of social vulnerabilities that shape sexual risk behavior. Social policies and programs aiming to enhance housing and overall social stability are likely to be beneficial for HIV prevention. PMID:21259043
Understanding software faults and their role in software reliability modeling
NASA Technical Reports Server (NTRS)
Munson, John C.
1994-01-01
This study is a direct result of an on-going project to model the reliability of a large real-time control avionics system. In previous modeling efforts with this system, hardware reliability models were applied in modeling the reliability behavior of this system. In an attempt to enhance the performance of the adapted reliability models, certain software attributes were introduced in these models to control for differences between programs and also sequential executions of the same program. As the basic nature of the software attributes that affect software reliability become better understood in the modeling process, this information begins to have important implications on the software development process. A significant problem arises when raw attribute measures are to be used in statistical models as predictors, for example, of measures of software quality. This is because many of the metrics are highly correlated. Consider the two attributes: lines of code, LOC, and number of program statements, Stmts. In this case, it is quite obvious that a program with a high value of LOC probably will also have a relatively high value of Stmts. In the case of low level languages, such as assembly language programs, there might be a one-to-one relationship between the statement count and the lines of code. When there is a complete absence of linear relationship among the metrics, they are said to be orthogonal or uncorrelated. Usually the lack of orthogonality is not serious enough to affect a statistical analysis. However, for the purposes of some statistical analysis such as multiple regression, the software metrics are so strongly interrelated that the regression results may be ambiguous and possibly even misleading. Typically, it is difficult to estimate the unique effects of individual software metrics in the regression equation. The estimated values of the coefficients are very sensitive to slight changes in the data and to the addition or deletion of variables in the regression equation. Since most of the existing metrics have common elements and are linear combinations of these common elements, it seems reasonable to investigate the structure of the underlying common factors or components that make up the raw metrics. The technique we have chosen to use to explore this structure is a procedure called principal components analysis. Principal components analysis is a decomposition technique that may be used to detect and analyze collinearity in software metrics. When confronted with a large number of metrics measuring a single construct, it may be desirable to represent the set by some smaller number of variables that convey all, or most, of the information in the original set. Principal components are linear transformations of a set of random variables that summarize the information contained in the variables. The transformations are chosen so that the first component accounts for the maximal amount of variation of the measures of any possible linear transform; the second component accounts for the maximal amount of residual variation; and so on. The principal components are constructed so that they represent transformed scores on dimensions that are orthogonal. Through the use of principal components analysis, it is possible to have a set of highly related software attributes mapped into a small number of uncorrelated attribute domains. This definitively solves the problem of multi-collinearity in subsequent regression analysis. There are many software metrics in the literature, but principal component analysis reveals that there are few distinct sources of variation, i.e. dimensions, in this set of metrics. It would appear perfectly reasonable to characterize the measurable attributes of a program with a simple function of a small number of orthogonal metrics each of which represents a distinct software attribute domain.
Yoneoka, Daisuke; Henmi, Masayuki
2017-11-30
Recently, the number of clinical prediction models sharing the same regression task has increased in the medical literature. However, evidence synthesis methodologies that use the results of these regression models have not been sufficiently studied, particularly in meta-analysis settings where only regression coefficients are available. One of the difficulties lies in the differences between the categorization schemes of continuous covariates across different studies. In general, categorization methods using cutoff values are study specific across available models, even if they focus on the same covariates of interest. Differences in the categorization of covariates could lead to serious bias in the estimated regression coefficients and thus in subsequent syntheses. To tackle this issue, we developed synthesis methods for linear regression models with different categorization schemes of covariates. A 2-step approach to aggregate the regression coefficient estimates is proposed. The first step is to estimate the joint distribution of covariates by introducing a latent sampling distribution, which uses one set of individual participant data to estimate the marginal distribution of covariates with categorization. The second step is to use a nonlinear mixed-effects model with correction terms for the bias due to categorization to estimate the overall regression coefficients. Especially in terms of precision, numerical simulations show that our approach outperforms conventional methods, which only use studies with common covariates or ignore the differences between categorization schemes. The method developed in this study is also applied to a series of WHO epidemiologic studies on white blood cell counts. Copyright © 2017 John Wiley & Sons, Ltd.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nimbalkar, Sachin U.; Wenning, Thomas J.; Guo, Wei
In the United States, manufacturing facilities account for about 32% of total domestic energy consumption in 2014. Robust energy tracking methodologies are critical to understanding energy performance in manufacturing facilities. Due to its simplicity and intuitiveness, the classic energy intensity method (i.e. the ratio of total energy use over total production) is the most widely adopted. However, the classic energy intensity method does not take into account the variation of other relevant parameters (i.e. product type, feed stock type, weather, etc.). Furthermore, the energy intensity method assumes that the facilities’ base energy consumption (energy use at zero production) is zero,more » which rarely holds true. Therefore, it is commonly recommended to utilize regression models rather than the energy intensity approach for tracking improvements at the facility level. Unfortunately, many energy managers have difficulties understanding why regression models are statistically better than utilizing the classic energy intensity method. While anecdotes and qualitative information may convince some, many have major reservations about the accuracy of regression models and whether it is worth the time and effort to gather data and build quality regression models. This paper will explain why regression models are theoretically and quantitatively more accurate for tracking energy performance improvements. Based on the analysis of data from 114 manufacturing plants over 12 years, this paper will present quantitative results on the importance of utilizing regression models over the energy intensity methodology. This paper will also document scenarios where regression models do not have significant relevance over the energy intensity method.« less
Ranney, Megan L; Patena, John V; Nugent, Nicole; Spirito, Anthony; Boyer, Edward; Zatzick, Douglas; Cunningham, Rebecca
2016-01-01
Posttraumatic stress disorder (PTSD) is often underdiagnosed and undertreated among adolescents. The objective of this analysis was to describe the prevalence and correlates of symptoms consistent with PTSD among adolescents presenting to an urban emergency department (ED). A cross-sectional survey of adolescents aged 13-17 years presenting to the ED for any reason was conducted between August 2013 and March 2014. Validated self-report measures were used to measure mental health symptoms, violence exposure and risky behaviors. Multivariate logistic regression analysis was performed to determine adjusted differences in associations between symptoms consistent with PTSD and predicted correlates. Of 353 adolescents, 23.2% reported current symptoms consistent with PTSD, 13.9% had moderate or higher depressive symptoms and 11.3% reported past-year suicidal ideation. Adolescents commonly reported physical peer violence (46.5%), cyberbullying (46.7%) and exposure to community violence (58.9%). On multivariate logistic regression, physical peer violence, cyberbullying victimization, exposure to community violence, female gender and alcohol or other drug use positively correlated with symptoms consistent with PTSD. Among adolescents presenting to the ED for any reason, symptoms consistent with PTSD, depressive symptoms, physical peer violence, cyberbullying and community violence exposure are common and interrelated. Greater attention to PTSD, both disorder and symptom levels, and its cooccurring risk factors is needed. Copyright © 2016 Elsevier Inc. All rights reserved.
Yingyong, Penpimol
2010-11-01
Refractive error is one of the leading causes of visual impairment in children. An analysis of risk factors for refractive error is required to reduce and prevent this common eye disease. To identify the risk factors associated with refractive errors in primary school children (6-12 year old) in Nakhon Pathom province. A population-based cross-sectional analytic study was conducted between October 2008 and September 2009 in Nakhon Pathom. Refractive error, parental refractive status, and hours per week of near activities (studying, reading books, watching television, playing with video games, or working on the computer) were assessed in 377 children who participated in this study. The most common type of refractive error in primary school children was myopia. Myopic children were more likely to have parents with myopia. Children with myopia spend more time at near activities. The multivariate odds ratio (95% confidence interval)for two myopic parents was 6.37 (2.26-17.78) and for each diopter-hour per week of near work was 1.019 (1.005-1.033). Multivariate logistic regression models show no confounding effects between parental myopia and near work suggesting that each factor has an independent association with myopia. Statistical analysis by logistic regression revealed that family history of refractive error and hours of near-work were significantly associated with refractive error in primary school children.
Hazardous alcohol consumption in non-aboriginal male inmates in New South Wales.
Field, Courtney
2018-03-12
Purpose The purpose of this paper is to examine correlates and predictors of hazardous drinking behaviour, that may be considered evidence of generalised strain, in a sample of incarcerated non-Aboriginal males in New South Wales, Australia. Design/methodology/approach Data were collected from 283 non-Aboriginal male inmates as part of a larger epidemiological survey of inmates in NSW undertaken in 2015 by the Justice Health and Forensic Mental Health Network. Data relating to a range of social factors were selected with reference to relevant literature and assessed with regards their predictive value for scores from the Alcohol Use Disorders Identification Test (AUDIT). To facilitate regression analysis, variables were logically organised into historical factors or adult factors. Findings Almost all participants reported some history of alcohol consumption. Hazardous drinking was common among participants. While parental alcohol problems and adult drug use were the only correlates of AUDIT scores, parental misuse of alcohol was shown to be an important predictor of AUDIT scores in regression analysis. The role of parent gender was inconclusive. Previous incarceration as an adult, employment status, and drug use as an adult also predicted AUDIT scores. Originality/value Alcohol abuse is common among inmates and the use of alcohol is implicated in the commission of many offences. A better understanding of its genesis may inspire novel approaches to treatment, leading to improved health outcomes for inmates.
Novotny, Dalibor; Vaverkova, Helena; Karasek, David; Malina, Pavel
2014-08-01
The aim was to evaluate the relationships of the T-1131C (rs662799) polymorphism variants of apolipoprotein A5 (Apo A5) gene and variants of apolipoprotein E (Apo E) gene common polymorphism (rs429358, rs7412) to signs of metabolic syndrome (MetS). We examined 590 asymptomatic dyslipidemic patients divided into MetS+ (n=146) and MetS- (n=444) groups according to criteria of NCEP ATPIII Panel. We evaluated genotype frequencies and differences in MetS features between individual groups. Logistic regression analysis was used for the evaluation of Apo A5/Apo E variants as possible risk factors for MetS. We found no statistical differences between genotype and allele frequencies for both Apo A5 and Apo E polymorphisms between MetS+ and MetS- groups. In all subjects and MetS- group, we confirmed well-known association of the -1131C Apo A5 minor allele with elevated triglycerides (TG, p<0.001). The Apo E gene E2 and E4 variants were associated with higher levels of TG (p<0.01) in comparison to E33 common variant. However, no statistical differences were observed in MetS+ subjects, regardless of significantly higher TG levels in this group. Apo A5/Apo E variant analysis in all dyslipidemic patients revealed significant increase of TG levels in all subgroups in comparison to common -1131T/E3 variant carriers, the most in -1131C/E4 variant subgroup. Logistic regression analysis models showed no association of Apo A5, Apo E and all Apo A5/Apo E variants with metabolic syndrome, even after adjustment for age and sex. Our study refined the role of Apo A5 and Apo E genetic variants in the group of adult dyslipidemic patients. We demonstrate that except of TG, Apo A5 T-1131C (rs662799) and Apo E (rs429358, rs7412) polymorphisms have no remarkable effect on MetS characteristics. Copyright © 2014 The Canadian Society of Clinical Chemists. Published by Elsevier Inc. All rights reserved.
Classical Statistics and Statistical Learning in Imaging Neuroscience
Bzdok, Danilo
2017-01-01
Brain-imaging research has predominantly generated insight by means of classical statistics, including regression-type analyses and null-hypothesis testing using t-test and ANOVA. Throughout recent years, statistical learning methods enjoy increasing popularity especially for applications in rich and complex data, including cross-validated out-of-sample prediction using pattern classification and sparsity-inducing regression. This concept paper discusses the implications of inferential justifications and algorithmic methodologies in common data analysis scenarios in neuroimaging. It is retraced how classical statistics and statistical learning originated from different historical contexts, build on different theoretical foundations, make different assumptions, and evaluate different outcome metrics to permit differently nuanced conclusions. The present considerations should help reduce current confusion between model-driven classical hypothesis testing and data-driven learning algorithms for investigating the brain with imaging techniques. PMID:29056896
Zeng, Ni; Wu, Jun; Zhu, Wen-Chao; Shi, Bing; Jia, Zhong-Lin
2015-11-01
Non-syndromic orofacial clefts (NSOCs) are complex disease involving genetic triggers, environmental factors, and their interplay. Recent studies demonstrated that EYA1, a member of eye absent gene family, might contribute to NSOCs. We investigated three single nucleotide polymorphisms (SNPs) and eight environmental factors (multivitamin, folic acid and calcium supplementation history, maternal alcohol consumption, common cold history, maternal smoking and environmental tobacco smoke in the first trimester, and paternal smoking in the 3 months before pregnancy) among 294 case-parent trios and 183 individual controls in western Han Chinese to evaluate the relationship between EYA1, environmental factors, and NSOCs. To be better known the gene's role in the etiology of NSOCs, we performed statistical analysis in different aspects including the linkage disequilibrium test, transmission disequilibrium test, haplotype analysis, multiple logistic regression analysis, and conditional logistic regression analysis. Allele C at rs3779748 showed an over-transmission in NSCL/P trios (P = 0.03), and genotype A/A at rs10094908 was under-transmitted among NSCL/P trios (P = 0.03), whereas over-transmitted among NSCPO trios (P = 0.02). The haplotype GC of rs10094908-rs3779748 was over-transmitted among NSCL/P trios (P = 0.05) and NSCPO trios (P = 0.05), respectively. Maternal common cold history, environmental tobacco smoke, and maternal alcohol consumption during the first trimester of pregnancy were risk factors for NSOCs, while calcium supplementation during the first trimester showed a protective effect. No evidence of interactions between EYA1 and environmental factors was found. These results revealed an association between EYA1, some environmental factors, and NSOCs in western Han Chinese. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Davies, Neil M; Jones, Tim; Kehoe, Patrick G; Martin, Richard M
2016-01-01
Introduction Current treatments for Alzheimer's and other neurodegenerative diseases have only limited effectiveness meaning that there is an urgent need for new medications that could influence disease incidence and progression. We will investigate the potential of a selection of commonly prescribed drugs, as a more efficient and cost-effective method of identifying new drugs for the prevention or treatment of Alzheimer's disease, non-Alzheimer's disease dementias, Parkinson's disease and amyotrophic lateral sclerosis. Our research will focus on drugs used for the treatment of hypertension, hypercholesterolaemia and type 2 diabetes, all of which have previously been identified as potentially cerebroprotective and have variable levels of preclinical evidence that suggest they may have beneficial effects for various aspects of dementia pathology. Methods and analysis We will conduct a hypothesis testing observational cohort study using data from the Clinical Practice Research Datalink (CPRD). Our analysis will consider four statistical methods, which have different approaches for modelling confounding. These are multivariable adjusted Cox regression; propensity matched regression; instrumental variable analysis and marginal structural models. We will also use an intention-to-treat analysis, whereby we will define all exposures based on the first prescription observed in the database so that the target parameter is comparable to that estimated by a randomised controlled trial. Ethics and dissemination This protocol has been approved by the CPRD's Independent Scientific Advisory Committee (ISAC). We will publish the results of the study as open-access peer-reviewed publications and disseminate findings through national and international conferences as are appropriate. PMID:27965247
Using Weighted Least Squares Regression for Obtaining Langmuir Sorption Constants
USDA-ARS?s Scientific Manuscript database
One of the most commonly used models for describing phosphorus (P) sorption to soils is the Langmuir model. To obtain model parameters, the Langmuir model is fit to measured sorption data using least squares regression. Least squares regression is based on several assumptions including normally dist...
Fang, Ling; Gu, Caiyun; Liu, Xinyu; Xie, Jiabin; Hou, Zhiguo; Tian, Meng; Yin, Jia; Li, Aizhu; Li, Yubo
2017-01-01
Primary dysmenorrhea (PD) is a common gynecological disorder which, while not life-threatening, severely affects the quality of life of women. Most patients with PD suffer ovarian hormone imbalances caused by uterine contraction, which results in dysmenorrhea. PD patients may also suffer from increases in estrogen levels caused by increased levels of prostaglandin synthesis and release during luteal regression and early menstruation. Although PD pathogenesis has been previously reported on, these studies only examined the menstrual period and neglected the importance of the luteal regression stage. Therefore, the present study used urine metabolomics to examine changes in endogenous substances and detect urine biomarkers for PD during luteal regression. Ultra performance liquid chromatography coupled with quadrupole-time-of-flight mass spectrometry was used to create metabolomic profiles for 36 patients with PD and 27 healthy controls. Principal component analysis and partial least squares discriminate analysis were used to investigate the metabolic alterations associated with PD. Ten biomarkers for PD were identified, including ornithine, dihydrocortisol, histidine, citrulline, sphinganine, phytosphingosine, progesterone, 17-hydroxyprogesterone, androstenedione, and 15-keto-prostaglandin F2α. The specificity and sensitivity of these biomarkers was assessed based on the area under the curve of receiver operator characteristic curves, which can be used to distinguish patients with PD from healthy controls. These results provide novel targets for the treatment of PD. PMID:28098892
Principal component regression analysis with SPSS.
Liu, R X; Kuang, J; Gong, Q; Hou, X L
2003-06-01
The paper introduces all indices of multicollinearity diagnoses, the basic principle of principal component regression and determination of 'best' equation method. The paper uses an example to describe how to do principal component regression analysis with SPSS 10.0: including all calculating processes of the principal component regression and all operations of linear regression, factor analysis, descriptives, compute variable and bivariate correlations procedures in SPSS 10.0. The principal component regression analysis can be used to overcome disturbance of the multicollinearity. The simplified, speeded up and accurate statistical effect is reached through the principal component regression analysis with SPSS.
Detection of crossover time scales in multifractal detrended fluctuation analysis
NASA Astrophysics Data System (ADS)
Ge, Erjia; Leung, Yee
2013-04-01
Fractal is employed in this paper as a scale-based method for the identification of the scaling behavior of time series. Many spatial and temporal processes exhibiting complex multi(mono)-scaling behaviors are fractals. One of the important concepts in fractals is crossover time scale(s) that separates distinct regimes having different fractal scaling behaviors. A common method is multifractal detrended fluctuation analysis (MF-DFA). The detection of crossover time scale(s) is, however, relatively subjective since it has been made without rigorous statistical procedures and has generally been determined by eye balling or subjective observation. Crossover time scales such determined may be spurious and problematic. It may not reflect the genuine underlying scaling behavior of a time series. The purpose of this paper is to propose a statistical procedure to model complex fractal scaling behaviors and reliably identify the crossover time scales under MF-DFA. The scaling-identification regression model, grounded on a solid statistical foundation, is first proposed to describe multi-scaling behaviors of fractals. Through the regression analysis and statistical inference, we can (1) identify the crossover time scales that cannot be detected by eye-balling observation, (2) determine the number and locations of the genuine crossover time scales, (3) give confidence intervals for the crossover time scales, and (4) establish the statistically significant regression model depicting the underlying scaling behavior of a time series. To substantive our argument, the regression model is applied to analyze the multi-scaling behaviors of avian-influenza outbreaks, water consumption, daily mean temperature, and rainfall of Hong Kong. Through the proposed model, we can have a deeper understanding of fractals in general and a statistical approach to identify multi-scaling behavior under MF-DFA in particular.
Higher-order Multivariable Polynomial Regression to Estimate Human Affective States
NASA Astrophysics Data System (ADS)
Wei, Jie; Chen, Tong; Liu, Guangyuan; Yang, Jiemin
2016-03-01
From direct observations, facial, vocal, gestural, physiological, and central nervous signals, estimating human affective states through computational models such as multivariate linear-regression analysis, support vector regression, and artificial neural network, have been proposed in the past decade. In these models, linear models are generally lack of precision because of ignoring intrinsic nonlinearities of complex psychophysiological processes; and nonlinear models commonly adopt complicated algorithms. To improve accuracy and simplify model, we introduce a new computational modeling method named as higher-order multivariable polynomial regression to estimate human affective states. The study employs standardized pictures in the International Affective Picture System to induce thirty subjects’ affective states, and obtains pure affective patterns of skin conductance as input variables to the higher-order multivariable polynomial model for predicting affective valence and arousal. Experimental results show that our method is able to obtain efficient correlation coefficients of 0.98 and 0.96 for estimation of affective valence and arousal, respectively. Moreover, the method may provide certain indirect evidences that valence and arousal have their brain’s motivational circuit origins. Thus, the proposed method can serve as a novel one for efficiently estimating human affective states.
Emerson, Douglas G.; Vecchia, Aldo V.; Dahl, Ann L.
2005-01-01
The drainage-area ratio method commonly is used to estimate streamflow for sites where no streamflow data were collected. To evaluate the validity of the drainage-area ratio method and to determine if an improved method could be developed to estimate streamflow, a multiple-regression technique was used to determine if drainage area, main channel slope, and precipitation were significant variables for estimating streamflow in the Red River of the North Basin. A separate regression analysis was performed for streamflow for each of three seasons-- winter, spring, and summer. Drainage area and summer precipitation were the most significant variables. However, the regression equations generally overestimated streamflows for North Dakota stations and underestimated streamflows for Minnesota stations. To correct the bias in the residuals for the two groups of stations, indicator variables were included to allow both the intercept and the coefficient for the logarithm of drainage area to depend on the group. Drainage area was the only significant variable in the revised regression equations. The exponents for the drainage-area ratio were 0.85 for the winter season, 0.91 for the spring season, and 1.02 for the summer season.
Higher-order Multivariable Polynomial Regression to Estimate Human Affective States
Wei, Jie; Chen, Tong; Liu, Guangyuan; Yang, Jiemin
2016-01-01
From direct observations, facial, vocal, gestural, physiological, and central nervous signals, estimating human affective states through computational models such as multivariate linear-regression analysis, support vector regression, and artificial neural network, have been proposed in the past decade. In these models, linear models are generally lack of precision because of ignoring intrinsic nonlinearities of complex psychophysiological processes; and nonlinear models commonly adopt complicated algorithms. To improve accuracy and simplify model, we introduce a new computational modeling method named as higher-order multivariable polynomial regression to estimate human affective states. The study employs standardized pictures in the International Affective Picture System to induce thirty subjects’ affective states, and obtains pure affective patterns of skin conductance as input variables to the higher-order multivariable polynomial model for predicting affective valence and arousal. Experimental results show that our method is able to obtain efficient correlation coefficients of 0.98 and 0.96 for estimation of affective valence and arousal, respectively. Moreover, the method may provide certain indirect evidences that valence and arousal have their brain’s motivational circuit origins. Thus, the proposed method can serve as a novel one for efficiently estimating human affective states. PMID:26996254
Smith, Joseph M.; Mather, Martha E.
2012-01-01
Ecological indicators are science-based tools used to assess how human activities have impacted environmental resources. For monitoring and environmental assessment, existing species assemblage data can be used to make these comparisons through time or across sites. An impediment to using assemblage data, however, is that these data are complex and need to be simplified in an ecologically meaningful way. Because multivariate statistics are mathematical relationships, statistical groupings may not make ecological sense and will not have utility as indicators. Our goal was to define a process to select defensible and ecologically interpretable statistical simplifications of assemblage data in which researchers and managers can have confidence. For this, we chose a suite of statistical methods, compared the groupings that resulted from these analyses, identified convergence among groupings, then we interpreted the groupings using species and ecological guilds. When we tested this approach using a statewide stream fish dataset, not all statistical methods worked equally well. For our dataset, logistic regression (Log), detrended correspondence analysis (DCA), cluster analysis (CL), and non-metric multidimensional scaling (NMDS) provided consistent, simplified output. Specifically, the Log, DCA, CL-1, and NMDS-1 groupings were ≥60% similar to each other, overlapped with the fluvial-specialist ecological guild, and contained a common subset of species. Groupings based on number of species (e.g., Log, DCA, CL and NMDS) outperformed groupings based on abundance [e.g., principal components analysis (PCA) and Poisson regression]. Although the specific methods that worked on our test dataset have generality, here we are advocating a process (e.g., identifying convergent groupings with redundant species composition that are ecologically interpretable) rather than the automatic use of any single statistical tool. We summarize this process in step-by-step guidance for the future use of these commonly available ecological and statistical methods in preparing assemblage data for use in ecological indicators.
A comparative analysis of readmission rates after outpatient cosmetic surgery.
Mioton, Lauren M; Alghoul, Mohammed S; Kim, John Y S
2014-02-01
Despite the increasing scrutiny of surgical procedures, outpatient cosmetic surgery has an established record of safety and efficacy. A key measure in assessing surgical outcomes is the examination of readmission rates. However, there is a paucity of data on unplanned readmission following cosmetic surgery procedures. The authors studied readmission rates for outpatient cosmetic surgery and compared the data with readmission rates for other surgical procedures. The 2011 National Surgical Quality Improvement Program (NSQIP) data set was queried for all outpatient procedures. Readmission rates were calculated for the 5 surgical specialties with the greatest number of outpatient procedures and for the overall outpatient cosmetic surgery population. Subgroup analysis was performed on the 5 most common cosmetic surgery procedures. Multivariate regression models were used to determine predictors of readmission for cosmetic surgery patients. The 2879 isolated outpatient cosmetic surgery cases had an associated 0.90% unplanned readmission rate. The 5 specialties with the highest number of outpatient surgical procedures were general, orthopedic, gynecologic, urologic, and otolaryngologic surgery; their unplanned readmission rates ranged from 1.21% to 3.73%. The 5 most common outpatient cosmetic surgery procedures and their associated readmission rates were as follows: reduction mammaplasty, 1.30%; mastopexy, 0.31%; liposuction, 1.13%; abdominoplasty, 1.78%; and breast augmentation, 1.20%. Multivariate regression analysis demonstrated that operating time (in hours) was an independent predictor of readmission (odds ratio, 1.40; 95% confidence interval, 1.08-1.81; P=.010). Rates of unplanned readmission with outpatient cosmetic surgery are low and compare favorably to those of other outpatient surgeries.
Fichman, Yoseph; Levi, Assi; Hodak, Emmilia; Halachmi, Shlomit; Mazor, Sigal; Wolf, Dana; Caplan, Orit; Lapidoth, Moshe
2018-05-01
Verruca vulgaris (VV) is a prevalent skin condition caused by various subtypes of human papilloma virus (HPV). The most common causes of non-genital lesions are HPV types 2 and 4, and to a lesser extent types 1, 3, 26, 29, and 57. Although numerous therapeutic modalities exist, none is universally effective or without adverse events (AE). Pulsed dye laser (PDL) is a favorable option due to its observed efficacy and relatively low AE rate. However, it is not known which verrucae are most likely to respond to PDL, or whether the causative viral subtype influences this response. The objective of this prospective blinded study was to assess whether the HPV subtype was predictive of response to PDL. For that matter, 26 verrucae from 26 immunocompetent patients were biopsied prior to treatment by PDL. HPV coding sequences were isolated and genotyped using PCR analysis. Patients were treated by PDL (595 nm wavelength, 5 mm spot size, 1.5 ms pulse duration, 12 J/cm 2 fluence) once a month for up to 6 months, and clinical response was assessed. Binary logistic regression analysis and linear logistic regression analysis were used in order to evaluate statistical significance. Different types of HPV were identified in 22 of 26 tissue samples. Response to treatment did not correlate with HPV type, age, or gender. As no association between HPV type and response to PDL therapy could be established, it is therefore equally effective for all HPV types and remains a favorable treatment option for all VV.
Preterm Versus Term Children: Analysis of Sedation/Anesthesia Adverse Events and Longitudinal Risk.
Havidich, Jeana E; Beach, Michael; Dierdorf, Stephen F; Onega, Tracy; Suresh, Gautham; Cravero, Joseph P
2016-03-01
Preterm and former preterm children frequently require sedation/anesthesia for diagnostic and therapeutic procedures. Our objective was to determine the age at which children who are born <37 weeks gestational age are no longer at increased risk for sedation/anesthesia adverse events. Our secondary objective was to describe the nature and incidence of adverse events. This is a prospective observational study of children receiving sedation/anesthesia for diagnostic and/or therapeutic procedures outside of the operating room by the Pediatric Sedation Research Consortium. A total of 57,227 patients 0 to 22 years of age were eligible for this study. All adverse events and descriptive terms were predefined. Logistic regression and locally weighted scatterplot regression were used for analysis. Preterm and former preterm children had higher adverse event rates (14.7% vs 8.5%) compared with children born at term. Our analysis revealed a biphasic pattern for the development of adverse sedation/anesthesia events. Airway and respiratory adverse events were most commonly reported. MRI scans were the most commonly performed procedures in both categories of patients. Patients born preterm are nearly twice as likely to develop sedation/anesthesia adverse events, and this risk continues up to 23 years of age. We recommend obtaining birth history during the formulation of an anesthetic/sedation plan, with heightened awareness that preterm and former preterm children may be at increased risk. Further prospective studies focusing on the etiology and prevention of adverse events in former preterm patients are warranted. Copyright © 2016 by the American Academy of Pediatrics.
Sreeramareddy, Chandrashekhar T; Panduru, Kishore V; Verma, Sharat C; Joshi, Hari S; Bates, Michael N
2008-01-01
Background Studies from developed countries have reported on host-related risk factors for extra-pulmonary tuberculosis (EPTB). However, similar studies from high-burden countries like Nepal are lacking. Therefore, we carried out this study to compare demographic, life-style and clinical characteristics between EPTB and PTB patients. Methods A retrospective analysis was carried out on 474 Tuberculosis (TB) patients diagnosed in a tertiary care hospital in western Nepal. Characteristics of demography, life-style and clinical features were obtained from medical case records. Risk factors for being an EPTB patient relative to a PTB patient were identified using logistic regression analysis. Results The age distribution of the TB patients had a bimodal distribution. The male to female ratio for PTB was 2.29. EPTB was more common at younger ages (< 25 years) and in females. Common sites for EPTB were lymph nodes (42.6%) and peritoneum and/or intestines (14.8%). By logistic regression analysis, age less than 25 years (OR 2.11 95% CI 1.12–3.68) and female gender (OR 1.69, 95% CI 1.12–2.56) were associated with EPTB. Smoking, use of immunosuppressive drugs/steroids, diabetes and past history of TB were more likely to be associated with PTB. Conclusion Results suggest that younger age and female gender may be independent risk factors for EPTB in a high-burden country like Nepal. TB control programmes may target young and female populations for EPTB case-finding. Further studies are necessary in other high-burden countries to confirm our findings. PMID:18218115
Pariser, Joseph J; Pearce, Shane M; Patel, Sanjay G; Bales, Gregory T
2015-10-01
To examine the national trends of simple prostatectomy (SP) for benign prostatic hyperplasia (BPH) focusing on perioperative outcomes and risk factors for complications. The National Inpatient Sample (2002-2012) was utilized to identify patients with BPH undergoing SP. Analysis included demographics, hospital details, associated procedures, and operative approach (open, robotic, or laparoscopic). Outcomes included complications, length of stay, charges, and mortality. Multivariate logistic regression was used to determine the risk factors for perioperative complications. Linear regression was used to assess the trends in the national annual utilization of SP. The study population included 35,171 patients. Median length of stay was 4 days (interquartile range 3-6). Cystolithotomy was performed concurrently in 6041 patients (17%). The overall complication rate was 28%, with bleeding occurring most commonly. In total, 148 (0.4%) patients experienced in-hospital mortality. On multivariate analysis, older age, black race, and overall comorbidity were associated with greater risk of complications while the use of a minimally invasive approach and concurrent cystolithotomy had a decreased risk. Over the study period, the national use of simple prostatectomy decreased, on average, by 145 cases per year (P = .002). By 2012, 135/2580 procedures (5%) were performed using a minimally invasive approach. The nationwide utilization of SP for BPH has decreased. Bleeding complications are common, but perioperative mortality is low. Patients who are older, black race, or have multiple comorbidities are at higher risk of complications. Minimally invasive approaches, which are becoming increasingly utilized, may reduce perioperative morbidity. Copyright © 2015 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Neher, Christopher; Duffield, John; Patterson, David
2013-09-01
The National Park Service (NPS) currently manages a large and diverse system of park units nationwide which received an estimated 279 million recreational visits in 2011. This article uses park visitor data collected by the NPS Visitor Services Project to estimate a consistent set of count data travel cost models of park visitor willingness to pay (WTP). Models were estimated using 58 different park unit survey datasets. WTP estimates for these 58 park surveys were used within a meta-regression analysis model to predict average and total WTP for NPS recreational visitation system-wide. Estimated WTP per NPS visit in 2011 averaged 102 system-wide, and ranged across park units from 67 to 288. Total 2011 visitor WTP for the NPS system is estimated at 28.5 billion with a 95% confidence interval of 19.7-43.1 billion. The estimation of a meta-regression model using consistently collected data and identical specification of visitor WTP models greatly reduces problems common to meta-regression models, including sample selection bias, primary data heterogeneity, and heteroskedasticity, as well as some aspects of panel effects. The article provides the first estimate of total annual NPS visitor WTP within the literature directly based on NPS visitor survey data.
Iturriaga, H; Hirsch, S; Bunout, D; Díaz, M; Kelly, M; Silva, G; de la Maza, M P; Petermann, M; Ugarte, G
1993-04-01
Looking for a noninvasive method to predict liver histologic alterations in alcoholic patients without clinical signs of liver failure, we studied 187 chronic alcoholics recently abstinent, divided in 2 series. In the model series (n = 94) several clinical variables and results of common laboratory tests were confronted to the findings of liver biopsies. These were classified in 3 groups: 1. Normal liver; 2. Moderate alterations; 3. Marked alterations, including alcoholic hepatitis and cirrhosis. Multivariate methods used were logistic regression analysis and a classification and regression tree (CART). Both methods entered gamma-glutamyltransferase (GGT), aspartate-aminotransferase (AST), weight and age as significant and independent variables. Univariate analysis with GGT and AST at different cutoffs were also performed. To predict the presence of any kind of damage (Groups 2 and 3), CART and AST > 30 IU showed the higher sensitivity, specificity and correct prediction, both in the model and validation series. For prediction of marked liver damage, a score based on logistic regression and GGT > 110 IU had the higher efficiencies. It is concluded that GGT and AST are good markers of alcoholic liver damage and that, using sample cutoffs, histologic diagnosis can be correctly predicted in 80% of recently abstinent asymptomatic alcoholics.
Bayesian Analysis of High Dimensional Classification
NASA Astrophysics Data System (ADS)
Mukhopadhyay, Subhadeep; Liang, Faming
2009-12-01
Modern data mining and bioinformatics have presented an important playground for statistical learning techniques, where the number of input variables is possibly much larger than the sample size of the training data. In supervised learning, logistic regression or probit regression can be used to model a binary output and form perceptron classification rules based on Bayesian inference. In these cases , there is a lot of interest in searching for sparse model in High Dimensional regression(/classification) setup. we first discuss two common challenges for analyzing high dimensional data. The first one is the curse of dimensionality. The complexity of many existing algorithms scale exponentially with the dimensionality of the space and by virtue of that algorithms soon become computationally intractable and therefore inapplicable in many real applications. secondly, multicollinearities among the predictors which severely slowdown the algorithm. In order to make Bayesian analysis operational in high dimension we propose a novel 'Hierarchical stochastic approximation monte carlo algorithm' (HSAMC), which overcomes the curse of dimensionality, multicollinearity of predictors in high dimension and also it possesses the self-adjusting mechanism to avoid the local minima separated by high energy barriers. Models and methods are illustrated by simulation inspired from from the feild of genomics. Numerical results indicate that HSAMC can work as a general model selection sampler in high dimensional complex model space.
Common and Distinctive Patterns of Cognitive Dysfunction in Children With Benign Epilepsy Syndromes.
Cheng, Dazhi; Yan, Xiuxian; Gao, Zhijie; Xu, Keming; Zhou, Xinlin; Chen, Qian
2017-07-01
Childhood absence epilepsy and benign childhood epilepsy with centrotemporal spikes are the most common forms of benign epilepsy syndromes. Although cognitive dysfunctions occur in children with both childhood absence epilepsy and benign childhood epilepsy with centrotemporal spikes, the similarity between their patterns of underlying cognitive impairments is not well understood. To describe these patterns, we examined multiple cognitive functions in children with childhood absence epilepsy and benign childhood epilepsy with centrotemporal spikes. In this study, 43 children with childhood absence epilepsy, 47 children with benign childhood epilepsy with centrotemporal spikes, and 64 control subjects were recruited; all received a standardized assessment (i.e., computerized test battery) assessing processing speed, spatial skills, calculation, language ability, intelligence, visual attention, and executive function. Groups were compared in these cognitive domains. Simple regression analysis was used to analyze the effects of epilepsy-related clinical variables on cognitive test scores. Compared with control subjects, children with childhood absence epilepsy and benign childhood epilepsy with centrotemporal spikes showed cognitive deficits in intelligence and executive function, but performed normally in language processing. Impairment in visual attention was specific to patients with childhood absence epilepsy, whereas impaired spatial ability was specific to the children with benign childhood epilepsy with centrotemporal spikes. Simple regression analysis showed syndrome-related clinical variables did not affect cognitive functions. This study provides evidence of both common and distinctive cognitive features underlying the relative cognitive difficulties in children with childhood absence epilepsy and benign childhood epilepsy with centrotemporal spikes. Our data suggest that clinicians should pay particular attention to the specific cognitive deficits in children with childhood absence epilepsy and benign childhood epilepsy with centrotemporal spikes, to allow for more discriminative and potentially more effective interventions. Copyright © 2017 Elsevier Inc. All rights reserved.
Soil sail content estimation in the yellow river delta with satellite hyperspectral data
Weng, Yongling; Gong, Peng; Zhu, Zhi-Liang
2008-01-01
Soil salinization is one of the most common land degradation processes and is a severe environmental hazard. The primary objective of this study is to investigate the potential of predicting salt content in soils with hyperspectral data acquired with EO-1 Hyperion. Both partial least-squares regression (PLSR) and conventional multiple linear regression (MLR), such as stepwise regression (SWR), were tested as the prediction model. PLSR is commonly used to overcome the problem caused by high-dimensional and correlated predictors. Chemical analysis of 95 samples collected from the top layer of soils in the Yellow River delta area shows that salt content was high on average, and the dominant chemicals in the saline soil were NaCl and MgCl2. Multivariate models were established between soil contents and hyperspectral data. Our results indicate that the PLSR technique with laboratory spectral data has a strong prediction capacity. Spectral bands at 1487-1527, 1971-1991, 2032-2092, and 2163-2355 nm possessed large absolute values of regression coefficients, with the largest coefficient at 2203 nm. We obtained a root mean squared error (RMSE) for calibration (with 61 samples) of RMSEC = 0.753 (R2 = 0.893) and a root mean squared error for validation (with 30 samples) of RMSEV = 0.574. The prediction model was applied on a pixel-by-pixel basis to a Hyperion reflectance image to yield a quantitative surface distribution map of soil salt content. The result was validated successfully from 38 sampling points. We obtained an RMSE estimate of 1.037 (R2 = 0.784) for the soil salt content map derived by the PLSR model. The salinity map derived from the SWR model shows that the predicted value is higher than the true value. These results demonstrate that the PLSR method is a more suitable technique than stepwise regression for quantitative estimation of soil salt content in a large area. ?? 2008 CASI.
Eken, Cenker; Bilge, Ugur; Kartal, Mutlu; Eray, Oktay
2009-06-03
Logistic regression is the most common statistical model for processing multivariate data in the medical literature. Artificial intelligence models like an artificial neural network (ANN) and genetic algorithm (GA) may also be useful to interpret medical data. The purpose of this study was to perform artificial intelligence models on a medical data sheet and compare to logistic regression. ANN, GA, and logistic regression analysis were carried out on a data sheet of a previously published article regarding patients presenting to an emergency department with flank pain suspicious for renal colic. The study population was composed of 227 patients: 176 patients had a diagnosis of urinary stone, while 51 ultimately had no calculus. The GA found two decision rules in predicting urinary stones. Rule 1 consisted of being male, pain not spreading to back, and no fever. In rule 2, pelvicaliceal dilatation on bedside ultrasonography replaced no fever. ANN, GA rule 1, GA rule 2, and logistic regression had a sensitivity of 94.9, 67.6, 56.8, and 95.5%, a specificity of 78.4, 76.47, 86.3, and 47.1%, a positive likelihood ratio of 4.4, 2.9, 4.1, and 1.8, and a negative likelihood ratio of 0.06, 0.42, 0.5, and 0.09, respectively. The area under the curve was found to be 0.867, 0.720, 0.715, and 0.713 for all applications, respectively. Data mining techniques such as ANN and GA can be used for predicting renal colic in emergency settings and to constitute clinical decision rules. They may be an alternative to conventional multivariate analysis applications used in biostatistics.
Erdoğan, Sinem B; Tong, Yunjie; Hocke, Lia M; Lindsey, Kimberly P; deB Frederick, Blaise
2016-01-01
Resting state functional connectivity analysis is a widely used method for mapping intrinsic functional organization of the brain. Global signal regression (GSR) is commonly employed for removing systemic global variance from resting state BOLD-fMRI data; however, recent studies have demonstrated that GSR may introduce spurious negative correlations within and between functional networks, calling into question the meaning of anticorrelations reported between some networks. In the present study, we propose that global signal from resting state fMRI is composed primarily of systemic low frequency oscillations (sLFOs) that propagate with cerebral blood circulation throughout the brain. We introduce a novel systemic noise removal strategy for resting state fMRI data, "dynamic global signal regression" (dGSR), which applies a voxel-specific optimal time delay to the global signal prior to regression from voxel-wise time series. We test our hypothesis on two functional systems that are suggested to be intrinsically organized into anticorrelated networks: the default mode network (DMN) and task positive network (TPN). We evaluate the efficacy of dGSR and compare its performance with the conventional "static" global regression (sGSR) method in terms of (i) explaining systemic variance in the data and (ii) enhancing specificity and sensitivity of functional connectivity measures. dGSR increases the amount of BOLD signal variance being modeled and removed relative to sGSR while reducing spurious negative correlations introduced in reference regions by sGSR, and attenuating inflated positive connectivity measures. We conclude that incorporating time delay information for sLFOs into global noise removal strategies is of crucial importance for optimal noise removal from resting state functional connectivity maps.
NASA Astrophysics Data System (ADS)
Rajab, Jasim M.; MatJafri, M. Z.; Lim, H. S.
2013-06-01
This study encompasses columnar ozone modelling in the peninsular Malaysia. Data of eight atmospheric parameters [air surface temperature (AST), carbon monoxide (CO), methane (CH4), water vapour (H2Ovapour), skin surface temperature (SSKT), atmosphere temperature (AT), relative humidity (RH), and mean surface pressure (MSP)] data set, retrieved from NASA's Atmospheric Infrared Sounder (AIRS), for the entire period (2003-2008) was employed to develop models to predict the value of columnar ozone (O3) in study area. The combined method, which is based on using both multiple regressions combined with principal component analysis (PCA) modelling, was used to predict columnar ozone. This combined approach was utilized to improve the prediction accuracy of columnar ozone. Separate analysis was carried out for north east monsoon (NEM) and south west monsoon (SWM) seasons. The O3 was negatively correlated with CH4, H2Ovapour, RH, and MSP, whereas it was positively correlated with CO, AST, SSKT, and AT during both the NEM and SWM season periods. Multiple regression analysis was used to fit the columnar ozone data using the atmospheric parameter's variables as predictors. A variable selection method based on high loading of varimax rotated principal components was used to acquire subsets of the predictor variables to be comprised in the linear regression model of the atmospheric parameter's variables. It was found that the increase in columnar O3 value is associated with an increase in the values of AST, SSKT, AT, and CO and with a drop in the levels of CH4, H2Ovapour, RH, and MSP. The result of fitting the best models for the columnar O3 value using eight of the independent variables gave about the same values of the R (≈0.93) and R2 (≈0.86) for both the NEM and SWM seasons. The common variables that appeared in both regression equations were SSKT, CH4 and RH, and the principal precursor of the columnar O3 value in both the NEM and SWM seasons was SSKT.
Methodology for Estimation of Flood Magnitude and Frequency for New Jersey Streams
Watson, Kara M.; Schopp, Robert D.
2009-01-01
Methodologies were developed for estimating flood magnitudes at the 2-, 5-, 10-, 25-, 50-, 100-, and 500-year recurrence intervals for unregulated or slightly regulated streams in New Jersey. Regression equations that incorporate basin characteristics were developed to estimate flood magnitude and frequency for streams throughout the State by use of a generalized least squares regression analysis. Relations between flood-frequency estimates based on streamflow-gaging-station discharge and basin characteristics were determined by multiple regression analysis, and weighted by effective years of record. The State was divided into five hydrologically similar regions to refine the regression equations. The regression analysis indicated that flood discharge, as determined by the streamflow-gaging-station annual peak flows, is related to the drainage area, main channel slope, percentage of lake and wetland areas in the basin, population density, and the flood-frequency region, at the 95-percent confidence level. The standard errors of estimate for the various recurrence-interval floods ranged from 48.1 to 62.7 percent. Annual-maximum peak flows observed at streamflow-gaging stations through water year 2007 and basin characteristics determined using geographic information system techniques for 254 streamflow-gaging stations were used for the regression analysis. Drainage areas of the streamflow-gaging stations range from 0.18 to 779 mi2. Peak-flow data and basin characteristics for 191 streamflow-gaging stations located in New Jersey were used, along with peak-flow data for stations located in adjoining States, including 25 stations in Pennsylvania, 17 stations in New York, 16 stations in Delaware, and 5 stations in Maryland. Streamflow records for selected stations outside of New Jersey were included in the present study because hydrologic, physiographic, and geologic boundaries commonly extend beyond political boundaries. The StreamStats web application was developed cooperatively by the U.S. Geological Survey and the Environmental Systems Research Institute, Inc., and was designed for national implementation. This web application has been recently implemented for use in New Jersey. This program used in conjunction with a geographic information system provides the computation of values for selected basin characteristics, estimates of flood magnitudes and frequencies, and statistics for stream locations in New Jersey chosen by the user, whether the site is gaged or ungaged.
Moro, Marilyn; Goparaju, Balaji; Castillo, Jelina; Alameddine, Yvonne; Bianchi, Matt T
2016-01-01
Introduction Periodic limb movements of sleep (PLMS) may increase cardiovascular and cerebrovascular morbidity. However, most people with PLMS are either asymptomatic or have nonspecific symptoms. Therefore, predicting elevated PLMS in the absence of restless legs syndrome remains an important clinical challenge. Methods We undertook a retrospective analysis of demographic data, subjective symptoms, and objective polysomnography (PSG) findings in a clinical cohort with or without obstructive sleep apnea (OSA) from our laboratory (n=443 with OSA, n=209 without OSA). Correlation analysis and regression modeling were performed to determine predictors of periodic limb movement index (PLMI). Markov decision analysis with TreeAge software compared strategies to detect PLMS: in-laboratory PSG, at-home testing, and a clinical prediction tool based on the regression analysis. Results Elevated PLMI values (>15 per hour) were observed in >25% of patients. PLMI values in No-OSA patients correlated with age, sex, self-reported nocturnal leg jerks, restless legs syndrome symptoms, and hypertension. In OSA patients, PLMI correlated only with age and self-reported psychiatric medications. Regression models indicated only a modest predictive value of demographics, symptoms, and clinical history. Decision modeling suggests that at-home testing is favored as the pretest probability of PLMS increases, given plausible assumptions regarding PLMS morbidity, costs, and assumed benefits of pharmacological therapy. Conclusion Although elevated PLMI values were commonly observed, routinely acquired clinical information had only weak predictive utility. As the clinical importance of elevated PLMI continues to evolve, it is likely that objective measures such as PSG or at-home PLMS monitors will prove increasingly important for clinical and research endeavors. PMID:27540316
Zhu, Hongxiao; Morris, Jeffrey S; Wei, Fengrong; Cox, Dennis D
2017-07-01
Many scientific studies measure different types of high-dimensional signals or images from the same subject, producing multivariate functional data. These functional measurements carry different types of information about the scientific process, and a joint analysis that integrates information across them may provide new insights into the underlying mechanism for the phenomenon under study. Motivated by fluorescence spectroscopy data in a cervical pre-cancer study, a multivariate functional response regression model is proposed, which treats multivariate functional observations as responses and a common set of covariates as predictors. This novel modeling framework simultaneously accounts for correlations between functional variables and potential multi-level structures in data that are induced by experimental design. The model is fitted by performing a two-stage linear transformation-a basis expansion to each functional variable followed by principal component analysis for the concatenated basis coefficients. This transformation effectively reduces the intra-and inter-function correlations and facilitates fast and convenient calculation. A fully Bayesian approach is adopted to sample the model parameters in the transformed space, and posterior inference is performed after inverse-transforming the regression coefficients back to the original data domain. The proposed approach produces functional tests that flag local regions on the functional effects, while controlling the overall experiment-wise error rate or false discovery rate. It also enables functional discriminant analysis through posterior predictive calculation. Analysis of the fluorescence spectroscopy data reveals local regions with differential expressions across the pre-cancer and normal samples. These regions may serve as biomarkers for prognosis and disease assessment.
Liu, Zhi-yu; Zhong, Meng; Hai, Yan; Du, Qi-yun; Wang, Ai-hua; Xie, Dong-hua
2012-11-01
To understand the situation of depression and its related influencing factors among medical staff in Hunan province. Data were collected through random sampling with multi-stage stratified cluster. Wilcoxon rank sum test, Kruskal-Wallis H test and Ordinal regression analysis were used for data analysis by SPSS 17.0 software. This survey was including 16,000 medical personnel with 14, 988 valid questionnaires and the effective rate was 93.68%. from the single factor analysis showed that factors as: level of the hospital grading, gender, education background, age, occupation, title, departments, the number of continue education, income, working overtime every week, the frequency of night work, the number of patients treated in the emergency room etc., had statistical significances (P < 0.05). Data from ordinal regression showed that the probabilities related to depression that clinicians and nurses suffering from were 1.58 times more than the pharmacists (OR = 1.58, 95%CI: 1.30 - 1.92). The probability among those whose income was less than 2000 Yuan/month was 2.19 times of the ones whose earned more than 3000 Yuan/month (OR = 2.19, 95%CI: 2.05 - 2.35). The higher the numbers of days with working overtime every week, the frequencies of night work, and the numbers of patients being treated at the emergency room, with more probabilities of the people with depression seen in our study. Depression seemed to be common among doctors and nurses. We suggested that the government need to increase the monthly income and to reduce the workload and intensity, lessen the overworking time, etc.
Regression Analysis by Example. 5th Edition
ERIC Educational Resources Information Center
Chatterjee, Samprit; Hadi, Ali S.
2012-01-01
Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. "Regression Analysis by Example, Fifth Edition" has been expanded and thoroughly…
Sun, Tao; Wang, Lingxiang; Guo, Changzhi; Zhang, Guochuan; Hu, Wenhai
2017-05-02
Malignant tumors in the proximal fibula are rare but life-threatening; however, biopsy is not routine due to the high risk of peroneal nerve injury. Our aim was to determine preoperative clinical indicators of malignancy. Between 2004 and 2016, 52 consecutive patients with proximal fibular tumors were retrospectively reviewed. Details of the clinicopathological characteristics including age, gender, location of tumors, the presenting symptoms, the duration of symptoms, and pathological diagnosis were collected. Descriptive statistics were calculated, and univariate and multivariate regression were performed. Of these 52 patients, 84.6% had benign tumors and 15.4% malignant tumors. The most common benign tumors were osteochondromas (46.2%), followed by enchondromas (13.5%) and giant cell tumors (13.5%). The most common malignancy was osteosarcomas (11.5%). The most common presenting symptoms were a palpable mass (52.0%) and pain (46.2%). Pain was the most sensitive (100%) and fourth specific (64%); both high skin temperature and peroneal nerve compression had the highest specificity (98%) and third sensitivity (64%); change in symptoms had the second highest specificity (89%) while 50% sensitivity. Using multivariate regression, palpable pain, high skin temperature, and peroneal nerve compression symptoms were predictors of malignancy. Most tumors in the proximal fibula are benign, and the malignancy is rare. Palpable pain, peroneal nerve compression symptoms, and high skin temperature were specific in predicting malignancy.
Spatial Assessment of Model Errors from Four Regression Techniques
Lianjun Zhang; Jeffrey H. Gove; Jeffrey H. Gove
2005-01-01
Fomst modelers have attempted to account for the spatial autocorrelations among trees in growth and yield models by applying alternative regression techniques such as linear mixed models (LMM), generalized additive models (GAM), and geographicalIy weighted regression (GWR). However, the model errors are commonly assessed using average errors across the entire study...
Simultaneous Estimation of Regression Functions for Marine Corps Technical Training Specialties.
ERIC Educational Resources Information Center
Dunbar, Stephen B.; And Others
This paper considers the application of Bayesian techniques for simultaneous estimation to the specification of regression weights for selection tests used in various technical training courses in the Marine Corps. Results of a method for m-group regression developed by Molenaar and Lewis (1979) suggest that common weights for training courses…
Regression Effects in Angoff Ratings: Examples from Credentialing Exams
ERIC Educational Resources Information Center
Wyse, Adam E.
2018-01-01
This article discusses regression effects that are commonly observed in Angoff ratings where panelists tend to think that hard items are easier than they are and easy items are more difficult than they are in comparison to estimated item difficulties. Analyses of data from two credentialing exams illustrate these regression effects and the…
Zaggia, Luca; Lorenzetti, Giuliano; Manfé, Giorgia; Scarpa, Gian Marco; Molinaroli, Emanuela; Parnell, Kevin Ellis; Rapaglia, John Paul; Gionta, Maria; Soomere, Tarmo
2017-01-01
An investigation based on in-situ surveys combined with remote sensing and GIS analysis revealed fast shoreline retreat on the side of a major waterway, the Malamocco Marghera Channel, in the Lagoon of Venice, Italy. Monthly and long-term regression rates caused by ship wakes in a reclaimed industrial area were considered. The short-term analysis, based on field surveys carried out between April 2014 and January 2015, revealed that the speed of shoreline regression was insignificantly dependent on the distance from the navigation channel, but was not constant through time. Periods of high water levels due to tidal forcing or storm surges, more common in the winter season, are characterized by faster regression rates. The retreat is a discontinuous process in time and space depending on the morpho-stratigraphy and the vegetation cover of the artificial deposits. A GIS analysis performed with the available imagery shows an average retreat of 3-4 m/yr in the period between 1974 and 2015. Digitization of historical maps and bathymetric surveys made in April 2015 enabled the construction of two digital terrain models for both past and present situations. The two models have been used to calculate the total volume of sediment lost during the period 1968-2015 (1.19×106 m3). The results show that in the presence of heavy ship traffic, ship-channel interactions can dominate the morphodynamics of a waterway and its margins. The analysis enables a better understanding of how shallow-water systems react to the human activities in the post-industrial period. An adequate evaluation of the temporal and spatial variation of shoreline position is also crucial for the development of future scenarios and for the sustainable management port traffic worldwide.
Lorenzetti, Giuliano; Manfé, Giorgia; Scarpa, Gian Marco; Molinaroli, Emanuela; Parnell, Kevin Ellis; Rapaglia, John Paul; Gionta, Maria; Soomere, Tarmo
2017-01-01
An investigation based on in-situ surveys combined with remote sensing and GIS analysis revealed fast shoreline retreat on the side of a major waterway, the Malamocco Marghera Channel, in the Lagoon of Venice, Italy. Monthly and long-term regression rates caused by ship wakes in a reclaimed industrial area were considered. The short-term analysis, based on field surveys carried out between April 2014 and January 2015, revealed that the speed of shoreline regression was insignificantly dependent on the distance from the navigation channel, but was not constant through time. Periods of high water levels due to tidal forcing or storm surges, more common in the winter season, are characterized by faster regression rates. The retreat is a discontinuous process in time and space depending on the morpho-stratigraphy and the vegetation cover of the artificial deposits. A GIS analysis performed with the available imagery shows an average retreat of 3˗4 m/yr in the period between 1974 and 2015. Digitization of historical maps and bathymetric surveys made in April 2015 enabled the construction of two digital terrain models for both past and present situations. The two models have been used to calculate the total volume of sediment lost during the period 1968˗2015 (1.19×106 m3). The results show that in the presence of heavy ship traffic, ship-channel interactions can dominate the morphodynamics of a waterway and its margins. The analysis enables a better understanding of how shallow-water systems react to the human activities in the post-industrial period. An adequate evaluation of the temporal and spatial variation of shoreline position is also crucial for the development of future scenarios and for the sustainable management port traffic worldwide. PMID:29088244
Use of psychotherapy in a representative adult community sample in São Paulo, Brazil
Blay, Sergio L.; Fillenbaum, Gerda G.; da Silva, Paula Freitas R.; Peluso, Erica T.
2014-01-01
Little is known about the use of psychotherapy to treat common mental disorders in a major city in a middle income country. Data come from in-home interviews with a stratified random sample of 2,000 community residents age 18–65 in the city of São Paulo, Brazil. The information obtained included sociodemographic characteristics; psychotropic drugs; mental status; and lifetime, previous 12 months, and current use of psychotherapy. Logistic regression was used to examine determinants of use of psychotherapy. Of the sample, 22.7% met General Health Questionnaire-12 criteria for common mental disorders. Lifetime, previous 12 months, and current use of psychotherapy were reported by 14.6%, 4.6%, and 2.3% of the sample respectively. Users were typically women, more educated, higher income, not married, unemployed, with common mental disorders. Further analysis found that 47% (with higher education and income) paid out-of-pocket, and 53% used psychotropic medication. Psychotherapy does not appear to be the preferred treatment for common mental disorders. PMID:25118139
Thieler, E. Robert; Himmelstoss, Emily A.; Zichichi, Jessica L.; Ergul, Ayhan
2009-01-01
The Digital Shoreline Analysis System (DSAS) version 4.0 is a software extension to ESRI ArcGIS v.9.2 and above that enables a user to calculate shoreline rate-of-change statistics from multiple historic shoreline positions. A user-friendly interface of simple buttons and menus guides the user through the major steps of shoreline change analysis. Components of the extension and user guide include (1) instruction on the proper way to define a reference baseline for measurements, (2) automated and manual generation of measurement transects and metadata based on user-specified parameters, and (3) output of calculated rates of shoreline change and other statistical information. DSAS computes shoreline rates of change using four different methods: (1) endpoint rate, (2) simple linear regression, (3) weighted linear regression, and (4) least median of squares. The standard error, correlation coefficient, and confidence interval are also computed for the simple and weighted linear-regression methods. The results of all rate calculations are output to a table that can be linked to the transect file by a common attribute field. DSAS is intended to facilitate the shoreline change-calculation process and to provide rate-of-change information and the statistical data necessary to establish the reliability of the calculated results. The software is also suitable for any generic application that calculates positional change over time, such as assessing rates of change of glacier limits in sequential aerial photos, river edge boundaries, land-cover changes, and so on.
Lorio, Morgan; Martinson, Melissa; Ferrara, Lisa
2016-01-01
Minimally invasive sacroiliac joint arthrodesis ("MI SIJ fusion") received a Category I CPT ® code (27279) effective January 1, 2015 and was assigned a work relative value unit ("RVU") of 9.03. The International Society for the Advancement of Spine Surgery ("ISASS") conducted a study consisting of a Rasch analysis of two separate surveys of surgeons to assess the accuracy of the assigned work RVU. A survey was developed and sent to ninety-three ISASS surgeon committee members. Respondents were asked to compare CPT ® 27279 to ten other comparator CPT ® codes reflective of common spine surgeries. The survey presented each comparator CPT ® code with its code descriptor as well as the description of CPT ® 27279 and asked respondents to indicate whether CPT ® 27279 was greater, equal, or less in terms of work effort than the comparator code. A second survey was sent to 557 U.S.-based spine surgeon members of ISASS and 241 spine surgeon members of the Society for Minimally Invasive Spine Surgery ("SMISS"). The design of the second survey mirrored that of the first survey except for the use of a broader set of comparator CPT ® codes (27 vs. 10). Using the work RVUs of the comparator codes, a Rasch analysis was performed to estimate the relative difficulty of CPT ® 27279, after which the work RVU of CPT ® 27279 was estimated by regression analysis. Twenty surgeons responded to the first survey and thirty-four surgeons responded to the second survey. The results of the regression analysis of the first survey indicate a work RVU for CPT ® 27279 of 14.36 and the results of the regression analysis of the second survey indicate a work RVU for CPT ® 27279 of 14.1. The Rasch analysis indicates that the current work RVU assigned to CPT ® 27279 is undervalued at 9.03. Averaging the results of the regression analyses of the two surveys indicates a work RVU for CPT ® 27279 of 14.23.
Hemilä, Harri; Fitzgerald, James T; Petrus, Edward J; Prasad, Ananda
2017-01-01
A previous meta-analysis of 3 zinc acetate lozenge trials estimated that colds were on average 40% shorter for the zinc groups. However, the duration of colds is a time outcome, and survival analysis may be a more informative approach. The objective of this individual patient data (IPD) meta-analysis was to estimate the effect of zinc acetate lozenges on the rate of recovery from colds. We analyzed IPD for 3 randomized placebo-controlled trials in which 80-92 mg/day of elemental zinc were administered as zinc acetate lozenges to 199 common cold patients. We used mixed-effects Cox regression to estimate the effect of zinc. Patients administered zinc lozenges recovered faster by rate ratio 3.1 (95% confidence interval, 2.1-4.7). The effect was not modified by age, sex, race, allergy, smoking, or baseline common cold severity. On the 5th day, 70% of the zinc patients had recovered compared with 27% of the placebo patients. Accordingly, 2.6 times more patients were cured in the zinc group. The difference also corresponds to the number needed to treat of 2.3 on the 5th day. None of the studies observed serious adverse effects of zinc. The 3-fold increase in the rate of recovery from the common cold is a clinically important effect. The optimal formulation of zinc lozenges and an ideal frequency of their administration should be examined. Given the evidence of efficacy, common cold patients may be instructed to try zinc acetate lozenges within 24 hours of onset of symptoms. © The Author 2017. Published by Oxford University Press on behalf of Infectious Diseases Society of America.
The Relationship between Grandiose and Vulnerable (Hypersensitive) Narcissism
Jauk, Emanuel; Weigle, Elena; Lehmann, Konrad; Benedek, Mathias; Neubauer, Aljoscha C.
2017-01-01
Narcissistic grandiosity is characterized by overt expressions of feelings of superiority and entitlement, while narcissistic vulnerability reflects hypersensitivity and introversive self-absorbedness. Clinical evidence suggests that grandiosity is accompanied by vulnerable aspects, pointing to a common foundation. Subclinical personality research, however, views grandiose and vulnerable narcissism as independent traits. Grandiose narcissism displays substantial correlation with extraversion, while vulnerable narcissism correlates highly with introversion. We investigated if (1) controlling for intro-/extraversion might reveal a “common core” of grandiose and vulnerable narcissism, and if (2) the correlation between both aspects might be higher at higher levels of narcissism. Latent variable structural equation modeling and segmented regression analysis confirmed these hypotheses in a large non-clinical sample (N = 1,006). Interindividual differences in intro-/extraversion mask the common core of grandiose and vulnerable narcissism. The association between both aspects increases at high levels (upper 10%) of grandiose narcissism, which suggests a possible transition to clinically relevant (pathological) narcissism. PMID:28955288
Common Scientific and Statistical Errors in Obesity Research
George, Brandon J.; Beasley, T. Mark; Brown, Andrew W.; Dawson, John; Dimova, Rositsa; Divers, Jasmin; Goldsby, TaShauna U.; Heo, Moonseong; Kaiser, Kathryn A.; Keith, Scott; Kim, Mimi Y.; Li, Peng; Mehta, Tapan; Oakes, J. Michael; Skinner, Asheley; Stuart, Elizabeth; Allison, David B.
2015-01-01
We identify 10 common errors and problems in the statistical analysis, design, interpretation, and reporting of obesity research and discuss how they can be avoided. The 10 topics are: 1) misinterpretation of statistical significance, 2) inappropriate testing against baseline values, 3) excessive and undisclosed multiple testing and “p-value hacking,” 4) mishandling of clustering in cluster randomized trials, 5) misconceptions about nonparametric tests, 6) mishandling of missing data, 7) miscalculation of effect sizes, 8) ignoring regression to the mean, 9) ignoring confirmation bias, and 10) insufficient statistical reporting. We hope that discussion of these errors can improve the quality of obesity research by helping researchers to implement proper statistical practice and to know when to seek the help of a statistician. PMID:27028280
[A competency model of rural general practitioners: theory construction and empirical study].
Yang, Xiu-Mu; Qi, Yu-Long; Shne, Zheng-Fu; Han, Bu-Xin; Meng, Bei
2015-04-01
To perform theory construction and empirical study of the competency model of rural general practitioners. Through literature study, job analysis, interviews, and expert team discussion, the questionnaire of rural general practitioners competency was constructed. A total of 1458 rural general practitioners were surveyed by the questionnaire in 6 central provinces. The common factors were constructed using the principal component method of exploratory factor analysis and confirmatory factor analysis. The influence of the competency characteristics on the working performance was analyzed using regression equation analysis. The Cronbach 's alpha coefficient of the questionnaire was 0.974. The model consisted of 9 dimensions and 59 items. The 9 competency dimensions included basic public health service ability, basic clinical skills, system analysis capability, information management capability, communication and cooperation ability, occupational moral ability, non-medical professional knowledge, personal traits and psychological adaptability. The rate of explained cumulative total variance was 76.855%. The model fitting index were Χ(2)/df 1.88, GFI=0.94, NFI=0.96, NNFI=0.98, PNFI=0.91, RMSEA=0.068, CFI=0.97, IFI=0.97, RFI=0.96, suggesting good model fitting. Regression analysis showed that the competency characteristics had a significant effect on job performance. The rural general practitioners competency model provides reference for rural doctor training, rural order directional cultivation of medical students, and competency performance management of the rural general practitioners.
An unjustified benefit: immortal time bias in the analysis of time-dependent events.
Gleiss, Andreas; Oberbauer, Rainer; Heinze, Georg
2018-02-01
Immortal time bias is a problem arising from methodologically wrong analyses of time-dependent events in survival analyses. We illustrate the problem by analysis of a kidney transplantation study. Following patients from transplantation to death, groups defined by the occurrence or nonoccurrence of graft failure during follow-up seemingly had equal overall mortality. Such naive analysis assumes that patients were assigned to the two groups at time of transplantation, which actually are a consequence of occurrence of a time-dependent event later during follow-up. We introduce landmark analysis as the method of choice to avoid immortal time bias. Landmark analysis splits the follow-up time at a common, prespecified time point, the so-called landmark. Groups are then defined by time-dependent events having occurred before the landmark, and outcome events are only considered if occurring after the landmark. Landmark analysis can be easily implemented with common statistical software. In our kidney transplantation example, landmark analyses with landmarks set at 30 and 60 months clearly identified graft failure as a risk factor for overall mortality. We give further typical examples from transplantation research and discuss strengths and limitations of landmark analysis and other methods to address immortal time bias such as Cox regression with time-dependent covariables. © 2017 Steunstichting ESOT.
Rajbongshi, Nijara; Mahanta, Lipi B; Nath, Dilip C
2015-06-01
Breast cancer is the most commonly diagnosed cancer among the female population of Assam, India. Chewing of betel quid with or without tobacco is common practice among female population of this region. Moreoverthe method of preparing the betel quid is different from other parts of the country.So matched case control study is conducted to analyse whetherbetel quid chewing plays a significant role in the high incidence of breast cancer occurrences in Assam. Here, controls are matched to the cases by age at diagnosis (±5 years), family income and place of residence with matching ratio 1:1. Conditional logistic regression models and odd ratios (OR) was used to draw conclusions. It is observed that cases are more habituated to chewing habits than the controls.Further the conditional logistic regression analysis reveals that betel quid chewer faces 2.353 times more risk having breast cancer than the non-chewer with p value 0.0003 (95% CI 1.334-4.150). Though the female population in Assam usually does not smoke, the addictive habits typical to this region have equal effect on the occurrence of breast cancer.
Wolf, Alexander; Leucht, Stefan; Pajonk, Frank-Gerald
2017-04-01
Behavioural and psychological symptoms in dementia (BPSD) are common and often treated with antipsychotics, which are known to have small efficacy and to cause many side effects. One potential side effect might be cognitive decline. We searched MEDLINE, Scopus, CENTRAL and www.ClincalStudyResult.org for randomized, double-blind, placebo-controlled trials using antipsychotics for treating BPSD and evaluated cognitive functioning. The studies identified were summarized in a meta-analysis with the standardized mean difference (SMD, Hedges's g) as the effect size. Meta-regression was additionally performed to identify associated factors. Ten studies provided data on the course of cognitive functioning. The random effects model of the pooled analysis showed a not significant effect (SMD = -0.065, 95 % CI -0.186 to 0.057, I 2 = 41 %). Meta-regression revealed a significant correlation between cognitive impairment and treatment duration (R 2 = 0.78, p < 0.02) as well as baseline MMSE (R 2 = 0.92, p < 0.005). These correlations depend on only two out of ten studies and should interpret cautiously.
Patterson, Megan S; Goodson, Patricia
2017-05-01
Compulsive exercise, a form of unhealthy exercise often associated with prioritizing exercise and feeling guilty when exercise is missed, is a common precursor to and symptom of eating disorders. College-aged women are at high risk of exercising compulsively compared with other groups. Social network analysis (SNA) is a theoretical perspective and methodology allowing researchers to observe the effects of relational dynamics on the behaviors of people. SNA was used to assess the relationship between compulsive exercise and body dissatisfaction, physical activity, and network variables. Descriptive statistics were conducted using SPSS, and quadratic assignment procedure (QAP) analyses were conducted using UCINET. QAP regression analysis revealed a statistically significant model (R 2 = .375, P < .0001) predicting compulsive exercise behavior. Physical activity, body dissatisfaction, and network variables were statistically significant predictor variables in the QAP regression model. In our sample, women who are connected to "important" or "powerful" people in their network are likely to have higher compulsive exercise scores. This result provides healthcare practitioners key target points for intervention within similar groups of women. For scholars researching eating disorders and associated behaviors, this study supports looking into group dynamics and network structure in conjunction with body dissatisfaction and exercise frequency.
Duda, Piotr; Jaworski, Maciej; Rutkowski, Leszek
2018-03-01
One of the greatest challenges in data mining is related to processing and analysis of massive data streams. Contrary to traditional static data mining problems, data streams require that each element is processed only once, the amount of allocated memory is constant and the models incorporate changes of investigated streams. A vast majority of available methods have been developed for data stream classification and only a few of them attempted to solve regression problems, using various heuristic approaches. In this paper, we develop mathematically justified regression models working in a time-varying environment. More specifically, we study incremental versions of generalized regression neural networks, called IGRNNs, and we prove their tracking properties - weak (in probability) and strong (with probability one) convergence assuming various concept drift scenarios. First, we present the IGRNNs, based on the Parzen kernels, for modeling stationary systems under nonstationary noise. Next, we extend our approach to modeling time-varying systems under nonstationary noise. We present several types of concept drifts to be handled by our approach in such a way that weak and strong convergence holds under certain conditions. Finally, in the series of simulations, we compare our method with commonly used heuristic approaches, based on forgetting mechanism or sliding windows, to deal with concept drift. Finally, we apply our concept in a real life scenario solving the problem of currency exchange rates prediction.
ERIC Educational Resources Information Center
Shear, Benjamin R.; Zumbo, Bruno D.
2013-01-01
Type I error rates in multiple regression, and hence the chance for false positive research findings, can be drastically inflated when multiple regression models are used to analyze data that contain random measurement error. This article shows the potential for inflated Type I error rates in commonly encountered scenarios and provides new…
What Are the Odds of that? A Primer on Understanding Logistic Regression
ERIC Educational Resources Information Center
Huang, Francis L.; Moon, Tonya R.
2013-01-01
The purpose of this Methodological Brief is to present a brief primer on logistic regression, a commonly used technique when modeling dichotomous outcomes. Using data from the National Education Longitudinal Study of 1988 (NELS:88), logistic regression techniques were used to investigate student-level variables in eighth grade (i.e., enrolled in a…
Global Prevalence of Elder Abuse: A Meta-analysis and Meta-regression.
Ho, C Sh; Wong, S Y; Chiu, M M; Ho, R Cm
2017-06-01
Elder abuse is increasingly recognised as a global public health and social problem. There has been limited inter-study comparison of the prevalence and risk factors for elder abuse. This study aimed to estimate the pooled and subtype prevalence of elder abuse worldwide and identify significant associated risk factors. We conducted a meta-analysis and meta-regression of 34 population-based and 17 non-population-based studies. The pooled prevalences of elder abuse were 10.0% (95% confidence interval, 5.2%-18.6%) and 34.3% (95% confidence interval, 22.9%-47.8%) in population-based studies and third party- or caregiver-reported studies, respectively. Being in a marital relationship was found to be a significant moderator using random-effects model. This meta-analysis revealed that third parties or caregivers were more likely to report abuse than older abused adults. Subgroup analyses showed that females and those resident in non-western countries were more likely to be abused. Emotional abuse was the most prevalent elder abuse subtype and financial abuse was less commonly reported by third parties or caregivers. Heterogeneity in the prevalence was due to the high proportion of married older adults in the sample. Subgroup analysis showed that cultural factors, subtypes of abuse, and gender also contributed to heterogeneity in the pooled prevalence of elder abuse.
Smith, E M D; Jorgensen, A L; Beresford, M W
2017-10-01
Background Lupus nephritis (LN) affects up to 80% of juvenile-onset systemic lupus erythematosus (JSLE) patients. The value of commonly available biomarkers, such as anti-dsDNA antibodies, complement (C3/C4), ESR and full blood count parameters in the identification of active LN remains uncertain. Methods Participants from the UK JSLE Cohort Study, aged <16 years at diagnosis, were categorized as having active or inactive LN according to the renal domain of the British Isles Lupus Assessment Group score. Classic biomarkers: anti-dsDNA, C3, C4, ESR, CRP, haemoglobin, total white cells, neutrophils, lymphocytes, platelets and immunoglobulins were assessed for their ability to identify active LN using binary logistic regression modeling, with stepAIC function applied to select a final model. Receiver-operating curve analysis was used to assess diagnostic accuracy. Results A total of 370 patients were recruited; 191 (52%) had active LN and 179 (48%) had inactive LN. Binary logistic regression modeling demonstrated a combination of ESR, C3, white cell count, neutrophils, lymphocytes and IgG to be best for the identification of active LN (area under the curve 0.724). Conclusions At best, combining common classic blood biomarkers of lupus activity using multivariate analysis provides a 'fair' ability to identify active LN. Urine biomarkers were not included in these analyses. These results add to the concern that classic blood biomarkers are limited in monitoring discrete JSLE manifestations such as LN.
Professional Burnout and Concurrent Health Complaints in Neonatal Nursing.
Skorobogatova, Natalija; Žemaitienė, Nida; Šmigelskas, Kastytis; Tamelienė, Rasa
2017-01-01
The aim of this study was to analyze nurses' professional burnout and health complaints and the relationship between the two components. The anonymous survey included 94 neonatal intensive care nurses from two centers of perinatology. The Maslach Burnout Inventory-Human Services Survey (MBI-HSS) was used to evaluate professional burnout; it consisted of 3 components, Emotional Exhaustion, Depersonalization, and Personal Accomplishments, with 22 items in total. Health complaints were evaluated by 21 items, where nurses were asked to report the occurrence of symptoms within the last year. Scale means were presented with standard deviations (SD). Inferential analysis was conducted with multivariate logistic regression, adjusting for age, residence, and work experience. The mean score of professional burnout on the Emotional Exhaustion subscale was 14.4 (SD=7.91), Depersonalization 3.8 (SD=4.75), and Personal Accomplishment 29.1 (SD=10.12). The health assessment revealed that sleeplessness, lack of rest, nervousness, and tiredness were the most common complaints. The regression analysis revealed that tiredness was independently associated with significantly increased odds of professional burnout (OR=4.1). In our study, more than half of the nurses in neonatal intensive care had moderate or high levels of emotional exhaustion, while levels of depersonalization were significantly lower. In contrast, the level of personal accomplishment was low in more than half of the nurses. The most common health complaints were sleep disturbances, nervousness, and tiredness. Tiredness was most strongly associated with professional burnout.
Hong, J; Chen, J; Sun, X; Deng, S X; Chen, L; Gong, L; Cao, W; Yu, X; Xu, J
2012-01-01
Purpose The purpose of this study was to review the microbiological profile, in vitro antibiotic susceptibility and visual outcomes of paediatric microbial keratitis in Shanghai, China over the past 6 years. Methods Medical records of patients aged ≤16 years were reviewed, who were diagnosed as having bacterial keratitis between 1 January 2005 and 31 December 2010. Bacterial culture results and in vitro antibiotic susceptibility were analysed. A logistic regression analysis was conducted to evaluate the relationship between visual impairment and possible risk factors. Results Eighty consecutive cases of paediatric bacterial keratitis cases were included, among which 59 were identified as having positive culture. Staphylococcus epidermidis was the most commonly isolated organism (n=23; 39.0%), followed by Streptococcus pneumoniae (n=11; 18.6%) and Pseudomonas aeruginosa (n=6; 10.2%). Antibiotic sensitivities revealed that tested bacteria had low resistance rates to fluoroquinolones and aminoglycosides (8.3–18.4% and 12.5–24.4%, respectively). Multivariate logistic regression analysis proved that visual impairment was significantly associated with Gram-negative bacterial infection (odds ratio (OR)=7.626; P=0.043) and an increasing number of resistant antibiotics (OR=0.385; P=0.040). Conclusions S. epidermidis was the most common isolated organism in Shanghai paediatric keratitis. The fluoroquinolones and aminoglycosides remained good choices for treating these patients. Gram-negative bacterial infection and an increasing number of resistant antibiotics were associated with worse visual prognoses in paediatric keratitis. PMID:23079751
Roine, Antti; Saviauk, Taavi; Kumpulainen, Pekka; Karjalainen, Markus; Tuokko, Antti; Aittoniemi, Janne; Vuento, Risto; Lekkala, Jukka; Lehtimäki, Terho; Tammela, Teuvo L; Oksala, Niku K J
2014-01-01
Urinary tract infection (UTI) is a common disease with significant morbidity and economic burden, accounting for a significant part of the workload in clinical microbiology laboratories. Current clinical chemisty point-of-care diagnostics rely on imperfect dipstick analysis which only provides indirect and insensitive evidence of urinary bacterial pathogens. An electronic nose (eNose) is a handheld device mimicking mammalian olfaction that potentially offers affordable and rapid analysis of samples without preparation at athmospheric pressure. In this study we demonstrate the applicability of ion mobility spectrometry (IMS) -based eNose to discriminate the most common UTI pathogens from gaseous headspace of culture plates rapidly and without sample preparation. We gathered a total of 101 culture samples containing four most common UTI bacteries: E. coli, S. saprophyticus, E. faecalis, Klebsiella spp and sterile culture plates. The samples were analyzed using ChemPro 100i device, consisting of IMS cell and six semiconductor sensors. Data analysis was conducted by linear discriminant analysis (LDA) and logistic regression (LR). The results were validated by leave-one-out and 5-fold cross validation analysis. In discrimination of sterile and bacterial samples sensitivity of 95% and specificity of 97% were achieved. The bacterial species were identified with sensitivity of 95% and specificity of 96% using eNose as compared to urine bacterial cultures. These findings strongly demonstrate the ability of our eNose to discriminate bacterial cultures and provides a proof of principle to use this method in urinanalysis of UTI.
Martins, Filipe C; Santiago, Ines de; Trinh, Anne; Xian, Jian; Guo, Anne; Sayal, Karen; Jimenez-Linan, Mercedes; Deen, Suha; Driver, Kristy; Mack, Marie; Aslop, Jennifer; Pharoah, Paul D; Markowetz, Florian; Brenton, James D
2014-12-17
TP53 and BRCA1/2 mutations are the main drivers in high-grade serous ovarian carcinoma (HGSOC). We hypothesise that combining tissue phenotypes from image analysis of tumour sections with genomic profiles could reveal other significant driver events. Automatic estimates of stromal content combined with genomic analysis of TCGA HGSOC tumours show that stroma strongly biases estimates of PTEN expression. Tumour-specific PTEN expression was tested in two independent cohorts using tissue microarrays containing 521 cases of HGSOC. PTEN loss or downregulation occurred in 77% of the first cohort by immunofluorescence and 52% of the validation group by immunohistochemistry, and is associated with worse survival in a multivariate Cox-regression model adjusted for study site, age, stage and grade. Reanalysis of TCGA data shows that hemizygous loss of PTEN is common (36%) and expression of PTEN and expression of androgen receptor are positively associated. Low androgen receptor expression was associated with reduced survival in data from TCGA and immunohistochemical analysis of the first cohort. PTEN loss is a common event in HGSOC and defines a subgroup with significantly worse prognosis, suggesting the rational use of drugs to target PI3K and androgen receptor pathways for HGSOC. This work shows that integrative approaches combining tissue phenotypes from images with genomic analysis can resolve confounding effects of tissue heterogeneity and should be used to identify new drivers in other cancers.
Krasikova, Dina V; Le, Huy; Bachura, Eric
2018-06-01
To address a long-standing concern regarding a gap between organizational science and practice, scholars called for more intuitive and meaningful ways of communicating research results to users of academic research. In this article, we develop a common language effect size index (CLβ) that can help translate research results to practice. We demonstrate how CLβ can be computed and used to interpret the effects of continuous and categorical predictors in multiple linear regression models. We also elaborate on how the proposed CLβ index is computed and used to interpret interactions and nonlinear effects in regression models. In addition, we test the robustness of the proposed index to violations of normality and provide means for computing standard errors and constructing confidence intervals around its estimates. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Real, Jordi; Forné, Carles; Roso-Llorach, Albert; Martínez-Sánchez, Jose M
2016-05-01
Controlling for confounders is a crucial step in analytical observational studies, and multivariable models are widely used as statistical adjustment techniques. However, the validation of the assumptions of the multivariable regression models (MRMs) should be made clear in scientific reporting. The objective of this study is to review the quality of statistical reporting of the most commonly used MRMs (logistic, linear, and Cox regression) that were applied in analytical observational studies published between 2003 and 2014 by journals indexed in MEDLINE.Review of a representative sample of articles indexed in MEDLINE (n = 428) with observational design and use of MRMs (logistic, linear, and Cox regression). We assessed the quality of reporting about: model assumptions and goodness-of-fit, interactions, sensitivity analysis, crude and adjusted effect estimate, and specification of more than 1 adjusted model.The tests of underlying assumptions or goodness-of-fit of the MRMs used were described in 26.2% (95% CI: 22.0-30.3) of the articles and 18.5% (95% CI: 14.8-22.1) reported the interaction analysis. Reporting of all items assessed was higher in articles published in journals with a higher impact factor.A low percentage of articles indexed in MEDLINE that used multivariable techniques provided information demonstrating rigorous application of the model selected as an adjustment method. Given the importance of these methods to the final results and conclusions of observational studies, greater rigor is required in reporting the use of MRMs in the scientific literature.
Common mental disorders and intimate partner violence in pregnancy.
Ludermir, Ana Bernarda; Valongueiro, Sandra; Araújo, Thália Velho Barreto de
2014-02-01
To investigate the association between common mental disorders and intimate partner violence during pregnancy. A cross sectional study was carried out with 1,120 pregnant women aged 18-49 years old, who were registered in the Family Health Program in the city of Recife, Northeastern Brazil, between 2005 and 2006. Common mental disorders were assessed using the Self-Reporting Questionnaire (SRQ-20). Intimate partner violence was defined as psychologically, physically and sexually abusive acts committed against women by their partners. Crude and adjusted odds ratios were estimated for the association studied utilizing logistic regression analysis. The most common form of partner violence was psychological. The prevalence of common mental disorders was 71.0% among women who reported all form of violence in pregnancy and 33.8% among those who did not report intimate partner violence. Common mental disorders were associated with psychological violence (OR 2.49, 95%CI 1.8;3.5), even without physical or sexual violence. When psychological violence was combined with physical or sexual violence, the risk of common mental disorders was even higher (OR 3.45; 95%CI 2.3;5.2). Being assaulted by someone with whom you are emotionally involved can trigger feelings of helplessness, low self-esteem and depression. The pregnancy probably increased women`s vulnerability to common mental disorders.
Low-flow, base-flow, and mean-flow regression equations for Pennsylvania streams
Stuckey, Marla H.
2006-01-01
Low-flow, base-flow, and mean-flow characteristics are an important part of assessing water resources in a watershed. These streamflow characteristics can be used by watershed planners and regulators to determine water availability, water-use allocations, assimilative capacities of streams, and aquatic-habitat needs. Streamflow characteristics are commonly predicted by use of regression equations when a nearby streamflow-gaging station is not available. Regression equations for predicting low-flow, base-flow, and mean-flow characteristics for Pennsylvania streams were developed from data collected at 293 continuous- and partial-record streamflow-gaging stations with flow unaffected by upstream regulation, diversion, or mining. Continuous-record stations used in the regression analysis had 9 years or more of data, and partial-record stations used had seven or more measurements collected during base-flow conditions. The state was divided into five low-flow regions and regional regression equations were developed for the 7-day, 10-year; 7-day, 2-year; 30-day, 10-year; 30-day, 2-year; and 90-day, 10-year low flows using generalized least-squares regression. Statewide regression equations were developed for the 10-year, 25-year, and 50-year base flows using generalized least-squares regression. Statewide regression equations were developed for harmonic mean and mean annual flow using weighted least-squares regression. Basin characteristics found to be significant explanatory variables at the 95-percent confidence level for one or more regression equations were drainage area, basin slope, thickness of soil, stream density, mean annual precipitation, mean elevation, and the percentage of glaciation, carbonate bedrock, forested area, and urban area within a basin. Standard errors of prediction ranged from 33 to 66 percent for the n-day, T-year low flows; 21 to 23 percent for the base flows; and 12 to 38 percent for the mean annual flow and harmonic mean, respectively. The regression equations are not valid in watersheds with upstream regulation, diversions, or mining activities. Watersheds with karst features need close examination as to the applicability of the regression-equation results.
Kemper, Claudia; Koller, Daniela; Glaeske, Gerd; van den Bussche, Hendrik
2011-01-01
Aphasia, dementia, and depression are important and common neurological and neuropsychological disorders after ischemic stroke. We estimated the frequency of these comorbidities and their impact on mortality and nursing care dependency. Data of a German statutory health insurance were analyzed for people aged 50 years and older with first ischemic stroke. Aphasia, dementia, and depression were defined on the basis of outpatient medical diagnoses within 1 year after stroke. Logistic regression models for mortality and nursing care dependency were calculated and were adjusted for age, sex, and other relevant comorbidity. Of 977 individuals with a first ischemic stroke, 14.8% suffered from aphasia, 12.5% became demented, and 22.4% became depressed. The regression model for mortality showed a significant influence of age, aphasia, and other relevant comorbidity. In the regression model for nursing care dependency, the factors age, aphasia, dementia, depression, and other relevant comorbidity were significant. Aphasia has a high impact on mortality and nursing care dependency after ischemic stroke, while dementia and depression are strongly associated with increasing nursing care dependency.
Taljaard, Monica; McKenzie, Joanne E; Ramsay, Craig R; Grimshaw, Jeremy M
2014-06-19
An interrupted time series design is a powerful quasi-experimental approach for evaluating effects of interventions introduced at a specific point in time. To utilize the strength of this design, a modification to standard regression analysis, such as segmented regression, is required. In segmented regression analysis, the change in intercept and/or slope from pre- to post-intervention is estimated and used to test causal hypotheses about the intervention. We illustrate segmented regression using data from a previously published study that evaluated the effectiveness of a collaborative intervention to improve quality in pre-hospital ambulance care for acute myocardial infarction (AMI) and stroke. In the original analysis, a standard regression model was used with time as a continuous variable. We contrast the results from this standard regression analysis with those from segmented regression analysis. We discuss the limitations of the former and advantages of the latter, as well as the challenges of using segmented regression in analysing complex quality improvement interventions. Based on the estimated change in intercept and slope from pre- to post-intervention using segmented regression, we found insufficient evidence of a statistically significant effect on quality of care for stroke, although potential clinically important effects for AMI cannot be ruled out. Segmented regression analysis is the recommended approach for analysing data from an interrupted time series study. Several modifications to the basic segmented regression analysis approach are available to deal with challenges arising in the evaluation of complex quality improvement interventions.
Ferreira, Ana P; Tobyn, Mike
2015-01-01
In the pharmaceutical industry, chemometrics is rapidly establishing itself as a tool that can be used at every step of product development and beyond: from early development to commercialization. This set of multivariate analysis methods allows the extraction of information contained in large, complex data sets thus contributing to increase product and process understanding which is at the core of the Food and Drug Administration's Process Analytical Tools (PAT) Guidance for Industry and the International Conference on Harmonisation's Pharmaceutical Development guideline (Q8). This review is aimed at providing pharmaceutical industry professionals an introduction to multivariate analysis and how it is being adopted and implemented by companies in the transition from "quality-by-testing" to "quality-by-design". It starts with an introduction to multivariate analysis and the two methods most commonly used: principal component analysis and partial least squares regression, their advantages, common pitfalls and requirements for their effective use. That is followed with an overview of the diverse areas of application of multivariate analysis in the pharmaceutical industry: from the development of real-time analytical methods to definition of the design space and control strategy, from formulation optimization during development to the application of quality-by-design principles to improve manufacture of existing commercial products.
Halvorsen, Peder A.; Wennevold, Katrine; Fleten, Nils; Muras, Magdalena; Kowalczyk, Anna; Godycki-Cwirko, Maciek; Melbye, Hasse
2011-01-01
Objective To explore whether frequency and duration of sick-leave certification for acute airway infections differ between general practitioners (GPs) in Poland and Norway. Design Cross-sectional survey. Setting Educational courses for GPs. Intervention We used a questionnaire with four vignettes presenting patients with symptoms consistent with pneumonia, sinusitis, common cold, and exacerbation of chronic obstructive pulmonary disease (COPD), respectively. For each vignette GPs were asked whether they would offer a sick-leave note, and if so, for how many days. Subjects Convenience samples of GPs in Poland (n = 216) and Norway (n = 171). Main outcome measures Proportion of GPs offering a sick-leave certificate. Duration of sick-leave certification. Results In Poland 100%, 95%, 87%, and 94% of GPs would offer sick leave for pneumonia, sinusitis, common cold, and exacerbation of COPD, respectively. Corresponding figures in Norway were 97%, 83%, 60%, and 90%. Regression analysis adjusting for the GPs' sex, speciality, experience, and workload indicated that relative risks for offering sick leave (Poland versus Norway) were 1.16 (95% CI 1.07–1.26) for sinusitis and 1.50 (1.28–1.75) for common cold. Among GPs who offered sick leave for pneumonia, sinusitis, common cold, and exacerbation of COPD, mean duration was 8.9, 7.5, 5.1, and 6.9 days (Poland) versus 6.6, 4.3, 3.1, and 6.1 days (Norway), respectively. In regression analyses the differences between the Polish and Norwegian samples in duration of sick leave were statistically significant for all vignettes. A pattern of offering sick leave for three, five, seven, 10, or 14 days was observed in both countries. Conclusion In the Polish sample GPs were more likely to offer sick-leave notes for sinusitis and common cold. GPs in Poland offered sick leaves of longer duration for pneumonia, sinusitis, common colds, and exacerbation of COPD compared with GPs in the Norwegian sample. PMID:21323635
Brain enlargement is associated with regression in preschool-age boys with autism spectrum disorders
Nordahl, Christine Wu; Lange, Nicholas; Li, Deana D.; Barnett, Lou Ann; Lee, Aaron; Buonocore, Michael H.; Simon, Tony J.; Rogers, Sally; Ozonoff, Sally; Amaral, David G.
2011-01-01
Autism is a heterogeneous disorder with multiple behavioral and biological phenotypes. Accelerated brain growth during early childhood is a well-established biological feature of autism. Onset pattern, i.e., early onset or regressive, is an intensely studied behavioral phenotype of autism. There is currently little known, however, about whether, or how, onset status maps onto the abnormal brain growth. We examined the relationship between total brain volume and onset status in a large sample of 2- to 4-y-old boys and girls with autism spectrum disorder (ASD) [n = 53, no regression (nREG); n = 61, regression (REG)] and a comparison group of age-matched typically developing controls (n = 66). We also examined retrospective head circumference measurements from birth through 18 mo of age. We found that abnormal brain enlargement was most commonly found in boys with regressive autism. Brain size in boys without regression did not differ from controls. Retrospective head circumference measurements indicate that head circumference in boys with regressive autism is normal at birth but diverges from the other groups around 4–6 mo of age. There were no differences in brain size in girls with autism (n = 22, ASD; n = 24, controls). These results suggest that there may be distinct neural phenotypes associated with different onsets of autism. For boys with regressive autism, divergence in brain size occurs well before loss of skills is commonly reported. Thus, rapid head growth may be a risk factor for regressive autism. PMID:22123952
Effects of eye artifact removal methods on single trial P300 detection, a comparative study.
Ghaderi, Foad; Kim, Su Kyoung; Kirchner, Elsa Andrea
2014-01-15
Electroencephalographic signals are commonly contaminated by eye artifacts, even if recorded under controlled conditions. The objective of this work was to quantitatively compare standard artifact removal methods (regression, filtered regression, Infomax, and second order blind identification (SOBI)) and two artifact identification approaches for independent component analysis (ICA) methods, i.e. ADJUST and correlation. To this end, eye artifacts were removed and the cleaned datasets were used for single trial classification of P300 (a type of event related potentials elicited using the oddball paradigm). Statistical analysis of the results confirms that the combination of Infomax and ADJUST provides a relatively better performance (0.6% improvement on average of all subject) while the combination of SOBI and correlation performs the worst. Low-pass filtering the data at lower cutoffs (here 4 Hz) can also improve the classification accuracy. Without requiring any artifact reference channel, the combination of Infomax and ADJUST improves the classification performance more than the other methods for both examined filtering cutoffs, i.e., 4 Hz and 25 Hz. Copyright © 2013 Elsevier B.V. All rights reserved.
Shen, Chung-Wei; Chen, Yi-Hau
2015-10-01
Missing observations and covariate measurement error commonly arise in longitudinal data. However, existing methods for model selection in marginal regression analysis of longitudinal data fail to address the potential bias resulting from these issues. To tackle this problem, we propose a new model selection criterion, the Generalized Longitudinal Information Criterion, which is based on an approximately unbiased estimator for the expected quadratic error of a considered marginal model accounting for both data missingness and covariate measurement error. The simulation results reveal that the proposed method performs quite well in the presence of missing data and covariate measurement error. On the contrary, the naive procedures without taking care of such complexity in data may perform quite poorly. The proposed method is applied to data from the Taiwan Longitudinal Study on Aging to assess the relationship of depression with health and social status in the elderly, accommodating measurement error in the covariate as well as missing observations. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Ifoulis, A A; Savopoulou-Soultani, M
2006-10-01
The purpose of this research was to quantify the spatial pattern and develop a sampling program for larvae of Lobesia botrana Denis and Schiffermüller (Lepidoptera: Tortricidae), an important vineyard pest in northern Greece. Taylor's power law and Iwao's patchiness regression were used to model the relationship between the mean and the variance of larval counts. Analysis of covariance was carried out, separately for infestation and injury, with combined second and third generation data, for vine and half-vine sample units. Common regression coefficients were estimated to permit use of the sampling plan over a wide range of conditions. Optimum sample sizes for infestation and injury, at three levels of precision, were developed. An investigation of a multistage sampling plan with a nested analysis of variance showed that if the goal of sampling is focusing on larval infestation, three grape clusters should be sampled in a half-vine; if the goal of sampling is focusing on injury, then two grape clusters per half-vine are recommended.
Nelson, Jon P
2014-01-01
Precise estimates of price elasticities are important for alcohol tax policy. Using meta-analysis, this paper corrects average beer elasticities for heterogeneity, dependence, and publication selection bias. A sample of 191 estimates is obtained from 114 primary studies. Simple and weighted means are reported. Dependence is addressed by restricting number of estimates per study, author-restricted samples, and author-specific variables. Publication bias is addressed using funnel graph, trim-and-fill, and Egger's intercept model. Heterogeneity and selection bias are examined jointly in meta-regressions containing moderator variables for econometric methodology, primary data, and precision of estimates. Results for fixed- and random-effects regressions are reported. Country-specific effects and sample time periods are unimportant, but several methodology variables help explain the dispersion of estimates. In models that correct for selection bias and heterogeneity, the average beer price elasticity is about -0.20, which is less elastic by 50% compared to values commonly used in alcohol tax policy simulations. Copyright © 2013 Elsevier B.V. All rights reserved.
Fibromyalgia in 300 adult index patients with primary immunodeficiency.
Barton, James C; Bertoli, Luigi F; Barton, Jackson C; Acton, Ronald T
2017-01-01
We sought to determine the prevalence and clinical and laboratory associations of fibromyalgia in adults with primary immunodeficiency (immunoglobulin (Ig) G subclass deficiency (IgGSD) and common variable immunodeficiency (CVID). We performed a retrospective analysis of these observations in 300 non-Hispanic white adult index patients with recurrent/severe respiratory tract infections and IgGSD or CVID: age; sex; IgGSD; fibromyalgia; chronic fatigue; autoimmune conditions (ACs); interstitial cystitis (IC); diabetes; body mass index; serum Ig isotypes; blood lymphocytes and subsets; and human leukocyte antigen (HLA)-A and -B types and haplotypes. We performed univariate comparisons, logistic multivariable regressions, and an analysis of covariance. Mean age was 49 ± 12 (standard deviation) y. There were 246 women (82.0%). IgGSD was diagnosed in 276 patients (92.0%). Fifty-six patients had fibromyalgia (18.7%; female:male 13:1). Other characteristics included: chronic fatigue, 63.0%; aggregate ACs, 35.3%; Sjögren's syndrome, 8.0%; IC, 3.0%; diabetes, 10.3%; and HLA-A*29, B*44 positivity, 9.7%. Prevalences of female sex; chronic fatigue; IC; and HLA-A*29, B*44 positivity were greater in patients with fibromyalgia. Logistic regression on fibromyalgia revealed three positive associations: chronic fatigue (p=0.0149; odds ratio 2.6 [95% confidence interval 1.2, 5.6]); Sjögren's syndrome (p=0.0004; 5.2 [2.1, 13.2]); and IC (p=0.0232; 5.7 [1.3, 25.7]). In an analysis of covariance, there were significant interactions of chronic fatigue, Sjögren's syndrome, and interstitial cystitis on fibromyalgia. Fibromyalgia is common in non-Hispanic white adult index patients with primary immunodeficiency, especially women. Chronic fatigue, Sjögren's syndrome, and IC are significantly associated with fibromyalgia after adjustment for other independent variables.
Nivard, Michel G; Gage, Suzanne H; Hottenga, Jouke J; van Beijsterveldt, Catharina E M; Abdellaoui, Abdel; Bartels, Meike; Baselmans, Bart M L; Ligthart, Lannie; Pourcain, Beate St; Boomsma, Dorret I; Munafò, Marcus R; Middeldorp, Christel M
2017-10-21
Several nonpsychotic psychiatric disorders in childhood and adolescence can precede the onset of schizophrenia, but the etiology of this relationship remains unclear. We investigated to what extent the association between schizophrenia and psychiatric disorders in childhood is explained by correlated genetic risk factors. Polygenic risk scores (PRS), reflecting an individual's genetic risk for schizophrenia, were constructed for 2588 children from the Netherlands Twin Register (NTR) and 6127 from the Avon Longitudinal Study of Parents And Children (ALSPAC). The associations between schizophrenia PRS and measures of anxiety, depression, attention deficit hyperactivity disorder (ADHD), and oppositional defiant disorder/conduct disorder (ODD/CD) were estimated at age 7, 10, 12/13, and 15 years in the 2 cohorts. Results were then meta-analyzed, and a meta-regression analysis was performed to test differences in effects sizes over, age and disorders. Schizophrenia PRS were associated with childhood and adolescent psychopathology. Meta-regression analysis showed differences in the associations over disorders, with the strongest association with childhood and adolescent depression and a weaker association for ODD/CD at age 7. The associations increased with age and this increase was steepest for ADHD and ODD/CD. Genetic correlations varied between 0.10 and 0.25. By optimally using longitudinal data across diagnoses in a multivariate meta-analysis this study sheds light on the development of childhood disorders into severe adult psychiatric disorders. The results are consistent with a common genetic etiology of schizophrenia and developmental psychopathology as well as with a stronger shared genetic etiology between schizophrenia and adolescent onset psychopathology. © The Author 2017. Published by Oxford University Press on behalf of the Maryland Psychiatric Research Center. All rights reserved. For permissions, please email: journals.permissions@oup.com
Real-life assessment of the validity of patient global impression of change in fibromyalgia.
Rampakakis, Emmanouil; Ste-Marie, Peter A; Sampalis, John S; Karellis, Angeliki; Shir, Yoram; Fitzcharles, Mary-Ann
2015-01-01
Patient Global Rating of Change (GRC) scales are commonly used in routine clinical care given their ease of use, availability and short completion time. This analysis aimed at assessing the validity of Patient Global Impression of Change (PGIC), a GRC scale commonly used in fibromyalgia, in a Canadian real-life setting. 167 fibromyalgia patients with available PGIC data were recruited in 2005-2013 from a Canadian tertiary-care multidisciplinary clinic. In addition to PGIC, disease severity was assessed with: pain visual analogue scale (VAS); Patient Global Assessment (PGA); Fibromyalgia Impact Questionnaire (FIQ); Health Assessment Questionnaire (HAQ); McGill Pain Questionnaire; body map. Multivariate linear regression assessed the PGIC relationship with disease parameter improvement while adjusting for follow-up duration and baseline parameter levels. The Spearman's rank coefficient assessed parameter correlation. Higher PGIC scores were significantly (p<0.001) associated with greater improvement in pain, PGA, FIQ, HAQ and the body map. A statistically significant moderate positive correlation was observed between PGIC and FIQ improvement (r=0.423; p<0.001); correlation with all remaining disease severity measures was weak. Regression analysis confirmed a significant (p<0.001) positive association between improvement in all disease severity measures and PGIC. Baseline disease severity and follow-up duration were identified as significant independent predictors of PGIC rating. Despite that only a weak correlation was identified between PGIC and standard fibromyalgia outcomes improvement, in the absence of objective outcomes, PGIC remains a clinically relevant tool to assess perceived impact of disease management. However, our analysis suggests that outcome measures data should not be considered in isolation but, within the global clinical context.
Walker, Venexia M; Davies, Neil M; Jones, Tim; Kehoe, Patrick G; Martin, Richard M
2016-12-13
Current treatments for Alzheimer's and other neurodegenerative diseases have only limited effectiveness meaning that there is an urgent need for new medications that could influence disease incidence and progression. We will investigate the potential of a selection of commonly prescribed drugs, as a more efficient and cost-effective method of identifying new drugs for the prevention or treatment of Alzheimer's disease, non-Alzheimer's disease dementias, Parkinson's disease and amyotrophic lateral sclerosis. Our research will focus on drugs used for the treatment of hypertension, hypercholesterolaemia and type 2 diabetes, all of which have previously been identified as potentially cerebroprotective and have variable levels of preclinical evidence that suggest they may have beneficial effects for various aspects of dementia pathology. We will conduct a hypothesis testing observational cohort study using data from the Clinical Practice Research Datalink (CPRD). Our analysis will consider four statistical methods, which have different approaches for modelling confounding. These are multivariable adjusted Cox regression; propensity matched regression; instrumental variable analysis and marginal structural models. We will also use an intention-to-treat analysis, whereby we will define all exposures based on the first prescription observed in the database so that the target parameter is comparable to that estimated by a randomised controlled trial. This protocol has been approved by the CPRD's Independent Scientific Advisory Committee (ISAC). We will publish the results of the study as open-access peer-reviewed publications and disseminate findings through national and international conferences as are appropriate. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
Risk factors for falls in older patients with cancer.
Zhang, Xiaotao; Sun, Ming; Liu, Suyu; Leung, Cheuk Hong; Pang, Linda; Popat, Uday R; Champlin, Richard; Holmes, Holly M; Valero, Vicente; Dinney, Colin P; Tripathy, Debu; Edwards, Beatrice J
2018-03-01
A rising number of patients with cancer are older adults (65 years of age and older), and this proportion will increase to 70% by the year 2020. Falls are a common condition in older adults. We sought to assess the prevalence and risk factors for falls in older patients with cancer. This is a single-site, retrospective cohort study. Patients who were receiving cancer care underwent a comprehensive geriatric assessments, including cognitive, functional, nutritional, physical, falls in the prior 6 months and comorbidity assessment. Vitamin D and bone densitometry were performed. Descriptive statistics and multivariable logistic regression. A total of 304 patients aged 65 or above were enrolled in this study. The mean age was 78.4±6.9 years. They had haematological, gastrointestinal, urological, breast, lung and gynaecological cancers. A total of 215 patients with available information about falls within the past 6 months were included for final analysis. Seventy-seven (35.8%) patients had at least one fall in the preceding 6 months. Functional impairment (p=0.048), frailty (p<0.001), dementia (p=0.021), major depression (p=0.010) and low social support (p=0.045) were significantly associated with the fall status in the univariate analysis. Multivariate logistic regression analysis identified frailty and functional impairment to be independent risk factors for falls. Falls are common in older patients with cancer and lead to adverse clinical outcomes. Major depression, functional impairment, frailty, dementia and low social support were risk factors for falls. Heightened awareness and targeted interventions can prevent falls in older patients with cancer. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Empirical Assessment of Spatial Prediction Methods for Location Cost Adjustment Factors
Migliaccio, Giovanni C.; Guindani, Michele; D'Incognito, Maria; Zhang, Linlin
2014-01-01
In the feasibility stage, the correct prediction of construction costs ensures that budget requirements are met from the start of a project's lifecycle. A very common approach for performing quick-order-of-magnitude estimates is based on using Location Cost Adjustment Factors (LCAFs) that compute historically based costs by project location. Nowadays, numerous LCAF datasets are commercially available in North America, but, obviously, they do not include all locations. Hence, LCAFs for un-sampled locations need to be inferred through spatial interpolation or prediction methods. Currently, practitioners tend to select the value for a location using only one variable, namely the nearest linear-distance between two sites. However, construction costs could be affected by socio-economic variables as suggested by macroeconomic theories. Using a commonly used set of LCAFs, the City Cost Indexes (CCI) by RSMeans, and the socio-economic variables included in the ESRI Community Sourcebook, this article provides several contributions to the body of knowledge. First, the accuracy of various spatial prediction methods in estimating LCAF values for un-sampled locations was evaluated and assessed in respect to spatial interpolation methods. Two Regression-based prediction models were selected, a Global Regression Analysis and a Geographically-weighted regression analysis (GWR). Once these models were compared against interpolation methods, the results showed that GWR is the most appropriate way to model CCI as a function of multiple covariates. The outcome of GWR, for each covariate, was studied for all the 48 states in the contiguous US. As a direct consequence of spatial non-stationarity, it was possible to discuss the influence of each single covariate differently from state to state. In addition, the article includes a first attempt to determine if the observed variability in cost index values could be, at least partially explained by independent socio-economic variables. PMID:25018582
Comparison of statistical tests for association between rare variants and binary traits.
Bacanu, Silviu-Alin; Nelson, Matthew R; Whittaker, John C
2012-01-01
Genome-wide association studies have found thousands of common genetic variants associated with a wide variety of diseases and other complex traits. However, a large portion of the predicted genetic contribution to many traits remains unknown. One plausible explanation is that some of the missing variation is due to the effects of rare variants. Nonetheless, the statistical analysis of rare variants is challenging. A commonly used method is to contrast, within the same region (gene), the frequency of minor alleles at rare variants between cases and controls. However, this strategy is most useful under the assumption that the tested variants have similar effects. We previously proposed a method that can accommodate heterogeneous effects in the analysis of quantitative traits. Here we extend this method to include binary traits that can accommodate covariates. We use simulations for a variety of causal and covariate impact scenarios to compare the performance of the proposed method to standard logistic regression, C-alpha, SKAT, and EREC. We found that i) logistic regression methods perform well when the heterogeneity of the effects is not extreme and ii) SKAT and EREC have good performance under all tested scenarios but they can be computationally intensive. Consequently, it would be more computationally desirable to use a two-step strategy by (i) selecting promising genes by faster methods and ii) analyzing selected genes using SKAT/EREC. To select promising genes one can use (1) regression methods when effect heterogeneity is assumed to be low and the covariates explain a non-negligible part of trait variability, (2) C-alpha when heterogeneity is assumed to be large and covariates explain a small fraction of trait's variability and (3) the proposed trend and heterogeneity test when the heterogeneity is assumed to be non-trivial and the covariates explain a large fraction of trait variability.
Onset patterns in autism: Variation across informants, methods, and timing.
Ozonoff, Sally; Gangi, Devon; Hanzel, Elise P; Hill, Alesha; Hill, Monique M; Miller, Meghan; Schwichtenberg, A J; Steinfeld, Mary Beth; Parikh, Chandni; Iosif, Ana-Maria
2018-05-01
While previous studies suggested that regressive forms of onset were not common in autism spectrum disorder (ASD), more recent investigations suggest that the rates are quite high and may be under-reported using certain methods. The current study undertook a systematic investigation of how rates of regression differed by measurement method. Infants with (n = 147) and without a family history of ASD (n = 83) were seen prospectively for up to 7 visits in the first three years of life. Reports of symptom onset were collected using four measures that systematically varied the informant (examiner vs. parent), the decision type (categorical [regression absent or present] vs. dimensional [frequency of social behaviors]), and the timing of the assessment (retrospective vs. prospective). Latent class growth models were used to classify individual trajectories to see whether regressive onset patterns were infrequent or widespread within the ASD group. A majority of the sample was classified as having a regressive onset using either examiner (88%) or parent (69%) prospective dimensional ratings. Rates of regression were much lower using retrospective or categorical measures (from 29 to 47%). Agreement among different measurement methods was low. Declining trajectories of development, consistent with a regressive onset pattern, are common in children with ASD and may be more the rule than the exception. The accuracy of widely used methods of measuring onset is questionable and the present findings argue against their widespread use. Autism Res 2018, 11: 788-797. © 2018 International Society for Autism Research, Wiley Periodicals, Inc. This study examines different ways of measuring the onset of symptoms in autism spectrum disorder (ASD). The present findings suggest that declining developmental skills, consistent with a regressive onset pattern, are common in children with ASD and may be more the rule than the exception. The results question the accuracy of widely used methods of measuring symptom onset and argue against their widespread use. © 2018 International Society for Autism Research, Wiley Periodicals, Inc.
2014-01-01
Background Greater use of antibiotics during the past 50 years has exerted selective pressure on susceptible bacteria and may have favoured the survival of resistant strains. Existing information on antibiotic resistance patterns from pathogens circulating among community-based patients is substantially less than from hospitalized patients on whom guidelines are often based. We therefore chose to assess the relationship between the antibiotic resistance pattern of bacteria circulating in the community and the consumption of antibiotics in the community. Methods Both gray literature and published scientific literature in English and other European languages was examined. Multiple regression analysis was used to analyse whether studies found a positive relationship between antibiotic consumption and resistance. A subsequent meta-analysis and meta-regression was conducted for studies for which a common effect size measure (odds ratio) could be calculated. Results Electronic searches identified 974 studies but only 243 studies were considered eligible for inclusion by the two independent reviewers who extracted the data. A binomial test revealed a positive relationship between antibiotic consumption and resistance (p < .001) but multiple regression modelling did not produce any significant predictors of study outcome. The meta-analysis generated a significant pooled odds ratio of 2.3 (95% confidence interval 2.2 to 2.5) with a meta-regression producing several significant predictors (F(10,77) = 5.82, p < .01). Countries in southern Europe produced a stronger link between consumption and resistance than other regions. Conclusions Using a large set of studies we found that antibiotic consumption is associated with the development of antibiotic resistance. A subsequent meta-analysis, with a subsample of the studies, generated several significant predictors. Countries in southern Europe produced a stronger link between consumption and resistance than other regions so efforts at reducing antibiotic consumption may need to be strengthened in this area. Increased consumption of antibiotics may not only produce greater resistance at the individual patient level but may also produce greater resistance at the community, country, and regional levels, which can harm individual patients. PMID:24405683
Biodiversity patterns along ecological gradients: unifying β-diversity indices.
Szava-Kovats, Robert C; Pärtel, Meelis
2014-01-01
Ecologists have developed an abundance of conceptions and mathematical expressions to define β-diversity, the link between local (α) and regional-scale (γ) richness, in order to characterize patterns of biodiversity along ecological (i.e., spatial and environmental) gradients. These patterns are often realized by regression of β-diversity indices against one or more ecological gradients. This practice, however, is subject to two shortcomings that can undermine the validity of the biodiversity patterns. First, many β-diversity indices are constrained to range between fixed lower and upper limits. As such, regression analysis of β-diversity indices against ecological gradients can result in regression curves that extend beyond these mathematical constraints, thus creating an interpretational dilemma. Second, despite being a function of the same measured α- and γ-diversity, the resultant biodiversity pattern depends on the choice of β-diversity index. We propose a simple logistic transformation that rids beta-diversity indices of their mathematical constraints, thus eliminating the possibility of an uninterpretable regression curve. Moreover, this transformation results in identical biodiversity patterns for three commonly used classical beta-diversity indices. As a result, this transformation eliminates the difficulties of both shortcomings, while allowing the researcher to use whichever beta-diversity index deemed most appropriate. We believe this method can help unify the study of biodiversity patterns along ecological gradients.
Biodiversity Patterns along Ecological Gradients: Unifying β-Diversity Indices
Szava-Kovats, Robert C.; Pärtel, Meelis
2014-01-01
Ecologists have developed an abundance of conceptions and mathematical expressions to define β-diversity, the link between local (α) and regional-scale (γ) richness, in order to characterize patterns of biodiversity along ecological (i.e., spatial and environmental) gradients. These patterns are often realized by regression of β-diversity indices against one or more ecological gradients. This practice, however, is subject to two shortcomings that can undermine the validity of the biodiversity patterns. First, many β-diversity indices are constrained to range between fixed lower and upper limits. As such, regression analysis of β-diversity indices against ecological gradients can result in regression curves that extend beyond these mathematical constraints, thus creating an interpretational dilemma. Second, despite being a function of the same measured α- and γ-diversity, the resultant biodiversity pattern depends on the choice of β-diversity index. We propose a simple logistic transformation that rids beta-diversity indices of their mathematical constraints, thus eliminating the possibility of an uninterpretable regression curve. Moreover, this transformation results in identical biodiversity patterns for three commonly used classical beta-diversity indices. As a result, this transformation eliminates the difficulties of both shortcomings, while allowing the researcher to use whichever beta-diversity index deemed most appropriate. We believe this method can help unify the study of biodiversity patterns along ecological gradients. PMID:25330181
Statistical downscaling modeling with quantile regression using lasso to estimate extreme rainfall
NASA Astrophysics Data System (ADS)
Santri, Dewi; Wigena, Aji Hamim; Djuraidah, Anik
2016-02-01
Rainfall is one of the climatic elements with high diversity and has many negative impacts especially extreme rainfall. Therefore, there are several methods that required to minimize the damage that may occur. So far, Global circulation models (GCM) are the best method to forecast global climate changes include extreme rainfall. Statistical downscaling (SD) is a technique to develop the relationship between GCM output as a global-scale independent variables and rainfall as a local- scale response variable. Using GCM method will have many difficulties when assessed against observations because GCM has high dimension and multicollinearity between the variables. The common method that used to handle this problem is principal components analysis (PCA) and partial least squares regression. The new method that can be used is lasso. Lasso has advantages in simultaneuosly controlling the variance of the fitted coefficients and performing automatic variable selection. Quantile regression is a method that can be used to detect extreme rainfall in dry and wet extreme. Objective of this study is modeling SD using quantile regression with lasso to predict extreme rainfall in Indramayu. The results showed that the estimation of extreme rainfall (extreme wet in January, February and December) in Indramayu could be predicted properly by the model at quantile 90th.
Four Major South Korea's Rivers Using Deep Learning Models.
Lee, Sangmok; Lee, Donghyun
2018-06-24
Harmful algal blooms are an annual phenomenon that cause environmental damage, economic losses, and disease outbreaks. A fundamental solution to this problem is still lacking, thus, the best option for counteracting the effects of algal blooms is to improve advance warnings (predictions). However, existing physical prediction models have difficulties setting a clear coefficient indicating the relationship between each factor when predicting algal blooms, and many variable data sources are required for the analysis. These limitations are accompanied by high time and economic costs. Meanwhile, artificial intelligence and deep learning methods have become increasingly common in scientific research; attempts to apply the long short-term memory (LSTM) model to environmental research problems are increasing because the LSTM model exhibits good performance for time-series data prediction. However, few studies have applied deep learning models or LSTM to algal bloom prediction, especially in South Korea, where algal blooms occur annually. Therefore, we employed the LSTM model for algal bloom prediction in four major rivers of South Korea. We conducted short-term (one week) predictions by employing regression analysis and deep learning techniques on a newly constructed water quality and quantity dataset drawn from 16 dammed pools on the rivers. Three deep learning models (multilayer perceptron, MLP; recurrent neural network, RNN; and long short-term memory, LSTM) were used to predict chlorophyll-a, a recognized proxy for algal activity. The results were compared to those from OLS (ordinary least square) regression analysis and actual data based on the root mean square error (RSME). The LSTM model showed the highest prediction rate for harmful algal blooms and all deep learning models out-performed the OLS regression analysis. Our results reveal the potential for predicting algal blooms using LSTM and deep learning.
NiftyNet: a deep-learning platform for medical imaging.
Gibson, Eli; Li, Wenqi; Sudre, Carole; Fidon, Lucas; Shakir, Dzhoshkun I; Wang, Guotai; Eaton-Rosen, Zach; Gray, Robert; Doel, Tom; Hu, Yipeng; Whyntie, Tom; Nachev, Parashkev; Modat, Marc; Barratt, Dean C; Ourselin, Sébastien; Cardoso, M Jorge; Vercauteren, Tom
2018-05-01
Medical image analysis and computer-assisted intervention problems are increasingly being addressed with deep-learning-based solutions. Established deep-learning platforms are flexible but do not provide specific functionality for medical image analysis and adapting them for this domain of application requires substantial implementation effort. Consequently, there has been substantial duplication of effort and incompatible infrastructure developed across many research groups. This work presents the open-source NiftyNet platform for deep learning in medical imaging. The ambition of NiftyNet is to accelerate and simplify the development of these solutions, and to provide a common mechanism for disseminating research outputs for the community to use, adapt and build upon. The NiftyNet infrastructure provides a modular deep-learning pipeline for a range of medical imaging applications including segmentation, regression, image generation and representation learning applications. Components of the NiftyNet pipeline including data loading, data augmentation, network architectures, loss functions and evaluation metrics are tailored to, and take advantage of, the idiosyncracies of medical image analysis and computer-assisted intervention. NiftyNet is built on the TensorFlow framework and supports features such as TensorBoard visualization of 2D and 3D images and computational graphs by default. We present three illustrative medical image analysis applications built using NiftyNet infrastructure: (1) segmentation of multiple abdominal organs from computed tomography; (2) image regression to predict computed tomography attenuation maps from brain magnetic resonance images; and (3) generation of simulated ultrasound images for specified anatomical poses. The NiftyNet infrastructure enables researchers to rapidly develop and distribute deep learning solutions for segmentation, regression, image generation and representation learning applications, or extend the platform to new applications. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.
Milner, Allison; Aitken, Zoe; Kavanagh, Anne; LaMontagne, Anthony D; Pega, Frank; Petrie, Dennis
2017-06-23
Previous studies suggest that poor psychosocial job quality is a risk factor for mental health problems, but they use conventional regression analytic methods that cannot rule out reverse causation, unmeasured time-invariant confounding and reporting bias. This study combines two quasi-experimental approaches to improve causal inference by better accounting for these biases: (i) linear fixed effects regression analysis and (ii) linear instrumental variable analysis. We extract 13 annual waves of national cohort data including 13 260 working-age (18-64 years) employees. The exposure variable is self-reported level of psychosocial job quality. The instruments used are two common workplace entitlements. The outcome variable is the Mental Health Inventory (MHI-5). We adjust for measured time-varying confounders. In the fixed effects regression analysis adjusted for time-varying confounders, a 1-point increase in psychosocial job quality is associated with a 1.28-point improvement in mental health on the MHI-5 scale (95% CI: 1.17, 1.40; P < 0.001). When the fixed effects was combined with the instrumental variable analysis, a 1-point increase psychosocial job quality is related to 1.62-point improvement on the MHI-5 scale (95% CI: -0.24, 3.48; P = 0.088). Our quasi-experimental results provide evidence to confirm job stressors as risk factors for mental ill health using methods that improve causal inference. © The Author 2017. Published by Oxford University Press on behalf of Faculty of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com
Grogan-Kaylor, Andrew; Perron, Brian E.; Kilbourne, Amy M.; Woltmann, Emily; Bauer, Mark S.
2013-01-01
Objective Prior meta-analysis indicates that collaborative chronic care models (CCMs) improve mental and physical health outcomes for individuals with mental disorders. This study aimed to investigate the stability of evidence over time and identify patient and intervention factors associated with CCM effects in order to facilitate implementation and sustainability of CCMs in clinical practice. Method We reviewed 53 CCM trials that analyzed depression, mental quality of life (QOL), or physical QOL outcomes. Cumulative meta-analysis and meta-regression were supplemented by descriptive investigations across and within trials. Results Most trials targeted depression in the primary care setting, and cumulative meta-analysis indicated that effect sizes favoring CCM quickly achieved significance for depression outcomes, and more recently achieved significance for mental and physical QOL. Four of six CCM elements (patient self-management support, clinical information systems, system redesign, and provider decision support) were common among reviewed trials, while two elements (healthcare organization support and linkages to community resources) were rare. No single CCM element was statistically associated with the success of the model. Similarly, meta-regression did not identify specific factors associated with CCM effectiveness. Nonetheless, results within individual trials suggest that increased illness severity predicts CCM outcomes. Conclusions Significant CCM trials have been derived primarily from four original CCM elements. Nonetheless, implementing and sustaining this established model will require healthcare organization support. While CCMs have typically been tested as population-based interventions, evidence supports stepped care application to more severely ill individuals. Future priorities include developing implementation strategies to support adoption and sustainability of the model in clinical settings while maximizing fit of this multi-component framework to local contextual factors. PMID:23938600
Regression in autistic spectrum disorders.
Stefanatos, Gerry A
2008-12-01
A significant proportion of children diagnosed with Autistic Spectrum Disorder experience a developmental regression characterized by a loss of previously-acquired skills. This may involve a loss of speech or social responsitivity, but often entails both. This paper critically reviews the phenomena of regression in autistic spectrum disorders, highlighting the characteristics of regression, age of onset, temporal course, and long-term outcome. Important considerations for diagnosis are discussed and multiple etiological factors currently hypothesized to underlie the phenomenon are reviewed. It is argued that regressive autistic spectrum disorders can be conceptualized on a spectrum with other regressive disorders that may share common pathophysiological features. The implications of this viewpoint are discussed.
Unified Computational Methods for Regression Analysis of Zero-Inflated and Bound-Inflated Data
Yang, Yan; Simpson, Douglas
2010-01-01
Bounded data with excess observations at the boundary are common in many areas of application. Various individual cases of inflated mixture models have been studied in the literature for bound-inflated data, yet the computational methods have been developed separately for each type of model. In this article we use a common framework for computing these models, and expand the range of models for both discrete and semi-continuous data with point inflation at the lower boundary. The quasi-Newton and EM algorithms are adapted and compared for estimation of model parameters. The numerical Hessian and generalized Louis method are investigated as means for computing standard errors after optimization. Correlated data are included in this framework via generalized estimating equations. The estimation of parameters and effectiveness of standard errors are demonstrated through simulation and in the analysis of data from an ultrasound bioeffect study. The unified approach enables reliable computation for a wide class of inflated mixture models and comparison of competing models. PMID:20228950
Using Dominance Analysis to Determine Predictor Importance in Logistic Regression
ERIC Educational Resources Information Center
Azen, Razia; Traxel, Nicole
2009-01-01
This article proposes an extension of dominance analysis that allows researchers to determine the relative importance of predictors in logistic regression models. Criteria for choosing logistic regression R[superscript 2] analogues were determined and measures were selected that can be used to perform dominance analysis in logistic regression. A…
Smallman-Raynor, M R; Cliff, A D; Ord, J K
2010-08-01
Although the involvement of common childhood infections in the aetiology of acute appendicitis has long been conjectured, supporting evidence is largely restricted to a disparate set of clinical case reports. A systematic population-based analysis of the implied comorbid associations is lacking in the literature. Drawing on a classic epidemiological dataset, assembled by the School Epidemics Committee of the United Kingdom's Medical Research Council (MRC) in the 1930s, this paper presents a historical analysis of the association between termly outbreaks of each of six common childhood infections (chickenpox, measles, mumps, rubella, scarlet fever and whooping cough) and operated cases of acute appendicitis in 27 English public boarding schools. When controlled for the potential confounding effects of school, year and season, multivariate negative binomial regression revealed a positive association between the level of appendicitis activity and the recorded rate of mumps (beta=0.15, 95% CI 0.07-0.24, P<0.001). Non-significant associations were identified between appendicitis and the other sample infectious diseases. Subject to data caveats, our findings suggest that further studies are required to determine whether the comorbid association between mumps and appendicitis is causal.
Ergonomics and musculoskeletal disorder: as an occupational hazard in dentistry.
Gopinadh, Anne; Devi, Kolli Naga Neelima; Chiramana, Sandeep; Manne, Prakash; Sampath, Anche; Babu, Muvva Suresh
2013-03-01
Musculoskeletal disorders (MSDs) are commonly experienced in dentistry. The objective of this study is to determine the prevalence of ergonomics and MSDs among dental professionals. A cross-sectional survey was conducted among 170 dentists of different specialties. The questionnaire gathered information regarding demographic details, MSDs, work duration, working status, awareness of ergonomics, etc. Data was analyzed using SPSS version 15.0. Student's t-test and analysis of variance (ANOVA) test was used for comparison in mean scores. Stepwise multiple linear regression analysis was used to assess the independent variables that significantly influenced the variance in the dependent variable (pain). It was found that 73.9% of the participants reported musculoskeletal pain and most common painful sites were neck and back. More than half of the participants, i.e. 232 (59.3%) were aware of correct ergonomic posture regarding dental. Almost percentage of pain increased significantly with increase in age and working time. Among all specialties, prosthodontics were found to have more prevalence of MSDs. The appearance of musculoskeletal symptoms among dental professionals was quite common. It suggested that ergonomics should be covered in the educational system to reduce risks to dental practitioners.
NASA Astrophysics Data System (ADS)
Gürcan, Eser Kemal
2017-04-01
The most commonly used methods for analyzing time-dependent data are multivariate analysis of variance (MANOVA) and nonlinear regression models. The aim of this study was to compare some MANOVA techniques and nonlinear mixed modeling approach for investigation of growth differentiation in female and male Japanese quail. Weekly individual body weight data of 352 male and 335 female quail from hatch to 8 weeks of age were used to perform analyses. It is possible to say that when all the analyses are evaluated, the nonlinear mixed modeling is superior to the other techniques because it also reveals the individual variation. In addition, the profile analysis also provides important information.
Jung, Taejin; Youn, Hyunsook; McClung, Steven
2007-02-01
The main purposes of this study are to find out individuals' motives and interpersonal self-presentation strategies on constructing Korean weblog format personal homepage (e.g., "Cyworld mini-homepage"). The study also attempts to find predictor motives that lead to the activities of posting and maintaining a homepage and compare the self-presentation strategies used on the Web with those commonly used in interpersonal situations. By using a principal component factor analysis, four salient self-presentation strategy factors and five interpretable mini-homepage hosting motive factors were identified. Accompanying multiple regression analysis shows that entertainment and personal income factors are major predictors in explaining homepage maintenance expenditures and frequencies of updating.
Intermediate and advanced topics in multilevel logistic regression analysis
Merlo, Juan
2017-01-01
Multilevel data occur frequently in health services, population and public health, and epidemiologic research. In such research, binary outcomes are common. Multilevel logistic regression models allow one to account for the clustering of subjects within clusters of higher‐level units when estimating the effect of subject and cluster characteristics on subject outcomes. A search of the PubMed database demonstrated that the use of multilevel or hierarchical regression models is increasing rapidly. However, our impression is that many analysts simply use multilevel regression models to account for the nuisance of within‐cluster homogeneity that is induced by clustering. In this article, we describe a suite of analyses that can complement the fitting of multilevel logistic regression models. These ancillary analyses permit analysts to estimate the marginal or population‐average effect of covariates measured at the subject and cluster level, in contrast to the within‐cluster or cluster‐specific effects arising from the original multilevel logistic regression model. We describe the interval odds ratio and the proportion of opposed odds ratios, which are summary measures of effect for cluster‐level covariates. We describe the variance partition coefficient and the median odds ratio which are measures of components of variance and heterogeneity in outcomes. These measures allow one to quantify the magnitude of the general contextual effect. We describe an R 2 measure that allows analysts to quantify the proportion of variation explained by different multilevel logistic regression models. We illustrate the application and interpretation of these measures by analyzing mortality in patients hospitalized with a diagnosis of acute myocardial infarction. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:28543517
Intermediate and advanced topics in multilevel logistic regression analysis.
Austin, Peter C; Merlo, Juan
2017-09-10
Multilevel data occur frequently in health services, population and public health, and epidemiologic research. In such research, binary outcomes are common. Multilevel logistic regression models allow one to account for the clustering of subjects within clusters of higher-level units when estimating the effect of subject and cluster characteristics on subject outcomes. A search of the PubMed database demonstrated that the use of multilevel or hierarchical regression models is increasing rapidly. However, our impression is that many analysts simply use multilevel regression models to account for the nuisance of within-cluster homogeneity that is induced by clustering. In this article, we describe a suite of analyses that can complement the fitting of multilevel logistic regression models. These ancillary analyses permit analysts to estimate the marginal or population-average effect of covariates measured at the subject and cluster level, in contrast to the within-cluster or cluster-specific effects arising from the original multilevel logistic regression model. We describe the interval odds ratio and the proportion of opposed odds ratios, which are summary measures of effect for cluster-level covariates. We describe the variance partition coefficient and the median odds ratio which are measures of components of variance and heterogeneity in outcomes. These measures allow one to quantify the magnitude of the general contextual effect. We describe an R 2 measure that allows analysts to quantify the proportion of variation explained by different multilevel logistic regression models. We illustrate the application and interpretation of these measures by analyzing mortality in patients hospitalized with a diagnosis of acute myocardial infarction. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.
Assessment of traffic noise levels in urban areas using different soft computing techniques.
Tomić, J; Bogojević, N; Pljakić, M; Šumarac-Pavlović, D
2016-10-01
Available traffic noise prediction models are usually based on regression analysis of experimental data, and this paper presents the application of soft computing techniques in traffic noise prediction. Two mathematical models are proposed and their predictions are compared to data collected by traffic noise monitoring in urban areas, as well as to predictions of commonly used traffic noise models. The results show that application of evolutionary algorithms and neural networks may improve process of development, as well as accuracy of traffic noise prediction.
General Framework for Meta-analysis of Rare Variants in Sequencing Association Studies
Lee, Seunggeun; Teslovich, Tanya M.; Boehnke, Michael; Lin, Xihong
2013-01-01
We propose a general statistical framework for meta-analysis of gene- or region-based multimarker rare variant association tests in sequencing association studies. In genome-wide association studies, single-marker meta-analysis has been widely used to increase statistical power by combining results via regression coefficients and standard errors from different studies. In analysis of rare variants in sequencing studies, region-based multimarker tests are often used to increase power. We propose meta-analysis methods for commonly used gene- or region-based rare variants tests, such as burden tests and variance component tests. Because estimation of regression coefficients of individual rare variants is often unstable or not feasible, the proposed method avoids this difficulty by calculating score statistics instead that only require fitting the null model for each study and then aggregating these score statistics across studies. Our proposed meta-analysis rare variant association tests are conducted based on study-specific summary statistics, specifically score statistics for each variant and between-variant covariance-type (linkage disequilibrium) relationship statistics for each gene or region. The proposed methods are able to incorporate different levels of heterogeneity of genetic effects across studies and are applicable to meta-analysis of multiple ancestry groups. We show that the proposed methods are essentially as powerful as joint analysis by directly pooling individual level genotype data. We conduct extensive simulations to evaluate the performance of our methods by varying levels of heterogeneity across studies, and we apply the proposed methods to meta-analysis of rare variant effects in a multicohort study of the genetics of blood lipid levels. PMID:23768515
The evolution of floral scent and insect chemical communication.
Schiestl, Florian P
2010-05-01
Plants have evolved a range of strategies to manipulate the behaviour of their insect partners. One powerful strategy is to produce signals that already have a role in the animals' own communication systems. To investigate to what extent the evolution of floral scents is correlated with chemical communication in insects, I analyse the occurrence, commonness, and evolutionary patterns of the 71 most common 'floral' volatile organic compounds (VOCs) in 96 plant families and 87 insect families. I found an overlap of 87% in VOCs produced by plants and insects. 'Floral' monoterpenes showed strong positive correlation in commonness between plants (both gymnosperms and angiosperms) and herbivores, whereas the commonness of 'floral' aromatics was positively correlated between angiosperms and both pollinators and herbivores. According to a multivariate regression analysis the commonness of 'floral' aromatics was best explained by their commonness in pollinators, whereas monoterpenes were best explained by herbivores. Among pollinator orders, aromatics were significantly more common in Lepidoptera than in Hymenoptera, whereas monoterpenes showed no difference among the two orders. Collectively, these patterns suggest that plants and insects converge in overall patterns of volatile production, both for attraction and defence. Monoterpenes seem to have evolved primarily for defence under selection by herbivores, whereas aromatics evolved signalling functions in angiosperms, primarily for pollinator attraction.
Linden, Ariel
2018-04-01
Interrupted time series analysis (ITSA) is an evaluation methodology in which a single treatment unit's outcome is studied over time and the intervention is expected to "interrupt" the level and/or trend of the outcome. The internal validity is strengthened considerably when the treated unit is contrasted with a comparable control group. In this paper, we introduce a robust evaluation framework that combines the synthetic controls method (SYNTH) to generate a comparable control group and ITSA regression to assess covariate balance and estimate treatment effects. We evaluate the effect of California's Proposition 99 for reducing cigarette sales, by comparing California to other states not exposed to smoking reduction initiatives. SYNTH is used to reweight nontreated units to make them comparable to the treated unit. These weights are then used in ITSA regression models to assess covariate balance and estimate treatment effects. Covariate balance was achieved for all but one covariate. While California experienced a significant decrease in the annual trend of cigarette sales after Proposition 99, there was no statistically significant treatment effect when compared to synthetic controls. The advantage of using this framework over regression alone is that it ensures that a comparable control group is generated. Additionally, it offers a common set of statistical measures familiar to investigators, the capability for assessing covariate balance, and enhancement of the evaluation with a comprehensive set of postestimation measures. Therefore, this robust framework should be considered as a primary approach for evaluating treatment effects in multiple group time series analysis. © 2018 John Wiley & Sons, Ltd.
Oh, Eric J; Shepherd, Bryan E; Lumley, Thomas; Shaw, Pamela A
2018-04-15
For time-to-event outcomes, a rich literature exists on the bias introduced by covariate measurement error in regression models, such as the Cox model, and methods of analysis to address this bias. By comparison, less attention has been given to understanding the impact or addressing errors in the failure time outcome. For many diseases, the timing of an event of interest (such as progression-free survival or time to AIDS progression) can be difficult to assess or reliant on self-report and therefore prone to measurement error. For linear models, it is well known that random errors in the outcome variable do not bias regression estimates. With nonlinear models, however, even random error or misclassification can introduce bias into estimated parameters. We compare the performance of 2 common regression models, the Cox and Weibull models, in the setting of measurement error in the failure time outcome. We introduce an extension of the SIMEX method to correct for bias in hazard ratio estimates from the Cox model and discuss other analysis options to address measurement error in the response. A formula to estimate the bias induced into the hazard ratio by classical measurement error in the event time for a log-linear survival model is presented. Detailed numerical studies are presented to examine the performance of the proposed SIMEX method under varying levels and parametric forms of the error in the outcome. We further illustrate the method with observational data on HIV outcomes from the Vanderbilt Comprehensive Care Clinic. Copyright © 2017 John Wiley & Sons, Ltd.
Robertson, Dale M.; Saad, D.A.; Heisey, D.M.
2006-01-01
Various approaches are used to subdivide large areas into regions containing streams that have similar reference or background water quality and that respond similarly to different factors. For many applications, such as establishing reference conditions, it is preferable to use physical characteristics that are not affected by human activities to delineate these regions. However, most approaches, such as ecoregion classifications, rely on land use to delineate regions or have difficulties compensating for the effects of land use. Land use not only directly affects water quality, but it is often correlated with the factors used to define the regions. In this article, we describe modifications to SPARTA (spatial regression-tree analysis), a relatively new approach applied to water-quality and environmental characteristic data to delineate zones with similar factors affecting water quality. In this modified approach, land-use-adjusted (residualized) water quality and environmental characteristics are computed for each site. Regression-tree analysis is applied to the residualized data to determine the most statistically important environmental characteristics describing the distribution of a specific water-quality constituent. Geographic information for small basins throughout the study area is then used to subdivide the area into relatively homogeneous environmental water-quality zones. For each zone, commonly used approaches are subsequently used to define its reference water quality and how its water quality responds to changes in land use. SPARTA is used to delineate zones of similar reference concentrations of total phosphorus and suspended sediment throughout the upper Midwestern part of the United States. ?? 2006 Springer Science+Business Media, Inc.
Bayesian function-on-function regression for multilevel functional data.
Meyer, Mark J; Coull, Brent A; Versace, Francesco; Cinciripini, Paul; Morris, Jeffrey S
2015-09-01
Medical and public health research increasingly involves the collection of complex and high dimensional data. In particular, functional data-where the unit of observation is a curve or set of curves that are finely sampled over a grid-is frequently obtained. Moreover, researchers often sample multiple curves per person resulting in repeated functional measures. A common question is how to analyze the relationship between two functional variables. We propose a general function-on-function regression model for repeatedly sampled functional data on a fine grid, presenting a simple model as well as a more extensive mixed model framework, and introducing various functional Bayesian inferential procedures that account for multiple testing. We examine these models via simulation and a data analysis with data from a study that used event-related potentials to examine how the brain processes various types of images. © 2015, The International Biometric Society.
Hyperopic photorefractive keratectomy and central islands
NASA Astrophysics Data System (ADS)
Gobbi, Pier Giorgio; Carones, Francesco; Morico, Alessandro; Vigo, Luca; Brancato, Rosario
1998-06-01
We have evaluated the refractive evolution in patients treated with yhyperopic PRK to assess the extent of the initial overcorrection and the time constant of regression. To this end, the time history of the refractive error (i.e. the difference between achieved and intended refractive correction) has been fitted by means of an exponential statistical model, giving information characterizing the surgical procedure with a direct clinical meaning. Both hyperopic and myopic PRk procedures have been analyzed by this method. The analysis of the fitting model parameters shows that hyperopic PRK patients exhibit a definitely higher initial overcorrection than myopic ones, and a regression time constant which is much longer. A common mechanism is proposed to be responsible for the refractive outcomes in hyperopic treatments and in myopic patients exhibiting significant central islands. The interpretation is in terms of superhydration of the central cornea, and is based on a simple physical model evaluating the amount of centripetal compression in the apical cornea.
A Statistical Theory of Bidirectionality
NASA Technical Reports Server (NTRS)
DeLoach, Richard; Ulbrich, Norbert
2013-01-01
Original concepts related to the quantification and assessment of bidirectionality in strain-gage balances were introduced by Ulbrich in 2012. These concepts are extended here in three ways: 1) the metric originally proposed by Ulbrich is normalized, 2) a categorical variable is introduced in the regression analysis to account for load polarity, and 3) the uncertainty in both normalized and non-normalized bidirectionality metrics is quantified. These extensions are applied to four representative balances to assess the bidirectionality characteristics of each. The paper is tutorial in nature, featuring reviews of certain elements of regression and formal inference. Principal findings are that bidirectionality appears to be a common characteristic of most balance outputs and that unless it is taken into account, it is likely to consume the entire error budget of a typical balance calibration experiment. Data volume and correlation among calibration loads are shown to have a significant impact on the precision with which bidirectionality metrics can be assessed.
Outcomes of Sinonasal Cancer Treated With Proton Therapy
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dagan, Roi, E-mail: rdagan@floridaproton.org; Department of Radiation Oncology, University of Florida, Jacksonville, Florida; Bryant, Curtis
Purpose: To report disease outcomes after proton therapy (PT) for sinonasal cancer. Methods and Materials: Eighty-four adult patients without metastases received primary (13%) or adjuvant (87%) PT for sinonasal cancers (excluding melanoma, sarcoma, and lymphoma). Common histologies were olfactory neuroblastoma (23%), squamous cell carcinoma (22%), and adenoid cystic carcinoma (17%). Advanced stage (T3 in 25% and T4 in 69%) and high-grade histology (51%) were common. Surgical procedures included endoscopic resection alone (45%), endoscopic resection with craniotomy (12%), or open resection (30%). Gross residual disease was present in 26% of patients. Most patients received hyperfractionated PT (1.2 Gy [relative biological effectiveness (RBE)] twicemore » daily, 99%) and chemotherapy (75%). The median PT dose was 73.8 Gy (RBE), with 85% of patients receiving more than 70 Gy (RBE). Prognostic factors were analyzed using Kaplan-Meier analysis and proportional hazards regression for multiple regression. Dosimetric parameters were evaluated using logistic regression. Serious, late grade 3 or higher toxicity was reported using the National Cancer Institute Common Terminology Criteria for Adverse Events, version 4. The median follow-up was 2.4 years for all patients and 2.7 years among living patients. Results: The local control (LC), neck control, freedom from distant metastasis, disease-free survival, cause-specific survival, and overall survival rates were 83%, 94%, 73%, 63%, 70%, and 68%, respectively, at 3 years. Gross total resection and PT resulted in a 90% 3-year LC rate. The 3-year LC rate was 61% for primary radiation therapy and 59% for patients with gross disease. Gross disease was the only significant factor for LC on multivariate analysis, whereas grade and continuous LC were prognostic for overall survival. Six of 12 local recurrences were marginal. Dural dissemination represented 26% of distant recurrences. Late toxicity occurred in 24% of patients (with grade 3 or higher unilateral vision loss in 2%). Conclusions: Dose-intensified, hyperfractionated PT with or without concurrent chemotherapy results in excellent LC after gross total resection, and results in patients with gross disease are encouraging. Patients with high-grade histology are at greater risk of death from distant dissemination. Continuous LC is a major determinant of survival justifying aggressive local therapy in nearly all cases.« less
Using a Linear Regression Method to Detect Outliers in IRT Common Item Equating
ERIC Educational Resources Information Center
He, Yong; Cui, Zhongmin; Fang, Yu; Chen, Hanwei
2013-01-01
Common test items play an important role in equating alternate test forms under the common item nonequivalent groups design. When the item response theory (IRT) method is applied in equating, inconsistent item parameter estimates among common items can lead to large bias in equated scores. It is prudent to evaluate inconsistency in parameter…
Ryberg, Karen R.
2006-01-01
This report presents the results of a study by the U.S. Geological Survey, done in cooperation with the Bureau of Reclamation, U.S. Department of the Interior, to estimate water-quality constituent concentrations in the Red River of the North at Fargo, North Dakota. Regression analysis of water-quality data collected in 2003-05 was used to estimate concentrations and loads for alkalinity, dissolved solids, sulfate, chloride, total nitrite plus nitrate, total nitrogen, total phosphorus, and suspended sediment. The explanatory variables examined for regression relation were continuously monitored physical properties of water-streamflow, specific conductance, pH, water temperature, turbidity, and dissolved oxygen. For the conditions observed in 2003-05, streamflow was a significant explanatory variable for all estimated constituents except dissolved solids. pH, water temperature, and dissolved oxygen were not statistically significant explanatory variables for any of the constituents in this study. Specific conductance was a significant explanatory variable for alkalinity, dissolved solids, sulfate, and chloride. Turbidity was a significant explanatory variable for total phosphorus and suspended sediment. For the nutrients, total nitrite plus nitrate, total nitrogen, and total phosphorus, cosine and sine functions of time also were used to explain the seasonality in constituent concentrations. The regression equations were evaluated using common measures of variability, including R2, or the proportion of variability in the estimated constituent explained by the regression equation. R2 values ranged from 0.703 for total nitrogen concentration to 0.990 for dissolved-solids concentration. The regression equations also were evaluated by calculating the median relative percentage difference (RPD) between measured constituent concentration and the constituent concentration estimated by the regression equations. Median RPDs ranged from 1.1 for dissolved solids to 35.2 for total nitrite plus nitrate. Regression equations also were used to estimate daily constituent loads. Load estimates can be used by water-quality managers for comparison of current water-quality conditions to water-quality standards expressed as total maximum daily loads (TMDLs). TMDLs are a measure of the maximum amount of chemical constituents that a water body can receive and still meet established water-quality standards. The peak loads generally occurred in June and July when streamflow also peaked.
Shah, Kalpit N; Defroda, Steven F; Wang, Bo; Weiss, Arnold-Peter C
2017-12-01
The first carpometacarpal (CMC) joint is a common site of osteoarthritis, with arthroplasty being a common procedure to provide pain relief and improve function with low complications. However, little is known about risk factors that may predispose a patient for postoperative complications. All CMC joint arthroplasty from 2005 to 2015 in the prospectively collected American College of Surgeons National Surgical Quality Improvement Program (ACS-NSQIP) database were identified. Bivariate testing and multiple logistic regressions were performed to determine which patient demographics, surgical variables and medical comorbidities were significant predictors for complications. These included wound related, cardiopulmonary, neurological and renal complications, return to the operating room (OR) and readmission. A total of 3344 patients were identified from the database. Of those, 45 patients (1.3%) experienced a complication including wound issues (0.66%), return to the OR (0.15%) and readmission (0.27%) amongst others. When performing bivariate analysis, age over 65, American Society of Anesthesiologists (ASA) Class, diabetes and renal dialysis were significant risk factors. Multiple logistic regression after adjusting for confounding factors demonstrated that insulin-dependent diabetes and ASA Class 4 had a strong trend while renal dialysis was a significant risk factor. CMC arthroplasty has a very low overall complication rate of 1.3% and wound complication rate of 0.66%. Diabetes requiring insulin and ASA Class 4 trended towards significance while renal dialysis was found to be a significant risk factors in logistic regression. This information may be useful for preoperative counseling and discussion with patients who have these risk factors.
Text Mining of Journal Articles for Sleep Disorder Terminologies.
Lam, Calvin; Lai, Fu-Chih; Wang, Chia-Hui; Lai, Mei-Hsin; Hsu, Nanly; Chung, Min-Huey
2016-01-01
Research on publication trends in journal articles on sleep disorders (SDs) and the associated methodologies by using text mining has been limited. The present study involved text mining for terms to determine the publication trends in sleep-related journal articles published during 2000-2013 and to identify associations between SD and methodology terms as well as conducting statistical analyses of the text mining findings. SD and methodology terms were extracted from 3,720 sleep-related journal articles in the PubMed database by using MetaMap. The extracted data set was analyzed using hierarchical cluster analyses and adjusted logistic regression models to investigate publication trends and associations between SD and methodology terms. MetaMap had a text mining precision, recall, and false positive rate of 0.70, 0.77, and 11.51%, respectively. The most common SD term was breathing-related sleep disorder, whereas narcolepsy was the least common. Cluster analyses showed similar methodology clusters for each SD term, except narcolepsy. The logistic regression models showed an increasing prevalence of insomnia, parasomnia, and other sleep disorders but a decreasing prevalence of breathing-related sleep disorder during 2000-2013. Different SD terms were positively associated with different methodology terms regarding research design terms, measure terms, and analysis terms. Insomnia-, parasomnia-, and other sleep disorder-related articles showed an increasing publication trend, whereas those related to breathing-related sleep disorder showed a decreasing trend. Furthermore, experimental studies more commonly focused on hypersomnia and other SDs and less commonly on insomnia, breathing-related sleep disorder, narcolepsy, and parasomnia. Thus, text mining may facilitate the exploration of the publication trends in SDs and the associated methodologies.
Epidemiologic Evaluation of Measurement Data in the Presence of Detection Limits
Lubin, Jay H.; Colt, Joanne S.; Camann, David; Davis, Scott; Cerhan, James R.; Severson, Richard K.; Bernstein, Leslie; Hartge, Patricia
2004-01-01
Quantitative measurements of environmental factors greatly improve the quality of epidemiologic studies but can pose challenges because of the presence of upper or lower detection limits or interfering compounds, which do not allow for precise measured values. We consider the regression of an environmental measurement (dependent variable) on several covariates (independent variables). Various strategies are commonly employed to impute values for interval-measured data, including assignment of one-half the detection limit to nondetected values or of “fill-in” values randomly selected from an appropriate distribution. On the basis of a limited simulation study, we found that the former approach can be biased unless the percentage of measurements below detection limits is small (5–10%). The fill-in approach generally produces unbiased parameter estimates but may produce biased variance estimates and thereby distort inference when 30% or more of the data are below detection limits. Truncated data methods (e.g., Tobit regression) and multiple imputation offer two unbiased approaches for analyzing measurement data with detection limits. If interest resides solely on regression parameters, then Tobit regression can be used. If individualized values for measurements below detection limits are needed for additional analysis, such as relative risk regression or graphical display, then multiple imputation produces unbiased estimates and nominal confidence intervals unless the proportion of missing data is extreme. We illustrate various approaches using measurements of pesticide residues in carpet dust in control subjects from a case–control study of non-Hodgkin lymphoma. PMID:15579415
Peak oxygen consumption measured during the stair-climbing test in lung resection candidates.
Brunelli, Alessandro; Xiumé, Francesco; Refai, Majed; Salati, Michele; Di Nunzio, Luca; Pompili, Cecilia; Sabbatini, Armando
2010-01-01
The stair-climbing test is commonly used in the preoperative evaluation of lung resection candidates, but it is difficult to standardize and provides little physiologic information on the performance. To verify the association between the altitude and the V(O2peak) measured during the stair-climbing test. 109 consecutive candidates for lung resection performed a symptom-limited stair-climbing test with direct breath-by-breath measurement of V(O2peak) by a portable gas analyzer. Stepwise logistic regression and bootstrap analyses were used to verify the association of several perioperative variables with a V(O2peak) <15 ml/kg/min. Subsequently, multiple regression analysis was also performed to develop an equation to estimate V(O2peak) from stair-climbing parameters and other patient-related variables. 56% of patients climbing <14 m had a V(O2peak) <15 ml/kg/min, whereas 98% of those climbing >22 m had a V(O2peak) >15 ml/kg/min. The altitude reached at stair-climbing test resulted in the only significant predictor of a V(O2peak) <15 ml/kg/min after logistic regression analysis. Multiple regression analysis yielded an equation to estimate V(O2peak) factoring altitude (p < 0.0001), speed of ascent (p = 0.005) and body mass index (p = 0.0008). There was an association between altitude and V(O2peak) measured during the stair-climbing test. Most of the patients climbing more than 22 m are able to generate high values of V(O2peak) and can proceed to surgery without any additional tests. All others need to be referred for a formal cardiopulmonary exercise test. In addition, we were able to generate an equation to estimate V(O2peak), which could assist in streamlining the preoperative workup and could be used across different settings to standardize this test. Copyright (c) 2010 S. Karger AG, Basel.
Jung, Jiwon; Moon, Song Mi; Jang, Hee-Chang; Kang, Cheol-In; Jun, Jae-Bum; Cho, Yong Kyun; Kang, Seung-Ji; Seo, Bo-Jeong; Kim, Young-Joo; Park, Seong-Beom; Lee, Juneyoung; Yu, Chang Sik; Kim, Sung-Han
2018-01-01
The aim of this study was to investigate the incidence and risk factors of postoperative pneumonia (POP) within 1 year after cancer surgery in patients with the five most common cancers (gastric, colorectal, lung, breast cancer, and hepatocellular carcinoma [HCC]) in South Korea. This was a multicenter and retrospective cohort study performed at five nationwide cancer centers. The number of cancer patients in each center was allocated by the proportion of cancer surgery. Adult patients were randomly selected according to the allocated number, among those who underwent cancer surgery from January to December 2014 within 6 months after diagnosis of cancer. One-year cumulative incidence of POP was estimated using Kaplan-Meier analysis. An univariable Cox's proportional hazard regression analysis was performed to identify risk factors for POP development. As a multivariable analysis, confounders were adjusted using multiple Cox's PH regression model. Among the total 2000 patients, the numbers of patients with gastric cancer, colorectal cancer, lung cancer, breast cancer, and HCC were 497 (25%), 525 (26%), 277 (14%), 552 (28%), and 149 (7%), respectively. Overall, the 1-year cumulative incidence of POP was 2.0% (95% CI, 1.4-2.6). The 1-year cumulative incidences in each cancer were as follows: lung 8.0%, gastric 1.8%, colorectal 1.0%, HCC 0.7%, and breast 0.4%. In multivariable analysis, older age, higher Charlson comorbidity index (CCI) score, ulcer disease, history of pneumonia, and smoking were related with POP development. In conclusions, the 1-year cumulative incidence of POP in the five most common cancers was 2%. Older age, higher CCI scores, smoker, ulcer disease, and previous pneumonia history increased the risk of POP development in cancer patients. © 2017 The Authors. Cancer Medicine published by John Wiley & Sons Ltd.
Applied Multiple Linear Regression: A General Research Strategy
ERIC Educational Resources Information Center
Smith, Brandon B.
1969-01-01
Illustrates some of the basic concepts and procedures for using regression analysis in experimental design, analysis of variance, analysis of covariance, and curvilinear regression. Applications to evaluation of instruction and vocational education programs are illustrated. (GR)
Salamo, Oriana; Roghaee, Shiva; Schweitzer, Michael D; Mantero, Alejandro; Shafazand, Shirin; Campos, Michael; Mirsaeidi, Mehdi
2018-05-03
Sarcoidosis commonly affects the lung. Lung transplantation (LT) is required when there is a severe and refractory involvement. We compared post-transplant survival rates of sarcoidosis patients with chronic obstructive pulmonary disease (COPD) and idiopathic pulmonary fibrosis (IPF). We also explored whether the race and age of the donor, and double lung transplant have any effect on the survival in the post transplant setting. We analyzed 9,727 adult patients with sarcoidosis, COPD, and IPF who underwent LT worldwide between 2005-2015 based on United Network for Organ Sharing (UNOS) database. Survival rates were compared with Kaplan-Meier, and risk factors were investigated by Cox-regression analysis. 469 (5%) were transplanted because of sarcoidosis, 3,688 (38%) for COPD and 5,570 (57%) for IPF. Unadjusted survival analysis showed a better post-transplant survival rate for patients with sarcoidosis (p < 0.001, Log-rank test). In Cox-regression analysis, double lung transplant and white race of the lung donor showed to have a significant survival advantage. Since double lung transplant, those who are younger and have lower Lung Allocation Score (LAS) at the time of transplant have a survival advantage, we suggest double lung transplant as the procedure of choice, especially in younger sarcoidosis subjects and with lower LAS scores.
Elevated tumour necrosis factor-alpha was associated with intima thickening in obese children.
Bo, Luo; Yi-Can, Yang; Qing, Zhou; Xiao-Hui, Wu; Ke, Huang; Chao-Chun, Zou
2017-04-01
This study investigated the relationship between intima-media thickness (IMT) and immune parameters in obese children from five to 16 years of age. We enrolled 185 obese children with a mean age of 10.65 ± 2.10 years and 211 controls with a mean age of 10.32 ± 1.81 years. Glycometabolism, lipid metabolism, sex hormones, immune indices and carotid IMT were measured. Serum interleukin (IL)-6, IL-10, tumour necrosis factor (TNF)-alpha, white blood cells and common and internal carotid artery IMTs in the obese group were higher than those in the control group (p < 0.05, respectively). Bivariate correlation analysis showed that the common carotid arterial IMT was positively correlated with alanine aminotransferase, triglyceride, uric acid, apolipoprotein B, IL-6, IL-10, TNF-alpha, follicle-stimulating hormone and testosterone. Internal carotid artery IMT was positively correlated with alanine aminotransferase and follicle-stimulating hormone. Both common and internal carotid artery IMTs were inversely correlated with apolipoprotein A1 (p < 0.05, respectively). Stepwise multiple regression analysis showed that testosterone, alanine aminotransferase and TNF-alpha were the independent determinants of common carotid arterial IMT. Tumour necrosis factor-alpha, alanine aminotransferase and testosterone were associated with intima thickening in the early life in obese children and may increase later risks of premature atherogenicity and adult cardio-cerebrovascular diseases. ©2016 Foundation Acta Paediatrica. Published by John Wiley & Sons Ltd.
Austin, Peter C; Wagner, Philippe; Merlo, Juan
2017-03-15
Multilevel data occurs frequently in many research areas like health services research and epidemiology. A suitable way to analyze such data is through the use of multilevel regression models (MLRM). MLRM incorporate cluster-specific random effects which allow one to partition the total individual variance into between-cluster variation and between-individual variation. Statistically, MLRM account for the dependency of the data within clusters and provide correct estimates of uncertainty around regression coefficients. Substantively, the magnitude of the effect of clustering provides a measure of the General Contextual Effect (GCE). When outcomes are binary, the GCE can also be quantified by measures of heterogeneity like the Median Odds Ratio (MOR) calculated from a multilevel logistic regression model. Time-to-event outcomes within a multilevel structure occur commonly in epidemiological and medical research. However, the Median Hazard Ratio (MHR) that corresponds to the MOR in multilevel (i.e., 'frailty') Cox proportional hazards regression is rarely used. Analogously to the MOR, the MHR is the median relative change in the hazard of the occurrence of the outcome when comparing identical subjects from two randomly selected different clusters that are ordered by risk. We illustrate the application and interpretation of the MHR in a case study analyzing the hazard of mortality in patients hospitalized for acute myocardial infarction at hospitals in Ontario, Canada. We provide R code for computing the MHR. The MHR is a useful and intuitive measure for expressing cluster heterogeneity in the outcome and, thereby, estimating general contextual effects in multilevel survival analysis. © 2016 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.
Chang, Brian A; Pearson, William S; Owusu-Edusei, Kwame
2017-04-01
We used a combination of hot spot analysis (HSA) and spatial regression to examine county-level hot spot correlates for the most commonly reported nonviral sexually transmitted infections (STIs) in the 48 contiguous states in the United States (US). We obtained reported county-level total case rates of chlamydia, gonorrhea, and primary and secondary (P&S) syphilis in all counties in the 48 contiguous states from national surveillance data and computed temporally smoothed rates using 2008-2012 data. Covariates were obtained from county-level multiyear (2008-2012) American Community Surveys from the US census. We conducted HSA to identify hot spot counties for all three STIs. We then applied spatial logistic regression with the spatial error model to determine the association between the identified hot spots and the covariates. HSA indicated that ≥84% of hot spots for each STI were in the South. Spatial regression results indicated that, a 10-unit increase in the percentage of Black non-Hispanics was associated with ≈42% (P < 0.01) [≈22% (P < 0.01), for Hispanics] increase in the odds of being a hot spot county for chlamydia and gonorrhea, and ≈27% (P < 0.01) [≈11% (P < 0.01) for Hispanics] for P&S syphilis. Compared with the other regions (West, Midwest, and Northeast), counties in the South were 6.5 (P < 0.01; chlamydia), 9.6 (P < 0.01; gonorrhea), and 4.7 (P < 0.01; P&S syphilis) times more likely to be hot spots. Our study provides important information on hot spot clusters of nonviral STIs in the entire United States, including associations between hot spot counties and sociodemographic factors. Published by Elsevier Inc.
van der Meer, D; Hoekstra, P J; van Donkelaar, M; Bralten, J; Oosterlaan, J; Heslenfeld, D; Faraone, S V; Franke, B; Buitelaar, J K; Hartman, C A
2017-01-01
Identifying genetic variants contributing to attention-deficit/hyperactivity disorder (ADHD) is complicated by the involvement of numerous common genetic variants with small effects, interacting with each other as well as with environmental factors, such as stress exposure. Random forest regression is well suited to explore this complexity, as it allows for the analysis of many predictors simultaneously, taking into account any higher-order interactions among them. Using random forest regression, we predicted ADHD severity, measured by Conners’ Parent Rating Scales, from 686 adolescents and young adults (of which 281 were diagnosed with ADHD). The analysis included 17 374 single-nucleotide polymorphisms (SNPs) across 29 genes previously linked to hypothalamic–pituitary–adrenal (HPA) axis activity, together with information on exposure to 24 individual long-term difficulties or stressful life events. The model explained 12.5% of variance in ADHD severity. The most important SNP, which also showed the strongest interaction with stress exposure, was located in a region regulating the expression of telomerase reverse transcriptase (TERT). Other high-ranking SNPs were found in or near NPSR1, ESR1, GABRA6, PER3, NR3C2 and DRD4. Chronic stressors were more influential than single, severe, life events. Top hits were partly shared with conduct problems. We conclude that random forest regression may be used to investigate how multiple genetic and environmental factors jointly contribute to ADHD. It is able to implicate novel SNPs of interest, interacting with stress exposure, and may explain inconsistent findings in ADHD genetics. This exploratory approach may be best combined with more hypothesis-driven research; top predictors and their interactions with one another should be replicated in independent samples. PMID:28585928
Wagner, Philippe; Merlo, Juan
2016-01-01
Multilevel data occurs frequently in many research areas like health services research and epidemiology. A suitable way to analyze such data is through the use of multilevel regression models (MLRM). MLRM incorporate cluster‐specific random effects which allow one to partition the total individual variance into between‐cluster variation and between‐individual variation. Statistically, MLRM account for the dependency of the data within clusters and provide correct estimates of uncertainty around regression coefficients. Substantively, the magnitude of the effect of clustering provides a measure of the General Contextual Effect (GCE). When outcomes are binary, the GCE can also be quantified by measures of heterogeneity like the Median Odds Ratio (MOR) calculated from a multilevel logistic regression model. Time‐to‐event outcomes within a multilevel structure occur commonly in epidemiological and medical research. However, the Median Hazard Ratio (MHR) that corresponds to the MOR in multilevel (i.e., ‘frailty’) Cox proportional hazards regression is rarely used. Analogously to the MOR, the MHR is the median relative change in the hazard of the occurrence of the outcome when comparing identical subjects from two randomly selected different clusters that are ordered by risk. We illustrate the application and interpretation of the MHR in a case study analyzing the hazard of mortality in patients hospitalized for acute myocardial infarction at hospitals in Ontario, Canada. We provide R code for computing the MHR. The MHR is a useful and intuitive measure for expressing cluster heterogeneity in the outcome and, thereby, estimating general contextual effects in multilevel survival analysis. © 2016 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:27885709
NASA Astrophysics Data System (ADS)
Kiss, I.; Cioată, V. G.; Alexa, V.; Raţiu, S. A.
2017-05-01
The braking system is one of the most important and complex subsystems of railway vehicles, especially when it comes for safety. Therefore, installing efficient safe brakes on the modern railway vehicles is essential. Nowadays is devoted attention to solving problems connected with using high performance brake materials and its impact on thermal and mechanical loading of railway wheels. The main factor that influences the selection of a friction material for railway applications is the performance criterion, due to the interaction between the brake block and the wheel produce complex thermos-mechanical phenomena. In this work, the investigated subjects are the cast-iron brake shoes, which are still widely used on freight wagons. Therefore, the cast-iron brake shoes - with lamellar graphite and with a high content of phosphorus (0.8-1.1%) - need a special investigation. In order to establish the optimal condition for the cast-iron brake shoes we proposed a mathematical modelling study by using the statistical analysis and multiple regression equations. Multivariate research is important in areas of cast-iron brake shoes manufacturing, because many variables interact with each other simultaneously. Multivariate visualization comes to the fore when researchers have difficulties in comprehending many dimensions at one time. Technological data (hardness and chemical composition) obtained from cast-iron brake shoes were used for this purpose. In order to settle the multiple correlation between the hardness of the cast-iron brake shoes, and the chemical compositions elements several model of regression equation types has been proposed. Because a three-dimensional surface with variables on three axes is a common way to illustrate multivariate data, in which the maximum and minimum values are easily highlighted, we plotted graphical representation of the regression equations in order to explain interaction of the variables and locate the optimal level of each variable for maximal response. For the calculation of the regression coefficients, dispersion and correlation coefficients, the software Matlab was used.
Leukemia in Iran: Epidemiology and Morphology Trends.
Koohi, Fatemeh; Salehiniya, Hamid; Shamlou, Reza; Eslami, Soheyla; Ghojogh, Ziyaeddin Mahery; Kor, Yones; Rafiemanesh, Hosein
2015-01-01
Leukemia accounts for 8% of total cancer cases and involves all age groups with different prevalence and incidence rates in Iran and the entire world and causes a significant death toll and heavy expenses for diagnosis and treatment processes. This study was done to evaluate epidemiology and morphology of blood cancer during 2003-2008. This cross- sectional study was carried out based on re- analysis of the Cancer Registry Center report of the Health Deputy in Iran during a 6-year period (2003 - 2008). Statistical analysis for incidence time trends and morphology change percentage was performed with joinpoint regression analysis using the software Joinpoint Regression Program. During the studied years a total of 18,353 hematopoietic and reticuloendothelial system cancers were recorded. Chi square test showed significant difference between sex and morphological types of blood cancer (P-value<0.001). Joinpoint analysis showed a significant increasing trend for the adjusted standard incidence rate (ASIR) for both sexes (P-value<0.05). Annual percent changes (APC) for women and men were 18.7 and 19.9, respectively. The most common morphological blood cancers were ALL, ALM, MM and CLL which accounted for 60% of total hematopoietic system cancers. Joinpoint analyze showed a significant decreasing trend for ALM in both sexes (P-value<0.05). Hematopoietic system cancers in Iran demonstrate an increasing trend for incidence rate and decreasing trend for ALL, ALM and CLL morphology.
Quotation accuracy in medical journal articles-a systematic review and meta-analysis.
Jergas, Hannah; Baethge, Christopher
2015-01-01
Background. Quotations and references are an indispensable element of scientific communication. They should support what authors claim or provide important background information for readers. Studies indicate, however, that quotations not serving their purpose-quotation errors-may be prevalent. Methods. We carried out a systematic review, meta-analysis and meta-regression of quotation errors, taking account of differences between studies in error ascertainment. Results. Out of 559 studies screened we included 28 in the main analysis, and estimated major, minor and total quotation error rates of 11,9%, 95% CI [8.4, 16.6] 11.5% [8.3, 15.7], and 25.4% [19.5, 32.4]. While heterogeneity was substantial, even the lowest estimate of total quotation errors was considerable (6.7%). Indirect references accounted for less than one sixth of all quotation problems. The findings remained robust in a number of sensitivity and subgroup analyses (including risk of bias analysis) and in meta-regression. There was no indication of publication bias. Conclusions. Readers of medical journal articles should be aware of the fact that quotation errors are common. Measures against quotation errors include spot checks by editors and reviewers, correct placement of citations in the text, and declarations by authors that they have checked cited material. Future research should elucidate if and to what degree quotation errors are detrimental to scientific progress.
Quotation accuracy in medical journal articles—a systematic review and meta-analysis
Jergas, Hannah
2015-01-01
Background. Quotations and references are an indispensable element of scientific communication. They should support what authors claim or provide important background information for readers. Studies indicate, however, that quotations not serving their purpose—quotation errors—may be prevalent. Methods. We carried out a systematic review, meta-analysis and meta-regression of quotation errors, taking account of differences between studies in error ascertainment. Results. Out of 559 studies screened we included 28 in the main analysis, and estimated major, minor and total quotation error rates of 11,9%, 95% CI [8.4, 16.6] 11.5% [8.3, 15.7], and 25.4% [19.5, 32.4]. While heterogeneity was substantial, even the lowest estimate of total quotation errors was considerable (6.7%). Indirect references accounted for less than one sixth of all quotation problems. The findings remained robust in a number of sensitivity and subgroup analyses (including risk of bias analysis) and in meta-regression. There was no indication of publication bias. Conclusions. Readers of medical journal articles should be aware of the fact that quotation errors are common. Measures against quotation errors include spot checks by editors and reviewers, correct placement of citations in the text, and declarations by authors that they have checked cited material. Future research should elucidate if and to what degree quotation errors are detrimental to scientific progress. PMID:26528420
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tatiana G. Levitskaia; James M. Peterson; Emily L. Campbell
2013-12-01
In liquid–liquid extraction separation processes, accumulation of organic solvent degradation products is detrimental to the process robustness, and frequent solvent analysis is warranted. Our research explores the feasibility of online monitoring of the organic solvents relevant to used nuclear fuel reprocessing. This paper describes the first phase of developing a system for monitoring the tributyl phosphate (TBP)/n-dodecane solvent commonly used to separate used nuclear fuel. In this investigation, the effect of extraction of nitric acid from aqueous solutions of variable concentrations on the quantification of TBP and its major degradation product dibutylphosphoric acid (HDBP) was assessed. Fourier transform infrared (FTIR)more » spectroscopy was used to discriminate between HDBP and TBP in the nitric acid-containing TBP/n-dodecane solvent. Multivariate analysis of the spectral data facilitated the development of regression models for HDBP and TBP quantification in real time, enabling online implementation of the monitoring system. The predictive regression models were validated using TBP/n-dodecane solvent samples subjected to high-dose external ?-irradiation. The predictive models were translated to flow conditions using a hollow fiber FTIR probe installed in a centrifugal contactor extraction apparatus, demonstrating the applicability of the FTIR technique coupled with multivariate analysis for the online monitoring of the organic solvent degradation products.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Levitskaia, Tatiana G.; Peterson, James M.; Campbell, Emily L.
2013-11-05
In liquid-liquid extraction separation processes, accumulation of organic solvent degradation products is detrimental to the process robustness and frequent solvent analysis is warranted. Our research explores feasibility of online monitoring of the organic solvents relevant to used nuclear fuel reprocessing. This paper describes the first phase of developing a system for monitoring the tributyl phosphate (TBP)/n-dodecane solvent commonly used to separate used nuclear fuel. In this investigation, the effect of extraction of nitric acid from aqueous solutions of variable concentrations on the quantification of TBP and its major degradation product dibutyl phosphoric acid (HDBP) was assessed. Fourier Transform Infrared Spectroscopymore » (FTIR) spectroscopy was used to discriminate between HDBP and TBP in the nitric acid-containing TBP/n-dodecane solvent. Multivariate analysis of the spectral data facilitated the development of regression models for HDBP and TBP quantification in real time, enabling online implementation of the monitoring system. The predictive regression models were validated using TBP/n-dodecane solvent samples subjected to the high dose external gamma irradiation. The predictive models were translated to flow conditions using a hollow fiber FTIR probe installed in a centrifugal contactor extraction apparatus demonstrating the applicability of the FTIR technique coupled with multivariate analysis for the online monitoring of the organic solvent degradation products.« less
Fischer, Thomas; Fischer, Susanne; Himmel, Wolfgang; Kochen, Michael M; Hummers-Pradier, Eva
2008-01-01
The influence of patient characteristics on family practitioners' (FPs') diagnostic decision making has mainly been investigated using indirect methods such as vignettes or questionnaires. Direct observation-borrowed from social and cultural anthropology-may be an alternative method for describing FPs' real-life behavior and may help in gaining insight into how FPs diagnose respiratory tract infections, which are frequent in primary care. To clarify FPs' diagnostic processes when treating patients suffering from symptoms of respiratory tract infection. This direct observation study was performed in 30 family practices using a checklist for patient complaints, history taking, physical examination, and diagnoses. The influence of patients' symptoms and complaints on the FPs' physical examination and diagnosis was calculated by logistic regression analyses. Dummy variables based on combinations of symptoms and complaints were constructed and tested against saturated (full) and backward regression models. In total, 273 patients (median age 37 years, 51% women) were included. The median number of symptoms described was 4 per patient, and most information was provided at the patients' own initiative. Multiple logistic regression analysis showed a strong association between patients' complaints and the physical examination. Frequent diagnoses were upper respiratory tract infection (URTI)/common cold (43%), bronchitis (26%), sinusitis (12%), and tonsillitis (11%). There were no significant statistical differences between "simple heuristic'' models and saturated regression models in the diagnoses of bronchitis, sinusitis, and tonsillitis, indicating that simple heuristics are probably used by the FPs, whereas "URTI/common cold'' was better explained by the full model. FPs tended to make their diagnosis based on a few patient symptoms and a limited physical examination. Simple heuristic models were almost as powerful in explaining most diagnoses as saturated models. Direct observation allowed for the study of decision making under real conditions, yielding both quantitative data and "qualitative'' information about the FPs' performance. It is important for investigators to be aware of the specific disadvantages of the method (e.g., a possible observer effect).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hahn, M.; Walton, R.
2007-01-01
Common wood-nymph butterfl ies are found throughout the United States and Canada. However, not much is known about how they overwinter or their preferences for particular grasses and habitats. In this study, the impact of prairie management plans on the abundance of the wood-nymph population was assessed, as well as the preference of these butterfl ies for areas with native or non-native grasses. The abundance of common wood-nymph butterfl ies was determined using Pollard walks; more common wood-nymph butterfl ies were found in the European grasses than were found in the burned and unburned prairie sites. The majority of themore » vegetation at each of the three sites was identifi ed and documented. Using a 1 X 3 ANOVA analysis, it was determined there were signifi cantly more butterfl ies in the European grasses than in the burned and unburned prairie sites (p < 0.0005). There was no signifi cant difference between the burned and unburned treatments of the prairie on the common wood-nymph population. A multiple variable linear regression model described the effect of temperature and wind speed on the number of observed common wood-nymph butterfl ies per hour (p = 0.026). These preliminary results need to be supplemented with future studies. Quadrat analysis of the vegetation from all three sites should be done to search for a correlation between common wood-nymph butterfl y abundance per hour and the specifi c types or quantity of vegetation at each site. The effect of vegetation height and density on the observer’s visual fi eld should also be assessed.« less
[Common mental disorders and the use of psychoactive drugs: the impact of socioeconomic conditions].
Lima, Maria Cristina Pereira; Menezes, Paulo Rossi; Carandina, Luana; Cesar, Chester Luiz Galvão; Barros, Marilisa Berti de Azevedo; Goldbaum, Moisés
2008-08-01
To evaluate the influence of socioeconomic conditions on the association between common mental disorders and the use of health services and psychoactive drugs. This was a population-based cross-sectional study conducted in the city of Botucatu, Southeastern Brazil. The sample was probabilistic, stratified and cluster-based. Interviews with 1,023 subjects aged 15 years or over were held in their homes between 2001 and 2002. Common mental disorders were evaluated using the Self-Reporting Questionnaire (SRQ-20). The use of services was investigated in relation to the fortnight preceding the interview and the use of psychotropic drugs, over the preceding three days. Logistic regression was used for multivariable analysis, and the design effect was taken into consideration. Out of the whole sample, 13.4% (95% CI: 10.7;16.0) had sought health services over the fortnight preceding the interview. Seeking health services was associated with female gender (OR=2.0) and the presence of common mental disorders (OR=2.2). 13.3% of the sample (95% CI: 9.2;17.5) said they had used at least one psychotropic drug, especially antidepressives (5.0%) and benzodiazepines (3.1%). In the multivariable analysis, female gender and the presence of common mental disorders remained associated with the use of benzodiazepines. Per capita income presented a direct and independent association with the use of psychoactive drugs: the greater the income, the greater the use of these drugs was. Lower income was associated with the presence of common mental disorders, but not with the use of psychotropic drugs. The association of common mental disorders and the use of psychotropic drugs in relation to higher income strengthens the hypothesis that inequality of access to medical services exists among this population.
Evaluation of Cardiopulmonary Resuscitation (CPR) for Patient Outcomes and their Predictors
Singh, Swati; Grewal, Anju; Gautam, Parshotam L; Luthra, Neeru; Tanwar, Gayatri; Kaur, Amarpreet
2016-01-01
Introduction Cardiac arrest continues to be a common cause of in-hospital deaths. Even small improvements in survival can translate into thousands of lives saved every year. Aim The aim of our prospective observational study was to elicit the outcomes and predictors of in-hospital cardiopulmonary resuscitation among adult patients. Settings and Design All in-hospital adult patients (age >14) who suffered cardiac arrest & were attended by a Code Blue Team between 1st January 2012 & 30th April 2013 were part of the study. Materials and Methods The cardiopulmonary resuscitation (CPR) was assessed in terms of: Response time, Presenting initial rhythm, Time to first defibrillation, Duration of CPR and Outcome (Return of spontaneous circulation (ROSC), Glasgow outcome scale (GOS) at discharge). Statistical Analysis Age, GOS and mean response time were analysed using t-test and ANOVA. Logistic regression was applied to determine the significance of the various factors in determining mortality. Results ROSC was achieved in 44% of a total of 127 patients included in our study. Asystole/Pulseless electrical activity (PEA) was the most common presenting rhythm (87.5%). The survival to discharge was seen in 7.1% patients of whom only 3.9% patients had good neurological outcome. Regression and survival analysis depicted achievement of ROSC during CPR, absence of co-morbidities and shorter response time of code blue team as predictors of good outcome. Conclusion We found poor outcome of CPR after in-hospital cardiac arrest. This was mainly attributed to an initial presenting rhythm of Asystole/PEA in most cases and delayed response times. PMID:26894150
Ai Er Ken, Ai Bi Bai; Ma, Zhi-Hua; Xiong, Dai-Qin; Xu, Pei-Ru
2017-04-01
To investigate the clinical features of invasive candidiasis in children and the risk factors for Candida bloodstream infection. A retrospective study was performed on 134 children with invasive candidiasis and hospitalized in 5 tertiary hospitals in Urumqi, China, between January 2010 and December 2015. The Candida species distribution was investigated. The clinical data were compared between the patients with and without Candida bloodstream infection. The risk factors for Candida bloodstream infection were investigated using multivariate logistic regression analysis. A total of 134 Candida strains were isolated from 134 children with invasive candidiasis, and non-albicans Candida (NAC) accounted for 53.0%. The incidence of invasive candidiasis in the PICU and other pediatric wards were 41.8% and 48.5% respectively. Sixty-eight patients (50.7%) had Candida bloodstream infection, and 45 patients (33.6%) had Candida urinary tract infection. There were significant differences in age, rate of use of broad-spectrum antibiotics, and incidence rates of chronic renal insufficiency, heart failure, urinary catheterization, and NAC infection between the patients with and without Candida bloodstream infection (P<0.05). The multivariate logistic regression analysis showed that younger age (1-24 months) (OR=6.027) and NAC infection (OR=1.020) were the independent risk factors for Candida bloodstream infection. The incidence of invasive candidiasis is similar between the PICU and other pediatric wards. NAC is the most common species of invasive candidiasis. Candida bloodstream infection is the most common invasive infection. Younger age (1-24 months) and NAC infection are the risk factors for Candida bloodstream infection.
Antonioli, P; Manzalini, M C; Stefanati, A; Bonato, B; Verzola, A; Formaglio, A; Gabutti, G
2016-09-01
Healthcare associated infections (HAIs) and misuse of antimicrobials (AMs) represent a growing public health problem. The Point Prevalence Surveys (PPSs) find available information to be used for specific targeted interventions and evaluate their effects. The objective of this study was to estimate the prevalence of HAIs and AM use, to describe types of infections, causative pathogens and to compare data collected through three PPSs in Ferrara University Hospital (FUH), repeated in 3 different years (2011-2013). The population-based sample consists of all patients admitted to every acute care and rehabilitation Department in a single day. ECDC Protocol and Form for PPS of HAI and AM use, Version 4.2, July 2011. Risk factor analysis was performed using logistic regression. 1,239 patients were observed. Overall, HAI prevalence was 9.6%; prevalence was higher in Intensive Care Units; urinary tract infections were the most common HAIs in all 3 surveys; E.coli was the most common pathogen; AM use prevalence was 51.1%; AMs most frequently administered were fluoroquinolones, combinations of penicillins and third-generation cephalosporins. According to the regression model, urinary catheter (OR: 2.5) and invasive respiratory device (OR: 2.3) are significantly associated risk factors for HAIs (p < 0.05). PPSs are a sensitive and effective method of analysis. Yearly repetition is a useful way to maintain focus on the topic of HAIs and AM use, highlighting how changes in practices impact on the outcome of care and providing useful information to implement intervention programs targeted on specific issues.
Espigares, Miguel; Lardelli, Pablo; Ortega, Pedro
2003-10-01
The presence of trihalomethanes (THMs) in potable-water sources is an issue of great interest because of the negative impact THMs have on human health. The objective of this study was to correlate the presence of trihalomethanes with more routinely monitored parameters of water quality, in order to facilitate THM control. Water samples taken at various stages of treatment from a water treatment plant were analyzed for the presence of trihalomethanes with the Fujiwara method. The data collected from these determinations were compared with the values obtained for free-residual-chlorine and combined-residual-chlorine levels as well as standard physico-chemical and microbiological indicators such as chemical oxygen demand (by the KMnO4 method), total chlorophyll, conductivity, pH, alkalinity, turbidity, chlorides, sulfates, nitrates, nitrites, phosphates, ammonia, calcium, magnesium, heterotrophic bacteria count, Pseudomonas spp., total and fecal coliforms, and fecal streptococci. The data from these determinations were compiled, and statistical analysis was performed to determine which variables correlate best with the presence and quantity of trihalomethanes in the samples. Levels of THMs in water seem to correlate directly with levels of combined residual chlorine and nitrates, and inversely with the level of free residual chlorine. Statistical analysis with multiple linear regression was conducted to determine the best-fitting models. The models chosen incorporate between two and four independent variables and include chemical oxygen demand, nitrites, and ammonia. These indicators, which are commonly determined during the water treatment process, demonstrate the strongest correlation with the levels of trihalomethanes in water and offer great utility as an accessible method for THM detection and control.
Paffer, Adriana Toledo de; Ferreira, Haroldo da Silva; Cabral Júnior, Cyro Rego; Miranda, Claudio Torres de
2012-01-01
Compromised maternal mental health (MMH) is considered to be a risk factor for child malnutrition in low income areas. Psychosocial variables associated with MMH are potentially different between urban and rural environments. The aim here was to investigate whether associations existed between MMH and selected sociodemographic risk factors and whether specific to urban or rural settings. Cross-sectional study on a representative population sample of mothers from the semiarid region of Alagoas. Multistage sampling was used. The subjects were mothers of children aged up to 60 months. MMH was evaluated through the Self-Reporting Questionnaire-20. Mothers' nutritional status was assessed using the body mass index and waist circumference. Univariate analysis used odds ratios (OR) and chi-square. Logistic regression was performed separately for urban and rural subsamples using MMH as the dependent variable. The sample comprised 288 mothers. The prevalences of common mental disorders (CMD) in rural and urban areas were 56.2% and 43.8%, respectively (OR = 1.03; 95% CI: 0.64-1.63). In univariate analysis and logistic regression, the variable of education remained associated with MMH (OR = 2.2; 95% CI: 1.03-4.6) in urban areas. In rural areas, the variable of lack of partner remained associated (OR = 2.6; 95% CI: 1.01-6.7). The prevalence of CMD is high among mothers of children aged up to two years in the semiarid region of Alagoas. This seems to be associated with lower educational level in urban settings and lack of partner in rural settings.
Wood, Jonathan S; Donnell, Eric T; Porter, Richard J
2015-02-01
A variety of different study designs and analysis methods have been used to evaluate the performance of traffic safety countermeasures. The most common study designs and methods include observational before-after studies using the empirical Bayes method and cross-sectional studies using regression models. The propensity scores-potential outcomes framework has recently been proposed as an alternative traffic safety countermeasure evaluation method to address the challenges associated with selection biases that can be part of cross-sectional studies. Crash modification factors derived from the application of all three methods have not yet been compared. This paper compares the results of retrospective, observational evaluations of a traffic safety countermeasure using both before-after and cross-sectional study designs. The paper describes the strengths and limitations of each method, focusing primarily on how each addresses site selection bias, which is a common issue in observational safety studies. The Safety Edge paving technique, which seeks to mitigate crashes related to roadway departure events, is the countermeasure used in the present study to compare the alternative evaluation methods. The results indicated that all three methods yielded results that were consistent with each other and with previous research. The empirical Bayes results had the smallest standard errors. It is concluded that the propensity scores with potential outcomes framework is a viable alternative analysis method to the empirical Bayes before-after study. It should be considered whenever a before-after study is not possible or practical. Copyright © 2014 Elsevier Ltd. All rights reserved.
Modeling the compliance of polyurethane nanofiber tubes for artificial common bile duct
NASA Astrophysics Data System (ADS)
Moazeni, Najmeh; Vadood, Morteza; Semnani, Dariush; Hasani, Hossein
2018-02-01
The common bile duct is one of the body’s most sensitive organs and a polyurethane nanofiber tube can be used as a prosthetic of the common bile duct. The compliance is one of the most important properties of prosthetic which should be adequately compliant as long as possible to keep the behavioral integrity of prosthetic. In the present paper, the prosthetic compliance was measured and modeled using regression method and artificial neural network (ANN) based on the electrospinning process parameters such as polymer concentration, voltage, tip-to-collector distance and flow rate. Whereas, the ANN model contains different parameters affecting on the prediction accuracy directly, the genetic algorithm (GA) was used to optimize the ANN parameters. Finally, it was observed that the optimized ANN model by GA can predict the compliance with high accuracy (mean absolute percentage error = 8.57%). Moreover, the contribution of variables on the compliance was investigated through relative importance analysis and the optimum values of parameters for ideal compliance were determined.
NASA Technical Reports Server (NTRS)
Parsons, Vickie s.
2009-01-01
The request to conduct an independent review of regression models, developed for determining the expected Launch Commit Criteria (LCC) External Tank (ET)-04 cycle count for the Space Shuttle ET tanking process, was submitted to the NASA Engineering and Safety Center NESC on September 20, 2005. The NESC team performed an independent review of regression models documented in Prepress Regression Analysis, Tom Clark and Angela Krenn, 10/27/05. This consultation consisted of a peer review by statistical experts of the proposed regression models provided in the Prepress Regression Analysis. This document is the consultation's final report.
Sui, Meili; Huang, Xueyong; Li, Yi; Ma, Xiaomei; Zhang, Chao; Li, Xingle; Chen, Zhijuan; Feng, Huifen; Ren, Jingchao; Wang, Fang; Xu, Bianli; Duan, Guangcai
2016-01-01
In recent years, the prevalence of hand-foot-mouth disease (HFMD) in China and some other countries has caused worldwide concern. Mild cases tend to recover within a week, while severe cases may progress rapidly and tend to have bad outcome. Since there is no vaccine for HFMD and anti-inflammatory treatment is not ideal. In this study, we aimed to establish a valid forecasting model for severe HFMD using common laboratory parameters. Retrospectively, 77 severe HFMD cases from Zhengzhou Children's hospital in the peaking period between years 2013 to 2015 were collected, with 77 mild HFMD cases in the same area. The study recorded common laboratory parameters to assist in establishment of the severe HFMD model. After screening the important variables using Mann-Whitney U test, the study also matched the logistic regression (LR), discriminant analysis (DA), and decision tree (DT) to make a comparison. Compared with that of the mild group, serum levels of WBC, PLT, PCT, MCV, MCH, LCR, SCR, LCC, GLO, CK-MB, K, S100, and B in the severe group were higher (p < 0.05), while MCR, EOR, BASOR, SCC, MCC, EO, BASO, NA, CL, T, Th, and Th/Ts were lower (p < 0.05). Five indicators including MCR, LCC, Th, CK-MB, and CL were screened out by LR and the same for DA, and five variables including EO, LCC, CL, GLO, and MCC screened out by DT. The area under the curve (AUC) of LR, DA, and DT was 0.805, 0.779 and 0.864, respectively. The findings were that common laboratory indexes were effectively used to distinguish the mild HFMD cases and severe HFMD cases by LR, DA, and DT, and DT had the best classification effect with an AUC of 0.864.
Shah, Mansi; Tilton, Jessica; Kim, Shiyun
2016-04-01
In 2001, the University of Illinois Hospital and Health Sciences System (UI Health) established a pharmacist-run, referral-based medication therapy management clinic (MTMC). Referrals are obtained from any UI Health provider or by self-referral. Although there is a high volume of referrals, a large percentage of patients do not enroll. This study was designed to determine the various factors that influence patient enrollment in the MTMC. This study was a retrospective chart review of demographic and patient variable data during years 2010 and 2011. Disabilities, distance from MTMC, mode of transportation, past medical history, and appointment dates were extracted from the medical records. Results were analyzed using descriptive statistics and logistic regression analysis. A total of 103 referrals were made; however, only 17% of patients remain enrolled in MTMC. The baseline demographics included a mean age of 63 years, 68% female, 70% African American, and 81% English speaking. Patients lived an average of 8 miles from MTMC; most utilized public or government-supplemented transport services; 24% of patients reported some type of disability, most commonly utilizing a walker or a wheelchair. On average, patients were prescribed 13 medications with hypertension (70%), diabetes (56%), and hyperlipidemia (48%) being the most common chronic disease states. The reason for referral included medication management, education, medication reconciliation, and disease state management. Five patients were unable to be contacted to schedule an initial appointment. Additionally, 18 patients failed their scheduled initial appointment and did not reschedule. Logistic regression analysis demonstrated distance traveled for clinic visit, age, and history of hypertension affected the probability of patients showing for their appointments (chi-square = 19.7, P < .001). This study demonstrated that distance from MTMC is the most common barrier in patient enrollment; therefore, strategies to improve patient access are necessary. © The Author(s) 2014.
Nettleton, Jennifer A; Steffen, Lyn M; Schulze, Matthias B; Jenny, Nancy S; Barr, R Graham; Bertoni, Alain G; Jacobs, David R
2010-01-01
Background The association between diet and cardiovascular disease (CVD) may be mediated partly through inflammatory processes and reflected by markers of subclinical atherosclerosis. Objective We investigated whether empirically derived dietary patterns are associated with coronary artery calcium (CAC) and common and internal carotid artery intima media thickness (IMT) and whether prior information about inflammatory processes would increase the strength of the associations. Design At baseline, dietary patterns were derived with the use of a food-frequency questionnaire, and inflammatory biomarkers, CAC, and IMT were measured in 5089 participants aged 45–84 y, who had no clinical CVD or diabetes, in the Multi-Ethnic Study of Atherosclerosis. Dietary patterns based on variations in C-reactive protein, interleukin-6, homocysteine, and fibrinogen concentrations were created with reduced rank regression (RRR). Dietary patterns based on variations in food group intake were created with principal components analysis (PCA). Results The primary RRR(RRR 1) and PCA(PCA factor 1) dietary patterns were high in total and saturated fat and low in fiber and micronutrients. However, the food sources of these nutrients differed between the dietary patterns. RRR 1 was positively associated with CAC [Agatston score >0: OR(95% CI) for quartile 5 compared with quartile 1 = 1.34 (1.05, 1.71); ln(Agatston score = 1): P for trend = 0.023] and with common carotid IMT [≥1.0 mm: OR (95% CI) for quartile 5 compared with quartile 1 = 1.33 (0.99, 1.79); ln(common carotid IMT): P for trend = 0.006]. PCA 1 was not associated with CAC or IMT. Conclusion The results suggest that subtle differences in dietary pattern composition, realized by incorporating measures of inflammatory processes, affect associations with markers of subclinical atherosclerosis. PMID:17556701
Liu, Sifei; Zhang, Guangrui; Qiu, Ying; Wang, Xiaobo; Guo, Lihan; Zhao, Yanxin; Tong, Meng; Wei, Lan; Sun, Lixin
2016-12-01
In this study, we aimed to establish a comprehensive and practical quality evaluation system for Shenmaidihuang pills. A simple and reliable high-performance liquid chromatography coupled with photodiode array detection method was developed both for fingerprint analysis and quantitative determination. In fingerprint analysis, relative retention time and relative peak area were used to identify the common peaks in 18 samples for investigation. Twenty one peaks were selected as the common peaks to evaluate the similarities of 18 Shenmaidihuang pills samples with different manufacture dates. Furthermore, similarity analysis was applied to evaluate the similarity of samples. Hierarchical cluster analysis and principal component analysis were also performed to evaluate the variation of Shenmaidihuang pills. In quantitative analysis, linear regressions, injection precisions, recovery, repeatability and sample stability were all tested and good results were obtained to simultaneously determine the seven identified compounds, namely, 5-hydroxymethylfurfural, morroniside, loganin, paeonol, paeoniflorin, psoralen, isopsoralen in Shenmaidihuang pills. The contents of some analytes in different batches of samples indicated significant difference, especially for 5-hydroxymethylfurfural. So, it was concluded that the chromatographic fingerprint method obtained by high-performance liquid chromatography coupled with photodiode array detection associated with multiple compounds determination is a powerful and meaningful tool to comprehensively conduct the quality control of Shenmaidihuang pills. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Ridge Regression for Interactive Models.
ERIC Educational Resources Information Center
Tate, Richard L.
1988-01-01
An exploratory study of the value of ridge regression for interactive models is reported. Assuming that the linear terms in a simple interactive model are centered to eliminate non-essential multicollinearity, a variety of common models, representing both ordinal and disordinal interactions, are shown to have "orientations" that are…
Zhang, Nan; Yu, Cao; Wen, Denggui; Chen, Jun; Ling, Yiwei; Terajima, Kenshi; Akazawa, Kohei; Shan, Baoen; Wang, Shijie
2012-01-01
The incidence of esophageal squamous cell carcinoma (ESCC), which is the eighth most common malignancy worldwide, is highest in China. The purpose of this study was to investigate the association between nitrogen compounds in drinking water with the incidence of ESCC by geographical spatial analysis. The incidence of ESCC is high in Shexian county, China, and environmental factors, particularly nitrogen-contaminated drinking water, are the main suspected risk factors. This study focuses on three nitrogen compounds in drinking water, namely, nitrates, nitrites, and ammonia, all of which are derived mainly from domestic garbage and agricultural fertilizer. The study surveyed 48 villages in the Shexian area with a total population of 54,716 (661 adults with ESCC and 54,055 non-cancer subjects). Hot-spot analysis was used to identify spatial clusters with a high incidence of ESCC and a high concentration of nitrogen compounds. Logistic regression analysis was used to detect risk factors for ESCC incidence. Most areas with high concentrations of nitrate nitrogen in drinking water had a high incidence of ESCC. Correlation analysis revealed a significant positive relationship between nitrate concentration and ESCC (P = 0.01). Logistic regression analysis also confirmed that nitrate nitrogen has a significantly higher odds ratio. The results indicate that nitrate nitrogen is associated with ESCC incidence in Shexian county. In conclusion, high concentrations of nitrate nitrogen in drinking water may be a significant risk factor for the incidence of ESCC.
Dawson, Alistair
Photoperiodic control of reproduction in birds is based on two processes, a positive effect leading to gonadal maturation and an inhibitory effect subsequently inducing regression. Nonphotoperiodic cues can modulate photoperiodic control, particularly the inhibitory process. In previous studies of common starlings (Sturnus vulgaris), (1) restriction of food availability to 8 h after dawn had little effect on testicular maturation but dramatically delayed subsequent regression and (2) lower ambient temperature also had little effect during maturation but delayed regression. Could the effects of food restriction and temperature share a common underlying mechanism? Four groups of starlings were kept on a simulated natural cycle in photoperiod in a 2 × 2 factorial experimental design. Two groups were held under an ambient temperature of 16°C, and the other two were held under 6°C. One of each of these groups had food provided ad lib., and in the other two groups access to food was denied 7 h after dawn. In both the ad lib. food groups and the food-restricted groups, lower temperature had little effect on testicular maturation but delayed subsequent regression and molt. In both the 16°C groups and the 6°C groups, food restriction had no effect on testicular maturation but delayed regression and molt. The daily cycle in body temperature was recorded in all groups when the photoperiod had reached 12L∶12D, the photoperiod at which regression is initiated. In both 6°C groups, nighttime body temperature was lower than in the 16°C groups, a characteristic of shorter photoperiods. In the two ad lib. food groups high daytime temperature was maintained until dusk, whereas in the two food-restricted groups body temperature began to decrease after food withdrawal. Thus, both lower temperature and food restriction delayed regression, as if the photoperiod was shorter than it actually was, and both resulted in daily cycles in body temperature that reflected cycles under shorter photoperiods. This implies that the daily cycle in body temperature is possibly a common pathway through which nonphotoperiodic cues may operate.
Maintenance Operations in Mission Oriented Protective Posture Level IV (MOPPIV)
1987-10-01
Repair FADAC Printed Circuit Board ............. 6 3. Data Analysis Techniques ............................. 6 a. Multiple Linear Regression... ANALYSIS /DISCUSSION ............................... 12 1. Exa-ple of Regression Analysis ..................... 12 S2. Regression results for all tasks...6 * TABLE 9. Task Grouping for Analysis ........................ 7 "TABXLE 10. Remove/Replace H60A3 Power Pack................. 8 TABLE
Faita, Francesco; Gemignani, Vincenzo; Bianchini, Elisabetta; Giannarelli, Chiara; Demi, Marcello
2006-01-01
The evaluation of the intima media thickness (IMT) of the common carotid artery (CCA) with B-mode ultrasonography represents an important index of cardiovascular risk. The IMT is defined as the distance between the leading edge of the lumen-intima interface and the leading edge of the media-adventitia interface. In order to evaluate the IMT, it is necessary to locate such edges. In this paper we developed an automatic real-time system to evaluate the IMT based on the first order absolute moment (FOAM), which is used as an edge detector, and on a pattern recognition approach. The IMT measurements were compared with manual measurements. We used regression analysis and Bland-Altman analysis to compare the results.
2013-01-01
application of the Hammett equation with the constants rph in the chemistry of organophosphorus compounds, Russ. Chem. Rev. 38 (1969) 795–811. [13...of oximes and OP compounds and the ability of oximes to reactivate OP- inhibited AChE. Multiple linear regression equations were analyzed using...phosphonate pairs, 21 oxime/ phosphoramidate pairs and 12 oxime/phosphate pairs. The best linear regression equation resulting from multiple regression anal
Zdroik, Jennifer; Veliz, Philip
2016-12-01
School districts in the United States are turning toward new sources of revenue to maintain their interscholastic sports programs. One common revenue generating policy is the implementation of participation fees, also known as pay-to-play. One concern of the growing trend of participation fees is how it impacts student participation opportunities. This study looks at how pay-to-play fees are impacting participation opportunities and participation rates in the state of Michigan. Through merging 3 school-level data sets, Civil Rights Data Collection, the Common Core of Data, and participation information from MHSAA (Michigan High School Athletic Association), bivariate analysis and ordinary least squares regression were used in our analysis. Our findings indicate that certain types of schools are able to support pay-to-play fees: relatively large schools that are located in suburban, white communities, with relatively low poverty rates. We also found that participation fees are not decreasing the number of sport opportunities for students, participation opportunities are higher in schools with fees; but participation rates are similar between schools with and without participation fees. Participation fee policy implications are discussed and we offer suggestions for future research.
Yılmaz Isıkhan, Selen; Karabulut, Erdem; Alpar, Celal Reha
2016-01-01
Background/Aim . Evaluating the success of dose prediction based on genetic or clinical data has substantially advanced recently. The aim of this study is to predict various clinical dose values from DNA gene expression datasets using data mining techniques. Materials and Methods . Eleven real gene expression datasets containing dose values were included. First, important genes for dose prediction were selected using iterative sure independence screening. Then, the performances of regression trees (RTs), support vector regression (SVR), RT bagging, SVR bagging, and RT boosting were examined. Results . The results demonstrated that a regression-based feature selection method substantially reduced the number of irrelevant genes from raw datasets. Overall, the best prediction performance in nine of 11 datasets was achieved using SVR; the second most accurate performance was provided using a gradient-boosting machine (GBM). Conclusion . Analysis of various dose values based on microarray gene expression data identified common genes found in our study and the referenced studies. According to our findings, SVR and GBM can be good predictors of dose-gene datasets. Another result of the study was to identify the sample size of n = 25 as a cutoff point for RT bagging to outperform a single RT.
Are your covariates under control? How normalization can re-introduce covariate effects.
Pain, Oliver; Dudbridge, Frank; Ronald, Angelica
2018-04-30
Many statistical tests rely on the assumption that the residuals of a model are normally distributed. Rank-based inverse normal transformation (INT) of the dependent variable is one of the most popular approaches to satisfy the normality assumption. When covariates are included in the analysis, a common approach is to first adjust for the covariates and then normalize the residuals. This study investigated the effect of regressing covariates against the dependent variable and then applying rank-based INT to the residuals. The correlation between the dependent variable and covariates at each stage of processing was assessed. An alternative approach was tested in which rank-based INT was applied to the dependent variable before regressing covariates. Analyses based on both simulated and real data examples demonstrated that applying rank-based INT to the dependent variable residuals after regressing out covariates re-introduces a linear correlation between the dependent variable and covariates, increasing type-I errors and reducing power. On the other hand, when rank-based INT was applied prior to controlling for covariate effects, residuals were normally distributed and linearly uncorrelated with covariates. This latter approach is therefore recommended in situations were normality of the dependent variable is required.
NASA Technical Reports Server (NTRS)
Rummler, D. R.
1976-01-01
The results are presented of investigations to apply regression techniques to the development of methodology for creep-rupture data analysis. Regression analysis techniques are applied to the explicit description of the creep behavior of materials for space shuttle thermal protection systems. A regression analysis technique is compared with five parametric methods for analyzing three simulated and twenty real data sets, and a computer program for the evaluation of creep-rupture data is presented.
Resting-state functional magnetic resonance imaging: the impact of regression analysis.
Yeh, Chia-Jung; Tseng, Yu-Sheng; Lin, Yi-Ru; Tsai, Shang-Yueh; Huang, Teng-Yi
2015-01-01
To investigate the impact of regression methods on resting-state functional magnetic resonance imaging (rsfMRI). During rsfMRI preprocessing, regression analysis is considered effective for reducing the interference of physiological noise on the signal time course. However, it is unclear whether the regression method benefits rsfMRI analysis. Twenty volunteers (10 men and 10 women; aged 23.4 ± 1.5 years) participated in the experiments. We used node analysis and functional connectivity mapping to assess the brain default mode network by using five combinations of regression methods. The results show that regressing the global mean plays a major role in the preprocessing steps. When a global regression method is applied, the values of functional connectivity are significantly lower (P ≤ .01) than those calculated without a global regression. This step increases inter-subject variation and produces anticorrelated brain areas. rsfMRI data processed using regression should be interpreted carefully. The significance of the anticorrelated brain areas produced by global signal removal is unclear. Copyright © 2014 by the American Society of Neuroimaging.
Kesselmeier, Miriam; Lorenzo Bermejo, Justo
2017-11-01
Logistic regression is the most common technique used for genetic case-control association studies. A disadvantage of standard maximum likelihood estimators of the genotype relative risk (GRR) is their strong dependence on outlier subjects, for example, patients diagnosed at unusually young age. Robust methods are available to constrain outlier influence, but they are scarcely used in genetic studies. This article provides a non-intimidating introduction to robust logistic regression, and investigates its benefits and limitations in genetic association studies. We applied the bounded Huber and extended the R package 'robustbase' with the re-descending Hampel functions to down-weight outlier influence. Computer simulations were carried out to assess the type I error rate, mean squared error (MSE) and statistical power according to major characteristics of the genetic study and investigated markers. Simulations were complemented with the analysis of real data. Both standard and robust estimation controlled type I error rates. Standard logistic regression showed the highest power but standard GRR estimates also showed the largest bias and MSE, in particular for associated rare and recessive variants. For illustration, a recessive variant with a true GRR=6.32 and a minor allele frequency=0.05 investigated in a 1000 case/1000 control study by standard logistic regression resulted in power=0.60 and MSE=16.5. The corresponding figures for Huber-based estimation were power=0.51 and MSE=0.53. Overall, Hampel- and Huber-based GRR estimates did not differ much. Robust logistic regression may represent a valuable alternative to standard maximum likelihood estimation when the focus lies on risk prediction rather than identification of susceptibility variants. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
NASA Astrophysics Data System (ADS)
Wilson, Barry T.; Knight, Joseph F.; McRoberts, Ronald E.
2018-03-01
Imagery from the Landsat Program has been used frequently as a source of auxiliary data for modeling land cover, as well as a variety of attributes associated with tree cover. With ready access to all scenes in the archive since 2008 due to the USGS Landsat Data Policy, new approaches to deriving such auxiliary data from dense Landsat time series are required. Several methods have previously been developed for use with finer temporal resolution imagery (e.g. AVHRR and MODIS), including image compositing and harmonic regression using Fourier series. The manuscript presents a study, using Minnesota, USA during the years 2009-2013 as the study area and timeframe. The study examined the relative predictive power of land cover models, in particular those related to tree cover, using predictor variables based solely on composite imagery versus those using estimated harmonic regression coefficients. The study used two common non-parametric modeling approaches (i.e. k-nearest neighbors and random forests) for fitting classification and regression models of multiple attributes measured on USFS Forest Inventory and Analysis plots using all available Landsat imagery for the study area and timeframe. The estimated Fourier coefficients developed by harmonic regression of tasseled cap transformation time series data were shown to be correlated with land cover, including tree cover. Regression models using estimated Fourier coefficients as predictor variables showed a two- to threefold increase in explained variance for a small set of continuous response variables, relative to comparable models using monthly image composites. Similarly, the overall accuracies of classification models using the estimated Fourier coefficients were approximately 10-20 percentage points higher than the models using the image composites, with corresponding individual class accuracies between six and 45 percentage points higher.
Reps, Jenna M; Aickelin, Uwe; Hubbard, Richard B
2016-02-01
To develop a framework for identifying and incorporating candidate confounding interaction terms into a regularised cox regression analysis to refine adverse drug reaction signals obtained via longitudinal observational data. We considered six drug families that are commonly associated with myocardial infarction in observational healthcare data, but where the causal relationship ground truth is known (adverse drug reaction or not). We applied emergent pattern mining to find itemsets of drugs and medical events that are associated with the development of myocardial infarction. These are the candidate confounding interaction terms. We then implemented a cohort study design using regularised cox regression that incorporated and accounted for the candidate confounding interaction terms. The methodology was able to account for signals generated due to confounding and a cox regression with elastic net regularisation correctly ranking the drug families known to be true adverse drug reactions above those that are not. This was not the case without the inclusion of the candidate confounding interaction terms, where confounding leads to a non-adverse drug reaction being ranked highest. The methodology is efficient, can identify high-order confounding interactions and does not require expert input to specify outcome specific confounders, so it can be applied for any outcome of interest to quickly refine its signals. The proposed method shows excellent potential to overcome some forms of confounding and therefore reduce the false positive rate for signal analysis using longitudinal data. Copyright © 2015 Elsevier Ltd. All rights reserved.
Determining association constants from titration experiments in supramolecular chemistry.
Thordarson, Pall
2011-03-01
The most common approach for quantifying interactions in supramolecular chemistry is a titration of the guest to solution of the host, noting the changes in some physical property through NMR, UV-Vis, fluorescence or other techniques. Despite the apparent simplicity of this approach, there are several issues that need to be carefully addressed to ensure that the final results are reliable. This includes the use of non-linear rather than linear regression methods, careful choice of stoichiometric binding model, the choice of method (e.g., NMR vs. UV-Vis) and concentration of host, the application of advanced data analysis methods such as global analysis and finally the estimation of uncertainties and confidence intervals for the results obtained. This tutorial review will give a systematic overview of all these issues-highlighting some of the key messages herein with simulated data analysis examples.
Differential Language Influence on Math Achievement
ERIC Educational Resources Information Center
Chen, Fang
2010-01-01
New models are commonly designed to solve certain limitations of other ones. Quantile regression is introduced in this paper because it can provide information that a regular mean regression misses. This research aims to demonstrate its utility in the educational research and measurement field for questions that may not be detected otherwise.…
USDA-ARS?s Scientific Manuscript database
Parametric non-linear regression (PNR) techniques commonly are used to develop weed seedling emergence models. Such techniques, however, require statistical assumptions that are difficult to meet. To examine and overcome these limitations, we compared PNR with a nonparametric estimation technique. F...
Schultz, K K; Bennett, T B; Nordlund, K V; Döpfer, D; Cook, N B
2016-09-01
Transition cow management has been tracked via the Transition Cow Index (TCI; AgSource Cooperative Services, Verona, WI) since 2006. Transition Cow Index was developed to measure the difference between actual and predicted milk yield at first test day to evaluate the relative success of the transition period program. This project aimed to assess TCI in relation to all commonly used Dairy Herd Improvement (DHI) metrics available through AgSource Cooperative Services. Regression analysis was used to isolate variables that were relevant to TCI, and then principal components analysis and network analysis were used to determine the relative strength and relatedness among variables. Finally, cluster analysis was used to segregate herds based on similarity of relevant variables. The DHI data were obtained from 2,131 Wisconsin dairy herds with test-day mean ≥30 cows, which were tested ≥10 times throughout the 2014 calendar year. The original list of 940 DHI variables was reduced through expert-driven selection and regression analysis to 23 variables. The K-means cluster analysis produced 5 distinct clusters. Descriptive statistics were calculated for the 23 variables per cluster grouping. Using principal components analysis, cluster analysis, and network analysis, 4 parameters were isolated as most relevant to TCI; these were energy-corrected milk, 3 measures of intramammary infection (dry cow cure rate, linear somatic cell count score in primiparous cows, and new infection rate), peak ratio, and days in milk at peak milk production. These variables together with cow and newborn calf survival measures form a group of metrics that can be used to assist in the evaluation of overall transition period performance. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Standards for Standardized Logistic Regression Coefficients
ERIC Educational Resources Information Center
Menard, Scott
2011-01-01
Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…
An improved multiple linear regression and data analysis computer program package
NASA Technical Reports Server (NTRS)
Sidik, S. M.
1972-01-01
NEWRAP, an improved version of a previous multiple linear regression program called RAPIER, CREDUC, and CRSPLT, allows for a complete regression analysis including cross plots of the independent and dependent variables, correlation coefficients, regression coefficients, analysis of variance tables, t-statistics and their probability levels, rejection of independent variables, plots of residuals against the independent and dependent variables, and a canonical reduction of quadratic response functions useful in optimum seeking experimentation. A major improvement over RAPIER is that all regression calculations are done in double precision arithmetic.
Belilovsky, Eugene; Gkirtzou, Katerina; Misyrlis, Michail; Konova, Anna B; Honorio, Jean; Alia-Klein, Nelly; Goldstein, Rita Z; Samaras, Dimitris; Blaschko, Matthew B
2015-12-01
We explore various sparse regularization techniques for analyzing fMRI data, such as the ℓ1 norm (often called LASSO in the context of a squared loss function), elastic net, and the recently introduced k-support norm. Employing sparsity regularization allows us to handle the curse of dimensionality, a problem commonly found in fMRI analysis. In this work we consider sparse regularization in both the regression and classification settings. We perform experiments on fMRI scans from cocaine-addicted as well as healthy control subjects. We show that in many cases, use of the k-support norm leads to better predictive performance, solution stability, and interpretability as compared to other standard approaches. We additionally analyze the advantages of using the absolute loss function versus the standard squared loss which leads to significantly better predictive performance for the regularization methods tested in almost all cases. Our results support the use of the k-support norm for fMRI analysis and on the clinical side, the generalizability of the I-RISA model of cocaine addiction. Copyright © 2015 Elsevier Ltd. All rights reserved.
Hotta, Takayuki; Nishiguchi, Shu; Fukutani, Naoto; Tashiro, Yuto; Adachi, Daiki; Morino, Saori; Aoyama, Tomoki
2016-09-01
Plantar heel pain (PHP) is a common complaint, and is most often caused by plantar fasciitis. Plantar fasciitis is reported to be associated with running surfaces, however the association between PHP and running surfaces has not previously been revealed in an epidemiological investigation. Therefore, the purpose of the current study was to examine the association between PHP and running surfaces. This is a cross-sectional study. A total of 347 competitive long-distance male runners participated in this study. The participants completed an original questionnaire, which included items assessing demographic characteristics, training characteristics focusing on running surfaces (soft surface, hard surface and tartan), and the prevalence of PHP during the previous 12 months. A logistic regression analysis was used to identify the effect of running surfaces on PHP. We found that 21.9% of participants had experienced PHP during the previous 12 months. The multivariate logistic regression analysis, after adjusting for demographic and training characteristics, revealed that running on tartan was associated with PHP (odds ratio 2.82, 95% confidence interval 1.42 to 5.61; P<0.01). Our findings suggest that running more than 25% on tartan is associated with PHP in competitive long-distance male runners.
Analysis of Sequence Data Under Multivariate Trait-Dependent Sampling.
Tao, Ran; Zeng, Donglin; Franceschini, Nora; North, Kari E; Boerwinkle, Eric; Lin, Dan-Yu
2015-06-01
High-throughput DNA sequencing allows for the genotyping of common and rare variants for genetic association studies. At the present time and for the foreseeable future, it is not economically feasible to sequence all individuals in a large cohort. A cost-effective strategy is to sequence those individuals with extreme values of a quantitative trait. We consider the design under which the sampling depends on multiple quantitative traits. Under such trait-dependent sampling, standard linear regression analysis can result in bias of parameter estimation, inflation of type I error, and loss of power. We construct a likelihood function that properly reflects the sampling mechanism and utilizes all available data. We implement a computationally efficient EM algorithm and establish the theoretical properties of the resulting maximum likelihood estimators. Our methods can be used to perform separate inference on each trait or simultaneous inference on multiple traits. We pay special attention to gene-level association tests for rare variants. We demonstrate the superiority of the proposed methods over standard linear regression through extensive simulation studies. We provide applications to the Cohorts for Heart and Aging Research in Genomic Epidemiology Targeted Sequencing Study and the National Heart, Lung, and Blood Institute Exome Sequencing Project.
Estimating individual benefits of medical or behavioral treatments in severely ill patients.
Diaz, Francisco J
2017-01-01
There is a need for statistical methods appropriate for the analysis of clinical trials from a personalized-medicine viewpoint as opposed to the common statistical practice that simply examines average treatment effects. This article proposes an approach to quantifying, reporting and analyzing individual benefits of medical or behavioral treatments to severely ill patients with chronic conditions, using data from clinical trials. The approach is a new development of a published framework for measuring the severity of a chronic disease and the benefits treatments provide to individuals, which utilizes regression models with random coefficients. Here, a patient is considered to be severely ill if the patient's basal severity is close to one. This allows the derivation of a very flexible family of probability distributions of individual benefits that depend on treatment duration and the covariates included in the regression model. Our approach may enrich the statistical analysis of clinical trials of severely ill patients because it allows investigating the probability distribution of individual benefits in the patient population and the variables that influence it, and we can also measure the benefits achieved in specific patients including new patients. We illustrate our approach using data from a clinical trial of the anti-depressant imipramine.
Toyabe, Shin-ichi
2014-01-01
Inpatient falls are the most common adverse events that occur in a hospital, and about 3 to 10% of falls result in serious injuries such as bone fractures and intracranial haemorrhages. We previously reported that bone fractures and intracranial haemorrhages were two major fall-related injuries and that risk assessment score for osteoporotic bone fracture was significantly associated not only with bone fractures after falls but also with intracranial haemorrhage after falls. Based on the results, we tried to establish a risk assessment tool for predicting fall-related severe injuries in a hospital. Possible risk factors related to fall-related serious injuries were extracted from data on inpatients that were admitted to a tertiary-care university hospital by using multivariate Cox’ s regression analysis and multiple logistic regression analysis. We found that fall risk score and fracture risk score were the two significant factors, and we constructed models to predict fall-related severe injuries incorporating these factors. When the prediction model was applied to another independent dataset, the constructed model could detect patients with fall-related severe injuries efficiently. The new assessment system could identify patients prone to severe injuries after falls in a reproducible fashion. PMID:25168984
The impact of depression on fatigue in patients with haemodialysis: a correlational study.
Bai, Yu-Ling; Lai, Liu-Yuan; Lee, Bih-O; Chang, Yong-Yuan; Chiou, Chou-Ping
2015-07-01
To investigate the fatigue levels and important fatigue predictors for patients undergoing haemodialysis. Fatigue is a common symptom for haemodialysis patients. With its debilitating and distressing effects, it impacts patients in terms of their quality of life while also increasing their mortality rate. A descriptive correlational study. Convenience sampling was conducted at six chosen haemodialysis centres in Southern Taiwan. Data were collected via a structured questionnaire from 193 haemodialysis patients. The scales involved in this study were socio-demographic details, the Center for Epidemiologic Studies Depression Scale, and the Fatigue Scale for haemodialysis patients. Data analysis included percentages, means, standard deviations and hierarchical multiple regression analysis. The fatigue level for haemodialysis patients was in the moderate range. Results from the hierarchical multiple regression analysis indicated that age, employment status, types of medications, physical activity and depression were significant. Of those variables, depression had the greatest impact on the patients' fatigue level, accounting for up to 30·6% of the explanatory power. The total explanatory power of the regression model was 64·2%. This study determined that for haemodialysis patients, unemployment, increased age, taking more medications or lower exercise frequencies resulted in more severe depression, which translated in turn to higher levels of fatigue. Among all these factors, depression had the greatest impact on the patients' fatigue levels. Not only is this finding beneficial to future studies on fatigue as a source of reference, it is also helpful in our understanding of important predictors relating to fatigue in the everyday lives of haemodialysis patients. It is recommended that when caring for fatigued patients, more care should be dedicated to their psychological states, and assistance should be provided in a timely way so as to reduce the amount of fatigue suffered. © 2015 John Wiley & Sons Ltd.
Factors Associated With Work Ability in Patients Undergoing Surgery for Cervical Radiculopathy.
Ng, Eunice; Johnston, Venerina; Wibault, Johanna; Löfgren, Håkan; Dedering, Åsa; Öberg, Birgitta; Zsigmond, Peter; Peolsson, Anneli
2015-08-15
Cross-sectional study. To investigate the factors associated with work ability in patients undergoing surgery for cervical radiculopathy. Surgery is a common treatment of cervical radiculopathy in people of working age. However, few studies have investigated the impact on the work ability of these patients. Patients undergoing surgery for cervical radiculopathy (n = 201) were recruited from spine centers in Sweden to complete a battery of questionnaires and physical measures the day before surgery. The associations between various individual, psychological, and work-related factors and self-reported work ability were investigated by Spearman rank correlation coefficient, multivariate linear regression, and forward stepwise regression analyses. Factors that were significant (P < 0.05) in each statistical analysis were entered into the successive analysis to reveal the factors most related to work ability. Work ability was assessed using the Work Ability Index. The mean Work Ability Index score was 28 (SD, 9.0). The forward stepwise regression analysis revealed 6 factors significantly associated with work ability, which explained 62% of the variance in the Work Ability Index. Factors highly correlated with greater work ability included greater self-efficacy in performing self-cares, lower physical load on the neck at work, greater self-reported chance of being able to work in 6 months' time, greater use of active coping strategies, lower frequency of hand weakness, and higher health-related quality of life. Psychological, work-related and individual factors were significantly associated with work ability in patients undergoing surgery for cervical radiculopathy. High self-efficacy was most associated with greater work ability. Consideration of these factors by surgeons preoperatively may provide optimal return to work outcomes after surgery. 3.
Determinants of single family residential water use across scales in four western US cities.
Chang, Heejun; Bonnette, Matthew Ryan; Stoker, Philip; Crow-Miller, Britt; Wentz, Elizabeth
2017-10-15
A growing body of literature examines urban water sustainability with increasing evidence that locally-based physical and social spatial interactions contribute to water use. These studies however are based on single-city analysis and often fail to consider whether these interactions occur more generally. We examine a multi-city comparison using a common set of spatially-explicit water, socioeconomic, and biophysical data. We investigate the relative importance of variables for explaining the variations of single family residential (SFR) water uses at Census Block Group (CBG) and Census Tract (CT) scales in four representative western US cities - Austin, Phoenix, Portland, and Salt Lake City, - which cover a wide range of climate and development density. We used both ordinary least squares regression and spatial error regression models to identify the influence of spatial dependence on water use patterns. Our results show that older downtown areas show lower water use than newer suburban areas in all four cities. Tax assessed value and building age are the main determinants of SFR water use across the four cities regardless of the scale. Impervious surface area becomes an important variable for summer water use in all cities, and it is important in all seasons for arid environments such as Phoenix. CT level analysis shows better model predictability than CBG analysis. In all cities, seasons, and spatial scales, spatial error regression models better explain the variations of SFR water use. Such a spatially-varying relationship of urban water consumption provides additional evidence for the need to integrate urban land use planning and municipal water planning. Copyright © 2017 Elsevier B.V. All rights reserved.
Musuku, Adrien; Tan, Aimin; Awaiye, Kayode; Trabelsi, Fethi
2013-09-01
Linear calibration is usually performed using eight to ten calibration concentration levels in regulated LC-MS bioanalysis because a minimum of six are specified in regulatory guidelines. However, we have previously reported that two-concentration linear calibration is as reliable as or even better than using multiple concentrations. The purpose of this research is to compare two-concentration with multiple-concentration linear calibration through retrospective data analysis of multiple bioanalytical projects that were conducted in an independent regulated bioanalytical laboratory. A total of 12 bioanalytical projects were randomly selected: two validations and two studies for each of the three most commonly used types of sample extraction methods (protein precipitation, liquid-liquid extraction, solid-phase extraction). When the existing data were retrospectively linearly regressed using only the lowest and the highest concentration levels, no extra batch failure/QC rejection was observed and the differences in accuracy and precision between the original multi-concentration regression and the new two-concentration linear regression are negligible. Specifically, the differences in overall mean apparent bias (square root of mean individual bias squares) are within the ranges of -0.3% to 0.7% and 0.1-0.7% for the validations and studies, respectively. The differences in mean QC concentrations are within the ranges of -0.6% to 1.8% and -0.8% to 2.5% for the validations and studies, respectively. The differences in %CV are within the ranges of -0.7% to 0.9% and -0.3% to 0.6% for the validations and studies, respectively. The average differences in study sample concentrations are within the range of -0.8% to 2.3%. With two-concentration linear regression, an average of 13% of time and cost could have been saved for each batch together with 53% of saving in the lead-in for each project (the preparation of working standard solutions, spiking, and aliquoting). Furthermore, examples are given as how to evaluate the linearity over the entire concentration range when only two concentration levels are used for linear regression. To conclude, two-concentration linear regression is accurate and robust enough for routine use in regulated LC-MS bioanalysis and it significantly saves time and cost as well. Copyright © 2013 Elsevier B.V. All rights reserved.
Hunduma, Gari; Girma, Mulugeta; Digaffe, Tesfaye; Weldegebreal, Fitsum; Tola, Assefa
2017-01-01
Introduction Common mental disorders include depression, anxiety and somatoform disorders are a public health problem in developed as well as developing countries. It represents a psychiatric morbidity with significant prevalence, affecting all stages of life and cause suffering to the individuals, their family and communities. Despite this fact, little information about the prevalence of common mental illness is available from low and middle-income countries including Ethiopia. The aim of this study was to determine the magnitude of common mental disorders and its associated factors among adult residents of Harari Region. Methods Comparative cross-sectional, quantitative community-based survey was conducted From February 1, 2016 to March 30, 2016 in Harari Regional State using multi-stage sampling technique. A total of 968 residents was selected using two stage sampling technique. Of this 901 were participated in the study. Validated and Pretested Self reported questionnaire (SQR_20) was used to determine the maginitude of common mental disorders. Data was entered and analyzed using Epi-info version 3.5.1 and SPSS-17 for windows statistical packages. Univirate, Bi-variate and multivariate logistic regression analysis with 95% CI was employed in order to infer associations. Results The prevalence of common mental illnesses among adults in our study area was 14.9%. The most common neurotic symptoms in this study were often head ache (23.2%), sleep badly (16%) and poor appetite (13.8%). Substance use like Khat chewing (48.2%), tobacco use (38.2%) and alcohol use (10.5%) was highly prevalent health problem among study participant. In multivariate logistic regression analysis, respondents age between 25-34 years, 35-44 years, 45-54 years and above 55years were 6.4 times (AOR 6.377; 95% CI: 2.280-17.835), 5.9 times (AOR 5.900; 95% CI: 2.243-14.859), 5.6 times (AOR 5.648; 95% CI: 2.200-14.50) and 4.1 times (AOR 4.110; 95% CI: 1.363-12.393) more likely having common mental illnesses than those age between 15-24 years, respectively. The occurrence of common mental illness was twice (AOR: 2.162; 95% CI 1.254-3.728) higher among respondents earn less than the average monthly income than those earn more than average monthly income. The odds of developing common mental illnesses were 6.6 times (AOR 6.653; 95% CI: 1.640-6.992) higher among adults with medically confirmed physical disability than those without physical disability. Similarly, adults who chewed Khat were 2.3 times (AOR 2.305; 95% CI: 1.484-3.579) more likely having common mental illnesses than those who did not chew Khat. Adults with emotional stress were twice (AOR 2.063; 95% CI: 1.176-3.619) higher chance to have common mental illnesses than adults without emotional stress. Conclusion This study had reveals that common mental disorders are major public health problems. Advancing age, low average family monthly income, Khat chewing and emotional stress were independent predictors of common mental illnesses. Whereas sex, place of residence, educational status, marital status, occupation, family size, financial stress, taking alcohol, tobacco use and family history of mental illnesses were not statistically associated with common mental illnesses.
[A SAS marco program for batch processing of univariate Cox regression analysis for great database].
Yang, Rendong; Xiong, Jie; Peng, Yangqin; Peng, Xiaoning; Zeng, Xiaomin
2015-02-01
To realize batch processing of univariate Cox regression analysis for great database by SAS marco program. We wrote a SAS macro program, which can filter, integrate, and export P values to Excel by SAS9.2. The program was used for screening survival correlated RNA molecules of ovarian cancer. A SAS marco program could finish the batch processing of univariate Cox regression analysis, the selection and export of the results. The SAS macro program has potential applications in reducing the workload of statistical analysis and providing a basis for batch processing of univariate Cox regression analysis.
The effect of attending tutoring on course grades in Calculus I
NASA Astrophysics Data System (ADS)
Rickard, Brian; Mills, Melissa
2018-04-01
Tutoring centres are common in universities in the United States, but there are few published studies that statistically examine the effects of tutoring on student success. This study utilizes multiple regression analysis to model the effect of tutoring attendance on final course grades in Calculus I. Our model predicted that every three visits to the tutoring centre is correlated with an increase of a students' course grade by one per cent, after controlling for prior academic ability. We also found that for lower-achieving students, attending tutoring had a greater impact on final grades.
Residualization is not the answer: Rethinking how to address multicollinearity.
York, Richard
2012-11-01
Here I show that a commonly used procedure to address problems stemming from collinearity and multicollinearity among independent variables in regression analysis, "residualization", leads to biased coefficient and standard error estimates and does not address the fundamental problem of collinearity, which is a lack of information. I demonstrate this using visual representations of collinearity, hypothetical experimental designs, and analyses of both artificial and real world data. I conclude by noting the importance of examining methodological practices to ensure that their validity can be established based on rational criteria. Copyright © 2012 Elsevier Inc. All rights reserved.
Anselmi, Luciana; Barros, Fernando C; Minten, Gicele C; Gigante, Denise P; Horta, Bernardo L; Victora, Cesar G
2009-01-01
OBJECTIVE To estimate the prevalence of common mental disorders and assess its association with risk factors in a cohort of young adults. METHODS Cross-sectional study nested in a 1982 birth cohort study conducted in Pelotas, Southern Brazil. In 2004-5, 4,297 subjects were interviewed during home visits. Common mental disorders were assessed using the Self-Report Questionnaire. Risk factors included socioeconomic, demographic, perinatal, and environmental variables. The analysis was stratified by gender and crude and adjusted prevalence ratios were estimated by Poisson regression. RESULTS The overall prevalence of common mental disorders was 28.0%; 32.8% and 23.5% in women and men, respectively. Men and women who were poor in 2004-5, regardless of their poor status in 1982, had nearly 1.5-fold increased risk for common mental disorders (p≤0.001) when compared to those who have never been poor. Among women, being poor during childhood (p≤0.001) and black/mixed skin color (p=0.002) increased the risk for mental disorders. Low birth weight and duration of breastfeeding were not associated to the risk of these disorders. CONCLUSIONS Higher prevalence of common mental disorders among low-income groups and race-ethnic minorities suggests that social inequalities present at birth have a major impact on mental health, especially common mental disorders. PMID:19142342
Exact Analysis of Squared Cross-Validity Coefficient in Predictive Regression Models
ERIC Educational Resources Information Center
Shieh, Gwowen
2009-01-01
In regression analysis, the notion of population validity is of theoretical interest for describing the usefulness of the underlying regression model, whereas the presumably more important concept of population cross-validity represents the predictive effectiveness for the regression equation in future research. It appears that the inference…
USDA-ARS?s Scientific Manuscript database
Selective principal component regression analysis (SPCR) uses a subset of the original image bands for principal component transformation and regression. For optimal band selection before the transformation, this paper used genetic algorithms (GA). In this case, the GA process used the regression co...
Lloyd-Jones, Luke R; Robinson, Matthew R; Yang, Jian; Visscher, Peter M
2018-04-01
Genome-wide association studies (GWAS) have identified thousands of loci that are robustly associated with complex diseases. The use of linear mixed model (LMM) methodology for GWAS is becoming more prevalent due to its ability to control for population structure and cryptic relatedness and to increase power. The odds ratio (OR) is a common measure of the association of a disease with an exposure ( e.g. , a genetic variant) and is readably available from logistic regression. However, when the LMM is applied to all-or-none traits it provides estimates of genetic effects on the observed 0-1 scale, a different scale to that in logistic regression. This limits the comparability of results across studies, for example in a meta-analysis, and makes the interpretation of the magnitude of an effect from an LMM GWAS difficult. In this study, we derived transformations from the genetic effects estimated under the LMM to the OR that only rely on summary statistics. To test the proposed transformations, we used real genotypes from two large, publicly available data sets to simulate all-or-none phenotypes for a set of scenarios that differ in underlying model, disease prevalence, and heritability. Furthermore, we applied these transformations to GWAS summary statistics for type 2 diabetes generated from 108,042 individuals in the UK Biobank. In both simulation and real-data application, we observed very high concordance between the transformed OR from the LMM and either the simulated truth or estimates from logistic regression. The transformations derived and validated in this study improve the comparability of results from prospective and already performed LMM GWAS on complex diseases by providing a reliable transformation to a common comparative scale for the genetic effects. Copyright © 2018 by the Genetics Society of America.
Hirata, Makoto; Kamatani, Yoichiro; Nagai, Akiko; Kiyohara, Yutaka; Ninomiya, Toshiharu; Tamakoshi, Akiko; Yamagata, Zentaro; Kubo, Michiaki; Muto, Kaori; Mushiroda, Taisei; Murakami, Yoshinori; Yuji, Koichiro; Furukawa, Yoichi; Zembutsu, Hitoshi; Tanaka, Toshihiro; Ohnishi, Yozo; Nakamura, Yusuke; Matsuda, Koichi
2017-03-01
To implement personalized medicine, we established a large-scale patient cohort, BioBank Japan, in 2003. BioBank Japan contains DNA, serum, and clinical information derived from approximately 200,000 patients with 47 diseases. Serum and clinical information were collected annually until 2012. We analyzed clinical information of participants at enrollment, including age, sex, body mass index, hypertension, and smoking and drinking status, across 47 diseases, and compared the results with the Japanese database on Patient Survey and National Health and Nutrition Survey. We conducted multivariate logistic regression analysis, adjusting for sex and age, to assess the association between family history and disease development. Distribution of age at enrollment reflected the typical age of disease onset. Analysis of the clinical information revealed strong associations between smoking and chronic obstructive pulmonary disease, drinking and esophageal cancer, high body mass index and metabolic disease, and hypertension and cardiovascular disease. Logistic regression analysis showed that individuals with a family history of keloid exhibited a higher odds ratio than those without a family history, highlighting the strong impact of host genetic factor(s) on disease onset. Cross-sectional analysis of the clinical information of participants at enrollment revealed characteristics of the present cohort. Analysis of family history revealed the impact of host genetic factors on each disease. BioBank Japan, by publicly distributing DNA, serum, and clinical information, could be a fundamental infrastructure for the implementation of personalized medicine. Copyright © 2017 The Authors. Production and hosting by Elsevier B.V. All rights reserved.
AGSuite: Software to conduct feature analysis of artificial grammar learning performance.
Cook, Matthew T; Chubala, Chrissy M; Jamieson, Randall K
2017-10-01
To simplify the problem of studying how people learn natural language, researchers use the artificial grammar learning (AGL) task. In this task, participants study letter strings constructed according to the rules of an artificial grammar and subsequently attempt to discriminate grammatical from ungrammatical test strings. Although the data from these experiments are usually analyzed by comparing the mean discrimination performance between experimental conditions, this practice discards information about the individual items and participants that could otherwise help uncover the particular features of strings associated with grammaticality judgments. However, feature analysis is tedious to compute, often complicated, and ill-defined in the literature. Moreover, the data violate the assumption of independence underlying standard linear regression models, leading to Type I error inflation. To solve these problems, we present AGSuite, a free Shiny application for researchers studying AGL. The suite's intuitive Web-based user interface allows researchers to generate strings from a database of published grammars, compute feature measures (e.g., Levenshtein distance) for each letter string, and conduct a feature analysis on the strings using linear mixed effects (LME) analyses. The LME analysis solves the inflation of Type I errors that afflicts more common methods of repeated measures regression analysis. Finally, the software can generate a number of graphical representations of the data to support an accurate interpretation of results. We hope the ease and availability of these tools will encourage researchers to take full advantage of item-level variance in their datasets in the study of AGL. We moreover discuss the broader applicability of the tools for researchers looking to conduct feature analysis in any field.
Ernst, Anja F; Albers, Casper J
2017-01-01
Misconceptions about the assumptions behind the standard linear regression model are widespread and dangerous. These lead to using linear regression when inappropriate, and to employing alternative procedures with less statistical power when unnecessary. Our systematic literature review investigated employment and reporting of assumption checks in twelve clinical psychology journals. Findings indicate that normality of the variables themselves, rather than of the errors, was wrongfully held for a necessary assumption in 4% of papers that use regression. Furthermore, 92% of all papers using linear regression were unclear about their assumption checks, violating APA-recommendations. This paper appeals for a heightened awareness for and increased transparency in the reporting of statistical assumption checking.
Ernst, Anja F.
2017-01-01
Misconceptions about the assumptions behind the standard linear regression model are widespread and dangerous. These lead to using linear regression when inappropriate, and to employing alternative procedures with less statistical power when unnecessary. Our systematic literature review investigated employment and reporting of assumption checks in twelve clinical psychology journals. Findings indicate that normality of the variables themselves, rather than of the errors, was wrongfully held for a necessary assumption in 4% of papers that use regression. Furthermore, 92% of all papers using linear regression were unclear about their assumption checks, violating APA-recommendations. This paper appeals for a heightened awareness for and increased transparency in the reporting of statistical assumption checking. PMID:28533971
Development of a User Interface for a Regression Analysis Software Tool
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert Manfred; Volden, Thomas R.
2010-01-01
An easy-to -use user interface was implemented in a highly automated regression analysis tool. The user interface was developed from the start to run on computers that use the Windows, Macintosh, Linux, or UNIX operating system. Many user interface features were specifically designed such that a novice or inexperienced user can apply the regression analysis tool with confidence. Therefore, the user interface s design minimizes interactive input from the user. In addition, reasonable default combinations are assigned to those analysis settings that influence the outcome of the regression analysis. These default combinations will lead to a successful regression analysis result for most experimental data sets. The user interface comes in two versions. The text user interface version is used for the ongoing development of the regression analysis tool. The official release of the regression analysis tool, on the other hand, has a graphical user interface that is more efficient to use. This graphical user interface displays all input file names, output file names, and analysis settings for a specific software application mode on a single screen which makes it easier to generate reliable analysis results and to perform input parameter studies. An object-oriented approach was used for the development of the graphical user interface. This choice keeps future software maintenance costs to a reasonable limit. Examples of both the text user interface and graphical user interface are discussed in order to illustrate the user interface s overall design approach.
Regression Analysis and the Sociological Imagination
ERIC Educational Resources Information Center
De Maio, Fernando
2014-01-01
Regression analysis is an important aspect of most introductory statistics courses in sociology but is often presented in contexts divorced from the central concerns that bring students into the discipline. Consequently, we present five lesson ideas that emerge from a regression analysis of income inequality and mortality in the USA and Canada.
Li, Shi; Mukherjee, Bhramar; Taylor, Jeremy M G; Rice, Kenneth M; Wen, Xiaoquan; Rice, John D; Stringham, Heather M; Boehnke, Michael
2014-07-01
With challenges in data harmonization and environmental heterogeneity across various data sources, meta-analysis of gene-environment interaction studies can often involve subtle statistical issues. In this paper, we study the effect of environmental covariate heterogeneity (within and between cohorts) on two approaches for fixed-effect meta-analysis: the standard inverse-variance weighted meta-analysis and a meta-regression approach. Akin to the results in Simmonds and Higgins (), we obtain analytic efficiency results for both methods under certain assumptions. The relative efficiency of the two methods depends on the ratio of within versus between cohort variability of the environmental covariate. We propose to use an adaptively weighted estimator (AWE), between meta-analysis and meta-regression, for the interaction parameter. The AWE retains full efficiency of the joint analysis using individual level data under certain natural assumptions. Lin and Zeng (2010a, b) showed that a multivariate inverse-variance weighted estimator retains full efficiency as joint analysis using individual level data, if the estimates with full covariance matrices for all the common parameters are pooled across all studies. We show consistency of our work with Lin and Zeng (2010a, b). Without sacrificing much efficiency, the AWE uses only univariate summary statistics from each study, and bypasses issues with sharing individual level data or full covariance matrices across studies. We compare the performance of the methods both analytically and numerically. The methods are illustrated through meta-analysis of interaction between Single Nucleotide Polymorphisms in FTO gene and body mass index on high-density lipoprotein cholesterol data from a set of eight studies of type 2 diabetes. © 2014 WILEY PERIODICALS, INC.
Common mental disorders associated with tuberculosis: a matched case-control study.
de Araújo, Gleide Santos; Pereira, Susan Martins; dos Santos, Darci Neves; Marinho, Jamocyr Moura; Rodrigues, Laura Cunha; Barreto, Mauricio Lima
2014-01-01
Despite the availability of treatment and a vaccine, tuberculosis continues to be a public health problem worldwide. Mental disorders might contribute to the burden of the disease. The objective of this study was to investigate the association between common mental disorders and tuberculosis. A matched case-control study was conducted. The study population included symptomatic respiratory patients who attended three referral hospitals and six community clinics in the city of Salvador, Brazil. A doctor's diagnosis defined potential cases and controls. Cases were newly diagnosed tuberculosis cases, and controls were symptomatic respiratory patients for whom tuberculosis was excluded as a diagnosis by the attending physician. Cases and controls were ascertained in the same clinic. Data collection occurred between August 2008 and April 2010. The study instruments included a structured interview, a self-reporting questionnaire for the identification of common mental disorders, and a questionnaire for alcoholism. An univariate analysis included descriptive procedures (with chi-square statistics), and a multivariate analysis used conditional logistic regression. The mean age of the cases was 38 years, and 61% of the cases were males. After adjusting for potential confounders, the odds of tuberculosis were significantly higher in patients diagnosed with a common mental disorder (OR: 1.34; 95% CI 1.05-1.70). There appears to be a positive and independent association between common mental disorders and tuberculosis; further epidemiological studies are required to increase our understanding of the possible biological and social mechanisms responsible for this association. Independent of the direction of the association, this finding has implications for the provision of care for mental disorders and for tuberculosis.
Multivariate Regression Analysis and Slaughter Livestock,
AGRICULTURE, *ECONOMICS), (*MEAT, PRODUCTION), MULTIVARIATE ANALYSIS, REGRESSION ANALYSIS , ANIMALS, WEIGHT, COSTS, PREDICTIONS, STABILITY, MATHEMATICAL MODELS, STORAGE, BEEF, PORK, FOOD, STATISTICAL DATA, ACCURACY
Generalized and synthetic regression estimators for randomized branch sampling
David L. R. Affleck; Timothy G. Gregoire
2015-01-01
In felled-tree studies, ratio and regression estimators are commonly used to convert more readily measured branch characteristics to dry crown mass estimates. In some cases, data from multiple trees are pooled to form these estimates. This research evaluates the utility of both tactics in the estimation of crown biomass following randomized branch sampling (...
Fitting program for linear regressions according to Mahon (1996)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Trappitsch, Reto G.
2018-01-09
This program takes the users' Input data and fits a linear regression to it using the prescription presented by Mahon (1996). Compared to the commonly used York fit, this method has the correct prescription for measurement error propagation. This software should facilitate the proper fitting of measurements with a simple Interface.
Statistical Power for a Simultaneous Test of Factorial and Predictive Invariance
ERIC Educational Resources Information Center
Olivera-Aguilar, Margarita; Millsap, Roger E.
2013-01-01
A common finding in studies of differential prediction across groups is that although regression slopes are the same or similar across groups, group differences exist in regression intercepts. Building on earlier work by Birnbaum (1979), Millsap (1998) presented an invariant factor model that would explain such intercept differences as arising due…
NASA Astrophysics Data System (ADS)
Li, X.; Gao, M.
2017-12-01
The magnitude of an earthquake is one of its basic parameters and is a measure of its scale. It plays a significant role in seismology and earthquake engineering research, particularly in the calculations of the seismic rate and b value in earthquake prediction and seismic hazard analysis. However, several current types of magnitudes used in seismology research, such as local magnitude (ML), surface wave magnitude (MS), and body-wave magnitude (MB), have a common limitation, which is the magnitude saturation phenomenon. Fortunately, the problem of magnitude saturation was solved by a formula for calculating the seismic moment magnitude (MW) based on the seismic moment, which describes the seismic source strength. Now the moment magnitude is very commonly used in seismology research. However, in China, the earthquake scale is primarily based on local and surface-wave magnitudes. In the present work, we studied the empirical relationships between moment magnitude (MW) and local magnitude (ML) as well as surface wave magnitude (MS) in the Chinese Mainland. The China Earthquake Networks Center (CENC) ML catalog, China Seismograph Network (CSN) MS catalog, ANSS Comprehensive Earthquake Catalog (ComCat), and Global Centroid Moment Tensor (GCMT) are adopted to regress the relationships using the orthogonal regression method. The obtained relationships are as follows: MW=0.64+0.87MS; MW=1.16+0.75ML. Therefore, in China, if the moment magnitude of an earthquake is not reported by any agency in the world, we can use the equations mentioned above for converting ML to MW and MS to MW. These relationships are very important, because they will allow the China earthquake catalogs to be used more effectively for seismic hazard analysis, earthquake prediction, and other seismology research. We also computed the relationships of and (where Mo is the seismic moment) by linear regression using the Global Centroid Moment Tensor. The obtained relationships are as follows: logMo=18.21+1.05ML; logMo=17.04+1.32MS. This formula can be used by seismologists to convert the ML/MS of Chinese mainland events into their seismic moments.
Regression Analysis: Legal Applications in Institutional Research
ERIC Educational Resources Information Center
Frizell, Julie A.; Shippen, Benjamin S., Jr.; Luna, Andrew L.
2008-01-01
This article reviews multiple regression analysis, describes how its results should be interpreted, and instructs institutional researchers on how to conduct such analyses using an example focused on faculty pay equity between men and women. The use of multiple regression analysis will be presented as a method with which to compare salaries of…
RAWS II: A MULTIPLE REGRESSION ANALYSIS PROGRAM,
This memorandum gives instructions for the use and operation of a revised version of RAWS, a multiple regression analysis program. The program...of preprocessed data, the directed retention of variable, listing of the matrix of the normal equations and its inverse, and the bypassing of the regression analysis to provide the input variable statistics only. (Author)
Workplace bullying and common mental disorders: a follow-up study.
Lahelma, Eero; Lallukka, Tea; Laaksonen, Mikko; Saastamoinen, Peppiina; Rahkonen, Ossi
2012-06-01
Workplace bullying has been associated with mental health, but longitudinal studies confirming the association are lacking. This study examined the associations of workplace bullying with subsequent common mental disorders 5-7 years later, taking account of baseline common mental disorders and several covariates. Baseline questionnaire survey data were collected in 2000-2002 among municipal employees, aged 40-60 years (n=8960; 80% women; response rate 67%). Follow-up data were collected in 2007 (response rate 83%). The final data amounted to 6830 respondents. Workplace bullying was measured at baseline using an instructed question about being bullied currently, previously or never. Common mental disorders were measured at baseline and at follow-up using the 12-item version of the General Health Questionnaire. Those scoring 3-12 were classified as having common mental disorders. Covariates included bullying in childhood, occupational and employment position, work stress, obesity and limiting longstanding illness. Logistic regression analysis was used. After adjusting for age, being currently bullied at baseline was associated with common mental disorders at follow-up among women (OR 2.34, CI 1.81 to 3.02) and men (OR 3.64, CI 2.13 to 6.24). The association for the previously bullied was weaker. Adjusting for baseline common mental disorders, the association attenuated but remained. Adjusting for further covariates did not substantially alter the studied association. CONCLUSION The study confirms that workplace bullying is likely to contribute to subsequent common mental disorders. Measures against bullying are needed at workplaces to prevent mental disorders.
Spatial regression analysis on 32 years of total column ozone data
NASA Astrophysics Data System (ADS)
Knibbe, J. S.; van der A, R. J.; de Laat, A. T. J.
2014-08-01
Multiple-regression analyses have been performed on 32 years of total ozone column data that was spatially gridded with a 1 × 1.5° resolution. The total ozone data consist of the MSR (Multi Sensor Reanalysis; 1979-2008) and 2 years of assimilated SCIAMACHY (SCanning Imaging Absorption spectroMeter for Atmospheric CHartographY) ozone data (2009-2010). The two-dimensionality in this data set allows us to perform the regressions locally and investigate spatial patterns of regression coefficients and their explanatory power. Seasonal dependencies of ozone on regressors are included in the analysis. A new physically oriented model is developed to parameterize stratospheric ozone. Ozone variations on nonseasonal timescales are parameterized by explanatory variables describing the solar cycle, stratospheric aerosols, the quasi-biennial oscillation (QBO), El Niño-Southern Oscillation (ENSO) and stratospheric alternative halogens which are parameterized by the effective equivalent stratospheric chlorine (EESC). For several explanatory variables, seasonally adjusted versions of these explanatory variables are constructed to account for the difference in their effect on ozone throughout the year. To account for seasonal variation in ozone, explanatory variables describing the polar vortex, geopotential height, potential vorticity and average day length are included. Results of this regression model are compared to that of a similar analysis based on a more commonly applied statistically oriented model. The physically oriented model provides spatial patterns in the regression results for each explanatory variable. The EESC has a significant depleting effect on ozone at mid- and high latitudes, the solar cycle affects ozone positively mostly in the Southern Hemisphere, stratospheric aerosols affect ozone negatively at high northern latitudes, the effect of QBO is positive and negative in the tropics and mid- to high latitudes, respectively, and ENSO affects ozone negatively between 30° N and 30° S, particularly over the Pacific. The contribution of explanatory variables describing seasonal ozone variation is generally large at mid- to high latitudes. We observe ozone increases with potential vorticity and day length and ozone decreases with geopotential height and variable ozone effects due to the polar vortex in regions to the north and south of the polar vortices. Recovery of ozone is identified globally. However, recovery rates and uncertainties strongly depend on choices that can be made in defining the explanatory variables. The application of several trend models, each with their own pros and cons, yields a large range of recovery rate estimates. Overall these results suggest that care has to be taken in determining ozone recovery rates, in particular for the Antarctic ozone hole.
Bayesian multivariate hierarchical transformation models for ROC analysis.
O'Malley, A James; Zou, Kelly H
2006-02-15
A Bayesian multivariate hierarchical transformation model (BMHTM) is developed for receiver operating characteristic (ROC) curve analysis based on clustered continuous diagnostic outcome data with covariates. Two special features of this model are that it incorporates non-linear monotone transformations of the outcomes and that multiple correlated outcomes may be analysed. The mean, variance, and transformation components are all modelled parametrically, enabling a wide range of inferences. The general framework is illustrated by focusing on two problems: (1) analysis of the diagnostic accuracy of a covariate-dependent univariate test outcome requiring a Box-Cox transformation within each cluster to map the test outcomes to a common family of distributions; (2) development of an optimal composite diagnostic test using multivariate clustered outcome data. In the second problem, the composite test is estimated using discriminant function analysis and compared to the test derived from logistic regression analysis where the gold standard is a binary outcome. The proposed methodology is illustrated on prostate cancer biopsy data from a multi-centre clinical trial.
Bayesian multivariate hierarchical transformation models for ROC analysis
O'Malley, A. James; Zou, Kelly H.
2006-01-01
SUMMARY A Bayesian multivariate hierarchical transformation model (BMHTM) is developed for receiver operating characteristic (ROC) curve analysis based on clustered continuous diagnostic outcome data with covariates. Two special features of this model are that it incorporates non-linear monotone transformations of the outcomes and that multiple correlated outcomes may be analysed. The mean, variance, and transformation components are all modelled parametrically, enabling a wide range of inferences. The general framework is illustrated by focusing on two problems: (1) analysis of the diagnostic accuracy of a covariate-dependent univariate test outcome requiring a Box–Cox transformation within each cluster to map the test outcomes to a common family of distributions; (2) development of an optimal composite diagnostic test using multivariate clustered outcome data. In the second problem, the composite test is estimated using discriminant function analysis and compared to the test derived from logistic regression analysis where the gold standard is a binary outcome. The proposed methodology is illustrated on prostate cancer biopsy data from a multi-centre clinical trial. PMID:16217836
A dynamic factor model of the evaluation of the financial crisis in Turkey.
Sezgin, F; Kinay, B
2010-01-01
Factor analysis has been widely used in economics and finance in situations where a relatively large number of variables are believed to be driven by few common causes of variation. Dynamic factor analysis (DFA) which is a combination of factor and time series analysis, involves autocorrelation matrices calculated from multivariate time series. Dynamic factor models were traditionally used to construct economic indicators, macroeconomic analysis, business cycles and forecasting. In recent years, dynamic factor models have become more popular in empirical macroeconomics. They have more advantages than other methods in various respects. Factor models can for instance cope with many variables without running into scarce degrees of freedom problems often faced in regression-based analysis. In this study, a model which determines the effect of the global crisis on Turkey is proposed. The main aim of the paper is to analyze how several macroeconomic quantities show an alteration before the evolution of the crisis and to decide if a crisis can be forecasted or not.
Prostate Cancer in Iran: Trends in Incidence and Morphological and Epidemiological Characteristics.
Pakzad, Reza; Rafiemanesh, Hosein; Ghoncheh, Mahshid; Sarmad, Arezoo; Salehiniya, Hamid; Hosseini, Sayedehafagh; Sepehri, Zahra; Afshari-Moghadam, Amin
2016-01-01
Prostate cancer is second most common cancer in men overall in the world, whereas it is the third most common cancer in men and the sixth most common cancer in Iran. Few studies have been conducted on the epidemiology of prostate cancer in Iran. Since ethnicity of Iranian men is different from Asian people and given the epidemiologic and demographic transition taking place in Iran, this study aimed to investigate trends of incidence and morphology of prostate cancer during 2003 - 2008 in the country. Data were collected retrospectively reviewing all new prostate cancer patients in the Cancer Registry Center of the Health Deputy for Iran during a 6-year period. Also carcinoma, NOS and adenocarcinoma, NOS morphology were surveyed. Trends analysis of incidence and morphology was by joinpoint regression. During the six years a total of 16,071 cases of prostate cancer were recorded in Iran. Most were adenocarcinomas at 95.2 percent. Trend analysis of incidence (ASR) There was a significant increase incidence, with annual percentage change (APC) of 17.3% and for morphology change percentage trends there was a significant decrease in adenocarcinoma with an APC of -1.24%. Prostate cancer is a disease of older men and the incidence is increasing in Iran. The most common morphology is adenocarcinoma this appears to be decreasing over time. Due to the changing lifestyles and the aging of the population, epidemiological studies and planning assessment of the etiology of prostate cancer and its early detection are essential.
Distiller, Larry A; Joffe, Barry I; Melville, Vanessa; Welman, Tania; Distiller, Greg B
2006-01-01
The factors responsible for premature coronary atherosclerosis in patients with type 1 diabetes are ill defined. We therefore assessed carotid intima-media complex thickness (IMT) in relatively long-surviving patients with type 1 diabetes as a marker of atherosclerosis and correlated this with traditional risk factors. Cross-sectional study of 148 patients with relatively long-surviving (>18 years) type 1 diabetes (76 men and 72 women) attending the Centre for Diabetes and Endocrinology, Johannesburg. The mean common carotid artery IMT and presence or absence of plaque was evaluated by high-resolution B-mode ultrasound. Their median age was 48 years and duration of diabetes 26 years (range 18-59 years). Traditional risk factors (age, duration of diabetes, glycemic control, hypertension, smoking and lipoprotein concentrations) were recorded. Three response variables were defined and modeled. Standard multiple regression was used for a continuous IMT variable, logistic regression for the presence/absence of plaque and ordinal logistic regression to model three categories of "risk." The median common carotid IMT was 0.62 mm (range 0.44-1.23 mm) with plaque detected in 28 cases. The multiple regression model found significant associations between IMT and current age (P=.001), duration of diabetes (P=.033), BMI (P=.008) and diagnosed hypertension (P=.046) with HDL showing a protective effect (P=.022). Current age (P=.001) and diagnosed hypertension (P=.004), smoking (P=.008) and retinopathy (P=.033) were significant in the logistic regression model. Current age was also significant in the ordinal logistic regression model (P<.001), as was total cholesterol/HDL ratio (P<.001) and mean HbA(1c) concentration (P=.073). The major factors influencing common carotid IMT in patients with relatively long-surviving type 1 diabetes are age, duration of diabetes, existing hypertension and HDL (protective) with a relatively minor role ascribed to relatively long-standing glycemic control.
Wang, D Z; Wang, C; Shen, C F; Zhang, Y; Zhang, H; Song, G D; Xue, X D; Xu, Z L; Zhang, S; Jiang, G H
2017-05-10
We described the time trend of acute myocardial infarction (AMI) from 1999 to 2013 in Tianjin incidence rate with Cochran-Armitage trend (CAT) test and linear regression analysis, and the results were compared. Based on actual population, CAT test had much stronger statistical power than linear regression analysis for both overall incidence trend and age specific incidence trend (Cochran-Armitage trend P value
Using Quantile and Asymmetric Least Squares Regression for Optimal Risk Adjustment.
Lorenz, Normann
2017-06-01
In this paper, we analyze optimal risk adjustment for direct risk selection (DRS). Integrating insurers' activities for risk selection into a discrete choice model of individuals' health insurance choice shows that DRS has the structure of a contest. For the contest success function (csf) used in most of the contest literature (the Tullock-csf), optimal transfers for a risk adjustment scheme have to be determined by means of a restricted quantile regression, irrespective of whether insurers are primarily engaged in positive DRS (attracting low risks) or negative DRS (repelling high risks). This is at odds with the common practice of determining transfers by means of a least squares regression. However, this common practice can be rationalized for a new csf, but only if positive and negative DRSs are equally important; if they are not, optimal transfers have to be calculated by means of a restricted asymmetric least squares regression. Using data from German and Swiss health insurers, we find considerable differences between the three types of regressions. Optimal transfers therefore critically depend on which csf represents insurers' incentives for DRS and, if it is not the Tullock-csf, whether insurers are primarily engaged in positive or negative DRS. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
A primer for biomedical scientists on how to execute model II linear regression analysis.
Ludbrook, John
2012-04-01
1. There are two very different ways of executing linear regression analysis. One is Model I, when the x-values are fixed by the experimenter. The other is Model II, in which the x-values are free to vary and are subject to error. 2. I have received numerous complaints from biomedical scientists that they have great difficulty in executing Model II linear regression analysis. This may explain the results of a Google Scholar search, which showed that the authors of articles in journals of physiology, pharmacology and biochemistry rarely use Model II regression analysis. 3. I repeat my previous arguments in favour of using least products linear regression analysis for Model II regressions. I review three methods for executing ordinary least products (OLP) and weighted least products (WLP) regression analysis: (i) scientific calculator and/or computer spreadsheet; (ii) specific purpose computer programs; and (iii) general purpose computer programs. 4. Using a scientific calculator and/or computer spreadsheet, it is easy to obtain correct values for OLP slope and intercept, but the corresponding 95% confidence intervals (CI) are inaccurate. 5. Using specific purpose computer programs, the freeware computer program smatr gives the correct OLP regression coefficients and obtains 95% CI by bootstrapping. In addition, smatr can be used to compare the slopes of OLP lines. 6. When using general purpose computer programs, I recommend the commercial programs systat and Statistica for those who regularly undertake linear regression analysis and I give step-by-step instructions in the Supplementary Information as to how to use loss functions. © 2011 The Author. Clinical and Experimental Pharmacology and Physiology. © 2011 Blackwell Publishing Asia Pty Ltd.
Morningstar, Rebecca J; Hamer, Gabriel L; Goldberg, Tony L; Huang, Shaoming; Andreadis, Theodore G; Walker, Edward D
2012-05-01
Analysis of molecular genetic diversity in nine marker regions of five genes within the bacteriophage WO genomic region revealed high diversity of the Wolbachia pipentis strain wPip in a population of Culex pipiens L. sampled in metropolitan Chicago, IL. From 166 blood fed females, 50 distinct genetic profiles of wPip were identified. Rarefaction analysis suggested a maximum of 110 profiles out of a possible 512 predicted by combinations of the nine markers. A rank-abundance curve showed that few strains were common and most were rare. Multiple regression showed that markers associated with gene Gp2d, encoding a partial putative capsid protein, were significantly associated with ancestry of individuals either to form molestus or form pipiens, as determined by prior microsatellite allele frequency analysis. None of the other eight markers was associated with ancestry to either form, nor to ancestry to Cx. quinquefasciatus Say. Logistic regression of host choice (mammal vs. avian) as determined by bloodmeal analysis revealed that significantly fewer individuals that had fed on mammals had the Gp9a genetic marker (58.5%) compared with avian-fed individuals (88.1%). These data suggest that certain wPip molecular genetic types are associated with genetic admixturing in the Cx. pipiens complex of metropolitan Chicago, IL, and that the association extends to phenotypic variation related to host preference.
A systematic evaluation of normalization methods in quantitative label-free proteomics.
Välikangas, Tommi; Suomi, Tomi; Elo, Laura L
2018-01-01
To date, mass spectrometry (MS) data remain inherently biased as a result of reasons ranging from sample handling to differences caused by the instrumentation. Normalization is the process that aims to account for the bias and make samples more comparable. The selection of a proper normalization method is a pivotal task for the reliability of the downstream analysis and results. Many normalization methods commonly used in proteomics have been adapted from the DNA microarray techniques. Previous studies comparing normalization methods in proteomics have focused mainly on intragroup variation. In this study, several popular and widely used normalization methods representing different strategies in normalization are evaluated using three spike-in and one experimental mouse label-free proteomic data sets. The normalization methods are evaluated in terms of their ability to reduce variation between technical replicates, their effect on differential expression analysis and their effect on the estimation of logarithmic fold changes. Additionally, we examined whether normalizing the whole data globally or in segments for the differential expression analysis has an effect on the performance of the normalization methods. We found that variance stabilization normalization (Vsn) reduced variation the most between technical replicates in all examined data sets. Vsn also performed consistently well in the differential expression analysis. Linear regression normalization and local regression normalization performed also systematically well. Finally, we discuss the choice of a normalization method and some qualities of a suitable normalization method in the light of the results of our evaluation. © The Author 2016. Published by Oxford University Press.
Water quality parameter measurement using spectral signatures
NASA Technical Reports Server (NTRS)
White, P. E.
1973-01-01
Regression analysis is applied to the problem of measuring water quality parameters from remote sensing spectral signature data. The equations necessary to perform regression analysis are presented and methods of testing the strength and reliability of a regression are described. An efficient algorithm for selecting an optimal subset of the independent variables available for a regression is also presented.
The impact of tobacco smoking on perinatal outcome among patients with gestational diabetes.
Contreras, K R; Kominiarek, M A; Zollinger, T W
2010-05-01
To determine the effects of tobacco use on perinatal outcomes among patients with gestational diabetes (GDM). This was a retrospective cohort study of singleton pregnancies with GDM and live births from 2003 to 2006. The primary outcome, large for gestational age (LGA) infants, was compared between smoking and nonsmoking groups. Secondary outcomes included cesarean deliveries, shoulder dystocia, birth trauma, peripartum complications, macrosomia, 5-min Apgar score < or =3, birth defects, and neonatal intensive care unit (NICU) admissions. chi(2) and Student t-tests compared the two groups; a P-value <0.05 was statistically significant and odds ratios (OR) were reported with 95% confidence intervals (CI). A multivariate logistic regression analysis controlled for variables known to affect outcomes in GDM. We identified 915 patients with GDM, of which 130 (14.2%) smoked during pregnancy. Women who smoked during pregnancy were less likely to have LGA infants (22.4 vs 31.2%; OR, 0.61; 95% CI, 0.39 to 0.95). In a logistic regression analysis, the inverse relationship between smoking and LGA persisted (OR, 0.59; 95% CI, 0.36 to 0.97) after controlling for maternal age, multiparity, ethnicity, weight status before pregnancy, weight gain during pregnancy, and male gender. Preterm labor, preeclampsia, Cesareans, shoulder dystocia, and birth trauma were similar in both groups. PPROM was more likely to occur in nonsmokers (0 vs 4%, P=0.03), but postpartum hemorrhage was more common among smokers (OR, 2.3; 95% CI, 1.02 to 5.31). Macrosomia, low 5-min Apgar score, birth defects, and NICU admissions were similar between the groups. Patients with GDM who smoke during pregnancy were 40% less likely to have LGA infants. However, smoking was not protective of other common morbidities associated with GDM.
Wiernik, Emmanuel; Lemogne, Cédric; Thomas, Frédérique; Perier, Marie-Cécile; Guibout, Catherine; Nabi, Hermann; Laurent, Stéphane; Pannier, Bruno; Boutouyrie, Pierre; Jouven, Xavier; Empana, Jean-Philippe
2016-10-15
The association between psychological factors and cardiovascular diseases may depend upon socio-economic status. The present cross-sectional study examined the potential moderating role of occupational status on the association between perceived stress and intima-media thickness (IMT), using baseline examination data of the Paris Prospective Study III. IMT was measured in the right common carotid artery (CCA-IMT) 1cm below the bifurcation, in a zone free of discrete plaques, using non-invasive high-resolution echotracking. Perceived stress was measured with the 4-item Perceived Stress Scale. The association between perceived stress and CCA-IMT was explored using linear regression analysis and regression coefficients (b) were given per 1-point increment. The study population included 5140 participants (3539 men) in the labor force aged 55.9years on average (standard deviation: 3.9), and who were free of personal history of cardiovascular disease and not on psychotropic drugs. There was a non-significant trend between perceived stress and CCA-IMT after adjustment for socio-demographic, self-rated health and cardiovascular risk factors (b [95% CI] 1.02 [-0.08;2.12]; p=0.069). However, multivariable stratified analysis indicates a significant and robust association between perceived stress and CCA-IMT in unemployed participants (b [95% CI] 3.30 [0.44;6.17]), and an association of same magnitude in working participants with low occupational status but without reaching statistical significance. The association between perceived stress and CCA-IMT may depend upon employment status. These results may explain why psychological stress is more tightly linked to cardiovascular disease among individuals facing social adversity. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Holtman, Gea A; Kranenberg, Justin J; Blanker, Marco H; Ott, Alewijn; Lisman-van Leeuwen, Yvonne; Berger, Marjolein Y
2017-02-01
Dientamoeba fragilis is commonly identified in children in primary care and is suspected to cause gastrointestinal disease. To determine the association between D. fragilis colonization and gastrointestinal symptoms in children. We performed a cross-sectional study with children who presented in primary care with gastrointestinal symptoms. The associations between D. fragilis colonization and specific symptoms were explored by means of logistic regression analyses. Asymptomatic siblings of these cases were invited as control subjects for a case-control analysis, where we explored the association between D. fragilis and gastrointestinal symptoms with conditional logistic regression analysis. In the cross-sectional study, 107 children were included. Their median age was 9 years (interquartile range = 6-12) and 38 (35.5%) were boys. Colonization of D. fragilis was present in 59 children (55.1%). The absence of D. fragilis was associated with soft to watery stool [odds ratio (OR) = 0.29; 95% confidence interval (CI) = 0.10-0.85], chronic diarrhoea (OR = 0.42; 95% CI = 0.18-0.97) and fatigue (OR = 0.45; 95% CI = 0.20-0.99). The case-control analyses included 44 children in each group. Dientamoeba fragilis colonization was not observed more often in cases than in controls after adjustment for age and sex (OR = 1.02; 95% CI = 0.28-3.65). Dientamoeba fragilis is a common parasite in children with and without gastrointestinal symptoms. The anomalous finding of the association between the absence of D. fragilis with soft to watery stools, chronic diarrhoea and fatigue are inexplicable. Our study suggests that D. fragilis colonization does not increase the risk for gastrointestinal symptoms. © The Author 2016. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Ching, SiewMooi; Ramachandran, Vasudevan; Gew, Lai Teck; Lim, Sazlyna Mohd Sazlly; Sulaiman, Wan Aliaa Wan; Foo, Yoke Loong; Zakaria, Zainul Amiruddin; Samsudin, Nurul Huda; Lau, Paul Chih Ming Chih; Veettil, Sajesh K; Hoo, Fankee
2016-01-29
In Malaysia, the number of reported cases of dengue fever demonstrates an increasing trend. Since dengue fever has no vaccine or antiviral treatment available, it has become a burden. Complementary and alternative medicine (CAM) has become one of the good alternatives to treat the patients with dengue fever. There is limited study on the use of CAM among patients with dengue fever, particularly in hospital settings. This study aims to determine the prevalence, types, reasons, expenditure, and resource of information on CAM use among patients with dengue fever. This is a descriptive, cross-sectional study of 306 patients with dengue fever, which was carried out at the dengue clinic of three hospitals. Data were analysed using IBM SPSS Statistics version 21.0 and logistic regression analysis was used to determine the factors associated with CAM use. The prevalence of CAM use was 85.3% among patients with dengue fever. The most popular CAMs were isotonic drinks (85.8%), crab soup (46.7%) and papaya leaf extract (22.2%). The most common reason for CAM use was a good impression of CAM from other CAM users (33.3%). The main resource of information on CAM use among patients with dengue fever was family (54.8%). In multiple logistic regression analysis, dengue fever patients with a tertiary level are more likely to use CAM 5.8 (95% confidence interval (CI 1.62-20.45) and 3.8 (95% CI 1.12-12.93) times than secondary level and primary and below respectively. CAM was commonly used by patients with dengue fever. The predictor of CAM use was a higher level of education.
Ikeya, Yoshimori; Fukuyama, Naoto; Mori, Hidezo
2015-03-01
N-3 fatty acids, including eicosapentaenoic acid (EPA), prevent ischemic stroke. The preventive effect has been attributed to an antithrombic effect induced by elevated EPA and reduced arachidonic acid (AA) levels. However, the relationship between intracranial hemorrhage and N-3 fatty acids has not yet been elucidated. In this cross-sectional study, we compared common clinical and lifestyle parameters between 70 patients with intracranial hemorrhages and 66 control subjects. The parameters included blood chemistry data, smoking, alcohol intake, fish consumption, and the incidences of underlying diseases. The comparisons were performed using the Mann-Whitney U test followed by multiple logistic regression analysis. Nonparametric tests revealed that the 70 patients with intracerebral hemorrhages exhibited significantly higher diastolic blood pressures and alcohol intakes and lower body mass indices, high-density lipoprotein (HDL) cholesterol levels, EPA concentrations, EPA/AA ratios, and vegetable consumption compared with the 66 control subjects. A multiple logistic regression analysis revealed that higher diastolic blood pressure and alcohol intake and lower body mass index, HDL cholesterol, EPA/AA ratio, and vegetable consumption were relative risk factors for intracerebral hemorrhage. High HDL cholesterol was a common risk factor in both of the sex-segregated subgroups and the <65-year-old subgroup. However, neither EPA nor the EPA/AA ratio was a risk factor in these subgroups. Eicosapentaenoic acid was relative risk factor only in the ≥65-year-old subgroup. Rather than higher EPA levels, lower EPA concentrations and EPA/AA ratios were found to be risk factors for intracerebral hemorrhage in addition to previously known risk factors such as blood pressure, alcohol consumption, and lifestyle. Copyright © 2015 Elsevier Inc. All rights reserved.
Rude, Tope L; Donin, Nicholas M; Cohn, Matthew R; Meeks, William; Gulig, Scott; Patel, Samir N; Wysock, James S; Makarov, Danil V; Bjurlin, Marc A
2018-06-07
To define the rates of common Hospital Acquired Conditions (HACs) in patients undergoing major urological surgery over a period of time encompassing the implementation of the Hospital Acquired Condition Reduction program, and to evaluate whether implementation of the HAC reimbursement penalties in 2008 was associated with a change in the rate of HACs. Using American College of Surgeons National Surgical Quality Improvement Program (NSQIP) data, we determined rates of HACs in patients undergoing major inpatient urological surgery from 2005 to 2012. Rates were stratified by procedure type and approach (open vs. laparoscopic/robotic). Multivariable logistic regression was used to determine the association between year of surgery and HACs. We identified 39,257 patients undergoing major urological surgery, of whom 2300 (5.9%) had at least one hospital acquired condition. Urinary tract infection (UTI, 2.6%) was the most common, followed by surgical site infection (SSI, 2.5%) and venous thrombotic events (VTE, 0.7%). Multivariable logistic regression analysis demonstrated that open surgical approach, diabetes, congestive heart failure, chronic obstructive pulmonary disease, weight loss, and ASA class were among the variables associated with higher likelihood of HAC. We observed a non-significant secular trend of decreasing rates of HAC from 7.4% to 5.8% HACs during the study period, which encompassed the implementation of the Hospital Acquired Condition Reduction Program. HACs occurred at a rate of 5.9% after major urological surgery, and are significantly affected by procedure type and patient health status. The rate of HAC appeared unaffected by national reduction program in this cohort. Better understanding of the factors associated with HACs is critical in developing effective reduction programs. Copyright © 2018. Published by Elsevier Inc.
Evaluation of Land Use Regression Models for Nitrogen Dioxide and Benzene in Four US Cities
Mukerjee, Shaibal; Smith, Luther; Neas, Lucas; Norris, Gary
2012-01-01
Spatial analysis studies have included the application of land use regression models (LURs) for health and air quality assessments. Recent LUR studies have collected nitrogen dioxide (NO2) and volatile organic compounds (VOCs) using passive samplers at urban air monitoring networks in El Paso and Dallas, TX, Detroit, MI, and Cleveland, OH to assess spatial variability and source influences. LURs were successfully developed to estimate pollutant concentrations throughout the study areas. Comparisons of development and predictive capabilities of LURs from these four cities are presented to address this issue of uniform application of LURs across study areas. Traffic and other urban variables were important predictors in the LURs although city-specific influences (such as border crossings) were also important. In addition, transferability of variables or LURs from one city to another may be problematic due to intercity differences and data availability or comparability. Thus, developing common predictors in future LURs may be difficult. PMID:23226985
Maggin, Daniel M; Swaminathan, Hariharan; Rogers, Helen J; O'Keeffe, Breda V; Sugai, George; Horner, Robert H
2011-06-01
A new method for deriving effect sizes from single-case designs is proposed. The strategy is applicable to small-sample time-series data with autoregressive errors. The method uses Generalized Least Squares (GLS) to model the autocorrelation of the data and estimate regression parameters to produce an effect size that represents the magnitude of treatment effect from baseline to treatment phases in standard deviation units. In this paper, the method is applied to two published examples using common single case designs (i.e., withdrawal and multiple-baseline). The results from these studies are described, and the method is compared to ten desirable criteria for single-case effect sizes. Based on the results of this application, we conclude with observations about the use of GLS as a support to visual analysis, provide recommendations for future research, and describe implications for practice. Copyright © 2011 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.
Finley, Andrew O.; Banerjee, Sudipto; Cook, Bruce D.; Bradford, John B.
2013-01-01
In this paper we detail a multivariate spatial regression model that couples LiDAR, hyperspectral and forest inventory data to predict forest outcome variables at a high spatial resolution. The proposed model is used to analyze forest inventory data collected on the US Forest Service Penobscot Experimental Forest (PEF), ME, USA. In addition to helping meet the regression model's assumptions, results from the PEF analysis suggest that the addition of multivariate spatial random effects improves model fit and predictive ability, compared with two commonly applied modeling approaches. This improvement results from explicitly modeling the covariation among forest outcome variables and spatial dependence among observations through the random effects. Direct application of such multivariate models to even moderately large datasets is often computationally infeasible because of cubic order matrix algorithms involved in estimation. We apply a spatial dimension reduction technique to help overcome this computational hurdle without sacrificing richness in modeling.
Jafari, Peyman; Sharafi, Zahra; Bagheri, Zahra; Shalileh, Sara
2014-06-01
Measurement equivalence is a necessary assumption for meaningful comparison of pediatric quality of life rated by children and parents. In this study, differential item functioning (DIF) analysis is used to examine whether children and their parents respond consistently to the items in the KINDer Lebensqualitätsfragebogen (KINDL; in German, Children Quality of Life Questionnaire). Two DIF detection methods, graded response model (GRM) and ordinal logistic regression (OLR), were applied for comparability. The KINDL was completed by 1,086 school children and 1,061 of their parents. While the GRM revealed that 12 out of the 24 items were flagged with DIF, the OLR identified 14 out of the 24 items with DIF. Seven items with DIF and five items without DIF were common across the two methods, yielding a total agreement rate of 50 %. This study revealed that parent proxy-reports cannot be used as a substitute for a child's ratings in the KINDL.
Hanks, Ephraim M.; Schliep, Erin M.; Hooten, Mevin B.; Hoeting, Jennifer A.
2015-01-01
In spatial generalized linear mixed models (SGLMMs), covariates that are spatially smooth are often collinear with spatially smooth random effects. This phenomenon is known as spatial confounding and has been studied primarily in the case where the spatial support of the process being studied is discrete (e.g., areal spatial data). In this case, the most common approach suggested is restricted spatial regression (RSR) in which the spatial random effects are constrained to be orthogonal to the fixed effects. We consider spatial confounding and RSR in the geostatistical (continuous spatial support) setting. We show that RSR provides computational benefits relative to the confounded SGLMM, but that Bayesian credible intervals under RSR can be inappropriately narrow under model misspecification. We propose a posterior predictive approach to alleviating this potential problem and discuss the appropriateness of RSR in a variety of situations. We illustrate RSR and SGLMM approaches through simulation studies and an analysis of malaria frequencies in The Gambia, Africa.
Predicting the Retention Behavior of Specific O-Linked Glycopeptides.
Badgett, Majors J; Boyes, Barry; Orlando, Ron
2017-09-01
O -Linked glycosylation is a common post-translational modification that can alter the overall structure, polarity, and function of proteins. Reverse-phase (RP) chromatography is the most common chromatographic approach to analyze O -glycosylated peptides and their unmodified counterparts, even though this approach often does not provide adequate separation of these two species. Hydrophilic interaction liquid chromatography (HILIC) can be a solution to this problem, as the polar glycan interacts with the polar stationary phase and potentially offers the ability to resolve the peptide from its modified form(s). In this paper, HILIC is used to separate peptides with O - N -acetylgalactosamine ( O -GalNAc), O - N -acetylglucosamine ( O -GlcNAc), and O -fucose additions from their native forms, and coefficients representing the extent of hydrophilicity were derived using linear regression analysis as a means to predict the retention times of peptides with these modifications.
Predicting the Retention Behavior of Specific O-Linked Glycopeptides
Badgett, Majors J.; Boyes, Barry; Orlando, Ron
2017-01-01
O-Linked glycosylation is a common post-translational modification that can alter the overall structure, polarity, and function of proteins. Reverse-phase (RP) chromatography is the most common chromatographic approach to analyze O-glycosylated peptides and their unmodified counterparts, even though this approach often does not provide adequate separation of these two species. Hydrophilic interaction liquid chromatography (HILIC) can be a solution to this problem, as the polar glycan interacts with the polar stationary phase and potentially offers the ability to resolve the peptide from its modified form(s). In this paper, HILIC is used to separate peptides with O-N-acetylgalactosamine (O-GalNAc), O-N-acetylglucosamine (O-GlcNAc), and O-fucose additions from their native forms, and coefficients representing the extent of hydrophilicity were derived using linear regression analysis as a means to predict the retention times of peptides with these modifications. PMID:28785176
Coutinho, Letícia Maria Silva; Matijasevich, Alícia; Scazufca, Márcia; Menezes, Paulo Rossi
2014-09-01
Social context can play a important role in the etiology and prevalence of mental disorders. The aim of the present study was to investigate risk factors for common mental disorders (CMD), considering different contextual levels: individual, household, and census tract. The study used a population-based sample of 2,366 respondents from the São Paulo Ageing & Health Study. Presence of CMD was identified by the SRQ-20. Sex, age, education, and occupation were individual characteristics associated with prevalence of CMD. Multilevel logistic regression models showed that part of the variance in prevalence of CMD was associated with the household level, showing associations between crowding, family income, and CMD, even after controlling for individual characteristics. These results suggest that characteristics of the environment where people live can influence their mental health status.
The incidence of phlebitis with intravenous amiodarone at guideline dose recommendations.
Slim, Ahmad M; Roth, Jason E; Duffy, Benjamin; Boyd, Sheri Y N; Rubal, Bernard J
2007-12-01
Postoperative atrial fibrillation following cardiothoracic surgery is common and frequently managed with intravenous (IV) amiodarone. Phlebitis is the most common complication with peripheral infusion of this agent. Current practice guidelines for peripheral IV administration of <2 mg/mL amiodarone were established to reduce the risk of phlebitis. The present study examines the incidence of phlebitis in a postoperative patient population given current dose recommendations. A total of 273 patient charts were reviewed. The incidence of phlebitis in patients given IV amiodarone (n = 36) was 13.9% (95% confidence interval, 2.6-25.2%; p = 0.001). Logistic regression analysis with backward elimination of other therapeutic risk factors suggests that the odds ratio for phlebitis using current dose regimens without IV filters is 19-fold greater than baseline risk in this population. Phlebitis remains a significant complication associated with peripheral infusion of amiodarone within recommended dosing limits.
Comparing the Relationship Between Age and Length of Disability Across Common Chronic Conditions
Jetha, Arif; Besen, Elyssa; Smith, Peter M.
2016-01-01
Objective: The aim of this study was to compare the association between age and disability length across common chronic conditions. Methods: Analysis of 39,915 nonwork-related disability claims with a diagnosis of arthritis, diabetes, hypertension, coronary artery disease, depression, low back pain, chronic pulmonary disease, or cancer. Ordinary least squares regression models examined age-length of disability association across chronic conditions. Results: Arthritis (76.6 days), depression (63.2 days), and cancer (64.9 days) were associated with longest mean disability lengths; hypertension was related to shortest disability lengths (41.5 days). Across chronic conditions, older age was significantly associated with longer work disability. The age–length of disability association was most significant for chronic pulmonary disease and cancer. The relationship between age and length of work disability was linear among most chronic conditions. Conclusions: Work disability prevention strategies should consider both employee age and chronic condition diagnosis. PMID:27164446
De Girolamo, A; Lippolis, V; Nordkvist, E; Visconti, A
2009-06-01
Fourier transform near-infrared spectroscopy (FT-NIR) was used for rapid and non-invasive analysis of deoxynivalenol (DON) in durum and common wheat. The relevance of using ground wheat samples with a homogeneous particle size distribution to minimize measurement variations and avoid DON segregation among particles of different sizes was established. Calibration models for durum wheat, common wheat and durum + common wheat samples, with particle size <500 microm, were obtained by using partial least squares (PLS) regression with an external validation technique. Values of root mean square error of prediction (RMSEP, 306-379 microg kg(-1)) were comparable and not too far from values of root mean square error of cross-validation (RMSECV, 470-555 microg kg(-1)). Coefficients of determination (r(2)) indicated an "approximate to good" level of prediction of the DON content by FT-NIR spectroscopy in the PLS calibration models (r(2) = 0.71-0.83), and a "good" discrimination between low and high DON contents in the PLS validation models (r(2) = 0.58-0.63). A "limited to good" practical utility of the models was ascertained by range error ratio (RER) values higher than 6. A qualitative model, based on 197 calibration samples, was developed to discriminate between blank and naturally contaminated wheat samples by setting a cut-off at 300 microg kg(-1) DON to separate the two classes. The model correctly classified 69% of the 65 validation samples with most misclassified samples (16 of 20) showing DON contamination levels quite close to the cut-off level. These findings suggest that FT-NIR analysis is suitable for the determination of DON in unprocessed wheat at levels far below the maximum permitted limits set by the European Commission.
Wang, Zhen; Zhang, Hong; Shen, Xu-Hui; Jin, Kui-Li; Ye, Guo-fen; Qian, Li; Li, Bo; Zhang, Yong-Hong; Shi, Guo-Ping
2011-01-01
Background Recent studies have suggested that mast-cell activation and inflammation are important in obesity and diabetes. Plasma levels of mast cell proteases and the mast cell activator immunoglobulin E (IgE) may serve as novel inflammatory markers that associate with the risk of pre-diabetes and diabetes mellitus. Methods and Results A total of 340 subjects 55 to 75 years of age were grouped according to the American Diabetes Association 2003 criteria of normal glucose tolerance, pre-diabetes, and diabetes mellitus. The Kruskal-Wallis test demonstrated significant differences in plasma IgE levels (P = 0.008) among groups with different glucose tolerance status. Linear regression analysis revealed significant correlations between plasma levels of chymase (P = 0.030) or IgE (P = 0.022) and diabetes mellitus. Ordinal logistic regression analysis showed that IgE was a significant risk factor of pre-diabetes and diabetes mellitus (odds ratio [OR]: 1.674, P = 0.034). After adjustment for common diabetes risk factors, including age, sex, hypertension, body-mass index, cholesterol, homeostatic model assessment (HOMA) index, high-sensitivity C-reactive protein (hs-CRP), and mast cell chymase and tryptase, IgE remained a significant risk factor (OR: 1.866, P = 0.015). Two-variable ordinal logistic analysis indicated that interactions between hs-CRP and IgE, or between IgE and chymase, increased further the risks of developing pre-diabetes and diabetes mellitus before (OR: 2.204, P = 0.044; OR: 2.479, P = 0.033) and after (OR: 2.251, P = 0.040; OR: 2.594, P = 0.026) adjustment for common diabetes risk factors. Conclusions Both IgE and chymase associate with diabetes status. While IgE and hs-CRP are individual risk factors of pre-diabetes and diabetes mellitus, interactions of IgE with hs-CRP or with chymase further increased the risk of pre-diabetes and diabetes mellitus. PMID:22194960
Utilizing Infant Cry Acoustics to Determine Gestational Age.
Sahin, Mustafa; Sahin, Suzan; Sari, Fatma N; Tatar, Emel C; Uras, Nurdan; Oguz, Suna S; Korkmaz, Mehmet H
2017-07-01
The date of last menstruation period and ultrasonography are the most commonly used methods to determine gestational age (GA). However, if these data are not clear, some scoring systems performed after birth can be used. New Ballard Score (NBS) is a commonly used method in estimation of GA. Cry sound may reflect the developmental integrity of the infant. The aim of this study was to evaluate the connection between the infants' GA and some acoustic parameters of the infant cry. A prospective single-blind study was carried out. In this prospective study, medically stable infants without any congenital craniofacial anomalies were evaluated. During routine blood sampling, cry sounds were recorded and acoustic analysis was performed. Step-by-step multiple linear regression analysis was performed. The data of 116 infants (57 female, 59 male) with the known GA (34.6 ± 3.8 weeks) were evaluated and with Apgar score of higher than 5. The real GA was significantly and well correlated with the estimated GA according to the NBS, F0, Int, Jitt, and latency parameters. The obtained stepwise linear regression analysis model was formulized as GA=(31.169) - (0.020 × F0)+(0.286 × GA according to NBS) - (0.003 × Latency)+(0.108 × Int) - (0.367 × Jitt). The real GA could be determined with a ratio of 91.7% using this model. We have determined that after addition of F0, Int, Jitt, and latency to NBS, the power of GA estimation would be increased. This simple formula can be used to determine GA in clinical practice but validity of such prediction formulas needs to be further tested. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Laptook, Abbot R.; Shankaran, Seetha; Ambalavanan, Namasivayam; Carlo, Waldemar A.; McDonald, Scott A.; Higgins, Rosemary D.; Das, Abhik
2010-01-01
Context Death or severe disability is so common following an Apgar score of 0 at 10 minutes in observational studies that the Neonatal Resuscitation Program suggests considering discontinuation of resuscitation after 10 minutes of effective CPR. Objective To determine if Apgar scores at 10 minutes are associated with death or disability in early childhood following perinatal hypoxic-ischemic encephalopathy (HIE). Design, Setting, and Patients This is a secondary analysis of infants enrolled in the NICHD Neonatal Research Network hypothermia trial. Infants ≥ 36 weeks gestation had clinical and/or biochemical abnormalities at birth, and encephalopathy at < 6 hours. Logistic regression and classification and regression tree (CART) analysis was used to determine associations between Apgar scores at 10 minutes and neurodevelopmental outcome adjusting for covariates. Associations are expressed as odds ratios (OR) and 95% confidence interval (CI). Main Outcome Measure Death or disability (moderate or severe) at 18–22 months of age. Results Twenty of 208 infants were excluded (missing data). More than 90% of infants had Apgar scores of 0–2 at 1 minute and Apgars at 5 and 10 minutes shifted to progressively higher values; at 10 minutes 27% of infants had Apgar scores of 0–2. After adjustment each point decrease in Apgar score at 10 minutes was associated with a 45% increase in the odds of death or disability (OR 1.45, CI 1.22–1.72). Death or disability occurred in 76, 82 and 80% of infants with Apgar scores at 10 minutes of 0, 1 and 2, respectively. CART analysis indicated that Apgar scores at 10 minutes were discriminators of outcome. Conclusion Apgar scores at 10 minutes provide useful prognostic data before other evaluations are available for infants with HIE. Death or moderate/severe disability is common but not uniform with Apgar scores < 3; caution is needed before adopting a specific time interval to guide duration of resuscitation. PMID:19948631
Work and family transitions and the self-rated health of young women in South Africa.
Bennett, Rachel; Waterhouse, Philippa
2018-04-01
Understanding the transition to adulthood has important implications for supporting young adults and understanding the roots of diversity in wellbeing later in life. In South Africa, the end of Apartheid means today's youth are experiencing their transition to adulthood in a changed social and political context which offers opportunities compared to the past but also threats. This paper presents the first national level analysis of the patterning of key transitions (completion of education, entry into the labour force, motherhood and marriage or cohabitation), and the association between the different pathways and health amongst young women. With the use of longitudinal data from the South African National Income Dynamics Study (2008-2015), this paper employs sequence analysis to identify common pathways to adulthood amongst women aged 15-17 years at baseline (n = 429) and logistic regression modelling to examine the association between these pathways and self-rated health. The sequence analysis identified five pathways: 1. 'Non-activity commonly followed by motherhood', 2. 'Pathway from school, motherhood then work', 3. 'Motherhood combined with schooling', 4. 'Motherhood after schooling', and 5. 'Schooling to non-activity'. After controlling for baseline socio-economic and demographic characteristics and health, the regression results show young women who followed pathways characterised by early motherhood and economic inactivity (1, 3 and 4) had poorer self-rated health compared to women whose pathways were characterised by combining motherhood and economic activity (2) and young women who were yet to become economically active or mothers (5). Therefore, policies should seek to prevent adolescent childbearing, support young mothers to continue their educational careers and enable mothers in work and seeking work to balance their work and care responsibilities. Further, the findings highlight the value of taking a holistic approach to health and provide further evidence for the need to consider work-family balance in the development agenda. Copyright © 2018 Elsevier Ltd. All rights reserved.
Zhou, Qing; Yu, Ting; Liu, Yuan; Shi, Ruifen; Tian, Suping; Yang, Chaoxia; Gan, Huaxiu; Zhu, Yanying; Liang, Xia; Wang, Ling; Wu, Zhenhua; Huang, Jinping; Hu, Ailing
2018-02-01
To ascertain the pressure ulcer prevalence in secondary and tertiary general hospitals in different areas of Guangdong Province in China and explore the possible risk factors that are related to pressure ulcers. Few multicentre studies have been conducted on pressure ulcer prevalence in Chinese hospitals. A cross-sectional study design was used. Data from a total of 25,264 patients were included in the analysis at 25 hospitals in China. The investigators were divided into two groups. The investigators in group 1 examined the patients' skin. When a pressure ulcer was found, a pressure ulcer assessment form was completed. The investigators in group 2 provided guidance to the nurses, who assessed all patients and completed another questionnaire. A multivariate logistic regression analysis was used to analyse the relationship between the possible risk factors and pressure ulcer. The overall prevalence rate of pressure ulcers in the 25 hospitals ranged from 0%-3.49%, with a mean of 1.26%. The most common stage of the pressure ulcers was stage II (41.4%); most common anatomical locations were sacrum (39.5%) and the feet (16.4%). Braden score (p < .001), expected length of stay (p < .001), incontinence (p < .001), care group (p = .011), hospital location (p < .001), type of hospitals (p = .004), ages of patients (p < .001) were associations of pressure ulcers from the multivariate logistic regression analysis. The overall prevalence rate of pressure ulcers in Chinese hospitals was lower than that reported in previous investigations. Specific characteristics of pressure ulcer patients were as follows: low Braden score, longer expected length of stay, double incontinence, an ICU and a medical ward, hospital location in the Pearl River Delta, a university hospital and an older patient. The survey could make managers know their prevalence level of pressure ulcers and provide priorities for clinical nurses. © 2017 John Wiley & Sons Ltd.
Short-term outcome of 1,465 computer-navigated primary total knee replacements 2005-2008.
Gøthesen, Oystein; Espehaug, Birgitte; Havelin, Leif; Petursson, Gunnar; Furnes, Ove
2011-06-01
and purpose Improvement of positioning and alignment by the use of computer-assisted surgery (CAS) might improve longevity and function in total knee replacements, but there is little evidence. In this study, we evaluated the short-term results of computer-navigated knee replacements based on data from the Norwegian Arthroplasty Register. Primary total knee replacements without patella resurfacing, reported to the Norwegian Arthroplasty Register during the years 2005-2008, were evaluated. The 5 most common implants and the 3 most common navigation systems were selected. Cemented, uncemented, and hybrid knees were included. With the risk of revision for any cause as the primary endpoint and intraoperative complications and operating time as secondary outcomes, 1,465 computer-navigated knee replacements (CAS) and 8,214 conventionally operated knee replacements (CON) were compared. Kaplan-Meier survival analysis and Cox regression analysis with adjustment for age, sex, prosthesis brand, fixation method, previous knee surgery, preoperative diagnosis, and ASA category were used. Kaplan-Meier estimated survival at 2 years was 98% (95% CI: 97.5-98.3) in the CON group and 96% (95% CI: 95.0-97.8) in the CAS group. The adjusted Cox regression analysis showed a higher risk of revision in the CAS group (RR = 1.7, 95% CI: 1.1-2.5; p = 0.02). The LCS Complete knee had a higher risk of revision with CAS than with CON (RR = 2.1, 95% CI: 1.3-3.4; p = 0.004)). The differences were not statistically significant for the other prosthesis brands. Mean operating time was 15 min longer in the CAS group. With the introduction of computer-navigated knee replacement surgery in Norway, the short-term risk of revision has increased for computer-navigated replacement with the LCS Complete. The mechanisms of failure of these implantations should be explored in greater depth, and in this study we have not been able to draw conclusions regarding causation.
NASA Astrophysics Data System (ADS)
Takahashi, Tomoko; Thornton, Blair
2017-12-01
This paper reviews methods to compensate for matrix effects and self-absorption during quantitative analysis of compositions of solids measured using Laser Induced Breakdown Spectroscopy (LIBS) and their applications to in-situ analysis. Methods to reduce matrix and self-absorption effects on calibration curves are first introduced. The conditions where calibration curves are applicable to quantification of compositions of solid samples and their limitations are discussed. While calibration-free LIBS (CF-LIBS), which corrects matrix effects theoretically based on the Boltzmann distribution law and Saha equation, has been applied in a number of studies, requirements need to be satisfied for the calculation of chemical compositions to be valid. Also, peaks of all elements contained in the target need to be detected, which is a bottleneck for in-situ analysis of unknown materials. Multivariate analysis techniques are gaining momentum in LIBS analysis. Among the available techniques, principal component regression (PCR) analysis and partial least squares (PLS) regression analysis, which can extract related information to compositions from all spectral data, are widely established methods and have been applied to various fields including in-situ applications in air and for planetary explorations. Artificial neural networks (ANNs), where non-linear effects can be modelled, have also been investigated as a quantitative method and their applications are introduced. The ability to make quantitative estimates based on LIBS signals is seen as a key element for the technique to gain wider acceptance as an analytical method, especially in in-situ applications. In order to accelerate this process, it is recommended that the accuracy should be described using common figures of merit which express the overall normalised accuracy, such as the normalised root mean square errors (NRMSEs), when comparing the accuracy obtained from different setups and analytical methods.
Soleimani, Robabeh; Salehi, Zivar; Soltanipour, Soheil; Hasandokht, Tolou; Jalali, Mir Mohammad
2018-04-01
Methylphenidate (MPH) is the most commonly used treatment for attention-deficit hyperactivity disorder (ADHD) in children. However, the response to MPH is not similar in all patients. This meta-analysis investigated the potential role of SLC6A3 polymorphisms in response to MPH in children with ADHD. Clinical trials or naturalistic studies were selected from electronic databases. A meta-analysis was conducted using a random-effects model. Cohen's d effect size and 95% confidence intervals (CIs) were determined. Sensitivity analysis and meta-regression were performed. Q-statistic and Egger's tests were conducted to evaluate heterogeneity and publication bias, respectively. The Grading of Recommendations Assessment, Development and Evaluation (GRADE) system was used to assess the quality of evidence. Sixteen studies with follow-up periods of 1-28 weeks were eligible. The mean treatment acceptability of MPH was 97.2%. In contrast to clinical trials, the meta-analysis of naturalistic studies indicated that children without 10/10 repeat carriers had better response to MPH (Cohen's d: -0.09 and 0.44, respectively). The 9/9 repeat polymorphism had no effect on the response rate (Cohen's d: -0.43). In the meta-regression, a significant association was observed between baseline severity of ADHD, MPH dosage, and combined type of ADHD in some genetic models. Sensitivity analysis indicated the robustness of our findings. No publication bias was observed in our meta-analysis. The GRADE evaluations revealed very low levels of confidence for each outcome of response to MPH. The results of clinical trials and naturalistic studies regarding the effect size between different polymorphisms of SLC6A3 were contradictory. Therefore, further research is recommended. © 2017 Wiley Periodicals, Inc.
Determining degree of optic nerve edema from color fundus photography
NASA Astrophysics Data System (ADS)
Agne, Jason; Wang, Jui-Kai; Kardon, Randy H.; Garvin, Mona K.
2015-03-01
Swelling of the optic nerve head (ONH) is subjectively assessed by clinicians using the Frisén scale. It is believed that a direct measurement of the ONH volume would serve as a better representation of the swelling. However, a direct measurement requires optic nerve imaging with spectral domain optical coherence tomography (SD-OCT) and 3D segmentation of the resulting images, which is not always available during clinical evaluation. Furthermore, telemedical imaging of the eye at remote locations is more feasible with non-mydriatic fundus cameras which are less costly than OCT imagers. Therefore, there is a critical need to develop a more quantitative analysis of optic nerve swelling on a continuous scale, similar to SD-OCT. Here, we select features from more commonly available 2D fundus images and use them to predict ONH volume. Twenty-six features were extracted from each of 48 color fundus images. The features include attributes of the blood vessels, optic nerve head, and peripapillary retina areas. These features were used in a regression analysis to predict ONH volume, as computed by a segmentation of the SD-OCT image. The results of the regression analysis yielded a mean square error of 2.43 mm3 and a correlation coefficient between computed and predicted volumes of R = 0:771, which suggests that ONH volume may be predicted from fundus features alone.
A general framework for the regression analysis of pooled biomarker assessments.
Liu, Yan; McMahan, Christopher; Gallagher, Colin
2017-07-10
As a cost-efficient data collection mechanism, the process of assaying pooled biospecimens is becoming increasingly common in epidemiological research; for example, pooling has been proposed for the purpose of evaluating the diagnostic efficacy of biological markers (biomarkers). To this end, several authors have proposed techniques that allow for the analysis of continuous pooled biomarker assessments. Regretfully, most of these techniques proceed under restrictive assumptions, are unable to account for the effects of measurement error, and fail to control for confounding variables. These limitations are understandably attributable to the complex structure that is inherent to measurements taken on pooled specimens. Consequently, in order to provide practitioners with the tools necessary to accurately and efficiently analyze pooled biomarker assessments, herein, a general Monte Carlo maximum likelihood-based procedure is presented. The proposed approach allows for the regression analysis of pooled data under practically all parametric models and can be used to directly account for the effects of measurement error. Through simulation, it is shown that the proposed approach can accurately and efficiently estimate all unknown parameters and is more computational efficient than existing techniques. This new methodology is further illustrated using monocyte chemotactic protein-1 data collected by the Collaborative Perinatal Project in an effort to assess the relationship between this chemokine and the risk of miscarriage. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Tyllianakis, Emmanouil; Skuras, Dimitris
2016-11-01
The income elasticity of Willingness-To-Pay (WTP) is ambiguous and results from meta-analyses are disparate. This may be because the environmental good or service to be valued is very broadly defined or because the income measured in individual studies suffers from extensive non-reporting or miss reporting. The present study carries out a meta-analysis of WTP to restore Good Ecological Status (GES) under the Water Framework Directive (WFD). This environmental service is narrowly defined and its aims and objectives are commonly understood among the members of the scientific community. Besides income reported by the individual studies, wealth and income indicators collected by Eurostat for the geographic entities covered by the individual studies are used. Meta-regression analyses show that income is statistically significant, explains a substantial proportion of WTP variability and its elasticity is considerable in magnitude ranging from 0.6 to almost 1.7. Results are robust to variations in the sample of the individual studies participating in the meta-analysis, the econometric approach and the function form of the meta-regression. The choice of wealth or income measure is not that important as it is whether this measure is Purchasing Power Parity (PPP) adjusted among the individual studies. Copyright © 2016 Elsevier Ltd. All rights reserved.
Krishan, Kewal; Kanchan, Tanuj; Sharma, Abhilasha
2012-05-01
Estimation of stature is an important parameter in identification of human remains in forensic examinations. The present study is aimed to compare the reliability and accuracy of stature estimation and to demonstrate the variability in estimated stature and actual stature using multiplication factor and regression analysis methods. The study is based on a sample of 246 subjects (123 males and 123 females) from North India aged between 17 and 20 years. Four anthropometric measurements; hand length, hand breadth, foot length and foot breadth taken on the left side in each subject were included in the study. Stature was measured using standard anthropometric techniques. Multiplication factors were calculated and linear regression models were derived for estimation of stature from hand and foot dimensions. Derived multiplication factors and regression formula were applied to the hand and foot measurements in the study sample. The estimated stature from the multiplication factors and regression analysis was compared with the actual stature to find the error in estimated stature. The results indicate that the range of error in estimation of stature from regression analysis method is less than that of multiplication factor method thus, confirming that the regression analysis method is better than multiplication factor analysis in stature estimation. Copyright © 2012 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.
Brezonik, Patrick L; Stadelmann, Teresa H
2002-04-01
Urban nonpoint source pollution is a significant contributor to water quality degradation. Watershed planners need to be able to estimate nonpoint source loads to lakes and streams if they are to plan effective management strategies. To meet this need for the twin cities metropolitan area, a large database of urban and suburban runoff data was compiled. Stormwater runoff loads and concentrations of 10 common constituents (six N and P forms, TSS, VSS, COD, Pb) were characterized, and effects of season and land use were analyzed. Relationships between runoff variables and storm and watershed characteristics were examined. The best regression equation to predict runoff volume for rain events was based on rainfall amount, drainage area, and percent impervious area (R2 = 0.78). Median event-mean concentrations (EMCs) tended to be higher in snowmelt runoff than in rainfall runoff, and significant seasonal differences were found in yields (kg/ha) and EMCs for most constituents. Simple correlations between explanatory variables and stormwater loads and EMCs were weak. Rainfall amount and intensity and drainage area were the most important variables in multiple linear regression models to predict event loads, but uncertainty was high in models developed with the pooled data set. The most accurate models for EMCs generally were found when sites were grouped according to common land use and size.
Determination of Flavonoids in Wine by High Performance Liquid Chromatography
NASA Astrophysics Data System (ADS)
da Queija, Celeste; Queirós, M. A.; Rodrigues, Ligia M.
2001-02-01
The experiment presented is an application of HPLC to the analysis of flavonoids in wines, designed for students of instrumental methods. It is done in two successive 4-hour laboratory sessions. While the hydrolysis of the wines is in progress, the students prepare the calibration curves with standard solutions of flavonoids and calculate the regression lines and correlation coefficients. During the second session they analyze the hydrolyzed wine samples and calculate the concentrations of the flavonoids using the calibration curves obtained earlier. This laboratory work is very attractive to students because they deal with a common daily product whose components are reported to have preventive and therapeutic effects. Furthermore, students can execute preparative work and apply a more elaborate technique that is nowadays an indispensable tool in instrumental analysis.
Mohamed, Dhibi; Lotfi, Belkacem
2016-12-01
In this study, the Manchester Driver Behaviour Questionnaire (DBQ) was used to examine the self-reported driving behaviours of a group of Tunisian drivers (N = 900) and to collect socio-demographic data, driver behaviours and DBQ items. A sample of Tunisian drivers above 18 years was selected. The aim of the present study was to investigate the factorial structure of the DBQ in Tunisia. The principal component analysis identified three factor solutions: inattention errors, dangerous errors and dangerous violations. Logistic regression analysis showed that dangerous errors, dangerous violations and speeding preference factors predicted crash involvement in Tunisia. Speeding is the most common form of aberrant behaviour reported by drivers in the current sample. It remains one of the major road safety concerns.
Parsimonious nonstationary flood frequency analysis
NASA Astrophysics Data System (ADS)
Serago, Jake M.; Vogel, Richard M.
2018-02-01
There is now widespread awareness of the impact of anthropogenic influences on extreme floods (and droughts) and thus an increasing need for methods to account for such influences when estimating a frequency distribution. We introduce a parsimonious approach to nonstationary flood frequency analysis (NFFA) based on a bivariate regression equation which describes the relationship between annual maximum floods, x, and an exogenous variable which may explain the nonstationary behavior of x. The conditional mean, variance and skewness of both x and y = ln (x) are derived, and combined with numerous common probability distributions including the lognormal, generalized extreme value and log Pearson type III models, resulting in a very simple and general approach to NFFA. Our approach offers several advantages over existing approaches including: parsimony, ease of use, graphical display, prediction intervals, and opportunities for uncertainty analysis. We introduce nonstationary probability plots and document how such plots can be used to assess the improved goodness of fit associated with a NFFA.
Using Robust Standard Errors to Combine Multiple Regression Estimates with Meta-Analysis
ERIC Educational Resources Information Center
Williams, Ryan T.
2012-01-01
Combining multiple regression estimates with meta-analysis has continued to be a difficult task. A variety of methods have been proposed and used to combine multiple regression slope estimates with meta-analysis, however, most of these methods have serious methodological and practical limitations. The purpose of this study was to explore the use…
A Quality Assessment Tool for Non-Specialist Users of Regression Analysis
ERIC Educational Resources Information Center
Argyrous, George
2015-01-01
This paper illustrates the use of a quality assessment tool for regression analysis. It is designed for non-specialist "consumers" of evidence, such as policy makers. The tool provides a series of questions such consumers of evidence can ask to interrogate regression analysis, and is illustrated with reference to a recent study published…
Zarb, Francis; McEntee, Mark F; Rainford, Louise
2015-06-01
To evaluate visual grading characteristics (VGC) and ordinal regression analysis during head CT optimisation as a potential alternative to visual grading assessment (VGA), traditionally employed to score anatomical visualisation. Patient images (n = 66) were obtained using current and optimised imaging protocols from two CT suites: a 16-slice scanner at the national Maltese centre for trauma and a 64-slice scanner in a private centre. Local resident radiologists (n = 6) performed VGA followed by VGC and ordinal regression analysis. VGC alone indicated that optimised protocols had similar image quality as current protocols. Ordinal logistic regression analysis provided an in-depth evaluation, criterion by criterion allowing the selective implementation of the protocols. The local radiology review panel supported the implementation of optimised protocols for brain CT examinations (including trauma) in one centre, achieving radiation dose reductions ranging from 24 % to 36 %. In the second centre a 29 % reduction in radiation dose was achieved for follow-up cases. The combined use of VGC and ordinal logistic regression analysis led to clinical decisions being taken on the implementation of the optimised protocols. This improved method of image quality analysis provided the evidence to support imaging protocol optimisation, resulting in significant radiation dose savings. • There is need for scientifically based image quality evaluation during CT optimisation. • VGC and ordinal regression analysis in combination led to better informed clinical decisions. • VGC and ordinal regression analysis led to dose reductions without compromising diagnostic efficacy.
Flood quantile estimation at ungauged sites by Bayesian networks
NASA Astrophysics Data System (ADS)
Mediero, L.; Santillán, D.; Garrote, L.
2012-04-01
Estimating flood quantiles at a site for which no observed measurements are available is essential for water resources planning and management. Ungauged sites have no observations about the magnitude of floods, but some site and basin characteristics are known. The most common technique used is the multiple regression analysis, which relates physical and climatic basin characteristic to flood quantiles. Regression equations are fitted from flood frequency data and basin characteristics at gauged sites. Regression equations are a rigid technique that assumes linear relationships between variables and cannot take the measurement errors into account. In addition, the prediction intervals are estimated in a very simplistic way from the variance of the residuals in the estimated model. Bayesian networks are a probabilistic computational structure taken from the field of Artificial Intelligence, which have been widely and successfully applied to many scientific fields like medicine and informatics, but application to the field of hydrology is recent. Bayesian networks infer the joint probability distribution of several related variables from observations through nodes, which represent random variables, and links, which represent causal dependencies between them. A Bayesian network is more flexible than regression equations, as they capture non-linear relationships between variables. In addition, the probabilistic nature of Bayesian networks allows taking the different sources of estimation uncertainty into account, as they give a probability distribution as result. A homogeneous region in the Tagus Basin was selected as case study. A regression equation was fitted taking the basin area, the annual maximum 24-hour rainfall for a given recurrence interval and the mean height as explanatory variables. Flood quantiles at ungauged sites were estimated by Bayesian networks. Bayesian networks need to be learnt from a huge enough data set. As observational data are reduced, a stochastic generator of synthetic data was developed. Synthetic basin characteristics were randomised, keeping the statistical properties of observed physical and climatic variables in the homogeneous region. The synthetic flood quantiles were stochastically generated taking the regression equation as basis. The learnt Bayesian network was validated by the reliability diagram, the Brier Score and the ROC diagram, which are common measures used in the validation of probabilistic forecasts. Summarising, the flood quantile estimations through Bayesian networks supply information about the prediction uncertainty as a probability distribution function of discharges is given as result. Therefore, the Bayesian network model has application as a decision support for water resources and planning management.
VanEngelsdorp, Dennis; Speybroeck, Niko; Evans, Jay D; Nguyen, Bach Kim; Mullin, Chris; Frazier, Maryann; Frazier, Jim; Cox-Foster, Diana; Chen, Yanping; Tarpy, David R; Haubruge, Eric; Pettis, Jeffrey S; Saegerman, Claude
2010-10-01
Colony collapse disorder (CCD), a syndrome whose defining trait is the rapid loss of adult worker honey bees, Apis mellifera L., is thought to be responsible for a minority of the large overwintering losses experienced by U.S. beekeepers since the winter 2006-2007. Using the same data set developed to perform a monofactorial analysis (PloS ONE 4: e6481, 2009), we conducted a classification and regression tree (CART) analysis in an attempt to better understand the relative importance and interrelations among different risk variables in explaining CCD. Fifty-five exploratory variables were used to construct two CART models: one model with and one model without a cost of misclassifying a CCD-diagnosed colony as a non-CCD colony. The resulting model tree that permitted for misclassification had a sensitivity and specificity of 85 and 74%, respectively. Although factors measuring colony stress (e.g., adult bee physiological measures, such as fluctuating asymmetry or mass of head) were important discriminating values, six of the 19 variables having the greatest discriminatory value were pesticide levels in different hive matrices. Notably, coumaphos levels in brood (a miticide commonly used by beekeepers) had the highest discriminatory value and were highest in control (healthy) colonies. Our CART analysis provides evidence that CCD is probably the result of several factors acting in concert, making afflicted colonies more susceptible to disease. This analysis highlights several areas that warrant further attention, including the effect of sublethal pesticide exposure on pathogen prevalence and the role of variability in bee tolerance to pesticides on colony survivorship.
Ushida, Keisuke; McGrath, Colman P; Lo, Edward C M; Zwahlen, Roger A
2015-07-24
Even though oral cavity cancer (OCC; ICD 10 codes C01, C02, C03, C04, C05, and C06) ranks eleventh among the world's most common cancers, accounting for approximately 2 % of all cancers, a trend analysis of OCC in Hong Kong is lacking. Hong Kong has experienced rapid economic growth with socio-cultural and environmental change after the Second World War. This together with the collected data in the cancer registry provides interesting ground for an epidemiological study on the influence of socio-cultural and environmental factors on OCC etiology. A multidirectional statistical analysis of the OCC trends over the past 25 years was performed using the databases of the Hong Kong Cancer Registry. The age, period, and cohort (APC) modeling was applied to determine age, period, and cohort effects on OCC development. Joinpoint regression analysis was used to find secular trend changes of both age-standardized and age-specific incidence rates. The APC model detected that OCC development in men was mainly dominated by the age effect, whereas in women an increasing linear period effect together with an age effect became evident. The joinpoint regression analysis showed a general downward trend of age-standardized incidence rates of OCC for men during the entire investigated period, whereas women demonstrated a significant upward trend from 2001 onwards. The results suggest that OCC incidence in Hong Kong appears to be associated with cumulative risk behaviors of the population, despite considerable socio-cultural and environmental changes after the Second World War.
Tanaka, Tomohiro; Voigt, Michael D
2018-03-01
Non-melanoma skin cancer (NMSC) is the most common de novo malignancy in liver transplant (LT) recipients; it behaves more aggressively and it increases mortality. We used decision tree analysis to develop a tool to stratify and quantify risk of NMSC in LT recipients. We performed Cox regression analysis to identify which predictive variables to enter into the decision tree analysis. Data were from the Organ Procurement Transplant Network (OPTN) STAR files of September 2016 (n = 102984). NMSC developed in 4556 of the 105984 recipients, a mean of 5.6 years after transplant. The 5/10/20-year rates of NMSC were 2.9/6.3/13.5%, respectively. Cox regression identified male gender, Caucasian race, age, body mass index (BMI) at LT, and sirolimus use as key predictive or protective factors for NMSC. These factors were entered into a decision tree analysis. The final tree stratified non-Caucasians as low risk (0.8%), and Caucasian males > 47 years, BMI < 40 who did not receive sirolimus, as high risk (7.3% cumulative incidence of NMSC). The predictions in the derivation set were almost identical to those in the validation set (r 2 = 0.971, p < 0.0001). Cumulative incidence of NMSC in low, moderate and high risk groups at 5/10/20 year was 0.5/1.2/3.3, 2.1/4.8/11.7 and 5.6/11.6/23.1% (p < 0.0001). The decision tree model accurately stratifies the risk of developing NMSC in the long-term after LT.
REGRESSION ANALYSIS OF SEA-SURFACE-TEMPERATURE PATTERNS FOR THE NORTH PACIFIC OCEAN.
SEA WATER, *SURFACE TEMPERATURE, *OCEANOGRAPHIC DATA, PACIFIC OCEAN, REGRESSION ANALYSIS , STATISTICAL ANALYSIS, UNDERWATER EQUIPMENT, DETECTION, UNDERWATER COMMUNICATIONS, DISTRIBUTION, THERMAL PROPERTIES, COMPUTERS.
ERIC Educational Resources Information Center
Rudner, Lawrence
2016-01-01
In the machine learning literature, it is commonly accepted as fact that as calibration sample sizes increase, Naïve Bayes classifiers initially outperform Logistic Regression classifiers in terms of classification accuracy. Applied to subtests from an on-line final examination and from a highly regarded certification examination, this study shows…
The process and utility of classification and regression tree methodology in nursing research
Kuhn, Lisa; Page, Karen; Ward, John; Worrall-Carter, Linda
2014-01-01
Aim This paper presents a discussion of classification and regression tree analysis and its utility in nursing research. Background Classification and regression tree analysis is an exploratory research method used to illustrate associations between variables not suited to traditional regression analysis. Complex interactions are demonstrated between covariates and variables of interest in inverted tree diagrams. Design Discussion paper. Data sources English language literature was sourced from eBooks, Medline Complete and CINAHL Plus databases, Google and Google Scholar, hard copy research texts and retrieved reference lists for terms including classification and regression tree* and derivatives and recursive partitioning from 1984–2013. Discussion Classification and regression tree analysis is an important method used to identify previously unknown patterns amongst data. Whilst there are several reasons to embrace this method as a means of exploratory quantitative research, issues regarding quality of data as well as the usefulness and validity of the findings should be considered. Implications for Nursing Research Classification and regression tree analysis is a valuable tool to guide nurses to reduce gaps in the application of evidence to practice. With the ever-expanding availability of data, it is important that nurses understand the utility and limitations of the research method. Conclusion Classification and regression tree analysis is an easily interpreted method for modelling interactions between health-related variables that would otherwise remain obscured. Knowledge is presented graphically, providing insightful understanding of complex and hierarchical relationships in an accessible and useful way to nursing and other health professions. PMID:24237048
The process and utility of classification and regression tree methodology in nursing research.
Kuhn, Lisa; Page, Karen; Ward, John; Worrall-Carter, Linda
2014-06-01
This paper presents a discussion of classification and regression tree analysis and its utility in nursing research. Classification and regression tree analysis is an exploratory research method used to illustrate associations between variables not suited to traditional regression analysis. Complex interactions are demonstrated between covariates and variables of interest in inverted tree diagrams. Discussion paper. English language literature was sourced from eBooks, Medline Complete and CINAHL Plus databases, Google and Google Scholar, hard copy research texts and retrieved reference lists for terms including classification and regression tree* and derivatives and recursive partitioning from 1984-2013. Classification and regression tree analysis is an important method used to identify previously unknown patterns amongst data. Whilst there are several reasons to embrace this method as a means of exploratory quantitative research, issues regarding quality of data as well as the usefulness and validity of the findings should be considered. Classification and regression tree analysis is a valuable tool to guide nurses to reduce gaps in the application of evidence to practice. With the ever-expanding availability of data, it is important that nurses understand the utility and limitations of the research method. Classification and regression tree analysis is an easily interpreted method for modelling interactions between health-related variables that would otherwise remain obscured. Knowledge is presented graphically, providing insightful understanding of complex and hierarchical relationships in an accessible and useful way to nursing and other health professions. © 2013 The Authors. Journal of Advanced Nursing Published by John Wiley & Sons Ltd.
Parameter estimation in Cox models with missing failure indicators and the OPPERA study.
Brownstein, Naomi C; Cai, Jianwen; Slade, Gary D; Bair, Eric
2015-12-30
In a prospective cohort study, examining all participants for incidence of the condition of interest may be prohibitively expensive. For example, the "gold standard" for diagnosing temporomandibular disorder (TMD) is a physical examination by a trained clinician. In large studies, examining all participants in this manner is infeasible. Instead, it is common to use questionnaires to screen for incidence of TMD and perform the "gold standard" examination only on participants who screen positively. Unfortunately, some participants may leave the study before receiving the "gold standard" examination. Within the framework of survival analysis, this results in missing failure indicators. Motivated by the Orofacial Pain: Prospective Evaluation and Risk Assessment (OPPERA) study, a large cohort study of TMD, we propose a method for parameter estimation in survival models with missing failure indicators. We estimate the probability of being an incident case for those lacking a "gold standard" examination using logistic regression. These estimated probabilities are used to generate multiple imputations of case status for each missing examination that are combined with observed data in appropriate regression models. The variance introduced by the procedure is estimated using multiple imputation. The method can be used to estimate both regression coefficients in Cox proportional hazard models as well as incidence rates using Poisson regression. We simulate data with missing failure indicators and show that our method performs as well as or better than competing methods. Finally, we apply the proposed method to data from the OPPERA study. Copyright © 2015 John Wiley & Sons, Ltd.
CADDIS Volume 4. Data Analysis: Basic Analyses
Use of statistical tests to determine if an observation is outside the normal range of expected values. Details of CART, regression analysis, use of quantile regression analysis, CART in causal analysis, simplifying or pruning resulting trees.
Characterization of Microbiota in Children with Chronic Functional Constipation.
de Meij, Tim G J; de Groot, Evelien F J; Eck, Anat; Budding, Andries E; Kneepkens, C M Frank; Benninga, Marc A; van Bodegraven, Adriaan A; Savelkoul, Paul H M
2016-01-01
Disruption of the intestinal microbiota is considered an etiological factor in pediatric functional constipation. Scientifically based selection of potential beneficial probiotic strains in functional constipation therapy is not feasible due to insufficient knowledge of microbiota composition in affected subjects. The aim of this study was to describe microbial composition and diversity in children with functional constipation, compared to healthy controls. Fecal samples from 76 children diagnosed with functional constipation according to the Rome III criteria (median age 8.0 years; range 4.2-17.8) were analyzed by IS-pro, a PCR-based microbiota profiling method. Outcome was compared with intestinal microbiota profiles of 61 healthy children (median 8.6 years; range 4.1-17.9). Microbiota dissimilarity was depicted by principal coordinate analysis (PCoA), diversity was calculated by Shannon diversity index. To determine the most discriminative species, cross validated logistic ridge regression was performed. Applying total microbiota profiles (all phyla together) or per phylum analysis, no disease-specific separation was observed by PCoA and by calculation of diversity indices. By ridge regression, however, functional constipation and controls could be discriminated with 82% accuracy. Most discriminative species were Bacteroides fragilis, Bacteroides ovatus, Bifidobacterium longum, Parabacteroides species (increased in functional constipation) and Alistipes finegoldii (decreased in functional constipation). None of the commonly used unsupervised statistical methods allowed for microbiota-based discrimination of children with functional constipation and controls. By ridge regression, however, both groups could be discriminated with 82% accuracy. Optimization of microbiota-based interventions in constipated children warrants further characterization of microbial signatures linked to clinical subgroups of functional constipation.
NASA Astrophysics Data System (ADS)
Takayama, T.; Iwasaki, A.
2016-06-01
Above-ground biomass prediction of tropical rain forest using remote sensing data is of paramount importance to continuous large-area forest monitoring. Hyperspectral data can provide rich spectral information for the biomass prediction; however, the prediction accuracy is affected by a small-sample-size problem, which widely exists as overfitting in using high dimensional data where the number of training samples is smaller than the dimensionality of the samples due to limitation of require time, cost, and human resources for field surveys. A common approach to addressing this problem is reducing the dimensionality of dataset. Also, acquired hyperspectral data usually have low signal-to-noise ratio due to a narrow bandwidth and local or global shifts of peaks due to instrumental instability or small differences in considering practical measurement conditions. In this work, we propose a methodology based on fused lasso regression that select optimal bands for the biomass prediction model with encouraging sparsity and grouping, which solves the small-sample-size problem by the dimensionality reduction from the sparsity and the noise and peak shift problem by the grouping. The prediction model provided higher accuracy with root-mean-square error (RMSE) of 66.16 t/ha in the cross-validation than other methods; multiple linear analysis, partial least squares regression, and lasso regression. Furthermore, fusion of spectral and spatial information derived from texture index increased the prediction accuracy with RMSE of 62.62 t/ha. This analysis proves efficiency of fused lasso and image texture in biomass estimation of tropical forests.
NeCamp, Timothy; Kilbourne, Amy; Almirall, Daniel
2017-08-01
Cluster-level dynamic treatment regimens can be used to guide sequential treatment decision-making at the cluster level in order to improve outcomes at the individual or patient-level. In a cluster-level dynamic treatment regimen, the treatment is potentially adapted and re-adapted over time based on changes in the cluster that could be impacted by prior intervention, including aggregate measures of the individuals or patients that compose it. Cluster-randomized sequential multiple assignment randomized trials can be used to answer multiple open questions preventing scientists from developing high-quality cluster-level dynamic treatment regimens. In a cluster-randomized sequential multiple assignment randomized trial, sequential randomizations occur at the cluster level and outcomes are observed at the individual level. This manuscript makes two contributions to the design and analysis of cluster-randomized sequential multiple assignment randomized trials. First, a weighted least squares regression approach is proposed for comparing the mean of a patient-level outcome between the cluster-level dynamic treatment regimens embedded in a sequential multiple assignment randomized trial. The regression approach facilitates the use of baseline covariates which is often critical in the analysis of cluster-level trials. Second, sample size calculators are derived for two common cluster-randomized sequential multiple assignment randomized trial designs for use when the primary aim is a between-dynamic treatment regimen comparison of the mean of a continuous patient-level outcome. The methods are motivated by the Adaptive Implementation of Effective Programs Trial which is, to our knowledge, the first-ever cluster-randomized sequential multiple assignment randomized trial in psychiatry.
Population heterogeneity in the salience of multiple risk factors for adolescent delinquency.
Lanza, Stephanie T; Cooper, Brittany R; Bray, Bethany C
2014-03-01
To present mixture regression analysis as an alternative to more standard regression analysis for predicting adolescent delinquency. We demonstrate how mixture regression analysis allows for the identification of population subgroups defined by the salience of multiple risk factors. We identified population subgroups (i.e., latent classes) of individuals based on their coefficients in a regression model predicting adolescent delinquency from eight previously established risk indices drawn from the community, school, family, peer, and individual levels. The study included N = 37,763 10th-grade adolescents who participated in the Communities That Care Youth Survey. Standard, zero-inflated, and mixture Poisson and negative binomial regression models were considered. Standard and mixture negative binomial regression models were selected as optimal. The five-class regression model was interpreted based on the class-specific regression coefficients, indicating that risk factors had varying salience across classes of adolescents. Standard regression showed that all risk factors were significantly associated with delinquency. Mixture regression provided more nuanced information, suggesting a unique set of risk factors that were salient for different subgroups of adolescents. Implications for the design of subgroup-specific interventions are discussed. Copyright © 2014 Society for Adolescent Health and Medicine. Published by Elsevier Inc. All rights reserved.
Kakoly, Nadira Sultana; Earnest, Arul; Moran, Lisa J; Teede, Helena J; Joham, Anju E
2017-11-06
Obesity is common in young women, increasing insulin resistance (IR) and worsening pregnancy complications, including gestational diabetes (GDM). Women with polycystic ovary syndrome (PCOS) are commonly obese, which aggravates the severity of PCOS clinical expression. Relationships between these common insulin-resistant conditions, however, remain unclear. We conducted a secondary analysis of the Australian Longitudinal Study on Women's Health (ALSWH) database, including data from 8009 women aged 18-36 years across six surveys. We used latent-curve growth modelling to identify distinct body mass index (BMI) trajectories and multinomial logistic regression to explore sociodemographic and health variables characterizing BMI group membership. Logistic regression was used to assess independent risk of GDM. A total of 662 women (8.29%, 95% CI 7.68-8.89) reported PCOS. Three distinct BMI trajectories emerged, namely low stable (LSG) (63.8%), defined as an average trajectory remaining at ~25 kg/m 2 ; moderately rising (MRG) (28.8%), a curvilinear trajectory commencing in a healthy BMI and terminating in the overweight range; and high-rising (HRG) (7.4%), a curvilinear trajectory starting and terminating in the obese range. A high BMI in early reproductive life predicted membership in higher trajectories. The HRG BMI trajectory was independently associated with GDM (OR 2.50, 95% CI 1.80-3.48) and was a stronger correlate than PCOS (OR 1.89, 95% CI 1.41-2.54), maternal age, socioeconomic status, or parity. Our results suggest heterogeneity in BMI change among Australian women of reproductive age, with and without PCOS. Reducing early adult life weight represents an ideal opportunity to intervene at an early stage of reproductive life and decreases the risk of long-term metabolic complications such as GDM.
Polack, Sarah; Adams, Mel; O'banion, David; Baltussen, Marjolein; Asante, Sandra; Kerac, Marko; Gladstone, Melissa; Zuurmond, Maria
2018-05-07
To assess feeding difficulties and nutritional status among children with cerebral palsy (CP) in Ghana, and whether severity of feeding difficulties and malnutrition are independently associated with caregiver quality of life (QoL). This cross-sectional survey included 76 children with CP (18mo-12y) from four regions of Ghana. Severity of CP was classified using the Gross Motor Function Classification System and anthropometric measures were taken. Caregivers rated their QoL (using the Pediatric Quality of Life Inventory Family Impact Module) and difficulties with eight aspects of child feeding. Logistic regression analysis explored factors (socio-economic characteristics, severity of CP, and feeding difficulties) associated with being underweight. Linear regression was undertaken to assess the relationship between caregiver QoL and child malnutrition and feeding difficulties. Poor nutritional status was common: 65% of children aged under 5 years were categorized as underweight, 54% as stunted, and 58% as wasted. Reported difficulties with child's feeding were common and were associated with the child being underweight (odds ratio 10.7, 95% confidence interval 2.3-49.6) and poorer caregiver QoL (p<0.001). No association between caregiver QoL and nutritional status was evident. Among rural, low resource populations in Ghana, there is a need for appropriate, accessible caregiver training and support around feeding practices of children with CP, to improve child nutritional status and caregiver well-being. What this paper adds Malnutrition is very common among children with cerebral palsy in this rural population in Ghana. Feeding difficulties in this population were strongly associated with being underweight. Feeding difficulties were associated with poorer caregiver quality of life (QoL). Child nutritional status was not associated with caregiver QoL. © 2018 Mac Keith Press.
Hu, Wen-Long; Chen, Hsuan-Ju; Li, Tsai-Chung; Tsai, Pei-Yuan; Chen, Hsin-Ping; Huang, Meng-Hsuan; Su, Fang-Yen
2015-01-01
Objective Combinations of Chinese herbal products (CHPs) are widely used for ischemic heart disease (IHD) in Taiwan. We analyzed the usage and frequency of CHPs prescribed for patients with IHD. Methods A nationwide population-based cross-sectional study was conducted, 53531 patients from a random sample of one million in the National Health Insurance Research Database (NHIRD) from 2000 to 2010 were enrolled. Descriptive statistics, the multiple logistic regression method and Poisson regression analysis were employed to estimate the adjusted odds ratios (aORs) and adjusted risk ratios (aRRs) for utilization of CHPs. Results The mean age of traditional Chinese medicine (TCM) nonusers was significantly higher than that of TCM users. Zhi-Gan-Cao-Tang (24.85%) was the most commonly prescribed formula CHPs, followed by Xue-Fu-Zhu-Yu-Tang (16.53%) and Sheng-Mai-San (16.00%). The most commonly prescribed single CHPs were Dan Shen (29.30%), Yu Jin (7.44%), and Ge Gen (6.03%). After multivariate adjustment, patients with IHD younger than 29 years had 2.62 times higher odds to use TCM than those 60 years or older. Residents living in Central Taiwan, having hyperlipidemia or cardiac dysrhythmias also have higher odds to use TCM. On the contrary, those who were males, who had diabetes mellitus (DM), hypertension, stroke, myocardial infarction (MI) were less likely to use TCM. Conclusions Zhi-Gan-Cao-Tang and Dan Shen are the most commonly prescribed CHPs for IHD in Taiwan. Our results should be taken into account by physicians when devising individualized therapy for IHD. Further large-scale, randomized clinical trials are warranted in order to determine the effectiveness and safety of these herbal medicines. PMID:26322893
He, Zhifei; Bishwajit, Ghose; Zou, Dongsheng; Yaya, Sanni; Cheng, Zhaohui; Zhou, Yan
2018-06-12
Having access to improved water, sanitation, and hygiene (WASH) facilities constitute a key component of healthy living and quality of life. Prolonged exposure to insanitary living conditions can significantly enhance the burden of infectious diseases among children and affect nutritional status and growth. In this study we examined the prevalence of some common infectious diseases/disease symptoms of childhood among under-five children in Nigeria, and the association between the occurrence of these diseases with household’s access to WASH facilities. Types of diseases used as outcome variables included diarrheal, and acute respiratory infections (fever and cough). Access to WASH facilities were defined by WHO classification. The association between diarrhoea, fever and chronic cough with sanitation, and hygiene was analyzed by logistic regression techniques. Results showed that the prevalence of diarrhoea, fever and cough was respectively 10.5% (95% CI = 9.7⁻2.0), 13.4% (95% CI = 11.9⁻14.8), and 10.4% (95% CI = 9.2⁻11.5). In the regression analysis, children in the households that lacked all three types of facilities were found to have respectively 1.32 [AOR = 1.329, 95% CI = 1.046⁻1.947], 1.24 [AOR = 1.242, 95% CI = 1.050⁻1.468] and 1.43 [AOR = 1.432, 95% CI = 1.113⁻2.902] times higher odds of suffering from diarrhea, fever and cough. The study concludes that unimproved WASH conditions is an important contributor to ARIs and diarrheal morbidities among Nigerian children. In light of these findings, it is recommended that programs targeting to reduce childhood morbidity and mortality from common infectious diseases should leverage equitable provision of WASH interventions.
Nieuwenhuijsen, Karen; Verbeek, Jos H A M; de Boer, Angela G E M; Blonk, Roland W B; van Dijk, Frank J H
2006-02-01
This study attempted to determine the factors that best predict the duration of absence from work among employees with common mental disorders. A cohort of 188 employees, of whom 102 were teachers, on sick leave with common mental disorders was followed for 1 year. Only information potentially available to the occupational physician during a first consultation was included in the predictive model. The predictive power of the variables was tested using Cox's regression analysis with a stepwise backward selection procedure. The hazard ratios (HR) from the final model were used to deduce a simple prediction rule. The resulting prognostic scores were then used to predict the probability of not returning to work after 3, 6, and 12 months. Calculating the area under the curve from the ROC (receiver operating characteristic) curve tested the discriminative ability of the prediction rule. The final Cox's regression model produced the following four predictors of a longer time until return to work: age older than 50 years [HR 0.5, 95% confidence interval (95% CI) 0.3-0.8], expectation of duration absence longer than 3 months (HR 0.5, 95% CI 0.3-0.8), higher educational level (HR 0.5, 95% CI 0.3-0.8), and diagnosis depression or anxiety disorder (HR 0.7, 95% CI 0.4-0.9). The resulting prognostic score yielded areas under the curves ranging from 0.68 to 0.73, which represent acceptable discrimination of the rule. A prediction rule based on four simple variables can be used by occupational physicians to identify unfavorable cases and to predict the duration of sickness absence.
Nam, Kijoeng; Henderson, Nicholas C; Rohan, Patricia; Woo, Emily Jane; Russek-Cohen, Estelle
2017-01-01
The Vaccine Adverse Event Reporting System (VAERS) and other product surveillance systems compile reports of product-associated adverse events (AEs), and these reports may include a wide range of information including age, gender, and concomitant vaccines. Controlling for possible confounding variables such as these is an important task when utilizing surveillance systems to monitor post-market product safety. A common method for handling possible confounders is to compare observed product-AE combinations with adjusted baseline frequencies where the adjustments are made by stratifying on observable characteristics. Though approaches such as these have proven to be useful, in this article we propose a more flexible logistic regression approach which allows for covariates of all types rather than relying solely on stratification. Indeed, a main advantage of our approach is that the general regression framework provides flexibility to incorporate additional information such as demographic factors and concomitant vaccines. As part of our covariate-adjusted method, we outline a procedure for signal detection that accounts for multiple comparisons and controls the overall Type 1 error rate. To demonstrate the effectiveness of our approach, we illustrate our method with an example involving febrile convulsion, and we further evaluate its performance in a series of simulation studies.
Kronholm, Scott C.; Capel, Paul D.; Terziotti, Silvia
2016-01-01
Accurate estimation of total nitrogen loads is essential for evaluating conditions in the aquatic environment. Extrapolation of estimates beyond measured streams will greatly expand our understanding of total nitrogen loading to streams. Recursive partitioning and random forest regression were used to assess 85 geospatial, environmental, and watershed variables across 636 small (<585 km2) watersheds to determine which variables are fundamentally important to the estimation of annual loads of total nitrogen. Initial analysis led to the splitting of watersheds into three groups based on predominant land use (agricultural, developed, and undeveloped). Nitrogen application, agricultural and developed land area, and impervious or developed land in the 100-m stream buffer were commonly extracted variables by both recursive partitioning and random forest regression. A series of multiple linear regression equations utilizing the extracted variables were created and applied to the watersheds. As few as three variables explained as much as 76 % of the variability in total nitrogen loads for watersheds with predominantly agricultural land use. Catchment-scale national maps were generated to visualize the total nitrogen loads and yields across the USA. The estimates provided by these models can inform water managers and help identify areas where more in-depth monitoring may be beneficial.
ERIC Educational Resources Information Center
Dolan, Conor V.; Wicherts, Jelte M.; Molenaar, Peter C. M.
2004-01-01
We consider the question of how variation in the number and reliability of indicators affects the power to reject the hypothesis that the regression coefficients are zero in latent linear regression analysis. We show that power remains constant as long as the coefficient of determination remains unchanged. Any increase in the number of indicators…
Linear regression models for solvent accessibility prediction in proteins.
Wagner, Michael; Adamczak, Rafał; Porollo, Aleksey; Meller, Jarosław
2005-04-01
The relative solvent accessibility (RSA) of an amino acid residue in a protein structure is a real number that represents the solvent exposed surface area of this residue in relative terms. The problem of predicting the RSA from the primary amino acid sequence can therefore be cast as a regression problem. Nevertheless, RSA prediction has so far typically been cast as a classification problem. Consequently, various machine learning techniques have been used within the classification framework to predict whether a given amino acid exceeds some (arbitrary) RSA threshold and would thus be predicted to be "exposed," as opposed to "buried." We have recently developed novel methods for RSA prediction using nonlinear regression techniques which provide accurate estimates of the real-valued RSA and outperform classification-based approaches with respect to commonly used two-class projections. However, while their performance seems to provide a significant improvement over previously published approaches, these Neural Network (NN) based methods are computationally expensive to train and involve several thousand parameters. In this work, we develop alternative regression models for RSA prediction which are computationally much less expensive, involve orders-of-magnitude fewer parameters, and are still competitive in terms of prediction quality. In particular, we investigate several regression models for RSA prediction using linear L1-support vector regression (SVR) approaches as well as standard linear least squares (LS) regression. Using rigorously derived validation sets of protein structures and extensive cross-validation analysis, we compare the performance of the SVR with that of LS regression and NN-based methods. In particular, we show that the flexibility of the SVR (as encoded by metaparameters such as the error insensitivity and the error penalization terms) can be very beneficial to optimize the prediction accuracy for buried residues. We conclude that the simple and computationally much more efficient linear SVR performs comparably to nonlinear models and thus can be used in order to facilitate further attempts to design more accurate RSA prediction methods, with applications to fold recognition and de novo protein structure prediction methods.
NASA Technical Reports Server (NTRS)
Mccormick, M. P.; Chiou, E. W.; Mcmaster, L. R.; Chu, W. P.; Larsen, J. C.; Rind, D.; Oltmans, S.
1993-01-01
Data collected by the Stratospheric Aerosol and Gas Experiment II are presented, showing annual variations of water vapor in the stratosphere and the upper troposphere. The altitude-time cross sections of water vapor were found to exhibit annually repeatable patterns in both hemispheres, with a yearly minimum in water vapor appearing in both hemispheres at about the same time, supporting the concept of a common source for stratospheric dry air. A linear regression analysis was applied to the three-year data set to elucidate global values and variations of water vapor ratio.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brink, A.; Kilpinen, P.; Hupa, M.
1996-01-01
Two methods to improve the modeling of NO{sub x} emissions in numerical flow simulation of combustion are investigated. The models used are a reduced mechanism for nitrogen chemistry in methane combustion and a new model based on regression analysis of perfectly stirred reactor simulations using detailed comprehensive reaction kinetics. The applicability of the methods to numerical flow simulation of practical furnaces, especially in the near burner region, is tested against experimental data from a pulverized coal fired single burner furnace. The results are also compared to those obtained using a commonly used description for the overall reaction rate of NO.
Gender-Blind Sexism and Rape Myth Acceptance.
Stoll, Laurie Cooper; Lilley, Terry Glenn; Pinter, Kelly
2017-01-01
The purpose of this article is to explore whether gender-blind sexism, as an extension of Bonilla-Silva's racialized social system theory, is an appropriate theoretical framework for understanding the creation and continued prevalence of rape myth acceptance. Specifically, we hypothesize that individuals who hold attitudes consistent with the frames of gender-blind sexism are more likely to accept common rape myths. Data for this article come from an online survey administered to the entire undergraduate student body at a large Midwestern institution (N = 1,401). Regression analysis showed strong support for the effects of gender-blind sexism on rape myth acceptance. © The Author(s) 2016.
Qian, S.S.; Anderson, Chauncey W.
1999-01-01
We analyzed available concentration data of five commonly used herbicides and three pesticides collected from small streams in the Willamette River Basin in Oregon to identify factors that affect the variation of their concentrations in the area. The emphasis of this paper is the innovative use of classification and regression tree models for exploratory data analysis as well as analyzing data with a substantial amount of left-censored values. Among variables included in this analysis, land-use pattern in the watershed is the most important for all but one (simazine) of the eight pesticides studied, followed by geographic location, intensity of agriculture activities in the watershed (represented by nutrient concentrations in the stream), and the size of the watershed. The significant difference between urban sites and agriculture sites is the variability of stream concentrations. While all 16 nonurban watersheds have significantly higher variation than urban sites, the same is not necessarily true for the mean concentrations. Seasonal variation accounts for only a small fraction of the total variance in all eight pesticides.We analyzed available concentration data of five commonly used herbicides and three pesticides collected from small streams in the Willamette River Basin in Oregon to identify factors that affect the variation of their concentrations in the area. The emphasis of this paper is the innovative use of classification and regression tree models for exploratory data analysis as well as analyzing data with a substantial amount of left-censored values. Among variables included in this analysis, land-use pattern in the watershed is the most important for all but one (simazine) of the eight pesticides studied, followed by geographic location, intensity of agriculture activities in the watershed (represented by nutrient concentrations in the stream), and the size of the watershed. The significant difference between urban sites and agriculture sites is the variability of stream concentrations. While all 16 nonurban watersheds have significantly higher variation than urban sites, the same is not necessarily true for the mean concentrations. Seasonal variation accounts for only a small fraction of the total variance in all eight pesticides.
Archfield, Stacey A.; Pugliese, Alessio; Castellarin, Attilio; Skøien, Jon O.; Kiang, Julie E.
2013-01-01
In the United States, estimation of flood frequency quantiles at ungauged locations has been largely based on regional regression techniques that relate measurable catchment descriptors to flood quantiles. More recently, spatial interpolation techniques of point data have been shown to be effective for predicting streamflow statistics (i.e., flood flows and low-flow indices) in ungauged catchments. Literature reports successful applications of two techniques, canonical kriging, CK (or physiographical-space-based interpolation, PSBI), and topological kriging, TK (or top-kriging). CK performs the spatial interpolation of the streamflow statistic of interest in the two-dimensional space of catchment descriptors. TK predicts the streamflow statistic along river networks taking both the catchment area and nested nature of catchments into account. It is of interest to understand how these spatial interpolation methods compare with generalized least squares (GLS) regression, one of the most common approaches to estimate flood quantiles at ungauged locations. By means of a leave-one-out cross-validation procedure, the performance of CK and TK was compared to GLS regression equations developed for the prediction of 10, 50, 100 and 500 yr floods for 61 streamgauges in the southeast United States. TK substantially outperforms GLS and CK for the study area, particularly for large catchments. The performance of TK over GLS highlights an important distinction between the treatments of spatial correlation when using regression-based or spatial interpolation methods to estimate flood quantiles at ungauged locations. The analysis also shows that coupling TK with CK slightly improves the performance of TK; however, the improvement is marginal when compared to the improvement in performance over GLS.
Potential pitfalls when denoising resting state fMRI data using nuisance regression.
Bright, Molly G; Tench, Christopher R; Murphy, Kevin
2017-07-01
In resting state fMRI, it is necessary to remove signal variance associated with noise sources, leaving cleaned fMRI time-series that more accurately reflect the underlying intrinsic brain fluctuations of interest. This is commonly achieved through nuisance regression, in which the fit is calculated of a noise model of head motion and physiological processes to the fMRI data in a General Linear Model, and the "cleaned" residuals of this fit are used in further analysis. We examine the statistical assumptions and requirements of the General Linear Model, and whether these are met during nuisance regression of resting state fMRI data. Using toy examples and real data we show how pre-whitening, temporal filtering and temporal shifting of regressors impact model fit. Based on our own observations, existing literature, and statistical theory, we make the following recommendations when employing nuisance regression: pre-whitening should be applied to achieve valid statistical inference of the noise model fit parameters; temporal filtering should be incorporated into the noise model to best account for changes in degrees of freedom; temporal shifting of regressors, although merited, should be achieved via optimisation and validation of a single temporal shift. We encourage all readers to make simple, practical changes to their fMRI denoising pipeline, and to regularly assess the appropriateness of the noise model used. By negotiating the potential pitfalls described in this paper, and by clearly reporting the details of nuisance regression in future manuscripts, we hope that the field will achieve more accurate and precise noise models for cleaning the resting state fMRI time-series. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Patino, Reynaldo; VanLandeghem, Matthew M.; Goodbred, Steven L.; Orsak, Erik; Jenkins, Jill A.; Echols, Kathy R.; Rosen, Michael R.; Torres, Leticia
2015-01-01
Adult male Common Carp were sampled in 2007/08 over a full reproductive cycle at Lake Mead National Recreation Area. Sites sampled included a stream dominated by treated wastewater effluent, a lake basin receiving the streamflow, an upstream lake basin (reference), and a site below Hoover Dam. Individual body burdens for 252 contaminants were measured, and biological variables assessed included physiological [plasma vitellogenin (VTG), estradiol-17β (E2), 11-ketotestosterone (11KT)] and organ [gonadosomatic index (GSI)] endpoints. Patterns in contaminant composition and biological condition were determined by Principal Component Analysis, and their associations modeled by Principal Component Regression. Three spatially distinct but temporally stable gradients of contaminant distribution were recognized: a contaminant mixture typical of wastewaters (PBDEs, methyl triclosan, galaxolide), PCBs, and DDTs. Two spatiotemporally variable patterns of biological condition were recognized: a primary pattern consisting of reproductive condition variables (11KT, E2, GSI), and a secondary pattern including general condition traits (condition factor, hematocrit, fork length). VTG was low in all fish, indicating low estrogenic activity of water at all sites. Wastewater contaminants associated negatively with GSI, 11KT and E2; PCBs associated negatively with GSI and 11KT; and DDTs associated positively with GSI and 11KT. Regression of GSI on sex steroids revealed a novel, nonlinear association between these variables. Inclusion of sex steroids in the GSI regression on contaminants rendered wastewater contaminants nonsignificant in the model and reduced the influence of PCBs and DDTs. Thus, the influence of contaminants on GSI may have been partially driven by organismal modes-of-action that include changes in sex steroid production. The positive association of DDTs with 11KT and GSI suggests that lifetime, sub-lethal exposures to DDTs have effects on male carp opposite of those reported by studies where exposure concentrations were relatively high. Lastly, this study highlighted advantages of multivariate/multiple regression approaches for exploring associations between complex contaminant mixtures and gradients and reproductive condition in wild fishes.
Patiño, Reynaldo; VanLandeghem, Matthew M; Goodbred, Steven L; Orsak, Erik; Jenkins, Jill A; Echols, Kathy; Rosen, Michael R; Torres, Leticia
2015-08-01
Adult male Common Carp were sampled in 2007/08 over a full reproductive cycle at Lake Mead National Recreation Area. Sites sampled included a stream dominated by treated wastewater effluent, a lake basin receiving the streamflow, an upstream lake basin (reference), and a site below Hoover Dam. Individual body burdens for 252 contaminants were measured, and biological variables assessed included physiological [plasma vitellogenin (VTG), estradiol-17β (E2), 11-ketotestosterone (11KT)] and organ [gonadosomatic index (GSI)] endpoints. Patterns in contaminant composition and biological condition were determined by Principal Component Analysis, and their associations modeled by Principal Component Regression. Three spatially distinct but temporally stable gradients of contaminant distribution were recognized: a contaminant mixture typical of wastewaters (PBDEs, methyl triclosan, galaxolide), PCBs, and DDTs. Two spatiotemporally variable patterns of biological condition were recognized: a primary pattern consisting of reproductive condition variables (11KT, E2, GSI), and a secondary pattern including general condition traits (condition factor, hematocrit, fork length). VTG was low in all fish, indicating low estrogenic activity of water at all sites. Wastewater contaminants associated negatively with GSI, 11KT and E2; PCBs associated negatively with GSI and 11KT; and DDTs associated positively with GSI and 11KT. Regression of GSI on sex steroids revealed a novel, nonlinear association between these variables. Inclusion of sex steroids in the GSI regression on contaminants rendered wastewater contaminants nonsignificant in the model and reduced the influence of PCBs and DDTs. Thus, the influence of contaminants on GSI may have been partially driven by organismal modes-of-action that include changes in sex steroid production. The positive association of DDTs with 11KT and GSI suggests that lifetime, sub-lethal exposures to DDTs have effects on male carp opposite of those reported by studies where exposure concentrations were relatively high. Lastly, this study highlighted advantages of multivariate/multiple regression approaches for exploring associations between complex contaminant mixtures and gradients and reproductive condition in wild fishes. Published by Elsevier Inc.
Jia, De-An; Zhou, Yu-Jie; Shi, Dong-Mei; Liu, Yu-Yang; Wang, Jian-Long; Liu, Xiao-Li; Wang, Zhi-Jian; Yang, Shi-Wei; Ge, Hai-Long; Hu, Bin; Yan, Zhen-Xian; Chen, Yi; Gao, Fei
2010-04-05
Radial artery spasm (RAS) is the most common complication in transradial coronary angiography and intervention. In this study, we designed to investigate the incidence of RAS during transradial procedures in Chinese, find out the independent predictors through multiple regression, and analyze the clinical effect of RAS during follow-up. Patients arranged to receive transradial coronary angiography and intervention were consecutively enrolled. The incidence of RAS was recorded. Univariate analysis was performed to find out the influence factors of RAS, and logistic regression analysis was performed to find out the independent predictors of RAS. The patients were asked to return 1 month later for the assessment of the radial access. The incidence of RAS was 7.8% (112/1427) in all the patients received transradial procedure. Univariate analysis indicates that young (P = 0.038), female (P = 0.026), small diameter of radial artery (P < 0.001), diabetes (P = 0.026), smoking (P = 0.019), moderate or severe pain during radial artery cannulation (P < 0.001), unsuccessful access at first attempt (P = 0.002), big sheath (P = 0.004), number of catheters (> 3) (P = 0.048), rapid baseline heart rate (P = 0.032) and long operation time (P = 0.021) were associated with RAS. Logistic regression showed that female (OR = 1.745, 95%CI: 1.148 - 3.846, P = 0.024), small radial artery diameter (OR = 4.028, 95%CI: 1.264 - 12.196, P = 0.008), diabetes (OR = 2.148, 95%CI: 1.579 - 7.458, P = 0.019) and unsuccessful access at first attempt (OR = 1.468, 95%CI: 1.212 - 2.591, P = 0.032) were independent predictors of RAS. Follow-up at (28 +/- 7) days after the procedure showed that, compared with non-spasm patients, the RAS patients had higher portion of pain (11.8% vs. 6.2%, P = 0.043). The occurrences of hematoma (7.3% vs. 5.6%, P = 0.518) and radial artery occlusion (3.6% vs. 2.6%, P = 0.534) were similar. The incidence of RAS during transradial coronary procedure was 7.8%. Logistic regression analysis showed that female, small radial artery diameter, diabetes and unsuccessful access at first attempt were the independent predictors of RAS.
Ma, W; Zhang, T-F; Lu, P; Lu, S H
2014-01-01
Breast cancer is categorized into two broad groups: estrogen receptor positive (ER+) and ER negative (ER-) groups. Previous study proposed that under trastuzumab-based neoadjuvant chemotherapy, tumor initiating cell (TIC) featured ER- tumors response better than ER+ tumors. Exploration of the molecular difference of these two groups may help developing new therapeutic strategies, especially for ER- patients. With gene expression profile from the Gene Expression Omnibus (GEO) database, we performed partial least squares (PLS) based analysis, which is more sensitive than common variance/regression analysis. We acquired 512 differentially expressed genes. Four pathways were found to be enriched with differentially expressed genes, involving immune system, metabolism and genetic information processing process. Network analysis identified five hub genes with degrees higher than 10, including APP, ESR1, SMAD3, HDAC2, and PRKAA1. Our findings provide new understanding for the molecular difference between TIC featured ER- and ER+ breast tumors with the hope offer supports for therapeutic studies.
Multivariate meta-analysis using individual participant data
Riley, R. D.; Price, M. J.; Jackson, D.; Wardle, M.; Gueyffier, F.; Wang, J.; Staessen, J. A.; White, I. R.
2016-01-01
When combining results across related studies, a multivariate meta-analysis allows the joint synthesis of correlated effect estimates from multiple outcomes. Joint synthesis can improve efficiency over separate univariate syntheses, may reduce selective outcome reporting biases, and enables joint inferences across the outcomes. A common issue is that within-study correlations needed to fit the multivariate model are unknown from published reports. However, provision of individual participant data (IPD) allows them to be calculated directly. Here, we illustrate how to use IPD to estimate within-study correlations, using a joint linear regression for multiple continuous outcomes and bootstrapping methods for binary, survival and mixed outcomes. In a meta-analysis of 10 hypertension trials, we then show how these methods enable multivariate meta-analysis to address novel clinical questions about continuous, survival and binary outcomes; treatment–covariate interactions; adjusted risk/prognostic factor effects; longitudinal data; prognostic and multiparameter models; and multiple treatment comparisons. Both frequentist and Bayesian approaches are applied, with example software code provided to derive within-study correlations and to fit the models. PMID:26099484
Ha, Diep H; Spencer, A John; Thomson, W Murray; Scott, Jane A; Do, Loc G
2018-04-01
Objective The association between and commonality of risk factors for poor self-rated oral health (SROH) and general health (SRGH) among new mothers has not been reported. The purpose of this paper is to assess the commonality of risk factors for poor SROH and SRGH, and self-reported obesity and dental pain, among a population-based sample of new mothers in Australia. It also investigated health conditions affecting new mothers' general health. Methods Data collected at baseline of a population-based birth cohort was used. Mothers of newborns in Adelaide were approached to participate. Mothers completed a questionnaire collecting data on socioeconomic status (SES), health behaviours, dental pain, SROH, self-reported height and weight and SRGH. Analysis was conducted sequentially from bivariate to multivariable regression to estimate prevalence rate (PR) of reporting poor/fair SROH and SRGH. Results of the 1895 new mothers, some 21 and 6% rated their SROH and SRGH as poor/fair respectively. Dental pain was associated with low income and smoking status, while being obese was associated with low SES, low education and infrequent tooth brushing. SROH and SRGH was associated with low SES, smoking, and dental pain. SROH was also associated with SRGH [PR: 3.06 (2.42-3.88)]. Conclusion for practice There was a commonality of factors associated with self-rated oral health and general health. Strong associations between OH and GH were also observed. Given the importance of maternal health for future generations, there would be long-term societal benefit from addressing common risk factors for OH and GH in integrated programs.
Drivers of metacommunity structure diverge for common and rare Amazonian tree species.
Bispo, Polyanna da Conceição; Balzter, Heiko; Malhi, Yadvinder; Slik, J W Ferry; Dos Santos, João Roberto; Rennó, Camilo Daleles; Espírito-Santo, Fernando D; Aragão, Luiz E O C; Ximenes, Arimatéa C; Bispo, Pitágoras da Conceição
2017-01-01
We analysed the flora of 46 forest inventory plots (25 m x 100 m) in old growth forests from the Amazonian region to identify the role of environmental (topographic) and spatial variables (obtained using PCNM, Principal Coordinates of Neighbourhood Matrix analysis) for common and rare species. For the analyses, we used multiple partial regression to partition the specific effects of the topographic and spatial variables on the univariate data (standardised richness, total abundance and total biomass) and partial RDA (Redundancy Analysis) to partition these effects on composition (multivariate data) based on incidence, abundance and biomass. The different attributes (richness, abundance, biomass and composition based on incidence, abundance and biomass) used to study this metacommunity responded differently to environmental and spatial processes. Considering standardised richness, total abundance (univariate) and composition based on biomass, the results for common species differed from those obtained for all species. On the other hand, for total biomass (univariate) and for compositions based on incidence and abundance, there was a correspondence between the data obtained for the total community and for common species. Our data also show that in general, environmental and/or spatial components are important to explain the variability in tree communities for total and common species. However, with the exception of the total abundance, the environmental and spatial variables measured were insufficient to explain the attributes of the communities of rare species. These results indicate that predicting the attributes of rare tree species communities based on environmental and spatial variables is a substantial challenge. As the spatial component was relevant for several community attributes, our results demonstrate the importance of using a metacommunities approach when attempting to understand the main ecological processes underlying the diversity of tropical forest communities.
Saleh, F; Renno, W; Klepacek, I; Ibrahim, G; Dashti, H; Asfar, S; Behbehani, A; Al-Sayer, H; Dashti, A; Kerry, Crotty
2005-01-01
To develop an effective pharmaceutical treatment for a disease, we need to fully understand the biological behavior of that disease, especially when dealing with cancer. The current available treatment for cancer may help in lessening the burden of the disease or, on certain occasions, in increasing the survival of the patient. However, a total eradication of cancer remains the researchers' hope. Some of the discoveries in the field of medicine relied on observations of natural events. Among these events is the spontaneous regression of cancer. It has been argued that such regression could be immunologically-mediated, but no direct evidence has been shown to support such an argument. We, hereby, provide compelling evidence that spontaneous cancer regression in humans is immunologically-mediated, hoping that the results from this study would stimulate the pharmaceutical industry to focus more on cancer vaccine immunotherapy. Our results showed that patients with >3 primary melanomas (very rare group among cancer patients) develop significant histopathological spontaneous regression of further melanomas that they could acquire during their life (P=0.0080) as compared to patients with single primary melanoma where the phenomenon of spontaneous regression is absent or minimal. It seems that such regression resulted from the repeated exposure to the tumor which mimics a self-immunization process. Analysis of the regressing tumors revealed heavy infiltration by T lymphocytes as compared to non-regressing tumors (P<0.0001), the predominant of which were T cytotoxic rather than T helper. Mature dendritic cells were also found in significant number (P<0.0001) in the regressing tumors as compared to the non regressing ones, which demonstrate an active involvement of the different arms of the immune system in the multiple primary melanoma patients in the process of tumor regression. Also, MHC expression was significantly higher in the regressing versus the non-regressing tumors (P <0.0001), which reflects a proper tumor antigen expression. Associated with tumor regression was also loss of the melanoma common tumor antigen Melan A/ MART-1 in the multiple primary melanoma patients as compared to the single primary ones (P=0.0041). Furthermore, loss of Melan A/ MART-1 in the regressing tumors significantly correlated with the presence of Melan A/ MART-1-specific CTLs in the peripheral blood of these patients (P=0.03), which adds to the evidence that the phenomenon of regression seen in these patients was immunologically-mediated and tumor-specific. Such correlation was also seen in another rare group of melanoma patients, namely those with occult primary melanoma. The lesson that we could learn from nature in this study is that inducing cancer regression using the different arms of the immune system is possible. Also, developing a novel cancer vaccine is not out of reach.
Tools to Support Interpreting Multiple Regression in the Face of Multicollinearity
Kraha, Amanda; Turner, Heather; Nimon, Kim; Zientek, Linda Reichwein; Henson, Robin K.
2012-01-01
While multicollinearity may increase the difficulty of interpreting multiple regression (MR) results, it should not cause undue problems for the knowledgeable researcher. In the current paper, we argue that rather than using one technique to investigate regression results, researchers should consider multiple indices to understand the contributions that predictors make not only to a regression model, but to each other as well. Some of the techniques to interpret MR effects include, but are not limited to, correlation coefficients, beta weights, structure coefficients, all possible subsets regression, commonality coefficients, dominance weights, and relative importance weights. This article will review a set of techniques to interpret MR effects, identify the elements of the data on which the methods focus, and identify statistical software to support such analyses. PMID:22457655
Tools to support interpreting multiple regression in the face of multicollinearity.
Kraha, Amanda; Turner, Heather; Nimon, Kim; Zientek, Linda Reichwein; Henson, Robin K
2012-01-01
While multicollinearity may increase the difficulty of interpreting multiple regression (MR) results, it should not cause undue problems for the knowledgeable researcher. In the current paper, we argue that rather than using one technique to investigate regression results, researchers should consider multiple indices to understand the contributions that predictors make not only to a regression model, but to each other as well. Some of the techniques to interpret MR effects include, but are not limited to, correlation coefficients, beta weights, structure coefficients, all possible subsets regression, commonality coefficients, dominance weights, and relative importance weights. This article will review a set of techniques to interpret MR effects, identify the elements of the data on which the methods focus, and identify statistical software to support such analyses.
Effects of spatial disturbance on common loon nest site selection and territory success
McCarthy, K.P.; DeStefano, S.
2011-01-01
The common loon (Gavia immer) breeds during the summer on northern lakes and water bodies that are also often desirable areas for aquatic recreation and human habitation. In northern New England, we assessed how the spatial nature of disturbance affects common loon nest site selection and territory success. We found through classification and regression analysis that distance to and density of disturbance factors can be used to classify observed nest site locations versus random points, suggesting that these factors affect loon nest site selection (model 1: Correct classification = 75%, null = 50%, K = 0.507, P < 0.001; model 2: Correct classification = 78%, null = 50%, K = 0.551, P < 0.001). However, in an exploratory analysis, we were unable to show a relation between spatial disturbance variables and breeding success (P = 0.595, R 2 = 0.436), possibly because breeding success was so low during the breeding seasons of 2007-2008. We suggest that by selecting nest site locations that avoid disturbance factors, loons thereby limit the effect that disturbance will have on their breeding success. Still, disturbance may force loons to use sub-optimal nesting habitat, limiting the available number of territories, and overall productivity. We advise that management efforts focus on limiting disturbance factors to allow breeding pairs access to the best nesting territories, relieving disturbance pressures that may force sub-optimal nest placement. ?? 2011 The Wildlife Society.
A review of population data utilization in beef cattle research.
Jones, R; Langemeier, M
2010-04-01
Controlled experimentation has been the most common source of research data in most biological sciences. However, many research questions lend themselves to the use of population data, or combinations of population data and data resulting from controlled experimentation. Studies of important economic outcomes, such as efficiency, profits, and costs, lend themselves particularly well to this type of analysis. Analytical methods that have been most commonly applied to population data in studies related to livestock production and management include statistical regression and mathematical programming. In social sciences, such as applied economics, it has become common to utilize more than one method in the same study to provide answers to the various questions at hand. Of course, care must be taken to ensure that the methods of analysis are appropriately applied; however, a wide variety of beef industry research questions are being addressed using population data. Issues related to data sources, aggregation levels, and consistency of collection often surface when using population data. These issues are addressed by careful consideration of the questions being addressed and the costs of data collection. Previous research across a variety of cattle production and marketing issues provides a broad foundation upon which to build future research. There is tremendous opportunity for increased use of population data and increased collaboration across disciplines to address issues of importance to the cattle industry.
Multivariate regression model for predicting lumber grade volumes of northern red oak sawlogs
Daniel A. Yaussy; Robert L. Brisbin
1983-01-01
A multivariate regression model was developed to predict green board-foot yields for the seven common factory lumber grades processed from northern red oak (Quercus rubra L.) factory grade logs. The model uses the standard log measurements of grade, scaling diameter, length, and percent defect. It was validated with an independent data set. The model...
Multivariate regression model for predicting yields of grade lumber from yellow birch sawlogs
Andrew F. Howard; Daniel A. Yaussy
1986-01-01
A multivariate regression model was developed to predict green board-foot yields for the common grades of factory lumber processed from yellow birch factory-grade logs. The model incorporates the standard log measurements of scaling diameter, length, proportion of scalable defects, and the assigned USDA Forest Service log grade. Differences in yields between band and...
Sample Size Determination for Regression Models Using Monte Carlo Methods in R
ERIC Educational Resources Information Center
Beaujean, A. Alexander
2014-01-01
A common question asked by researchers using regression models is, What sample size is needed for my study? While there are formulae to estimate sample sizes, their assumptions are often not met in the collected data. A more realistic approach to sample size determination requires more information such as the model of interest, strength of the…
Sierra/SolidMechanics 4.46 Example Problems Manual.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Plews, Julia A.; Crane, Nathan K; de Frias, Gabriel Jose
Presented in this document are tests that exist in the Sierra/SolidMechanics example problem suite, which is a subset of the Sierra/SM regression and performance test suite. These examples showcase common and advanced code capabilities. A wide variety of other regression and verification tests exist in the Sierra/SM test suite that are not included in this manual.
Applicability of digital analysis and imaging technology in neuropathology assessment.
Dunn, William D; Gearing, Marla; Park, Yuna; Zhang, Lifan; Hanfelt, John; Glass, Jonathan D; Gutman, David A
2016-06-01
Alzheimer's disease (AD) is a progressive neurological disorder that affects more than 30 million people worldwide. While various dementia-related losses in cognitive functioning are its hallmark clinical symptoms, ultimate diagnosis is based on manual neuropathological assessments using various schemas, including Braak staging, CERAD (Consortium to Establish a Registry for Alzheimer's Disease) and Thal phase scoring. Since these scoring systems are based on subjective assessment, there is inevitably some degree of variation between readers, which could affect ultimate neuropathology diagnosis. Here, we report a pilot study investigating the applicability of computer-driven image analysis for characterizing neuropathological features, as well as its potential to supplement or even replace manually derived ratings commonly performed in medical settings. In this work, we quantitatively measured amyloid beta (Aβ) plaque in various brain regions from 34 patients using a robust digital quantification algorithm. We next verified these digitally derived measures to the manually derived pathology ratings using correlation and ordinal logistic regression methods, while also investigating the association with other AD-related neuropathology scoring schema commonly used at autopsy, such as Braak and CERAD. In addition to successfully verifying our digital measurements of Aβ plaques with respective categorical measurements, we found significant correlations with most AD-related scoring schemas. Our results demonstrate the potential for digital analysis to be adapted to more complex staining procedures commonly used in neuropathological diagnosis. As the efficiency of scanning and digital analysis of histology images increases, we believe that the basis of our semi-automatic approach may better standardize quantification of neuropathological changes and AD diagnosis, ultimately leading to a more comprehensive understanding of neurological disorders and more efficient patient care. © 2015 Japanese Society of Neuropathology.